[PROPOSAL] Shared Ispell dictionaries
Hello, hackers!
Introduction
------------
I'm going to implement a patch which will store Ispell dictionaries in a shared memory.
There is an extension shared_ispell [1], developed by Tomas Vondra. But it is a bad candidate for including into contrib.
Because it should know a lot of information about IspellDict struct to copy it into a shared memory.
Why
---
Shared Ispell dictionary gives the following improvements:
- consume less memory - Ispell dictionary loads into memory for every backends and requires for some dictionaries more than 100Mb
- there is no overhead during first call of a full text search function (such as to_tsvector(), to_tsquery())
Implementation
--------------
It is necessary to change all structures related with IspellDict: SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all shouldn't use pointers for this reason. Others are used only during dictionary building.
It would be good to store in a shared memory StopList struct too.
All fields of IspellDict struct, which are used only during dictionary building, will be move into new IspellDictBuild to decrease needed shared memory size. And they are going to be released by buildCxt.
Each dictionary will be stored in its own dsm segment. Structures for regular expressions won't be stored in a shared memory. They are compiled for every backend.
The patch will be ready and added into the 2018-03 commitfest.
Thank you for your attention. Any thoughts?
1 - github.com/tvondra/shared_ispell or github.com/postgrespro/shared_ispell
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Arthur Zakirov wrote:
Implementation
--------------It is necessary to change all structures related with IspellDict:
SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all
shouldn't use pointers for this reason. Others are used only during
dictionary building.
So what are you going to use instead?
It would be good to store in a shared memory StopList struct too.
Sure (probably a separate patch though).
All fields of IspellDict struct, which are used only during dictionary
building, will be move into new IspellDictBuild to decrease needed
shared memory size. And they are going to be released by buildCxt.Each dictionary will be stored in its own dsm segment.
All that sounds reasonable.
The patch will be ready and added into the 2018-03 commitfest.
So this will be a large patch not submitted to 2018-01? Depending on
size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
too late.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2017-12-26 17:55 GMT+01:00 Alvaro Herrera <alvherre@alvh.no-ip.org>:
Arthur Zakirov wrote:
Implementation
--------------It is necessary to change all structures related with IspellDict:
SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all
shouldn't use pointers for this reason. Others are used only during
dictionary building.So what are you going to use instead?
It would be good to store in a shared memory StopList struct too.
Sure (probably a separate patch though).
All fields of IspellDict struct, which are used only during dictionary
building, will be move into new IspellDictBuild to decrease needed
shared memory size. And they are going to be released by buildCxt.Each dictionary will be stored in its own dsm segment.
All that sounds reasonable.
The patch will be ready and added into the 2018-03 commitfest.
So this will be a large patch not submitted to 2018-01? Depending on
size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
too late.
Tomas had some workable patches related to this topic
Regards
Pavel
Show quoted text
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thank you for your feedback.
On Tue, Dec 26, 2017 at 01:55:57PM -0300, Alvaro Herrera wrote:
So what are you going to use instead?
For example, AffixNode and AffixNodeData represent prefix tree of an
affix list. They are accessed by Suffix and Prefix pointers of
IspellDict struct now. Instead all affix nodes should be placed into an
array and accessed by an offset. Suffix array goes first, Prefix array
goes after. AffixNodeData will access to a child node by an offset too.
AffixNodeData struct has the array of pointers to AFFIX struct. These
array with all AFFIX data can be stored within AffixNodeData. Or
AffixNodeData can have an array of indexes to a single AFFIX array,
which stored within IspellDict before or after Suffix and Prefix.
Same for prefix tree of a word list, represented by SPNode struct. It
might by stored as an array after the Prefix array.
AffixData and CompoundAffix arrays go after them.
To allocate IspellDict in this case it is necessary to calculate needed
memory size. I think arrays mentioned above will be built first then
memcpy'ed into IspellDict, if it won't take much time.
Hope it makes sense and is reasonable.
So this will be a large patch not submitted to 2018-01? Depending on
size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
too late.
Oh, I see. I try to prepare the patch while 2018-01 is open.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Arthur Zakirov wrote:
On Tue, Dec 26, 2017 at 01:55:57PM -0300, Alvaro Herrera wrote:
So what are you going to use instead?
[ ... ]
To allocate IspellDict in this case it is necessary to calculate needed
memory size. I think arrays mentioned above will be built first then
memcpy'ed into IspellDict, if it won't take much time.
OK, that sounds sensible on first blush. If there are many processes
concurrently doing text searches, then the amnount of memory saved may
be large enough to justify the additional processing (moreso if it's
just one more memcpy cycle).
I hope that there is some way to cope with the ispell data changing
underneath -- maybe you'll need some sort of RCU?
So this will be a large patch not submitted to 2018-01? Depending on
size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
too late.Oh, I see. I try to prepare the patch while 2018-01 is open.
It isn't necessary that the patch to present to 2018-01 is final and
complete (so don't kill yourself to achieve that) -- a preliminary patch
that reviewers can comment on is enough, as long as the final patch you
present to 2018-03 is not *too* different. But any medium-large patch
whose first post is to the last commitfest of a cycle is likely to be
thrown out to the next cycle's first commitfest very quickly.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Dec 26, 2017 at 07:03:48PM +0100, Pavel Stehule wrote:
Tomas had some workable patches related to this topic
Tomas, have you planned to propose it?
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hello, hackers,
On Tue, Dec 26, 2017 at 07:48:27PM +0300, Arthur Zakirov wrote:
The patch will be ready and added into the 2018-03 commitfest.
I attached the patch itself.
0001-Fix-ispell-memory-handling.patch:
Some strings are allocated via compact_palloc0(). But they are not
persistent, so they should be allocated using temporary memory context.
Also a couple strings are not released if .aff file had new format.
0002-Retreive-shmem-location-for-ispell.patch:
Adds ispell_shmem_location() function which look for location for a
dictionary using .dict and .aff file names. If the location haven't been
allocated in DSM earlier, allocate it. Shared hash table is used here to
search the location.
Maximum number of elements of hash table is NUM_DICTIONARIES=20 now. It
will be better to use a GUC-variable. Also if the number of elements
reached the limit then it will be good to use backend's local memory
instead of shared.
0003-Store-ispell-structures-in-shmem.patch:
Introduces IspellDictBuild and IspellDictData structures, removes
IspellDict structure. IspellDictBuild is used during building the
dictionary, if it haven't been allocated in DSM earlier, within
dispell_build() function. IspellDictBuild has a pointer to
IspellDictData structure, which will be filled with persistent data.
After building the dictionary IspellDictData is copied into
DSM location and temporary data of IspellDictBuild is released.
All prefix trees are stored as a flat array now. Those arrays are
allocated and stored using NodeArray struct now. Required node can be
retreied by node offset. AffixData and Affix arrays have additional
offset array to retreive an element by index.
Affix field (array of AFFIX) of IspellDictBuild is persistent data also. But it is
constructed as a temporary array first, Affix array need to be sorted
via qsort() within NISortAffixes().
So IspellDictData stores:
- AffixData - array of strings, access via AffixDataOffset
- Affix - array of AFFIX, access via AffixOffset
- DictNodes, PrefixNodes, SuffixNodes - prefix trees as a plain array
- CompoundAffix - array of CMPDAffix sequential access
I had to remove compact_palloc0() added by Pavel in
3e5f9412d0a818be77c974e5af710928097b91f3. Ispell dictionary doesn't need
such allocation anymore. It was used to allocate a little locations. I
will definity check performance of Czech dictionary.
There are issues to do:
- add the GUC-variable for hash table limit
- fix bugs
- improve comments
- performance testing
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 9a09ffb20a..6617c2cf05 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -498,7 +498,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? MemoryContextStrdup(Conf->buildCxt, flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1040,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = MemoryContextStrdup(Conf->buildCxt, s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1536,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2d1ed143e0..86a6df131b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index eab98b0760..d8c8cc8cc3 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -520,6 +520,7 @@ RegisterLWLockTranches(void)
"shared_tuplestore");
LWLockRegisterTranche(LWTRANCHE_TBM, "tbm");
LWLockRegisterTranche(LWTRANCHE_PARALLEL_APPEND, "parallel_append");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
/* Register named tranches. */
for (i = 0; i < NamedLWLockTrancheRequests; i++)
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 34fe4c5b3c..1c8c9c5ed7 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..03fe615b1c
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,163 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "storage/dsm.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+
+/* XXX should it be a GUC-variable? */
+#define NUM_DICTIONARIES 20
+
+typedef struct
+{
+ char dictfile[MAXPGPATH];
+ char afffile[MAXPGPATH];
+} TsearchDictKey;
+
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+} TsearchDictEntry;
+
+typedef struct
+{
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+static HTAB *dict_table;
+
+/*
+ * Return handle to a dynamic shared memory.
+ *
+ * dictbuild: building structure for the dictionary.
+ * dictfile: .dict file of the dictionary.
+ * afffile: .aff file of the dictionary.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ */
+void *
+ispell_shmem_location(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *res;
+
+ StrNCpy(key.dictfile, dictfile, MAXPGPATH);
+ StrNCpy(key.afffile, afffile, MAXPGPATH);
+
+refind_entry:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ entry = (TsearchDictEntry *) hash_search(dict_table, &key, HASH_FIND,
+ &found);
+
+ /* Dictionary wasn't load into memory */
+ if (!found)
+ {
+ void *ispell_dict;
+ Size ispell_size;
+
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend, try to refind an entry.
+ */
+ goto refind_entry;
+ }
+
+ entry = (TsearchDictEntry *) hash_search(dict_table, &key, HASH_ENTER,
+ &found);
+
+ Assert(!found);
+
+ /* The lock was free so add new entry */
+ ispell_dict = allocate_cb(dictbuild, dictfile, afffile, &ispell_size);
+
+ seg = dsm_create(ispell_size, 0);
+ res = dsm_segment_address(seg);
+ memcpy(res, ispell_dict, ispell_size);
+
+ pfree(ispell_dict);
+
+ entry->dict_dsm = dsm_segment_handle(seg);
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+
+ dsm_detach(seg);
+ }
+ else
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ res = dsm_segment_address(seg);
+
+ dsm_detach(seg);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ return res;
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ HASHCTL ctl;
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", TsearchShmemSize(), &found);
+
+ if (!found)
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(TsearchDictKey);
+ ctl.entrysize = sizeof(TsearchDictEntry);
+
+ dict_table = ShmemInitHash("Shared Tsearch Lookup Table",
+ NUM_DICTIONARIES, NUM_DICTIONARIES,
+ &ctl,
+ HASH_ELEM | HASH_BLOBS);
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ /* size of lookup hash table */
+ size = add_size(size, hash_estimate_size(NUM_DICTIONARIES,
+ sizeof(TsearchDictEntry)));
+
+ return size;
+}
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 97e4a0bbbd..3d41073b60 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,7 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..ded3a7c2ec
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+
+typedef void *(*ispell_build_callback) (void *dictbuild,
+ const char *dictfile,
+ const char *afffile,
+ Size *size);
+
+extern void *ispell_shmem_location(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ ispell_build_callback allocate_cb);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8f61bd2830..970ce868df 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -16,6 +16,7 @@
#include "commands/defrem.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -23,48 +24,44 @@
typedef struct
{
StopList stoplist;
- IspellDict obj;
+ IspellDictBuild obj;
} DictISpell;
+static void *dispell_build(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
+ char *dictfile = NULL,
+ *afffile = NULL;
+ bool stoploaded = false;
ListCell *l;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
-
foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (pg_strcasecmp(defel->defname, "DictFile") == 0)
{
- if (dictloaded)
+ if (dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (pg_strcasecmp(defel->defname, "AffFile") == 0)
{
- if (affloaded)
+ if (afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (pg_strcasecmp(defel->defname, "StopWords") == 0)
{
@@ -84,12 +81,16 @@ dispell_init(PG_FUNCTION_ARGS)
}
}
- if (affloaded && dictloaded)
+ if (dictfile && afffile)
{
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
+ IspellDictData *dict;
+
+ dict = ispell_shmem_location(&d->obj, dictfile, afffile,
+ dispell_build);
+
+ d->obj.dict = (IspellDictData *) dict;
}
- else if (!affloaded)
+ else if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -102,8 +103,6 @@ dispell_init(PG_FUNCTION_ARGS)
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
-
PG_RETURN_POINTER(d);
}
@@ -122,7 +121,7 @@ dispell_lexize(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(NULL);
txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
+ res = NINormalizeWord(d->obj.dict, txt);
if (res == NULL)
PG_RETURN_POINTER(NULL);
@@ -146,3 +145,36 @@ dispell_lexize(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(res);
}
+
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(void *dictbuild, const char *dictfile, const char *afffile,
+ Size *size)
+{
+ IspellDictBuild *build = (IspellDictBuild *) dictbuild;
+
+ Assert(dictfile && afffile);
+
+ NIStartBuild(build);
+
+ /* Read files */
+ NIImportDictionary(build, dictfile);
+ NIImportAffixes(build, afffile);
+
+ /* Build persistent data to use by backends */
+ NISortDictionary(build);
+ NISortAffixes(build);
+
+ NICopyData(build);
+
+ /* Release temporary data */
+ NIFinishBuild(build);
+
+ /* Return the buffer and its size */
+ *size = build->dict_size;
+ return build->dict;
+}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 6617c2cf05..5ce5f6f735 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -46,9 +46,9 @@
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
+ * The IspellDictBuild structure has the Spell field which is used only in
+ * compile time. The Spell field stores a words list. It can take a lot of
+ * memory. Therefore when a dictionary is compiled this field is cleared by
* NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
@@ -73,110 +73,145 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy temporary data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -188,7 +223,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -309,18 +344,119 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+NIInitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+NIAddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray.
+ */
+static uint32
+NIAllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
}
/*
@@ -331,7 +467,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -339,13 +475,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -354,11 +490,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -420,15 +556,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -438,31 +574,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -475,31 +608,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? MemoryContextStrdup(Conf->buildCxt, flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
+ ? MemoryContextStrdup(ConfBuild->buildCxt, flag) : VoidString;
+ ConfBuild->nSpell++;
}
/*
@@ -507,11 +642,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -562,9 +697,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -596,9 +731,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *Conf, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(Conf);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -634,10 +769,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(Conf->flagMode,
+ DictAffixDataGet(Conf, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -655,7 +794,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -671,26 +811,49 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_FIND_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -703,42 +866,14 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
+ /* TODO Compile regular expressions */
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
+ /* TODO Compile regular expressions */
}
Affix->flagflags = flagflags;
@@ -747,15 +882,19 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1019,10 +1158,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1040,21 +1179,21 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = MemoryContextStrdup(Conf->buildCxt, s);
+ entry->flag.s = MemoryContextStrdup(ConfBuild->buildCxt, s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1081,29 +1220,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1111,7 +1250,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1119,18 +1258,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1142,14 +1281,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1160,13 +1298,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1177,11 +1315,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1193,17 +1331,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1220,30 +1357,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1256,9 +1399,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1272,8 +1415,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1293,15 +1436,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1311,21 +1454,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ NIInitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ NIAddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ NIAddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1336,8 +1473,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1365,21 +1502,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1405,7 +1542,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1426,9 +1563,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1450,10 +1587,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1526,7 +1661,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1545,53 +1681,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
- {
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ NIAddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1599,66 +1730,72 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset;
SPNode *rs;
SPNodeData *data;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
+ rs_offset = NIAllocateNode(ConfBuild, &ConfBuild->DictNodes, nchar,
+ sizeof(SPNodeData), SPNHDRSZ);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
rs->length = nchar;
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ data->node_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
lownew = i;
data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1667,15 +1804,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1687,9 +1826,9 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ data->node_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ return rs_offset;
}
/*
@@ -1697,7 +1836,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1709,78 +1848,78 @@ NISortDictionary(IspellDict *Conf)
* If we use flag aliases then we need to use Conf->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty value of
* Conf->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
/* Otherwise fill Conf->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into Conf->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ NIInitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ NIAddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1788,83 +1927,88 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset;
AffixNode *rs;
AffixNodeData *data;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
+ rs_offset = NIAllocateNode(ConfBuild, array, nchar, sizeof(AffixNodeData),
+ ANHRDSZ);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
rs->length = nchar;
- data = rs->data;
+ rs->isvoid = 0;
+ data = (AffixNodeData *) rs->data;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ data->node_offset = mkANode(ConfBuild, lownew, i,
+ level + 1, type);
+
+ /* Handle next data node */
data++;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ data->node_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
- pfree(aff);
-
- return rs;
+ return rs_offset;
}
/*
@@ -1872,137 +2016,153 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
-
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
+ array = &ConfBuild->PrefixNodes;
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = NIAllocateNode(ConfBuild, array, 1,
+ sizeof(AffixNodeData), ANHRDSZ);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->length = 1;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
+ AffixData->affstart = ISPELL_INVALID_INDEX;
+ AffixData->affend = ISPELL_INVALID_INDEX;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
* Returns true if the Conf->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+
+ voidPrefix->data[0].node_offset = mkANode(ConfBuild, 0, firstsuffix, 0,
+ FF_PREFIX);
+ voidSuffix->data[0].node_offset = mkANode(ConfBuild, firstsuffix,
+ ConfBuild->nAffix, 0, FF_SUFFIX);
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *Conf, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(Conf);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(Conf);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2017,9 +2177,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2074,7 +2235,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2084,9 +2245,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2097,27 +2258,27 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
- return newword;
+// if (RS_execute(&(Affix->reg.regis), newword))
+// return newword;
}
else
{
- int err;
- pg_wchar *data;
- size_t data_len;
- int newword_len;
-
- /* Convert data string to wide characters */
- newword_len = strlen(newword);
- data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
- data_len = pg_mb2wchar_with_len(newword, data, newword_len);
-
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
- {
- pfree(data);
- return newword;
- }
- pfree(data);
+// int err;
+// pg_wchar *data;
+// size_t data_len;
+// int newword_len;
+
+// /* Convert data string to wide characters */
+// newword_len = strlen(newword);
+// data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
+// data_len = pg_mb2wchar_with_len(newword, data, newword_len);
+
+// if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+// {
+// pfree(data);
+// return newword;
+// }
+// pfree(data);
}
return NULL;
@@ -2139,7 +2300,7 @@ addToResult(char **forms, char **cur, char *word)
}
static char **
-NormalizeSubWord(IspellDict *Conf, char *word, int flag)
+NormalizeSubWord(IspellDictData *Conf, char *word, int flag)
{
AffixNodeData *suffix = NULL,
*prefix = NULL;
@@ -2151,7 +2312,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf),
*pnode;
int i,
j;
@@ -2171,23 +2332,27 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf, j);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, flag, newword, NULL))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf),
+ prefix->node_offset);
}
/*
@@ -2199,45 +2364,55 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf, snode, word, wrdlen, &slevel, FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf, i);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, flag, newword, &baselen))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf, pnode, newword, swrdlen, &plevel,
+ FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf, j);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, flag,
+ pnewword, &baselen))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
if (FindWord(Conf, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2257,7 +2432,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *Conf, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2267,9 +2443,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2283,9 +2462,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2337,13 +2519,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDictData *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2358,8 +2541,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2406,7 +2592,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2465,13 +2652,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2500,7 +2688,7 @@ addNorm(TSLexeme **lres, TSLexeme **lcur, char *word, int flags, uint16 NVariant
}
TSLexeme *
-NINormalizeWord(IspellDict *Conf, char *word)
+NINormalizeWord(IspellDictData *Conf, char *word)
{
char **res;
TSLexeme *lcur = NULL,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 3032d0b508..b0fc8729d7 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,6 +18,9 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0xFFFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
@@ -30,9 +33,10 @@ typedef struct
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,22 +90,40 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
- union
- {
- regex_t regex;
- Regis regis;
- } reg;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl and find, but who knows */
+ uint8 replen;
+ uint8 findlen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
} AFFIX;
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+#define AF_FIND_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldFlag(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + strlen(AffixFieldFlag(af)) + 1)
+
/*
* affixes use dictionary flags too
*/
@@ -124,12 +146,16 @@ struct AffixNode;
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
+#define ANDHDRSZ (offsetof(AffixNodeData, aff))
+#define AffixNodeDataSize(an) (ANDHDRSZ + sizeof(uint32) * (an)->naff)
+
typedef struct AffixNode
{
uint32 isvoid:1,
@@ -139,9 +165,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +212,70 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +284,52 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
-
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
-} IspellDict;
-
-extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
+
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+extern TSLexeme *NINormalizeWord(IspellDictData *Conf, char *word);
+
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
On Sun, Dec 31, 2017 at 06:28:13PM +0300, Arthur Zakirov wrote:
There are issues to do:
- add the GUC-variable for hash table limit
- fix bugs
- improve comments
- performance testing
Here is the second version of the patch.
0002-Retreive-shmem-location-for-ispell-v2.patch:
Fixed some bugs and added the GUC variable "shared_dictionaries".
Added documentation for it. I'm not sure about the order of configuration parameters in section "19.4.1.
Memory". Now "shared_dictionaries" goes after "shared_buffers". Maybe it
will be good to make a patch wich will sort parameters in alphabetical
order?
0003-Store-ispell-structures-in-shmem-v2.patch:
Fixed some bugs, regression tests pass now. I added more comments
and fixed old. I also tested with Hunspell dictionaries [1]. They are
good too.
Results of performance testing of Ispell and Hunspell dictionaries will
be ready soon.
1 - github.com/postgrespro/hunspell_dicts
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v2.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..25614f2d31 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -498,7 +498,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? MemoryContextStrdup(Conf->buildCxt, flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1040,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = MemoryContextStrdup(Conf->buildCxt, s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1536,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell-v2.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e4a01699e4..858423354e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1355,6 +1355,36 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-shared-dictionaries" xreflabel="shared_dictionaries">
+ <term><varname>shared_dictionaries</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>shared_dictionaries</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of text search dictionaries loaded into shared
+ memory. The default is 10 dictionaries.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ <literal>DictFile</literal> and <literal>AffFile</literal> are used to
+ search the dictionary in shared memory.
+ </para>
+
+ <para>
+ If the number of simultaneously loaded dictionaries reaches the maximum
+ allowed number then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 71caac1a1f..2446db7266 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -520,6 +520,7 @@ RegisterLWLockTranches(void)
"shared_tuplestore");
LWLockRegisterTranche(LWTRANCHE_TBM, "tbm");
LWLockRegisterTranche(LWTRANCHE_PARALLEL_APPEND, "parallel_append");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
/* Register named tranches. */
for (i = 0; i < NamedLWLockTrancheRequests; i++)
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..4682eab506
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,179 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+
+
+/*
+ * Hash table structures
+ */
+
+typedef struct
+{
+ char dictfile[MAXPGPATH];
+ char afffile[MAXPGPATH];
+} TsearchDictKey;
+
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+} TsearchDictEntry;
+
+static HTAB *dict_table;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+int shared_dictionaries = 10;
+
+/*
+ * Return handle to a dynamic shared memory using hash table. If shared memory
+ * for dictfile and afffile doesn't allocated yet, do it.
+ *
+ * dictbuild: building structure for the dictionary.
+ * dictfile: .dict file of the dictionary.
+ * afffile: .aff file of the dictionary.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ */
+dsm_handle
+ispell_shmem_location(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ dsm_handle res;
+
+ StrNCpy(key.dictfile, dictfile, MAXPGPATH);
+ StrNCpy(key.afffile, afffile, MAXPGPATH);
+
+refind_entry:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ entry = (TsearchDictEntry *) hash_search(dict_table, &key, HASH_FIND,
+ &found);
+
+ /* Dictionary wasn't load into memory */
+ if (!found)
+ {
+ void *ispell_dict,
+ *dict_location;
+ Size ispell_size;
+
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend, try to refind an entry.
+ */
+ goto refind_entry;
+ }
+
+ entry = (TsearchDictEntry *) hash_search(dict_table, &key,
+ HASH_ENTER_NULL,
+ &found);
+
+ /*
+ * There is no space in shared hash table, let backend to build the
+ * dictionary within its memory context.
+ */
+ if (entry == NULL)
+ return DSM_HANDLE_INVALID;
+
+ /* The lock was free so add new entry */
+ ispell_dict = allocate_cb(dictbuild, dictfile, afffile, &ispell_size);
+
+ seg = dsm_create(ispell_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, ispell_dict, ispell_size);
+
+ pfree(ispell_dict);
+
+ entry->dict_dsm = dsm_segment_handle(seg);
+ res = entry->dict_dsm;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+
+ dsm_detach(seg);
+ }
+ else
+ {
+ res = entry->dict_dsm;
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ return res;
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ HASHCTL ctl;
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(TsearchDictKey);
+ ctl.entrysize = sizeof(TsearchDictEntry);
+
+ dict_table = ShmemInitHash("Shared Tsearch Lookup Table",
+ shared_dictionaries, shared_dictionaries,
+ &ctl,
+ HASH_ELEM | HASH_BLOBS);
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ /* size of lookup hash table */
+ size = add_size(size, hash_estimate_size(shared_dictionaries,
+ sizeof(TsearchDictEntry)));
+
+ return size;
+}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 72f6be329e..dbc9bf93f0 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2910,6 +2911,19 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"shared_dictionaries", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum number of text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If the number of simultaneously loaded dictionaries "
+ "reaches the maximum allowed number then a new dictionary "
+ "will be loaded into local memory of a backend.")
+ },
+ &shared_dictionaries,
+ 10, 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 69f40f04b0..b83ffe6a39 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -131,6 +131,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#shared_dictionaries = 10 # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..2bb80cdd26 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,7 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..4bcfb437ef
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,35 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+#include "storage/dsm.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int shared_dictionaries;
+
+typedef void *(*ispell_build_callback) (void *dictbuild,
+ const char *dictfile,
+ const char *afffile,
+ Size *size);
+
+extern dsm_handle ispell_shmem_location(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ ispell_build_callback allocate_cb);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem-v2.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 0d706795ad..08f5d20ac5 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -23,48 +34,46 @@
typedef struct
{
StopList stoplist;
+ IspellDictBuild build;
IspellDict obj;
+ dsm_handle dict_handle;
} DictISpell;
+static void *dispell_build(void *dictbuild,
+ const char *dictfile, const char *afffile,
+ Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
+ char *dictfile = NULL,
+ *afffile = NULL;
+ bool stoploaded = false;
ListCell *l;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
-
foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (pg_strcasecmp(defel->defname, "DictFile") == 0)
{
- if (dictloaded)
+ if (dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (pg_strcasecmp(defel->defname, "AffFile") == 0)
{
- if (affloaded)
+ if (afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (pg_strcasecmp(defel->defname, "StopWords") == 0)
{
@@ -84,12 +93,46 @@ dispell_init(PG_FUNCTION_ARGS)
}
}
- if (affloaded && dictloaded)
+ if (dictfile && afffile)
{
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
+ dsm_segment *seg;
+ uint32 naffix;
+
+ d->dict_handle = ispell_shmem_location(&(d->build), dictfile, afffile,
+ dispell_build);
+
+ /*
+ * There is no space in shared memory, build the dictionary within its
+ * memory context.
+ */
+ if (d->dict_handle == DSM_HANDLE_INVALID)
+ {
+ Size ispell_size;
+
+ d->obj.dict = (IspellDictData *) dispell_build(&(d->build),
+ dictfile, afffile,
+ &ispell_size);
+ naffix = d->obj.dict->nAffix;
+ }
+ /* The dictionary was allocated in DSM */
+ else
+ {
+ IspellDictData *dict;
+
+ seg = dsm_attach(d->dict_handle);
+ dict = (IspellDictData *) dsm_segment_address(seg);
+
+ /* We need to save naffix here because seg will be detached */
+ naffix = dict->nAffix;
+
+ dsm_detach(seg);
+ }
+
+ d->obj.reg = (AffixReg *) palloc0(naffix * sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
}
- else if (!affloaded)
+ else if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -102,8 +145,6 @@ dispell_init(PG_FUNCTION_ARGS)
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
-
PG_RETURN_POINTER(d);
}
@@ -113,6 +154,7 @@ dispell_lexize(PG_FUNCTION_ARGS)
DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
char *in = (char *) PG_GETARG_POINTER(1);
int32 len = PG_GETARG_INT32(2);
+ dsm_segment *seg = NULL;
char *txt;
TSLexeme *res;
TSLexeme *ptr,
@@ -121,11 +163,26 @@ dispell_lexize(PG_FUNCTION_ARGS)
if (len <= 0)
PG_RETURN_POINTER(NULL);
+ /*
+ * If the dictionary allocated in DSM, get a pointer to IspellDictData.
+ * Otherwise d->obj.dict already points to IspellDictData allocated within
+ * the dictionary's memory context.
+ */
+ if (d->dict_handle != DSM_HANDLE_INVALID)
+ {
+ seg = dsm_attach(d->dict_handle);
+ d->obj.dict = (IspellDictData *) dsm_segment_address(seg);
+ }
+
txt = lowerstr_with_len(in, len);
res = NINormalizeWord(&(d->obj), txt);
if (res == NULL)
+ {
+ if (seg)
+ dsm_detach(seg);
PG_RETURN_POINTER(NULL);
+ }
cptr = res;
for (ptr = cptr; ptr->lexeme; ptr++)
@@ -144,5 +201,40 @@ dispell_lexize(PG_FUNCTION_ARGS)
}
cptr->lexeme = NULL;
+ if (seg)
+ dsm_detach(seg);
PG_RETURN_POINTER(res);
}
+
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(void *dictbuild, const char *dictfile, const char *afffile,
+ Size *size)
+{
+ IspellDictBuild *build = (IspellDictBuild *) dictbuild;
+
+ Assert(dictfile && afffile);
+
+ NIStartBuild(build);
+
+ /* Read files */
+ NIImportDictionary(build, dictfile);
+ NIImportAffixes(build, afffile);
+
+ /* Build persistent data to use by backends */
+ NISortDictionary(build);
+ NISortAffixes(build);
+
+ NICopyData(build);
+
+ /* Release temporary data */
+ NIFinishBuild(build);
+
+ /* Return the buffer and its size */
+ *size = build->dict_size;
+ return build->dict;
+}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 25614f2d31..68db529307 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,110 +75,145 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -188,7 +225,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -309,18 +346,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -331,7 +539,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -339,13 +547,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -354,11 +562,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -420,15 +628,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -438,31 +646,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -475,31 +680,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? MemoryContextStrdup(Conf->buildCxt, flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
+ ? MemoryContextStrdup(ConfBuild->buildCxt, flag) : VoidString;
+ ConfBuild->nSpell++;
}
/*
@@ -507,11 +714,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -562,9 +769,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -588,7 +795,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -596,9 +803,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -634,10 +841,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -655,7 +866,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -671,26 +883,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -703,42 +943,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -747,15 +957,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1019,10 +1236,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1040,21 +1257,21 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = MemoryContextStrdup(Conf->buildCxt, s);
+ entry->flag.s = MemoryContextStrdup(ConfBuild->buildCxt, s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1081,29 +1298,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1111,7 +1328,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1119,18 +1336,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1142,14 +1359,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1160,13 +1376,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1177,11 +1393,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1193,17 +1409,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1220,30 +1435,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1256,9 +1477,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1272,8 +1493,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1293,15 +1514,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1311,21 +1532,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1336,8 +1551,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1365,21 +1580,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1405,7 +1620,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1426,9 +1641,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1450,10 +1665,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1526,7 +1739,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1545,53 +1759,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1599,66 +1808,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1667,15 +1897,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1687,9 +1919,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1697,7 +1939,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1706,81 +1948,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1788,83 +2030,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1872,137 +2135,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2017,9 +2297,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2033,8 +2314,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2074,7 +2414,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2084,9 +2424,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2097,7 +2437,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2107,12 +2452,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2151,7 +2501,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2163,7 +2513,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2171,23 +2521,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2199,45 +2555,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2257,7 +2627,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2267,9 +2638,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2283,9 +2657,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2337,13 +2714,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2358,8 +2736,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2406,7 +2787,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2465,13 +2847,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2521,7 +2904,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
Hi Arthur,
Sorry for the delay, I somehow missed this thread ...
On 12/27/2017 10:20 AM, Arthur Zakirov wrote:
On Tue, Dec 26, 2017 at 07:03:48PM +0100, Pavel Stehule wrote:
Tomas had some workable patches related to this topic
Tomas, have you planned to propose it?
I believe Pavel was referring to this extension:
https://github.com/tvondra/shared_ispell
I wasn't going to submit that as in-core solution, but I'm happy you're
making improvements in that direction. I'll take a look at your patch
shortly.
ragards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thank you for your answer.
On Mon, Jan 08, 2018 at 06:12:37PM +0100, Tomas Vondra wrote:
I believe Pavel was referring to this extension:
Oh, understood.
I wasn't going to submit that as in-core solution, but I'm happy you're
making improvements in that direction. I'll take a look at your patch
shortly.
There is the second version of the patch. But I've noticed a performance regression in ts_lexize() and I will try to find where the overhead hides.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hi Arthur,
I've done some initial review of the patch today, and here are some
thoughts:
0001-Fix-ispell-memory-handling-v2.patch
This makes sense. The patch simply replaces two cpstrdup() calls with
MemoryContextStrdup, but I see spell.c already has two macros to
allocate memory in the buildCxt. What about adding tmpstrdup to copy a
string into the context? I admit this is mostly nitpicking though.
0002-Retreive-shmem-location-for-ispell-v2.patch
I think the GUC name should make it clear it's a maximum number of
something, just like "max_parallel_workers" and other such GUCs. When I
first saw "shared_dictionaries" in the patch I thought it's a list of
dictionary names, or something like that.
I have a bunch of additional design questions and proposals (not
necessarily required for v1, but perhaps useful for shaping it).
1) Why do we actually need the limit? Is it really necessary / useful?
When I wrote shared_ispell back in 2012, all we had were fixed segments
allocated at start, and so similar limits were a built-in restriction.
But after the DSM stuff was introduced I imagined it would not be necessary.
I realize the current implementation requires that, because the hash
table is still created in an old-style memory context (and only the
dictionaries are in DSM segments).
But that seems fairly straightforward to fix by maintaining the hash
table in a separate DSM segment too. So lookup of the dictionary DSM
would have to fist check what the current hash table segment is, and
then continue as now.
I'm not sure if dynahash can live in a DSM segment, but we already have
a hash table that supports that in dshash.c (which is also concurrent,
although I'm not sure if that's a major advantage for this use case).
2) Do we actually want/need some limits? Which ones?
That is not to say we don't need/want some limits, but the current limit
may not be the droid we're looking for, for a couple of reasons.
Firstly, currently it only matters during startup, when the dynahash is
created. So to change the limit (e.g. to increase it) you actually have
to restart the database, which is obviously a major hassle.
Secondly, dynahash tweaks the values to get proper behavior. For example
it's not using the values directly but some higher value of 2^N form.
Which means the limit may not enforced immediately when hitting the GUC,
but unexpectedly somewhat later.
And finally, I believe this is log-worthy - right now the dictionary
load silently switches to backend memory (thus incurring all the parsing
overhead). This certainly deserves at least a log message.
Actually, I'm not sure "number of dictionaries" is a particularly useful
limit in the first place - that's not a number I really care about. But
I do care about amount of memory consumed by the loaded dictionaries.
So I do suggest adding such "max memory for shared dictionaries" limit.
I'm not sure we can enforce it strictly, because when deciding where to
load the dict we haven't parsed it yet and so don't know how much memory
will be required. But I believe a lazy check should be fine (load it,
and if we exceeded the total memory disable loading additional ones).
3) How do I unload a dictionary from the shared memory?
Assume we've reached the limit (it does not matter if it's the number of
dictionaries or memory used by them). How do I resolve that without
restarting the database? How do I unload a dictionary (which may be
unused) from shared memory?
ALTER TEXT SEARCH DICTIONARY x UNLOAD
4) How do I reload a dictionary?
Assume I've updated the dictionary files (added new words into the
files, or something like that). How do I reload the dictionary? Do I
have to restart the server, DROP/CREATE everything again, or what?
What about instead having something like this:
ALTER TEXT SEARCH DICTIONARY x RELOAD
5) Actually, how do I list currently loaded dictionaries (and how much
memory they use in the shared memory)?
6) What other restrictions would be useful?
I think it should be possible to specify which ispell dictionaries may
be loaded into shared memory, and which should be always loaded into
local backend memory. That is, something like
CREATE TEXT SEARCH DICTIONARY x (
TEMPLATE = ispell,
DictFile = czech,
AffFile = czech,
StopWords = czech,
SharedMemory = true/false (default: false)
);
because otherwise the dictionaries will compete for shared memory, and
it's unclear which of them will get loaded. For a server with a single
application that may not be a huge issue, but think about servers shared
by multiple applications, etc.
In the extension this was achieved kinda explicitly by definition of a
separate 'shared_ispell' template, but if you modify the current one
that won't work, of course.
7) You mentioned you had to get rid of the compact_palloc0 - can you
elaborate a bit why that was necessary? Also, when benchmarking the
impact of this make sure to measure not only the time but also memory
consumption.
In fact, that was the main reason why Pavel implemented it in 2010,
because the czech dictionary takes quite a bit of memory, and without
the shared memory a copy was kept in every backend.
Of course, maybe that would be mostly irrelevant thanks to this patch
(due to changes to the representation and keeping just a single copy).
8) One more thing - I've noticed that the hash table uses this key:
typedef struct
{
char dictfile[MAXPGPATH];
char afffile[MAXPGPATH];
} TsearchDictKey;
That is, full paths to the two files, and I'm not sure that's a very
good idea. Firstly, it's a bit wasteful (1kB per path). But more
importantly it means all dictionaries referencing the same files will
share the same chunk of shared memory - not only within a single
database, but across the whole cluster. That may lead to surprising
behavior, because e.g. unloading a dictionary in one database will
affect dictionaries in all other databases referencing the same files.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello,
Thank you Tomas for your review.
On Sat, Jan 13, 2018 at 03:25:55AM +0100, Tomas Vondra wrote:
allocate memory in the buildCxt. What about adding tmpstrdup to copy a
string into the context? I admit this is mostly nitpicking though.
I agree about tmpstrdup(). It will be self consistent with tmpalloc().
1) Why do we actually need the limit? Is it really necessary / useful?
...
I realize the current implementation requires that, because the hash
table is still created in an old-style memory context (and only the
dictionaries are in DSM segments).
Yes indeed. I tried to implement dynahash via DSM, but I failed. It
seems to me that dynahash can work only in an old-style memory context.
I'm not sure if dynahash can live in a DSM segment, but we already have
a hash table that supports that in dshash.c (which is also concurrent,
although I'm not sure if that's a major advantage for this use case).
Thank you a lot for pointing on dshash.c. I think this is just what we
need. I will try to use it in the new version of the patch.
2) Do we actually want/need some limits? Which ones?
...
And finally, I believe this is log-worthy - right now the dictionary
load silently switches to backend memory (thus incurring all the parsing
overhead). This certainly deserves at least a log message.
I think such log message may be usefull, so I will add it too.
So I do suggest adding such "max memory for shared dictionaries" limit.
I'm not sure we can enforce it strictly, because when deciding where to
load the dict we haven't parsed it yet and so don't know how much memory
will be required. But I believe a lazy check should be fine (load it,
and if we exceeded the total memory disable loading additional ones).
With dshash in DSM it seems that shared_dictionaries GUC variable is not needed
anymore. I aggree that another GUC variable (for example,
max_shared_dictionaries_size) may be useful. But maybe it's worth
checking the size of a dictionary only after actual compiling? We can do
the following:
- within ispell_shmem_location() build a dictionary using the callback
function
- the callback function returns its size, if the dictionary doesn't fit
into a remaining shared space ispell_shmem_location() just will return
pointer to the palloc'ed and compiled dictionary without creating a
DSM segment.
3) How do I unload a dictionary from the shared memory?
...
ALTER TEXT SEARCH DICTIONARY x UNLOAD4) How do I reload a dictionary?
...
ALTER TEXT SEARCH DICTIONARY x RELOAD
I think these syntax will be very useful not only for Ispell but for
other dictionaries too. So init_function of a text search template may
return pointers for a C funtions which unload and reload dictionaries.
This approach doesn't require to change the catalog by adding additional
functions for the template [1].
If init_function of a template didn't return pointers then this template
doesn't support unloading or reloading. And UNLOAD and RELOAD commands
should throw an error if a user calles them for such template.
5) Actually, how do I list currently loaded dictionaries (and how much
memory they use in the shared memory)?
This is may be very useful too. This function can be called as
pg_get_shared_dictionaries().
6) What other restrictions would be useful?
...
CREATE TEXT SEARCH DICTIONARY x (
TEMPLATE = ispell,
DictFile = czech,
AffFile = czech,
StopWords = czech,
SharedMemory = true/false (default: false)
);
Hm, I didn't think about such option. It will be a very simple way
of shared dictionary control for a user.
7) You mentioned you had to get rid of the compact_palloc0 - can you
elaborate a bit why that was necessary? Also, when benchmarking the
impact of this make sure to measure not only the time but also memory
consumption.
As I understood from the commit 3e5f9412d0a818be77c974e5af710928097b91f3
compact_palloc0() reduces overhead from a lot of palloc's for small
chunks of data. And persistent data of the patch should not suffer from
this overhead, because persistent data is allocated using big chunks.
But now I realized that we can keep compact_palloc0() for small chunks of
temporary data. And it may be worth to save compact_palloc0().
8) One more thing - I've noticed that the hash table uses this key:
...
That is, full paths to the two files, and I'm not sure that's a very
good idea. Firstly, it's a bit wasteful (1kB per path). But more
importantly it means all dictionaries referencing the same files will
share the same chunk of shared memory - not only within a single
database, but across the whole cluster. That may lead to surprising
behavior, because e.g. unloading a dictionary in one database will
affect dictionaries in all other databases referencing the same files.
Hm, indeed. It's worth to use only file names instead full paths. And
it is good idea to use more information besides file names. It can be
Oid of a database and Oid of a namespace maybe, because a
dictionary can be created in different schemas.
I think your proposals may be implemented in several patches, so they can
be applyed independently but consistently. I suppose I will prepare new
version of the patch with fixes and with initial design of new functions
and commands soon.
1 - https://www.postgresql.org/docs/current/static/sql-createtstemplate.html
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 01/13/2018 04:22 PM, Arthur Zakirov wrote:
Hello,
Thank you Tomas for your review.
On Sat, Jan 13, 2018 at 03:25:55AM +0100, Tomas Vondra wrote:
allocate memory in the buildCxt. What about adding tmpstrdup to copy a
string into the context? I admit this is mostly nitpicking though.... snip ...>
8) One more thing - I've noticed that the hash table uses this key:
...
That is, full paths to the two files, and I'm not sure that's a very
good idea. Firstly, it's a bit wasteful (1kB per path). But more
importantly it means all dictionaries referencing the same files will
share the same chunk of shared memory - not only within a single
database, but across the whole cluster. That may lead to surprising
behavior, because e.g. unloading a dictionary in one database will
affect dictionaries in all other databases referencing the same files.Hm, indeed. It's worth to use only file names instead full paths. And
it is good idea to use more information besides file names. It can be
Oid of a database and Oid of a namespace maybe, because a
dictionary can be created in different schemas.
I doubt using filenames (without the directory paths) solves anything,
really. The keys still have to be MAXPGPATH because someone could create
very long filename. But I don't think memory consumption is such a big
deal, really. With 1000 dictionaries it's still just ~2MB of data, which
is negligible compared to the amount of memory saved by sharing the
dictionaries.
Not sure if we really need to add the database/schema OIDs. I mentioned
the unexpected consequences (cross-db sharing) but maybe that's a
feature we should keep (it reduces memory usage). So perhaps this should
be another CREATE TEXT SEARCH DICTIONARY parameter, allowing sharing the
dictionary with other databases?
Aren't we overengineering this?
I think your proposals may be implemented in several patches, so they can
be applyed independently but consistently. I suppose I will prepare new
version of the patch with fixes and with initial design of new functions
and commands soon.
Yes, splitting patches into smaller, more focused bits is a good idea.
BTW the current patch fails to document the dictionary sharing. It only
mentions it when describing the shared_dictionaries GUC. IMHO the right
place for additional details is
https://www.postgresql.org/docs/10/static/textsearch-dictionaries.html
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Jan 13, 2018 at 10:33:14PM +0100, Tomas Vondra wrote:
Not sure if we really need to add the database/schema OIDs. I mentioned
the unexpected consequences (cross-db sharing) but maybe that's a
feature we should keep (it reduces memory usage). So perhaps this should
be another CREATE TEXT SEARCH DICTIONARY parameter, allowing sharing the
dictionary with other databases?Aren't we overengineering this?
Another related problem I've noticed is memory leak. When a dictionary
loaded and then dropped it won't be unloaded. I see several approaches:
1 - Use Oid of the dictionary itself as the key instead dictfile and
afffile. When the dictionary is dropped it will be easily unloaded if it
was loaded. Implementing should be easy, but the drawback is more memory consumption.
2 - Use reference counter with cross-db sharing. When the dictionary is
loaded the counter increases. If all record of loaded dictionary is dropped
it will be unloaded.
3 - Or reference counters without cross-db sharing to avoid possible confusing.
Here dictfile, afffile and database Oid will be used as the key.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 01/15/2018 08:02 PM, Arthur Zakirov wrote:
On Sat, Jan 13, 2018 at 10:33:14PM +0100, Tomas Vondra wrote:
Not sure if we really need to add the database/schema OIDs. I mentioned
the unexpected consequences (cross-db sharing) but maybe that's a
feature we should keep (it reduces memory usage). So perhaps this should
be another CREATE TEXT SEARCH DICTIONARY parameter, allowing sharing the
dictionary with other databases?Aren't we overengineering this?
Another related problem I've noticed is memory leak. When a
dictionary loaded and then dropped it won't be unloaded.
Good point.
I see several approaches:
1 - Use Oid of the dictionary itself as the key instead dictfile and
afffile. When the dictionary is dropped it will be easily unloaded if it
was loaded. Implementing should be easy, but the drawback is more memory consumption.
2 - Use reference counter with cross-db sharing. When the dictionary is
loaded the counter increases. If all record of loaded dictionary is dropped
it will be unloaded.
3 - Or reference counters without cross-db sharing to avoid possible confusing.
Here dictfile, afffile and database Oid will be used as the key.
I think you're approaching the problem from the right direction, hence
asking the wrong question.
I think the primary question is "Do we want to share dictionaries cross
databases?" and the answer will determine which of the tree options is
the right one.
Another important consideration is the complexity of the patch. In fact,
I suggest to make it your goal to make the initial patch as simple as
possible. If something is "nice to have" it may wait for v2.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Jan 13, 2018 at 06:22:41PM +0300, Arthur Zakirov wrote:
I think your proposals may be implemented in several patches, so they can
be applyed independently but consistently. I suppose I will prepare new
version of the patch with fixes and with initial design of new functions
and commands soon.
I attached new version of the patch.
0001-Fix-ispell-memory-handling-v3.patch:
allocate memory in the buildCxt. What about adding tmpstrdup to copy a
string into the context? I admit this is mostly nitpicking though.
Fixed. Added tmpstrdup.
0002-Retreive-shmem-location-for-ispell-v3.patch:
dshash.c is used now instead of dynahash.c. A hash table is created during first call of a text search function in an instance. A hash table uses OID of a dictionary instead of file names, so there is no cross-db sharing at all.
Added max_shared_dictionaries_size GUC instead of shared_dictionaries. In current version it can be set only at server start. If a dictionary is allocated in a backend's memory instead of shared memory then LOG message will be raised which includes OID of the dictionary.
Fixed memory leak. During removing a dictionary and invalidating dictionaries cash ts_dict_shmem_release() is called. It unpins mapping of a dictionary, if reference count reaches zero then DSM segment will be unpinned. So allocated shared memory will be released by Postgres.
0003-Store-ispell-structures-in-shmem-v3.patch:
Added documentation fixes. dispell_init() (tmplinit too) has second argument, dictid.
0004-Update-tmplinit-arguments-v3.patch:
It is necessary to fix all dictionaries including contrib extensions because of second argument for tmplinit.
tmplinit has the following signature now:
dict_init(internal, internal)
0005-pg-ts-shared-dictinaries-view-v3.patch:
Added pg_ts_shared_dictionaries() function and pg_ts_shared_dictionaries system view. They return a list of dictionaries currently in shared memory, with the columns:
- dictoid
- schemaname
- dictname
- size
0006-Shared-memory-ispell-option-v3.patch:
Added SharedMemory option for Ispell dictionary template. It is true by default, because I think it would be good that people will haven't to do anything to allocate dictionaries in shared memory.
Setting SharedMemory=false during ALTER TEXT SEARCH DICTIONARY hasn't immediate effect. It is because ALTER doesn't force to invalidate dictionaries cache, if I'm not mistaken.
3) How do I unload a dictionary from the shared memory?
...
ALTER TEXT SEARCH DICTIONARY x UNLOAD4) How do I reload a dictionary?
...
ALTER TEXT SEARCH DICTIONARY x RELOAD
I thought about it. And it seems to me that we can use functions ts_unload() and ts_reload() instead of new syntax. We already have text search functions like ts_lexize() and ts_debug(), and it is better to keep consistency. I think there are to approach for ts_unload():
- use DSM's pin and unpin methods and the invalidation callback, as it done during fixing memory leak. It has the drawback that it won't have an immediate effect, because DSM will be released only when all backends unpin DSM mapping.
- use DSA and dsa_free() method. As far as I understand dsa_free() frees allocated memory immediatly. But it requires more work to do, because we will need some more locks. For instance, what happens when someone calls ts_lexize() and someone else calls dsa_free() at the same time.
7) You mentioned you had to get rid of the compact_palloc0 - can you
elaborate a bit why that was necessary? Also, when benchmarking the
impact of this make sure to measure not only the time but also memory
consumption.
It seems to me that there is no need compact_palloc0() anymore. Tests show that czech dictionary doesn't consume more memory after the patch.
Tests
-----
I've measured creation time of dictionaries on my 64-bit machine. You can get them from [1]. Here the master is 434e6e1484418c55561914600de9e180fc408378. I've measured french dictionary too because it has even bigger affix file than czech dictionary.
With patch:
czech_hunspell - 247 ms
english_hunspell - 59 ms
french_hunspell - 103 ms
Master:
czech_hunspell - 224 ms
english_hunspell - 52 ms
french_hunspell - 101 ms
Memory:
With patch (shared memory size + backend's memory):
czech_hunspell - 9573049 + 192584 total in 5 blocks; 1896 free (11 chunks); 190688 used
english_hunspell - 1985299 + 21064 total in 6 blocks; 7736 free (13 chunks); 13328 used
french_hunspell - 4763456 + 626960 total in 7 blocks; 7680 free (14 chunks); 619280 used
Here french dictionary uses more backend's memory because it has big affix file. Regular expression structures are stored in backend's memory still.
Master (backend's memory):
czech_hunspell - 17181544 total in 2034 blocks; 3584 free (10 chunks); 17177960 used
english_hunspell - 4160120 total in 506 blocks; 2792 free (10 chunks); 4157328 used
french_hunspell - 11439184 total in 1187 blocks; 18832 free (171 chunks); 11420352 used
You can see that dictionaries now takes almost two times less memory.
pgbench with select only script:
SELECT ts_lexize('czech_hunspell', 'slon');
patch: 30431 TPS
master: 30419 TPS
SELECT ts_lexize('english_hunspell', 'elephant'):
patch: 35029 TPS
master: 35276 TPS
SELECT ts_lexize('french_hunspell', '�l�phante');
patch: 22264 TPS
master: 22744 TPS
1 - https://github.com/postgrespro/hunspell_dicts
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..7d2382045a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -75,8 +75,10 @@
* with the dictionary cache entry. We keep the short-lived stuff
* in the Conf->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 45b2af14eb..46617df852 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1364,6 +1364,35 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum size of all text search dictionaries loaded into shared
+ memory. The default is 100 megabytes (<literal>100MB</literal>). This
+ parameter can only be set at server start.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ </para>
+
+ <para>
+ If total size of simultaneously loaded dictionaries reaches the maximum
+ allowed size then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index bdf3857ce4..42be77d045 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -396,7 +397,8 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall2(initmethod, PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(InvalidOid));
}
ReleaseSysCache(tup);
@@ -513,6 +515,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..c7ee71412a
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,375 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table structures
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries. Default value is
+ * 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If there is a space in
+ * shared memory and max_shared_dictionaries_size is greater than 0 copy the
+ * dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size is greater than 0 then try to find the
+ * dictionary in shared hash table first. If it was built by someone earlier
+ * just return its location in DSM.
+ *
+ * dictid: Oid of the dictionary.
+ * arg: an argument to the callback function.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ Assert(max_shared_dictionaries_size);
+
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ {
+ LWLockRelease(&tsearch_ctl->lock);
+ return;
+ }
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ /* size of lookup hash table */
+ size = add_size(size, hash_estimate_size(max_shared_dictionaries_size,
+ sizeof(TsearchDictEntry)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC is greater than zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..c078503111 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -98,7 +99,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -334,8 +344,9 @@ lookup_ts_dictionary_cache(Oid dictId)
dictoptions = deserialize_deflist(opt);
entry->dictData =
- DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ DatumGetPointer(OidFunctionCall2(template->tmplinit,
+ PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(dictId)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5884fa905e..f910528c78 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2912,6 +2913,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into local memory of a backend."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index abffde6b2b..9fe2b0f85e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -133,6 +133,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..d6a27c9037
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,33 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+
+#include "nodes/pg_list.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ispell_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 1a2f04019c..20d637dde3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3029,6 +3029,23 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes a
+ noticable value of memory. Size of a dictionary can reach tens of megabytes.
+ Most of them also stores configuration in text files. A dictionary is compiled
+ during first access per a user session.
+ </para>
+
+ <para>
+ To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter value greater than zero before server starting.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 0d706795ad..60ef770dbd 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,19 +37,90 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ Oid dictid = PG_GETARG_OID(1);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
foreach(l, dictoptions)
{
@@ -46,34 +128,36 @@ dispell_init(PG_FUNCTION_ARGS)
if (pg_strcasecmp(defel->defname, "DictFile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (pg_strcasecmp(defel->defname, "AffFile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (pg_strcasecmp(defel->defname, "StopWords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +167,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 7d2382045a..9419100982 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0004-Update-tmplinit-arguments-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/Makefile b/contrib/dict_int/Makefile
index f6ae24aa4d..897be348ff 100644
--- a/contrib/dict_int/Makefile
+++ b/contrib/dict_int/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_int
OBJS = dict_int.o $(WIN32RES)
EXTENSION = dict_int
-DATA = dict_int--1.0.sql dict_int--unpackaged--1.0.sql
+DATA = dict_int--1.1.sql dict_int--1.0--1.1.sql dict_int--unpackaged--1.0.sql
PGFILEDESC = "dict_int - add-on dictionary template for full-text search"
REGRESS = dict_int
diff --git a/contrib/dict_int/dict_int--1.0--1.1.sql b/contrib/dict_int/dict_int--1.0--1.1.sql
new file mode 100644
index 0000000000..3517a5ecd1
--- /dev/null
+++ b/contrib/dict_int/dict_int--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_int/dict_int--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_int UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dintdict_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int--1.0.sql b/contrib/dict_int/dict_int--1.1.sql
similarity index 93%
rename from contrib/dict_int/dict_int--1.0.sql
rename to contrib/dict_int/dict_int--1.1.sql
index acb1461b56..6d3933e3d3 100644
--- a/contrib/dict_int/dict_int--1.0.sql
+++ b/contrib/dict_int/dict_int--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_int" to load this file. \quit
-CREATE FUNCTION dintdict_init(internal)
+CREATE FUNCTION dintdict_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int.control b/contrib/dict_int/dict_int.control
index 6e2d2b351a..51894171f6 100644
--- a/contrib/dict_int/dict_int.control
+++ b/contrib/dict_int/dict_int.control
@@ -1,5 +1,5 @@
# dict_int extension
comment = 'text search dictionary template for integers'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_int'
relocatable = true
diff --git a/contrib/dict_xsyn/Makefile b/contrib/dict_xsyn/Makefile
index 0c401cf3c8..d1cf8d0b5d 100644
--- a/contrib/dict_xsyn/Makefile
+++ b/contrib/dict_xsyn/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_xsyn
OBJS = dict_xsyn.o $(WIN32RES)
EXTENSION = dict_xsyn
-DATA = dict_xsyn--1.0.sql dict_xsyn--unpackaged--1.0.sql
+DATA = dict_xsyn--1.1.sql dict_xsyn--1.0--1.1.sql dict_xsyn--unpackaged--1.0.sql
DATA_TSEARCH = xsyn_sample.rules
PGFILEDESC = "dict_xsyn - add-on dictionary template for full-text search"
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
new file mode 100644
index 0000000000..35a576bfee
--- /dev/null
+++ b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_xsyn UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dxsyn_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0.sql b/contrib/dict_xsyn/dict_xsyn--1.1.sql
similarity index 93%
rename from contrib/dict_xsyn/dict_xsyn--1.0.sql
rename to contrib/dict_xsyn/dict_xsyn--1.1.sql
index 3d6bb51ca8..d8d1de1aa4 100644
--- a/contrib/dict_xsyn/dict_xsyn--1.0.sql
+++ b/contrib/dict_xsyn/dict_xsyn--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_xsyn" to load this file. \quit
-CREATE FUNCTION dxsyn_init(internal)
+CREATE FUNCTION dxsyn_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn.control b/contrib/dict_xsyn/dict_xsyn.control
index 3fd465a955..50358374a7 100644
--- a/contrib/dict_xsyn/dict_xsyn.control
+++ b/contrib/dict_xsyn/dict_xsyn.control
@@ -1,5 +1,5 @@
# dict_xsyn extension
comment = 'text search dictionary template for extended synonym processing'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_xsyn'
relocatable = true
diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile
index f8e3860926..b0ba23ed37 100644
--- a/contrib/unaccent/Makefile
+++ b/contrib/unaccent/Makefile
@@ -4,7 +4,8 @@ MODULE_big = unaccent
OBJS = unaccent.o $(WIN32RES)
EXTENSION = unaccent
-DATA = unaccent--1.1.sql unaccent--1.0--1.1.sql unaccent--unpackaged--1.0.sql
+DATA = unaccent--1.2.sql unaccent--1.1--1.2.sql unaccent--1.0--1.1.sql \
+ unaccent--unpackaged--1.0.sql
DATA_TSEARCH = unaccent.rules
PGFILEDESC = "unaccent - text search dictionary that removes accents"
diff --git a/contrib/unaccent/unaccent--1.1--1.2.sql b/contrib/unaccent/unaccent--1.1--1.2.sql
new file mode 100644
index 0000000000..eaef37f87e
--- /dev/null
+++ b/contrib/unaccent/unaccent--1.1--1.2.sql
@@ -0,0 +1,9 @@
+/* contrib/unaccent/unaccent--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION unaccent UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION unaccent_init(internal,internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME', 'unaccent_init'
+ LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent--1.1.sql b/contrib/unaccent/unaccent--1.2.sql
similarity index 94%
rename from contrib/unaccent/unaccent--1.1.sql
rename to contrib/unaccent/unaccent--1.2.sql
index ecc8651780..d6ce193e82 100644
--- a/contrib/unaccent/unaccent--1.1.sql
+++ b/contrib/unaccent/unaccent--1.2.sql
@@ -13,7 +13,7 @@ CREATE FUNCTION unaccent(text)
AS 'MODULE_PATHNAME', 'unaccent_dict'
LANGUAGE C STABLE STRICT PARALLEL SAFE;
-CREATE FUNCTION unaccent_init(internal)
+CREATE FUNCTION unaccent_init(internal,internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'unaccent_init'
LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent.control b/contrib/unaccent/unaccent.control
index a77a65f891..aec53b5ad5 100644
--- a/contrib/unaccent/unaccent.control
+++ b/contrib/unaccent/unaccent.control
@@ -1,5 +1,5 @@
# unaccent extension
comment = 'text search dictionary that removes accents'
-default_version = '1.1'
+default_version = '1.2'
module_pathname = '$libdir/unaccent'
relocatable = true
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 42be77d045..01230a2936 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -664,7 +664,7 @@ get_ts_template_func(DefElem *defel, int attnum)
switch (attnum)
{
case Anum_pg_ts_template_tmplinit:
- nargs = 1;
+ nargs = 2;
break;
case Anum_pg_ts_template_tmpllexize:
nargs = 4;
diff --git a/src/backend/snowball/snowball_func.sql.in b/src/backend/snowball/snowball_func.sql.in
index c02dad43e3..9b85e41ff8 100644
--- a/src/backend/snowball/snowball_func.sql.in
+++ b/src/backend/snowball/snowball_func.sql.in
@@ -19,7 +19,7 @@
SET search_path = pg_catalog;
-CREATE FUNCTION dsnowball_init(INTERNAL)
+CREATE FUNCTION dsnowball_init(INTERNAL, INTERNAL)
RETURNS INTERNAL AS '$libdir/dict_snowball', 'dsnowball_init'
LANGUAGE C STRICT;
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f01648c961..4c45c432e7 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4853,22 +4853,22 @@ DESCR("(internal)");
DATA(insert OID = 3723 ( ts_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1009 "3769 25" _null_ _null_ _null_ _null_ _null_ ts_lexize _null_ _null_ _null_ ));
DESCR("normalize one word by dictionary");
-DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
+DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3726 ( dsimple_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
+DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3729 ( dsynonym_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
+DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3732 ( dispell_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
+DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3741 ( thesaurus_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
0005-pg-ts-shared-dictinaries-view-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 71e20f2740..00faef73ed 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8228,6 +8228,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10983,6 +10988,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 20d637dde3..e1829277d0 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3044,6 +3044,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter value greater than zero before server starting.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5652e9ee6d..c663db3cf2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -504,6 +504,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index c7ee71412a..ff9163ab6f 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -373,3 +380,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 4c45c432e7..1a6e7662ef 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4933,6 +4933,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5433944c6a..235b066119 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2209,6 +2209,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0744ef803b..e844f92f4e 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -580,3 +580,55 @@ SELECT to_tsvector('thesaurus_tst', 'Booking tickets is looking like a booking a
'card':3,10 'invit':2,9 'like':6 'look':5 'order':1,8
(1 row)
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell_long', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | hunspell_long
+ public | hunspell_num
+ public | shared_ispell
+(5 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | hunspell_long
+ public | hunspell_num
+(4 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index a5a569e1ad..cdcde447f4 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -188,3 +188,22 @@ ALTER TEXT SEARCH CONFIGURATION thesaurus_tst ALTER MAPPING FOR
SELECT to_tsvector('thesaurus_tst', 'one postgres one two one two three one');
SELECT to_tsvector('thesaurus_tst', 'Supernovae star is very new star and usually called supernovae (abbreviation SN)');
SELECT to_tsvector('thesaurus_tst', 'Booking tickets is looking like a booking a tickets');
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('hunspell_long', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
0006-Shared-memory-ispell-option-v3.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index e1829277d0..ed11a162c2 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2828,6 +2828,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ SharedMemory = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2842,6 +2843,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>SharedMemory</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3036,7 +3040,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
Some dictionaries, especially <application>Ispell</application>, consumes a
noticable value of memory. Size of a dictionary can reach tens of megabytes.
Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ during first access per a user session. Currently only
+ <application>Ispell</application> supports loading into shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 60ef770dbd..7119d30820 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -49,15 +50,22 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ else
+ dict_location = dispell_build(dictoptions, NULL);
+
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -111,9 +119,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -121,6 +130,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -159,6 +170,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "SharedMemory") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple SharedMemory parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -181,7 +205,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -213,6 +237,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index e844f92f4e..778b5ee0c1 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -586,6 +586,12 @@ CREATE TEXT SEARCH DICTIONARY shared_ispell (
DictFile=ispell_sample,
AffFile=ispell_sample
);
+CREATE TEXT SEARCH DICTIONARY nonshared_ispell (
+ Template=ispell,
+ SharedMemory=false,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
-- Make sure that dictionaries in shared memory
SELECT ts_lexize('ispell', 'skies');
ts_lexize
@@ -611,6 +617,12 @@ SELECT ts_lexize('shared_ispell', 'skies');
{sky}
(1 row)
+SELECT ts_lexize('nonshared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
schemaname | dictname
------------+---------------
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index cdcde447f4..6f0d8d444a 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -195,11 +195,20 @@ CREATE TEXT SEARCH DICTIONARY shared_ispell (
DictFile=ispell_sample,
AffFile=ispell_sample
);
+
+CREATE TEXT SEARCH DICTIONARY nonshared_ispell (
+ Template=ispell,
+ SharedMemory=false,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
-- Make sure that dictionaries in shared memory
SELECT ts_lexize('ispell', 'skies');
SELECT ts_lexize('hunspell', 'skies');
SELECT ts_lexize('hunspell_long', 'skies');
SELECT ts_lexize('shared_ispell', 'skies');
+SELECT ts_lexize('nonshared_ispell', 'skies');
SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
Hi,
On 01/24/2018 06:20 PM, Arthur Zakirov wrote:
On Sat, Jan 13, 2018 at 06:22:41PM +0300, Arthur Zakirov wrote:
I think your proposals may be implemented in several patches, so
they can be applyed independently but consistently. I suppose I
will prepare new version of the patch with fixes and with initial
design of new functions and commands soon.I attached new version of the patch.
Thanks. I don't have time to review/test this before FOSDEM, but a
couple of comments regarding some of the points you mentioned.
3) How do I unload a dictionary from the shared memory?
...
ALTER TEXT SEARCH DICTIONARY x UNLOAD4) How do I reload a dictionary?
...
ALTER TEXT SEARCH DICTIONARY x RELOADI thought about it. And it seems to me that we can use functions
ts_unload() and ts_reload() instead of new syntax. We already have
text search functions like ts_lexize() and ts_debug(), and it is
better to keep consistency.
This argument seems a bit strange. Both ts_lexize() and ts_debug() are
operating on text values, and are meant to be executed as functions from
SQL - particularly ts_lexize(). It's hard to imagine this implemented as
DDL commands.
The unload/reload is something that operates on a database object
(dictionary), which already has create/drop/alter DDL. So it seems
somewhat natural to treat unload/reload as another DDL action.
Taken to an extreme, this argument would essentially mean we should not
have any DDL commands because we have SQL functions.
That being said, I'm not particularly attached to having this DDL now.
Implementing it seems straight-forward (particularly when we already
have the stuff implemented as functions), and some of the other open
questions seem more important to tackle now.
I think there are to approach for ts_unload():> - use DSM's pin and unpin methods and the invalidation callback, as
it done during fixing memory leak. It has the drawback that it won't
have an immediate effect, because DSM will be released only when all
backends unpin DSM mapping.
- use DSA and dsa_free() method. As far as I understand dsa_free()
frees allocated memory immediatly. But it requires more work to do,
because we will need some more locks. For instance, what happens
when someone calls ts_lexize() and someone else calls dsa_free() at
the same time.
No opinion on this yet, I have to think about it for a bit and look at
the code first.
7) You mentioned you had to get rid of the compact_palloc0 - can you
elaborate a bit why that was necessary? Also, when benchmarking the
impact of this make sure to measure not only the time but also memory
consumption.It seems to me that there is no need compact_palloc0() anymore. Tests
show that czech dictionary doesn't consume more memory after the
patch.
That's interesting. I'll do some additional tests to verify the finding.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-01-24 20:57 GMT+03:00 Tomas Vondra <tomas.vondra@2ndquadrant.com>:
Thanks. I don't have time to review/test this before FOSDEM, but a
couple of comments regarding some of the points you mentioned.
Thank you for your thoughts.
I thought about it. And it seems to me that we can use functions
ts_unload() and ts_reload() instead of new syntax. We already have
text search functions like ts_lexize() and ts_debug(), and it is
better to keep consistency.This argument seems a bit strange. Both ts_lexize() and ts_debug() are
operating on text values, and are meant to be executed as functions from
SQL - particularly ts_lexize(). It's hard to imagine this implemented as
DDL commands.The unload/reload is something that operates on a database object
(dictionary), which already has create/drop/alter DDL. So it seems
somewhat natural to treat unload/reload as another DDL action.Taken to an extreme, this argument would essentially mean we should not
have any DDL commands because we have SQL functions.That being said, I'm not particularly attached to having this DDL now.
Implementing it seems straight-forward (particularly when we already
have the stuff implemented as functions), and some of the other open
questions seem more important to tackle now.
I understood your opinion. I haven't strong opinion on the subject yet.
And I agree that they can be implemented in future improvements for shared
dictionaries.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, 24 Jan 2018 20:20:41 +0300
Arthur Zakirov <a.zakirov@postgrespro.ru> wrote:
Hi, I did some review of the patch.
In 0001 there are few lines where is only indentation has changed.
0002:
- TsearchShmemSize - calculating size using hash_estimate_size seems
redundant since you use DSA hash now.
- ts_dict_shmem_release - LWLockAcquire in the beginning makes no
sense, since dict_table couldn't change anyway.
0003:
- ts_dict_shmem_location could return IspellDictData, it makes more
sense.
0006:
It's very subjective, but I think it would nicer to call option as
Shared (as property of dictionary) or UseSharedMemory, the boolean
option called SharedMemory sounds weird.
Overall the patches look good, all tests passed. I tried to broke it in
few places where I thought it could be unsafe but not succeeded.
--
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hello,
Thank you for your review! Good catches.
On Thu, Jan 25, 2018 at 03:26:46PM +0300, Ildus Kurbangaliev wrote:
In 0001 there are few lines where is only indentation has changed.
Fixed.
0002:
- TsearchShmemSize - calculating size using hash_estimate_size seems
redundant since you use DSA hash now.
Fixed. True, there is no need in hash_estimate_size anymore.
- ts_dict_shmem_release - LWLockAcquire in the beginning makes no
sense, since dict_table couldn't change anyway.
Fixed. In earlier version tsearch_ctl was used here, but I forgot to remove LWLockAcquire.
0003:
- ts_dict_shmem_location could return IspellDictData, it makes more
sense.
I assume that ts_dict_shmem_location can be used by various types of dictionaries, not only by Ispell. So void * more suitable here.
0006:
It's very subjective, but I think it would nicer to call option as
Shared (as property of dictionary) or UseSharedMemory, the boolean
option called SharedMemory sounds weird.
Agree. In our offline conversation we came to Shareable, that is a dictionary can be shared. It may be more appropriate because setting Shareable=true doesn't guarantee that a dictionary will be allocated in shared memory due to max_shared_dictionaries_size GUC.
Attached new version of the patch.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 45b2af14eb..46617df852 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1364,6 +1364,35 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum size of all text search dictionaries loaded into shared
+ memory. The default is 100 megabytes (<literal>100MB</literal>). This
+ parameter can only be set at server start.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ </para>
+
+ <para>
+ If total size of simultaneously loaded dictionaries reaches the maximum
+ allowed size then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index bdf3857ce4..42be77d045 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -396,7 +397,8 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall2(initmethod, PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(InvalidOid));
}
ReleaseSysCache(tup);
@@ -513,6 +515,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..7d1f7544cf
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,366 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table structures
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries. Default value is
+ * 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If there is a space in
+ * shared memory and max_shared_dictionaries_size is greater than 0 copy the
+ * dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size is greater than 0 then try to find the
+ * dictionary in shared hash table first. If it was built by someone earlier
+ * just return its location in DSM.
+ *
+ * dictid: Oid of the dictionary.
+ * arg: an argument to the callback function.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ Assert(max_shared_dictionaries_size);
+
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC is greater than zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..c078503111 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -98,7 +99,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -334,8 +344,9 @@ lookup_ts_dictionary_cache(Oid dictId)
dictoptions = deserialize_deflist(opt);
entry->dictData =
- DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ DatumGetPointer(OidFunctionCall2(template->tmplinit,
+ PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(dictId)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5884fa905e..f910528c78 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2912,6 +2913,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into local memory of a backend."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index abffde6b2b..9fe2b0f85e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -133,6 +133,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..d6a27c9037
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,33 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+
+#include "nodes/pg_list.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ispell_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 1a2f04019c..20d637dde3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3029,6 +3029,23 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes a
+ noticable value of memory. Size of a dictionary can reach tens of megabytes.
+ Most of them also stores configuration in text files. A dictionary is compiled
+ during first access per a user session.
+ </para>
+
+ <para>
+ To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter value greater than zero before server starting.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 0d706795ad..60ef770dbd 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,19 +37,90 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ Oid dictid = PG_GETARG_OID(1);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
foreach(l, dictoptions)
{
@@ -46,34 +128,36 @@ dispell_init(PG_FUNCTION_ARGS)
if (pg_strcasecmp(defel->defname, "DictFile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (pg_strcasecmp(defel->defname, "AffFile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (pg_strcasecmp(defel->defname, "StopWords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +167,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0004-Update-tmplinit-arguments-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/Makefile b/contrib/dict_int/Makefile
index f6ae24aa4d..897be348ff 100644
--- a/contrib/dict_int/Makefile
+++ b/contrib/dict_int/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_int
OBJS = dict_int.o $(WIN32RES)
EXTENSION = dict_int
-DATA = dict_int--1.0.sql dict_int--unpackaged--1.0.sql
+DATA = dict_int--1.1.sql dict_int--1.0--1.1.sql dict_int--unpackaged--1.0.sql
PGFILEDESC = "dict_int - add-on dictionary template for full-text search"
REGRESS = dict_int
diff --git a/contrib/dict_int/dict_int--1.0--1.1.sql b/contrib/dict_int/dict_int--1.0--1.1.sql
new file mode 100644
index 0000000000..3517a5ecd1
--- /dev/null
+++ b/contrib/dict_int/dict_int--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_int/dict_int--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_int UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dintdict_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int--1.0.sql b/contrib/dict_int/dict_int--1.1.sql
similarity index 93%
rename from contrib/dict_int/dict_int--1.0.sql
rename to contrib/dict_int/dict_int--1.1.sql
index acb1461b56..6d3933e3d3 100644
--- a/contrib/dict_int/dict_int--1.0.sql
+++ b/contrib/dict_int/dict_int--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_int" to load this file. \quit
-CREATE FUNCTION dintdict_init(internal)
+CREATE FUNCTION dintdict_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int.control b/contrib/dict_int/dict_int.control
index 6e2d2b351a..51894171f6 100644
--- a/contrib/dict_int/dict_int.control
+++ b/contrib/dict_int/dict_int.control
@@ -1,5 +1,5 @@
# dict_int extension
comment = 'text search dictionary template for integers'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_int'
relocatable = true
diff --git a/contrib/dict_xsyn/Makefile b/contrib/dict_xsyn/Makefile
index 0c401cf3c8..d1cf8d0b5d 100644
--- a/contrib/dict_xsyn/Makefile
+++ b/contrib/dict_xsyn/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_xsyn
OBJS = dict_xsyn.o $(WIN32RES)
EXTENSION = dict_xsyn
-DATA = dict_xsyn--1.0.sql dict_xsyn--unpackaged--1.0.sql
+DATA = dict_xsyn--1.1.sql dict_xsyn--1.0--1.1.sql dict_xsyn--unpackaged--1.0.sql
DATA_TSEARCH = xsyn_sample.rules
PGFILEDESC = "dict_xsyn - add-on dictionary template for full-text search"
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
new file mode 100644
index 0000000000..35a576bfee
--- /dev/null
+++ b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_xsyn UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dxsyn_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0.sql b/contrib/dict_xsyn/dict_xsyn--1.1.sql
similarity index 93%
rename from contrib/dict_xsyn/dict_xsyn--1.0.sql
rename to contrib/dict_xsyn/dict_xsyn--1.1.sql
index 3d6bb51ca8..d8d1de1aa4 100644
--- a/contrib/dict_xsyn/dict_xsyn--1.0.sql
+++ b/contrib/dict_xsyn/dict_xsyn--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_xsyn" to load this file. \quit
-CREATE FUNCTION dxsyn_init(internal)
+CREATE FUNCTION dxsyn_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn.control b/contrib/dict_xsyn/dict_xsyn.control
index 3fd465a955..50358374a7 100644
--- a/contrib/dict_xsyn/dict_xsyn.control
+++ b/contrib/dict_xsyn/dict_xsyn.control
@@ -1,5 +1,5 @@
# dict_xsyn extension
comment = 'text search dictionary template for extended synonym processing'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_xsyn'
relocatable = true
diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile
index f8e3860926..b0ba23ed37 100644
--- a/contrib/unaccent/Makefile
+++ b/contrib/unaccent/Makefile
@@ -4,7 +4,8 @@ MODULE_big = unaccent
OBJS = unaccent.o $(WIN32RES)
EXTENSION = unaccent
-DATA = unaccent--1.1.sql unaccent--1.0--1.1.sql unaccent--unpackaged--1.0.sql
+DATA = unaccent--1.2.sql unaccent--1.1--1.2.sql unaccent--1.0--1.1.sql \
+ unaccent--unpackaged--1.0.sql
DATA_TSEARCH = unaccent.rules
PGFILEDESC = "unaccent - text search dictionary that removes accents"
diff --git a/contrib/unaccent/unaccent--1.1--1.2.sql b/contrib/unaccent/unaccent--1.1--1.2.sql
new file mode 100644
index 0000000000..eaef37f87e
--- /dev/null
+++ b/contrib/unaccent/unaccent--1.1--1.2.sql
@@ -0,0 +1,9 @@
+/* contrib/unaccent/unaccent--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION unaccent UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION unaccent_init(internal,internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME', 'unaccent_init'
+ LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent--1.1.sql b/contrib/unaccent/unaccent--1.2.sql
similarity index 94%
rename from contrib/unaccent/unaccent--1.1.sql
rename to contrib/unaccent/unaccent--1.2.sql
index ecc8651780..d6ce193e82 100644
--- a/contrib/unaccent/unaccent--1.1.sql
+++ b/contrib/unaccent/unaccent--1.2.sql
@@ -13,7 +13,7 @@ CREATE FUNCTION unaccent(text)
AS 'MODULE_PATHNAME', 'unaccent_dict'
LANGUAGE C STABLE STRICT PARALLEL SAFE;
-CREATE FUNCTION unaccent_init(internal)
+CREATE FUNCTION unaccent_init(internal,internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'unaccent_init'
LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent.control b/contrib/unaccent/unaccent.control
index a77a65f891..aec53b5ad5 100644
--- a/contrib/unaccent/unaccent.control
+++ b/contrib/unaccent/unaccent.control
@@ -1,5 +1,5 @@
# unaccent extension
comment = 'text search dictionary that removes accents'
-default_version = '1.1'
+default_version = '1.2'
module_pathname = '$libdir/unaccent'
relocatable = true
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 42be77d045..01230a2936 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -664,7 +664,7 @@ get_ts_template_func(DefElem *defel, int attnum)
switch (attnum)
{
case Anum_pg_ts_template_tmplinit:
- nargs = 1;
+ nargs = 2;
break;
case Anum_pg_ts_template_tmpllexize:
nargs = 4;
diff --git a/src/backend/snowball/snowball_func.sql.in b/src/backend/snowball/snowball_func.sql.in
index c02dad43e3..9b85e41ff8 100644
--- a/src/backend/snowball/snowball_func.sql.in
+++ b/src/backend/snowball/snowball_func.sql.in
@@ -19,7 +19,7 @@
SET search_path = pg_catalog;
-CREATE FUNCTION dsnowball_init(INTERNAL)
+CREATE FUNCTION dsnowball_init(INTERNAL, INTERNAL)
RETURNS INTERNAL AS '$libdir/dict_snowball', 'dsnowball_init'
LANGUAGE C STRICT;
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f01648c961..4c45c432e7 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4853,22 +4853,22 @@ DESCR("(internal)");
DATA(insert OID = 3723 ( ts_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1009 "3769 25" _null_ _null_ _null_ _null_ _null_ ts_lexize _null_ _null_ _null_ ));
DESCR("normalize one word by dictionary");
-DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
+DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3726 ( dsimple_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
+DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3729 ( dsynonym_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
+DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3732 ( dispell_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
+DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3741 ( thesaurus_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
0005-pg-ts-shared-dictinaries-view-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 71e20f2740..00faef73ed 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8228,6 +8228,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10983,6 +10988,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 20d637dde3..e1829277d0 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3044,6 +3044,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter value greater than zero before server starting.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5652e9ee6d..c663db3cf2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -504,6 +504,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 7d1f7544cf..ff3127f207 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -364,3 +371,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 4c45c432e7..1a6e7662ef 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4933,6 +4933,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5433944c6a..235b066119 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2209,6 +2209,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index e1829277d0..c534487743 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2828,6 +2828,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2842,6 +2843,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3036,7 +3040,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
Some dictionaries, especially <application>Ispell</application>, consumes a
noticable value of memory. Size of a dictionary can reach tens of megabytes.
Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ during first access per a user session. Currently only
+ <application>Ispell</application> supports loading into shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 60ef770dbd..5384f7d87a 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -49,15 +50,22 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ else
+ dict_location = dispell_build(dictoptions, NULL);
+
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -111,9 +119,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -121,6 +130,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -159,6 +170,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -181,7 +205,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -213,6 +237,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0744ef803b..932d75acec 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -580,3 +582,58 @@ SELECT to_tsvector('thesaurus_tst', 'Booking tickets is looking like a booking a
'card':3,10 'invit':2,9 'like':6 'look':5 'order':1,8
(1 row)
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index a5a569e1ad..04c1161141 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -188,3 +190,26 @@ ALTER TEXT SEARCH CONFIGURATION thesaurus_tst ALTER MAPPING FOR
SELECT to_tsvector('thesaurus_tst', 'one postgres one two one two three one');
SELECT to_tsvector('thesaurus_tst', 'Supernovae star is very new star and usually called supernovae (abbreviation SN)');
SELECT to_tsvector('thesaurus_tst', 'Booking tickets is looking like a booking a tickets');
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
On Thu, Jan 25, 2018 at 07:51:58PM +0300, Arthur Zakirov wrote:
Attached new version of the patch.
Here is rebased version of the patch due to changes into dict_ispell.c.
The patch itself wasn't changed.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c45979dee4..725473b7c2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1364,6 +1364,35 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum size of all text search dictionaries loaded into shared
+ memory. The default is 100 megabytes (<literal>100MB</literal>). This
+ parameter can only be set at server start.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ </para>
+
+ <para>
+ If total size of simultaneously loaded dictionaries reaches the maximum
+ allowed size then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..b6aeae449b 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -396,7 +397,8 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall2(initmethod, PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(InvalidOid));
}
ReleaseSysCache(tup);
@@ -513,6 +515,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..7d1f7544cf
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,366 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table structures
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries. Default value is
+ * 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If there is a space in
+ * shared memory and max_shared_dictionaries_size is greater than 0 copy the
+ * dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size is greater than 0 then try to find the
+ * dictionary in shared hash table first. If it was built by someone earlier
+ * just return its location in DSM.
+ *
+ * dictid: Oid of the dictionary.
+ * arg: an argument to the callback function.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ Assert(max_shared_dictionaries_size);
+
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC is greater than zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..c078503111 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -98,7 +99,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -334,8 +344,9 @@ lookup_ts_dictionary_cache(Oid dictId)
dictoptions = deserialize_deflist(opt);
entry->dictData =
- DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ DatumGetPointer(OidFunctionCall2(template->tmplinit,
+ PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(dictId)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 87ba67661a..53230bc37f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2922,6 +2923,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into local memory of a backend."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9a3535559e..908ccebb52 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -133,6 +133,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..d6a27c9037
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,33 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+
+#include "nodes/pg_list.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ispell_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..82afe201f8 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,23 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes a
+ noticable value of memory. Size of a dictionary can reach tens of megabytes.
+ Most of them also stores configuration in text files. A dictionary is compiled
+ during first access per a user session.
+ </para>
+
+ <para>
+ To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter value greater than zero before server starting.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..e7f4d5a48d 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,19 +37,90 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ Oid dictid = PG_GETARG_OID(1);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
foreach(l, dictoptions)
{
@@ -46,34 +128,36 @@ dispell_init(PG_FUNCTION_ARGS)
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +167,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0004-Update-tmplinit-arguments-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/Makefile b/contrib/dict_int/Makefile
index f6ae24aa4d..897be348ff 100644
--- a/contrib/dict_int/Makefile
+++ b/contrib/dict_int/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_int
OBJS = dict_int.o $(WIN32RES)
EXTENSION = dict_int
-DATA = dict_int--1.0.sql dict_int--unpackaged--1.0.sql
+DATA = dict_int--1.1.sql dict_int--1.0--1.1.sql dict_int--unpackaged--1.0.sql
PGFILEDESC = "dict_int - add-on dictionary template for full-text search"
REGRESS = dict_int
diff --git a/contrib/dict_int/dict_int--1.0--1.1.sql b/contrib/dict_int/dict_int--1.0--1.1.sql
new file mode 100644
index 0000000000..3517a5ecd1
--- /dev/null
+++ b/contrib/dict_int/dict_int--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_int/dict_int--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_int UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dintdict_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int--1.0.sql b/contrib/dict_int/dict_int--1.1.sql
similarity index 93%
rename from contrib/dict_int/dict_int--1.0.sql
rename to contrib/dict_int/dict_int--1.1.sql
index acb1461b56..6d3933e3d3 100644
--- a/contrib/dict_int/dict_int--1.0.sql
+++ b/contrib/dict_int/dict_int--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_int" to load this file. \quit
-CREATE FUNCTION dintdict_init(internal)
+CREATE FUNCTION dintdict_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int.control b/contrib/dict_int/dict_int.control
index 6e2d2b351a..51894171f6 100644
--- a/contrib/dict_int/dict_int.control
+++ b/contrib/dict_int/dict_int.control
@@ -1,5 +1,5 @@
# dict_int extension
comment = 'text search dictionary template for integers'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_int'
relocatable = true
diff --git a/contrib/dict_xsyn/Makefile b/contrib/dict_xsyn/Makefile
index 0c401cf3c8..d1cf8d0b5d 100644
--- a/contrib/dict_xsyn/Makefile
+++ b/contrib/dict_xsyn/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_xsyn
OBJS = dict_xsyn.o $(WIN32RES)
EXTENSION = dict_xsyn
-DATA = dict_xsyn--1.0.sql dict_xsyn--unpackaged--1.0.sql
+DATA = dict_xsyn--1.1.sql dict_xsyn--1.0--1.1.sql dict_xsyn--unpackaged--1.0.sql
DATA_TSEARCH = xsyn_sample.rules
PGFILEDESC = "dict_xsyn - add-on dictionary template for full-text search"
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
new file mode 100644
index 0000000000..35a576bfee
--- /dev/null
+++ b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_xsyn UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dxsyn_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0.sql b/contrib/dict_xsyn/dict_xsyn--1.1.sql
similarity index 93%
rename from contrib/dict_xsyn/dict_xsyn--1.0.sql
rename to contrib/dict_xsyn/dict_xsyn--1.1.sql
index 3d6bb51ca8..d8d1de1aa4 100644
--- a/contrib/dict_xsyn/dict_xsyn--1.0.sql
+++ b/contrib/dict_xsyn/dict_xsyn--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_xsyn" to load this file. \quit
-CREATE FUNCTION dxsyn_init(internal)
+CREATE FUNCTION dxsyn_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn.control b/contrib/dict_xsyn/dict_xsyn.control
index 3fd465a955..50358374a7 100644
--- a/contrib/dict_xsyn/dict_xsyn.control
+++ b/contrib/dict_xsyn/dict_xsyn.control
@@ -1,5 +1,5 @@
# dict_xsyn extension
comment = 'text search dictionary template for extended synonym processing'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_xsyn'
relocatable = true
diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile
index f8e3860926..b0ba23ed37 100644
--- a/contrib/unaccent/Makefile
+++ b/contrib/unaccent/Makefile
@@ -4,7 +4,8 @@ MODULE_big = unaccent
OBJS = unaccent.o $(WIN32RES)
EXTENSION = unaccent
-DATA = unaccent--1.1.sql unaccent--1.0--1.1.sql unaccent--unpackaged--1.0.sql
+DATA = unaccent--1.2.sql unaccent--1.1--1.2.sql unaccent--1.0--1.1.sql \
+ unaccent--unpackaged--1.0.sql
DATA_TSEARCH = unaccent.rules
PGFILEDESC = "unaccent - text search dictionary that removes accents"
diff --git a/contrib/unaccent/unaccent--1.1--1.2.sql b/contrib/unaccent/unaccent--1.1--1.2.sql
new file mode 100644
index 0000000000..eaef37f87e
--- /dev/null
+++ b/contrib/unaccent/unaccent--1.1--1.2.sql
@@ -0,0 +1,9 @@
+/* contrib/unaccent/unaccent--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION unaccent UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION unaccent_init(internal,internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME', 'unaccent_init'
+ LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent--1.1.sql b/contrib/unaccent/unaccent--1.2.sql
similarity index 94%
rename from contrib/unaccent/unaccent--1.1.sql
rename to contrib/unaccent/unaccent--1.2.sql
index ecc8651780..d6ce193e82 100644
--- a/contrib/unaccent/unaccent--1.1.sql
+++ b/contrib/unaccent/unaccent--1.2.sql
@@ -13,7 +13,7 @@ CREATE FUNCTION unaccent(text)
AS 'MODULE_PATHNAME', 'unaccent_dict'
LANGUAGE C STABLE STRICT PARALLEL SAFE;
-CREATE FUNCTION unaccent_init(internal)
+CREATE FUNCTION unaccent_init(internal,internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'unaccent_init'
LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent.control b/contrib/unaccent/unaccent.control
index a77a65f891..aec53b5ad5 100644
--- a/contrib/unaccent/unaccent.control
+++ b/contrib/unaccent/unaccent.control
@@ -1,5 +1,5 @@
# unaccent extension
comment = 'text search dictionary that removes accents'
-default_version = '1.1'
+default_version = '1.2'
module_pathname = '$libdir/unaccent'
relocatable = true
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index b6aeae449b..32ab98b6a7 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -664,7 +664,7 @@ get_ts_template_func(DefElem *defel, int attnum)
switch (attnum)
{
case Anum_pg_ts_template_tmplinit:
- nargs = 1;
+ nargs = 2;
break;
case Anum_pg_ts_template_tmpllexize:
nargs = 4;
diff --git a/src/backend/snowball/snowball_func.sql.in b/src/backend/snowball/snowball_func.sql.in
index c02dad43e3..9b85e41ff8 100644
--- a/src/backend/snowball/snowball_func.sql.in
+++ b/src/backend/snowball/snowball_func.sql.in
@@ -19,7 +19,7 @@
SET search_path = pg_catalog;
-CREATE FUNCTION dsnowball_init(INTERNAL)
+CREATE FUNCTION dsnowball_init(INTERNAL, INTERNAL)
RETURNS INTERNAL AS '$libdir/dict_snowball', 'dsnowball_init'
LANGUAGE C STRICT;
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 2a5321315a..ecec8f7ff8 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4881,22 +4881,22 @@ DESCR("(internal)");
DATA(insert OID = 3723 ( ts_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1009 "3769 25" _null_ _null_ _null_ _null_ _null_ ts_lexize _null_ _null_ _null_ ));
DESCR("normalize one word by dictionary");
-DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
+DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3726 ( dsimple_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
+DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3729 ( dsynonym_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
+DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3732 ( dispell_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
+DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3741 ( thesaurus_lexize PGNSP PGUID 12 1 0 0 0 f f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
0005-pg-ts-shared-dictinaries-view-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 71e20f2740..00faef73ed 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8228,6 +8228,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10983,6 +10988,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 82afe201f8..78ed082994 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3045,6 +3045,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter value greater than zero before server starting.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5652e9ee6d..c663db3cf2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -504,6 +504,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 7d1f7544cf..ff3127f207 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -364,3 +371,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index ecec8f7ff8..71f704fc92 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4961,6 +4961,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5433944c6a..235b066119 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2209,6 +2209,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 78ed082994..f5e88f7c86 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3037,7 +3041,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
Some dictionaries, especially <application>Ispell</application>, consumes a
noticable value of memory. Size of a dictionary can reach tens of megabytes.
Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ during first access per a user session. Currently only
+ <application>Ispell</application> supports loading into shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index e7f4d5a48d..8a714cec54 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -49,15 +50,22 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ else
+ dict_location = dispell_build(dictoptions, NULL);
+
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -111,9 +119,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -121,6 +130,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -159,6 +170,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -181,7 +205,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -213,6 +237,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..6f6bca4f42 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,58 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..66a7c37e53 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,26 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
Hi,
On 2018-02-07 19:28:29 +0300, Arthur Zakirov wrote:
+ { + {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."), + gettext_noop("Currently controls only loading of Ispell dictionaries. " + "If total size of simultaneously loaded dictionaries " + "reaches the maximum allowed size then a new dictionary " + "will be loaded into local memory of a backend."), + GUC_UNIT_KB, + }, + &max_shared_dictionaries_size, + 100 * 1024, 0, MAX_KILOBYTES, + NULL, NULL, NULL + },
So this uses shared memory, allocated at server start? That doesn't
seem right. Wouldn't it make more sense to have a
'num_shared_dictionaries' GUC, and then allocate them with dsm? Or even
better not have any such limit and us a dshash table to point to
individual loaded tables?
Is there any chance we can instead can convert dictionaries into a form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?
Regards,
Andres
Hello,
Thank you for your comments.
On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
Hi,
On 2018-02-07 19:28:29 +0300, Arthur Zakirov wrote:
+ { + {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."), + gettext_noop("Currently controls only loading of Ispell dictionaries. " + "If total size of simultaneously loaded dictionaries " + "reaches the maximum allowed size then a new dictionary " + "will be loaded into local memory of a backend."), + GUC_UNIT_KB, + }, + &max_shared_dictionaries_size, + 100 * 1024, 0, MAX_KILOBYTES, + NULL, NULL, NULL + },So this uses shared memory, allocated at server start? That doesn't
seem right. Wouldn't it make more sense to have a
'num_shared_dictionaries' GUC, and then allocate them with dsm? Or even
better not have any such limit and us a dshash table to point to
individual loaded tables?
The patch uses dsm and dshash table already.
'max_shared_dictionaries_size' GUC was introduced after discussion with
Tomas [1]. To limit amount of memory consumed by loaded dictionaries and to
prevent possible memory bloating. Its default value is 100MB.
There was 'shared_dictionaries' GUC before, it was introduced because
usual hash tables was used before, not dshash. I replaced usual hash
tables by dshash, removed 'shared_dictionaries' and added
'max_shared_dictionaries_size'.
Is there any chance we can instead can convert dictionaries into a form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?
I think new IspellDictData structure (in 0003-Store-ispell-structures-in-shmem-v5.patch)
can be stored in a binary file and mapped into memory already. But
mmap() is not used in this patch yet.
I can do some experiments and make a prototype.
1 - /messages/by-id/d12d9395-922c-64c9-c87d-dd0e1d31440e@2ndquadrant.com
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, Feb 07, 2018 at 07:28:29PM +0300, Arthur Zakirov wrote:
Here is rebased version of the patch due to changes into dict_ispell.c.
The patch itself wasn't changed.
Here is rebased version of the patch due to changes within pg_proc.h.
I haven't implemented a mmap prototype yet, though.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0006-Shared-memory-ispell-option-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 78ed082994..f5e88f7c86 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3037,7 +3041,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
Some dictionaries, especially <application>Ispell</application>, consumes a
noticable value of memory. Size of a dictionary can reach tens of megabytes.
Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ during first access per a user session. Currently only
+ <application>Ispell</application> supports loading into shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index e7f4d5a48d..8a714cec54 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -49,15 +50,22 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ else
+ dict_location = dispell_build(dictoptions, NULL);
+
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -111,9 +119,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -121,6 +130,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -159,6 +170,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -181,7 +205,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -213,6 +237,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..6f6bca4f42 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,58 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..66a7c37e53 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,26 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
0001-Fix-ispell-memory-handling-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Retreive-shmem-location-for-ispell-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 259a2d83b4..439d2cdf87 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1364,6 +1364,35 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum size of all text search dictionaries loaded into shared
+ memory. The default is 100 megabytes (<literal>100MB</literal>). This
+ parameter can only be set at server start.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ </para>
+
+ <para>
+ If total size of simultaneously loaded dictionaries reaches the maximum
+ allowed size then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..b6aeae449b 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -396,7 +397,8 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall2(initmethod, PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(InvalidOid));
}
ReleaseSysCache(tup);
@@ -513,6 +515,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..7d1f7544cf
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,366 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table structures
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries. Default value is
+ * 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If there is a space in
+ * shared memory and max_shared_dictionaries_size is greater than 0 copy the
+ * dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size is greater than 0 then try to find the
+ * dictionary in shared hash table first. If it was built by someone earlier
+ * just return its location in DSM.
+ *
+ * dictid: Oid of the dictionary.
+ * arg: an argument to the callback function.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ Assert(max_shared_dictionaries_size);
+
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC is greater than zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..c078503111 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -98,7 +99,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -334,8 +344,9 @@ lookup_ts_dictionary_cache(Oid dictId)
dictoptions = deserialize_deflist(opt);
entry->dictData =
- DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ DatumGetPointer(OidFunctionCall2(template->tmplinit,
+ PointerGetDatum(dictoptions),
+ ObjectIdGetDatum(dictId)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1db7845d5a..3488a8fb6a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2922,6 +2923,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into local memory of a backend."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39272925fb..eb3e348b48 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -133,6 +133,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..d6a27c9037
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,33 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "c.h"
+
+#include "nodes/pg_list.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ispell_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(Oid dictid, List *dictoptions,
+ ispell_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0003-Store-ispell-structures-in-shmem-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..82afe201f8 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,23 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes a
+ noticable value of memory. Size of a dictionary can reach tens of megabytes.
+ Most of them also stores configuration in text files. A dictionary is compiled
+ during first access per a user session.
+ </para>
+
+ <para>
+ To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter value greater than zero before server starting.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..e7f4d5a48d 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,19 +37,90 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ Oid dictid = PG_GETARG_OID(1);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(dictid, dictoptions, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
foreach(l, dictoptions)
{
@@ -46,34 +128,36 @@ dispell_init(PG_FUNCTION_ARGS)
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +167,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0004-Update-tmplinit-arguments-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/Makefile b/contrib/dict_int/Makefile
index f6ae24aa4d..897be348ff 100644
--- a/contrib/dict_int/Makefile
+++ b/contrib/dict_int/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_int
OBJS = dict_int.o $(WIN32RES)
EXTENSION = dict_int
-DATA = dict_int--1.0.sql dict_int--unpackaged--1.0.sql
+DATA = dict_int--1.1.sql dict_int--1.0--1.1.sql dict_int--unpackaged--1.0.sql
PGFILEDESC = "dict_int - add-on dictionary template for full-text search"
REGRESS = dict_int
diff --git a/contrib/dict_int/dict_int--1.0--1.1.sql b/contrib/dict_int/dict_int--1.0--1.1.sql
new file mode 100644
index 0000000000..3517a5ecd1
--- /dev/null
+++ b/contrib/dict_int/dict_int--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_int/dict_int--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_int UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dintdict_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int--1.0.sql b/contrib/dict_int/dict_int--1.1.sql
similarity index 93%
rename from contrib/dict_int/dict_int--1.0.sql
rename to contrib/dict_int/dict_int--1.1.sql
index acb1461b56..6d3933e3d3 100644
--- a/contrib/dict_int/dict_int--1.0.sql
+++ b/contrib/dict_int/dict_int--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_int" to load this file. \quit
-CREATE FUNCTION dintdict_init(internal)
+CREATE FUNCTION dintdict_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_int/dict_int.control b/contrib/dict_int/dict_int.control
index 6e2d2b351a..51894171f6 100644
--- a/contrib/dict_int/dict_int.control
+++ b/contrib/dict_int/dict_int.control
@@ -1,5 +1,5 @@
# dict_int extension
comment = 'text search dictionary template for integers'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_int'
relocatable = true
diff --git a/contrib/dict_xsyn/Makefile b/contrib/dict_xsyn/Makefile
index 0c401cf3c8..d1cf8d0b5d 100644
--- a/contrib/dict_xsyn/Makefile
+++ b/contrib/dict_xsyn/Makefile
@@ -4,7 +4,7 @@ MODULE_big = dict_xsyn
OBJS = dict_xsyn.o $(WIN32RES)
EXTENSION = dict_xsyn
-DATA = dict_xsyn--1.0.sql dict_xsyn--unpackaged--1.0.sql
+DATA = dict_xsyn--1.1.sql dict_xsyn--1.0--1.1.sql dict_xsyn--unpackaged--1.0.sql
DATA_TSEARCH = xsyn_sample.rules
PGFILEDESC = "dict_xsyn - add-on dictionary template for full-text search"
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
new file mode 100644
index 0000000000..35a576bfee
--- /dev/null
+++ b/contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql
@@ -0,0 +1,9 @@
+/* contrib/dict_xsyn/dict_xsyn--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION dict_xsyn UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION dxsyn_init(internal, internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn--1.0.sql b/contrib/dict_xsyn/dict_xsyn--1.1.sql
similarity index 93%
rename from contrib/dict_xsyn/dict_xsyn--1.0.sql
rename to contrib/dict_xsyn/dict_xsyn--1.1.sql
index 3d6bb51ca8..d8d1de1aa4 100644
--- a/contrib/dict_xsyn/dict_xsyn--1.0.sql
+++ b/contrib/dict_xsyn/dict_xsyn--1.1.sql
@@ -3,7 +3,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION dict_xsyn" to load this file. \quit
-CREATE FUNCTION dxsyn_init(internal)
+CREATE FUNCTION dxsyn_init(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/contrib/dict_xsyn/dict_xsyn.control b/contrib/dict_xsyn/dict_xsyn.control
index 3fd465a955..50358374a7 100644
--- a/contrib/dict_xsyn/dict_xsyn.control
+++ b/contrib/dict_xsyn/dict_xsyn.control
@@ -1,5 +1,5 @@
# dict_xsyn extension
comment = 'text search dictionary template for extended synonym processing'
-default_version = '1.0'
+default_version = '1.1'
module_pathname = '$libdir/dict_xsyn'
relocatable = true
diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile
index f8e3860926..b0ba23ed37 100644
--- a/contrib/unaccent/Makefile
+++ b/contrib/unaccent/Makefile
@@ -4,7 +4,8 @@ MODULE_big = unaccent
OBJS = unaccent.o $(WIN32RES)
EXTENSION = unaccent
-DATA = unaccent--1.1.sql unaccent--1.0--1.1.sql unaccent--unpackaged--1.0.sql
+DATA = unaccent--1.2.sql unaccent--1.1--1.2.sql unaccent--1.0--1.1.sql \
+ unaccent--unpackaged--1.0.sql
DATA_TSEARCH = unaccent.rules
PGFILEDESC = "unaccent - text search dictionary that removes accents"
diff --git a/contrib/unaccent/unaccent--1.1--1.2.sql b/contrib/unaccent/unaccent--1.1--1.2.sql
new file mode 100644
index 0000000000..eaef37f87e
--- /dev/null
+++ b/contrib/unaccent/unaccent--1.1--1.2.sql
@@ -0,0 +1,9 @@
+/* contrib/unaccent/unaccent--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION unaccent UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION unaccent_init(internal,internal)
+ RETURNS internal
+ AS 'MODULE_PATHNAME', 'unaccent_init'
+ LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent--1.1.sql b/contrib/unaccent/unaccent--1.2.sql
similarity index 94%
rename from contrib/unaccent/unaccent--1.1.sql
rename to contrib/unaccent/unaccent--1.2.sql
index ecc8651780..d6ce193e82 100644
--- a/contrib/unaccent/unaccent--1.1.sql
+++ b/contrib/unaccent/unaccent--1.2.sql
@@ -13,7 +13,7 @@ CREATE FUNCTION unaccent(text)
AS 'MODULE_PATHNAME', 'unaccent_dict'
LANGUAGE C STABLE STRICT PARALLEL SAFE;
-CREATE FUNCTION unaccent_init(internal)
+CREATE FUNCTION unaccent_init(internal,internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'unaccent_init'
LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/unaccent/unaccent.control b/contrib/unaccent/unaccent.control
index a77a65f891..aec53b5ad5 100644
--- a/contrib/unaccent/unaccent.control
+++ b/contrib/unaccent/unaccent.control
@@ -1,5 +1,5 @@
# unaccent extension
comment = 'text search dictionary that removes accents'
-default_version = '1.1'
+default_version = '1.2'
module_pathname = '$libdir/unaccent'
relocatable = true
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index b6aeae449b..32ab98b6a7 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -664,7 +664,7 @@ get_ts_template_func(DefElem *defel, int attnum)
switch (attnum)
{
case Anum_pg_ts_template_tmplinit:
- nargs = 1;
+ nargs = 2;
break;
case Anum_pg_ts_template_tmpllexize:
nargs = 4;
diff --git a/src/backend/snowball/snowball_func.sql.in b/src/backend/snowball/snowball_func.sql.in
index c02dad43e3..9b85e41ff8 100644
--- a/src/backend/snowball/snowball_func.sql.in
+++ b/src/backend/snowball/snowball_func.sql.in
@@ -19,7 +19,7 @@
SET search_path = pg_catalog;
-CREATE FUNCTION dsnowball_init(INTERNAL)
+CREATE FUNCTION dsnowball_init(INTERNAL, INTERNAL)
RETURNS INTERNAL AS '$libdir/dict_snowball', 'dsnowball_init'
LANGUAGE C STRICT;
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0fdb42f639..163885840d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4893,22 +4893,22 @@ DESCR("(internal)");
DATA(insert OID = 3723 ( ts_lexize PGNSP PGUID 12 1 0 0 0 f f f t f i s 2 0 1009 "3769 25" _null_ _null_ _null_ _null_ _null_ ts_lexize _null_ _null_ _null_ ));
DESCR("normalize one word by dictionary");
-DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
+DATA(insert OID = 3725 ( dsimple_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3726 ( dsimple_lexize PGNSP PGUID 12 1 0 0 0 f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsimple_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
+DATA(insert OID = 3728 ( dsynonym_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3729 ( dsynonym_lexize PGNSP PGUID 12 1 0 0 0 f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dsynonym_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
+DATA(insert OID = 3731 ( dispell_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3732 ( dispell_lexize PGNSP PGUID 12 1 0 0 0 f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ dispell_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
-DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 1 0 2281 "2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
+DATA(insert OID = 3740 ( thesaurus_init PGNSP PGUID 12 1 0 0 0 f f f t f i s 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_init _null_ _null_ _null_ ));
DESCR("(internal)");
DATA(insert OID = 3741 ( thesaurus_lexize PGNSP PGUID 12 1 0 0 0 f f f t f i s 4 0 2281 "2281 2281 2281 2281" _null_ _null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ ));
DESCR("(internal)");
0005-pg-ts-shared-dictinaries-view-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..b0c86804c8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8225,6 +8225,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10980,6 +10985,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 82afe201f8..78ed082994 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3045,6 +3045,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter value greater than zero before server starting.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..ab7ee973d9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -506,6 +506,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 7d1f7544cf..ff3127f207 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -364,3 +371,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 163885840d..00ffee8837 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4973,6 +4973,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d7eff6c0a7..7e5a20470e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2211,6 +2211,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
Hello Andres,
On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
Is there any chance we can instead can convert dictionaries into a form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?
To avoid misunderstanding can you please elaborate on using mmap()? The
DSM approach looks like more simple and requires less code. Also DSM may
use mmap() if I'm not mistaken.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
Hello Andres,
On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
Is there any chance we can instead can convert dictionaries into a form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?To avoid misunderstanding can you please elaborate on using mmap()? The
DSM approach looks like more simple and requires less code. Also DSM may
use mmap() if I'm not mistaken.
I think the mmap() idea is that you preprocess the dictionary, store the
result in a file, and then mmap it when needed, without the expensive
preprocessing.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Mar 07, 2018 at 10:55:29AM +0100, Tomas Vondra wrote:
On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
Hello Andres,
On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
Is there any chance we can instead can convert dictionaries into a form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?To avoid misunderstanding can you please elaborate on using mmap()? The
DSM approach looks like more simple and requires less code. Also DSM may
use mmap() if I'm not mistaken.I think the mmap() idea is that you preprocess the dictionary, store the
result in a file, and then mmap it when needed, without the expensive
preprocessing.
Understand. I'm not againts the mmap() approach, just I have lack of
understanding mmap() benefits... Current shared Ispell approach requires
preprocessing after server restarting, and the main advantage of mmap() here
is that mmap() doesn't require preprocessing after restarting.
Speaking about the implementation.
It seems that the most appropriate place to store preprocessed files is
'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
dsm_cleanup_for_mmap() will remove them.
I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
maybe it's worth to reuse them.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
2018-03-07 12:55 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
On Wed, Mar 07, 2018 at 10:55:29AM +0100, Tomas Vondra wrote:
On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
Hello Andres,
On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
Is there any chance we can instead can convert dictionaries into a
form
we can just mmap() into memory? That'd scale a lot higher and more
dynamicallly?To avoid misunderstanding can you please elaborate on using mmap()? The
DSM approach looks like more simple and requires less code. Also DSMmay
use mmap() if I'm not mistaken.
I think the mmap() idea is that you preprocess the dictionary, store the
result in a file, and then mmap it when needed, without the expensive
preprocessing.Understand. I'm not againts the mmap() approach, just I have lack of
understanding mmap() benefits... Current shared Ispell approach requires
preprocessing after server restarting, and the main advantage of mmap()
here
is that mmap() doesn't require preprocessing after restarting.Speaking about the implementation.
It seems that the most appropriate place to store preprocessed files is
'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
dsm_cleanup_for_mmap() will remove them.I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
maybe it's worth to reuse them.
I don't think so serialization to file (mmap) has not too sense. But the
shared dictionary should loaded every time, and should be released every
time if it is possible.Maybe there can be some background worker, that
holds dictionary in memory.
Regards
Pavel
Show quoted text
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, Mar 07, 2018 at 01:02:07PM +0100, Pavel Stehule wrote:
Understand. I'm not againts the mmap() approach, just I have lack of
understanding mmap() benefits... Current shared Ispell approach requires
preprocessing after server restarting, and the main advantage of mmap()
here
is that mmap() doesn't require preprocessing after restarting.Speaking about the implementation.
It seems that the most appropriate place to store preprocessed files is
'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
dsm_cleanup_for_mmap() will remove them.I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
maybe it's worth to reuse them.I don't think so serialization to file (mmap) has not too sense. But the
shared dictionary should loaded every time, and should be released every
time if it is possible.Maybe there can be some background worker, that
holds dictionary in memory.
Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
2018-03-07 13:43 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
On Wed, Mar 07, 2018 at 01:02:07PM +0100, Pavel Stehule wrote:
Understand. I'm not againts the mmap() approach, just I have lack of
understanding mmap() benefits... Current shared Ispell approachrequires
preprocessing after server restarting, and the main advantage of mmap()
here
is that mmap() doesn't require preprocessing after restarting.Speaking about the implementation.
It seems that the most appropriate place to store preprocessed files is
'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
dsm_cleanup_for_mmap() will remove them.I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
maybe it's worth to reuse them.I don't think so serialization to file (mmap) has not too sense. But the
shared dictionary should loaded every time, and should be released every
time if it is possible.Maybe there can be some background worker, that
holds dictionary in memory.Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.
No, it is not necessary - just there should be commands (functions) for
preload dictiory and unload dictionary.
Show quoted text
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.No, it is not necessary - just there should be commands (functions) for
preload dictiory and unload dictionary.
Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
2018-03-07 13:58 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.No, it is not necessary - just there should be commands (functions) for
preload dictiory and unload dictionary.Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.
Postgres hasn't any subsystem based on modification time, so introduction
this sensitivity, I don't see, practical.
Regards
Pavel
Show quoted text
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
2018-03-07 14:10 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2018-03-07 13:58 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.No, it is not necessary - just there should be commands (functions) for
preload dictiory and unload dictionary.Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.Postgres hasn't any subsystem based on modification time, so introduction
this sensitivity, I don't see, practical.
Usually the shared dictionaries are used for complex language based
fulltext. The frequence of updates of these dictionaries is less than
updates PostgreSQL. The czech dictionary is same 10 years.
Regards
Pavel
Show quoted text
Regards
Pavel
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, Mar 07, 2018 at 02:12:32PM +0100, Pavel Stehule wrote:
2018-03-07 14:10 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2018-03-07 13:58 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.Postgres hasn't any subsystem based on modification time, so introduction
this sensitivity, I don't see, practical.Usually the shared dictionaries are used for complex language based
fulltext. The frequence of updates of these dictionaries is less than
updates PostgreSQL. The czech dictionary is same 10 years.
Agree. In this case auto reloading isn't important feature here.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 03/07/2018 02:18 PM, Arthur Zakirov wrote:
On Wed, Mar 07, 2018 at 02:12:32PM +0100, Pavel Stehule wrote:
2018-03-07 14:10 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2018-03-07 13:58 GMT+01:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.Postgres hasn't any subsystem based on modification time, so
introduction this sensitivity, I don't see, practical.Usually the shared dictionaries are used for complex language
based fulltext. The frequence of updates of these dictionaries is
less than updates PostgreSQL. The czech dictionary is same 10
years.Agree. In this case auto reloading isn't important feature here.
Arthur, what are your plans with this patch in the current CF?
It does not seem to be moving towards RFC very much, and reworking the
patch to use mmap() seems like a quite significant change late in the
CF. Which means it's likely to cause the patch get get bumped to the
next CF (2018-09).
FWIW I am not quite sure if the mmap() approach is better than what was
implemented by the patch. I'm not sure how exactly will it behave under
memory pressure (AFAIK it goes through page cache, which means random
parts of dictionaries might get evicted) or how well is it supported on
various platforms (say, Windows).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Tomas,
Arthur, what are your plans with this patch in the current CF?
I think dsm-based approach is in good shape already and works nice.
I've planned only to improve the documentation a little. Also it seems I
should change 0004 part, I found that extension upgrade scripts may be made
in wrong way.
In my opinion RELOAD and UNLOAD commands can be made in next commitfest
(2018-09).
Did you look it? Have you arguments about how shared memory allocation and
releasing functions are made?
It does not seem to be moving towards RFC very much, and reworking the
patch to use mmap() seems like a quite significant change late in the
CF. Which means it's likely to cause the patch get get bumped to the
next CF (2018-09).
Agree. I have a draft version for mmap-based approach which works in
platforms with mmap. In Windows it is necessary to use another API
(CreateFileMapping, etc). But this approach requires more work on handling
processed dictionary files (how name them, when remove).
FWIW I am not quite sure if the mmap() approach is better than what was
implemented by the patch. I'm not sure how exactly will it behave under
memory pressure (AFAIK it goes through page cache, which means random
parts of dictionaries might get evicted) or how well is it supported on
various platforms (say, Windows).
Yes, as I wrote mmap-based approach requires more work. The only benefit I
see is that you don't need to process a dictionary after server restart.
I'd vote for dsm-based approach.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 03/17/2018 05:43 AM, Arthur Zakirov wrote:
Hello Tomas,
Arthur, what are your plans with this patch in the current CF?
I think dsm-based approach is in good shape already and works nice.
I've planned only to improve the documentation a little. Also it seems I
should change 0004 part, I found that extension upgrade scripts may be
made in wrong way.
In my opinion RELOAD and UNLOAD commands can be made in next commitfest
(2018-09).
Did you look it? Have you arguments about how shared memory allocation
and releasing functions are made?
It does not seem to be moving towards RFC very much, and reworking the
patch to use mmap() seems like a quite significant change late in the
CF. Which means it's likely to cause the patch get get bumped to the
next CF (2018-09).Agree. I have a draft version for mmap-based approach which works in
platforms with mmap. In Windows it is necessary to use another API
(CreateFileMapping, etc). But this approach requires more work on
handling processed dictionary files (how name them, when remove).
FWIW I am not quite sure if the mmap() approach is better than what was
implemented by the patch. I'm not sure how exactly will it behave under
memory pressure (AFAIK it goes through page cache, which means random
parts of dictionaries might get evicted) or how well is it supported on
various platforms (say, Windows).Yes, as I wrote mmap-based approach requires more work. The only
benefit I see is that you don't need to process a dictionary after
server restart. I'd vote for dsm-based approach.
I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.
I don't see a reason to stall this patch based on a mere assumption that
the mmap-based approach might be magically better in some unknown
aspects. It might be, but we may as well leave that as a future work.
I wonder how much of this patch would be affected by the switch from dsm
to mmap? I guess the memory limit would get mostly irrelevant (mmap
would rely on the OS to page the memory in/out depending on memory
pressure), and so would the UNLOAD/RELOAD commands (because each backend
would do it's own mmap).
In any case, I suggest to polish the dsm-based patch, and see if we can
get that one into PG11.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.
Well, it's also awkward and manual to use. I do think that's something
we've to pay attention to.
I wonder how much of this patch would be affected by the switch from dsm
to mmap? I guess the memory limit would get mostly irrelevant (mmap
would rely on the OS to page the memory in/out depending on memory
pressure), and so would the UNLOAD/RELOAD commands (because each backend
would do it's own mmap).
Those seem fairly major.
Greetings,
Andres Freund
Arthur Zakirov wrote:
I've planned only to improve the documentation a little. Also it seems I
should change 0004 part, I found that extension upgrade scripts may be made
in wrong way.
I've attached new version of the patch. In this version I removed
0004-Update-tmplinit-arguments-v6.patch. In my opinion it handled
extensions upgrade in wrong way. If I'm not mistaken currently there is
no way to upgrade a template's init function signature. And I didn't
find way to change init_method(internal) to init_method(internal,
internal) within an extension's upgrade script.
Therefore I added 0002-Change-tmplinit-argument-v7.patch. Now
DictInitData struct is passed in a template's init method. It contains
necessary data: dictoptions and dictid. And there is no need to change
the method's signature.
Other parts of the patch are same, except that they use DictInitData
structure now.
On Mon, Mar 19, 2018 at 01:52:41AM +0100, Tomas Vondra wrote:
I wonder how much of this patch would be affected by the switch from dsm
to mmap? I guess the memory limit would get mostly irrelevant (mmap
would rely on the OS to page the memory in/out depending on memory
pressure), and so would the UNLOAD/RELOAD commands (because each backend
would do it's own mmap).
I beleive mmap requires completely rewrite 0003 part of the patch and a
little changes in 0005.
In any case, I suggest to polish the dsm-based patch, and see if we can
get that one into PG11.
Yes we have more time in future commitfests if dsm-based patch won't be
approved.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..967fe5a6f4 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,22 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = InvalidOid;
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..db12606fdd 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..6d0dedbefb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..80f2d1535d 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..29f86472a4 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..7f87ed1c97 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..adb9c60b72 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -314,6 +315,7 @@ lookup_ts_dictionary_cache(Oid dictId)
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -333,9 +335,12 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = dictId;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..723862981d 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,7 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
#include "tsearch/ts_type.h"
/*
@@ -84,6 +85,19 @@ extern bool searchstoplist(StopList *s, char *key);
* Interface with dictionaries
*/
+/*
+ * Argument which is passed to a template's init method.
+ */
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dictoptions;
+ Oid dictid;
+} DictInitData;
+
/* return struct for any lexize function */
typedef struct
{
0003-Retreive-shared-location-for-dict-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f18d2b3353..6862d5eef9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1425,6 +1425,35 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum size of all text search dictionaries loaded into shared
+ memory. The default is 100 megabytes (<literal>100MB</literal>). This
+ parameter can only be set at server start.
+ </para>
+
+ <para>
+ Currently controls only loading of <application>Ispell</application>
+ dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
+ After compiling the dictionary it will be copied into shared memory.
+ Another backends on first use of the dictionary will use it from shared
+ memory, so it doesn't need to compile the dictionary second time.
+ </para>
+
+ <para>
+ If total size of simultaneously loaded dictionaries reaches the maximum
+ allowed size then a new dictionary will be loaded into local memory of
+ a backend.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 967fe5a6f4..742ff58c72 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..bfc52923e0
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,367 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table structures
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Shared struct for locking
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * GUC variable for maximum number of shared dictionaries. Default value is
+ * 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If there is a space in
+ * shared memory and max_shared_dictionaries_size is greater than 0 copy the
+ * dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size is greater than 0 then try to find the
+ * dictionary in shared hash table first. If it was built by someone earlier
+ * just return its location in DSM.
+ *
+ * initoptions: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *initoptions,
+ ispell_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ initoptions->dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(initoptions->dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(initoptions->dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &initoptions->dictid,
+ false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table,
+ &initoptions->dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(initoptions->dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ Assert(max_shared_dictionaries_size);
+
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = initoptions->dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC is greater than zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index adb9c60b72..aed3395075 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -99,7 +100,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 7a7ac479c1..172627a94b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2932,6 +2933,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into local memory of a backend."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 048bf4cccd..10cdb656be 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -135,6 +135,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB # (change requires restart)
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..7a8ca80554
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ispell_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *initoptions,
+ ispell_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..82afe201f8 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,23 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes a
+ noticable value of memory. Size of a dictionary can reach tens of megabytes.
+ Most of them also stores configuration in text files. A dictionary is compiled
+ during first access per a user session.
+ </para>
+
+ <para>
+ To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter value greater than zero before server starting.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6d0dedbefb..f8ab16d825 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if number of
+ * loaded dictionaries reached maximum allowed value then it will be
+ * allocated within its memory context (dictCtx).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dictoptions)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0005-pg-ts-shared-dictinaries-view-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 30e6741305..fe7d31c057 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8216,6 +8216,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10971,6 +10976,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 82afe201f8..78ed082994 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3045,6 +3045,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter value greater than zero before server starting.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..ab7ee973d9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -506,6 +506,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index bfc52923e0..f28e0a09e3 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -365,3 +372,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0fdb42f639..31cd0c91b2 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4973,6 +4973,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5e0597e091..d25b5f5ed9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2211,6 +2211,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v7.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 78ed082994..f5e88f7c86 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3037,7 +3041,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
Some dictionaries, especially <application>Ispell</application>, consumes a
noticable value of memory. Size of a dictionary can reach tens of megabytes.
Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ during first access per a user session. Currently only
+ <application>Ispell</application> supports loading into shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index f8ab16d825..b423e403cb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -48,15 +49,21 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ else
+ dict_location = dispell_build(init_data->dictoptions, NULL);
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -110,9 +117,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -120,6 +128,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -158,6 +168,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -180,7 +203,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -212,6 +235,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..6f6bca4f42 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,58 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..66a7c37e53 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,26 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
On Mon, 19 Mar 2018 14:06:50 +0300
Arthur Zakirov <a.zakirov@postgrespro.ru> wrote:
I beleive mmap requires completely rewrite 0003 part of the patch and
a little changes in 0005.In any case, I suggest to polish the dsm-based patch, and see if we
can get that one into PG11.Yes we have more time in future commitfests if dsm-based patch won't
be approved.
Hi, I'm not sure about mmap approach, it would just bring another
problems. I like the dsm approach because it's not inventing any new
files in the database, when mmap approach will possibly require new
folder in data directory and management above bunch of new files, with
additional issues related with pg_upgrade and etc. Also in dsm approach
if someone needs to update dictionaries then he (or his package
manager) can just replace files and be done with it.
--
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 03/19/2018 02:34 AM, Andres Freund wrote:
Hi,
On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.Well, it's also awkward and manual to use. I do think that's
something we've to pay attention to.
Awkward in what sense?
I don't think the manual aspect is an issue. Currently we have no way to
reload the dictionary, except for restarting all the backends. I don't
see that as a particularly convenient solution. Also, this is pretty
much how the shared_ispell extension works, although you might argue
that was more due to the limitation of how shared memory could be used
in extensions before DSM was introduced. In any case, I've never heard
complaints about this aspect of the extension.
There are about two things that might be automated - reloading of
dictionaries and evicting them when hitting the memory limit. I have
tried to implement that in the shared_ispell dictionary but it's a bit
more complicated than it looks.
For example, it seems obvious to reload the dictionary when the file
timestamp changes. But in fact there are three files - dict, affixes,
stopwords. So will you reload when a single file changes? All of them?
Keep in mind that the new version of dictionary may use different
affixes, so a reload at the wrong moment may result in broken result.
I wonder how much of this patch would be affected by the switch
from dsm to mmap? I guess the memory limit would get mostly
irrelevant (mmap would rely on the OS to page the memory in/out
depending on memory pressure), and so would the UNLOAD/RELOAD
commands (because each backend would do it's own mmap).Those seem fairly major.
I'm not sure I'd say those are major. And you might also see the lack of
these capabilities as negative points for the mmap approach.
So, I'm not at all convinced the mmap approach is actually better than
the dsm one. And I believe that if we come up with a good way to
automate some of the tasks, I don't see why would that be possible in
the mmap and not dsm.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-03-19 14:52:34 +0100, Tomas Vondra wrote:
On 03/19/2018 02:34 AM, Andres Freund wrote:
Hi,
On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.Well, it's also awkward and manual to use. I do think that's
something we've to pay attention to.Awkward in what sense?
You've to manually configure a setting that can only be set at server
start. You can't set it as big as necessary because it might use up
memory better used for other things. It needs the full space for
dictionaries even if the majority of it never will be needed. All of
those aren't needed in an mmap world.
So, I'm not at all convinced the mmap approach is actually better than
the dsm one. And I believe that if we come up with a good way to
automate some of the tasks, I don't see why would that be possible in
the mmap and not dsm.
To me it seems we'll end up needing a heck of a lot more code that the
OS already implements if we do it ourselves.
Greetings,
Andres Freund
On 03/19/2018 07:07 PM, Andres Freund wrote:
On 2018-03-19 14:52:34 +0100, Tomas Vondra wrote:
On 03/19/2018 02:34 AM, Andres Freund wrote:
Hi,
On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.Well, it's also awkward and manual to use. I do think that's
something we've to pay attention to.Awkward in what sense?
You've to manually configure a setting that can only be set at server
start. You can't set it as big as necessary because it might use up
memory better used for other things. It needs the full space for
dictionaries even if the majority of it never will be needed. All of
those aren't needed in an mmap world.
Which is not quite true, because that's not what the patch does.
Each dictionary is loaded into a separate dsm segment when needed, which
is then stored in a dhash table. So most of what you wrote is not really
true - the patch does not pre-allocate the space, and the setting might
be set even after server start (it's not defined like that currently,
but that should be trivial to change).
So, I'm not at all convinced the mmap approach is actually better
than the dsm one. And I believe that if we come up with a good way
to automate some of the tasks, I don't see why would that be
possible in the mmap and not dsm.To me it seems we'll end up needing a heck of a lot more code that
the OS already implements if we do it ourselves.
Like what? Which features do you expect to need much more code?
The automated reloading will need a fairly small amount of code - the
main issue is deciding when to reload, and as I mentioned before that's
more complicated than you seem to believe. In fact, it may not even be
possible - there's no way to decide if all files are already updated.
Currently we kinda ignore that, on the assumption that dictionaries
change only rarely. We may do the same thing and reload the dict if at
least one file changes. In any case, the amount of code is trivial.
In fact, it may be more complicated in the mmap case - how do you update
a dictionary that is already mapped to multiple processes?
The eviction is harder - I'll give you that. But then again, I'm not
sure the mmap approach is really what we want here - it seems better to
evict the whole dictionary, than some random pages from many of them.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 19, 2018 at 07:40:54PM +0100, Tomas Vondra wrote:
On 03/19/2018 07:07 PM, Andres Freund wrote:
You've to manually configure a setting that can only be set at server
start. You can't set it as big as necessary because it might use up
memory better used for other things. It needs the full space for
dictionaries even if the majority of it never will be needed. All of
those aren't needed in an mmap world.Which is not quite true, because that's not what the patch does.
Each dictionary is loaded into a separate dsm segment when needed, which
is then stored in a dhash table. So most of what you wrote is not really
true - the patch does not pre-allocate the space, and the setting might
be set even after server start (it's not defined like that currently,
but that should be trivial to change).
Oh, it's true. I had plans to fix it but somehow I forgot to allow to change
max_shared_dictionaries_size GUC via pg_reload_conf(). I'll fix it and
will send new version of the patch.
To me it seems we'll end up needing a heck of a lot more code that
the OS already implements if we do it ourselves.Like what? Which features do you expect to need much more code?
The automated reloading will need a fairly small amount of code - the
main issue is deciding when to reload, and as I mentioned before that's
more complicated than you seem to believe. In fact, it may not even be
possible - there's no way to decide if all files are already updated.
Currently we kinda ignore that, on the assumption that dictionaries
change only rarely. We may do the same thing and reload the dict if at
least one file changes. In any case, the amount of code is trivial.In fact, it may be more complicated in the mmap case - how do you update
a dictionary that is already mapped to multiple processes?The eviction is harder - I'll give you that. But then again, I'm not
sure the mmap approach is really what we want here - it seems better to
evict the whole dictionary, than some random pages from many of them.
Agree. mmap approach requires same code plus code to handle cache files,
which will be mapped into memory. In mmap approach we need to solve same
issues we face and more. Also we need somehow automatically reload
dictionaries in both cases.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hi Arthur,
I went through the patch - just skimming through the diffs, will do more
testing tomorrow. Here are a few initial comments.
1) max_shared_dictionaries_size / PGC_POSTMASTER
I'm not quite sure why the GUC is defined as PGC_POSTMASTER, i.e. why it
can't be changed after server start. That seems like a fairly useful
thing to do (e.g. increase the limit while the server is running), and
after looking at the code I think it shouldn't be difficult to change.
The other thing I'd suggest is handling "-1" as "no limit".
2) max_shared_dictionaries_size / size of number
Some of the comments dealing with the GUC treat it as a number of
dictionaries (instead of a size). I suppose that's due to how the
original patch was implemented.
3) Assert(max_shared_dictionaries_size);
I'd say that assert is not very clear - it should be
Assert(max_shared_dictionaries_size > 0);
or something along the lines. It's also a good idea to add a comment
explaining the assert, say
/* we can only get here when shared dictionaries are enabled */
Assert(max_shared_dictionaries_size > 0);
4) I took the liberty of rewording some of the docs/comments. See the
attached diffs, that should apply on top of 0003 and 0004 patches.
Please, treat those as mere suggestions.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0003.difftext/x-patch; name=0003.diffDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6862d5e..6747fe2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1433,23 +1433,23 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Sets the maximum size of all text search dictionaries loaded into shared
- memory. The default is 100 megabytes (<literal>100MB</literal>). This
- parameter can only be set at server start.
+ Specifies the amount of shared memory to be used to store full-text search
+ search dictionaries. The default is 100 megabytes (<literal>100MB</literal>).
+ This parameter can only be set at server start.
</para>
<para>
- Currently controls only loading of <application>Ispell</application>
- dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>).
- After compiling the dictionary it will be copied into shared memory.
- Another backends on first use of the dictionary will use it from shared
- memory, so it doesn't need to compile the dictionary second time.
+ Currently only <application>Ispell</application> dictionaries (see
+ <xref linkend="textsearch-ispell-dictionary"/>) may be loaded into
+ shared memory. The first backend requesting the dictionary will
+ build it and copy it into shared memory, so that other backends can
+ reuse it without having to build it again.
</para>
<para>
- If total size of simultaneously loaded dictionaries reaches the maximum
- allowed size then a new dictionary will be loaded into local memory of
- a backend.
+ If the size of simultaneously loaded dictionaries reaches the maximum
+ allowed size, additional dictionares will be loaded into private backend
+ memory (effectively disabling the sharing).
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index bfc5292..22d58a0 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -22,7 +22,7 @@
/*
- * Hash table structures
+ * Hash table entries representing shared dictionaries.
*/
typedef struct
{
@@ -37,7 +37,8 @@ typedef struct
static dshash_table *dict_table = NULL;
/*
- * Shared struct for locking
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
*/
typedef struct
{
@@ -53,8 +54,8 @@ typedef struct
static TsearchCtlData *tsearch_ctl;
/*
- * GUC variable for maximum number of shared dictionaries. Default value is
- * 100MB.
+ * Maximum allowed amount of shared memory for shared dictionaries,
+ * in kilobytes. Default value is 100MB.
*/
int max_shared_dictionaries_size = 100 * 1024;
@@ -213,7 +202,7 @@ ts_dict_shmem_location(DictInitData *initoptions,
/*
* Release memory occupied by the dictionary. Function just unpins DSM mapping.
- * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ * If nobody else hasn't mapping to this DSM then unpin DSM segment.
*
* dictid: Oid of the dictionary.
*/
@@ -312,6 +301,7 @@ init_dict_table(void)
MemoryContext old_context;
dsa_area *dsa;
+ /* bail out if shared dictionaries not allowed */
if (max_shared_dictionaries_size == 0)
return;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 172627a..b10ec48 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2939,7 +2939,7 @@ static struct config_int ConfigureNamesInt[] =
gettext_noop("Currently controls only loading of Ispell dictionaries. "
"If total size of simultaneously loaded dictionaries "
"reaches the maximum allowed size then a new dictionary "
- "will be loaded into local memory of a backend."),
+ "will be loaded into private backend memory."),
GUC_UNIT_KB,
},
&max_shared_dictionaries_size,
0004.difftext/x-patch; name=0004.diffDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 82afe20..d1ae7b7 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3034,15 +3034,17 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
<title>Dictionaries in Shared Memory</title>
<para>
- Some dictionaries, especially <application>Ispell</application>, consumes a
- noticable value of memory. Size of a dictionary can reach tens of megabytes.
- Most of them also stores configuration in text files. A dictionary is compiled
- during first access per a user session.
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
</para>
<para>
- To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/>
- parameter value greater than zero before server starting.
+ To enable storing dictionaries in shared memory, set <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter to a value greater than zero.
</para>
</sect2>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index f8ab16d..43aa27a 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,14 +5,14 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
- * By default all Ispell dictionaries are stored in DSM. But if number of
- * loaded dictionaries reached maximum allowed value then it will be
- * allocated within its memory context (dictCtx).
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary be
+ * allocated in private backend memory (in dictCtx context).
*
* All necessary data are built within dispell_build() function. But
* structures for regular expressions are compiled on first demand and
* stored using AffixReg array. It is because regex_t and Regis cannot be
- * stored in shared memory.
+ * stored in shared memory easily.
*
*
* IDENTIFICATION
Hello,
On Mon, Mar 19, 2018 at 08:50:46PM +0100, Tomas Vondra wrote:
Hi Arthur,
I went through the patch - just skimming through the diffs, will do more
testing tomorrow. Here are a few initial comments.
Thank you for the review!
1) max_shared_dictionaries_size / PGC_POSTMASTER
I'm not quite sure why the GUC is defined as PGC_POSTMASTER, i.e. why it
can't be changed after server start. That seems like a fairly useful
thing to do (e.g. increase the limit while the server is running), and
after looking at the code I think it shouldn't be difficult to change.
max_shared_dictionaries_size is defined as PGC_SIGHUP now. Added check
of a new value to disallow to set zero if there are loaded dictionaries
and to decrease maximum allowed size if loaded size is greater than the
new value.
The other thing I'd suggest is handling "-1" as "no limit".
I added availability to set '-1'. Fixed some comments and the
documentation.
2) max_shared_dictionaries_size / size of number
Some of the comments dealing with the GUC treat it as a number of
dictionaries (instead of a size). I suppose that's due to how the
original patch was implemented.
Fixed. Should be good now.
3) Assert(max_shared_dictionaries_size);
I'd say that assert is not very clear - it should be
Assert(max_shared_dictionaries_size > 0);
or something along the lines. It's also a good idea to add a comment
explaining the assert, say/* we can only get here when shared dictionaries are enabled */
Assert(max_shared_dictionaries_size > 0);
Fixed the assert and added the comment. I extended the assert, it also
takes into account -1 value.
4) I took the liberty of rewording some of the docs/comments. See the
attached diffs, that should apply on top of 0003 and 0004 patches.
Please, treat those as mere suggestions.
I applied your diffs and added changes to max_shared_dictionaries_size.
Please find the attached new version of the patch.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..967fe5a6f4 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,22 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = InvalidOid;
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..db12606fdd 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..6d0dedbefb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..80f2d1535d 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..29f86472a4 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..7f87ed1c97 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..adb9c60b72 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -314,6 +315,7 @@ lookup_ts_dictionary_cache(Oid dictId)
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -333,9 +335,12 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = dictId;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..723862981d 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,7 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
#include "tsearch/ts_type.h"
/*
@@ -84,6 +85,19 @@ extern bool searchstoplist(StopList *s, char *key);
* Interface with dictionaries
*/
+/*
+ * Argument which is passed to a template's init method.
+ */
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dictoptions;
+ Oid dictid;
+} DictInitData;
+
/* return struct for any lexize function */
typedef struct
{
0003-Retreive-shared-location-for-dict-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f18d2b3353..8520089f2e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1425,6 +1425,42 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies the amount of shared memory to be used to store full-text search
+ search dictionaries. A value of <literal>-1</literal> means no limit of
+ the size of loaded dictionaries. A value of <literal>0</literal>
+ disables shared dictionaries feature. The default is 100 megabytes
+ (<literal>100MB</literal>).
+ </para>
+
+ <para>
+ Currently only <application>Ispell</application> dictionaries (see
+ <xref linkend="textsearch-ispell-dictionary"/>) may be loaded into
+ shared memory. The first backend requesting the dictionary will
+ build it and copy it into shared memory, so that other backends can
+ reuse it without having to build it again.
+ </para>
+
+ <para>
+ If the size of simultaneously loaded dictionaries reaches the maximum
+ allowed size, additional dictionares will be loaded into private backend
+ memory (effectively disabling the sharing).
+ </para>
+
+ <para>
+ This parameter can only be set in the <filename>postgresql.conf</filename>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 967fe5a6f4..742ff58c72 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..872dd3f6f5
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,396 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * Maximum allowed amount of shared memory for shared dictionaries,
+ * in kilobytes. Default value is 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If
+ * max_shared_dictionaries_size is TS_DICT_SHMEM_UNLIMITED or if there is a
+ * space in shared memory and max_shared_dictionaries_size is greater than 0
+ * copy the dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size isn't 0 then try to find the dictionary in
+ * shared hash table first. If it was built by someone earlier just return its
+ * location in DSM.
+ *
+ * initoptions: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (max_shared_dictionaries_size != TS_DICT_SHMEM_UNLIMITED && \
+ entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ initoptions->dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(initoptions->dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(initoptions->dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &initoptions->dictid,
+ false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table,
+ &initoptions->dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(initoptions->dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* We can only get here when shared dictionaries are enabled */
+ Assert(max_shared_dictionaries_size > 0 ||
+ max_shared_dictionaries_size == TS_DICT_SHMEM_UNLIMITED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = initoptions->dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unpin DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+
+ /* Decrease total loaded size */
+ LWLockAcquire(&tsearch_ctl->lock, LW_EXCLUSIVE);
+ tsearch_ctl->loaded_size -= entry->dict_size;
+ LWLockRelease(&tsearch_ctl->lock);
+
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Get total size of loaded dictionaries into shared memory in bytes.
+ */
+Size
+ts_dict_shmem_loaded_size(void)
+{
+ Size res;
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ res = tsearch_ctl->loaded_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ return res;
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC isn't equal to zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Bail out if shared dictionaries not allowed */
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index adb9c60b72..aed3395075 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -99,7 +100,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 7a7ac479c1..ec1a926ab3 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -185,6 +186,8 @@ static bool check_autovacuum_max_workers(int *newval, void **extra, GucSource so
static bool check_autovacuum_work_mem(int *newval, void **extra, GucSource source);
static bool check_effective_io_concurrency(int *newval, void **extra, GucSource source);
static void assign_effective_io_concurrency(int newval, void *extra);
+static bool check_max_shared_dictionaries_size(int *newval, void **extra,
+ GucSource source);
static void assign_pgstat_temp_directory(const char *newval, void *extra);
static bool check_application_name(char **newval, void **extra, GucSource source);
static void assign_application_name(const char *newval, void *extra);
@@ -2932,6 +2935,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_SIGHUP, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into private backend memory."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, -1, MAX_KILOBYTES,
+ check_max_shared_dictionaries_size, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
@@ -10478,6 +10495,39 @@ assign_effective_io_concurrency(int newval, void *extra)
#endif /* USE_PREFETCH */
}
+static bool
+check_max_shared_dictionaries_size(int *newval, void **extra, GucSource source)
+{
+ /*
+ * We can check the size of loaded dictionaries only if we in normal
+ * processing mode.
+ */
+ if (!IsNormalProcessingMode())
+ return true;
+
+ if (*newval != TS_DICT_SHMEM_UNLIMITED)
+ {
+ Size loaded_size = ts_dict_shmem_loaded_size();
+
+ if (*newval == 0 && loaded_size > 0)
+ {
+ GUC_check_errmsg("Cannot disable shared dictionaries, "
+ "there are loaded dictionaries into shared memory.");
+ return false;
+ }
+ else if (*newval * 1024L < loaded_size)
+ {
+ GUC_check_errmsg("Cannot decrease maximum allowed size, "
+ "total size of loaded dictionaries "
+ "into shared memory is greater than "
+ "the new value.");
+ return false;
+ }
+ }
+
+ return true;
+}
+
static void
assign_pgstat_temp_directory(const char *newval, void *extra)
{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 048bf4cccd..60a109246b 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -135,6 +135,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..fbe27d436c
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,38 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+/*
+ * Value for max_shared_dictionaries_size, means that there is no limit in
+ * shared memory for shared dictionaries.
+ */
+#define TS_DICT_SHMEM_UNLIMITED (-1)
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+extern Size ts_dict_shmem_loaded_size(void);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..f3288fbb3f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,25 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
+ </para>
+
+ <para>
+ To enable storing dictionaries in shared memory, set <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter to a value greater than zero or to a value <literal>-1</literal>.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6d0dedbefb..6294a52af3 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dictoptions)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0005-pg-ts-shared-dictinaries-view-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 30e6741305..fe7d31c057 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8216,6 +8216,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10971,6 +10976,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index f3288fbb3f..02e8e8aa90 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3047,6 +3047,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter to a value greater than zero or to a value <literal>-1</literal>.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..ab7ee973d9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -506,6 +506,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 872dd3f6f5..00f0f10f8a 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -394,3 +401,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0fdb42f639..31cd0c91b2 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4973,6 +4973,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5e0597e091..d25b5f5ed9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2211,6 +2211,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v8.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 02e8e8aa90..65c2d6daa3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3040,6 +3044,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
both CPU and time-consuming. Instead of doing this in each backend when
it needs a dictionary for the first time, the compiled dictionary may be
stored in shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6294a52af3..1475cdb908 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -48,15 +49,21 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ else
+ dict_location = dispell_build(init_data->dictoptions, NULL);
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -110,9 +117,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -120,6 +128,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -158,6 +168,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -180,7 +203,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -212,6 +235,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..ea4dda8091 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,77 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+SHOW max_shared_dictionaries_size;
+ max_shared_dictionaries_size
+------------------------------
+ 100MB
+(1 row)
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+-- We cannot disable shared dictionaries since there are loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to 0;
+ERROR: Cannot disable shared dictionaries, there are loaded dictionaries into shared memory.
+-- But we can decrease maximum allowed size of loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to '50MB';
+-- Or set no limit
+ALTER SYSTEM SET max_shared_dictionaries_size to -1;
+SELECT pg_reload_conf();
+ pg_reload_conf
+----------------
+ t
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..c4dbadb3d1 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,38 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+SHOW max_shared_dictionaries_size;
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+
+-- We cannot disable shared dictionaries since there are loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to 0;
+-- But we can decrease maximum allowed size of loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to '50MB';
+-- Or set no limit
+ALTER SYSTEM SET max_shared_dictionaries_size to -1;
+SELECT pg_reload_conf();
+
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
On 03/20/2018 02:11 PM, Arthur Zakirov wrote:
Hello,
On Mon, Mar 19, 2018 at 08:50:46PM +0100, Tomas Vondra wrote:
Hi Arthur,
I went through the patch - just skimming through the diffs, will do more
testing tomorrow. Here are a few initial comments.Thank you for the review!
1) max_shared_dictionaries_size / PGC_POSTMASTER
I'm not quite sure why the GUC is defined as PGC_POSTMASTER, i.e. why it
can't be changed after server start. That seems like a fairly useful
thing to do (e.g. increase the limit while the server is running), and
after looking at the code I think it shouldn't be difficult to change.max_shared_dictionaries_size is defined as PGC_SIGHUP now. Added check
of a new value to disallow to set zero if there are loaded dictionaries
and to decrease maximum allowed size if loaded size is greater than the
new value.
I wonder if these restrictions needed? I mean, why not to allow setting
max_shared_dictionaries_size below the size of loaded dictionaries?
Of course, on the one hand those restriction seem sensible. On the other
hand, perhaps in some cases it would be useful to allow violating them?
I mean, why not to simply disable loading of new dictionaries when
(max_shared_dictionaries_size < loaded_size)
Maybe I'm over-thinking this though. It's probably safer and less
surprising to enforce the restrictions.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Mar 20, 2018 at 09:30:15PM +0100, Tomas Vondra wrote:
On 03/20/2018 02:11 PM, Arthur Zakirov wrote:
max_shared_dictionaries_size is defined as PGC_SIGHUP now. Added check
of a new value to disallow to set zero if there are loaded dictionaries
and to decrease maximum allowed size if loaded size is greater than the
new value.I wonder if these restrictions needed? I mean, why not to allow setting
max_shared_dictionaries_size below the size of loaded dictionaries?Of course, on the one hand those restriction seem sensible. On the other
hand, perhaps in some cases it would be useful to allow violating them?I mean, why not to simply disable loading of new dictionaries when
(max_shared_dictionaries_size < loaded_size)
Maybe I'm over-thinking this though. It's probably safer and less
surprising to enforce the restrictions.
Hm, yes in some cases this check may be over-engineering. I thought that
it is reasonable and safer in v7 patch. But there are similar GUCs,
wal_keep_segments and max_wal_size, which don't do additional checks.
And people are fine with them. So I removed that check from the variable.
Please find the attached new version of the patch.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..967fe5a6f4 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,22 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = InvalidOid;
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..db12606fdd 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..6d0dedbefb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..80f2d1535d 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..29f86472a4 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..7f87ed1c97 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..adb9c60b72 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -314,6 +315,7 @@ lookup_ts_dictionary_cache(Oid dictId)
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -333,9 +335,12 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = dictId;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..723862981d 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,7 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
#include "tsearch/ts_type.h"
/*
@@ -84,6 +85,19 @@ extern bool searchstoplist(StopList *s, char *key);
* Interface with dictionaries
*/
+/*
+ * Argument which is passed to a template's init method.
+ */
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dictoptions;
+ Oid dictid;
+} DictInitData;
+
/* return struct for any lexize function */
typedef struct
{
0003-Retreive-shared-location-for-dict-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f18d2b3353..8520089f2e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1425,6 +1425,42 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies the amount of shared memory to be used to store full-text search
+ search dictionaries. A value of <literal>-1</literal> means no limit of
+ the size of loaded dictionaries. A value of <literal>0</literal>
+ disables shared dictionaries feature. The default is 100 megabytes
+ (<literal>100MB</literal>).
+ </para>
+
+ <para>
+ Currently only <application>Ispell</application> dictionaries (see
+ <xref linkend="textsearch-ispell-dictionary"/>) may be loaded into
+ shared memory. The first backend requesting the dictionary will
+ build it and copy it into shared memory, so that other backends can
+ reuse it without having to build it again.
+ </para>
+
+ <para>
+ If the size of simultaneously loaded dictionaries reaches the maximum
+ allowed size, additional dictionares will be loaded into private backend
+ memory (effectively disabling the sharing).
+ </para>
+
+ <para>
+ This parameter can only be set in the <filename>postgresql.conf</filename>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 967fe5a6f4..742ff58c72 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..2a8b80bce8
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,379 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * Maximum allowed amount of shared memory for shared dictionaries,
+ * in kilobytes. Default value is 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If
+ * max_shared_dictionaries_size is TS_DICT_SHMEM_UNLIMITED or if there is a
+ * space in shared memory and max_shared_dictionaries_size is greater than 0
+ * copy the dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size isn't 0 then try to find the dictionary in
+ * shared hash table first. If it was built by someone earlier just return its
+ * location in DSM.
+ *
+ * initoptions: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (max_shared_dictionaries_size != TS_DICT_SHMEM_UNLIMITED && \
+ entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ initoptions->dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(initoptions->dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(initoptions->dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &initoptions->dictid,
+ false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table,
+ &initoptions->dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(initoptions->dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* We can only get here when shared dictionaries are enabled */
+ Assert(max_shared_dictionaries_size > 0 ||
+ max_shared_dictionaries_size == TS_DICT_SHMEM_UNLIMITED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = initoptions->dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unpin DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+
+ /* Decrease total loaded size */
+ LWLockAcquire(&tsearch_ctl->lock, LW_EXCLUSIVE);
+ tsearch_ctl->loaded_size -= entry->dict_size;
+ LWLockRelease(&tsearch_ctl->lock);
+
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC isn't equal to zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Bail out if shared dictionaries not allowed */
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index adb9c60b72..aed3395075 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -99,7 +100,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 7a7ac479c1..52007ace83 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -76,6 +76,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2932,6 +2933,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_SIGHUP, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into private backend memory."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, -1, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 048bf4cccd..60a109246b 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -135,6 +135,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..5bac5f9eda
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,37 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+/*
+ * Value for max_shared_dictionaries_size, means that there is no limit in
+ * shared memory for shared dictionaries.
+ */
+#define TS_DICT_SHMEM_UNLIMITED (-1)
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..f3288fbb3f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,25 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
+ </para>
+
+ <para>
+ To enable storing dictionaries in shared memory, set <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter to a value greater than zero or to a value <literal>-1</literal>.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6d0dedbefb..6294a52af3 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dictoptions)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0005-pg-ts-shared-dictinaries-view-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 30e6741305..fe7d31c057 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8216,6 +8216,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10971,6 +10976,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index f3288fbb3f..02e8e8aa90 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3047,6 +3047,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter to a value greater than zero or to a value <literal>-1</literal>.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..ab7ee973d9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -506,6 +506,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 2a8b80bce8..fe632ecf6f 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -377,3 +384,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0fdb42f639..31cd0c91b2 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4973,6 +4973,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5e0597e091..d25b5f5ed9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2211,6 +2211,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v9.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 02e8e8aa90..65c2d6daa3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3040,6 +3044,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
both CPU and time-consuming. Instead of doing this in each backend when
it needs a dictionary for the first time, the compiled dictionary may be
stored in shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6294a52af3..1475cdb908 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -48,15 +49,21 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ else
+ dict_location = dispell_build(init_data->dictoptions, NULL);
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -110,9 +117,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -120,6 +128,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -158,6 +168,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -180,7 +203,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -212,6 +235,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..ea4dda8091 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,77 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+SHOW max_shared_dictionaries_size;
+ max_shared_dictionaries_size
+------------------------------
+ 100MB
+(1 row)
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+-- We cannot disable shared dictionaries since there are loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to 0;
+ERROR: Cannot disable shared dictionaries, there are loaded dictionaries into shared memory.
+-- But we can decrease maximum allowed size of loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to '50MB';
+-- Or set no limit
+ALTER SYSTEM SET max_shared_dictionaries_size to -1;
+SELECT pg_reload_conf();
+ pg_reload_conf
+----------------
+ t
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..c4dbadb3d1 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,38 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+SHOW max_shared_dictionaries_size;
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+
+-- We cannot disable shared dictionaries since there are loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to 0;
+-- But we can decrease maximum allowed size of loaded dictionaries
+ALTER SYSTEM SET max_shared_dictionaries_size to '50MB';
+-- Or set no limit
+ALTER SYSTEM SET max_shared_dictionaries_size to -1;
+SELECT pg_reload_conf();
+
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
On Wed, Mar 21, 2018 at 12:00:52PM +0300, Arthur Zakirov wrote:
On Tue, Mar 20, 2018 at 09:30:15PM +0100, Tomas Vondra wrote:
I wonder if these restrictions needed? I mean, why not to allow setting
max_shared_dictionaries_size below the size of loaded dictionaries?Of course, on the one hand those restriction seem sensible. On the other
hand, perhaps in some cases it would be useful to allow violating them?I mean, why not to simply disable loading of new dictionaries when
(max_shared_dictionaries_size < loaded_size)
Maybe I'm over-thinking this though. It's probably safer and less
surprising to enforce the restrictions.Hm, yes in some cases this check may be over-engineering. I thought that
it is reasonable and safer in v7 patch. But there are similar GUCs,
wal_keep_segments and max_wal_size, which don't do additional checks.
And people are fine with them. So I removed that check from the variable.Please find the attached new version of the patch.
I forgot to fix regression tests for max_shared_dictionaries_size. Also
I'm not confident about using pg_reload_conf() in regression tests.
I haven't found where pg_reload_conf() is used in tests. So I removed
max_shared_dictionaries_size tests for now.
Sorry for the noise.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..967fe5a6f4 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,22 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = InvalidOid;
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..db12606fdd 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..6d0dedbefb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..80f2d1535d 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..29f86472a4 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..7f87ed1c97 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 3d5c194148..adb9c60b72 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -314,6 +315,7 @@ lookup_ts_dictionary_cache(Oid dictId)
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -333,9 +335,12 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dictoptions = dictoptions;
+ init_data.dictid = dictId;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..723862981d 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,7 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
#include "tsearch/ts_type.h"
/*
@@ -84,6 +85,19 @@ extern bool searchstoplist(StopList *s, char *key);
* Interface with dictionaries
*/
+/*
+ * Argument which is passed to a template's init method.
+ */
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dictoptions;
+ Oid dictid;
+} DictInitData;
+
/* return struct for any lexize function */
typedef struct
{
0003-Retreive-shared-location-for-dict-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0d61dcb179..940f6d3e2f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1425,6 +1425,42 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-shared-dictionaries-size" xreflabel="max_shared_dictionaries_size">
+ <term><varname>max_shared_dictionaries_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_shared_dictionaries_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies the amount of shared memory to be used to store full-text search
+ search dictionaries. A value of <literal>-1</literal> means no limit of
+ the size of loaded dictionaries. A value of <literal>0</literal>
+ disables shared dictionaries feature. The default is 100 megabytes
+ (<literal>100MB</literal>).
+ </para>
+
+ <para>
+ Currently only <application>Ispell</application> dictionaries (see
+ <xref linkend="textsearch-ispell-dictionary"/>) may be loaded into
+ shared memory. The first backend requesting the dictionary will
+ build it and copy it into shared memory, so that other backends can
+ reuse it without having to build it again.
+ </para>
+
+ <para>
+ If the size of simultaneously loaded dictionaries reaches the maximum
+ allowed size, additional dictionares will be loaded into private backend
+ memory (effectively disabling the sharing).
+ </para>
+
+ <para>
+ This parameter can only be set in the <filename>postgresql.conf</filename>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
<term><varname>huge_pages</varname> (<type>enum</type>)
<indexterm>
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 967fe5a6f4..742ff58c72 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,8 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..2a8b80bce8
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,379 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ Oid dict_id;
+ dsm_handle dict_dsm;
+ Size dict_size;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ /* Total size of loaded dictionaries into shared memory in bytes */
+ Size loaded_size;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+/*
+ * Maximum allowed amount of shared memory for shared dictionaries,
+ * in kilobytes. Default value is 100MB.
+ */
+int max_shared_dictionaries_size = 100 * 1024;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(Oid),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback. If
+ * max_shared_dictionaries_size is TS_DICT_SHMEM_UNLIMITED or if there is a
+ * space in shared memory and max_shared_dictionaries_size is greater than 0
+ * copy the dictionary into DSM.
+ *
+ * If max_shared_dictionaries_size isn't 0 then try to find the dictionary in
+ * shared hash table first. If it was built by someone earlier just return its
+ * location in DSM.
+ *
+ * initoptions: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+
+#define CHECK_SHARED_SPACE() \
+ if (max_shared_dictionaries_size != TS_DICT_SHMEM_UNLIMITED && \
+ entry->dict_size + tsearch_ctl->loaded_size > \
+ max_shared_dictionaries_size * 1024L) \
+ { \
+ LWLockRelease(&tsearch_ctl->lock); \
+ ereport(LOG, \
+ (errmsg("there is no space in shared memory for text search " \
+ "dictionary %u, it will be loaded into backend's memory", \
+ initoptions->dictid))); \
+ dshash_delete_entry(dict_table, entry); \
+ return dict; \
+ } \
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if a hash table wasn't created
+ * or dictid is invalid (it may happen if the dicionary's init method was
+ * called within verify_dictoptions()).
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle) ||
+ !OidIsValid(initoptions->dictid))
+ {
+ Size dict_size;
+
+ dict = allocate_cb(initoptions->dictoptions, &dict_size);
+
+ return dict;
+ }
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &initoptions->dictid,
+ false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table,
+ &initoptions->dictid,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(initoptions->dictoptions, &entry->dict_size);
+
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* We can only get here when shared dictionaries are enabled */
+ Assert(max_shared_dictionaries_size > 0 ||
+ max_shared_dictionaries_size == TS_DICT_SHMEM_UNLIMITED);
+
+ /* Before allocating a DSM segment check remaining shared space */
+ CHECK_SHARED_SPACE();
+
+ LWLockRelease(&tsearch_ctl->lock);
+ /* If we come here, we need an exclusive lock */
+ while (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * Check again in case if there are no space anymore while we were
+ * waiting for exclusive lock.
+ */
+ CHECK_SHARED_SPACE();
+ }
+
+ tsearch_ctl->loaded_size += entry->dict_size;
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(entry->dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, entry->dict_size);
+
+ pfree(dict);
+
+ entry->dict_id = initoptions->dictid;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM then unpin DSM segment.
+ *
+ * dictid: Oid of the dictionary.
+ */
+void
+ts_dict_shmem_release(Oid dictid)
+{
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table)
+ return;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+ /*
+ * If current backend didn't pin a mapping then we don't need to do
+ * unpinning.
+ */
+ if (!seg)
+ {
+ dshash_release_lock(dict_table, entry);
+ return;
+ }
+
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+
+ if (entry->refcnt == 0)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+
+ /* Decrease total loaded size */
+ LWLockAcquire(&tsearch_ctl->lock, LW_EXCLUSIVE);
+ tsearch_ctl->loaded_size -= entry->dict_size;
+ LWLockRelease(&tsearch_ctl->lock);
+
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ tsearch_ctl->loaded_size = 0;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized iff
+ * max_shared_dictionaries_size GUC isn't equal to zero and it doesn't exist
+ * yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Bail out if shared dictionaries not allowed */
+ if (max_shared_dictionaries_size == 0)
+ return;
+
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index adb9c60b72..aed3395075 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -99,7 +100,16 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (entry->isvalid && hash == TSDictionaryCacheHash)
+ {
+ TSDictionaryCacheEntry *dict_entry = (TSDictionaryCacheEntry *) entry;
+
+ ts_dict_shmem_release(dict_entry->dictId);
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index afb1007842..fbca674e34 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -77,6 +77,7 @@
#include "storage/predicate.h"
#include "tcop/tcopprot.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/guc_tables.h"
@@ -2943,6 +2944,20 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_shared_dictionaries_size", PGC_SIGHUP, RESOURCES_MEM,
+ gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
+ gettext_noop("Currently controls only loading of Ispell dictionaries. "
+ "If total size of simultaneously loaded dictionaries "
+ "reaches the maximum allowed size then a new dictionary "
+ "will be loaded into private backend memory."),
+ GUC_UNIT_KB,
+ },
+ &max_shared_dictionaries_size,
+ 100 * 1024, -1, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 91eacacdc9..489aab9022 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -135,6 +135,7 @@
# mmap
# use none to disable dynamic shared memory
# (change requires restart)
+#max_shared_dictionaries_size = 100MB
# - Disk -
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..5bac5f9eda
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,37 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+/*
+ * Value for max_shared_dictionaries_size, means that there is no limit in
+ * shared memory for shared dictionaries.
+ */
+#define TS_DICT_SHMEM_UNLIMITED (-1)
+
+/*
+ * GUC variable for maximum number of shared dictionaries
+ */
+extern int max_shared_dictionaries_size;
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *initoptions,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid dictid);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..f3288fbb3f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,25 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
+ </para>
+
+ <para>
+ To enable storing dictionaries in shared memory, set <xref linkend="guc-max-shared-dictionaries-size"/>
+ parameter to a value greater than zero or to a value <literal>-1</literal>.
+ </para>
+
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6d0dedbefb..6294a52af3 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dictoptions)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0005-pg-ts-shared-dictinaries-view-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fc81133f07..1714fd6b08 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8228,6 +8228,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry>time zone names</entry>
</row>
+ <row>
+ <entry><link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link></entry>
+ <entry>dictionaries currently in shared memory</entry>
+ </row>
+
<row>
<entry><link linkend="view-pg-user"><structname>pg_user</structname></link></entry>
<entry>database users</entry>
@@ -10983,6 +10988,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</sect1>
+ <sect1 id="view-pg-ts-shared-dictionaries">
+ <title><structname>pg_ts_shared_dictionaries</structname></title>
+
+ <indexterm zone="view-pg-ts-shared-dictionaries">
+ <primary>pg_ts_shared_dictionaries</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_ts_shared_dictionaries</structname> view provides a
+ listing of all text search dictionaries that currently allocated in the
+ shared memory. The size of available space in shared memory is controlled by
+ <xref linkend="guc-shared-buffers"/>. A dictionary may have an option which
+ controls allocation in shared memory (see <xref linkend="textsearch-ispell-dictionary"/>).
+ </para>
+
+ <table>
+ <title><structname>pg_ts_shared_dictionaries</structname> Columns</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Name</entry>
+ <entry>Type</entry>
+ <entry>References</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><structfield>dictoid</structfield></entry>
+ <entry><type>oid</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.oid</literal></entry>
+ <entry>The OID of the text search dictionary located in shared memory</entry>
+ </row>
+ <row>
+ <entry><structfield>schemaname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
+ <entry>The name of schema containing the text search dictionary</entry>
+ </row>
+ <row>
+ <entry><structfield>dictname</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry><literal><link linkend="catalog-pg-ts-dict"><structname>pg_ts_dict</structname></link>.dictname</literal></entry>
+ <entry>The text search dictionary name</entry>
+ </row>
+ <row>
+ <entry><structfield>size</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>Size of the text search dictionary in bytes</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </sect1>
+
<sect1 id="view-pg-user">
<title><structname>pg_user</structname></title>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index f3288fbb3f..02e8e8aa90 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3047,6 +3047,12 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
parameter to a value greater than zero or to a value <literal>-1</literal>.
</para>
+ <para>
+ List of dictionaries currently located in shared memory can be retreived by
+ <link linkend="view-pg-ts-shared-dictionaries"><structname>pg_ts_shared_dictionaries</structname></link>
+ view.
+ </para>
+
</sect2>
</sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..ab7ee973d9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -506,6 +506,9 @@ CREATE VIEW pg_config AS
REVOKE ALL on pg_config FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+CREATE VIEW pg_ts_shared_dictionaries AS
+ SELECT * FROM pg_ts_shared_dictionaries();
+
-- Statistics views
CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 2a8b80bce8..fe632ecf6f 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -13,11 +13,18 @@
*/
#include "postgres.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_ts_dict.h"
#include "lib/dshash.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "tsearch/ts_shared.h"
+#include "utils/builtins.h"
#include "utils/hashutils.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -377,3 +384,100 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
+/*
+ * pg_ts_shared_dictionaries - SQL SRF showing dictionaries currently in
+ * shared memory.
+ */
+Datum
+pg_ts_shared_dictionaries(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ Relation rel;
+ HeapTuple tuple;
+ SysScanDesc scan;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* Build tuplestore to hold the result rows */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ init_dict_table();
+
+ /*
+ * If a hash table wasn't created return zero records.
+ */
+ if (!DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+ }
+
+ /* Start to scan pg_ts_dict */
+ rel = heap_open(TSDictionaryRelationId, AccessShareLock);
+ scan = systable_beginscan(rel, InvalidOid, false, NULL, 0, NULL);
+
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ Datum values[4];
+ bool nulls[4];
+ Form_pg_ts_dict dict = (Form_pg_ts_dict) GETSTRUCT(tuple);
+ Oid dictid = HeapTupleGetOid(tuple);
+ TsearchDictEntry *entry;
+ NameData dict_name;
+
+ /* If dictionary isn't located in shared memory try following */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &dictid, false);
+ if (!entry)
+ continue;
+
+ namecpy(&dict_name, &dict->dictname);
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[0] = ObjectIdGetDatum(dictid);
+
+ if (OidIsValid(dict->dictnamespace))
+ values[1] = CStringGetDatum(get_namespace_name(dict->dictnamespace));
+ else
+ nulls[1] = true;
+
+ values[2] = NameGetDatum(&dict_name);
+ values[3] = Int64GetDatum(entry->dict_size);
+
+ dshash_release_lock(dict_table, entry);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ tuplestore_donestoring(tupstore);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index bfc90098f8..6cd60f7110 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4975,6 +4975,9 @@ DESCR("trigger for automatic update of tsvector column");
DATA(insert OID = 3759 ( get_current_ts_config PGNSP PGUID 12 1 0 0 0 f f f t f s s 0 0 3734 "" _null_ _null_ _null_ _null_ _null_ get_current_ts_config _null_ _null_ _null_ ));
DESCR("get current tsearch configuration");
+DATA(insert OID = 4213 ( pg_ts_shared_dictionaries PGNSP PGUID 12 1 10 0 0 f f f f t s s 0 0 2249 "" "{26,19,19,20}" "{o,o,o,o}" "{dictoid,schemaname,dictname,size}" _null_ _null_ pg_ts_shared_dictionaries _null_ _null_ _null_ ));
+DESCR("information about text search dictionaries currently in shared memory");
+
DATA(insert OID = 3736 ( regconfigin PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 3734 "2275" _null_ _null_ _null_ _null_ _null_ regconfigin _null_ _null_ _null_ ));
DESCR("I/O");
DATA(insert OID = 3737 ( regconfigout PGNSP PGUID 12 1 0 0 0 f f f t f s s 1 0 2275 "3734" _null_ _null_ _null_ _null_ _null_ regconfigout _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5149b72fe9..7e61b46419 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2211,6 +2211,11 @@ pg_timezone_names| SELECT pg_timezone_names.name,
pg_timezone_names.utc_offset,
pg_timezone_names.is_dst
FROM pg_timezone_names() pg_timezone_names(name, abbrev, utc_offset, is_dst);
+pg_ts_shared_dictionaries| SELECT pg_ts_shared_dictionaries.dictoid,
+ pg_ts_shared_dictionaries.schemaname,
+ pg_ts_shared_dictionaries.dictname,
+ pg_ts_shared_dictionaries.size
+ FROM pg_ts_shared_dictionaries() pg_ts_shared_dictionaries(dictoid, schemaname, dictname, size);
pg_user| SELECT pg_shadow.usename,
pg_shadow.usesysid,
pg_shadow.usecreatedb,
0006-Shared-memory-ispell-option-v10.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 02e8e8aa90..65c2d6daa3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -2829,6 +2829,7 @@ iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
TEMPLATE = ispell,
+ Shareable = false,
DictFile = en_us,
AffFile = en_us,
Stopwords = english);
@@ -2843,6 +2844,9 @@ CREATE TEXT SEARCH DICTIONARY english_hunspell (
The stop-words file has the same format explained above for the
<literal>simple</literal> dictionary type. The format of the other files is
not specified here but is available from the above-mentioned web sites.
+ <literal>Shareable</literal> controls loading into shared memory. By
+ default it is <literal>true</literal> (see more in
+ <xref linkend="textsearch-shared-dictionaries"/>).
</para>
<para>
@@ -3040,6 +3044,8 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
both CPU and time-consuming. Instead of doing this in each backend when
it needs a dictionary for the first time, the compiled dictionary may be
stored in shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ shared memory.
</para>
<para>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 6294a52af3..1475cdb908 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -38,7 +38,8 @@ typedef struct
} DictISpell;
static void parse_dictoptions(List *dictoptions,
- char **dictfile, char **afffile, char **stopfile);
+ char **dictfile, char **afffile, char **stopfile,
+ bool *isshared);
static void *dispell_build(List *dictoptions, Size *size);
Datum
@@ -48,15 +49,21 @@ dispell_init(PG_FUNCTION_ARGS)
DictISpell *d;
void *dict_location;
char *stopfile;
+ bool isshared;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile);
+ parse_dictoptions(init_data->dictoptions, NULL, NULL, &stopfile, &isshared);
+ /* Make stop word list */
if (stopfile)
readstoplist(stopfile, &(d->stoplist), lowerstr);
- dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ /* Make or get from shared memory dictionary itself */
+ if (isshared)
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ else
+ dict_location = dispell_build(init_data->dictoptions, NULL);
Assert(dict_location);
d->obj.dict = (IspellDictData *) dict_location;
@@ -110,9 +117,10 @@ dispell_lexize(PG_FUNCTION_ARGS)
static void
parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
- char **stopfile)
+ char **stopfile, bool *isshared)
{
ListCell *l;
+ bool isshared_defined = false;
if (dictfile)
*dictfile = NULL;
@@ -120,6 +128,8 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
*afffile = NULL;
if (stopfile)
*stopfile = NULL;
+ if (isshared)
+ *isshared = true;
foreach(l, dictoptions)
{
@@ -158,6 +168,19 @@ parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
errmsg("multiple StopWords parameters")));
*stopfile = defGetString(defel);
}
+ else if (pg_strcasecmp(defel->defname, "Shareable") == 0)
+ {
+ if (!isshared)
+ continue;
+
+ if (isshared_defined)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple Shareable parameters")));
+
+ *isshared = defGetBoolean(defel);
+ isshared_defined = true;
+ }
else
{
ereport(ERROR,
@@ -180,7 +203,7 @@ dispell_build(List *dictoptions, Size *size)
char *dictfile,
*afffile;
- parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL, NULL);
if (!afffile)
{
@@ -212,6 +235,7 @@ dispell_build(List *dictoptions, Size *size)
NIFinishBuild(&build);
/* Return the buffer and its size */
- *size = build.dict_size;
+ if (size)
+ *size = build.dict_size;
return build.dict;
}
diff --git a/src/test/regress/expected/tsdicts.out b/src/test/regress/expected/tsdicts.out
index 0c1d7c7675..71a43b74e8 100644
--- a/src/test/regress/expected/tsdicts.out
+++ b/src/test/regress/expected/tsdicts.out
@@ -194,6 +194,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -290,6 +291,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -588,3 +590,64 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"AffFile" = ispell_sample
);
ERROR: unrecognized Ispell parameter: "DictFile"
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+SHOW max_shared_dictionaries_size;
+ max_shared_dictionaries_size
+------------------------------
+ 100MB
+(1 row)
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('shared_ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+---------------
+ public | ispell
+ public | hunspell
+ public | shared_ispell
+(3 rows)
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT ts_lexize('hunspell', 'skies');
+ ts_lexize
+-----------
+ {sky}
+(1 row)
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+ schemaname | dictname
+------------+----------
+ public | ispell
+ public | hunspell
+(2 rows)
+
diff --git a/src/test/regress/sql/tsdicts.sql b/src/test/regress/sql/tsdicts.sql
index 1633c0d066..d6e69d5511 100644
--- a/src/test/regress/sql/tsdicts.sql
+++ b/src/test/regress/sql/tsdicts.sql
@@ -51,6 +51,7 @@ SELECT ts_lexize('hunspell', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG long parameter
CREATE TEXT SEARCH DICTIONARY hunspell_long (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_long,
AffFile=hunspell_sample_long
);
@@ -75,6 +76,7 @@ SELECT ts_lexize('hunspell_long', 'footballyklubber');
-- Test ISpell dictionary with hunspell affix file with FLAG num parameter
CREATE TEXT SEARCH DICTIONARY hunspell_num (
Template=ispell,
+ Shareable=false,
DictFile=hunspell_sample_num,
AffFile=hunspell_sample_num
);
@@ -196,3 +198,28 @@ CREATE TEXT SEARCH DICTIONARY tsdict_case
"DictFile" = ispell_sample,
"AffFile" = ispell_sample
);
+
+-- Test shared dictionaries
+CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+);
+
+SHOW max_shared_dictionaries_size;
+
+-- Make sure that dictionaries in shared memory
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+SELECT ts_lexize('shared_ispell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
+
+-- shared_ispell space should be released in shared memory
+DROP TEXT SEARCH DICTIONARY shared_ispell;
+
+-- Make sure that dictionaries in shared memory, DROP invalidates cache
+SELECT ts_lexize('ispell', 'skies');
+SELECT ts_lexize('hunspell', 'skies');
+
+SELECT schemaname, dictname FROM pg_ts_shared_dictionaries;
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
[ v10 patch versions ]
I took a quick look through this. I agree with the comments about
mmap-ability not being something we should insist on now, and maybe
not ever. However, in order to keep our options open, it seems like
we should minimize the amount of API we expose that's based on the
current implementation. That leads me to the following thoughts:
* I cannot imagine a use-case for setting max_shared_dictionaries_size
to anything except "unlimited". If it's not that, and you exceed it,
then subsequent backends load private copies of the dictionary, making
your memory situation rapidly worse not better. I think we should lose
that GUC altogether and just load dictionaries automatically.
* Similarly, I see no point in a "sharable" option on individual
dictionaries, especially when there's only one allowed setting for
most dictionary types. Let's lose that too.
* And that leads us to not particularly need a view telling which
dictionaries are loaded, either. It's just an implementation detail
that users don't need to worry about.
This does beg the question of whether we need a way to flush dictionary
contents that's short of restarting the server (or short of dropping and
recreating the dictionary). I'm not sure, but even if we do, none of
the above is necessary for that.
I do think it's required that changing the dictionary's options with
ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
happening with this patch, I don't see where. (It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.
BTW, I was going to complain that this patch alters the API for
dictionary template init functions without any documentation updates;
but then I realized that there isn't any documentation to update.
That pretty well sucks, but I suppose it's not the job of this patch
to improve that situation. Still, you could spend a bit more effort on
the commentary in ts_public.h in 0002, because that commentary is as
close to an API spec as we've got.
regards, tom lane
On 3/24/18 9:56 PM, Tom Lane wrote:
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
[ v10 patch versions ]
I took a quick look through this. I agree with the comments about
mmap-ability not being something we should insist on now, and maybe
not ever. However, in order to keep our options open, it seems like
we should minimize the amount of API we expose that's based on the
current implementation. That leads me to the following thoughts:* I cannot imagine a use-case for setting max_shared_dictionaries_size
to anything except "unlimited". If it's not that, and you exceed it,
then subsequent backends load private copies of the dictionary, making
your memory situation rapidly worse not better. I think we should lose
that GUC altogether and just load dictionaries automatically.
Introduction of that limit is likely my fault. It came from from an
extension I wrote a long time ago, but back then it was a necessity
because we did not have DSM. So in retrospect I agree with you - it's
not particularly useful and we should ditch it.
Arthur, let this be a lesson for you! You have to start fight against
bogus feature requests from other people ;-)
* Similarly, I see no point in a "sharable" option on individual
dictionaries, especially when there's only one allowed setting for
most dictionary types. Let's lose that too.
I'm not so sure. Imagine you have a small number of dictionaries that
are used frequently, and then many that are used only once in a while.
A good example is a system handling documents in various languages,
serving "local" customers in a few local languages most of the time, but
then once in a while there's a request in another language.
In that case it makes sense to keep the frequently used ones in shared
memory all the time, and the rest load as needed (and throw it away when
the backend disconnects). Which is exactly what the 'shareable' option
is about ...
* And that leads us to not particularly need a view telling which
dictionaries are loaded, either. It's just an implementation detail
that users don't need to worry about.
Not so sure about this either. Of course, if we remove the memory limit,
it will be predictable which dictionaries are loaded in shared memory
and which are in the backends.
This does beg the question of whether we need a way to flush dictionary
contents that's short of restarting the server (or short of dropping and
recreating the dictionary). I'm not sure, but even if we do, none of
the above is necessary for that.
Ummm, I don't follow. Why would be flushing a dictionary similar to
restarting a server? It's certainly simpler to do a RELOAD on an
existing dictionary than having to do DROP+CREATE. I would not be
surprised if DROP+CREATE was significantly harder process-wise for many
admins (I mean, more approvals).
I do think it's required that changing the dictionary's options with
ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
happening with this patch, I don't see where. (It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)
Not sure.
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.
Actually, I think that's an issue - such race condition might easily
leak the shared memory forever (because the new dictionary will get a
different OID etc.). It probably is not happening very often, because
dictionaries are not dropped very often. But it needs fixing I think.
BTW, I was going to complain that this patch alters the API for
dictionary template init functions without any documentation updates;
but then I realized that there isn't any documentation to update.
That pretty well sucks, but I suppose it's not the job of this patch
to improve that situation. Still, you could spend a bit more effort on
the commentary in ts_public.h in 0002, because that commentary is as
close to an API spec as we've got.
Yeah :-(
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
On 3/24/18 9:56 PM, Tom Lane wrote:
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.
Actually, I think that's an issue - such race condition might easily
leak the shared memory forever (because the new dictionary will get a
different OID etc.). It probably is not happening very often, because
dictionaries are not dropped very often. But it needs fixing I think.
My thought was (a) the ROLLBACK case is ok, because the next use of
the dictionary will reload it, and (b) the reload-concurrently-with-
DROP case is annoying, because indeed it leaks, but the window is small
and it probably won't be an issue in practice. We would need to be
sure that the DSM segment goes away at postmaster restart, but given
that I think it'd be tolerable. Of course it'd be better not to have
the race, but I see no easy way to prevent it -- do you?
regards, tom lane
On 03/25/2018 06:18 AM, Tom Lane wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
On 3/24/18 9:56 PM, Tom Lane wrote:
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.Actually, I think that's an issue - such race condition might easily
leak the shared memory forever (because the new dictionary will get a
different OID etc.). It probably is not happening very often, because
dictionaries are not dropped very often. But it needs fixing I think.My thought was (a) the ROLLBACK case is ok, because the next use of
the dictionary will reload it, and (b) the reload-concurrently-with-
DROP case is annoying, because indeed it leaks, but the window is small
and it probably won't be an issue in practice. We would need to be
sure that the DSM segment goes away at postmaster restart, but given
that I think it'd be tolerable. Of course it'd be better not to have
the race, but I see no easy way to prevent it -- do you?
Unfortunately no :( For a moment I thought that perhaps we could make it
a responsibility of the last user of the dictionary - set a flag in the
shared memory, and the last user would remove it. But the trouble is how
to decide who's the last one? Even with some simple reference counting,
we probably don't know if that's the really last one.
FWIW this is where the view listing dictionaries loaded into shared
memory would be helpful - you'd at least know there's a dictionary,
wasting memory.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
FWIW this is where the view listing dictionaries loaded into shared
memory would be helpful - you'd at least know there's a dictionary,
wasting memory.
Well, that's only because we failed to make the implementation transparent
:-(. But it's not unlikely that an mmap-based implementation would be
simply incapable of supporting such a view: the knowledge of whether a
particular file is mapped in would be pretty much process-local, I think.
So I'd really rather we don't add that.
Also, while these dictionaries are indeed kind of large relative to our
traditional view of shared memory, if they're in DSM segments that the
kernel can swap out then I really suspect that nobody would much care
if a few such segments had been leaked. I find it hard to imagine a
use-case where DROP race conditions would lead us to leak so many that
it becomes a serious problem. Maybe I lack imagination.
regards, tom lane
On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
[ v10 patch versions ]
Thank you for the review, Tom!
Tomas Vondra wrote:
Tom Lane wrote:
* I cannot imagine a use-case for setting max_shared_dictionaries_size
to anything except "unlimited". If it's not that, and you exceed it,
then subsequent backends load private copies of the dictionary, making
your memory situation rapidly worse not better. I think we should lose
that GUC altogether and just load dictionaries automatically.Introduction of that limit is likely my fault. It came from from an
extension I wrote a long time ago, but back then it was a necessity
because we did not have DSM. So in retrospect I agree with you - it's
not particularly useful and we should ditch it.Arthur, let this be a lesson for you! You have to start fight against
bogus feature requests from other people ;-)
Yeah, in this sense max_shared_dictionaries_size is pointless. I'll
remove it then :).
* Similarly, I see no point in a "sharable" option on individual
dictionaries, especially when there's only one allowed setting for
most dictionary types. Let's lose that too.
I think "Shareable" option could be useful if a shared dictionary
building time was much longer than a non-shared dictionary building
time. It is slightly longer because of additional memcpy(), but isn't
noticable I think. So it is worth to remove it.
* And that leads us to not particularly need a view telling which
dictionaries are loaded, either. It's just an implementation detail
that users don't need to worry about.
If all dictionaries will be shareable then this view could be removed.
Unfortunately I think it can't help with leaked segments, I didn't find
a way to iterate dshash entries. That's why pg_ts_shared_dictionaries()
scans pg_ts_dict table instead of scanning dshash table.
I do think it's required that changing the dictionary's options with
ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
happening with this patch, I don't see where. (It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)
Yes unfortunately ALTER TEXT SEARCH DICTIONARY doesn't reload a
dictionary. TID can help here. I thought about using XID too when I
started to work on RELOAD command. But I'm not sure that it is a good
idea, anyway XID isn't needed in current version.
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.
I missed this case. As you wrote below ROLLBACK case is ok. But I
haven't a soluton for the second case for now. If I won't solve it I'll
add additional comments in RemoveTSConfigurationById() and maybe in the
documentation if it's appropriate.
BTW, I was going to complain that this patch alters the API for
dictionary template init functions without any documentation updates;
but then I realized that there isn't any documentation to update.
That pretty well sucks, but I suppose it's not the job of this patch
to improve that situation. Still, you could spend a bit more effort on
the commentary in ts_public.h in 0002, because that commentary is as
close to an API spec as we've got.
I'll fix the comments.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Sun, Mar 25, 2018 at 06:45:08AM +0200, Tomas Vondra wrote:
FWIW this is where the view listing dictionaries loaded into shared
memory would be helpful - you'd at least know there's a dictionary,
wasting memory.
Unfortunately, It seems that this view can't help in listing leaked
segments. I didn't find a way to list dshash entries. Now
pg_ts_shared_dictionaries() scans pg_ts_dict table and gets a dshash
item using dictId. In case of leaked dictionaries we don't know their
identifiers.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
* And that leads us to not particularly need a view telling which
dictionaries are loaded, either. It's just an implementation detail
that users don't need to worry about.
If all dictionaries will be shareable then this view could be removed.
Unfortunately I think it can't help with leaked segments, I didn't find
a way to iterate dshash entries. That's why pg_ts_shared_dictionaries()
scans pg_ts_dict table instead of scanning dshash table.
If you're scanning pg_ts_dict, what happens with dictionaries belonging
to other databases? They won't be visible in your local copy of
pg_ts_dict. Between that and the inability to find leaked segments,
I'm not seeing that this has much use-case.
(It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)
Yes unfortunately ALTER TEXT SEARCH DICTIONARY doesn't reload a
dictionary. TID can help here. I thought about using XID too when I
started to work on RELOAD command. But I'm not sure that it is a good
idea, anyway XID isn't needed in current version.
Actually, existing practice is to check both xmin and tid; see for example
where plpgsql checks if a cached function data structure still matches the
pg_proc row, pl_comp.c around line 175 in HEAD. The other PLs do it
similarly I think. I'm not sure offhand just how much that changes the
risks of a false match compared to testing only one of these fields, but
I'd recommend conforming to the way it's done elsewhere.
regards, tom lane
On Sun, Mar 25, 2018 at 12:18:10AM -0400, Tom Lane wrote:
My thought was (a) the ROLLBACK case is ok, because the next use of
the dictionary will reload it, and (b) the reload-concurrently-with-
DROP case is annoying, because indeed it leaks, but the window is small
and it probably won't be an issue in practice. We would need to be
sure that the DSM segment goes away at postmaster restart, but given
that I think it'd be tolerable. Of course it'd be better not to have
the race, but I see no easy way to prevent it -- do you?
I'm not sure that I understood the second case correclty. Can cache
invalidation help in this case? I don't have confident knowledge of cache
invalidation. It seems to me that InvalidateTSCacheCallBack() should
release segment after commit.
But cache isn't invalidated if a backend was terminated after a
dictionary reloading. on_shmem_exit() could help, but we need a leaked
dictionaries list for that.
P.S. I think it isn't right to release all dictionaries segment in
InvalidateTSCacheCallBack(). Otherwise any DROP can release all
segments. It would be worth to release a specific dictionary.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Sun, Mar 25, 2018 at 02:28:29PM -0400, Tom Lane wrote:
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
If all dictionaries will be shareable then this view could be removed.
Unfortunately I think it can't help with leaked segments, I didn't find
a way to iterate dshash entries. That's why pg_ts_shared_dictionaries()
scans pg_ts_dict table instead of scanning dshash table.If you're scanning pg_ts_dict, what happens with dictionaries belonging
to other databases? They won't be visible in your local copy of
pg_ts_dict. Between that and the inability to find leaked segments,
I'm not seeing that this has much use-case.
Indeed pg_ts_dict scanning is wrong way here. And
pg_ts_shared_dictionaries() is definitely broken.
Yes unfortunately ALTER TEXT SEARCH DICTIONARY doesn't reload a
dictionary. TID can help here. I thought about using XID too when I
started to work on RELOAD command. But I'm not sure that it is a good
idea, anyway XID isn't needed in current version.Actually, existing practice is to check both xmin and tid; see for example
where plpgsql checks if a cached function data structure still matches the
pg_proc row, pl_comp.c around line 175 in HEAD. The other PLs do it
similarly I think. I'm not sure offhand just how much that changes the
risks of a false match compared to testing only one of these fields, but
I'd recommend conforming to the way it's done elsewhere.
Thank you for pointing to it! I think it shouldn't be hard to use both
xmin and tid.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
On Sun, Mar 25, 2018 at 12:18:10AM -0400, Tom Lane wrote:
My thought was (a) the ROLLBACK case is ok, because the next use of
the dictionary will reload it, and (b) the reload-concurrently-with-
DROP case is annoying, because indeed it leaks, but the window is small
and it probably won't be an issue in practice. We would need to be
sure that the DSM segment goes away at postmaster restart, but given
that I think it'd be tolerable. Of course it'd be better not to have
the race, but I see no easy way to prevent it -- do you?
I'm not sure that I understood the second case correclty. Can cache
invalidation help in this case? I don't have confident knowledge of cache
invalidation. It seems to me that InvalidateTSCacheCallBack() should
release segment after commit.
"Release after commit" sounds like a pretty dangerous design to me,
because a release necessarily implies some kernel calls, which could
fail. We can't afford to inject steps that might fail into post-commit
cleanup (because it's too late to recover by failing the transaction).
It'd be better to do cleanup while searching for a dictionary to use.
I assume the DSM infrastructure already has some solution for getting
rid of DSM segments when the last interested process disconnects,
so maybe you could piggyback on that somehow.
regards, tom lane
On Mon, Mar 26, 2018 at 11:27:48AM -0400, Tom Lane wrote:
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
I'm not sure that I understood the second case correclty. Can cache
invalidation help in this case? I don't have confident knowledge of cache
invalidation. It seems to me that InvalidateTSCacheCallBack() should
release segment after commit."Release after commit" sounds like a pretty dangerous design to me,
because a release necessarily implies some kernel calls, which could
fail. We can't afford to inject steps that might fail into post-commit
cleanup (because it's too late to recover by failing the transaction).
It'd be better to do cleanup while searching for a dictionary to use.I assume the DSM infrastructure already has some solution for getting
rid of DSM segments when the last interested process disconnects,
so maybe you could piggyback on that somehow.
Yes, there is dsm_pin_mapping() for this. But it is necessary to keep a
segment even if there are no attached processes. From 0003:
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Please find the attached new version of the patch.
I got rid of 0005 and 0006 parts. There are no
max_shared_dictionaries_size variable, Shareable option,
pg_ts_shared_dictionaries view anymore.
On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
I do think it's required that changing the dictionary's options with
ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
happening with this patch, I don't see where. (It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)
The database OID, the dictionary OID, TID and XMIN are used now as
lookup key.
Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.
The dictionary's DSM segment is alive till postmaster terminates now.
But when the dictionary is dropped or altered then the previous
(invalid) segment is unpinned. The segment itself is released when all
backends unpins mapping in lookup_ts_parser_cache() or by disconnecting.
The problem here comes when the dictionary was used before dropping or
altering by some process, isn't used after and the process lives a very
long time. In this situation the mapping isn't unpinned and the segment
isn't released. The other problem is that TsearchDictEntry isn't removed
if ts_dict_shmem_release() wasn't called. It may happen after dropping
the dictionary.
BTW, I was going to complain that this patch alters the API for
dictionary template init functions without any documentation updates;
but then I realized that there isn't any documentation to update.
That pretty well sucks, but I suppose it's not the job of this patch
to improve that situation. Still, you could spend a bit more effort on
the commentary in ts_public.h in 0002, because that commentary is as
close to an API spec as we've got.
I improved a little bit the commentary in ts_public.h.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v11.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v11.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..8dd4959028 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..0b8a32d459 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2a2fbee5fa 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..83012b5b54 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,24 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..15ebafd833 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..39f1e6faeb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..9605108334 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..02989cd16b 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..1604b5f60f 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 97347780d3..dea8c99c31 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -312,11 +313,14 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +340,14 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 410f1d54af..f7d80a0853 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,9 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XID of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..cb3a152d45 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,68 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin and tid to be sure that the content in the DSM segment still valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XID of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +168,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v11.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 83012b5b54..78ed0cee85 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -509,6 +510,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -520,6 +522,17 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -543,6 +556,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -629,6 +643,17 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..f27de76cfe
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,351 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+ bool segment_pinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ dshash_memcmp,
+ dshash_memhash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt++;
+ entry->segment_pinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+ }
+
+ if ((entry->refcnt == 0 || unpin_segment) && entry->segment_pinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ /*
+ * We dont want to unpin the segment twice. It may happen if we
+ * unpinned the segment after dropping or altering the dictionary,
+ * all interested processes unpinned mapping and refcnt is zero.
+ */
+ entry->segment_pinned = false;
+ }
+
+ if (entry->refcnt == 0)
+ {
+ /*
+ * Delete the entry iff there is no process which pinned mapping to
+ * the DSM segment.
+ */
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index dea8c99c31..4b1ea1c343 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -302,6 +303,17 @@ lookup_ts_dictionary_cache(Oid dictId)
}
else
{
+ DictPointerData dict_ptr;
+
+ /*
+ * It is possible that we've pinned a DSM segment, it isn't valid
+ * anymore and we need to unpin it to avoid memory leaking.
+ */
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+
/* Clear the existing entry's private context */
saveCtx = entry->dictCtx;
/* Don't let context's ident pointer dangle while we reset it */
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..b6d00bdc9e
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v11.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..c6b2e6cf7e 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 39f1e6faeb..ced52d2790 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
Here is the new version of the patch.
Now RemoveTSDictionaryById() and AlterTSDictionary() unpin the
dictionary DSM segment. So if all attached backends disconnect allocated
DSM segments will be released.
lookup_ts_dictionary_cache() may unping DSM mapping for all invalid
dictionary cache entries.
I added xmax in DictPointerData. It is used as a lookup key now too. It
helps to reload a dictionary after roll back DROP command.
There was a bug in ts_dict_shmem_location(), I fixed it.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v12.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v12.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..8dd4959028 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..0b8a32d459 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2a2fbee5fa 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..3753e32b2c 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..15ebafd833 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..39f1e6faeb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..9605108334 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..02989cd16b 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..1604b5f60f 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 97347780d3..b2a0105ee8 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 410f1d54af..45ed570864 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..363226c936 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v12.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3753e32b2c..ef6cabcc1e 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -510,6 +511,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -521,6 +523,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -544,6 +558,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -630,6 +645,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..590e93df8e
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,384 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /* How many backends have DSM mapping */
+ uint32 refcnt;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ entry->refcnt++;
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->refcnt = 1;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+
+ entry->refcnt--;
+ }
+
+ if (unpin_segment)
+ dsm_unpin_segment(entry->dict_dsm);
+
+ if (entry->refcnt == 0)
+ {
+ /*
+ * Delete the entry iff there is no process which pinned mapping to
+ * the DSM segment.
+ */
+ dshash_delete_entry(dict_table, entry);
+ }
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index b2a0105ee8..57bc2bea5b 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -112,6 +113,36 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all invalid dictionary entries.
+ */
+static void
+flush_ts_dictionary_content(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (!entry->isvalid)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +291,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ flush_ts_dictionary_content();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..b6d00bdc9e
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v12.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..c6b2e6cf7e 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 39f1e6faeb..ced52d2790 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
Hello all,
I'd like to add new optional function to text search template named fini in
addition to init() and lexize(). It will be called by
RemoveTSDictionaryById() and AlterTSDictionary(). dispell_fini() will call
ts_dict_shmem_release().
It doesn't change segments leaking situation. I think it makes text search
API more transparent.
I'll update the existing documentation. And I think I can add text search
API documentation in the 2018-09 commitfest, as Tom noticed that it doesn't
exist.
Any thoughts?
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 03/31/2018 12:42 PM, Arthur Zakirov wrote:
Hello all,
I'd like to add new optional function to text search template named fini
in addition to init() and lexize(). It will be called by
RemoveTSDictionaryById() and AlterTSDictionary(). dispell_fini() will
call ts_dict_shmem_release().It doesn't change segments leaking situation. I think it makes text
search API more transparent.
If it doesn't actually solve the problem, why add it? I don't see a
point in adding functions for the sake of transparency, when it does not
in fact serve any use cases.
Can't we handle the segment-leaking by adding some sort of tombstone?
For example, imagine that instead of removing the hash table entry we
mark it as 'dropped'. And after that, after the lookup we would know the
dictionary was removed, and the backends would load the dictionary into
their private memory.
Of course, this could mean we end up having many tombstones in the hash
table. But those tombstones would be tiny, making it less painful than
potentially leaking much more memory for the dictionaries.
Also, I wonder if we might actually remove the dictionaries after a
while, e.g. based on XID. Imagine that we note the XID of the
transaction removing the dictionary, or perhaps XID of the most recent
running transaction. Then we could use this to decide if all running
transactions actually see the DROP, and we could remove the tombstone.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra wrote:
On 03/31/2018 12:42 PM, Arthur Zakirov wrote:
Hello all,
I'd like to add new optional function to text search template named fini
in addition to init() and lexize(). It will be called by
RemoveTSDictionaryById() and AlterTSDictionary(). dispell_fini() will
call ts_dict_shmem_release().It doesn't change segments leaking situation. I think it makes text
search API more transparent.If it doesn't actually solve the problem, why add it? I don't see a
point in adding functions for the sake of transparency, when it does not
in fact serve any use cases.
It doesn't solve the problem. But it brings more clearness, if a dictionary
requested shared location then it should release/unpin it. There are no
such scenario yet, but someone might want to release not only shared
segment but also other private data.
Can't we handle the segment-leaking by adding some sort of tombstone?
It is interesting that there are such tombstones already, without the
patch. TSDictionaryCacheEntry entries aren't deleted after DROP, they are
just marked isvalid = false.
For example, imagine that instead of removing the hash table entry we
mark it as 'dropped'. And after that, after the lookup we would know the
dictionary was removed, and the backends would load the dictionary into
their private memory.Of course, this could mean we end up having many tombstones in the hash
table. But those tombstones would be tiny, making it less painful than
potentially leaking much more memory for the dictionaries.
Now actually Isn't guaranteed that the hash table entry will be removed.
Even if refcnt is 0. So I think I should remove refcnt and entries won't be
removed.
There are no big problems with leaking now. Memory may leak only if a
dictionary was dropped or altered and there is no text search workload
anymore and the backend still alive. Because next using of text search
functions will unpin segments used before for invalid dictionaries (isvalid
== false). Also the segment is unpinned if the backend terminates. The
segment is destroyed when all interested processes unpin the segment (as
Tom noticed), the hash table entry becomes tombstone.
I hope I described clear.
Also, I wonder if we might actually remove the dictionaries after a
while, e.g. based on XID. Imagine that we note the XID of the
transaction removing the dictionary, or perhaps XID of the most recent
running transaction. Then we could use this to decide if all running
transactions actually see the DROP, and we could remove the tombstone.
Maybe autovacuum should work here too :) It is joke of course. I'm not very
aware of removing dead tuples, but I think here is similar case.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Thu, Mar 29, 2018 at 02:03:07AM +0300, Arthur Zakirov wrote:
Here is the new version of the patch.
Please find the attached new version of the patch.
I removed refcnt because it is useless, it doesn't guarantee that a hash
table entry will be removed.
I fixed a bug, dsm_unpin_segment() can be called twice if a transaction
which called it was aborted and another transaction calls
ts_dict_shmem_release(). I added segment_ispinned to fix it.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v13.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v13.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..8dd4959028 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..0b8a32d459 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2a2fbee5fa 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..3753e32b2c 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..15ebafd833 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..39f1e6faeb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..9605108334 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..02989cd16b 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..1604b5f60f 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 97347780d3..b2a0105ee8 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 410f1d54af..45ed570864 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..363226c936 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v13.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3753e32b2c..ef6cabcc1e 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -510,6 +511,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -521,6 +523,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -544,6 +558,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -630,6 +645,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..4b8933628c
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index b2a0105ee8..eecdf419ce 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -88,6 +89,10 @@ static bool has_invalid_dictionary = false;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -112,6 +117,33 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +292,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..b6d00bdc9e
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v13.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 610b7bf033..b0cfbbf4d0 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3030,6 +3030,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in dynamic shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ dynamic shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 39f1e6faeb..ced52d2790 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index e071994523..1c560ef56a 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1195,17 +1411,16 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char repl[BUFSIZ],
*prepl;
bool isSuffix = false;
- int naffix = 0,
- curaffix = 0;
+ int naffix = 0;
int sflaglen = 0;
char flagflags = 0;
tsearch_readline_state trst;
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1437,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1479,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1495,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1516,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix == 0)
ereport(ERROR,
@@ -1313,21 +1534,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
- curaffix++;
+ AddAffixSet(ConfBuild, VoidString, 0);
}
/* Other lines is aliases */
else
{
- if (curaffix < naffix)
- {
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
- curaffix++;
- }
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
}
goto nextline;
}
@@ -1338,8 +1553,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1367,21 +1582,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1407,7 +1622,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1428,9 +1643,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1452,10 +1667,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1528,7 +1741,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1547,53 +1761,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1601,66 +1810,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1669,15 +1899,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1689,9 +1921,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
- return rs;
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1699,7 +1941,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1708,81 +1950,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1790,83 +2032,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1874,137 +2137,154 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
ptr->issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = (Affix->type == FF_SUFFIX);
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2019,9 +2299,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2035,8 +2316,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2076,7 +2416,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2086,9 +2426,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2099,7 +2439,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2109,12 +2454,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2153,7 +2503,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2165,7 +2515,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2173,23 +2523,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2201,45 +2557,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2259,7 +2629,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2269,9 +2640,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2285,9 +2659,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2339,13 +2716,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2360,8 +2738,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2408,7 +2789,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2467,13 +2849,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2523,7 +2906,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
On Tue, Mar 27, 2018 at 8:19 AM, Arthur Zakirov
<a.zakirov@postgrespro.ru> wrote:
I assume the DSM infrastructure already has some solution for getting
rid of DSM segments when the last interested process disconnects,
so maybe you could piggyback on that somehow.Yes, there is dsm_pin_mapping() for this. But it is necessary to keep a
segment even if there are no attached processes. From 0003:+ /* Remain attached until end of postmaster */ + dsm_pin_segment(seg); + /* Remain attached until end of session */ + dsm_pin_mapping(seg);
I don't quite understand the problem you're trying to solve here, but:
1. Unless dsm_pin_segment() is called, a DSM segment will
automatically be removed when there are no remaining mappings.
2. Unless dsm_pin_mapping() is called, a DSM segment will be unmapped
when the currently-in-scope resource owner is cleaned up, like at the
end of the query. If it is called, then the mapping will stick around
until the backend exits.
If you pin the mapping or the segment and later no longer want it
pinned, there are dsm_unpin_mapping() and dsm_unpin_segment()
functions available, too. So it seems like what you might want to do
is pin the segment when it's created, and then unpin it if it's
stale/obsolete. The latter won't remove it immediately, but will once
all the mappings are gone.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hello,
On Tue, May 15, 2018 at 05:02:43PM -0400, Robert Haas wrote:
On Tue, Mar 27, 2018 at 8:19 AM, Arthur Zakirov
<a.zakirov@postgrespro.ru> wrote:Yes, there is dsm_pin_mapping() for this. But it is necessary to keep a
segment even if there are no attached processes. From 0003:+ /* Remain attached until end of postmaster */ + dsm_pin_segment(seg); + /* Remain attached until end of session */ + dsm_pin_mapping(seg);I don't quite understand the problem you're trying to solve here, but:
1. Unless dsm_pin_segment() is called, a DSM segment will
automatically be removed when there are no remaining mappings.2. Unless dsm_pin_mapping() is called, a DSM segment will be unmapped
when the currently-in-scope resource owner is cleaned up, like at the
end of the query. If it is called, then the mapping will stick around
until the backend exits.
I tried to solve the case when DSM segment remains mapped even a
dictionary was dropped. It may happen in the following situation:
Backend 1:
=# select ts_lexize('english_shared', 'test');
-- The dictionary is loaded into DSM, the segment and the mapping is
pinned
...
-- Call ts_lexize() from backend 2 below
=# drop text search dictionary english_shared;
-- The segment and the mapping is unpinned, see ts_dict_shmem_release()
Backend 2:
=# select ts_lexize('english_shared', 'test');
-- The dictionary got from DSM, the mapping is pinned
...
-- The dictionary was dropped by backend 1, but the mapping still is
pinned
As you can see the DSM still is pinned by backend 2. Later I fixed it by
checking do we need to unping segments. In the current version of the
patch do_ts_dict_shmem_release() is called in
lookup_ts_dictionary_cache(). It unpins segments if text search cache
was invalidated. It unpins all segments, but I think it is ok since
text search changes should be infrequent.
If you pin the mapping or the segment and later no longer want it
pinned, there are dsm_unpin_mapping() and dsm_unpin_segment()
functions available, too. So it seems like what you might want to do
is pin the segment when it's created, and then unpin it if it's
stale/obsolete. The latter won't remove it immediately, but will once
all the mappings are gone.
Yes, dsm_unpin_mapping() and dsm_unpin_segment() will be called when the
dictionary is dropped or altered in the current version of the patch. I
descriped the approach above.
In sum, I think the problem is mostly solved. Backend 2 unpins the
segment in next ts_lexize() call. But if backend 2 doesn't call
ts_lexize() (or other TS function) anymore the segment will remain mapped.
It is the only problem I see for now.
I hope the description is clear. I attached the rebased patch.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0004-Store-ispell-in-shared-location-v14.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 19f58511c8..fbadc25d18 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3114,6 +3114,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in dynamic shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ dynamic shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 39f1e6faeb..ced52d2790 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 09297e384c..db4ef90228 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,147 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ offsets = (uint32 *) DictAffixOffset(dict);
+ offset = 0;
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +227,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +348,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +541,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +549,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +564,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +630,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +648,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +682,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +716,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +771,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +797,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +805,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +843,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +868,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +885,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +945,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +959,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1238,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1261,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1300,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1330,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1338,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1361,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1378,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1395,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1419,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1438,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1480,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1496,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1517,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1535,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1546,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1564,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1593,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1633,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1654,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1678,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1752,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1772,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1821,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1910,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1932,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1704,7 +1952,7 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
@@ -1713,81 +1961,81 @@ NISortDictionary(IspellDict *Conf)
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
}
/*
@@ -1795,83 +2043,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2148,156 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
- if (Conf->naffixes == 0)
+ if (ConfBuild->nAffix == 0)
return;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
+ ConfBuild->CompoundAffix = (CMPDAffix *) repalloc(ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * (ConfBuild->nCompoundAffix));
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *node_start;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ node_start = (AffixNode *) DictPrefixNodes(dict);
+ else
+ node_start = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(node_start, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2312,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(node_start,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2329,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2429,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2439,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2452,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2467,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2516,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2528,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2536,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2570,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2642,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2653,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2672,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2729,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2751,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2802,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2862,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2919,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..b40cf379eb 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,19 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +221,71 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i])
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i])
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) (((of) == ISPELL_INVALID_OFFSET) ? NULL : \
+ (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +294,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Data for IspellDictData */
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
0001-Fix-ispell-memory-handling-v14.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 6f5b635413..09297e384c 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v14.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..8dd4959028 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..0b8a32d459 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2a2fbee5fa 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..3753e32b2c 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..15ebafd833 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..39f1e6faeb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..9605108334 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..02989cd16b 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..1604b5f60f 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index f11cba4cce..780517723b 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 410f1d54af..45ed570864 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..363226c936 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v14.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3753e32b2c..ef6cabcc1e 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -510,6 +511,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -521,6 +523,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -544,6 +558,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -630,6 +645,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..4b8933628c
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 780517723b..1401920412 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -88,6 +89,10 @@ static bool has_invalid_dictionary = false;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -112,6 +117,33 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +292,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..b6d00bdc9e
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
On Wed, May 16, 2018 at 7:36 AM, Arthur Zakirov
<a.zakirov@postgrespro.ru> wrote:
I don't quite understand the problem you're trying to solve here, but:
1. Unless dsm_pin_segment() is called, a DSM segment will
automatically be removed when there are no remaining mappings.2. Unless dsm_pin_mapping() is called, a DSM segment will be unmapped
when the currently-in-scope resource owner is cleaned up, like at the
end of the query. If it is called, then the mapping will stick around
until the backend exits.I tried to solve the case when DSM segment remains mapped even a
dictionary was dropped. It may happen in the following situation:Backend 1:
=# select ts_lexize('english_shared', 'test');
-- The dictionary is loaded into DSM, the segment and the mapping is
pinned
...
-- Call ts_lexize() from backend 2 below
=# drop text search dictionary english_shared;
-- The segment and the mapping is unpinned, see ts_dict_shmem_release()Backend 2:
=# select ts_lexize('english_shared', 'test');
-- The dictionary got from DSM, the mapping is pinned
...
-- The dictionary was dropped by backend 1, but the mapping still is
pinned
Yeah, there's really nothing we can do about that (except switch from
processes to threads). There's no way for one process to force
another process to unmap something. As you've observed, you can get
it to be dropped eventually, but not immediately.
In sum, I think the problem is mostly solved. Backend 2 unpins the
segment in next ts_lexize() call. But if backend 2 doesn't call
ts_lexize() (or other TS function) anymore the segment will remain mapped.
It is the only problem I see for now.
Maybe you could use CacheRegisterSyscacheCallback to get a callback
when the backend notices that a DROP has occurred.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, May 16, 2018 at 09:33:46AM -0400, Robert Haas wrote:
In sum, I think the problem is mostly solved. Backend 2 unpins the
segment in next ts_lexize() call. But if backend 2 doesn't call
ts_lexize() (or other TS function) anymore the segment will remain mapped.
It is the only problem I see for now.Maybe you could use CacheRegisterSyscacheCallback to get a callback
when the backend notices that a DROP has occurred.
Yes, it was the first approach. DSM segments was unpinned in
InvalidateTSCacheCallBack() in that approach, which is registered using
CacheRegisterSyscacheCallback().
I haven't deep knowledge about guts of invalidation callbacks. It seems
that there is problem with it. Tom pointed above:
I'm not sure that I understood the second case correclty. Can cache
invalidation help in this case? I don't have confident knowledge of cache
invalidation. It seems to me that InvalidateTSCacheCallBack() should
release segment after commit."Release after commit" sounds like a pretty dangerous design to me,
because a release necessarily implies some kernel calls, which could
fail. We can't afford to inject steps that might fail into post-commit
cleanup (because it's too late to recover by failing the transaction).
It'd be better to do cleanup while searching for a dictionary to use.
But it is possible that I misunderstood his note.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, May 16, 2018 at 4:42 PM, Arthur Zakirov
<a.zakirov@postgrespro.ru> wrote:
I haven't deep knowledge about guts of invalidation callbacks. It seems
that there is problem with it. Tom pointed above:I'm not sure that I understood the second case correclty. Can cache
invalidation help in this case? I don't have confident knowledge of cache
invalidation. It seems to me that InvalidateTSCacheCallBack() should
release segment after commit."Release after commit" sounds like a pretty dangerous design to me,
because a release necessarily implies some kernel calls, which could
fail. We can't afford to inject steps that might fail into post-commit
cleanup (because it's too late to recover by failing the transaction).
It'd be better to do cleanup while searching for a dictionary to use.But it is possible that I misunderstood his note.
I think you and Tom have misunderstood each other somehow. If you
look at CommitTransaction(), you will see a comment that says:
* This is all post-commit cleanup. Note that if an error is
raised here,
* it's too late to abort the transaction. This should be just
* noncritical resource releasing.
Between that point and the end of that function, we shouldn't do
anything that throws an error, because the transaction is already
committed and it's too late to change our mind. But if session A
drops an object, session B is not going to get a callback to
InvalidateTSCacheCallBack at that point. It's going to happen
sometime in the middle of the transaction, like when it next tries to
lock a relation or something. So Tom's complaint is irrelevant in
that scenario.
Also, there is no absolute prohibition on kernel calls in post-commit
cleanup, or in no-fail code in general. For example, the
RESOURCE_RELEASE_AFTER_LOCKS phase of resowner cleanup calls
FileClose(). That's actually completely alarming when you really
think about it, because one of the documented return values for
close() is EIO, which certainly represents a very dangerous kind of
failure -- see nearby threads about fsync-safety. Transaction abort
acquires and releases numerous LWLocks, which can result in kernel
calls that could fail. We're OK with that because, in practice, it
never happens. Unmapping a DSM segment is probably about as safe as
acquiring and releasing an LWLock, maybe safer. On my MacBook, the
only documented return value for munmap is EINVAL, and any such error
would indicate a PostgreSQL bug (or a kernel bug, or a cosmic ray
hit). I checked a Linux system; things there are less clear, because
mmap and mumap share a single man page, and mmap can fail for all
kinds of reasons. But very few of the listed error codes look like
things that could legitimately happen during munmap. Also, if munmap
did fail (or shmdt/shmctl if using System V shared memory), it would
be reported as a WARNING, not an ERROR, so we'd still be sorta OK.
I think the only real question here is whether it's safe, at a high
level, to drop the object at time T0 and have various backends drop
the mapping at unpredictable later times T1, T2, ... all greater than
T0. Generally, one wants to remove all references to an object before
the object itself, which in this case we can't. Assuming that we can
convince ourselves that that much is OK, I don't see why using a
syscache callback to help ensure that the mappings are blown away in
an at-least-somewhat-timely fashion is worse than any other approach.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
... Assuming that we can
convince ourselves that that much is OK, I don't see why using a
syscache callback to help ensure that the mappings are blown away in
an at-least-somewhat-timely fashion is worse than any other approach.
I think the point you've not addressed is that "syscache callback
occurred" does not equate to "object was dropped". Can the code
survive having this occur at any invalidation point?
(CLOBBER_CACHE_ALWAYS testing would soon expose any fallacy there.)
regards, tom lane
On Thu, May 17, 2018 at 09:57:59AM -0400, Robert Haas wrote:
I think you and Tom have misunderstood each other somehow. If you
look at CommitTransaction(), you will see a comment that says:
Oh, I understood. You are right.
Also, there is no absolute prohibition on kernel calls in post-commit
cleanup, or in no-fail code in general.
Thank you for the explanation!
The current approach depends on syscache callbacks anyway. Backend 2
(from the example above) knows is it necessary to unpin segments after
syscache callback was called. Tom pointed below that callbacks are
occured in various events. So I think I should check the current approach
too using CLOBBER_CACHE_ALWAYS. It could show some problems in the
current patch.
Then if everything is OK I think I'll check another approach (unmapping
in TS syscache callback) using CLOBBER_CACHE_ALWAYS.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Thu, May 17, 2018 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
... Assuming that we can
convince ourselves that that much is OK, I don't see why using a
syscache callback to help ensure that the mappings are blown away in
an at-least-somewhat-timely fashion is worse than any other approach.I think the point you've not addressed is that "syscache callback
occurred" does not equate to "object was dropped". Can the code
survive having this occur at any invalidation point?
(CLOBBER_CACHE_ALWAYS testing would soon expose any fallacy there.)
Well, I'm not advocating for a lack of testing, and
CLOBBER_CACHE_ALWAYS testing is a good idea. However, I suspect that
calling dsm_detach() from a syscache callback should be fine.
Obviously there will be trouble if the surrounding code is still using
that mapping, but that would be a bug at some higher level, like using
an object without locking it. And there will be trouble if you
register an on_dsm_detach callback that does something strange, but
the ones that the core code installs (when you use shm_mq, for
example) should be safe. And there will be trouble if you're not
careful about memory contexts, because someplace you probably need to
remember that you detached from that DSM so you don't try to do it
again, and you'd better be sure you have the right context selected
when updating your data structures. But it all seems pretty solvable.
I think.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, May 17, 2018 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think the point you've not addressed is that "syscache callback
occurred" does not equate to "object was dropped". Can the code
survive having this occur at any invalidation point?
(CLOBBER_CACHE_ALWAYS testing would soon expose any fallacy there.)
Well, I'm not advocating for a lack of testing, and
CLOBBER_CACHE_ALWAYS testing is a good idea. However, I suspect that
calling dsm_detach() from a syscache callback should be fine.
Obviously there will be trouble if the surrounding code is still using
that mapping, but that would be a bug at some higher level, like using
an object without locking it.
No, you're clearly not getting the point. You could have an absolutely
airtight exclusive lock of any description whatsoever, and that would
provide no guarantee at all that you don't get a cache flush callback.
It's only a cache, not a catalog, and it can get flushed for any reason
or no reason. (That's why we have pin counts on catcache and relcache
entries, rather than assuming that locking the corresponding object is
enough.) So I think it's highly likely that unmapping in a syscache
callback is going to lead quickly to SIGSEGV. The only way it would not
is if we keep the shared dictionary mapped only in short straight-line
code segments that never do any other catalog accesses ... which seems
awkward, inefficient, and error-prone.
Do we actually need to worry about unmapping promptly on DROP TEXT
DICTIONARY? It seems like the only downside of not doing that is that
we'd leak some address space until process exit. If you were thrashing
dictionaries at some unreasonable rate on a 32-bit host, you might
eventually run some sessions out of address space; but that doesn't seem
like a situation that's so common that we need fragile coding to avoid it.
regards, tom lane
On Thu, May 17, 2018 at 1:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, May 17, 2018 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think the point you've not addressed is that "syscache callback
occurred" does not equate to "object was dropped". Can the code
survive having this occur at any invalidation point?
(CLOBBER_CACHE_ALWAYS testing would soon expose any fallacy there.)Well, I'm not advocating for a lack of testing, and
CLOBBER_CACHE_ALWAYS testing is a good idea. However, I suspect that
calling dsm_detach() from a syscache callback should be fine.
Obviously there will be trouble if the surrounding code is still using
that mapping, but that would be a bug at some higher level, like using
an object without locking it.No, you're clearly not getting the point. You could have an absolutely
airtight exclusive lock of any description whatsoever, and that would
provide no guarantee at all that you don't get a cache flush callback.
It's only a cache, not a catalog, and it can get flushed for any reason
or no reason. (That's why we have pin counts on catcache and relcache
entries, rather than assuming that locking the corresponding object is
enough.) So I think it's highly likely that unmapping in a syscache
callback is going to lead quickly to SIGSEGV. The only way it would not
is if we keep the shared dictionary mapped only in short straight-line
code segments that never do any other catalog accesses ... which seems
awkward, inefficient, and error-prone.
Yeah, that's true, but again, you can work around that problem. A DSM
mapping is fundamentally not that different from a backend-private
memory allocation. If you can avoid freeing memory while you're
referencing it -- as the catcache and the syscache clearly do -- you
can avoid it here, too.
Do we actually need to worry about unmapping promptly on DROP TEXT
DICTIONARY? It seems like the only downside of not doing that is that
we'd leak some address space until process exit. If you were thrashing
dictionaries at some unreasonable rate on a 32-bit host, you might
eventually run some sessions out of address space; but that doesn't seem
like a situation that's so common that we need fragile coding to avoid it.
I'm not sure what the situation is here.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, May 17, 2018 at 10:18:56AM -0400, Tom Lane wrote:
I think the point you've not addressed is that "syscache callback
occurred" does not equate to "object was dropped". Can the code
survive having this occur at any invalidation point?
(CLOBBER_CACHE_ALWAYS testing would soon expose any fallacy there.)
Thank you for the idea of testing with CLOBBER_CACHE_ALWAYS. I built
postgres with it and run regression tests. I tested both approaches. In
first glance they passed the tests.
There is no concurrent tests for text search feature with two and more
connections. Maybe it would be useful to make such tests. I did it
manually but it is better to have a script.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Thu, May 17, 2018 at 02:14:07PM -0400, Robert Haas wrote:
On Thu, May 17, 2018 at 1:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Do we actually need to worry about unmapping promptly on DROP TEXT
DICTIONARY? It seems like the only downside of not doing that is that
we'd leak some address space until process exit. If you were thrashing
dictionaries at some unreasonable rate on a 32-bit host, you might
eventually run some sessions out of address space; but that doesn't seem
like a situation that's so common that we need fragile coding to avoid it.I'm not sure what the situation is here.
I think this case may take place when you continuously create, drop a
lot of dictionaries; different connections concurrently work with them
and some of connection stops working with text search at some point and
therefore pinned segments won't be unpinned.
But I'm not sure is this real case. Text search configuration changes
should be very infrequent (as it is written on in the
InvalidateTSCacheCallBack commentary).
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Wed, May 16, 2018 at 02:36:33PM +0300, Arthur Zakirov wrote:
... I attached the rebased patch.
I attached new version of the patch.
I found a bug when CompoundAffix,
SuffixNodes, PrefixNodes, DictNodes of IspellDictData structure are
empty. Now they have terminating entry and therefore they have at least
one node entry.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v15.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 6f5b635413..09297e384c 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v15.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..8dd4959028 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..0b8a32d459 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2a2fbee5fa 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3a843512d1..3753e32b2c 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -386,17 +386,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 78c9f73ef0..15ebafd833 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -181,14 +181,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index edc6547700..39f1e6faeb 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index ac6a24eba5..9605108334 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index c011886cb0..02989cd16b 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 24364e646d..1604b5f60f 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index f11cba4cce..780517723b 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 410f1d54af..45ed570864 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index 0b7a5aa68e..363226c936 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v15.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 3753e32b2c..ef6cabcc1e 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -39,6 +39,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -510,6 +511,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -521,6 +523,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -544,6 +558,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -630,6 +645,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0c86a581c0..c7dce8cac5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/backend_random.h"
#include "utils/snapmgr.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, BackendRandomShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -271,6 +273,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
AsyncShmemInit();
BackendRandomShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 227468ae9e..860cd196e9 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..4b8933628c
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 780517723b..1401920412 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -88,6 +89,10 @@ static bool has_invalid_dictionary = false;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -112,6 +117,33 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +292,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index c21bfe2f66..16b0858eda 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..b6d00bdc9e
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v15.patchtext/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 8075ea94e7..e469558f4d 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3110,6 +3110,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in dynamic shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ dynamic shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 39f1e6faeb..ced52d2790 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 09297e384c..7ac0dceae0 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 210f97dda9..5f7cd6fbf4 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
On Thu, Jun 14, 2018 at 11:40:17AM +0300, Arthur Zakirov wrote:
I attached new version of the patch.
The patch still applies to HEAD. I moved it to the next commitfest.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 01.10.2018 12:22, Arthur Zakirov wrote:
On Thu, Jun 14, 2018 at 11:40:17AM +0300, Arthur Zakirov wrote:
I attached new version of the patch.
The patch still applies to HEAD. I moved it to the next commitfest.
Here is the rebased patch. I also updated copyright in ts_shared.h and
ts_shared.c.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v16.patchtext/x-patch; name=0001-Fix-ispell-memory-handling-v16.patchDownload
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb39466b22..eb8416ce7f 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
0002-Change-tmplinit-argument-v16.patchtext/x-patch; name=0002-Change-tmplinit-argument-v16.patchDownload
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 628b9769c3..ddde55eee4 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index 509e14aee0..15b1a0033a 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index fc5176e338..f3663cefd0 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -270,12 +270,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index cda21675f0..93a71adc5d 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -390,17 +390,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 5166738310..f30f29865c 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -201,14 +201,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8b05a477f1..fc9a96abca 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index 2f62ef00c8..c92744641b 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index b6226df940..d3f5f0da3f 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 75f8deef6a..8962e252e0 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index f7fc6c1558..a90ee5c115 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 77e325d101..cf3d870537 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index b325fa122c..e169450de3 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
0003-Retreive-shared-location-for-dict-v16.patchtext/x-patch; name=0003-Retreive-shared-location-for-dict-v16.patchDownload
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 93a71adc5d..6d2868ccb5 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -40,6 +40,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -529,6 +531,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -552,6 +566,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -638,6 +653,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2849e47d99..a1af2b2692 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/snapmgr.h"
@@ -148,6 +149,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -268,6 +270,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
SyncScanShmemInit();
AsyncShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 62d8bb3254..0b25c20fb0 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..748ab5a782
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index a90ee5c115..7935667eba 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -88,6 +89,10 @@ static bool has_invalid_dictionary = false;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -112,6 +117,33 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +292,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 96c7732006..49a3319a11 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..4339dc6e1f
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
0004-Store-ispell-in-shared-location-v16.patchtext/x-patch; name=0004-Store-ispell-in-shared-location-v16.patchDownload
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..0f172eda04 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3110,6 +3110,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in dynamic shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ dynamic shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index fc9a96abca..3c9dd78c56 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb8416ce7f..123fba7a11 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 4cba578436..df0abd38ae 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
Hello Arthur,
I've looked at the patch today, and in general is seems quite solid to
me. I do have a couple of minor points
1) I think the comments need more work. Instead of describing all the
individual changes here, I've outlined those improvements in attached
patches (see the attached "tweaks" patches). Some of it is formatting,
minor rewording or larger changes. Some comments are rather redundant
(e.g. the one before calls to release the DSM segment).
2) It's not quite clear to me why we need DictInitData, which simply
combines DictPointerData and list of options. It seems as if the only
point is to pass a single parameter to the init function, but is it
worth it? Why not to get rid of DictInitData entirely and pass two
parameters instead?
3) I find it a bit cumbersome that before each ts_dict_shmem_release
call we construct a dummy DickPointerData value. Why not to pass
individual parameters and construct the struct in the function?
4) The reference to max_shared_dictionaries_size is obsolete, because
there's no such limit anymore.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Fix-ispell-memory-handling.patchtext/x-patch; name=0001-Fix-ispell-memory-handling.patchDownload
From e76d34bcb1a84127f9b4402f0147642d77505cc2 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 22:16:35 +0100
Subject: [PATCH 1/7] Fix ispell memory handling
---
src/backend/tsearch/spell.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb39466b22..eb8416ce7f 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
--
2.17.2
0002-Change-tmplinit-argument.patchtext/x-patch; name=0002-Change-tmplinit-argument.patchDownload
From dbb3cc3b7e7c560472cfa5efa77598f4992e0b70 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 22:17:32 +0100
Subject: [PATCH 2/7] Change tmplinit argument
---
contrib/dict_int/dict_int.c | 4 +-
contrib/dict_xsyn/dict_xsyn.c | 4 +-
contrib/unaccent/unaccent.c | 4 +-
src/backend/commands/tsearchcmds.c | 10 +++-
src/backend/snowball/dict_snowball.c | 4 +-
src/backend/tsearch/dict_ispell.c | 4 +-
src/backend/tsearch/dict_simple.c | 4 +-
src/backend/tsearch/dict_synonym.c | 4 +-
src/backend/tsearch/dict_thesaurus.c | 4 +-
src/backend/utils/cache/ts_cache.c | 19 +++++++-
src/include/tsearch/ts_cache.h | 4 ++
src/include/tsearch/ts_public.h | 69 ++++++++++++++++++++++++++--
12 files changed, 113 insertions(+), 21 deletions(-)
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 628b9769c3..ddde55eee4 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index 509e14aee0..15b1a0033a 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index fc5176e338..f3663cefd0 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -270,12 +270,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index cda21675f0..93a71adc5d 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -390,17 +390,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 5166738310..f30f29865c 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -201,14 +201,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8b05a477f1..fc9a96abca 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index 2f62ef00c8..c92744641b 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index b6226df940..d3f5f0da3f 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 75f8deef6a..8962e252e0 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index f7fc6c1558..a90ee5c115 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -98,7 +100,12 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ has_invalid_dictionary = true;
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
@@ -312,11 +319,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +347,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 77e325d101..cf3d870537 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index b325fa122c..e169450de3 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,70 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to handle a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to
+ * define another functions. API consists of the following functions:
+ * - init function - optional function which initializes internal structures of
+ * the dictionary. It accepts DictInitData structure as an argument and must
+ * return a custom palloc'd structure which stores content of the processed
+ * dictionary and is used in lexize function.
+ * - lexize function - normalizes a single word (token) using specific
+ * dictionary. It must return a palloc'd array of TSLexeme the last entry of
+ * which is the terminating entry and accepts the following arguments:
+ * - dictData - pointer to a custom structure returned by init function or
+ * NULL if init function wasn't defined by the template.
+ * - token - string which represents a token to normalize, isn't
+ * null-terminated.
+ * - length - length of token.
+ * - dictState - pointer to a DictSubState structure which stores current
+ * state of a set of tokens processing and allows to normalize phrases.
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM. Does
+ * the dictionary want it decides init function. A DSM segment is released if
+ * the dictionary was altered or droppped. But still there is a situation when
+ * we haven't a way to prevent a segment leaking. It may happen if the
+ * dictionary was dropped, some backend used the dictionary before dropping, the
+ * backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ *
+ * DictPointerData is a structure to search a dictionary's DSM segment. We
+ * need xmin, xmax and tid to be sure that the content in the DSM segment still
+ * valid.
+ */
+typedef struct
+{
+ Oid id; /* OID of dictionary which is processed */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictPointerData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /*
+ * A dictionary option list for a template's init method. Should go first
+ * for backward compatibility.
+ */
+ List *dict_options;
+ /*
+ * A dictionary information used to allocate, search and release its DSM
+ * segment.
+ */
+ DictPointerData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +170,8 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string or NULL if it is a terminating
+ * entry */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
--
2.17.2
0003-Change-tmplinit-argument-tweaks.patchtext/x-patch; name=0003-Change-tmplinit-argument-tweaks.patchDownload
From 89eb18b7ca386aef6852e2f889987670f106454b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 23:01:27 +0100
Subject: [PATCH 3/7] Change tmplinit argument tweaks
---
src/include/tsearch/ts_cache.h | 6 +--
src/include/tsearch/ts_public.h | 87 ++++++++++++++++++---------------
2 files changed, 51 insertions(+), 42 deletions(-)
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index cf3d870537..2298e0a275 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,9 +54,9 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
- TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
- TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
- ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
/* most frequent fmgr call */
Oid lexizeOid;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index e169450de3..232cefc672 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -85,61 +85,71 @@ extern bool searchstoplist(StopList *s, char *key);
/*
* API for text search dictionaries.
*
- * API functions to handle a text search dictionary are defined by a text search
- * template. Currently an existing template cannot be altered in order to
- * define another functions. API consists of the following functions:
- * - init function - optional function which initializes internal structures of
- * the dictionary. It accepts DictInitData structure as an argument and must
- * return a custom palloc'd structure which stores content of the processed
- * dictionary and is used in lexize function.
- * - lexize function - normalizes a single word (token) using specific
- * dictionary. It must return a palloc'd array of TSLexeme the last entry of
- * which is the terminating entry and accepts the following arguments:
- * - dictData - pointer to a custom structure returned by init function or
- * NULL if init function wasn't defined by the template.
- * - token - string which represents a token to normalize, isn't
- * null-terminated.
- * - length - length of token.
- * - dictState - pointer to a DictSubState structure which stores current
- * state of a set of tokens processing and allows to normalize phrases.
+ * API functions to manage a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to use
+ * different functions. API consists of the following functions:
+ *
+ * init function
+ * -------------
+ * - optional function which initializes internal structures of the dictionary
+ * - accepts DictInitData structure as an argument and must return a custom
+ * palloc'd structure which stores content of the processed dictionary and
+ * is used by lexize function
+ *
+ * lexize function
+ * ---------------
+ * - normalizes a single word (token) using specific dictionary
+ * - returns a palloc'd array of TSLexeme, with a terminating NULL entry
+ * - accepts the following arguments:
+ *
+ * - dictData - pointer to a structure returned by init function or NULL if
+ * init function wasn't defined by the template
+ * - token - string to normalize (not null-terminated)
+ * - length - length of the token
+ * - dictState - pointer to a DictSubState structure storing current
+ * state of a set of tokens processing and allows to normalize phrases
*/
/*
- * A preprocessed dictionary can be stored in shared memory using DSM. Does
- * the dictionary want it decides init function. A DSM segment is released if
- * the dictionary was altered or droppped. But still there is a situation when
- * we haven't a way to prevent a segment leaking. It may happen if the
- * dictionary was dropped, some backend used the dictionary before dropping, the
- * backend will hold its DSM segment till disconnecting or calling
- * lookup_ts_dictionary_cache(), where invalid segment is unpinned.
+ * A preprocessed dictionary can be stored in shared memory using DSM - this is
+ * decided in the init function. A DSM segment is released after altering or
+ * dropping the dictionary. The segment may still leak, when a backend uses the
+ * dictionary right before dropping - in that case the backend will hold the DSM
+ * untill it disconnects or calls lookup_ts_dictionary_cache().
*
- * DictPointerData is a structure to search a dictionary's DSM segment. We
- * need xmin, xmax and tid to be sure that the content in the DSM segment still
- * valid.
+ * DictPointerData represents DSM segment with a preprocessed dictionary. We
+ * need to ensure the content of the DSM segment is still valid, which is what
+ * xmin, xmax and tid are for.
*/
typedef struct
{
- Oid id; /* OID of dictionary which is processed */
- TransactionId xmin; /* XMIN of the dictionary's tuple */
- TransactionId xmax; /* XMAX of the dictionary's tuple */
- ItemPointerData tid; /* TID of the dictionary's tuple */
+ Oid id; /* OID of the dictionary */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
} DictPointerData;
/*
* API structure for a dictionary initialization. It is passed as an argument
* to a template's init function.
+ *
+ * XXX Why do we even do it this way, instead of simply passing two parameters
+ * to the tmplinit function?
*/
typedef struct
{
/*
- * A dictionary option list for a template's init method. Should go first
- * for backward compatibility.
+ * List of options for a template's init method (should go first for
+ * backwards compatibility).
+ *
+ * XXX The backwards compatibility argument seems a bit bogus, because this
+ * struct did not exist before anyway, and the functions were accepting List
+ * instead. So the extensions will have to be updated anyway, to prevent
+ * warnings about incorrect signatures / types.
*/
List *dict_options;
- /*
- * A dictionary information used to allocate, search and release its DSM
- * segment.
- */
+
+ /* pointer used to allocate, search and release the DSM segment */
DictPointerData dict;
} DictInitData;
@@ -170,8 +180,7 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string or NULL if it is a terminating
- * entry */
+ char *lexeme; /* C string (NULL for terminating entry) */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
--
2.17.2
0004-Retrieve-shared-location-for-dict.patchtext/x-patch; name=0004-Retrieve-shared-location-for-dict.patchDownload
From 8f864ce99fc809fe299fe941a9b4b1d706bfd559 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 22:18:03 +0100
Subject: [PATCH 4/7] Retrieve shared location for dict
---
src/backend/commands/tsearchcmds.c | 27 +++
src/backend/storage/ipc/ipci.c | 7 +
src/backend/tsearch/Makefile | 2 +-
src/backend/tsearch/ts_shared.c | 377 +++++++++++++++++++++++++++++
src/backend/utils/cache/ts_cache.c | 41 ++++
src/include/storage/lwlock.h | 2 +
src/include/tsearch/ts_shared.h | 26 ++
7 files changed, 481 insertions(+), 1 deletion(-)
create mode 100644 src/backend/tsearch/ts_shared.c
create mode 100644 src/include/tsearch/ts_shared.h
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 93a71adc5d..6d2868ccb5 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -40,6 +40,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -518,6 +519,7 @@ RemoveTSDictionaryById(Oid dictId)
{
Relation relation;
HeapTuple tup;
+ DictPointerData dict;
relation = heap_open(TSDictionaryRelationId, RowExclusiveLock);
@@ -529,6 +531,18 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ /*
+ * We need to release the dictionary's DSM segment. The segment still may
+ * leak. It may happen if some backend used the dictionary before dropping,
+ * the backend will hold its DSM segment till disconnecting or calling
+ * lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -552,6 +566,7 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
bool repl_null[Natts_pg_ts_dict];
bool repl_repl[Natts_pg_ts_dict];
ObjectAddress address;
+ DictPointerData dict;
dictId = get_ts_dict_oid(stmt->dictname, false);
@@ -638,6 +653,18 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ /*
+ * We need to release the dictionary's DSM segment. The segment isn't valid
+ * anymor. The segment still may leak. It may happen if some backend used
+ * the dictionary before dropping, the backend will hold its DSM segment
+ * till disconnecting or calling lookup_ts_dictionary_cache().
+ */
+ dict.id = dictId;
+ dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
+ dict.xmax = HeapTupleHeaderGetRawXmax(tup->t_data);
+ dict.tid = tup->t_self;
+ ts_dict_shmem_release(&dict, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2849e47d99..a1af2b2692 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/snapmgr.h"
@@ -148,6 +149,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -268,6 +270,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
SyncScanShmemInit();
AsyncShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 62d8bb3254..0b25c20fb0 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..748ab5a782
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictPointerData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * dict: key to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = *dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index a90ee5c115..a002a01d05 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -88,6 +89,10 @@ static bool has_invalid_dictionary = false;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -112,6 +117,35 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all dictionary entries.
+ *
+ * XXX All dictionaries, but only when there's invalid dictionary?
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ DictPointerData dict_ptr;
+
+ dict_ptr.id = entry->dictId;
+ dict_ptr.xmin = entry->dict_xmin;
+ dict_ptr.xmax = entry->dict_xmax;
+ dict_ptr.tid = entry->dict_tid;
+ ts_dict_shmem_release(&dict_ptr, false);
+ }
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -260,6 +294,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 96c7732006..49a3319a11 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..4339dc6e1f
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
--
2.17.2
0005-Retrieve-shared-location-for-dict-tweak.patchtext/x-patch; name=0005-Retrieve-shared-location-for-dict-tweak.patchDownload
From 5387d9515849c328cfeb3c513aae74215324e343 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 23:48:49 +0100
Subject: [PATCH 5/7] Retrieve shared location for dict tweak
---
src/backend/commands/tsearchcmds.c | 5 +++++
src/backend/tsearch/ts_shared.c | 4 ++++
src/backend/utils/cache/ts_cache.c | 1 +
3 files changed, 10 insertions(+)
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 6d2868ccb5..71b6d5a3c9 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -536,6 +536,9 @@ RemoveTSDictionaryById(Oid dictId)
* leak. It may happen if some backend used the dictionary before dropping,
* the backend will hold its DSM segment till disconnecting or calling
* lookup_ts_dictionary_cache().
+ *
+ * XXX This comment is already elsewhere, so maybe move all of that to the
+ * ts_dict_shmem_release function comment.
*/
dict.id = dictId;
dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
@@ -658,6 +661,8 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
* anymor. The segment still may leak. It may happen if some backend used
* the dictionary before dropping, the backend will hold its DSM segment
* till disconnecting or calling lookup_ts_dictionary_cache().
+ *
+ * XXX Move comment to ts_dict_shmem_release?
*/
dict.id = dictId;
dict.xmin = HeapTupleHeaderGetRawXmin(tup->t_data);
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index 748ab5a782..9a2bedc43a 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -188,6 +188,9 @@ ts_dict_shmem_location(DictInitData *init_data,
* dict: key to search the dictionary's DSM segment.
* unpin_segment: true if we need to unpin the segment in case if the dictionary
* was dropped or altered.
+ *
+ * XXX Maybe change this to accept individual fields instead of DickPointerData,
+ * so that we don't have to build the struct elsewhere, just to call this function.
*/
void
ts_dict_shmem_release(DictPointerData *dict, bool unpin_segment)
@@ -375,3 +378,4 @@ recheck_table:
MemoryContextSwitchTo(old_context);
}
+
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index a002a01d05..4ad103b87e 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -134,6 +134,7 @@ do_ts_dict_shmem_release(void)
hash_seq_init(&status, TSDictionaryCacheHash);
while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ /* XXX not really a pointer, so the name is misleading */
DictPointerData dict_ptr;
dict_ptr.id = entry->dictId;
--
2.17.2
0006-Store-ispell-in-shared-location.patchtext/x-patch; name=0006-Store-ispell-in-shared-location.patchDownload
From b694aad2b85c7dba8014a7b8b9eb31031a5a82f4 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 15 Jan 2019 22:18:35 +0100
Subject: [PATCH 6/7] Store ispell in shared location
---
doc/src/sgml/textsearch.sgml | 15 +
src/backend/tsearch/dict_ispell.c | 197 +++--
src/backend/tsearch/spell.c | 1343 +++++++++++++++++++----------
src/include/tsearch/dicts/spell.h | 239 +++--
4 files changed, 1212 insertions(+), 582 deletions(-)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..0f172eda04 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3110,6 +3110,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Some dictionaries, especially <application>Ispell</application>, consumes
+ a significant amount of memory, in some cases tens of megabytes. Most of
+ them store the data in text files, and building the in-memory structure is
+ both CPU and time-consuming. Instead of doing this in each backend when
+ it needs a dictionary for the first time, the compiled dictionary may be
+ stored in dynamic shared memory so that it may be reused by other backends.
+ Currently only <application>Ispell</application> supports loading into
+ dynamic shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index fc9a96abca..3c9dd78c56 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,15 @@
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
*
+ * By default all Ispell dictionaries are stored in DSM. But if the amount
+ * of memory exceeds max_shared_dictionaries_size, then the dictionary will be
+ * allocated in private backend memory (in dictCtx context).
+ *
+ * All necessary data are built within dispell_build() function. But
+ * structures for regular expressions are compiled on first demand and
+ * stored using AffixReg array. It is because regex_t and Regis cannot be
+ * stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +23,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +37,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
+
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
- foreach(l, init_data->dict_options)
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +166,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- PG_RETURN_POINTER(d);
-}
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
-
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ /* Release temporary data */
+ NIFinishBuild(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
-
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb8416ce7f..123fba7a11 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 4cba578436..df0abd38ae 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
--
2.17.2
0007-Store-ispell-in-shared-location-tweaks.patchtext/x-patch; name=0007-Store-ispell-in-shared-location-tweaks.patchDownload
From 18f7f3920983d5bd1172daf4cd888e147aef190c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Wed, 16 Jan 2019 00:03:13 +0100
Subject: [PATCH 7/7] Store ispell in shared location tweaks
---
doc/src/sgml/textsearch.sgml | 18 ++++++++++--------
src/backend/tsearch/dict_ispell.c | 2 ++
2 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 0f172eda04..8cceef3aa7 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3114,14 +3114,16 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
<title>Dictionaries in Shared Memory</title>
<para>
- Some dictionaries, especially <application>Ispell</application>, consumes
- a significant amount of memory, in some cases tens of megabytes. Most of
- them store the data in text files, and building the in-memory structure is
- both CPU and time-consuming. Instead of doing this in each backend when
- it needs a dictionary for the first time, the compiled dictionary may be
- stored in dynamic shared memory so that it may be reused by other backends.
- Currently only <application>Ispell</application> supports loading into
- dynamic shared memory.
+ Dictionaries, especially <application>Ispell</application>, may be quite
+ expensive both in terms of memory and CPU usage. For large dictionaries
+ it may take multiple seconds to read and process input text files on first
+ access, and the in-memory representation may require tens of megabytes.
+ When each backend processes the dictionaries independently and stores them
+ in private memory, this cost is significant. To amortize it, the compiled
+ dictionary may be stored in shared memory for reuse by other backends.
+ Currently only <application>Ispell</application> supports such sharing.
+
+ XXX "supported" is not the same as "all ispell dicts behave like that".
</para>
</sect2>
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 3c9dd78c56..ae2463e2dc 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -9,6 +9,8 @@
* of memory exceeds max_shared_dictionaries_size, then the dictionary will be
* allocated in private backend memory (in dictCtx context).
*
+ * XXX There's no max_shared_dictionaries_size anymore.
+ *
* All necessary data are built within dispell_build() function. But
* structures for regular expressions are compiled on first demand and
* stored using AffixReg array. It is because regex_t and Regis cannot be
--
2.17.2
Hello Tomas,
On 16.01.2019 03:23, Tomas Vondra wrote:
I've looked at the patch today, and in general is seems quite solid to
me. I do have a couple of minor points1) I think the comments need more work. Instead of describing all the
individual changes here, I've outlined those improvements in attached
patches (see the attached "tweaks" patches). Some of it is formatting,
minor rewording or larger changes. Some comments are rather redundant
(e.g. the one before calls to release the DSM segment).
Thank you!
2) It's not quite clear to me why we need DictInitData, which simply
combines DictPointerData and list of options. It seems as if the only
point is to pass a single parameter to the init function, but is it
worth it? Why not to get rid of DictInitData entirely and pass two
parameters instead?
In the first place init method had two parameters. But in the v7 patch I
added DictInitData struct instead of two parameters (list of options and
DictPointerData):
/messages/by-id/20180319110648.GA32319@zakirov.localdomain
I haven't way to replace template's init method from
init_method(internal) to init_method(internal,internal) in the upgrade
script of extensions. If I'm not mistaken we need new syntax here, like
ALTER TEXT SEARCH TEMPLATE. Thoughts?
3) I find it a bit cumbersome that before each ts_dict_shmem_release
call we construct a dummy DickPointerData value. Why not to pass
individual parameters and construct the struct in the function?
Agree, it may look too verbose. I'll change it.
4) The reference to max_shared_dictionaries_size is obsolete, because
there's no such limit anymore.
Yeah, I'll fix it.
/* XXX not really a pointer, so the name is misleading */
I think we don't need DictPointerData struct anymore, because only
ts_dict_shmem_release function needs it (see comments above) and we only
need it to hash search. I'll move all fields of DictPointerData to
TsearchDictKey struct.
XXX "supported" is not the same as "all ispell dicts behave like that".
I'll reword the sentence.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
I attached files of new version of the patch, I applied your tweaks.
XXX All dictionaries, but only when there's invalid dictionary?
I've made a little optimization. I introduced hashvalue into
TSDictionaryCacheEntry. Now released only DSM of altered or dropped
dictionaries.
/* XXX not really a pointer, so the name is misleading */
I think we don't need DictPointerData struct anymore, because only
ts_dict_shmem_release function needs it (see comments above) and we only
need it to hash search. I'll move all fields of DictPointerData to
TsearchDictKey struct.
I was wrong, DictInitData also needs DictPointerData. I didn't remove
DictPointerData, I renamed it to DictEntryData. Hope that it is a more
appropriate name.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v17.patchtext/x-patch; name=0001-Fix-ispell-memory-handling-v17.patchDownload
From c20c171c2107efc6f87b688af0feecf2f98fcd69 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 14:27:32 +0300
Subject: [PATCH 1/4] Fix-ispell-memory-handling
---
src/backend/tsearch/spell.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb39466b22..eb8416ce7f 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
--
2.20.1
0002-Change-tmplinit-argument-v17.patchtext/x-patch; name=0002-Change-tmplinit-argument-v17.patchDownload
From ca45e4ca314bdf8bed1a47796afb3e86c6fcd684 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:05:44 +0300
Subject: [PATCH 2/4] Change-tmplinit-argument
---
contrib/dict_int/dict_int.c | 4 +-
contrib/dict_xsyn/dict_xsyn.c | 4 +-
contrib/unaccent/unaccent.c | 4 +-
src/backend/commands/tsearchcmds.c | 10 ++++-
src/backend/snowball/dict_snowball.c | 4 +-
src/backend/tsearch/dict_ispell.c | 4 +-
src/backend/tsearch/dict_simple.c | 4 +-
src/backend/tsearch/dict_synonym.c | 4 +-
src/backend/tsearch/dict_thesaurus.c | 4 +-
src/backend/utils/cache/ts_cache.c | 13 +++++-
src/include/tsearch/ts_cache.h | 4 ++
src/include/tsearch/ts_public.h | 67 ++++++++++++++++++++++++++--
12 files changed, 105 insertions(+), 21 deletions(-)
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 628b9769c3..ddde55eee4 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index 509e14aee0..15b1a0033a 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index fc5176e338..f3663cefd0 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -270,12 +270,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index cda21675f0..93a71adc5d 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -390,17 +390,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 5166738310..f30f29865c 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -201,14 +201,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8b05a477f1..fc9a96abca 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index 2f62ef00c8..c92744641b 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index b6226df940..d3f5f0da3f 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 75f8deef6a..8962e252e0 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index f7fc6c1558..5bd1c4288f 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -312,11 +313,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -336,9 +341,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 77e325d101..2298e0a275 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index b325fa122c..db028ed6ad 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,69 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to manage a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to use
+ * different functions. API consists of the following functions:
+ *
+ * init function
+ * -------------
+ * - optional function which initializes internal structures of the dictionary
+ * - accepts DictInitData structure as an argument and must return a custom
+ * palloc'd structure which stores content of the processed dictionary and
+ * is used by lexize function
+ *
+ * lexize function
+ * ---------------
+ * - normalizes a single word (token) using specific dictionary
+ * - returns a palloc'd array of TSLexeme, with a terminating NULL entry
+ * - accepts the following arguments:
+ *
+ * - dictData - pointer to a structure returned by init function or NULL if
+ * init function wasn't defined by the template
+ * - token - string to normalize (not null-terminated)
+ * - length - length of the token
+ * - dictState - pointer to a DictSubState structure storing current
+ * state of a set of tokens processing and allows to normalize phrases
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM - this is
+ * decided in the init function. A DSM segment is released after altering or
+ * dropping the dictionary. The segment may still leak, when a backend uses the
+ * dictionary right before dropping - in that case the backend will hold the DSM
+ * untill it disconnects or calls lookup_ts_dictionary_cache().
+ *
+ * DictEntryData represents DSM segment with a preprocessed dictionary. We need
+ * to ensure the content of the DSM segment is still valid, which is what xmin,
+ * xmax and tid are for.
+ */
+typedef struct
+{
+ Oid id; /* OID of the dictionary */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictEntryData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /* List of options for a template's init method */
+ List *dict_options;
+
+ /* Data used to allocate, search and release the DSM segment */
+ DictEntryData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +169,7 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string (NULL for terminating entry) */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
--
2.20.1
0003-Retrieve-shared-location-for-dict-v17.patchtext/x-patch; name=0003-Retrieve-shared-location-for-dict-v17.patchDownload
From c50f90a9932b35f05d5e5f25c66184ae2776ff6a Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:38:09 +0300
Subject: [PATCH 3/4] Retrieve-shared-location-for-dict
---
src/backend/commands/tsearchcmds.c | 11 +
src/backend/storage/ipc/ipci.c | 7 +
src/backend/tsearch/Makefile | 2 +-
src/backend/tsearch/ts_shared.c | 385 +++++++++++++++++++++++++++++
src/backend/utils/cache/ts_cache.c | 51 ++++
src/include/storage/lwlock.h | 2 +
src/include/tsearch/ts_cache.h | 3 +
src/include/tsearch/ts_shared.h | 28 +++
8 files changed, 488 insertions(+), 1 deletion(-)
create mode 100644 src/backend/tsearch/ts_shared.c
create mode 100644 src/include/tsearch/ts_shared.h
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 93a71adc5d..1debe75e4d 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -40,6 +40,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -529,6 +530,11 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId,
+ HeapTupleHeaderGetRawXmin(tup->t_data),
+ HeapTupleHeaderGetRawXmax(tup->t_data),
+ tup->t_self, true);
+
ReleaseSysCache(tup);
heap_close(relation, RowExclusiveLock);
@@ -638,6 +644,11 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ ts_dict_shmem_release(dictId,
+ HeapTupleHeaderGetRawXmin(tup->t_data),
+ HeapTupleHeaderGetRawXmax(tup->t_data),
+ tup->t_self, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2849e47d99..a1af2b2692 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/snapmgr.h"
@@ -148,6 +149,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -268,6 +270,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
SyncScanShmemInit();
AsyncShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 62d8bb3254..0b25c20fb0 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..0f8454f746
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,385 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictEntryData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+} TsearchDictEntry;
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+static void init_dict_table(void);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ bool found;
+ dsm_segment *seg;
+ void *dict,
+ *dict_location;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ {
+ seg = dsm_attach(entry->dict_dsm);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+ }
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dsm_attach(entry->dict_dsm);
+
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least, allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(dict_size, 0);
+ dict_location = dsm_segment_address(seg);
+ memcpy(dict_location, dict, dict_size);
+
+ pfree(dict);
+
+ entry->key = key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ dshash_release_lock(dict_table, entry);
+
+ return dsm_segment_address(seg);
+}
+
+/*
+ * Release memory occupied by the dictionary. Function just unpins DSM mapping.
+ * If nobody else hasn't mapping to this DSM or the dictionary was dropped or
+ * altered then unpin the DSM segment.
+ *
+ * The segment still may leak. It may happen if some backend used the
+ * dictionary before dropping, the backend will hold its DSM segment till
+ * disconnecting or calling lookup_ts_dictionary_cache().
+ *
+ * id, xmin, xmax, tid: information to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(Oid id, TransactionId xmin, TransactionId xmax,
+ ItemPointerData tid, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict.id = id;
+ key.dict.xmin = xmin;
+ key.dict.xmax = xmax;
+ key.dict.tid = tid;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ dsm_unpin_mapping(seg);
+ dsm_detach(seg);
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+ }
+
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 5bd1c4288f..2bd5e6787e 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -76,6 +77,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -87,6 +89,10 @@ static Oid TSCurrentConfigCache = InvalidOid;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -99,13 +105,48 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ {
+ TSDictionaryCacheEntry *dict_entry;
+
+ dict_entry = (TSDictionaryCacheEntry *) entry;
+ if (dict_entry->hashvalue == hashvalue)
+ {
+ dict_entry->shmem_valid = false;
+ has_invalid_dictionary = true;
+ }
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all invalid dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ if (!entry->shmem_valid)
+ ts_dict_shmem_release(entry->dictId, entry->dict_xmin,
+ entry->dict_xmax, entry->dict_tid, false);
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -254,6 +295,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
@@ -360,6 +408,9 @@ lookup_ts_dictionary_cache(Oid dictId)
fmgr_info_cxt(entry->lexizeOid, &entry->lexize, entry->dictCtx);
entry->isvalid = true;
+ entry->hashvalue =
+ GetSysCacheHashValue1(TSDICTOID, ObjectIdGetDatum(entry->dictId));
+ entry->shmem_valid = true;
}
lastUsedDictionary = entry;
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 96c7732006..49a3319a11 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 2298e0a275..14e13bf252 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,9 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ uint32 hashvalue; /* hash value of the dictionary's OID */
+ bool shmem_valid;
+
TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
ItemPointerData dict_tid; /* TID of the dictionary's tuple */
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..1e506ef737
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,28 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid id, TransactionId xmin,
+ TransactionId xmax, ItemPointerData tid,
+ bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
--
2.20.1
0004-Store-ispell-in-shared-location-v17.patchtext/x-patch; name=0004-Store-ispell-in-shared-location-v17.patchDownload
From ae745ed3076537d74fbd8520c1ec03d76f1ad488 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:50:44 +0300
Subject: [PATCH 4/4] Store-ispell-in-shared-location
---
doc/src/sgml/textsearch.sgml | 15 +
src/backend/tsearch/dict_ispell.c | 193 +++--
src/backend/tsearch/spell.c | 1343 +++++++++++++++++++----------
src/include/tsearch/dicts/spell.h | 239 +++--
4 files changed, 1208 insertions(+), 582 deletions(-)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..308758942b 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3110,6 +3110,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Dictionaries, especially <application>Ispell</application>, may be quite
+ expensive both in terms of memory and CPU usage. For large dictionaries
+ it may take multiple seconds to read and process input text files on first
+ access, and the in-memory representation may require tens of megabytes.
+ When each backend processes the dictionaries independently and stores them
+ in private memory, this cost is significant. To amortize it, the compiled
+ dictionary may be stored in shared memory for reuse by other backends.
+ Currently only <application>Ispell</application> is stored in shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index fc9a96abca..b9c30bbeb4 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,11 @@
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
*
+ * Compiled Ispell dictionaries are stored in DSM. All necessary data are built
+ * within dispell_build() function. But structures for regular expressions are
+ * compiled on first demand and stored using AffixReg array. It is because
+ * regex_t and Regis cannot be stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +19,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +33,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
- foreach(l, init_data->dict_options)
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +162,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
-
- PG_RETURN_POINTER(d);
-}
-
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
+ /* Release temporary data */
+ NIFinishBuild(&build);
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb8416ce7f..123fba7a11 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 4cba578436..df0abd38ae 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
--
2.20.1
On 1/17/19 3:15 PM, Arthur Zakirov wrote:
I attached files of new version of the patch, I applied your tweaks.
XXX All dictionaries, but only when there's invalid dictionary?
I've made a little optimization. I introduced hashvalue into
TSDictionaryCacheEntry. Now released only DSM of altered or dropped
dictionaries.> /* XXX not really a pointer, so the name is misleading */
I think we don't need DictPointerData struct anymore, because only
ts_dict_shmem_release function needs it (see comments above) and we only
need it to hash search. I'll move all fields of DictPointerData to
TsearchDictKey struct.I was wrong, DictInitData also needs DictPointerData. I didn't remove
DictPointerData, I renamed it to DictEntryData. Hope that it is a more
appropriate name.
Thanks. I've reviewed v17 today and I haven't discovered any new issues
so far. If everything goes fine and no one protests, I plan to get it
committed over the next week or so.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-01-20 23:15:35 +0100, Tomas Vondra wrote:
On 1/17/19 3:15 PM, Arthur Zakirov wrote:
I attached files of new version of the patch, I applied your tweaks.
XXX All dictionaries, but only when there's invalid dictionary?
I've made a little optimization. I introduced hashvalue into
TSDictionaryCacheEntry. Now released only DSM of altered or dropped
dictionaries.� > /* XXX not really a pointer, so the name is misleading */
I think we don't need DictPointerData struct anymore, because only
ts_dict_shmem_release function needs it (see comments above) and we only
need it to hash search. I'll move all fields of DictPointerData to
TsearchDictKey struct.I was wrong, DictInitData also needs DictPointerData. I didn't remove
DictPointerData, I renamed it to DictEntryData. Hope that it is a more
appropriate name.Thanks. I've reviewed v17 today and I haven't discovered any new issues
so far. If everything goes fine and no one protests, I plan to get it
committed over the next week or so.
There doesn't seem to be any docs about what's needed to be able to take
advantage of shared dicts, and how to prevent them from permanently
taking up a significant share of memory.
Greetings,
Andres Freund
On 1/20/19 11:21 PM, Andres Freund wrote:
On 2019-01-20 23:15:35 +0100, Tomas Vondra wrote:
On 1/17/19 3:15 PM, Arthur Zakirov wrote:
I attached files of new version of the patch, I applied your tweaks.
XXX All dictionaries, but only when there's invalid dictionary?
I've made a little optimization. I introduced hashvalue into
TSDictionaryCacheEntry. Now released only DSM of altered or dropped
dictionaries.> /* XXX not really a pointer, so the name is misleading */
I think we don't need DictPointerData struct anymore, because only
ts_dict_shmem_release function needs it (see comments above) and we only
need it to hash search. I'll move all fields of DictPointerData to
TsearchDictKey struct.I was wrong, DictInitData also needs DictPointerData. I didn't remove
DictPointerData, I renamed it to DictEntryData. Hope that it is a more
appropriate name.Thanks. I've reviewed v17 today and I haven't discovered any new issues
so far. If everything goes fine and no one protests, I plan to get it
committed over the next week or so.There doesn't seem to be any docs about what's needed to be able to take
advantage of shared dicts, and how to prevent them from permanently
taking up a significant share of memory.
Yeah, those are good points. I agree the comments might be clearer, but
essentially ispell dictionaries are shared and everything else is not.
As for the memory consumption / unloading dicts - I agree that's
something we need to address. There used to be a way to specify memory
limit and ability to unload dictionaries explicitly, but both features
have been ditched. The assumption was that UNLOAD would be introduced
later, but that does not seem to have happened.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 21.01.2019 02:43, Tomas Vondra wrote:
On 1/20/19 11:21 PM, Andres Freund wrote:
On 2019-01-20 23:15:35 +0100, Tomas Vondra wrote:
Thanks. I've reviewed v17 today and I haven't discovered any new issues
so far. If everything goes fine and no one protests, I plan to get it
committed over the next week or so.There doesn't seem to be any docs about what's needed to be able to take
advantage of shared dicts, and how to prevent them from permanently
taking up a significant share of memory.Yeah, those are good points. I agree the comments might be clearer, but
essentially ispell dictionaries are shared and everything else is not.As for the memory consumption / unloading dicts - I agree that's
something we need to address. There used to be a way to specify memory
limit and ability to unload dictionaries explicitly, but both features
have been ditched. The assumption was that UNLOAD would be introduced
later, but that does not seem to have happened.
I'll try to implement the syntax, you suggested earlier:
ALTER TEXT SEARCH DICTIONARY x UNLOAD/RELOAD
The main point here is that UNLOAD/RELOAD can't release the memory
immediately, because some other backend may pin a DSM.
The second point we should consider (I think) - how do we know which
dictionary should be unloaded. There was such function earlier, which
was removed. But what about adding an information in the "\dFd" psql's
command output? It could be a column which shows is a dictionary loaded.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 1/21/19 12:51 PM, Arthur Zakirov wrote:
On 21.01.2019 02:43, Tomas Vondra wrote:
On 1/20/19 11:21 PM, Andres Freund wrote:
On 2019-01-20 23:15:35 +0100, Tomas Vondra wrote:
Thanks. I've reviewed v17 today and I haven't discovered any new issues
so far. If everything goes fine and no one protests, I plan to get it
committed over the next week or so.There doesn't seem to be any docs about what's needed to be able to take
advantage of shared dicts, and how to prevent them from permanently
taking up a significant share of memory.Yeah, those are good points. I agree the comments might be clearer, but
essentially ispell dictionaries are shared and everything else is not.As for the memory consumption / unloading dicts - I agree that's
something we need to address. There used to be a way to specify memory
limit and ability to unload dictionaries explicitly, but both features
have been ditched. The assumption was that UNLOAD would be introduced
later, but that does not seem to have happened.I'll try to implement the syntax, you suggested earlier:
ALTER TEXT SEARCH DICTIONARY x UNLOAD/RELOAD
The main point here is that UNLOAD/RELOAD can't release the memory
immediately, because some other backend may pin a DSM.The second point we should consider (I think) - how do we know which
dictionary should be unloaded. There was such function earlier, which
was removed. But what about adding an information in the "\dFd" psql's
command output? It could be a column which shows is a dictionary loaded.
The UNLOAD capability is probably a good start, but it's entirely manual
and I wonder if it's putting too much burden on the user. I mean, the
user has to realize the dictionaries are using a lot of shared memory,
has to decide which to unload, and then has to do UNLOAD on it. That's
not quite straightforward, especially if there's no way to determine
which dictionaries are currently loaded and how much memory they use :-(
Of course, the problem is not exactly new - we don't show dictionaries
already loaded into private memory. The only thing we have is "unload"
capability by closing the connection. OTOH the memory consumption should
be much lower thanks to using shared memory. So I think the patch is an
improvement even in this regard.
I wonder if we could devise some simple cache eviction policy. We don't
have any memory limit GUC anymore, but maybe we could use unload
dictionaries that were unused for sufficient amount of time (a couple of
minutes or so). Of course, the question is when exactly would it happen
(it seems far too expensive to invoke on each dict access, and it should
happen even when the dicts are not accessed at all).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 21.01.2019 17:56, Tomas Vondra wrote:
On 1/21/19 12:51 PM, Arthur Zakirov wrote:
I'll try to implement the syntax, you suggested earlier:
ALTER TEXT SEARCH DICTIONARY x UNLOAD/RELOAD
The main point here is that UNLOAD/RELOAD can't release the memory
immediately, because some other backend may pin a DSM.The second point we should consider (I think) - how do we know which
dictionary should be unloaded. There was such function earlier, which
was removed. But what about adding an information in the "\dFd" psql's
command output? It could be a column which shows is a dictionary loaded....The only thing we have is "unload" capability by closing the
connection...
BTW, even if the connection was closed and there are no other
connections a dictionary still remains "loaded". It is because
dsm_pin_segment() is called during loading the dictionary into DSM.
...
I wonder if we could devise some simple cache eviction policy. We don't
have any memory limit GUC anymore, but maybe we could use unload
dictionaries that were unused for sufficient amount of time (a couple of
minutes or so). Of course, the question is when exactly would it happen
(it seems far too expensive to invoke on each dict access, and it should
happen even when the dicts are not accessed at all).
Yes, I thought about such feature too. Agree, it could be expensive
since we need to scan pg_ts_dict table to get list of dictionaries (we
can't scan dshash_table).
I haven't a good solution yet. I just had a thought to return
max_shared_dictionaries_size. Then we can unload dictionaries (and scan
the pg_ts_dict table) that were accessed a lot time ago if we reached
the size limit.
We can't set exact size limit since we can't release the memory
immediately. So max_shared_dictionaries_size can be renamed to
shared_dictionaries_threshold. If it is equal to "0" then PostgreSQL has
unlimited space for dictionaries.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
пн, 21 янв. 2019 г. в 19:42, Arthur Zakirov <a.zakirov@postgrespro.ru>:
On 21.01.2019 17:56, Tomas Vondra wrote:
I wonder if we could devise some simple cache eviction policy. We don't
have any memory limit GUC anymore, but maybe we could use unload
dictionaries that were unused for sufficient amount of time (a couple of
minutes or so). Of course, the question is when exactly would it happen
(it seems far too expensive to invoke on each dict access, and it should
happen even when the dicts are not accessed at all).Yes, I thought about such feature too. Agree, it could be expensive
since we need to scan pg_ts_dict table to get list of dictionaries (we
can't scan dshash_table).I haven't a good solution yet. I just had a thought to return
max_shared_dictionaries_size. Then we can unload dictionaries (and scan
the pg_ts_dict table) that were accessed a lot time ago if we reached
the size limit.
We can't set exact size limit since we can't release the memory
immediately. So max_shared_dictionaries_size can be renamed to
shared_dictionaries_threshold. If it is equal to "0" then PostgreSQL has
unlimited space for dictionaries.
I want to propose to clean up segments during vacuum/autovacuum. I'm not
aware of the politics of cleaning up objects besides relations during
vacuum/autovacuum. Could be it a good idea?
Vacuum might unload dictionaries when total size of loaded dictionaries
exceeds a threshold. When it happens vacuum scans loaded dictionaries and
unloads (unpins segments and removes hash table entries) those dictionaries
which isn't mapped to any backend process (it happens because
dsm_pin_segment() is called) anymore.
max_shared_dictionaries_size can be renamed to
shared_dictionaries_cleanup_threshold.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 1/22/19 7:36 PM, Arthur Zakirov wrote:
пн, 21 янв. 2019 г. в 19:42, Arthur Zakirov <a.zakirov@postgrespro.ru>:
On 21.01.2019 17:56, Tomas Vondra wrote:
I wonder if we could devise some simple cache eviction policy. We don't
have any memory limit GUC anymore, but maybe we could use unload
dictionaries that were unused for sufficient amount of time (a couple of
minutes or so). Of course, the question is when exactly would it happen
(it seems far too expensive to invoke on each dict access, and it should
happen even when the dicts are not accessed at all).Yes, I thought about such feature too. Agree, it could be expensive
since we need to scan pg_ts_dict table to get list of dictionaries (we
can't scan dshash_table).I haven't a good solution yet. I just had a thought to return
max_shared_dictionaries_size. Then we can unload dictionaries (and scan
the pg_ts_dict table) that were accessed a lot time ago if we reached
the size limit.
We can't set exact size limit since we can't release the memory
immediately. So max_shared_dictionaries_size can be renamed to
shared_dictionaries_threshold. If it is equal to "0" then PostgreSQL has
unlimited space for dictionaries.I want to propose to clean up segments during vacuum/autovacuum. I'm not
aware of the politics of cleaning up objects besides relations during
vacuum/autovacuum. Could be it a good idea?
I doubt that's a good idea, for a couple of reasons. For example, would
it be bound to autovacuum on a particular object or would it happen as
part of each vacuum run? If the dict cleanup happens only when vacuuming
a particular object, then which one? If it happens on each autovacuum
run, then that may easily be far too frequent (it essentially makes the
cases with too frequent autovacuum runs even worse).
But also what happens when there only minimal write activity and thus no
regular autovacuum runs? Surely we should still do the dict cleanup.
Vacuum might unload dictionaries when total size of loaded dictionaries
exceeds a threshold. When it happens vacuum scans loaded dictionaries and
unloads (unpins segments and removes hash table entries) those dictionaries
which isn't mapped to any backend process (it happens because
dsm_pin_segment() is called) anymore.
Then why to bound that to autovacuum at all? Why not just make it part
of loading the dictionary?
max_shared_dictionaries_size can be renamed to
shared_dictionaries_cleanup_threshold.
That really depends on what exactly the threshold does. If it only
triggers cleanup but does not enforce maximum amount of memory used by
dictionaries, then this name seems OK. If it ensures max amount of
memory, the max_..._size name would be better.
I think there are essentially two ways:
(a) Define max amount of memory available for shared dictionarires, and
come up with an eviction algorithm. This will be tricky, because when
the frequently-used dictionaries need a bit more memory than the limit,
this will result in trashing (evict+load over and over).
(b) Define what "unused" means for dictionaries, and unload dictionaries
that become unused. For example, we could track timestamp of the last
time each dict was used, and decide that dictionaries unused for 5 or
more minutes are unused. And evict those.
The advantage of (b) is that it adopts automatically, more or less. When
you have a bunch of frequently used dictionaries, the amount of shared
memory increases. If you stop using them, it decreases after a while.
And rarely used dicts won't force eviction of the frequently used ones.
cheers
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 22.01.2019 22:17, Tomas Vondra wrote:
On 1/22/19 7:36 PM, Arthur Zakirov wrote:
max_shared_dictionaries_size can be renamed to
shared_dictionaries_cleanup_threshold.That really depends on what exactly the threshold does. If it only
triggers cleanup but does not enforce maximum amount of memory used by
dictionaries, then this name seems OK. If it ensures max amount of
memory, the max_..._size name would be better.
Yep, I thought about the first approach.
I think there are essentially two ways:
(a) Define max amount of memory available for shared dictionarires, and
come up with an eviction algorithm. This will be tricky, because when
the frequently-used dictionaries need a bit more memory than the limit,
this will result in trashing (evict+load over and over).(b) Define what "unused" means for dictionaries, and unload dictionaries
that become unused. For example, we could track timestamp of the last
time each dict was used, and decide that dictionaries unused for 5 or
more minutes are unused. And evict those.The advantage of (b) is that it adopts automatically, more or less. When
you have a bunch of frequently used dictionaries, the amount of shared
memory increases. If you stop using them, it decreases after a while.
And rarely used dicts won't force eviction of the frequently used ones.
Thanks for sharing your ideas, Tomas. Unfortunately I won't manage to
develop new version of the patch till the end of the commitfest due to
lack of time. I'll think about the second approach. Tracking timestamp
of the last time a dict was used may be difficult though and may slow
down FTS...
I move the path to the next commitfest.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 01.02.2019 12:09, Arthur Zakirov wrote:
Thanks for sharing your ideas, Tomas. Unfortunately I won't manage to
develop new version of the patch till the end of the commitfest due to
lack of time. I'll think about the second approach. Tracking timestamp
of the last time a dict was used may be difficult though and may slow
down FTS...I move the path to the next commitfest.
Oh, It seems it can't be moved to the next commitfest from status
"Waiting on Author".
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Tue, Jan 22, 2019 at 2:17 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I think there are essentially two ways:
(a) Define max amount of memory available for shared dictionarires, and
come up with an eviction algorithm. This will be tricky, because when
the frequently-used dictionaries need a bit more memory than the limit,
this will result in trashing (evict+load over and over).(b) Define what "unused" means for dictionaries, and unload dictionaries
that become unused. For example, we could track timestamp of the last
time each dict was used, and decide that dictionaries unused for 5 or
more minutes are unused. And evict those.The advantage of (b) is that it adopts automatically, more or less. When
you have a bunch of frequently used dictionaries, the amount of shared
memory increases. If you stop using them, it decreases after a while.
And rarely used dicts won't force eviction of the frequently used ones.
+1 for (b).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On 2019-02-01 09:40:44 -0500, Robert Haas wrote:
On Tue, Jan 22, 2019 at 2:17 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:I think there are essentially two ways:
(a) Define max amount of memory available for shared dictionarires, and
come up with an eviction algorithm. This will be tricky, because when
the frequently-used dictionaries need a bit more memory than the limit,
this will result in trashing (evict+load over and over).(b) Define what "unused" means for dictionaries, and unload dictionaries
that become unused. For example, we could track timestamp of the last
time each dict was used, and decide that dictionaries unused for 5 or
more minutes are unused. And evict those.The advantage of (b) is that it adopts automatically, more or less. When
you have a bunch of frequently used dictionaries, the amount of shared
memory increases. If you stop using them, it decreases after a while.
And rarely used dicts won't force eviction of the frequently used ones.+1 for (b).
This patch has been waiting on author for two weeks, the commitfest has
ended, and there's substantial work needed. Therefore I'm marking the
patch as returned with feedback. Please resubmit a new version, once the
feedback has been addressed.
Greetings,
Andres Freund
Hello,
I've created the new commitfest entry since the previous entry was
closed with status "Returned with feedback":
https://commitfest.postgresql.org/22/2007/
I attached new version of the patch. There are changes only in
0003-Retrieve-shared-location-for-dict-v18.patch.
I added a reference counter to shared hash tables dictionary entries. It
is necessary to not face memory bloat. It is necessary to delete shared
hash table entries if there are a lot of ALTER and DROP TEXT SEARCH
DICTIONARY.
Previous version of the patch had released unused DSM segments but left
shared hash table entries untouched.
There was refcnt before:
/messages/by-id/20180403115720.GA7450@zakirov.localdomain
But I didn't fully understand how on_dsm_detach() works.
On 22.01.2019 22:17, Tomas Vondra wrote:
I think there are essentially two ways:
(a) Define max amount of memory available for shared dictionarires, and
come up with an eviction algorithm. This will be tricky, because when
the frequently-used dictionaries need a bit more memory than the limit,
this will result in trashing (evict+load over and over).(b) Define what "unused" means for dictionaries, and unload dictionaries
that become unused. For example, we could track timestamp of the last
time each dict was used, and decide that dictionaries unused for 5 or
more minutes are unused. And evict those.The advantage of (b) is that it adopts automatically, more or less. When
you have a bunch of frequently used dictionaries, the amount of shared
memory increases. If you stop using them, it decreases after a while.
And rarely used dicts won't force eviction of the frequently used ones.
I'm working on the (b) approach. I thought about a priority queue
structure. There no such ready structure within PostgreSQL sources
except binaryheap.c, but it isn't for concurrent algorithms.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v18.patchtext/x-patch; name=0001-Fix-ispell-memory-handling-v18.patchDownload
From 3e220e259eebc6b9730c9500176015b04e588cae Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 14:27:32 +0300
Subject: [PATCH 1/4] Fix-ispell-memory-handling
---
src/backend/tsearch/spell.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb39466b22..eb8416ce7f 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
--
2.20.1
0002-Change-tmplinit-argument-v18.patchtext/x-patch; name=0002-Change-tmplinit-argument-v18.patchDownload
From 291667b579641176ca74eaa343521dd5c258a744 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:05:44 +0300
Subject: [PATCH 2/4] Change-tmplinit-argument
---
contrib/dict_int/dict_int.c | 4 +-
contrib/dict_xsyn/dict_xsyn.c | 4 +-
contrib/unaccent/unaccent.c | 4 +-
src/backend/commands/tsearchcmds.c | 10 ++++-
src/backend/snowball/dict_snowball.c | 4 +-
src/backend/tsearch/dict_ispell.c | 4 +-
src/backend/tsearch/dict_simple.c | 4 +-
src/backend/tsearch/dict_synonym.c | 4 +-
src/backend/tsearch/dict_thesaurus.c | 4 +-
src/backend/utils/cache/ts_cache.c | 13 +++++-
src/include/tsearch/ts_cache.h | 4 ++
src/include/tsearch/ts_public.h | 67 ++++++++++++++++++++++++++--
12 files changed, 105 insertions(+), 21 deletions(-)
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 628b9769c3..ddde55eee4 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index 509e14aee0..15b1a0033a 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index fc5176e338..f3663cefd0 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -270,12 +270,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 8e5eec22b5..30c5eb72a2 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -389,17 +389,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 5166738310..f30f29865c 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -201,14 +201,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8b05a477f1..fc9a96abca 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index 2f62ef00c8..c92744641b 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index b6226df940..d3f5f0da3f 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 75f8deef6a..8962e252e0 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 0545efc75b..8bc8d82c76 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -311,11 +312,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -335,9 +340,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 77e325d101..2298e0a275 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index b325fa122c..db028ed6ad 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,69 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to manage a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to use
+ * different functions. API consists of the following functions:
+ *
+ * init function
+ * -------------
+ * - optional function which initializes internal structures of the dictionary
+ * - accepts DictInitData structure as an argument and must return a custom
+ * palloc'd structure which stores content of the processed dictionary and
+ * is used by lexize function
+ *
+ * lexize function
+ * ---------------
+ * - normalizes a single word (token) using specific dictionary
+ * - returns a palloc'd array of TSLexeme, with a terminating NULL entry
+ * - accepts the following arguments:
+ *
+ * - dictData - pointer to a structure returned by init function or NULL if
+ * init function wasn't defined by the template
+ * - token - string to normalize (not null-terminated)
+ * - length - length of the token
+ * - dictState - pointer to a DictSubState structure storing current
+ * state of a set of tokens processing and allows to normalize phrases
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM - this is
+ * decided in the init function. A DSM segment is released after altering or
+ * dropping the dictionary. The segment may still leak, when a backend uses the
+ * dictionary right before dropping - in that case the backend will hold the DSM
+ * untill it disconnects or calls lookup_ts_dictionary_cache().
+ *
+ * DictEntryData represents DSM segment with a preprocessed dictionary. We need
+ * to ensure the content of the DSM segment is still valid, which is what xmin,
+ * xmax and tid are for.
+ */
+typedef struct
+{
+ Oid id; /* OID of the dictionary */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictEntryData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /* List of options for a template's init method */
+ List *dict_options;
+
+ /* Data used to allocate, search and release the DSM segment */
+ DictEntryData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +169,7 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string (NULL for terminating entry) */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
--
2.20.1
0003-Retrieve-shared-location-for-dict-v18.patchtext/x-patch; name=0003-Retrieve-shared-location-for-dict-v18.patchDownload
From e5f3745578ad3cf65e00714024781b5ae82de1a6 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:38:09 +0300
Subject: [PATCH 3/4] Retrieve-shared-location-for-dict
---
src/backend/commands/tsearchcmds.c | 11 +
src/backend/storage/ipc/ipci.c | 7 +
src/backend/tsearch/Makefile | 2 +-
src/backend/tsearch/ts_shared.c | 494 +++++++++++++++++++++++++++++
src/backend/utils/cache/ts_cache.c | 51 +++
src/include/storage/lwlock.h | 2 +
src/include/tsearch/ts_cache.h | 3 +
src/include/tsearch/ts_shared.h | 28 ++
8 files changed, 597 insertions(+), 1 deletion(-)
create mode 100644 src/backend/tsearch/ts_shared.c
create mode 100644 src/include/tsearch/ts_shared.h
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 30c5eb72a2..90ad24c019 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -40,6 +40,7 @@
#include "nodes/makefuncs.h"
#include "parser/parse_func.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -528,6 +529,11 @@ RemoveTSDictionaryById(Oid dictId)
CatalogTupleDelete(relation, &tup->t_self);
+ ts_dict_shmem_release(dictId,
+ HeapTupleHeaderGetRawXmin(tup->t_data),
+ HeapTupleHeaderGetRawXmax(tup->t_data),
+ tup->t_self, true);
+
ReleaseSysCache(tup);
table_close(relation, RowExclusiveLock);
@@ -637,6 +643,11 @@ AlterTSDictionary(AlterTSDictionaryStmt *stmt)
ObjectAddressSet(address, TSDictionaryRelationId, dictId);
+ ts_dict_shmem_release(dictId,
+ HeapTupleHeaderGetRawXmin(tup->t_data),
+ HeapTupleHeaderGetRawXmax(tup->t_data),
+ tup->t_self, true);
+
/*
* NOTE: because we only support altering the options, not the template,
* there is no need to update dependencies. This might have to change if
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 5965d3620f..029354b16c 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -44,6 +44,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
#include "utils/snapmgr.h"
/* GUCs */
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
+ size = add_size(size, TsearchShmemSize());
#ifdef EXEC_BACKEND
size = add_size(size, ShmemBackendArraySize());
#endif
@@ -270,6 +272,11 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
SyncScanShmemInit();
AsyncShmemInit();
+ /*
+ * Set up shared memory to tsearch
+ */
+ TsearchShmemInit();
+
#ifdef EXEC_BACKEND
/*
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 62d8bb3254..0b25c20fb0 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..d84fcb5eff
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,494 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "lib/dshash.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "tsearch/ts_shared.h"
+#include "utils/hashutils.h"
+#include "utils/memutils.h"
+
+
+/*
+ * Hash table entries key.
+ */
+typedef struct
+{
+ Oid db_id;
+ DictEntryData dict;
+} TsearchDictKey;
+
+/*
+ * Hash table entries representing shared dictionaries.
+ */
+typedef struct
+{
+ TsearchDictKey key;
+ dsm_handle dict_dsm;
+
+ /*
+ * We need a flag that the DSM segment is pinned/unpinned. Otherwise we can
+ * face double dsm_unpin_segment().
+ */
+ bool segment_ispinned;
+
+ slock_t mutex; /* protects the reference count */
+ uint32 refcnt; /* number of mapped backends */
+} TsearchDictEntry;
+
+/*
+ * Compiled dictionary data stored within the hash table.
+ */
+typedef struct
+{
+ TsearchDictKey dict_key; /* entry's key used to release the entry */
+ char dict[FLEXIBLE_ARRAY_MEMBER];
+} TsearchDictData;
+
+#define TsearchDictDataHdrSize MAXALIGN(offsetof(TsearchDictData, dict))
+
+static dshash_table *dict_table = NULL;
+
+/*
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
+ */
+typedef struct
+{
+ dsa_handle area;
+ dshash_table_handle dict_table_handle;
+
+ LWLock lock;
+} TsearchCtlData;
+
+static TsearchCtlData *tsearch_ctl;
+
+static int tsearch_dict_cmp(const void *a, const void *b, size_t size,
+ void *arg);
+static uint32 tsearch_dict_hash(const void *a, size_t size, void *arg);
+
+static void init_dict_table(void);
+static dsm_segment *dict_entry_init(TsearchDictKey *key,
+ TsearchDictEntry *entry, void *dict,
+ Size dict_size);
+static dsm_segment *dict_entry_attach(TsearchDictEntry *entry);
+static void dict_entry_on_detach(dsm_segment *segment, Datum datum);
+
+/* Parameters for dict_table */
+static const dshash_parameters dict_table_params ={
+ sizeof(TsearchDictKey),
+ sizeof(TsearchDictEntry),
+ tsearch_dict_cmp,
+ tsearch_dict_hash,
+ LWTRANCHE_TSEARCH_TABLE
+};
+
+/*
+ * Build the dictionary using allocate_cb callback.
+ *
+ * Firstly try to find the dictionary in shared hash table. If it was built by
+ * someone earlier just return its location in DSM.
+ *
+ * init_data: an argument used within a template's init method.
+ * allocate_cb: function to build the dictionary, if it wasn't found in DSM.
+ *
+ * Returns address in the dynamic shared memory segment or in backend memory.
+ */
+void *
+ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+ TsearchDictData *dict_data;
+ bool found;
+ dsm_segment *seg;
+ void *dict;
+ Size dict_size;
+
+ init_dict_table();
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict = init_data->dict;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, false);
+
+ if (entry)
+ {
+ seg = dsm_find_mapping(entry->dict_dsm);
+ if (!seg)
+ seg = dict_entry_attach(entry);
+ dshash_release_lock(dict_table, entry);
+
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ return dict_data->dict;
+ }
+
+ /* Dictionary haven't been loaded into memory yet */
+ entry = (TsearchDictEntry *) dshash_find_or_insert(dict_table, &key,
+ &found);
+
+ if (found)
+ {
+ /*
+ * Someone concurrently inserted a dictionary entry since the first time
+ * we checked.
+ */
+ seg = dict_entry_attach(entry);
+ dshash_release_lock(dict_table, entry);
+
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ return dict_data->dict;
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* At least initialize a dictionary entry */
+ seg = dict_entry_init(&key, entry, dict, dict_size);
+ dshash_release_lock(dict_table, entry);
+
+ pfree(dict);
+
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ return dict_data->dict;
+}
+
+/*
+ * Release memory occupied by the dictionary. Function unpins DSM mapping and
+ * if the dictionary is being dropped or altered unpins the DSM segment.
+ *
+ * The segment still may leak. It may happen if some backend used the
+ * dictionary before dropping, the backend will hold its DSM segment till
+ * disconnecting or calling lookup_ts_dictionary_cache().
+ *
+ * id, xmin, xmax, tid: information to search the dictionary's DSM segment.
+ * unpin_segment: true if we need to unpin the segment in case if the dictionary
+ * was dropped or altered.
+ */
+void
+ts_dict_shmem_release(Oid id, TransactionId xmin, TransactionId xmax,
+ ItemPointerData tid, bool unpin_segment)
+{
+ TsearchDictKey key;
+ TsearchDictEntry *entry;
+
+ /*
+ * If we didn't attach to a hash table then do nothing.
+ */
+ if (!dict_table && !unpin_segment)
+ return;
+ /*
+ * But if we need to unpin the DSM segment to get of rid of the segment when
+ * the last interested process disconnects we need the hash table to find
+ * the dictionary's entry.
+ */
+ else if (unpin_segment)
+ init_dict_table();
+
+ /* Set up key for hashtable search */
+ key.db_id = MyDatabaseId;
+ key.dict.id = id;
+ key.dict.xmin = xmin;
+ key.dict.xmax = xmax;
+ key.dict.tid = tid;
+
+ /* Try to find an entry in the hash table */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, &key, true);
+
+ if (entry)
+ {
+ dsm_segment *seg;
+
+ seg = dsm_find_mapping(entry->dict_dsm);
+
+ if (seg)
+ {
+ TsearchDictData *dict_data;
+
+ dsm_unpin_mapping(seg);
+ /*
+ * Cancel cleanup callback to avoid a deadlock. Cleanup is done
+ * below.
+ */
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ cancel_on_dsm_detach(seg, dict_entry_on_detach,
+ PointerGetDatum(&dict_data->dict_key));
+ dsm_detach(seg);
+
+ entry->refcnt--;
+ }
+
+ if (unpin_segment && entry->segment_ispinned)
+ {
+ dsm_unpin_segment(entry->dict_dsm);
+ entry->segment_ispinned = false;
+
+ Assert(entry->refcnt > 0);
+ entry->refcnt--;
+ }
+
+ if (entry->refcnt == 0)
+ dshash_delete_entry(dict_table, entry);
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
+
+/*
+ * Allocate and initialize tsearch-related shared memory.
+ */
+void
+TsearchShmemInit(void)
+{
+ bool found;
+
+ tsearch_ctl = (TsearchCtlData *)
+ ShmemInitStruct("Full Text Search Ctl", sizeof(TsearchCtlData), &found);
+
+ if (!found)
+ {
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_DSA, "tsearch_dsa");
+ LWLockRegisterTranche(LWTRANCHE_TSEARCH_TABLE, "tsearch_table");
+
+ LWLockInitialize(&tsearch_ctl->lock, LWTRANCHE_TSEARCH_DSA);
+
+ tsearch_ctl->area = DSM_HANDLE_INVALID;
+ tsearch_ctl->dict_table_handle = InvalidDsaPointer;
+ }
+}
+
+/*
+ * Report shared memory space needed by TsearchShmemInit.
+ */
+Size
+TsearchShmemSize(void)
+{
+ Size size = 0;
+
+ /* size of service structure */
+ size = add_size(size, MAXALIGN(sizeof(TsearchCtlData)));
+
+ return size;
+}
+
+/*
+ * A comparator function for TsearchDictKey.
+ *
+ * Returns 1 if keys are equal.
+ */
+static int
+tsearch_dict_cmp(const void *a, const void *b, size_t size, void *arg)
+{
+ TsearchDictKey *k1 = (TsearchDictKey *) a;
+ TsearchDictKey *k2 = (TsearchDictKey *) b;
+
+ if (k1->db_id == k2->db_id && k1->dict.id == k2->dict.id &&
+ k1->dict.xmin == k2->dict.xmin && k1->dict.xmax == k2->dict.xmax &&
+ ItemPointerEquals(&k1->dict.tid, &k2->dict.tid))
+ return 0;
+ else
+ return 1;
+}
+
+/*
+ * A hash function for TsearchDictKey.
+ */
+static uint32
+tsearch_dict_hash(const void *a, size_t size, void *arg)
+{
+ TsearchDictKey *k = (TsearchDictKey *) a;
+ uint32 s;
+
+ s = hash_combine(0, hash_uint32(k->db_id));
+ s = hash_combine(s, hash_uint32(k->dict.id));
+ s = hash_combine(s, hash_uint32(k->dict.xmin));
+ s = hash_combine(s, hash_uint32(k->dict.xmax));
+ s = hash_combine(s,
+ hash_uint32(BlockIdGetBlockNumber(&k->dict.tid.ip_blkid)));
+ s = hash_combine(s, hash_uint32(k->dict.tid.ip_posid));
+
+ return s;
+}
+
+/*
+ * Initialize hash table located in DSM.
+ *
+ * The hash table should be created and initialized if it doesn't exist yet.
+ */
+static void
+init_dict_table(void)
+{
+ MemoryContext old_context;
+ dsa_area *dsa;
+
+ /* Exit if hash table was initialized alread */
+ if (dict_table)
+ return;
+
+ old_context = MemoryContextSwitchTo(TopMemoryContext);
+
+recheck_table:
+ LWLockAcquire(&tsearch_ctl->lock, LW_SHARED);
+
+ /* Hash table have been created already by someone */
+ if (DsaPointerIsValid(tsearch_ctl->dict_table_handle))
+ {
+ Assert(tsearch_ctl->area != DSM_HANDLE_INVALID);
+
+ dsa = dsa_attach(tsearch_ctl->area);
+
+ dict_table = dshash_attach(dsa,
+ &dict_table_params,
+ tsearch_ctl->dict_table_handle,
+ NULL);
+ }
+ else
+ {
+ /* Try to get exclusive lock */
+ LWLockRelease(&tsearch_ctl->lock);
+ if (!LWLockAcquireOrWait(&tsearch_ctl->lock, LW_EXCLUSIVE))
+ {
+ /*
+ * The lock was released by another backend and other backend
+ * has concurrently created the hash table already.
+ */
+ goto recheck_table;
+ }
+
+ dsa = dsa_create(LWTRANCHE_TSEARCH_DSA);
+ tsearch_ctl->area = dsa_get_handle(dsa);
+
+ dict_table = dshash_create(dsa, &dict_table_params, NULL);
+ tsearch_ctl->dict_table_handle = dshash_get_hash_table_handle(dict_table);
+
+ /* Remain attached until end of postmaster */
+ dsa_pin(dsa);
+ }
+
+ LWLockRelease(&tsearch_ctl->lock);
+
+ /* Remain attached until end of session */
+ dsa_pin_mapping(dsa);
+
+ MemoryContextSwitchTo(old_context);
+}
+
+/*
+ * Initialize a dictionary's DSM segment entry within shared hash table.
+ */
+static dsm_segment *
+dict_entry_init(TsearchDictKey *key, TsearchDictEntry *entry, void *dict,
+ Size dict_size)
+{
+ TsearchDictData *dict_data;
+ dsm_segment *seg;
+
+ /* Allocate a DSM segment for the compiled dictionary */
+ seg = dsm_create(TsearchDictDataHdrSize + dict_size, 0);
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ dict_data->dict_key = *key;
+ memcpy(dict_data->dict, dict, dict_size);
+
+ entry->key = *key;
+ entry->dict_dsm = dsm_segment_handle(seg);
+ entry->segment_ispinned = true;
+ SpinLockInit(&entry->mutex);
+ entry->refcnt = 2; /* 1 for session + 1 for postmaster */
+
+ /* Remain attached until end of postmaster */
+ dsm_pin_segment(seg);
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ /* Register the shared hash table cleanup callback */
+ on_dsm_detach(seg, dict_entry_on_detach,
+ PointerGetDatum(&dict_data->dict_key));
+
+ return seg;
+}
+
+/*
+ * Attach a dictionary's DSM segment and pin mapping until end of session.
+ *
+ * Entry's reference counter increments to properly release it if no one else
+ * has mapping to this DSM using on-dsm-detach callback.
+ */
+static dsm_segment *
+dict_entry_attach(TsearchDictEntry *entry)
+{
+ dsm_segment *seg;
+ TsearchDictData *dict_data;
+
+ seg = dsm_attach(entry->dict_dsm);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+ /* Remain attached until end of session */
+ dsm_pin_mapping(seg);
+
+ /* We need a mutex here since the entry might be locked non-exclusively */
+ SpinLockAcquire(&entry->mutex);
+ entry->refcnt++;
+ SpinLockRelease(&entry->mutex);
+
+ dict_data = (TsearchDictData *) dsm_segment_address(seg);
+ /* Register the shared hash table cleanup callback */
+ on_dsm_detach(seg, dict_entry_on_detach,
+ PointerGetDatum(&dict_data->dict_key));
+
+ return seg;
+}
+
+/*
+ * When a session detaches from a DSM segment we need to check is someone else
+ * attached the segment. If it is not then delete the related shared hash table
+ * entry.
+ */
+static void
+dict_entry_on_detach(dsm_segment *segment, Datum datum)
+{
+ TsearchDictKey *key = (TsearchDictKey *) DatumGetPointer(datum);
+ TsearchDictEntry *entry;
+
+ /* Find the entry and lock it to decrement the refcnt */
+ entry = (TsearchDictEntry *) dshash_find(dict_table, key, true);
+ if (entry)
+ {
+ Assert(entry->refcnt > 0);
+ if (--entry->refcnt == 0)
+ dshash_delete_entry(dict_table, entry);
+ else
+ dshash_release_lock(dict_table, entry);
+ }
+}
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 8bc8d82c76..ea8fb9a039 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -40,6 +40,7 @@
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
#include "tsearch/ts_public.h"
+#include "tsearch/ts_shared.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -75,6 +76,7 @@ static TSConfigCacheEntry *lastUsedConfig = NULL;
char *TSCurrentConfig = NULL;
static Oid TSCurrentConfigCache = InvalidOid;
+static bool has_invalid_dictionary = false;
/*
@@ -86,6 +88,10 @@ static Oid TSCurrentConfigCache = InvalidOid;
* doesn't seem worth the trouble to determine that; we just flush all the
* entries of the related hash table.
*
+ * We set has_invalid_dictionary to true to unpin all used segments later on
+ * a first text search function usage. It isn't safe to call
+ * ts_dict_shmem_release() here since it may call kernel functions.
+ *
* We can use the same function for all TS caches by passing the hash
* table address as the "arg".
*/
@@ -98,13 +104,48 @@ InvalidateTSCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, hash);
while ((entry = (TSAnyCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (cacheid == TSDICTOID)
+ {
+ TSDictionaryCacheEntry *dict_entry;
+
+ dict_entry = (TSDictionaryCacheEntry *) entry;
+ if (dict_entry->hashvalue == hashvalue)
+ {
+ dict_entry->shmem_valid = false;
+ has_invalid_dictionary = true;
+ }
+ }
+
entry->isvalid = false;
+ }
/* Also invalidate the current-config cache if it's pg_ts_config */
if (hash == TSConfigCacheHash)
TSCurrentConfigCache = InvalidOid;
}
+/*
+ * Unpin shared segments of all invalid dictionary entries.
+ */
+static void
+do_ts_dict_shmem_release(void)
+{
+ HASH_SEQ_STATUS status;
+ TSDictionaryCacheEntry *entry;
+
+ if (!has_invalid_dictionary)
+ return;
+
+ hash_seq_init(&status, TSDictionaryCacheHash);
+ while ((entry = (TSDictionaryCacheEntry *) hash_seq_search(&status)) != NULL)
+ if (!entry->shmem_valid)
+ ts_dict_shmem_release(entry->dictId, entry->dict_xmin,
+ entry->dict_xmax, entry->dict_tid, false);
+
+ has_invalid_dictionary = false;
+}
+
/*
* Fetch parser cache entry
*/
@@ -253,6 +294,13 @@ lookup_ts_dictionary_cache(Oid dictId)
Form_pg_ts_template template;
MemoryContext saveCtx;
+ /*
+ * It is possible that some invalid entries hold a DSM mapping and we
+ * need to unpin it to avoid memory leaking. We will unpin segments of
+ * all other invalid dictionaries.
+ */
+ do_ts_dict_shmem_release();
+
tpdict = SearchSysCache1(TSDICTOID, ObjectIdGetDatum(dictId));
if (!HeapTupleIsValid(tpdict))
elog(ERROR, "cache lookup failed for text search dictionary %u",
@@ -359,6 +407,9 @@ lookup_ts_dictionary_cache(Oid dictId)
fmgr_info_cxt(entry->lexizeOid, &entry->lexize, entry->dictCtx);
entry->isvalid = true;
+ entry->hashvalue =
+ GetSysCacheHashValue1(TSDICTOID, ObjectIdGetDatum(entry->dictId));
+ entry->shmem_valid = true;
}
lastUsedDictionary = entry;
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 96c7732006..49a3319a11 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -219,6 +219,8 @@ typedef enum BuiltinTrancheIds
LWTRANCHE_SHARED_TUPLESTORE,
LWTRANCHE_TBM,
LWTRANCHE_PARALLEL_APPEND,
+ LWTRANCHE_TSEARCH_DSA,
+ LWTRANCHE_TSEARCH_TABLE,
LWTRANCHE_FIRST_USER_DEFINED
} BuiltinTrancheIds;
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 2298e0a275..14e13bf252 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,9 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ uint32 hashvalue; /* hash value of the dictionary's OID */
+ bool shmem_valid;
+
TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
ItemPointerData dict_tid; /* TID of the dictionary's tuple */
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..1e506ef737
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,28 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * tsearch shared memory management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern void *ts_dict_shmem_location(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void ts_dict_shmem_release(Oid id, TransactionId xmin,
+ TransactionId xmax, ItemPointerData tid,
+ bool unpin_segment);
+
+extern void TsearchShmemInit(void);
+extern Size TsearchShmemSize(void);
+
+#endif /* TS_SHARED_H */
--
2.20.1
0004-Store-ispell-in-shared-location-v18.patchtext/x-patch; name=0004-Store-ispell-in-shared-location-v18.patchDownload
From f4908f86df68e52a32f541905a7ee2d5d8a051f1 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:50:44 +0300
Subject: [PATCH 4/4] Store-ispell-in-shared-location
---
doc/src/sgml/textsearch.sgml | 15 +
src/backend/tsearch/dict_ispell.c | 193 +++--
src/backend/tsearch/spell.c | 1343 +++++++++++++++++++----------
src/include/tsearch/dicts/spell.h | 239 +++--
4 files changed, 1208 insertions(+), 582 deletions(-)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..308758942b 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3110,6 +3110,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Dictionaries, especially <application>Ispell</application>, may be quite
+ expensive both in terms of memory and CPU usage. For large dictionaries
+ it may take multiple seconds to read and process input text files on first
+ access, and the in-memory representation may require tens of megabytes.
+ When each backend processes the dictionaries independently and stores them
+ in private memory, this cost is significant. To amortize it, the compiled
+ dictionary may be stored in shared memory for reuse by other backends.
+ Currently only <application>Ispell</application> is stored in shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index fc9a96abca..b9c30bbeb4 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,11 @@
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
*
+ * Compiled Ispell dictionaries are stored in DSM. All necessary data are built
+ * within dispell_build() function. But structures for regular expressions are
+ * compiled on first demand and stored using AffixReg array. It is because
+ * regex_t and Regis cannot be stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,8 +19,10 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
@@ -26,54 +33,126 @@ typedef struct
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ void *dict_location;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
+
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
+
+ dict_location = ts_dict_shmem_location(init_data, dispell_build);
+ Assert(dict_location);
+
+ d->obj.dict = (IspellDictData *) dict_location;
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix *
+ sizeof(AffixReg));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
+
+ PG_RETURN_POINTER(d);
+}
+
+Datum
+dispell_lexize(PG_FUNCTION_ARGS)
+{
+ DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
+ char *in = (char *) PG_GETARG_POINTER(1);
+ int32 len = PG_GETARG_INT32(2);
+ char *txt;
+ TSLexeme *res;
+ TSLexeme *ptr,
+ *cptr;
+
+ if (len <= 0)
+ PG_RETURN_POINTER(NULL);
+
+ txt = lowerstr_with_len(in, len);
+ res = NINormalizeWord(&(d->obj), txt);
- foreach(l, init_data->dict_options)
+ if (res == NULL)
+ PG_RETURN_POINTER(NULL);
+
+ cptr = res;
+ for (ptr = cptr; ptr->lexeme; ptr++)
+ {
+ if (searchstoplist(&(d->stoplist), ptr->lexeme))
+ {
+ pfree(ptr->lexeme);
+ ptr->lexeme = NULL;
+ }
+ else
+ {
+ if (cptr != ptr)
+ memcpy(cptr, ptr, sizeof(TSLexeme));
+ cptr++;
+ }
+ }
+ cptr->lexeme = NULL;
+
+ PG_RETURN_POINTER(res);
+}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
{
DefElem *defel = (DefElem *) lfirst(l);
if (strcmp(defel->defname, "dictfile") == 0)
{
- if (dictloaded)
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
}
else if (strcmp(defel->defname, "afffile") == 0)
{
- if (affloaded)
+ if (!afffile)
+ continue;
+
+ if (*afffile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
}
else if (strcmp(defel->defname, "stopwords") == 0)
{
- if (stoploaded)
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
+ *stopfile = defGetString(defel);
}
else
{
@@ -83,66 +162,52 @@ dispell_init(PG_FUNCTION_ARGS)
defel->defname)));
}
}
+}
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing AffFile parameter")));
}
- else
+ else if (!dictfile)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("missing DictFile parameter")));
}
- NIFinishBuild(&(d->obj));
-
- PG_RETURN_POINTER(d);
-}
-
-Datum
-dispell_lexize(PG_FUNCTION_ARGS)
-{
- DictISpell *d = (DictISpell *) PG_GETARG_POINTER(0);
- char *in = (char *) PG_GETARG_POINTER(1);
- int32 len = PG_GETARG_INT32(2);
- char *txt;
- TSLexeme *res;
- TSLexeme *ptr,
- *cptr;
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
- if (len <= 0)
- PG_RETURN_POINTER(NULL);
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
- txt = lowerstr_with_len(in, len);
- res = NINormalizeWord(&(d->obj), txt);
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
- if (res == NULL)
- PG_RETURN_POINTER(NULL);
+ NICopyData(&build);
- cptr = res;
- for (ptr = cptr; ptr->lexeme; ptr++)
- {
- if (searchstoplist(&(d->stoplist), ptr->lexeme))
- {
- pfree(ptr->lexeme);
- ptr->lexeme = NULL;
- }
- else
- {
- if (cptr != ptr)
- memcpy(cptr, ptr, sizeof(TSLexeme));
- cptr++;
- }
- }
- cptr->lexeme = NULL;
+ /* Release temporary data */
+ NIFinishBuild(&build);
- PG_RETURN_POINTER(res);
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb8416ce7f..123fba7a11 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 4cba578436..df0abd38ae 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
--
2.20.1
On Wed, Feb 20, 2019 at 9:33 AM Arthur Zakirov <a.zakirov@postgrespro.ru> wrote:
I'm working on the (b) approach. I thought about a priority queue
structure. There no such ready structure within PostgreSQL sources
except binaryheap.c, but it isn't for concurrent algorithms.
I don't see why you need a priority queue or, really, any other fancy
data structure. It seems like all you need to do is somehow set it up
so that a backend which doesn't use a dictionary for a while will
dsm_detach() the segment. Eventually an unused dictionary will have
no remaining references and will go away.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 21.02.2019 15:45, Robert Haas wrote:
On Wed, Feb 20, 2019 at 9:33 AM Arthur Zakirov <a.zakirov@postgrespro.ru> wrote:
I'm working on the (b) approach. I thought about a priority queue
structure. There no such ready structure within PostgreSQL sources
except binaryheap.c, but it isn't for concurrent algorithms.I don't see why you need a priority queue or, really, any other fancy
data structure. It seems like all you need to do is somehow set it up
so that a backend which doesn't use a dictionary for a while will
dsm_detach() the segment. Eventually an unused dictionary will have
no remaining references and will go away.
Hm, I didn't think in this way. Agree that using a new data structure is
overengineering.
Now in the current patch all DSM segments are pinned (and therefore
dsm_pin_segment() is called). So a dictionary lives in shared memory
even if nobody have the reference to it.
I thought about periodically scanning the shared hash table and
unpinning old and unused dictionaries. But this approach needs
sequential scan facility for dshash. Happily there is the patch from
Kyotaro-san (the v16-0001-sequential-scan-for-dshash.patch part):
/messages/by-id/20190221.160555.191280262.horiguchi.kyotaro@lab.ntt.co.jp
Your approach looks simpler. It is necessary just to periodically scan
dictionaries' cache hash table and not call dsm_pin_segment() when a DSM
segment initialized. It also means that a dictionary is loaded into DSM
only while there is a backend which attached the dictionary's DSM.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On Thu, Feb 21, 2019 at 8:28 AM Arthur Zakirov <a.zakirov@postgrespro.ru> wrote:
Your approach looks simpler. It is necessary just to periodically scan
dictionaries' cache hash table and not call dsm_pin_segment() when a DSM
segment initialized. It also means that a dictionary is loaded into DSM
only while there is a backend which attached the dictionary's DSM.
Right. I think that having a central facility that tries to decide
whether or not a dictionary should be kept in shared memory or not,
e.g. based on a cache size parameter, isn't likely to work well. The
problem is that if we make a decision that a dictionary should be
evicted because it's causing us to exceed the cache size threshold,
then we have no way to implement that decision. We can't force other
backends to remove the mapping immediately, nor can we really bound
the time before they respond to a request to unmap it. They might be
in the middle of using it.
So I think it's better to have each backend locally make a decision
about when that particular backend no longer needs the dictionary, and
then let the system automatically clean up the ones that are needed by
nobody.
Perhaps a better approach still would be to do what Andres proposed
back in March:
#> Is there any chance we can instead can convert dictionaries into a form
#> we can just mmap() into memory? That'd scale a lot higher and more
#> dynamicallly?
The current approach inherently involves double-buffering: you've got
the filesystem cache containing the data read from disk, and then the
DSM containing the converted form of the data. Having something that
you could just mmap() would avoid that, plus it would become a lot
less critical to keep the mappings around. You could probably just
have individual queries mmap() it for as long as they need it and then
tear out the mapping when they finish executing; keeping the mappings
across queries likely wouldn't be too important in this case.
The downside is that you'd probably need to teach resowner.c about
mappings created via mmap() so that you don't leak mappings on an
abort, but that's probably not a crazy difficult problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
Perhaps a better approach still would be to do what Andres proposed
back in March:
#> Is there any chance we can instead can convert dictionaries into a form
#> we can just mmap() into memory? That'd scale a lot higher and more
#> dynamicallly?
That seems awfully attractive. I was about to question whether we could
assume that mmap() works everywhere, but it's required by SUSv2 ... and
if anybody has anything sufficiently lame for it not to work, we could
fall back on malloc-a-hunk-of-memory-and-read-in-the-file.
We'd need a bunch of work to design a position-independent binary
representation for dictionaries, and then some tool to produce disk files
containing that, so this isn't exactly a quick route to a solution.
On the other hand, it isn't sounding like the current patch is getting
close to committable either.
(Actually, I guess you need a PI representation of a dictionary to
put it in a DSM either, so presumably that part of the work is
done already; although we might also wish for architecture independence
of the disk files, which we probably don't have right now.)
regards, tom lane
On February 21, 2019 10:08:00 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Perhaps a better approach still would be to do what Andres proposed
back in March:#> Is there any chance we can instead can convert dictionaries into a
form
#> we can just mmap() into memory? That'd scale a lot higher and
more
#> dynamicallly?
That seems awfully attractive. I was about to question whether we
could
assume that mmap() works everywhere, but it's required by SUSv2 ... and
if anybody has anything sufficiently lame for it not to work, we could
fall back on malloc-a-hunk-of-memory-and-read-in-the-file.We'd need a bunch of work to design a position-independent binary
representation for dictionaries, and then some tool to produce disk
files
containing that, so this isn't exactly a quick route to a solution.
On the other hand, it isn't sounding like the current patch is getting
close to committable either.(Actually, I guess you need a PI representation of a dictionary to
put it in a DSM either, so presumably that part of the work is
done already; although we might also wish for architecture independence
of the disk files, which we probably don't have right now.)
That's what I was pushing for ages ago...
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On 21.02.2019 19:13, Robert Haas wrote:
So I think it's better to have each backend locally make a decision
about when that particular backend no longer needs the dictionary, and
then let the system automatically clean up the ones that are needed by
nobody.
Yep, it wouldn't be hard to implement.
Perhaps a better approach still would be to do what Andres proposed
back in March:#> Is there any chance we can instead can convert dictionaries into a form
#> we can just mmap() into memory? That'd scale a lot higher and more
#> dynamicallly?The current approach inherently involves double-buffering: you've got
the filesystem cache containing the data read from disk, and then the
DSM containing the converted form of the data. Having something that
you could just mmap() would avoid that, plus it would become a lot
less critical to keep the mappings around. You could probably just
have individual queries mmap() it for as long as they need it and then
tear out the mapping when they finish executing; keeping the mappings
across queries likely wouldn't be too important in this case.The downside is that you'd probably need to teach resowner.c about
mappings created via mmap() so that you don't leak mappings on an
abort, but that's probably not a crazy difficult problem.
It seems to me Tom and Andres also vote for the mmap() approach. I think
I need to look closely at the mmap().
I've labeled the patch as 'v13'.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
On 25.02.2019 14:33, Arthur Zakirov wrote:
It seems to me Tom and Andres also vote for the mmap() approach. I think
I need to look closely at the mmap().I've labeled the patch as 'v13'.
Unfortunately I didn't come up with a new patch yet. So I marked the
entry as "Returned with feedback" for now.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hello hackers,
On 25.02.2019 14:33, Arthur Zakirov wrote:
The current approach inherently involves double-buffering: you've got
the filesystem cache containing the data read from disk, and then the
DSM containing the converted form of the data. Having something that
you could just mmap() would avoid that, plus it would become a lot
less critical to keep the mappings around. You could probably just
have individual queries mmap() it for as long as they need it and then
tear out the mapping when they finish executing; keeping the mappings
across queries likely wouldn't be too important in this case.The downside is that you'd probably need to teach resowner.c about
mappings created via mmap() so that you don't leak mappings on an
abort, but that's probably not a crazy difficult problem.It seems to me Tom and Andres also vote for the mmap() approach. I think
I need to look closely at the mmap().I've labeled the patch as 'v13'.
I've attached new version of the patch. Note that it is in WIP state for
now and there are unresolved issues, which is listed at the end of the
email.
The patch implements simple approach of using mmap(). Also I want to be
sure that I'm going in right direction. Feel free to send a feedback.
On every dispell_init() call Postgres checks is there a shared
dictionary file in the pg_shdict directory, if it is then calls mmap().
If there is no such file then it compiles the dictionary, write it to
the file and calls mmap().
dispell_lexize() works with already mmap'ed dictionary. So it doesn't
mmap() for each individual query as Robert proposed above. It's because
such approach reduces performance twice (I tested with ts_lexize() calls
by pgbench).
Tests
-----
Like in:
/messages/by-id/20180124172039.GA11210@zakirov.localdomain
i performed tests. There are now big differences in numbers except that
files are being created now in the pg_shdict directory:
czech_hunspell - 9.2 MB file
english_hunspell - 1.9 MB file
french_hunspell - 4.6 MB file
TODO
----
- Improve the documentation and comments.
- Eliminate shared dictionary files after DROP/ALTER calls. It necessary
to come up with some fancy file name. For now it is just OID of a
dictionary. So it is possible to add database OID, xmin or xmax into a
file name.
- We cant remove the file right away after DROP/ALTER. Is it good idea
to use autovacuum here?
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
0001-Fix-ispell-memory-handling-v19.patchtext/x-patch; name=0001-Fix-ispell-memory-handling-v19.patchDownload
From 02ce308b0fb55dc60fe589bd01d2883bb6258354 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 14:27:32 +0300
Subject: [PATCH 1/4] Fix-ispell-memory-handling
Reviewed-by: Tomas Vondra, Ildus Kurbangaliev
---
src/backend/tsearch/spell.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb39466b22..eb8416ce7f 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
/*
* Prepare for constructing an ISpell dictionary.
*
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
- ? cpstrdup(Conf, flag) : VoidString;
+ ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
}
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
- entry->flag.s = cpstrdup(Conf, s);
+ entry->flag.s = tmpstrdup(s);
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1541,6 +1543,9 @@ nextline:
return;
isnewformat:
+ pfree(recoded);
+ pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
--
2.21.0
0002-Change-tmplinit-argument-v19.patchtext/x-patch; name=0002-Change-tmplinit-argument-v19.patchDownload
From 8b95f8078cc3af8d0de920e82bedf69f749c4960 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:05:44 +0300
Subject: [PATCH 2/4] Change-tmplinit-argument
Reviewed-by: Tomas Vondra, Ildus Kurbangaliev
---
contrib/dict_int/dict_int.c | 4 +-
contrib/dict_xsyn/dict_xsyn.c | 4 +-
contrib/unaccent/unaccent.c | 4 +-
src/backend/commands/tsearchcmds.c | 10 ++++-
src/backend/snowball/dict_snowball.c | 4 +-
src/backend/tsearch/dict_ispell.c | 4 +-
src/backend/tsearch/dict_simple.c | 4 +-
src/backend/tsearch/dict_synonym.c | 4 +-
src/backend/tsearch/dict_thesaurus.c | 4 +-
src/backend/utils/cache/ts_cache.c | 13 +++++-
src/include/tsearch/ts_cache.h | 4 ++
src/include/tsearch/ts_public.h | 67 ++++++++++++++++++++++++++--
12 files changed, 105 insertions(+), 21 deletions(-)
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 628b9769c3..ddde55eee4 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
Datum
dintdict_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt *d;
ListCell *l;
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index 509e14aee0..15b1a0033a 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
Datum
dxsyn_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index fc5176e338..f3663cefd0 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -270,12 +270,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
Datum
unaccent_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar *rootTrie = NULL;
bool fileloaded = false;
ListCell *l;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/commands/tsearchcmds.c b/src/backend/commands/tsearchcmds.c
index 8e5eec22b5..30c5eb72a2 100644
--- a/src/backend/commands/tsearchcmds.c
+++ b/src/backend/commands/tsearchcmds.c
@@ -389,17 +389,25 @@ verify_dictoptions(Oid tmplId, List *dictoptions)
}
else
{
+ DictInitData init_data;
+
/*
* Copy the options just in case init method thinks it can scribble on
* them ...
*/
dictoptions = copyObject(dictoptions);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = InvalidOid;
+ init_data.dict.xmin = InvalidTransactionId;
+ init_data.dict.xmax = InvalidTransactionId;
+ ItemPointerSetInvalid(&init_data.dict.tid);
+
/*
* Call the init method and see if it complains. We don't worry about
* it leaking memory, since our command will soon be over anyway.
*/
- (void) OidFunctionCall1(initmethod, PointerGetDatum(dictoptions));
+ (void) OidFunctionCall1(initmethod, PointerGetDatum(&init_data));
}
ReleaseSysCache(tup);
diff --git a/src/backend/snowball/dict_snowball.c b/src/backend/snowball/dict_snowball.c
index 5166738310..f30f29865c 100644
--- a/src/backend/snowball/dict_snowball.c
+++ b/src/backend/snowball/dict_snowball.c
@@ -201,14 +201,14 @@ locate_stem_module(DictSnowball *d, const char *lang)
Datum
dsnowball_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSnowball *d;
bool stoploaded = false;
ListCell *l;
d = (DictSnowball *) palloc0(sizeof(DictSnowball));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index 8b05a477f1..fc9a96abca 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dispell_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
bool affloaded = false,
dictloaded = false,
@@ -40,7 +40,7 @@ dispell_init(PG_FUNCTION_ARGS)
NIStartBuild(&(d->obj));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_simple.c b/src/backend/tsearch/dict_simple.c
index 2f62ef00c8..c92744641b 100644
--- a/src/backend/tsearch/dict_simple.c
+++ b/src/backend/tsearch/dict_simple.c
@@ -29,7 +29,7 @@ typedef struct
Datum
dsimple_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSimple *d = (DictSimple *) palloc0(sizeof(DictSimple));
bool stoploaded = false,
acceptloaded = false;
@@ -37,7 +37,7 @@ dsimple_init(PG_FUNCTION_ARGS)
d->accept = true; /* default */
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_synonym.c b/src/backend/tsearch/dict_synonym.c
index b6226df940..d3f5f0da3f 100644
--- a/src/backend/tsearch/dict_synonym.c
+++ b/src/backend/tsearch/dict_synonym.c
@@ -91,7 +91,7 @@ compareSyn(const void *a, const void *b)
Datum
dsynonym_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn *d;
ListCell *l;
char *filename = NULL;
@@ -104,7 +104,7 @@ dsynonym_init(PG_FUNCTION_ARGS)
char *line = NULL;
uint16 flags = 0;
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index 75f8deef6a..8962e252e0 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -604,7 +604,7 @@ compileTheSubstitute(DictThesaurus *d)
Datum
thesaurus_init(PG_FUNCTION_ARGS)
{
- List *dictoptions = (List *) PG_GETARG_POINTER(0);
+ DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictThesaurus *d;
char *subdictname = NULL;
bool fileloaded = false;
@@ -612,7 +612,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
- foreach(l, dictoptions)
+ foreach(l, init_data->dict_options)
{
DefElem *defel = (DefElem *) lfirst(l);
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 0545efc75b..8bc8d82c76 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -39,6 +39,7 @@
#include "catalog/pg_ts_template.h"
#include "commands/defrem.h"
#include "tsearch/ts_cache.h"
+#include "tsearch/ts_public.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
@@ -311,11 +312,15 @@ lookup_ts_dictionary_cache(Oid dictId)
MemSet(entry, 0, sizeof(TSDictionaryCacheEntry));
entry->dictId = dictId;
entry->dictCtx = saveCtx;
+ entry->dict_xmin = HeapTupleHeaderGetRawXmin(tpdict->t_data);
+ entry->dict_xmax = HeapTupleHeaderGetRawXmax(tpdict->t_data);
+ entry->dict_tid = tpdict->t_self;
entry->lexizeOid = template->tmpllexize;
if (OidIsValid(template->tmplinit))
{
+ DictInitData init_data;
List *dictoptions;
Datum opt;
bool isnull;
@@ -335,9 +340,15 @@ lookup_ts_dictionary_cache(Oid dictId)
else
dictoptions = deserialize_deflist(opt);
+ init_data.dict_options = dictoptions;
+ init_data.dict.id = dictId;
+ init_data.dict.xmin = entry->dict_xmin;
+ init_data.dict.xmax = entry->dict_xmax;
+ init_data.dict.tid = entry->dict_tid;
+
entry->dictData =
DatumGetPointer(OidFunctionCall1(template->tmplinit,
- PointerGetDatum(dictoptions)));
+ PointerGetDatum(&init_data)));
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/include/tsearch/ts_cache.h b/src/include/tsearch/ts_cache.h
index 77e325d101..2298e0a275 100644
--- a/src/include/tsearch/ts_cache.h
+++ b/src/include/tsearch/ts_cache.h
@@ -54,6 +54,10 @@ typedef struct TSDictionaryCacheEntry
Oid dictId;
bool isvalid;
+ TransactionId dict_xmin; /* XMIN of the dictionary's tuple */
+ TransactionId dict_xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData dict_tid; /* TID of the dictionary's tuple */
+
/* most frequent fmgr call */
Oid lexizeOid;
FmgrInfo lexize;
diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h
index b325fa122c..db028ed6ad 100644
--- a/src/include/tsearch/ts_public.h
+++ b/src/include/tsearch/ts_public.h
@@ -13,6 +13,8 @@
#ifndef _PG_TS_PUBLIC_H_
#define _PG_TS_PUBLIC_H_
+#include "nodes/pg_list.h"
+#include "storage/itemptr.h"
#include "tsearch/ts_type.h"
/*
@@ -81,10 +83,69 @@ extern void readstoplist(const char *fname, StopList *s,
extern bool searchstoplist(StopList *s, char *key);
/*
- * Interface with dictionaries
+ * API for text search dictionaries.
+ *
+ * API functions to manage a text search dictionary are defined by a text search
+ * template. Currently an existing template cannot be altered in order to use
+ * different functions. API consists of the following functions:
+ *
+ * init function
+ * -------------
+ * - optional function which initializes internal structures of the dictionary
+ * - accepts DictInitData structure as an argument and must return a custom
+ * palloc'd structure which stores content of the processed dictionary and
+ * is used by lexize function
+ *
+ * lexize function
+ * ---------------
+ * - normalizes a single word (token) using specific dictionary
+ * - returns a palloc'd array of TSLexeme, with a terminating NULL entry
+ * - accepts the following arguments:
+ *
+ * - dictData - pointer to a structure returned by init function or NULL if
+ * init function wasn't defined by the template
+ * - token - string to normalize (not null-terminated)
+ * - length - length of the token
+ * - dictState - pointer to a DictSubState structure storing current
+ * state of a set of tokens processing and allows to normalize phrases
+ */
+
+/*
+ * A preprocessed dictionary can be stored in shared memory using DSM - this is
+ * decided in the init function. A DSM segment is released after altering or
+ * dropping the dictionary. The segment may still leak, when a backend uses the
+ * dictionary right before dropping - in that case the backend will hold the DSM
+ * untill it disconnects or calls lookup_ts_dictionary_cache().
+ *
+ * DictEntryData represents DSM segment with a preprocessed dictionary. We need
+ * to ensure the content of the DSM segment is still valid, which is what xmin,
+ * xmax and tid are for.
+ */
+typedef struct
+{
+ Oid id; /* OID of the dictionary */
+ TransactionId xmin; /* XMIN of the dictionary's tuple */
+ TransactionId xmax; /* XMAX of the dictionary's tuple */
+ ItemPointerData tid; /* TID of the dictionary's tuple */
+} DictEntryData;
+
+/*
+ * API structure for a dictionary initialization. It is passed as an argument
+ * to a template's init function.
*/
+typedef struct
+{
+ /* List of options for a template's init method */
+ List *dict_options;
+
+ /* Data used to allocate, search and release the DSM segment */
+ DictEntryData dict;
+} DictInitData;
-/* return struct for any lexize function */
+/*
+ * Return struct for any lexize function. They are combined into an array, the
+ * last entry is the terminating entry.
+ */
typedef struct
{
/*----------
@@ -108,7 +169,7 @@ typedef struct
uint16 flags; /* See flag bits below */
- char *lexeme; /* C string */
+ char *lexeme; /* C string (NULL for terminating entry) */
} TSLexeme;
/* Flag bits that can appear in TSLexeme.flags */
--
2.21.0
0003-Retrieve-shared-location-for-dict-v19.patchtext/x-patch; name=0003-Retrieve-shared-location-for-dict-v19.patchDownload
From 48520622bf7f357cd5e61e732e400166a3d6d0b0 Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Thu, 17 Jan 2019 15:38:09 +0300
Subject: [PATCH 3/4] Retrieve-shared-location-for-dict
Reviewed-by: Tomas Vondra, Ildus Kurbangaliev
---
src/backend/postmaster/pgstat.c | 3 +
src/backend/tsearch/Makefile | 2 +-
src/backend/tsearch/ts_shared.c | 159 ++++++++++++++++++
src/bin/initdb/initdb.c | 1 +
src/bin/pg_rewind/filemap.c | 6 +
.../pg_verify_checksums/pg_verify_checksums | Bin 0 -> 337616 bytes
src/include/pgstat.h | 1 +
src/include/tsearch/ts_shared.h | 27 +++
8 files changed, 198 insertions(+), 1 deletion(-)
create mode 100644 src/backend/tsearch/ts_shared.c
create mode 100755 src/bin/pg_verify_checksums/pg_verify_checksums
create mode 100644 src/include/tsearch/ts_shared.h
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 0355fa65fb..fdcf15575a 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3977,6 +3977,9 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_TIMELINE_HISTORY_WRITE:
event_name = "TimelineHistoryWrite";
break;
+ case WAIT_EVENT_TS_SHARED_DICT_WRITE:
+ event_name = "TSSharedDictWrite";
+ break;
case WAIT_EVENT_TWOPHASE_FILE_READ:
event_name = "TwophaseFileRead";
break;
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 62d8bb3254..0b25c20fb0 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
dict_simple.o dict_synonym.o dict_thesaurus.o \
dict_ispell.o regis.o spell.o \
- to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+ to_tsany.o ts_selfuncs.o ts_shared.o ts_typanalyze.o ts_utils.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
new file mode 100644
index 0000000000..c1249f9a75
--- /dev/null
+++ b/src/backend/tsearch/ts_shared.c
@@ -0,0 +1,159 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.c
+ * Text search shared dictionary management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/tsearch/ts_shared.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <unistd.h>
+#ifndef WIN32
+#include <sys/mman.h>
+#endif
+#include <sys/stat.h>
+
+#include "pgstat.h"
+#include "storage/fd.h"
+#include "tsearch/ts_shared.h"
+
+
+char *
+ts_dict_shared_init(DictInitData *init_data, ts_dict_build_callback allocate_cb)
+{
+ char *name;
+ int flags;
+ int fd;
+ void *dict;
+ Size dict_size;
+
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ {
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ return dict;
+ }
+
+ name = psprintf(PG_SHDICT_DIR "/%u", init_data->dict.id);
+
+ /* Try to create a new file */
+ flags = O_RDWR | O_CREAT | O_EXCL | PG_BINARY;
+ if ((fd = OpenTransientFile(name, flags)) == -1)
+ {
+ if (errno != EEXIST)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open shared dictionary file \"%s\": %m",
+ name)));
+ /* The file was created before */
+ return name;
+ }
+
+ /* Build the dictionary */
+ dict = allocate_cb(init_data->dict_options, &dict_size);
+
+ /* And write it to the shared file */
+ pgstat_report_wait_start(WAIT_EVENT_TS_SHARED_DICT_WRITE);
+ if (write(fd, dict, dict_size) != dict_size)
+ {
+ pgstat_report_wait_end();
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ CloseTransientFile(fd);
+
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write to shared dictionary file \"%s\": %m",
+ name)));
+ }
+ pgstat_report_wait_end();
+
+ pfree(dict);
+
+ if (CloseTransientFile(fd))
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close shared dictionary file \"%s\": %m",
+ name)));
+
+ return name;
+}
+
+void *
+ts_dict_shared_attach(const char *dict_name, Size *dict_size)
+{
+ int flags;
+ int fd;
+ void *address;
+ struct stat st;
+
+ /* Open an existing file for attach */
+ flags = O_RDONLY | PG_BINARY;
+ if ((fd = OpenTransientFile(dict_name, flags)) == -1)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open shared dictionary file \"%s\": %m",
+ dict_name)));
+
+ if (fstat(fd, &st) != 0)
+ {
+ int save_errno;
+
+ save_errno = errno;
+ CloseTransientFile(fd);
+ errno = save_errno;
+
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat shared dictionary file \"%s\": %m",
+ dict_name)));
+ }
+ *dict_size = st.st_size;
+
+ /* Map the shared file. We need only read access */
+ address = mmap(NULL, *dict_size, PROT_READ, MAP_SHARED, fd, 0);
+ if (address == MAP_FAILED)
+ {
+ int save_errno;
+
+ save_errno = errno;
+ CloseTransientFile(fd);
+ errno = save_errno;
+
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not map shared dictionary file \"%s\": %m",
+ dict_name)));
+ return false;
+ }
+
+ if (CloseTransientFile(fd))
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close shared dictionary file \"%s\": %m",
+ dict_name)));
+
+ return address;
+}
+
+void
+ts_dict_shared_detach(const char *dict_name, void *dict_address, Size dict_size)
+{
+ if (munmap(dict_address, dict_size) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not unmap shared memory segment \"%s\": %m",
+ dict_name)));
+}
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 09b59c8324..29c7afbc27 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -207,6 +207,7 @@ static const char *const subdirs[] = {
"pg_dynshmem",
"pg_notify",
"pg_serial",
+ "pg_shdict",
"pg_snapshots",
"pg_subtrans",
"pg_twophase",
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 63d0baee74..ff76086328 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -58,6 +58,12 @@ static const char *excludeDirContents[] =
*/
"pg_replslot",
+ /*
+ * Skip compiled dictionaries files. A dictionary will be compiled on first
+ * demand.
+ */
+ "pg_shdict",
+
/* Contents removed on startup, see dsm_cleanup_for_mmap(). */
"pg_dynshmem", /* defined as PG_DYNSHMEM_DIR */
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums b/src/bin/pg_verify_checksums/pg_verify_checksums
new file mode 100755
index 0000000000000000000000000000000000000000..06a474673896c8f16f239cad62250aea269ad7a0
GIT binary patch
literal 337616
zcmeFa30RXy*Eb%}xNC5&T9>$>;({yIT~MMyWl_Q1B1>45O-OJ>(U3}vsnm+KT3dIm
zRjanv6$BS>X%&}N>WX_JsAz3PYvuntGxr^CP~NBSb3O0({=e@kZ89@w&YW}R%$YN1
z=AJvwSB)5AZ)YbH|0>7^$fz`>6^FPA6)8>SLQW=gmGzQU$G4-bwX71L74dNuYRjf!
z_Pjz&s|a`%kW#y?LVek^iGU}jqJ)%EyYl{&b$N}L${cu^+OZxed0AE_JrT*_Vk&LN
z_@xN#%A|=yI9yDn?MOE%*Li))qtfYqp%XrYM1O*uqz(d|6xy8>+KH)5(3hBs`XqAn
zIWE+fP2EvPe_~3w^yhO=8%+7NCs^^VbP^2|{=`&lHxlhc(K*I{{VjA)HcH@Eo?c=E
z`NdQm_s$XFfjxV4jtK4)5gw(R*=c64o}GI3aE^<1?#B8}^a*#Ummd>^#1>UWGS|b0
zU{rr5DO3Gs(v&`xrmcG9banf;`BT>&q`JQN5DnHZIdXPQ><QKnpZ@qbHC@uA_cK$I
zZpUBL`ELD-IsldXHw(?*g-{DZy$hZXO7DV?C<ji%^)CD`%YoMhT!~NFzbF9SMdzn-
z;I+%?*T?0+r<Vh7Q4ak5a`ZE}9Q>Z;z%P^||7Ydk=ad8AP!4=YIrQI@18-Ljo&Dv&
z1IvMLD+j)@9QfgK<e5|s{?KypUzda5uN?UM<-onlf&X3(yhb_jv9J>*WG(www;cPK
z4a9fZ`H6DqPb~)?S`MA%<-qTh17BPY{C+unpOu6ExE%ba<>2ov2cA(5orUG#>&k)e
zE{9Gh;4ASd`}YEXca6(s&}mr~8RgF9(DwtpwycqC3MmSI;<~s<fG3uLZw3zGhY4_T
zT}u@3W6I#mu>Qno!(uf7!L;afk@<W0din=zVl|=Raav8Rujk;1=qQbEKwyN1*S)ib
z+ROa?HL<Z#(f$$9K>^zE=qREcL{xjqLP8>RabdDJZER4?ESY9zxK>6@g2Q8FaWS#s
zQQ8n$oJLCx!3J1y;h|9h5tNGxij0v($7sW&f~goA6c#Is0iWntSx`iDoQ9DgG|-LF
zY2zS8oHih6ntxE(H2;u*@CafQ0m`y~!01@5EG9Ze6GfDQbuqvVh>O$2YB^$vCK_Tu
z#IV>XU4$$oR0DD-N1#EZCK3Wt3IEH2!iX(kBplFUX6V9${UMl6tBHySR&caVD+@s}
zh?+z(3S0_ePmR%vP|>r1*idSLJ~5pj4wy!@sDUP<LS-?mp9Bh;5iARt5gV@6u;$<?
z3(!V~6APvhNPsM9$lpI`W`KW4IFuMZTSJmYM#pPp0YO2UIMyc;+~1!Dh|~oQL<WRM
z$wFc^7-j#N0BNb3XNHdO7&O?wn{zj37nzvvA?q9#9jWPzrn=b9F|pC1ah+pA<EBS+
zqlgA=LnEW31k(PnNqhER1@<kLIE;T+6fmSPj?KSC_`BMc#(<+NLkj?z9s7@mqe_%@
zjl-)_<kXtb@n?rNlaY$?T)pt{niP3<K|R4ZzP$}!jC;lT=+m00$Yd)Y@l=FkE5ZKO
zmVtNwl*6}{fe#SiyUM`51o-|kaDM@Qk{l!cM80A<uMB+pXB<yf#t+OC;I3uhG78l2
zck&5ODTb`fv+O%Z0<R^3=StwsCGb26+))CzNZ@TG@Indv0|~rH0`Dz>i++;!L=@wS
zf1*DmIK{l;pXd(>K0pK_75yr~DNYvu8W!>#=>RU(_NSD<1(RZR?IduD#l^oa61eE2
zsjim<PQ1lGR|%ZO@j``K0vE@gs(d7H7`p9moCIFY2F1Qu0<SKChe_ZyB=8ssyru*`
zQvz2=;E58rg9M%|fm19m{-sFZbwnW2l@fSe34E;tPJ1HpZ>t1eUj!oEC4o1P!1qhw
z?@8bp5_m%i{G<flNCMB1z#B{8xf1yM5_p~jE}jigwM7DND#0(5z?(_nMH0AD0+$JX
zfb6z~1YSh~Zz+L0NZ_p`@P-n&;M!QNQUY%)!EYylx0ArTNZ?KqcrOXOy#($mfp?I=
z)e?9|3EW2l?<9eblfa!N@F@~_X9+w^0`DS$$4KB^CGeRNcsB_=Q3CHSfhSAgJtXiH
z3EV{jUnzn2l)%?Y;Jqa9trB=23H+bzKN|Q)1OI5?9}WDYfqyjcj|TqHz&{%JM+5(8
z;Qtj3{HAF3O06%bqBhCTpOeYdh77HJagJJlyvi}w#bTG+02e#n#kZolD{@p0qdIF|
zG5#E{P??T+tU0WFfy#8WW6faY(^RG-9qTSuK1yXe%CW9x<%3kFBOGfAEAOE)9o<+H
zS$P|kX*+I>VdeEyrXw8d6jokEWjeaC`mpjcD$|jT)s>YOQ<;uxtX)`nA(iQf#;RoH
z*;J;Z8LI;;$5NS&=&UkUo=Rl}l?&eh;220{2P)^W@&qc^rg9D|kD@Xi#aJ^~c^H-H
z2*$dLl?PFoj$W*5S-B6D>Bz;J!phyKOh+x&L{@H3WjbQ9#;|frD$~)5bqXstrZOF=
zSbbQz4wdOB#p=q+)u>EIDAq2lY)@r6I<YEQ`PEsJ>Bz+Dz{=04Oh+YF87n`eG967?
z3;&}2-=?yX%6Y7Oh01hPV$EUY3sj~f5^Dx4pQbV$jaYZF@=+?&k%)CID<7n?Bb8HF
zc@LH8Xw#a=%G;<+M<v!6R$fnKIwG-7VdYg+rlS$74=XRDG98InU0HcCmFXzN+J%)D
zQkjlGtV&j%O=UXzusX1EER~(9EMw)VRPIdW!q;s4sZ2*4);w08K;^Df&SB+IRHh>h
zYX-_Rzq&s1FrNQVZM?14-!1g<9buOL5ss;5t~sA2{U^aXi|Y^jQ6_tyqBi&64FYPj
zocgWqU8Jp#so4y5G&QRh=P8;eg0}3KkOF@J!*`*6sGZ)ZGv8ED8w=H$mVs)!EcN*}
z+Ik=$7YI}-&I?gAcQ0vA@+J09KwFuvyPsO$|MnSV)y8|;nrc)3%|DT3IRIIL(Dy94
zFA(h}9}96-G*2QiWXD2E^5h?go;4J>exRXVP>=x}wJ}W%NU+*e(UAsDZB!L?&B!)P
z0uV(R<V&4KN&OX+7!5Sh)yCABl$hj3UbmRlsS?#jzhpr59Gb+TDS)m7G=W3sap+n=
zw*oqYLuYd6E<pDKI-NsfIWz;%lYma+&<GCA0W=p-4TpwsXda-JYy<fpl0Sb|wjo%|
zPd2f<X*utO$v|B<nW^5ims+SZ@5|M8PfZn*yXF`TVFYL2;TfOe;fpDiAej$qcRk7p
z895<6qiUy5BSOvNSLqpdl!@d(#YeM|NF<sdk!XM*Q4*8LXovu+iG7PK3Gy#l9w^1q
zE`E)KU1Jjy$HZB5Wg}Y(y?Oj~6N9Qv2}Qx`?){h{XyukAuZoLL9b?9#FHowD2^OR7
zuG*xss9_{Xixm4@YlRpxbj>kezKxN|4jv_oF}vL1bEn2s*yd*-BH&gDF{++XJ|x=}
ziv;D=#%xOzrm|7>oFn-?3o-gsF{$o$RXsO(J~y6L8*|i|c@AnjN)#-yC^Ibz<I&99
z4o*jnC!KQi8Fqr0CVBE1veGO}y>tk%pG?3q8i$Dh6xaZR2w;*=vf=(}t7^+}@7e$j
zI3UjkkaIw>4e%V#bU?lhdb^+TyTbt*8{h&5EVltNIpB~Du!{q#!G<}FuK-AyTuS-y
zJBv?H?^H$kZJ6>*=!G9f;}=rn<Cn$77>mkuYekmdK&@95A#$LBBKM>_%jgPvT2SDs
zBuZ%~Z$kHbkK?<H2T+?0g@9thTPQIZ2=gFyFP*-?kjvQ+!)r9;p;Dhl-?CXlEgN6J
zh-DBdbTVOPsZI5to<hzf-^LO%(ei*FG`kp~0UNFLI3zwo(py?Gqmi2k9iyu@$`g^z
zHe4eP#NKQs=}e6AL79?>oCU*eMI_B#RO!>`Tc$`F27(KC0chG$5GI%l(-et`)u>H-
zNhFh*=u^ZMMkL*gFS(Fp$2KAV!U;D*Fb6K^EGQlQdU%q6OwgN@H8HIT(?gMT03w?V
zq~*-K+D14J;$wGI&@msrR4hm#M#B}7z}CD<NN)zw&H<BDP&Oq#2Q7V(oz7Q+rPx>K
zuGlBEt#4ow$OjO58tajXX`DJXMrF)&GZsPfGr}dEYA19GT-s7=-jycE=G&#{xwzIv
z2su&|KggQ`oRpz;f+iaivPph1n-b2%>|&5oK8pthe3&piSN{M7vPpOuzTCjev;oj&
zmU%Tus<d{$K?}ycN+f0N(qty6+LX$0V5KN|7*DE==hf!=oxwVt?i|{Ns}Ruv>6sDC
zw#@pS2pe3fs86FrrZ99y1Z;WoTrk=_6R0^EZ9mp6`R`do!ps?w(OwFZFnEM;g4;qQ
z%pQZHX0$Wv0`USzw2Vd<4H=50!w9`gW9$(yx(XPpN@E<XiqQt4j|H@;5RwSRzZ)%7
zKpQNe86{|DH6%)9dVo40s!eIHLdz=ckTY>jokBg~$byfl79f)uGBCw6QN^Zs(s<GZ
zX{#2g$t2Gv7X@uG4y3IcAa5IvBM8R0wg#fdCc{k7ag-I(^Xx=yVhm$3TeA)Q07b`3
z=?doa1YRL2Lt&T$BvFm5TZY0v$vLrcKMP)#7l@%ra)8s(7vgWZLA@~=d??YUQ6keu
zgA4<hec)tLDa(4nfoMebVr25@)0kAce&}i|>Z3`1?KpASOI<aYsg}Eo`qB=cMKsDg
z>_w(0$mB1`G>n8X8pctgPoqReF(n6ZnFz==yN1{y&}<6UpD@;`)Iniro&XPSUSmon
zNrYkJuB9sUmhV|g*{*PfbxJ|jBFMy)=H_emfw2gahAl3W;hplUrQ}ym$*-6Kp8!NQ
zhoGVr#Lg&~eHvLm#_>7uB{<KXQL*&AcLvNBU34)fTvHo$Ib~;iwhFdE%2wsjI%amU
zYES~#x!A>tP!xN#B(QujHpcpdYj%o{Co&sBD}4r5L28*Lw9EouXt5b<ZL9rOsNKqH
zbqdRNjOr2MZAqmLk^nYjgMd7TlWJMIcDhhInbj%`In)7j{|^DSoGKKl4g^(wLP4yy
zJ-*|$Rq*Yut(8AJOMdt$;&|&S)@^wh3TmUO(7KqH$FXv@C6CM5U2UqVB9N*6$1IT#
zrNr!_<U;U8+5>Zd8{sK5gZb1EsH9+T3bhSJF~1{d4C6E^a2iddG~_lK2BKj`++ot?
zm{iw15TKcBcF82R^&QXz!x4yX8dU}8Q?bI(pcwf2ghH6b=jen<z67|vRoCV@ndA$Q
zg`FGg&!qM%)#iF9fWk)>5g!H)eb0A*`VI#}eT|=h6d;V8&I=N+q;@9xNJ`8u>w#qR
zHKVa{1V%i^=*%(HQjD&^;Ck8p6mwYqTLu>_%p|uU`wn?vOZ6Q&!5f5Ob{Q)rnAef9
ztiUl2bBv#Z02Je;bYO6n1HclacbOsiJ%9}rB8VZ|erQJx<eN~V_(AnRUucKrmWJmW
z!Zpc*Io<kF&LK9s_n~_rg#yXs>VQx|+ZV-gV$q&a3gDC)b4tHSC2V4&^aZEXl$x@*
z1OA1?Xw4g6$Mk>#9!neF>>zf}pEX9G%r1X&3`;{qXRMoMmnQ;9UIMx$fSzIEVZ5)Q
z$gOudQ>uIg*l{sfcu!j8^;%y4A&FC_DnTK_PT0^@9Q^|u`es{|mkoWJ4c368JJ`@?
z*s7Y_(3jd^51;_3`xc^P`09B=G%jb7s!g60`P2~iR~;YOJm7)?pNb@E+DYco_hzsp
zbw84_E>WLGP#N2Fwa0Wjh}ezB!3#1#ZxfMNT7vQsCCtdu9j90D&K9C)N@djf8pH#p
zp1>|0Fdb<_WP}wJluL*EPLF9lyasCeH2N;q4Ylme7h-{#BZ(Jc0r}B1B4TFClPQ%t
z!<WX}frKG!9lmoCZKXS?B7P(x5gX(CAj!@{`pX5xF%Q56Y4lAsilk+b3>yxVX~WT7
zl;95^Q3BU;BuJ4;(A6fvZYT_LS3@h)ULspsdbb0_){Sjh;-wUxv=jRf4GLU%%;0Ab
zDBExsbm$L+ryqu3roF6(W`ZAV2!h6Q#W-f5;)_)J#)`=>5~;E`t57UI3Mh?uvV&Od
zT1Bxh19K7U4##(`(jSVD#n;C8SB7|eUQ9yo1dOtlx96e6@-UmB>vtUkQ4GlBX9p!h
zpmBc;ivg?cr><ESqA}tKWJ|(0ZDm%M9}ezx!jcZtreU}I3AG@6U7n%%Ap=nlszeO*
zmtp)`3WE-z^l9`h)4qrJh-E6X-sRWWPbZ-Ha>lA`|1MZ1{hLyS^=~Wr731EuNXkwr
zm^+~2Jg~Heb})H$y)ke_*z;m?oeyC0amBu(DPoderz&ICk!q0LqC^`eigjKEsx@$S
z*_M^{P^6fhUgQ(tiXUYCI_oB*MABe-o}?t(kb{IpvHCQY)lR^ir5q*)qlaxcYrvKX
zPYYGBMryMUWm%<N4#LHJC0ep3JQTOFVlB^;Od|7a%PR5_CB=Jxu}Bg}lY)5JmJE9w
znQp=tx_uwJSrBKEn_X^AK}np^w;UP9al#N4QyD|S6{Df{=Kjaf!ZPv!>vt~oMxRCr
zCUsTDz(6Hv2$6o$3({G(VL#R557`Vc3pYqv9nA`A0WA_BzYfZT#P_fa1_)J=1-j#~
zADA2=jS<telGKMYx!9DW76?HEWYK9F@(3=l=)mb>x6C&m{n_9typN)lg~^aciA>uI
z3S$71nMr~jF;(;B-f*8$Wc7R;StT1s#ShFA!&)%eZ3D_(#o7`QGIT57mG*a3^{yem
z!Qy33!DMDl#o=Wpk<RLE1HOp#BqC&6E`Y4Qs8~A|fpcYj(Hm`pd{0UBI*g6=BN`y=
z|MpQLn#a`-N#BCQKiZ06Y$6QNJS^xu-%n(qJABv;%`vq>YAz?WmHKIR*(oJ8gOi#d
zCA9{m-sugF-;;QQ9PSNzf#|<*15IgM1UK;X2gq!8In55Vgzo&-l6B`7suH`?NR+`k
zpBpprA;Xi`CM?4^D#DW?k!fS0Lds#jA}3;2g0#@J&J`M%O|(s^tP1B2Qtmp=eVmm0
zYBc3@&5m;?>cT|uK{mgZjgNuK<lia5CYTbwTWo_>9R4c#6vh*t9TkFSUrQVG?*O$N
zyMyHo83-pX@{EFX3tkKyqhSP^>(l64HnXOQ75f*%LB><>zSzrp$0c!TChgs4EvQO9
zn=nM(rri~L*9vZd6H~9uN|KY&LlSQMq9WP0br^)O#xrX&b2dU?!-6el_&?SlPE7l@
z1)vo;i6j5wd(@j<awd`_!pOf=QXBa-suHDZZj;Wy^dd-OW@9G`#W;>uP;1+GH*r9b
zuRvrdl_4?>e}6ET!+56_(%_g~ewK<5&qWw772yjsWy6nE5i)>D>3H+-ZE+rq#?zl(
zd?Mr$apT?`mZY9!iJ9i2baq&}X21*y3c+~`rr%XXv{cLzVu5<(tp(SJp46!&lX25#
z4FT!OdoC7V-?E+BTvR(9lQg}J8nuz!FvB(XvqzRh#Phi(ObwaXY}RGD8+=i&Oh-N?
z%#}bUXT|lAAWCl*94{RfUYSrf_C5AK4TYJ<e+*M<(W6PF@CO1-9F}l(fhKufK6}ET
zW5!pAPK0r=NaOw<)Vl@8X#ghLj`+LK^A#L{<Qe_0Vd$sq0W+wK=%F}WD2_+b;+|)l
ze)KTL?9vcDA>?lenV!w)6N+TI23TxmsIhDWm?_Dc0&y&INprs<lPB1z^7vA;m?T3N
zeE?$5X1)dh-Dv?49`6PddfZ@kIW(T}<BBh6inXT(c%1>%(u>K<8_xwJhLiukNDp2R
z&IQ9lddaUyoNZ@7N~Qt!cq!ZBine$tq)&J*(^ccbFG3TPi*;Q|KO~LWMI)6Y5R#xu
zMOEUoOa>jCeW{*9Ju4|PV~?)nEqG_>AiRf^L~~BUUP?j*5~BR>7&8#Fomab(_rR`=
zoq_GmKq8$sc#=vcNtI@obK?j#+cE^?paR6|E+YXtQVHF%>?R)O)I2`JI5{@Mdg0WN
zyzv*nf{s_9g{9sNHcARZG6SaOF$RKVvsO*(5w@*@nJ1ZiFq)^=V$M%+WjPHAlUuMp
zqb3Pxs$K#@@dpykXvifd2-T3_=n&mv0Mv)v6jt!k$9vuo(xxUEc?M%z6-4Z6<KA&B
z;adx%nVzYR;Lnvu7|k<L!fA(6;iO7bIC-Whocz`*oF?y5IB7BzPLVkZCtaSxDWOoJ
zX12%Znqld5o$W&n^@x17p&^n)Qjm#dsb_7VYjj148cXptk_FFV2;FF(ZRmm&ql@F`
z9>{fJs#Q4vy%&m5kq;$!bihhx!iEE2Ju>YjUf@H7i4Xh|@oC4QEa}UHh{HtSSSGU?
zwIyD`DIWozjlgv|lbf|PvX)jCpj%!+Xxn_Tc?>l@2RC}+xQssZOk9@-gM~NAwzO`5
zZKmqAN%cC@(i{g`usl~w1(0E<_gQWvDDBuCb`nRTb6#zH2HW%=Lsf#6RB0@3bCWTw
zAO=dkReAB1SvKDlNq;!Jfr`&)Ye!5RLG$nqkTAP+mr_b^BvKl{Dbb^*l{Q0BB5pCJ
z6<`M<8DP4RU@{O;pEn<q!LSVa<$%nhDqyHDnnz*jd~MtOVHl}GV8Ymc@+1FFfsojZ
zJJ{%0FR_sok&e8Dluk(BmgayHPS$`uWZLvr6BC-`&$o$)F69&@Wz19AAQ*^Xnv%xM
z5)ga*2TAc9hLMebi);@f+s>3qJ(mhF_btq7z!W<UUI)n0m0A3cToFyjnDE-DE5eXL
zo1&<7c(J#tNPK|o#35J=8Y9Q`Z1OA8KTPbMBjWHdgD+u3wJ=##hOTh_<zge&#zrg=
zYm~O~!31bJUw$wVN$;ZxrOwN2YojYxhLc9!aja`6Wx6WrY}H9P6m(IA=YBZ+ze>|i
zm51s)nXb1<bzGlk55dU>dwoNiaarJtL^Ojw?};{e5H0tElFa%gzF5PGFtDU0oO&Lp
zTC!DSF#<E3XJTUM^D5ISiAS)iLLm>{mt?v@>IFFGhJT%b9Ur?Vfyqb{676U&i6{4U
zQOL!9y$YMPLKID)zN(_wJNbC}<5y_%yhP3UnjG>gdM8~c;gb@I04YjmazOz<WA549
zLP@oF?DE4~)kfWOlizcq!K0StQVgXSiJ(swnNGpgN?On3TKCH%>4bpi0f}deIY{Lv
zzGf<~g?P!P@@yC=KlUC9J0!+7c{w}Aidrw?t@B1nS{rSxtMb;&wKb)DDbB-r+vU=>
zp0>6>aYdGJ=9;2dNn3+NB3nB73m#G1WGm67sOqVhi|?owRXzNz>awbKdJ98j9Rd&k
z4%Q^qdFl9Yp^Q41q92(f<YtRLd0;^_V*(txE(a!d5(vm+xabKMJvw2N28~$1UGiHq
zD&|3W{>hD7ZB`@iX;J)8C)t5*$hEc9`p6<X)<Ty<yLA>{wnTCgQy3I2i1SmB;<TS0
zr1}-IcDhFl)PWwIS#UFrZW4_)CftQ+xwzq@SZ}-m!Hl{Bqi3NJ=BQ7|m1(P@O|HqW
zz~}(7LU=qHR~LE-4M83K@H{!fsLB8loH)r6Avt)n=>TP=LXSe|sI7uvt*`2CoDguf
z7TDsZQ?8K&&oOqv_emq3>!>n}4(#p>Zg=!5iOZE`79z|7vW0BbU7lEYQb38?3@2oc
zK`>0Ix(ZKyDpE=Z;0Q`-23crbNADoJ18ucaGs@n)s`ImLC~@-&t`#anb3coPj0Eu!
zo`}h>^rtBzZ?G8-v=+)K16a8d%4(CmcoQY;DINUDYdUJS8AlQGd4hYTZ3mmqa83=7
zE5g{}yr~jp;f`>P$WDMJW$14H%a(oqZp&;{5z`sxU6|hxT*z*~n??iOB_mZ85_iF#
zu9F}(v5Um}QE-Ps*3Fkpg?kiirJynLy%zY^JZ*EVh|F<G_jG4pBY2gAO)hqrs$634
z439^121_H8Mte)Bf!cARt;z!w*{W)D(`jB%ePv{VkwF!6OKmQSyIQBC9!BQKX$mU<
zW?Ncd(UVQTTzYV7{1t&pTeKcwyl-=s{c6%|(jQk&bemLq4a`#ABt9pJ;C>3z`D7Lc
zurPlJx?&-XVfwOBOiZ{b(>(;B5K6Gn#ZjtqAOJm35b$s$pRBsc+X^Zt0vPb`2@|s$
z3An0?1oKCcxy#-cXs+GWC)~B0*8`J_9+~>QVYTQab~v7B3Smd0_gFD&b28bMp*S?R
zk}WY*vr9(-&xzwTpp{(lG28Tj6Zo^OE7;i2YbCPztAx#y5uD9Kh=btsJ%Q>q0eq2x
zb=b9V)t#z}og`MX{NBW89?toh)ninhMU>5i5}8+G`U*o8Og+L3$(A|Gtgqm0Q&{uJ
z{LZXZb|w&LjM*Gf8!tcGsm8^m7tjNp&V#jG4w^c5W`wEf{OrHokeE7Wx!TK0RtRq9
zg4ONHp@Db}dbB~mGo0(=%6X=b$pmIUqlUWELR~YofUewxCi?{NW$2268z+6jYx{Za
z5w5W@C1o?iHM=YnFcTQ@(y`#%OCz^K!T&Y^6>c}&jR{~m^SidTCG%J4+{JUOb0kX;
zrVgLR!s+=~)Ir?NVkJDUT@GWatydNXdoC4Vmf<LHo4~fqm@sr_0m~arh<tKMnBIWW
zRDwl+h2C=)h4?{-RaE}dS~6!W$!3@6VNCzb5G-eEz~v{N<;GLugUO}mk06Bz^ASS#
zI;!o$AvAfb6EEJC#sXB4j{Y;CB0U4QUvu(3ARn#G+)<MAh8c@3(cv$JmVgR<r`f?B
zs(3smjpN}zyVwFiVPh%W!pT-WCy$*!5sNAth!S1vn%iKxqW*Q&0fQHX((<wJmTH2<
z9p#gd1n4JO&vo7Os^@lP@TfEI+h77|x&t-~282k2T`Ng?%pD2hoNUXMpI|FT(L>S>
zEg@`E4D)BtW_U=Qi@RrF&D?bw+iLUtC$CE2{-DzV@iDeZ39sRc%-|tT2HsdOGhIBG
zadkjGaE6gm$1q#qF<E3*6}|#Cieo5lz*I+1=|mYGK?WNU*zZvYE%<bHV%*|oiCmA6
z&u_=?V7<Mby}@yjNc9*nxw*0UiS|$|ViV8P(h!RPm$cunvm!}DMRM<mWdX+GnQgfO
zCBVonXosBuh3A$vs5QH+8;V5SPd$XMWTtMN%1D;lA6)(sB+&U87V{Z8b<h{JWR93A
ze*!@2^P0T;k1f%x)~K>Am;(T?j3)#d&Pzsn%LtBwwX`d7hnD2yxi~GQ;!L!Oa}{Hj
zBC6wKMhd6Qd<~dSMLSyN?Ww1QB&Ly;eFUL8S_YVa2TlWQl0U;rgkFC$MAGZ$I4okl
z&Yywd!iZC>1p;Q5OexwQmC#0@HVxGBX|;i2WLqY4<q9lu{3ADUHty2)W*Zv^Mjj0x
zvQt8mRT`)|ux%mV7SQ$(UW$(V`VcX*iw~!1X@r2MWC<y!O(ou@uh8Zm9OA!hb3w)U
zv=iFw{yS~H<ZYf)f_^3Zoi<Z>8;j7!<L|WT$=h5L+SLC$ZR~lQtwNhya1H-b2WQ=x
z4wegTcKn?-t9YAaq0Own(<X$s@fF&r{!W{2yp39DQ(MwzLBSLyjX>Jgn{S=F(bIQ8
zF_C1Jf|GEq*=n<yAZk2sV~HUH2@bIi$QY3<OBI@(c;rsCnE1jP-&O>f)JkNDmDDAx
zp5i11-UBUlbV;g*cuA@<nA+B9Zm&WNfkUo$MMG=_R3xonwD>IkG@ImQK^Ti+uq;r}
zPkUOn{L*4ep=lAt$TRMjt;1RUyXpuzp7shww7q3sRC`;*q<abCR*Lo)oK+T9PrDK`
zQnH0sa1M{NAfypajjaywMKp<-MDgvIta!)*07%#*XA;m_&-ZNFn-bE-gaVVQ0JEwm
z6v(Kt4L{%3DX?7(7))c%x!4pu1rqH?WU>oer>)p#uPw9tg?rpZcP`AT7iha5>%}fU
zXvkwj4pB0Y((<=UQ22f<bpYi6H5WsIjdLC)S=c|rWv+$m+zM)F#aFsYWC5xS^b%Je
z@3Nq!7=P?sjL4r<JK=Uk4lN0Jw2dllELTq0jAcQA@X&#s%^UYRfQs745A6SQ4;}bp
zgND?vf8((Mld0tK#7u+<MngvkYGU7NlW-}k<dv+K$88<d7x=Ix-Kc}|{Pvd}D7MHz
z?flp$m#yp&6RB`wQOJkkISs>p3_}Kn0VBnRVO2#o3>2ea7@n6N2BMi;G7N<@45c45
zk1mmNK>_iBZz(+tQ`l(y$DaN0VQ~1n!;spA*qh9y^v^oJ{;eU{s|K#rA;llF4FMb{
zDay!hWU7sOcVS_Hjcr9zb~waaoAoUp9i^wYcq}8^56Z<>=BUi<UILyOWe^EGGh~zu
zCn*u?<UD{i<ZukS5(xZUB-!lo9UD$|XAmRGWIzv@L}q~;>){c4AR<DtE%i{uB_q0h
zSTf3$a6gH0DLt%nA-Uiw{x2I=uYWzPUt#xdbAt{gv6TE5JZ-`hDI_Borbltf8E#oc
zj3di3{->X9{`a0vn&d4>X0vGKeCn~$S3&a9F6J%!UXdYDgwnP`nH(_f5#hR4;Zq+z
z9V(&oBiGjahWZ#R4%jezb3AoJ8$Ohm&Q=MmM&n>ZXi96F#pJ4F>s^r@{5b)PK1w9C
z<`rtI5FpF7H9O2woAA5@Wwp^kT;diKtR+h}rp-i`F!y3uk{X7j%-ploJ3<=|qhS}J
z;0hrUK3Nt}&KiQ_HcUApK#RDecD7KWBfxY=pi*$!97HhR@_>bEmULikjG35jObAYc
zd(fFmMwH;BA=<J!b3W8mh*6rCKtP)HK%d6M)6q>W-X$ilTF**Q21`lKIE8^8Oi5~}
z@3g_{JfjQns06)X*jN?{$e^VdFEJBZ9rG|~l;w@p@^QQ%;A42{;F{N{jf8~m8*+&l
zp8g}@H_sp9u=(IYj4JjPJ^E78zZVkA)^w&Hh7YYZLQ$$9f^!9pZT^;m>f)fpnp>UL
z(v8Fje<qeQ8kVDMV&6u?T9&78qu~pdr*ETSCBCx_Dfs3sDQ7e!<J-i(jfN!*BguwA
zIiq1KzOxK^g2EG0nrv8zZ$a0HWAI2Ll?DT9$FYmi@ICS-_N}+H))$RYEb0lVp<i=7
z1Myay<?ekTd_{XYJFDLdd8ocPFPrs%rb7OD-ba>Sp-*En>FVm!nE1MC*{qjz3Bi=g
zq|C|8v$xB!%b*(z#NIAPZ|Pz*P@Q!&d`@B-iHyf5Sxk1^$T6`>sal9xHiIG~vo>MP
z*c{N-;0my_9zc!FV*bOXM00RIq!fCo?SM@JihLCHX*7zmS$(tFppt|ogPHFrqS7Gi
z)970^%b`S{FV+x<_uWp?$6j#!7*y&ccT(Np77J0r+Q=sepI+Q1x6`37y=|i~(65$c
zv37QeY0NgXGq^_8D@{o!h=kE_n8@-Wqs~zAZ+3>G{)^5qwj~2eb}H%GWX7CKTBcQG
zGYh8ywvvm_W~R=HJxdqcq6meMv2ca75<5^!G#DC(60Rg;Vj8!SP<<LrCz-a1iOm{V
zx1bBSh?A(XE(A|~8W*Zuq{%=M35e}iiWFQK1<oe)f65ZMQ)3e6Kxd-FSU3b<`5(6Y
z5`OtX|5H5Rr$r@l7okWmbjMY142!#oIY*P3xzW^(06_}sKD`rOymuyYCT7QDO>7#D
zr4>((c$%I<S$kqr_Wwhhfxi{3?jM`^Hy3vn6mTW}XKdy?MKokH_ShKx+ctxrEZIuN
zDSBG=y3lq&{wHUTjR}?pHX%~U^_lka(1W;RPE5cJW@{UqJpJ2!Yb5qf;yQDwYk~h$
zj-i|guAK`axb<u|=n_jucIN)^NG$ii$9+s({HM+{pI|c6ieQqc1Co-+p8U5Ypq3x#
zajpWi#FgTIwU@SVoP-n}!K9Ofkyu;)?SXq=9Jpr+2Oa=CQ`9!(fJOdPap+tt_NSYJ
zGw@pWTk*GR#Zm$IsEX!ch|PHfKCI$@6@r7IOyng_(+s$F8Wy;A77v?OqF_p8#=4gr
zJQ^|KN_lq-Rb@Uy9Wa&f?gqjm6WbRZ<fmoyBPy40Jr17L0wU-JO91uWg9#wn>JXeq
zO=i3+MCWtNWay3yZJ8`l_1Lo)2IM;+dY?|*y-<ji{0ypVLr=0SvAYqourm>X3~l^M
z|K339+r>}Oz;+Z-?m=IfjbJvrybIxA>%#9?v6m3D*$rX_=f9hjNGNzW2LZBhy>BLp
zbY>+K!}aWBPNtipUci3kiOyzbj27BJc|O3!;PQmEWXGWF$E5Nb0gu(NTYwx%ya>oE
z`ZC1W%lY_~0(O?buL}$Gx!S#;0&7QhP{(cv^Sx^);IRF^bnkj+7tI3!&Q9X+ww;`s
z>KPCba~B9q8Zfb(ksW4cvr|Y8rBg|M6*|k%fpA4We0M}F#of^e<neY6zQx@U{ls9Y
z-O)nllHE}}@@#ixN@Yx>I~^h->~yFue{qRO);RzqrZHKx!W#mO5U6nR&?9O<vRg+X
zqfcYSc~yC)`8=V;13CFDP9#J)b;YW~tOJ)Lnw0_3+w?-q9cU@mX7dK{E&#cLU2n!+
z;mHzyXIxMl1=b`A*QG?t?U;)CG`f1u!g(4GqGzS6nfz>-6%D0}^lp<rtrxXYB&`O%
zK8=l}wnnzW9Z)D<mfu{YD*)lAW%<}*5(1Z*2+(RP_PYgMPm+RqGhDIjdtf;>gjAv;
z{1BShd1{N!Q^GrMrGILUKO)wlu^_QF&_>0^85%FeS)%d261MNq_}ffhqo159P0#-p
zQ@PmJEND;^NnnbBw+-{#Oc&Xhn$T2a>MznL(ec|ottc`b`8HE@pnRtJ^?5fNp&idD
zkV_E_Td>$aX|3np0aKMc@KGG{!u{e0cQ8`SyfT1eVnv%d=!@ivMe6`BHG}C_BE<6z
z2jtld=Da_Z&?xyq4}N{h+`}FsIDlQrK^2vRgDPGju9K`J98^(WIH=<Iwu35G#6cBe
z*-F-z=!Gc%|3Tg?#HwVaA13<W<Ng;j5gJN5brOF_<t@ESV-&7Rc1B29Q%Q`%ieijH
z1)~t7@Fs6*Ux-mE<<I_KQviObjM*_4pf7X8ijSQUrw9+0OaHpz&aZ7DgmC2vziw#s
z%QJdjbH_!H5qP17T}a0357+qh9vnZj>paiV68D1g>Dez}*_Q4cVDh_*o3?l`$$pE@
z9)T7Py6LIwJe;rk;kR;%5F9BnN|r1-<P`6}qQ;Y5+~e!|07R++I{1P56A%(lz4Zk#
zIFZLbUzbuQxIq()Iz4uES=kO0#LGF^+ep~$`h@MWS=AtL3Jb01HG>qB=XPqr*=FOw
zD+TqXew%IaR-29%6zqcJbly(JP>2IjO3e}YcIQ8p>274dq?v5c0tQM-gdF3LEi0&5
z{;DleQf<vvC4nY1kwS?gowd|k#_5Zcibc7YOr>0o6|F><bDb#Akt&{Nu7tusV+lF1
zd@lmv$_xSUEz9pO!C-)n1i;aT4dZ>aSw2gUs7xL1mo1TLUEs2)*Yvxs^m>!oFUe$P
z-KTK*2G}E1pmTPH^dc8sTbqx^^1O@!PR^-HrmL?{Xo%%qpWq<VRTa9OuZFJH9wgb+
zFLXFjQ*A}h^r{p)43RqM!U0#Ral&g^r{T7f-RtivjnC*Otd;5Zn^X=RR4I%SZk4N#
z8db|_lvT^kHSA5PG*Z}=;G3Ce5X3gBl1;iK?00FH(5+<QXHBy$@u<}&ESJs0jqp^e
z1r-=?7l>l13QK_<saGE4lHo-rGP@eIo8hP_c%{sE&Z4bMh{*z&8*@uoK9(GGp_HMu
z*laU2Oh(*YFzPHuKfEd6iFW}kxbe##AVJ%7L@V6MW3vh`61)~?RSd>GTScorO(4`1
zaGrM{2M6FDJ<X^!=$$yD78Jx_ZKdrMJyA6l@u#@P)a_I<)1zFN_ePeO*CI(ae2;I)
z-w@`#64^55oo)CLK$0PwJ)EVSK8;#p{$(3j2B7^P!dw?HIOP;xtcL%z4sG2?7^G%O
zLln-LbDF66bKq>5SAZ92UqS^Id6{)DnwMZc$p}$&&YaO#q!FTca*Eo|?jq)NZ9s|0
z9;9gkcj1Wr8hNpVP3%H&<pXRY$>~V|s#B>u*dyZGn1w-o0%gM7n}`g4Q_QPb0Te<M
zV>3l9O6y*<raIqD2qFIydButhe(+rCCN7>mw-q|5KH?FDfAle2J4<csm4zRQ<$Wb#
zt%vbNlSdnkZ7(G1z<w==KLyWXzlGzNWw4^cIm_@6Nqg+}X}FRN`S{K<6yTeb#rkhb
zrJowYE|`9EM`X^NAzLpjX6VDn^qfCbAZ{Z^JU9y*IjE`>AzCUpA$J=geHsm$Op*LG
zbZknc0*+Gf&1NW>oXJ2Y$7X0Ib~8ryxZkvw1_ASQB_-xmx&nYFV}zJ(IRi92nL`1$
zR2ze?(3NxzjUN!?8_<aVgc2N49wF-B^Vp4pbd`XVxBe|8Mbcub6s{kbQmLhI?Ssw%
zxGancUqK8oM?sKz*y1t--aFBAoWo3eCp|pWB3(G;uoDJGnO{NRlskZO{uGg<nNS5X
zyuP}GGY%Qv!PyU-*=Cw(o8G2yip8w_Y6b5Za}-+WHopeE6`Se8)96x;D{P(3aoJW#
z^09a#P3Hy1)JgcJdEtZfEmy%-1yVBqt!VyebxjK#%)Mf&a{09>X{>)NKo(E1f+U<e
zerwxAzqd{9m7g#0pN?U*?8a3Jt!1DMm$;UaEY(H~4~Zk2kBuN@U@WqHJanYIavr=B
z%K%7Igak$kvVsR{M1Ed$KXU<P?xzuml%Pr74{s>AA8(ucQDNw)N_0QzN^}us`erS2
zKW(zvve`)@1-Y2Dz!F`IAQ_`Vb<0^DcQGrGFW1GipdP3Xl=zn_+`mv~$iL)!XQ{S2
zW~sh*&Qfh_pQZXHG)uMJrVJAsOSPF@Slh*~g766*gpn~Ch>AXqzGbs2LECouF2cHx
zpLsX?C5rI?tw27|8iSb$pTy#p>;@@lV6)JRA5<GluH;a_Mnj%la`9^IJMO=IX|u+X
zn3mpxq8D@&hO?YHz1oB4*r13|g`_zOxb+aP%*7QC2i)7#(kllj(B^<rgdptV9pc_f
zSdWD@czzdN>)Md6dO}FNyXT`sJ>6kf<>Ep|eh9+=m#?p5x9NrTx;Nha^I{}y7vdQM
zy7bP~u;(i>bz+Mb1?UA7P7c>UgrdrVj1znDNTyh-f(}dW?uf6;ndpKF?~5ylp+H^X
zu-->w@k@rRz$B-)atm=Eg-)}OBe`3#ddU6s8WRQ$F=Ti*lI{W-tC{>>Cp*}f;vGE-
zS)l4XT$P8aX{7>Yo-GLZ5Hea>2u*kEB|1id=@@Bg9b2S2rY1HW=P?~ym{Rgc)`^fc
zk0y}{vgQwlZhGTpj+cr^kjCh(8R9B4c@}UDJFH=y0!&mX4H#%p(Me@NFQ8bOX}pj8
zrE&DHkY0yiH)0F3Wi(6K6v#(pi`Qsrg8^^{0^=j&Z?D<HIVht8lZj}|sw~86uuYk^
z!i&iysqJ+)AqFIcQV>Y8nGko(n~dKSm1)P4c!(HSlt<e#bp0=E^f(&7>6>SC5biqK
z8Y@|@3U3BQpjAH{pb<2i#6@g@UZ-ZzqabLF@$9X7ozRkJI}pq-R}44pgr=|rqXQF2
zRZ5@;n?R78t`MUKHqq!1n0ILwZ6jF~nI5DTBU62WgJ8-`UGy6Mi(FbDVYbGj#<P4Y
zIR{)yw*VeL%{RNOia-dFZNcIuvzaPs7W+r2m0pgdk0-ZMQU{M2a0QDEr}#Q3+>q^-
z@jzWIey=zfM6v#GRGkxE*~M*KHc;%ngZJL2>TG5PmZ_HnYb$%-So&@>0)FLdXv{
z0AWU~mRKKj5T{tQiPmY<mt3@2-2@QfqRmp|`L56`-`yBKZ4S_}yZeUn(LBGI@<BYm
ziSlE4ehcM?^ZZuI59Ij`l<&^-YboEB=T}kweV$)Id3yW>UMZBX!1E@`v)`~Y%a>3-
zpSgXxnesPzp208jd@{kad42)q5Apn5%74rAGbq1-=ciNtGoFv7you){C_i74kK_3e
zf`{^a0Oi@QB%9^qDeukUev}`~^FEaSfaix%z5~yzDBqmtT`6Ce=lfH>GS3g7{GVha
zW_fSQKjC>7%HQGnE|kB_^BpLEhUeQ-{s_;vq&)rV80a*m{1%>XNcq(~U!U?zdA>H~
z^*paYK7TeEvdna3Y7rz1nHrSwN2V%e#voITGM>m(ri?o>6)Dpfnd+3m%^{BLgiIxZ
zD3Pf^nfH)k$n+XEgOq)vPF$1l3OC$Y9$sF@Ou<X~>|I7OdmWj*2ezNtmfu&DnEj8K
zsTr85`!Va0k`HHZR|&W3E&1)pXOLs}11c>{>}n;}D!-y<YDKT7I0X0+uOH;W=>d;^
z!5S+=+-adykxH?{sl-Sa{VqW=yln-)@J!OWVm^GWDv#oZA2PrSzfqN}x~TZU?;;Mz
zfLJK9VJ?hbONc#ZQvE2pfA-Vf5bq?2<Uqv2{C>z4@|$(+#{xjMyXqSG-3;m6hH^2t
zO@7xhA6CJ?$by47edI~{Ap-pF0KE-?U*D)efyZt}4t=+Vll9RZQZrt=H{$25WT0!L
zi>p!?ew!LjEYs3386BsLL$c}?%T!5L-9{!^bq8mA4B;-z0O20X0O39|$*Knk(DK+x
zs{z(V=2_4P;?}bV{1)JH_E^$fL##bZwH57*=euSC<2J)!Zo+nFa67}WXZ3ejJtAz0
zI+jT4=}l{XBfP{FMo8-Kv3hx__9G<q_gOu*Hxl|CCG`(jJsq5s(61rXpA-h73JpYM
z#iAvAAuRfDjBIAZ@<&#e2BX5iW0XSwUXj~a594LGF&@T0-2B{(f4SjbXmZp}k;(kj
z=1!E39^rJ4E-{5#&Mw0mllfN#ZL_XdL)gEgG{X35gz*n|;}f^yda#^hcIw{O@h&|6
z{j`ab-6pwBcKguHU;1w(5kJX4pTzjnk0BY~9D!BgpH~|lh)sy?pI3Uj{(aQrkLZw0
z#OWA={PXpX2LAgRu#>?*{1<=Lp3Nst89!?OHlJ_o+ehASI(ta*tPMp6R$rJquSLiH
zxAfjaFBQcc`0`GJomqokHR$&F<i1ts^eeuyvVBU!7gxV%Qhj**(tC${#y8yEch2M+
z5Bfitbt-+?o{Y^u&Z;`B%ca5F42K-M&RqX+Yed$=%6A96o-{mu$&s(pm8a`p{Oa_x
zF_%}T;-?Jw|8cQFox`JoB6Pu;&T-o4*nm(?=b$i6(6l&Rq<?s1OoVe7b)!YQcI|vn
zr`7}nYhs<0?b-yzpx9|ZOjvNNlM~Wle_c$BCRW+Mzp{&y%rQ<GuZay03D?9ag9Eey
zN|9!qGCWEdp!A83(}u=sMvol9ssjV!G)f2&r`5zdSF0xTiv#gK)nrN~2u<|%@%8Zb
zn(XZCteoiX=IiF}F=}!(SvBt%ZFqE49MEJk<-|_z9hIFr#e@>K{*whXC1)NS9;*pL
z=VlQq`%@}A#S=b+42%YE+h4R!8>7>b@C+WQi6c?s#D-`Ws~n{oF~nnp%FEkbWoyU!
zr;Lt@n5B%>L<Epz${FF>FlAhfCW!P9Oz@EK2u)OUutp?@4t5$N$Pq_PY=0$kFnD-W
zNOWu@Bim7_4bwy^H8aDtq@Yd%sePCxBF5I1GI3$iGk_Q#r=+S<ZAbytJVKOF(Pgxw
zY$vGANr`qg!C7OaCMrHWHaaR&69w%A#D)h1Mrf2iL#ahIVC&*E!8CHh-zZH?bgWhx
zs0)o#YNM6?VnXAlM|7g1YeaaQHqJSQ8vvlr(XpWeoU6%VLj8lHqqMQn5tJ1*<s2Ch
zL*?)&m_}4Ugfks4%Y26Vk5P>p4XY6NImY!-21V;4g3&9jGCBqdA_~fuj&Uvfz=%Y_
zO5?@Cnw5bO(LvLcjyf_-NgGz@7_5vAQ91^T&0*KzBTNs$!h`r=hXjCMa48}kZM4xS
zAR<Tyy8w!<z2i722uflCP~QW>qvDi~<G@YI$2OHpQ?CZ3&W>?h0j2xJ`foFOi4vUg
zOC-=HwC=Bsj6u`V?c%fnTB#07>n%JgJ|H4ISQ)1YWmXWSiwwkshmJ(eMFm81m9x=g
z4q%qQ*xk5JUAxJ+rOG-DL=N?2+p<phkwJ`bS#)K)5+~uL?Ak3@6R*LZxTL?LJd$}O
z8Z94_j><@Q6lI`B2~pvVf=VpGH+B}OMVL7)m6%=1$mm!Nx?-DXuu1nmv3=rUZDi$)
zG3o2=d1IR@rR+o-9j%NEh?)h%j6@pTDIz>dqYQ`*g)mXtIAyz#@K}h4$wOA^#M%=5
zl8K-kJZdl|TZ9gBk>d=CjwMGX+6VWSwob-{!I*-hXGGDA<Jz>@nm7T7ei?rpfL_z2
z(L~XdW`iks;%c(ci0Ht82pPF&8X|3AL|jaeY_K>1;(`LAsCO|D8W=XzCH3?q0*x<}
zjw;89VA&AN5cm}|iqZu4A>Uq2Hi(R00%S1EN}JUPc;dhdDd|XDrD!37bkr>wc#_|i
z37&*t{yJ@VM4WSwEGRa}KYE&f49y$fvCbu`7U-6q52gJGGZJzO;ZdQYC6&f!K11R|
zO3$AX|5O%F>T71}$PmW5jq>vF8VX~Gj*AN?R~tA>t5HVBV)X@kWn_39R)e6hYQAA*
ztH=|R1<-1)p|wJ4P)xYU@X#<V8*@g3;Y3Gis<D=Vv-q+R6B`{G8xZNNRC>Um!r&N!
zXpts*v6?tt1PnM(6M;o$26MZZWH$TksD!mwBg~Q+j9Pe?2ytbUwuy{^2c<5u^^&bw
z%tUPt3==U_qZL;N>H@}B^!4Gvm`vdzm}NRl4lGW#W*DPLndl?}!=nQDoNp=far0G6
zC@}3y>!JS=`goohjDUrvC}~wQQxl}q(pek(E5Qp23y2EUlyE0xVfBfW(qr7>W<>@@
zM}!9{;S#1v(BRME(HSpz#aLMkw?Hf{x>&|j^gS~I;xG(aT`Xo2OhYNS7)Gj^EClL`
zQ__k=&0>NgD8%6l6jKU#i0dyoWuKv=$B!QD?KK2}RjERMae0x&$3?}!@U$T%Wa9XE
zad!!8V|GcRSCb)lp>;D-1DDEUE>S+_XqXDg*+9+AGcsEYiPlBY!6esfSUA>l*kBL>
zFc#~epUgfe?m>5hHE=7?YbiX^3c`d)j5RH^&`Vr+D1ru{(u4)&()dJ6q^pa9YmE+#
z3ZD%KErdEO#Ws1$_K18Kd}gpN2BQdOSYaX}Fj-lcqHHHdD@sjlY;-JKduLi|Awg%X
z!1!Y8F#VhuMd7IFqb$RqOi^Nw7+RaD&xnqlM&3Yb^6;rVF6bN<9jWOY5DO#eOp`ed
z-VO_Yx6UwiS-F*QVOS5!uhwXT*!Yy&Uf4_UQIwTW$hPOu#KLEnj~5ZnHs<B>WAb*!
zA`$>|`Y-r#yx`~{jKsgj!*VSvZ4k_kSX1^TP6sS2uzbEmx9G2I*DgL95mI|6Y=LBh
z@zW(==M)#;!e`D2o>JZ)zZUYVQ2q&NJCq524|zM}|J8?Z|L$ibeqxPq#q{sCdlz0Z
zegf%T@OQxu|5#k?dy?}d-t-|I5KP2xhWtdKFGTxW{LY$Kw-P_FM0qJsF|R_q{PV@d
zH}PcRXMFl2PbqzlA}^LH{Z6P`iuCV%L^^*b4H9^Y>ECHn9!~a!l0KA&Bf@9lhZYH+
z_=+FutCYqks2BA^@W*H)wiokKd@;WtxW5Cp7@s+y{}P`W_(juK_&CPMx^(T<y@yNB
z4|??u2n-6=gh&8xg9f{+hS21Q$Sp3u@$29Jw0j5syI@4K_d9S2K1>3ih&<M1gaQMU
zU1Y)0F|j4{76BjTYe;N#q|LRLRN{RCu>cLw-~7_$C-bo$$0AyR&yAaf!-`1m;Ax>y
zbsOX{RHkwpJXkdviM#tK4==ybDxUT7_H`TM<}m`LAzt32{d|19NBN>K)Xl>S$%s*C
zFkYq_<c8l5qi@gAAT`KuG+=|(9wXe5czb!d4MHoGP#C2e0`j)PXcg$WgUpEGNWDE}
zD%CiT(SQwc^BX}4lTPhD+Sktu+y<*<Di8NjK=bhO8Rb26lxj4j_VDr@Mc-rGAgPBp
zJ`@%}LXXiD8mLBid;7>#o^<$uWQ2zoctO<B9urXZa`ROAdV7!X_JZtQ9LLLhklzpx
zrlXQEl&HKQ3zBiF!IXNcAORHO!%B!ksN~CeBEy&>ReAdOLKVo3_8ttLl<|WkNEv(_
z$?A<0fgvNj$I4VbD(aYzhYzGe=X|{-Dj((MH5A}c-XzxOQKSc74^L>!4}$rQ^BqJw
zA4k&4itZH`Z^5TAK9}&Bcel8B1wN_B$K&5?X@m4M(#QDh!KWL_3y}^24z<0Fd{5+`
zAYTigKzw$hycnN9@X5kw31Ii}@j`w8QfhYwp9?7OLfRhbQKZd~KE#LWhv5^2&q1L~
zun<X}%Ks-nzo4IA{u?}J3vF&lWIrIl^g_BBv_HmY6+Y5Fy+XMT+SvndJ<>4(KH*%)
zr;<Q(0@8Hg#v&ExvY1xTPkZ(vrthuizxmW{{@JSRCiR}WSRSlvlOKHg(z$WJF8=ZA
zo|(Pc-JaI>8{_9W=LUOT9MSGb+aoi#x9(T>!|3b}9U|OaZm-;@dAhnstqL(d$1lfS
z{MK=ivbbl`hULfCezj=Th9(7JW1k$J;~R0=c%(|x$%jqx->PS}IXwJNPp=yvrzx-3
z{N;kvn~yTvCp_w&*{JP<qpL>L?-|~tlFP#x&ujk@yseI(V%+`B{=@G%_Bc?>?M-CO
zqp4rL?0RC!t35mC?e@6SWzXs4qeFV<o>v|GuGP*PvtI0&Tt_~B=b#1?Iyv2)SG&)_
zgk>Yu#sIsCi*HZ;I^d(V34!u6OIQ4{`ioRuVbtNKbKH)6>G0j~NY}OM#~ns(SbXcF
zwN)2ueP3M(9Xo4xYMg5QnV76oy^~x%Z@=h=XSop<dew*?Ir`1&^%WkiY!TM{;uXh{
zIpe)HwBFWiS6e6DpdUSttT=b}QgGk?Ni%w-KQS~}`1)#dyU(2RpJn&4Oy7R8Lj8~?
zvK2KSKi5qD>&5+^ioj2^0s<U%e42Xauf!EoR>US$oV0n~q7Qr*YeR<`U)0#U<4p7X
zooa2P35Qx&8{f;jXwM`=$K5v{-%xc-|7pm#KM$<wyK;K1%fIPgwce2Oa?8oUI>+pP
zsO=i=e($^9WAAqvwdrULw?8wNwy)|kcXs2R#kDQhS6}USqv5Wz?ep`lJB67#&+Plh
z!a4ib#4H{DQ_zx=n?DKN9-#{ETzOy0wXqvN_3t-kUr^^N-{-l%a;;m>Iyhzi`&*yy
zJ@?Iw8~y#0K6ai|`QEknUT<AiZ~n0<c}qTxyQ3&tP-R*-zv`CpTZ-SW`u*#bfkSdP
zPH%X%ZR3ZXFAaOyacqm~+1>MR9^aMNsmtJ;0o^kW{kVDW@b#N|Dt$(th#5R;$osnw
zv^=jlxbEO*UZWl+d7fVoAO2`ryQxtxuY9@Y>r-o*9qk)GyUG*Y%b=zUs&%im&^Ye(
zso45^PCuRJI>6p}bpHjab-&zOs{1+Q?xGf{-!E_3;ATjr_CNNLf4$}O(+MwrefH}Y
z9U7&LDQ-NVf6d1~MK*inRjJ^38}qU3P4~NYAM-=YE<az-xw`wkin-n9{?ReB<Gs#<
zhQ3#Qp07uhzXl$9T`PKPvAN=ede*Zazc+Q$+&iED9-Y_t%QfHJx_EJG;QkhqKCyK3
zucOcSKHj6=zTaEkA6zN+cUMEB%E1S!jSuY;@Xe>Ey6#UIHeuKt=aesJ?)Woe$%o&}
zUwS>m+<0r3JB>O_to<mry8YvcQwrq+x3#WxIP;fh-k+R!`s1q)elE0z{}Q)wb<4yH
z$t@}zn0-1lZs)1H4JY+~PwhQmd8;&CbkuM01#-KE&Hd^u_+mib)S~A-!smRIwI=!8
z&tF#G*!JKD`(_>Z{+h{i=#rIQ#~)1Hw6;_F=2k;SjLK=?KXP=!4)OsrR<5u6xQnXo
z=UX52xE8S3>zkMDCU+ZoE?jjga`ygX_12u5_`}Czej9nP#kAkP4qpG|#kzg^Wh`EH
z_4!}bW<6N&^u)kF!b3YoXC^jmIKE}|$tm8`+gEuM_}$uSV~+RQs_NbC{Ccl#al0?u
zFE^}R)pNzY(Gv@noq1VYJ*e54xqlwcwZCJUubbl(QE>6>xYffWhTm9d*Y)eh8(MbR
z<KAFs`}^{P%CBz!w7>sTUxzCvS|!i?CB&`%_S(PhJCZ!z$5d>6wo+fm%k}x<g3*yJ
ztA{o|__fwQ?s%<s`5{er4_@Gub+uKWzUL=gAK_7S_UBv8U#*{&W__?L=hKvSv58d^
zQ@@|r@PjQCv)(I?`*3f;!#_slu8Ar<le=Zj?r#SUT;KDv+!bz$*@naN3V~<VteZZ{
zWr)gg$K5egf{h*Dd);B+$G$&SfB9onL&FCbGM;@Kb8%Yb&mJ`S%H{d^G2=dVKKp9T
zzL2J0XS_Lbu>bR{$-^Roy)GWt&)ztxMfk1M5p^;BWr|Hzd(7<rP1{P2UnEXX8}O;6
z{;ylUGSB(`ihF~zJFb0yy@z(Rb#TGcSLyeYo1U82zLw7~KlN$r73~zb$bNMF=4)M-
z>>H~6v-7>sMVH1ODID?V_VBw+RyLcPZko9F;?s_urr+LcSn<ozraev!e9yC`dg+s{
z=i641Z$7fE!JYW*h~<wp^J@-SIDGaE<+zuxj{W&@V1VVl(;r=%GOG1C$JHKtM^(t*
zeCC@P%MN$nxxlUW<&d81bh>Y=E_2KMq|2x^<0iLzm{Hs}F07%VV&TCiGpkoRuK0B0
z<rRGg-?_EnR^i5}OBP>z=KT4v38!}SKRPWv<7A!j`+Rl{b1oV?Z~T<sBb$7^^1W$Y
z`&4}P`SmUayZNpDY@|+Fvf$xo^_^;N2;BYcuC_Cep7zO`^6M$}qDSdDj&BBEnbfGq
zpf}ZaJZv=JO6;LdUmf~otfiIKeY)w>m`ZiK2F<pZ57mt6ymfi*$uI7>PF=d$So!z!
zwKv^Y9~g9DV2ixWqa8kPv-i8cE*;;$W;ankuTHaJb;H6h?4DCNdyMy@+s`Nc)$dZp
z#T(<_t6SrfR>8}=cWE*C!(n4iydHlp^8LvDzaN;VD$e-4TWH|oZ_XLguZ?ZH<MS_v
zU4C52a(u>4MWL+cjyrRIs@=D`-*@en{pH)_p5_n5+U=`#mwWUM@%rM%sNNOlx*e}2
z-~8Q*EqmJ!a9uEY^X+8~qnBEuUv&EO5C7BW7G#HAE57MJCwb^!VYQz$o)j`;zjv$j
zpIv<0%xLq|$0>CVHt1$Iw8`-MU6QQ7f8DWG=8f;SUs#=xxwYV4&>@{?(o*}M+k7$4
zd6j$BA9{^VX?8Q{!AA)(>%ASS{d{>(qY?A{I`w_<VnEiTkAK~*>v+1|^)qk$RpE~s
zFRA!hWWCKv&kwup(zm{|f#(0bWp0~lj*9k<%W!_!c;k&PgL{6}_q{)MpAQc%)SehH
zve%*Q4r@+E44Co5oB0=Ru3EZzU6r#HE}YD>ZlADj#NveYG0&ge`FT<2p9a26Te2$U
z*SU`$5AQ#Gjj8UsJB>zF^d9`1y6gEpj|SELqH;g|_4R%aI{nf?7ZEpZ@1W1yw5`yu
z-k<KbriRPi8-3kjXK3H4)ob~$njiDi)hiWh{@x|wtM60KpQ!)qw_|<hUi)nJiuLlo
z-OmJ^8M^nk#kaG%Z>+L^-Mt-Oesuqnyn0)De0s@v<I#|lbCO#2QoPzVYnw4<tm}*0
z4em|czwc|u_J7nE^Dy^*yP2O4c54#;!N8B(EgRugX=b-;L$CjusL9?^r*+_^3)Pah
zxeo7ito5PsX+OMreK!75gI#x1ziX1-qd2gmZbI#XiMyibHLhQ&@xy?b$=@8Wt~NO9
zt{gXw{wsT5r|6IBeE99wB4vN;WRLNUoIW@+Zi##K_>exIe{gKYu@y@#BQLjK{mILW
zdlmL;mV~NnKl{Dv_X}LweptLE<j<exv}p64b&5X2$tyhlkmAS7H;eaw?U~;6iaF)Q
z#^&coRZ+z>i@)6Uko>*oBj*@Cd|v0)&cukgCyvd(%RCp*B5~@_O8POqmmmFYm~TO&
z9$7mMPhLOyWUDSsB7F+GEpiTwYq_x3g{artK0UK^YFdk@MX!@qeo>?Fel1VG)U!cH
z@+Y-yJa|T1Q~!G*&R&k{<6ZVu4H$p%W&O5p`*-;+STUh}-OI%vdBlHLboGM<ruQ4Y
z{@plk{1=b5cDg;^n*7O;?7%0TKFU5-&G6^GmB&9kKJMZ72Ty-f`{K@tnjQTbXWV$*
zqy5?_-O?I8E**>AoEl;ma%jo?=dYX^q*R+6W<R{>{iB*X8lTqt#x4n(>eQ^|xd}IF
zIJO?)ndsRr?RocuuSdGhzHn%O<<n-%_H-Lnpp1L*Ro>*Lw_=;@Pi|1~;~z;o!VT_v
z{hz-aZFBu;X5$S(iAQoa?DX5&^+{p2U-P5`*&ntzT{-5=_o-c@8lFpO88Z3&qWWu!
zBd00SPufrOIz0EKqLHF^i{GYyGiA)G7lTIJ)l{&4wy9GE>jx2=6Gwk>;Z5fC8D_iR
z*8KMAo?iZ^>S|j|9Cd!w>bZS8R_*_#D)Og?eT$cPs|O-%*>OkausE+@RKs<jH_crd
z7c}hbFV&m}tk)i%U%B$!k&k9Qc2NKP{qBn2v?*Mdd;iAp&DyUXpL2*_wR@rSz@vj?
zKOfF<i}|=lr5%M|4y~g9F#PskKeijXd~0s^hkZU@@%gKLb038L@$2wzV^dw$Hs~^B
zO0O%up3Ry+a)q&a=ZYtHPC9<yz2%rKw<=EGczyPrdo!!HI<suX?!_N8`+4@@DMS0-
zue+#zjmY}GUuSy%e(%Wj?_Gavx@y~=rB@YK^E=GhG;8XOUk<qU*xEt<3C;ig+f>&V
zv<f}a+t@Pjx2`Sc3|}3(Ikt}K`04$!M)!l<|M>8fU58NDRvtGWhAgi?D#W~_sFlz1
z>2}>ZUUh%ce|NCe@P3OPhnH0tQ?b_&)7iu!%i`Nt@b=!&qSK42!G%{ZD4Pe()Eu4g
zbm-dK6YUe)-LVf&@fdpG>kQ4++EbO4O!XWucTb;Et=b2J!uG7GymzK!<?R_u2Tgxj
zX-ZwU?T%+7X9TqP`FiWu1-_NO^nEjU@RvgdPiRw6X;E`c>w8<y1@!u1liPq=ii&qW
z_^8c-r5%GNsg?(+{u<C`<?($LU(Idm*48IWe$o=*TL1jLz!~A*ZMSWB(e}IdCkIBg
zyyRMA)jaw62mYJC@&El!e8}9)dN+PaI#n+!DQElf%$lj5ccMEz*u2NC-|fi<R%}U3
zY;L&SYj4KDXEih04BKM(GtPO-=Vy{`yI*;HbWouCqOT_YwCS%qp?ONDn|ZxI(QUfB
z_4T5U9L5}-*w6gY_epJ!g{SSjRb5_j<1gn9Z#*}n!L2%8Umc72!{wtL5ySO!FWt&o
z_bBY{F~6hhnte5UUFEhpcds-XpS6GUZ+gw*cG-)&EnGOZdiAv~->ti6nbqjt)6DPI
zb}Z_%(91O?`_qxH9;8}*)~j=VO>7?)cjCqRGbbK54oK|Y%Cdgs>Ol{d?E8E}kfQtj
z^qvQgU;Ef$ft~x-1r1-@A3t%zbbm`iiw&~}o?1I<QP91m=?}9XRdQX}<;S9hPTEn~
z&pof+Tk-q!wZnhic6VcdY~8GJM*aJFt+KM(ug<!;tgGJsoBivi)U9(j&3@+bLkBM`
z2+>d6P_ggR`@cT*xS!#1XG7DNsSD<x%s9UO?)!<JlXjm-(e!%o?E13xo!@J}zUQ}z
z5As)3Jz;d-km&pUb%UYCjEr$EZ*KQ7`)+wKv|!7d;vu(NWu41#o2M~+F)BSX?qR*8
zofp?^{&Cs|cWSR&dZ$Vw$IU-0w<qmeSt)a>G2&K4*Q|}Zt9XBOao4?LwPsH~_TJGK
zAD#Yl-o~wiFWu5bD>kjVyeV(!<Bmri$9%M?=!*f1Zna*1v|rPvc^_`tw@IJ7qSDJL
z{RX!lzc*mWt>&#)ci7T;;iokN{I?|y?qA2bQr>{y2Pr=CuYBRiFOIt^&z%za)A1=<
zU6kY3BOX*f^W1Y#gL`cQ9VVOG{uEqM{_VA$uCWaoxK7)7SpLIjE^RM${wlC)1@|`h
z4?Pa*yD-#ke4n2xCKc)`u2J7~bDcgWsO72GZHh-PY~`HaF66fh>)bz?THS8M+VAZA
z8a8rIZjlo5@bf;cKG=IJG`HsPmhv7)WuZ&LR0oITs5ZQrEt{A9TgxW%$A@M{bsw7j
z%jfoupExLIjy<T^w8ma@*m0{eph*k+3L8yBZzTIv=-TDgkb<gl!5{DbwZ-T;0WF3d
zI}`lbt3gA4A7rV}j^_XN=-R!;-E0~2d{=k%u5?vilk>j0mVnR{_ge8=KdByiqJMDw
zU#Bjq_G%9Knp^B`>1e*vz3Hr}GQXeYwIeSEv|KvBR`-3TbXjrR^R=^sZiYsm+Z8{y
zuT0havuNM8gQlv6J(l~%Hr^ZB=Z7ot_RlWKwtaZ0_Ofl&TVAOY-2L|0y<6Y^^U8r%
zim8jIZIG`X{h<1zKZgguK6d2NIaA$3H@?eCKX+s5`5X7Y40yD&TdmhC;$@2mT!~&?
z$@%71uZ6n~#3x-^yu96^)!VOC-+DJD_`tOuQ_pEv%5RK_*!!qr#+BFo>fd}c^1!aw
z;hwT{EpA2MXtXI{>+Y(x4jBJRUz|De{OXrGr;hx<Uhb;+V{hEh30DH=H@j5t@}5KT
zCqt_D-10EE?N<W>dOrKKR@>8|>Gc+#IWIrB^k!V0R=Wb*UzUvwoD=OjA)$KQ{`0|s
ziH?^>o-`hEE&OS3&v5})+D=KGTCZJix!j>ZRxiOdTCV#3X3v@pceQnyls<Az;d$3h
z6$9cPZLJmf^UIqrJNoUi)LSRJI;m##>0pn5!$01uwc~m1^e+#monM+1cIiv&@k2|`
zcd35(^DlyTtjnExwW?Nrx@+Rzm$N!vvADHQzq;Y0^QTwG2E4p@qn0IqmF)jx?>(Ta
zD!#tqb8bEN<ksAqUT#7Op@-h3Ludj@2~rFYAhgg)XrUt=0!T9=Dgrio?^2YefK&ki
zX`-NXPz3V*_MUTZ0+#3he9!m2@4MFPT`M!Y%-%C+X3ms7`%D(qF5hRyu-zB8l}TIm
zdBE27+q#CF{k+4K9*csT8fTOU2>5-%vXmP;UO#c}QO?$b-?TgVLE4=yYg-xAQPIo0
zWX0X;`Q!8&hDRILt=soJ^xTOrzrL{V(~hf8%zj(`@m!^}M^{E9k7{(xku{*;&JU_g
zI=WUrW8J79@}JB4?sU%&vW6Mfeq3GMC(rJ3qEfe$KlVGc<xzv^oj(>>bM#SUz2to@
zes-MbwL<=2uP$xv(?=IZt$Sm2*3^Jxb<i?LM3dKd#v5~vR;sr3LesZHR(JUPiro8J
z)3o8=FYMWSRe8hkiM!V|-IaE(!*4}S#y6X=rBZ5tRgJmOB_i~U9RFlsTBY5iE~t}>
zt&Uiey0drq-lN0wB_ucPk?rX4S_8Fd^=n-^m|Z7(H{HHvc>3?_)Qc_8Mf~`F&-j%#
zL#285cHVE(;ppX#pClhEY;x>!)si=C{xxmc=<o}34s2LG{%+@WbFz<|8^5GV&keWc
z8J3+mt{xjcvI}mmI(gq(Xv^iYu@{b=nznk^uS4YfOHZa<UX+zQr+hWXc>95!8|rmA
zx-4_m$qnwvEz64DQ0H{6(`9^HsptLspBXM6N?Lbp?1OW=wtl3xt!3%baOks>LrOQ_
zlBx!+i~eBqImf~#J*$5EgQ57(W#y`G%}*=7v)6^_$GcWLj*Qzmq+9&a)Hly3w+-p*
zXjpP^&mmu*HKaZ)v99g*G3OeN|MFzj;ytz$4_c%~w=dnr(dEbFXr+<Ek+5@T)eQI1
z;x$@dXnW(2)eVn2<RSAvOiNu%?|)d&gxh--Pd#_`(6+Ma6&|ihYqzicx_LvofBv+=
z1aGJ**)*X1jPsi&Z{2r%ON-Q!_rL8|XHT_HXN<J0x!UylD+$(Jh1ymv`E2H>h^v3B
zdeFf9d;ey=4!k>g*`SYSAFf+3HeOv^v3BLAmnTjb_Q#bCb6R<G0-Fvz^6c%M0}f0*
zQ9Ev0@PqMlM(pkJ>V}LCt-nuP{UlbdaHV0XGnGb9yA|{EcKs~-y=zmt>{gqv$b2Pn
z->b*2RWi4J`dWoeaSNtQt$uH7=e2Ra)%~Q&{!jW&7`UkWFLnCub=e+&KeSq?dG+XG
z(Pvlh`RtYF1=qLy=3t*i{htqhSj%<J5-@Lfrz#ntAHUUD-o5d|n{5sDdkJNKi2q~a
z$x<^nRnG2pciVXL7q7p0ctp2)%bHAlP+OPwOML0k6|PKQUt;2pH^vTpe14(#t1>lC
z3^TXc+33c}p2|0erHZk^hsFgt7O#8qLAz(ew^sOicE_jvmSxPVnRsx3eZ}$6S87_0
zj~V3Lv}WP<I)6kiDS7CA=GFACy3ME`-xc2qjf~rs^3gBDn{AuWc&BTqv6A8{ck0`-
zlP(`xz1cRT<)fR)uY76P@^EOvhl?6sw0F|C39*#f@b&H)V@`a$vv<;nr;Ta`A71j#
z_kCxM*-&F-$%dBS%g0s<zV-9?d=E#j|GIvcCq+Bh5AGeeZ}6RWSJd++yyjR?xKj1O
z_wS9Z_49(YpP$;OL>@U`@qSRqxUa0k*Tug->e-4-H@~jgDW>0`VU21oJh0L}{OI0S
zW~+`d%iJkz5_io1<C|so55<nCm>!h!di^Ii-ZGrOd#h}jqJ<`!OE=zh<GodP(<Yqz
zVsty_i1l@cHF@Lld%A7EJ}CWq+3M4~6>qU);;7}1?F;sN^+RoC*vU6KH>%wB{ih3R
z9XvdEzTCIg=pHpbFE-awwA`~wdvBK?f9FyA_1-$0pZwnb+k?Sxg$>TPsDHh$%S?AP
zDsrv*k~<~F&itwF+L2wd?z9>HWY>nJrG}5$`9|a3flG}y1HLY2IA)rZu)p`_i$f<p
zs?_=Nm!~!t9(s9kt%f$+4*e9*$ucS5M9*k;deF|D)oxh>3%phJ+4IJu4*XbXRb0b!
zzb}4$)q$pyh7B4v(^+>;z6WamD(_W(_lFh}KK`xxhI%dc<m_s^{7A{Uod-nBP-;KW
zpO?N3-<VKs*YK&9??Sd+A9(X&a_>?5W^8OUWb2BU1}!4@n${RsmT!MCWpei~9a|>g
z^_Tk3-*bEU>m$#tuGqBfswcnA`#5WV#(*~lHttcX&WB%pZ7X<TX|;of-lLyKnO3`2
zUVi>~!=!IQd)4Y+Y*67^d(NB;sQbl^DxX9QdTWs`dSk!o2OqSa_x|2ZO?qTZZ6POi
z{{5MJ?b6&*YUTFRUddm6`?0KV?^T$%dH0#u`e)v16!xnA`Ze?Toaq&M6s$X?L)8*n
zSJV5yznDICW*gJdq(R0_$M#+NDD0;zU)RjOK4{XaYf-O1EB@*`$4bQQSyFUSz{L1R
zV=8`qv!Umkv4I^ARXN*a|LaSiw9QUCmGq$ehwU3bpV6T}diAZ9cDgq;|FP@#pGr2|
zG5Vo-@VC{94%*VCLtowJ{Ri|YuzPO$tzX~vu6t+Zvf&$NAK5;(ZM8uwlUf!(+y0kp
zk2;L+Rd4J3B|SFXdFtJsShmcLm|rso4~TnhP()^-z8`J~7;vci*14x1&wG2v%F8oN
z-M*Qf7BNJ6|9DINfhjSjezmQ}u!Gw!8FVwR^m^mk^%JuWT-%m7uK4Xy%S*g<=KG@6
zop<9~FR1x-!{)ue$!7^YwD60v`|k}kZydU&=(@n4JFNaBU;j0`Ki%E>V&`95Y^?H5
zbI1LSuXQW@WMSDKPAzD+`NQn1{bsy3^|$Invl84xeoyT>yyoJDBMYB@cI&$5*zM_!
zm;79Q+{B+c*eY6U@AEi{zS~iq-Li@0j~UVz%j*~0z02F?tBliw_BuQ4i?z-DDPNaw
zvI~80T2)~ESL1HgcPzhMvi0{r&wKCgPdULgt(PzKc1)fSs+NAWtmTb4L%y)IYq`6l
zA?B;^4qNxGuDdPvvuQK)?LB#|(D0}O1w0)yH=g_WwRLOn7Fr!pDPT?KRa;wUX3T35
z9CNw(iW}d&_GRg@3*UaSa>26?&Sux`@#wwcSLzK-DB5F4(g5%9OIyo~ocO?=_ONQc
z-tVTgPA&M^u%1l{9-7|lmmjB9Tt92#s1ZA++*o$M_SQZZYaFleN!^|~6YA*)m;W$&
zNBH8w*ZX|*O^Mn|m%jF(T=#`%%FDYyFSA)4op5Yysgs8rMV(sWPCXuZuFlbcvL&?8
zfr1fDvzkReZZkaUy`$J*gkJjN=NfN6e{JICXO$aFE4P32`{fsZccM(F>$ily^Ak_@
zZkBS&sJM=A-&O7C$?(3R6TWR1@$2Nm(XDD3qh2|*{<r-41%EVZd+YiA*}I;-b2~k)
zMWMF6tBx#@I&Pz`=dD-POuzeR(X<6CZcdEt`qh*M+a}izAGowe|6<4M9=w`U?}Oe=
zKiv6I=f$U=1%K41T;-+a>6Y(i)GK)Q$LMAk-#j(^e1$g;S1K~^Qsr7l-md<{yt(SE
z?;kw)BIwNHkDGjc&ozGZ-67Ueo|L^&A=d2Fz;!L^xX#DlALG7sarCfFpN!kncfwnr
z{$9S*hEm}jFLdvd()In?-R#5cE8DNl_gT4{tv{ZYf7mD2J~`C%w@yDco>_TTgZO(p
z8rCQ?dCZzWmX1FA;qh^udgi>PUT)ecq)6wEBl`!Ze6^)=x0T81E1xcJ``Mg3B|a`z
zQTIum)oYrz8M3Hxso0whCjInP!yn%6`(26l?atOS7QXoWYvcL3b=Fs!GIc@aD<^JM
zZyUL*YMC#mJxI2{|5#~t;@+x7zui3(lIYoYF(u^lw_JfiO{=*^(EDF#MAZhbKFOHX
z{jECZdXI^UZI~0Anm?n<T7$98le2~@wVn+z2h4e1`{>%SBNuF(5Z8I;AK^uQAKu~a
z1-a9vb(X>7W)E!m^^^Pd;P0oGe)GF2JwJN>sP(OR{oEB+SxaW7*_wRWpl`n!F^|i&
zFi*Bb>1Q<ivB14vE1C?7AJjwfR9(<Tvh4`1TIck88SmY>T<6q-AESyLUl`kEOH`K~
z?<Ti-T)ShHrYVtTlkJ__-xs+)a!!dWacP!C;f0dJIy|kP(rJD1UV|s8u7<lh&$UO?
zxmo(P8o%~j-fed4AM74?<-*-d4tSi}<j~Qeeq%PzD%WPwZHsf^&&__`KCjn@C!Fya
z_j`Lvl|Cee2W_bO`tVX26Wcbf^TnGDqN-$+jqUGR+vVf(-?X`O<4BdHBWujArR8cL
z9Gy6F?D!^eLq13dPdK!(!_8X<JAE)=_uxhE?`(K_e?`0XcHPo-Z`SC!^KiA+k3QPs
zE<Wa~k}2)JY;u0s)_!jXRxS5xxtbPpsXEP`1y|~|Zsd;mK1q8#ai6Y`LZ%+C+TmJ@
zjPbh**E#%Uv8eJtG>;uQ@l=;3D?e@XOTSO5v}|?S99g|t?XN<MkDMMEA2+Z{%kXNi
zecR#spp~6gdY28JUGZ$gv(4Mv<-swf6Z;qJxu#`O>p!YqbjL(|P_p~J51Smje4*d`
zDQ(Kt{;YsST^`%)&Wv`wHeWv%-|35`o}wvp<+8PJRvS^I*N~+)SKZqdPh?Pu?lJA^
z|C;hv^6aE<cFd|gc)@K`^*gDxuAUCc{QS7HQ0oW1Bck8^`OSS>=5?NW^u6>|k1sd+
zeNh)jzRjU!dTx*Ek+m?n)tUQ02I@~NEYVPXr}4ew*ZXd*)3I!F=SZb+x3K2NYozpE
zSolien)XFO%Cq}&rL)tk^<8Tj^68v`buZ8C7a6+I8q@s3l$6Q8KT6tD=yBy?(qvQV
z?zUP#z1lZ(al=7{-sr7FxD7Mj+@Jqm=g|$Gq;F4qtI-3q<R}?4qD<EU89lP~##SeL
zj0v<i$th8P_VdO$>&Euo@V%jI?<Yf)0?&sxzwn*hd)}%Eg&WWNBgojQy<DPSOto&I
z%Z6;LezxwzDi<Riksrj=8&ojm-Pe+mPE2i6x!0!!Oa*s;SnK(<3z-{#I9I62mrElI
zD`VfhII&&lxtCj{f3mM|qbr}Sa8#UfyiC6tr+Q3T{%Nb-Ula=ryWYG+vx?0c-|!ai
z`~Kih$_{OQS}D`=+vW%Quk1Z4BEE3jsx5;8k5rWN->6%y$LQTdRxjOI_qPwWMC!(W
z71QWejg*;(swI6FRJC%)(lt%-ZNIE_fB4qSEg3rs)p&DHgr$6)H@|gN>io{IQt8Xu
zHE#54xkHX}fg8#U4_@12LaA?B9Z6~s=owkIMEm#4HNG|>vF~TMkCaW{zeaiG@W$qQ
zZyxMDZA_EGtv*T!^2!nVD(&==)C1$ds5kH#KHy#Lb8<2eZ^nHOn-E{eEih*gPwNQZ
z1H_#><>Z7KBxyb3iijI`$;nAVT&Fwa5Z6n~$(fIMS6|2@Ufd7zh*u1PJmL)*kjH18
z^@ir;R7IRPEGMTO;-`p*BhEh}CucF@jLe*zU5Fcx%*nZgc<m_oF5q)d%V=Eng18gn
z8i@Oj$;nAZJOlA4#48XlMtl_UF2vQxLLTvE#1cLSJ%>03@e{-~5QmL}JmM;dM<H&7
zcroI>h<71=8}TK?3F9G;J65_NjzN3@aUI0T6L2jK;x&jTBK`vLa>Rci-jBH4Tlg=K
zh%*o?xch4?;&{Z<5ho(<Ix#1w7vhY`@Pk1dG$kkJGsMjiA4a?f@h!yVr{?5X;S_%i
zaWTXnO~)<Qh~Ju-lhYURR>U(A-$uL&asI5FoTG?aBff)p3SuYjF58JX0r7pr%@7xw
z1$o465zj<C1Mw=vdk`N*+;cYM5l=wuv`SLtIgm&EHsWT84<qi2c>ddvM?3@{f3HI9
zod<cuPZ8fi{AfPpZIX0+0pt<qAZ~{E{6feh&cIy>GZBx*RfMY$?_CXf#A$0FkN5^+
zrycdO9`cBHA#R3P-2!>U^S41B@r><|M{L*udBism-$7hpC*&RY?iO(Z;x~3d9`RJf
zeG#ujJQH!%FCdS&E8?Sw^Y4W`;`WH0PT>0>k2v`&$Rj?Am~P{{gm@KV34WRTDaH>s
zNb^q~`kEwXIOPbdS(%A1afyzb9;6z0=SotFI%bwA0I)3nZnQ$*#JeWho%QUY4IKgL
z$|$LNc$Fe$3giQoW!oYC9rDHL6U&-*=LB7Cn@Jz9Ll(>*4zdaF@-oDJzU~8JGd~;T
zY2crFfq%@;-v)YZi~+B`z;EK`p9g&q`14-Kf0jT0r{Ld87Gs7#fBk*EuY8fHhrKEA
zUBbJa%YSC@@vDNr6l2QJ7xWME>u(3Xgew6rA@a-X&iTtX9Q<A2+cD<&`Fe|Ae>V6l
zFh(g<&;I(;bNOOG+6MmI0q7HykB^@?R<9o``}4R2o(|)hE^iv$Ge4g`f4vI+I;imL
z(5+;C3`jBHXAI8CDU?@UpHE}}ekA_tAkDD!oSc>~q-n-!h@S%fDvYTo0O#g$#9xMS
z;HM19$)Q^_{Cs^Uzx)#LKOG9+u@~f<`{nn7{{#4KUf{Rz^RI#*j<MVRf_{r%zXA2D
z;EpJ7UimjzoS&}%_~w7$*8$%Ie!jf&?Q`W*H2r<@@^wA@<r@cnYse3Mfj`L4UjqK>
zf6%`d{I&m3zN_G``3JrM{qLXj7XY91<F=XH_Gy*-e09KI0X}Y$$mO@p<)?tZ9Q=ZL
z`N_Hbao~RrKHW6pZ-3p~+~)^B9&;}KnL;kV=2(3nq=*4&FZk8KACi|}cdTAV2|%Mh
zcmru>A<dY)X>=nvK@4aL-th<E|J7JRBBY}j(&Wp;wcAASy>AV3^Jomd3;fD?`Jx$F
zz8CmK|A9Xh{JP-R&MTjptN%0bQ^Chnmz#gRT>fG3U;i6^3`A~${|fkL^2*nrsvm36
z9VRXGj83dW?jcPe-WmTh>TUipEdl(iqjGZaKcRE^y7^oPl5YlnN346Q=H;j7ma#AR
zBfw`@Z(kkidi&*Pf`0^j{#@42PxJFvfu9fSroZbeN5Ov={1Y#{>qq!qk3pI4fZrVJ
zrOSEw=qvg&oP_JZgSC_cYa^QH{B@A0FO>y9|3C0sfM4Ms_yfUj^bh<w;J^6~{B__D
z_y_(8@Zb6e{sZve`3JrSFU`t-;Fkq|&p+^6fPeZQ_yfVe3I6iD^{e~P-^O#mAAq&z
z-;Foxz)u7JujVZ3gD1e(Va?k11^r^cBK`yLtASr3FTa0oAN61@+XDQ<dHK4p{N*bP
zzBCEny}ZCb@8`Dw{|oS!zQBLq&mRbWeXOOwe}R9-&z}Rn0c+|fFYq7w`Rl;H0RG>}
zp8)?IT7&16@5bWXzaN0V4E(>7_h2oaGA$=3JFmQMo<IMx;9mmYf;FrE{m>iilz{5D
z1^5TBR<4bD&gH+N`1k|Czx82G&L=PE|H!X@4)}wY{QZ2m4*a9wcYk4wO7V|TC%_L|
zmXq^$^X&uhQ^9YDIU%=vb^Yru4+hv-pTeL01^tu!`peP)ydo#(@eAd@=P!Q?@K1w3
z=mmaXKYt+j#a8CzTz!Fm(a)a){vGhE=H=%ZL)U?CU6qrA|0I-K{$4E3^?w5VDDaJW
z$2h&@U++8sUx&R*>Adn{2o~cX_Cg-;^XKJv$kksK{1EV`<>l)p`pe$}{BhtH$;<DO
zD?bqY>ELgFp?){{>vs<LMc4iPJh~42y5N7ASHJEHfBq-HzX^Vo7xb5B{dFMo0DNU*
zPEHq+uV;78(ABkvPS@A9$4oQSwZ~62CfXAwm=f(3$C~TgYoysL*0d+ow8z)7$JDZi
z*0MWm*%jU&RmGk!0eUd4`hOlkEkKV4J;+bRyb8FNmOd{%Pm+JZe{1018u+&c{;h$3
zYvBLd8faEPdvq->Gh5)VXz{CB{H7NBDvFjm^m{pOFUV4q7FVV<0e)Blh+m9|Fg5dU
z(=-v|8b<!j{6)mP7Xe61TKciA1Mz%~PwQU#`S=f?i{&TV?m+jg5y-AiKWsz!$2L?z
zSd#Lut5!ZN(fEfu9QpV1ngJ#q@w+1;O-$Nv5IC(v=!dNy|3b7liPr4+`RK`7Mfl47
zn^u8ZF(qIAe`w2Th;q{B8t&BO@eMXFw|(Wfqv6xF2l-vYeb32-_&*@nueIOHMJXR6
zDD||sl@@o^;=x)xL5t^V@lq|`pv7Nm@hL67uEmeE*c2=AFIbD?w79$$*VE!wTHIBO
z2W#;JEuO2zOSO1|7JsS5r?mLG7C+KrQ$8(!EsoRT@>*O^i(6@NS1lf_#S^r6t`;xV
z;tg8-r52yk;_F)cNQ+Ip$&jRAEsoRT@>*O^i(6^2DD9ua{dfONBTBz{f-b+rrhNeY
zGBm!xa>K#7;6Fo+1-pLj+LdGC>%ZEva7@J#rAw5GDV<QNVnW3-rKG>)lqyxaf>aEm
zIu;jiUmI=Ao5lvlKW8h_AH=II(qAtl((w7ipOb6)^=N64KDoR|)LDCHUltMiFBbQu
z-&0GZC@c#3a&k?79JC7kx3o$P{>SvvE57u)`a)$H{hw4bL6ql>L|^)H4MjS({r!cA
zNWZwLFa1$1J^jP<zxs*v!;*aIUu!H<U9tUDYV6rT)=BqMed)`+A<|X&>*QK`U0+}N
zURwHHMgLkr=zn*jFMY8IBHg@yte+(_ed+g166q`dwR){QBj)?kKhx6lKKd_YM0pCm
z?@PaDvPi`H>c2?;C3>cqeiXU#F3hokMUd)EN&35vt!vsnMRIwAt09@*U=Hn!NHI>q
z1ptOa0|1{1X-yo(JgqloP(?2*1I7c_nT_d)WYe}0NO`$Bf`TE*lu~CL3kk!V9ta9W
zb_2;|Fyf9U$uMdLNQF5kL!&CPWS9{~x($t1Aeb3OB{2N<2ZAgCPwNPHmPj+<4WyqF
z&S}yPgZ_2|%XTaQ`rI(;^@i&%!1E#)E`JY%Md1wpJQwh1Q4AN%2Y7WP!x4><es?Uv
zW~CZBC?01I9*1E|Tq{{RV=Sc{#havV#M;AH#ykf}CG~gAvc~e33Xo9LA36Z3z=%_w
zgkEB-$VjLf@H>!7jKrvAMgghJNW6Lxm29lSNP>C+ec4!*k&3E)9*}B`)KEW722z8O
zMD=t_AT=3jroMX=NG(Q^)S)+k)Cv9x<xEz4HjpJ_J<n|*scOBSz)9p%_f-p0rVSVw
zs2({8q+!5qJWht%A4V!;lR#>!QR-<tmhn|b8jvySn@f<asqzro#;O%(0cq|q;u$BZ
z$4NmO8(K@6saDPg(l*2i&V2Pm6p(gd5kMBJrCUH-doIp$^?XMl9W1CYX_e|UAlaL0
zBOsg9dx_vAbJ=#O4>0%{J1MbHuvao%nT!r(4#Q(f0oyacDQdSS52O~OBSo`93>!5b
zmCH_jf~e=gvsOT{%nJR}B<nF0%B%zf<g}%HY2BdKp>`yS!`%d_CfO4Ko4MT`I}ZVz
z%oN8{w7z+YfJ<C~Ot_W$gRKIRkHDW(Qu-runA?IE5l$pW@hgC%Ii15qbT8BKhEWAu
zn$&`r#iE=<TD5H=C;?V8iX*Ac(Lk(0^p6M>di$(}WKr~6NXn1tMH%k_ap|3gw2#n}
zui_<CL*0WQZkUU=PYv@-BzO#7J2l+%4q(SK*c$_i+o|h1-m%G2KnaEei?x*{YZp0`
z`aob2w3D@~%rB}^8|@Srjis_RLk^@@B`}JRVKTpedZ)1i-b}sI+~NnkKL*3zdB7Ay
zwFR;iSRojC{*axl!ZZ?6MqS7h(zh3itY#f^+^G$rHOA^OIDP`pa}WBg!3Kqq%iWPM
z#Gumgr04s!Ktc@``X^VSGU&fY1yP&h+&GZ*Si=l@`nSD?+EtJwl7}D5M664k_LWK%
zYyi|uMV9QVjb_>msJD@qeT`ZJh)z=6h`4(+1Hg4`Lmd+}m?+l7CuoeefXo~QcOfF3
zW-5J|NMZ0$B6iWsk}{r&6b8>A;^)(lav>8b3|@l3QX2nK!s+-$xeG+CR#=uCx3#jZ
z!0_R?<Cteg+0;e{fZX*A2Vzi5;ze{ka92{OZzwBSBZa|_NaF=`*+|Dz<#jZZ8de<|
zpKBWPqdgosLSx7ZbU=qB+XJb#Lagu`aL96yOgcl-Fd#d0f)^?&GBlYv3{Il9|EL`j
zb!H-k!M%y-I00e?6DbTHL&PKJK%B-z3WMhnv0EG}=L05E82lL#+f)T{BNHhMrhf`&
z?TRughnYxW@F@nV8H0adFqyzx3=SgjF@uu{)S;@(%6kOb7^E5w4rP$4COAKXsRWi_
za0G#s7@S359R`;X*p$I-1ir!GF#@{)G>l$_x{3~@YP2*)D;XSKh1yKr5D!MIqc31k
z9|QtR4>TcJsMB%McnsvQSTw%lw94=`s`E1eKM=h?`l#c(fV(7WdxCeIsN5kPl8sal
zhjFuHqnxSP7}rY{8=BJTIIcv3rgo-m&sshtO=w7v9QiRoIIct!+p!3p*Kw6$TQX8K
z@aZw5?r4w*&yR5@RCWd{?~%Gj7$BWtHuNwczaxP&+(vVVqV`@7j+e&|r~2^`ATjP_
zZ~|3ReIT(a$-BktwbGYMj-;bos{%(xZUxEOx*Y1T!t1g$(_KQzvLK6qSO_K9mO+uE
zc9mtxU5+zX)aP}<sphx>hEu(R>BIes{WOqJ)r<MXUBia4LW)$op(O5_ww*v?)cdGO
zcWny|+6C13`#|cl>Uec~JdlR`C<$sb#zpt5F0_qQQMK#^r<s%f2}}*OUIdV~L7ziU
zqIx78NIOQ7)LqAbv}Yt$Ej1lTj}Ur-49P}A`1=64eRNX9_F)*xygCQfb;M3WVg=&(
z;5bIJ0%C9;-$s^#N{fkOKh`@z6$ET(54Z}$hB0`!;2>uwbO1w#Gk}ZpfNFTY6lX&r
z4jLv6t&vMed~k$H^$XZ>c^?Q3BFHz;tT+%lW|x+wkcJMT7|eK%&^T5TN&ldi{(EW^
z0Z}yXMA2+5Ir1$;qC`(0kj+X%gqEaWcpZf0b0!C%opxE#g$vkj;+ZA&4U{V^&h{9H
zqE?RsQiu_&8l=inSYd@koa#^$kRmG06dtwsQzR?uq?saAz3c!|OeqCrk!l*o`mo~m
z`QXH;#VwF25%>|10%|vO#;}qs6R%dpgN7xrOfmIK(pHLP64bruVqvA3Q&vsw3z;%j
zJ)WSV`fV0CWm!QDbpuvJVdWXAqt3q$PIX2Su|(9B!=PrMCl0Hjr~8Ma6VX;?Lp78z
ztfiftBE9*jUwVV5;?49f!Vs+Ut}Kpu(fb&Kq`~`bPpnqFr_p~*l7WQ7I_L*LEd2e)
zcrH0C#ktsqsyl*?9M(gn=9SdXsZxh<1u1F~%oAZl8L_H*2!)LdpnRR;kw)<&g{spg
zgEN{NB1Zka0A$A6x*=J-+6Oa1*m&k-sJpN>3R}v^DD`O)ILo++CaMFeww5zzrn(uu
zFzjPS=Br+G@vu+0@fNH1@s@_IQs})}qVB?0G;B3<maC&a0J4TTpQ+WNfvja@mD+a{
zkadi#Q+L-zrt2Bmqz<R{+Q7&*_1h^xHnO%|D(%6-_VHuyRYOMu`H~j~`_<H1Q1CT#
zj;b?Gf%6R`C)CcwIm*ZxRW1eO7$fJ^9a!>&9cSc{Y7GK%f|09g8@v!<rx>}RJ{kaR
zry04W=BNI5mWy*o9sDVfbBx?oj}8TLfs;K@%V5Y0yQZ8+d4G$+P1;h}4J*Awp;)C+
z8;AX*--dRu7~9EGOn7PKS8DF*pUP5rSqCjdCG{}|=I{!<AXe1=%fP9~shqL6K2-{@
z#GFvPepKm(QHdBzy!_T$Hlxi9k)h64>~PW<4<s@?fV!r^;ga%A#-1iJ!pOMJA?0h7
z5Bx}h<2{oMBpn&0mmq9tu?cVd>(27%fBBWDH!MRzRO@{XVmp3LNp1cnkoJriBts^K
zqo@x0R!C;~Vhf%&YPQn?t?CjydemGO$wMX^$b6NwD(ZR+s8I_TQPtw;1yKu~4Ux>M
zR$C0tJB--X=X-!;GvZV?ZvgTx%LJ<310b`A-%XGD76y^1_iU7BsAQuG2nARGZHA|z
z5=AZ6PXN{M4Hg?wAA}`<YgQ;tLA-W-ENuXVm46*CZ9f-L!)Pfg_g9H7*=TI)1Bl1*
z>Ku9{Luo9!<pKKgU?{}3HAAwQzGPlW83j1bj`u(w5=3;1J_g1D`Z*K~V9}*QsR<qS
z3=AZt!)P>g94!O781LgcVzHzPx6@c|(a*fcf>NAyFBXLn8g;&pdNBoQBQ@$`A9YkH
z(nf_9a*~GS8r7`O5EJ)tMew{l<5={od=e>zK#j3cSuFZZK5Et!XwT=N{%O(g(x|-o
z841-~9|qI`;5b7SD5xAxbc=rRFQU+eCLRR!x=fFIwS=ItS1&+0yQmA}QX34ry6Y;X
z&}B7_PeVQEt_RaZsmrPncda7O8zGd^qQ7)kXo%ZT6VzV=XtAQpN-o@4(%t5lLYLJp
zZp~H@?`UXJ+&xlomtR<2R%^-j8LFTPv~k&WQjs<I57oL!@~t95HGRfFKRMeQh{ElH
z>#A;2Bt5@AYkt@G3GJYKQYnGY>>8Ig9sDT@rPXI$>Q)%{OiQ|{0SSz+>RLFZi=><8
zD$nSqu0_`t725fq76L4c#jb9FMHBB+c)&nOx6rW*&#uop+O=2$lEj(!Bg^%wmL@$e
z7O#(PgN>eDpS8bh+z&W0(QQ-(zN>3oJFJ*=o172uocgTKx<UnSZ>UIO*Q7`Pr`4EQ
zX^OB&d^FO(F1L&qMVqG+v7CJs62+;2uZPg{8nQO+92w8rUoS-34O#1Ujx5}1P*+LU
z(nXsxL)N98Bje&uqCBl^1<>scS!a*}{MKT~$~p@EBfN~dB)iaXc<0DslwaH-dWx<A
zG(s4%4nhj`*o{|hhO84iv5hQIjh?%{OdV9*P9aIJiK3#h^0~FL)(@t#>avn3+Xg{I
z*JY*RS!u&wGNLBXf^u>6b^2k>;GdjbXiya1`mHkMKLHCC`tbr3v7EgJTH;4@icrz#
zlH!YDsVHwPLXV-#Y7zG_G|SsVC?)cXOTk0RJ9xa(Wp#>cf$>k?$z!H2D<$FL*OK%d
zQd;!M^EfMN3?4|xEI}OZov<w&09@kkBS_DZX)N*?^w_hX%eKhlE8DCR1naXlWyV#+
z7%k6HmoR=QGcFZ?{5I#W&pMo0xVt3J)v%+Pg?mWyJR@n>XPw9_MTOgo^rc@bD9YAK
zE8;K^;}Y@a$sZU*>NA;A(W_dj3R)_WVwslWd?r2GF!Ti;O4mV!Hi+8jQ7IqBU6MjP
zsT@{32I(;Rpk~P|)J2wl!Qaa!_kBqCDN4qcOwZq1GIjs4B~$-rOQ!zMmQ1~7$s`M;
zK92=bFD#H|1q*GzMUh}(Q8biNlJHDyQ8eVWC>qLJb|RIqC>km-A}orAii`-0qM;Ha
z!lG!X%!sfk8mcfNEQ*Gzj0lUOp&BE?qG+hWh_EOcYBC}$iiTQ@2#cbjPH;bzQ&<!Y
z^*rf7ghkPi*P>|1Yf&^b44^kfSQHI;EsBP`7DYo-WdyVdi=v^qV;T@)Q8cvSeVec-
z8rp`;2S->G4ei1{1R^YohW1<>VNo>XwI~|$S`-a=EsBOt%EwTkSrm=tuwy7GTNDj`
zi()S%Fe}8cQG-#rxJwc>1&C%*oK2vxC@uoXX*c`Ql0^|KWl13lTNF)`>|{eWbGx%e
z(KMMUY*92#5inU4&6-7VCz6vn6L0+!(n1zRb3{0i*rI68V^K7FnXXwBeU?Hheq$up
z-Y~QB8ju$(g-RYvp_0c^sN}H}T8i7LOR}ZVQi5SN4q_|L#zD3Vs(I~#>VIk%RACpK
zi!ue0T~LKx@H;}tE~vsTNE|ld1>`m1Syw7$F*LCW&$^obE2J>tS=XqQz<>$Qiiq2z
z0RVn@GT{-`O%!3mYXQitQ0VSJL^k1dXCj5}0YqdIUM3SMbWbEAoA72ckwW*o2!sjG
z_KWf}5H{i2Zfj*D6Q1pkgZ$}*3D0)dGYE(<;n^OzOPkSABxN~kq|kkvG_ncL_Ec$X
zghn#q*`8|}$%JRi5gI*Y!n47IM^<NphfH`jnDCw;1I0roJR3}S#DfWs=43O6?&eS`
zOnCG?rCFiS-Ia)J!t2LG3f;qq$R@lAOr+2~lZb4>dxwb>x<4W!oA6dKkwW(tBC-i@
z9}_8bA7PN1(S4S|HUwT{Fpa=F42~u64+d$o?>6C~%v{567lTwa?nnkZ5?F}A!336O
za0-Fd7+g$XB7^G*B=etHIY?j!fNa9EMF&z<2@{^pt5BP<3C|Ym=nfdk$b@H05A2F$
zK~CFA<3W&v$%JP+tujm|JlmOovqUEop6$DUpGnmA21JO;9TFxyQb8QXVZx)Fso5C!
znec4Kl^}EvHsRUMTCzzK8WItk@N8G2iOnWF+f{~b$w*NPL`|<*p|>}L3{G|!oQM-b
z1}8fVPK2<*$x%SO<79Ah<S{ro@)(>Pc??d@Am=I+mkmzN;^IvugOe-7@gdR4$mEI-
z-VOb^1}9g8h~prbl?{|F8Jt`V9YkS+Q(zpc3B3k_3h^)k!f5ykYlH_72B*M8PcV39
zg>0X~;N;HdG{dgM1}Ap`+bl4I!O0zGTLeTHoZN*N5e6rBVTDA5!O2}jrGZ8moZLm7
zGz1BQlRK}$$z9xj9UNhBa+e4sGomm!xl6K)FgUsM8l2pvSVkC}+<6U7?lRU-P&Q$3
za+hTV!r<gC&xkNMxvMksqQS{sLr-Sg@LYqFyQRG~o+pA0P7!QyieQ6N1RI<p*x(ew
z2B(NT1}Aq1eK^D-g~7?4;;fIQG#i}UJydF5VQ_K};R+H4C-+cBgu%%@GJx_G1}FC@
zek5UVa*yVQ5C$jrSeqTmgu%%@o;kwc<X*~%FgUrFaT5uHlY2RHgu%)EF(bm@<o<*k
zPZ*rsc@0kP)yxqFCwE?hlRK}$$(`5W<j!kwa_2QTxi_*lVQ_Nq<Hr^TC-;}?x2PRq
zaB_dm9AR*Bf5V6{IJu89A`DLMV~hxcllwR$!r<gS!H6(8xlb`73{LLTj0l61`z#kn
z7@XYa7!d|1_XSQS3{LKAN>!BYw<t0=xo=qMCGr`Z+&}3%qaC7;bj6arm~6s|`Q=r#
zoJ~|OZKvg8QM7o$_95L%yMDP?X@Rdw_lB{CT2`dhXDv^UA?t!%!cDUJtTXAcT~j2v
zByaJsyGW1u9_w{EK?!8`rSw=Tn_P;^h8;$FY=22Et))Jio{#n$WsuaOZ?=G5eEiG`
zbxkurG*9gL!6!4El%tnyhY^#2Hn1whl+GY9NJ}woXd8#~u=CQ!;bq^OT0yYrbiv##
z+#Yno?Lkj&57;cw(4!M>4+cdi+#Yno?LjBp9`xk)fXzY+N-5kP^yKz{%>ou|l2N!l
z=*jH?n}t#!>B;RuKZA!gJ-I#TXNEDI+#d8<0-n|p@GOyr+#d9E!Z{7OJ?P(#pmOQ8
zO$W?3lAhciu<2l!+#d9c!WkyF2mNPJ43pb~esv_n<o2N79ZRrDfkjlp<B;2fnP7$8
z9<WK-iymMhw+C!eCL*b1A-4x?Qi@V43%NaDlR}8oLT(S(q!1EnA-4x?QV5B$lo<u2
zG9&R8a(logg*XWoa(logg^-FCa(logg^(JSPm_VvU?kB(ZV%X`5T}`i+#aw=AtcE{
zZV%X`e2;P_TgdGJo0Ka+QZ3~6fK3XOy03-Y9<WIvWT1uI9<WKd0wlviZV%X`P*aVv
zklO<`DV>0fv5?yXHYwy$G1fwE57?yq4rHQ*+#aw=xs7BqE#&rqO^N|eKi@)b57?wQ
zfGoC<+XFT!RGj4&a(logg)9`SEadipO$s`Ww8=tl57?wo*>+jT?E#yVASl=?>B;Q@
zn-p?vP|589o0I@@doU?ypwz@JDpqPRDwmD=1W{?95kPJaCMAbJ+N78i6)HJxK>(cA
z4JwRbl9?!Wa(logg+>n(x4WI(9<WIvik;jZut{N9+oVu|twq7?hd+m8ruB?n+oTZ5
zPHqp_q)<9LxjkT$LUg=gX&@==_Mljl1E5*R?Li5!;spRgZV!qTSB@Z}w~yQ&bi%(u
zPyP*-P<MBv)02OLCCoFNVDfLUgnJN5cJgmfi`%J(*vY>^Ey1u(xGCs_n}V}K8ayAq
zm7UZII^naR6Fv)i@*A*5^82f>-+*nU(id%JA-@6JYGYNvl7;*RY-`j(Ky;Eh5fRt#
zcmQ0#HdG{0T|@~WzX9`1KqiGk*FqxFro&8US0;r**GeMNro+62i4?l%tLgySbeNAY
zkwVvZ2o&}ku>YdOVHIm3zXAJgt!%XEu-|defw|5?egpQqp4C7M7V;agKX9kf(nvC2
zWsMZNib9FPn-2R^Wg=;$O^5xtrja%s_8g%xkTxB5Y&veE?F@mm>9AwdQJ!=L(x$_X
zO$YICI8h#&ObUgrX$S(yZ@@gCi4?j%AR=u#%%3xnLf1wj(x$__n~4;<4ik|!9p=+a
zq|o&P5oyz5zQsfeUE~50K${LTc`caOYrz%3AT^^an8CLRjAif>0*f)YlfViLo*=Lm
zgH*$=CJa*5xX1;<q`XTYc|VwxH3X(H_!WUe7(7qlI0o+$ID<jj$G8?E(39VQJvxx8
zN@2eNyO&)ZEaW#}k9B;6L`X)P4tsjw6(n;z?I(>URJ4aS9rn{I!?fwJp9u&iI&C`a
z-vz`0maL~CLR9XM4#`R?h{L#9vQo~}Y>ewAh5ZKX$Cc)wS;%j|e%5k^3t^^YcJdpr
zUx_BRo%{yuR~fd_?3d}&V@7?@%g3VweM8OKYe?pCfP*HbV~-InhZ7@cG9g}pD17Sg
zETN23kU5?DI}>d0Pz6}%)ZbZ-Pw*8Bo%%bgIZlJ&w9u(Pj=3;YVb-Bje`gIo&X2Uv
zslT%(o04KIv|M-Aw$NBqz(S|~&bq8R-a@DT&W5%t;3Qb+)Zh84%Z$QTv{?3n)69u#
z#(YGl{?4{R??X?bg--pQ?HEb2(5b((JtL_WI`wz<2%#s)kgPO<%>l^Iuakmk6XwDu
zjA|-~HeoJo!U(a`sefPrF=WvuEU>f~H|f+ru!4Z;)IYEa!@+;_!;-3;PR_20a^r^)
zl7Dh`3YK^wpY>#R0UFt{g6kIAv2R~WUqKL7bqk||3b}$#q2xGQ=oV(f`be(mC-e!k
z4+Wu;P8lRpq=y~p0YVjB05vU68oG($j@aQe^<XK+<KF}y!Df&m+EO{;MMM>UqXrjI
zS|=jd(a$1jNV{@aJUVn9np<F}51X6;5tmLO5g5nAP7eYeD6%EfedLV(AoP}HD#vvi
zy(yvp`T2;ob8oCbd&|L;1Uf?&2$5KUHw!7$hgB36c{C3Dl=A2+2Z%DHN)g%JL8v2h
zxm&#=^5END0Lt0>RbL&~k)uSF?(Y+#qk`9w69i#bpOEU5OI_KewL+heP6L^cU)12L
zK2coG`m(PT`h;;o6a9pAtkwrW<^~t!)hePOzE4@?>^=clt#eb}2o*|~r{O=Oca`Lu
z_O<9&XxX&bl9cN$EsslmVyjTY{$dm@kL9?7+VxkhS$C!7`$EfRyjG`SPkSPPQMQU4
z_NRq?MAc=I`E}A~ZNosD{}xr)Xf9@r3Zj^Nn-164a4g2X+$Tqei^`ba$IF#5x{y#h
zvrkOtbhO#n++w~n7#hdr7Ly-yys!85DR4>425$xbLXb)9R4UmVgHXDuPwa=(&F2_m
zF_P=EcJ-0+nJS6YkF|%3Z9%QJLVLF5ePSPCSs<_E<)uDrRiAuQGyyEX_$RX1re$$d
z%VL|B#hE@xRPZr%w>n%IzU{^(A&t%>+N(~4)+;had)4!`S7z)oYOng-|M8k-#(qY4
zqJ|&F0ZMHQDF*2-hGlJUlQT^0ZC?IPpmPlFo9TyF;J@12=-tDy9f*&>_$e8x1MUXE
zn2vz{O!RtVGQEpzLJj=>D&>&KAkmgYQi}RvgYps5Xi1cfe%L8l{$clIEEuu@dh+g`
zjD;e%gQVu!JsGv#6YYA8bHjKSL%SZM*!2V`5%jW)U5u4r8}DLFrIZy|1>0#CV=80*
znM!P@U5u%`B?ze$JMCgj6&P{aX%}Ov$VjN2b}^<(jKtW>i~>@bk$C&X&OoX#l3=G@
zjHxOk745W(F;!!vhW*oIAT=0Cw9_ueRFjcrcG|_5YB7>zr(KMxPViSKXR@7kF{XN+
zuYsi6X%}Nk<Wl#w(=Nu;fRTZA+Qpa}20X*#WY}pJV`>seRW`~_yBO1}4$NNC7(4A^
zOih*J&^FdiyBJe*2kl5E+G!VKYGbpY4rbbE7h`H0@+&y=?X-(AwF{$7#bP_{VodG1
zILqy{i!pVu&<MWD?!-%F(snU++QpcXxoo@aw2Luy;@yX~i!qzSXo_;sF2+>U?xbBz
zfU+Ao1SrHXQiD;sw7wRgYysk=T}%LP!JM><2~Zw@%4r?cPSh{mpyq?Hl_>hz^iq@!
zeGLp3?P!!q_R^pSa365djz*cx6c_DilqmwH9gQX2N`1pfTd!{T3zV$w5V>eaV~GeS
zl8bgUmS|4rq8*LJ%XDo=^A|f6cw!?<u~UI3_6349Pwcx?M9mYM`k_nn#3uN~or;@w
zD%QW>sldaT8s6sfaDD~tV%I#JRlwL4V5b5PXF8<Skz2J^hpz+RI<%pJi0UDVlXfar
zIui*{DD<Qfk#;K90ZgROGlGb;Q?X8DB88q=M5LXHmCi&06be1d5ZHVk&JTdtH4kS|
zHqFDCmcu%`=HWaNh{3LTI9J6yrbyN`tdT;`@1)V^;oJ)AYP;s)EHu(i1s=|<G05lP
zydCXl2=aM28%Sr6&%>E`p_0`AO#uppp0)^_v{SKmWg>;1zC@&*igg$hDfEmdBJEVH
zGnq)ChdjHTv{SKu#6$``<ecrKor;ytM9{&BxR;2uQ?b&SNPt42=Q{?e89i4S>`dTo
z1~UkJ!r(Llji{;s<pTnp3{nkyA{eBq@x(FMlfY68jv}xsgKra9pTSQEe2u}K1h!}J
z1cBro%{vu%I8#;Gd>+nJ!*<%Kz{7a}U?kH#oL3`RFnKti069eSaAufxD)4Z=N_5((
zz{B|gi5fqG2vNC1IwT{hAP(bZ$;b_eW@8*yKc9!Q7c{%(;k=YIp&>zXX&%nRc4;2Y
z1RH6lX$(Nq8+sj+l-xt6b!2EA634YgBEnp~1qs7=J=G(SUrek|dv!^JKs1{TvYLf;
z?L81^Ng920S&89?f5ZZ@M^zGpyHZ3xOpS6cg+>-#Rxh*^&8LwnQA@zl9@jV35}CCG
zOg}kWXe<%T<5|{kBcdXjQ}P5>qR$$a8O<dq#pWY@*2GM2GUOuIrK23VFk+=MGc$q<
zQr;Sla_O@cXGScd=d2Jw#fIk+RuR%El+hq%D|i*eS7MWp51B|`{(Nv^A=DcR{$DO5
z{=YMk(j*Um3jZXpnKn(zW)$3KswCKNs%(sz;V(^<Yblk_R7r^6R7r^6R7r^6R7r^6
zR7r^6R7r^6R7r^6R7r^6R7r^6R7r^6RM`dP^qVSs0{IV2mG3~oe_*O4R}H_Zax@UX
zsd5<<_)V2l!SS0a$-T>Os-)ulcTJV=LMGQ#xepn=WU5R?0u%eeSgFCNT)ZD9DmlSu
zrpmVn6sF2q06FdFzO-bj+y!AXQGBLKGESQmYImQhk|;h?CBu15m21JIB@IlKw9N6D
zDv9)0rb?p!{lpH(iS8&5n;_vhF`QtU*x@*VP?#X$I6*DXCP+9=5d7lA{ufq9HLuk%
zV5PDLuP<9216CVn02WrqfHmrV8gC@?JBYZd=-OqjR<b%0)kzd#btI#YNukjB3lZ7s
zNREvrg+iweQ--iQV#^L9h0YKnvehvk6Df2SLm;e<)?bt%K-lVNy{(mv=5p&DhZn=J
zusT}rdJI5>)zSLEO&|2Y>R5p_Qs~SijXalIpDO1_BhBU3=bA>E%dI&=ql@NpE3A%9
zQD}pU=5i~nj+02Ii{^4Gtd7Kk)o~K%K%w&z0%3K$$wUgB4~fWDM+sA$NukiGA`n)`
zASP1i^b(P+jzySAp|dOz+3HxGi4-~;5Rt8pEtp85vjc<FjLz;1-XL%wgU<*Y$sm2d
z<DAT3Ap&V;HgOF*7c)pz<NTDtI|OcE(1_6a1%u%Pl8Mx$lq68tNb3-InW=3E`~?Be
z<<{sxsw!c1w0ae4GqyTfV;#RC5t7kdZcPuYjbwpN>q%o9ByrPRZau9sOmn&QOaOUC
zxM?o8eiuMyPk2Z{gs9vhVRa-G#9<s(N6ML+jd7pV(Ry520vcN#t!FJo0po?IWNdY`
zUWq0)TOF-e8D^{FF%W%v=@qAM!Wf6Wk$4`mH`-utB!-9VjW*aD2?_d)=G8uOP0WKI
z*Q4CvPbnfy!$b}4%qhGRLB;eP+=tOp#3D<SPJ9LuG?&QjSkLR49(W@1PtKm?*V8V-
zr>7kY6ccEw6!InEF*1!1zPY!Xe?4%R6tGb&CX^)fS?7m^4j70-(|ii?;AcEc3ZtP_
znBgGUN~M`yLFh9OAOo7&U;fbF2#?}MBk(l;m4V>jnca5?NsaXH%#Q85N%_BVX4khK
zAxoAw_yFE_(CDA#anShZS^mF!(0Cq$qwk<`Ibi=m;}1YE<*!4;Q~q55JkrsWPgECC
z#FYOOkV&D?Wx{kJrhFG5lR}{@f{1+37{^2kT?s_wDZdI6DRk9E;6G@j4ZiQ7QIw6Q
zd>k}Z#CYmIXryn${RfR_fndsS#u_Pf%_EIG<>R1{wyVB_MxjwVXk?B4gGQe6{RfTo
z#c=LHqc-K!9@xY?VHX()#FS6Fc@yvHU3P$C$`4~Ag|7TW<b%c%Or+3Nk%&Cy*JdJx
zt|mm}DL;vc6uLSQk*ECLOr+42&LB0TYYc-jio=JWCbqJ;<}<Y@Q9op`27xOWq#AZ@
zW{|4JMFs+sVnOI4+ki=lCh#nSWeB{+U?PF!HEB}X6G&c@CZ#`tblT`UXw04RanMLL
z>^o?50oJB`95m8=%?FMBLG~RqGOQgmjv=~s&^U`kdCDg$cSvo@CysAwrkttS823&2
zIB47sn(v^o4q#0Al+1U~NNnFhBf&i7lh>rCH>e|CM|=xG20v#*5~uID#1G#S(bU=V
zJ4)fC6qvMwRrpKDv!bV=Jor>d3K~lart;kTsDTi~=^p>&>~>HP(jEK#sX?^!ph-Wp
z2MJ7&&uNJx{D~*~bDLn(a+89pQ|eJ%zRj9MKEK2vlHdZ33Kj)Q4c-KW@~9ZGHb{|z
zhLM7ytYD$0!0T7=%&%amT~i<hbtZ{{EHUg)5+xO1mIG}*iO{>W{^%~Z`BMSgBXvk$
zT5EK7_#de#>2Jw3b}>;DWvIVI?~(-7Sj!MmV@Y8Hd*ke}bpVfysK6i_$=Bra_0{C*
z)P#O=_AmY%YX*uOlSB>G^jln5U+n;Yc|u8F71sBvrZ2aOOIUo*TO~r6Y_uw_Qg9+N
z!~vre)Rz>LV+AEO1-Vt6>{n3EttpU#%8^7#mT>+_VwPW`WS~zX{5z`R5SH*Q!tDMf
z-s@PYiKh(VLgL7Fr057CTuOb`x{;z9!?ee47Li$>YfPFdA}O*dm8pnctF*{llwlFQ
zc*r`Dd=OSduXX!~?V5CXpR{WlC#RCYj1~F2kQQhv62Vo7l9I=;LHZ~F<FBw8ZAY$C
z@b~hSP?IXa$S?-V{#{vGsOSG{Gs@%M_OH$8Uz^dtHX|&~{<RtX@3t9v*3-0EoX>w=
zp2OtqpcH(8Ef;r)wM$~qfd24YFCHUSHZ_6<bRP<|7)|m#Ws}>xw$4ck9w5mHc9AkU
z7z-kCU_(nt>9j(#=r?d!(Pq&vMvGbW<%<$)zXBI07?l5F7R8->bo~7vIH%)Iz7~`+
zk8?Wi<ZBC3Ugvb&$w%HCdOCT*U3&hL7u=;s@ZX)h;BGws$qVkrqveKh635+mRBPho
z1$W~SB2Hd#Hy$D4<OO%*5h6}ra5o+y;^YN);}IfGUT`-aAz~KA-FSqElNa2LM~FCi
z!QFU-2>T4~#v??Wyx?xU+>;mFjhB1!g1hmk)WS&|cjFNvoWyZAUhc^Y?#81k6Hel|
z8?O@(;Utc`@p4aIa5rA=$qVkrTZUxf<OO%*<(|CYZoJ%+7u=1Pd-8(2@n{JroWyZA
z9(69^B#yiBsBFSX9Czd8nMH9o-afGS<OO%*X-?w*|JY}6H{MGpFSr}efAWI6@%$$*
zxEqfOZ2hZ~7u=2KKY793c$AKv#Bn#?pHE)kKaA~~^Z$hpICTJ@4>-Xty4EfLU#?PL
z|1VEtUiJYeZ(5&e=0m`K(+q9)VVYTqh)pwePZIA?$uvXMK%xlK%ws?%g~C8G3<%Q<
zeVl6IPgDcR5kr_}$ZO4{P#8$Y0%4jd&qN9X$!y>^&5$>M&om>-rkQ5SV7Kfy&4dE+
zn`TY|foY}@YosuctPg(E%x%)B`G5<JnrVhLx_v(2t?(oUx6cRsUDE0H`G6A-r!nLQ
z0&i)`fs7i$G?T+b3IoZ!AxtwKKqiI4Kr(a)(@bF|QW!`k4`G^lg^3gflJP^BW?p9^
zg@I%S5vG|`CQ=wk1`(4&%@{a>L9&nqPGpemVu7<6Bx_jUA_mC@7WgrPRKtPm7^JER
z+{qwWm;w(nNEW5QlMIp_Dew}5WIYPJ$spN`0?9euq>!a3P)23>OfytfejjkEVV`Nn
z3OLs^(-6r#<O5Fk?FDH*;0$Y~nGr<SOfz)do@9LmB1Gj53DXRzAdY52u~N>|Y>ekI
z&1?eAXPT)F7^WFY<}=L@+h>{~m`yX+0BCykbd8uJ&N&q7gNd$U?-)WrNCHS9G_Xl_
zx<<@_uR6gqDR;pPrDVDqHjBQ-JxT!wj8zkGa#3C;QK>PL^4#=6?$hD)MVYRAETzCX
zO$1%pqbpyUvVn~!octwp<*SRmcoJ^58Vhzlc@-oxX%58?J5~P4*+h-(l93_l3OF`W
zQL>u(4tlIKXnX~lR#Rb>$6@V|!+xGdP0<j)qP7aWGM;2oiU=NUvRXu?O#~{*LBzv0
zJL(;Q7ZG@ksP_rSnI_@1qKMie2%G@!s3W7@qmV)`JB7yN=wXw98+>>QDv>zefh4zI
zCfK*8fSWb{<m?DmvfrzfD&qU`l(?)`s2DEfK`n=*7%t<%5RtfD3>WcWv_O+%xPS)>
z3N!^Xr<0a8lkMot!a%TivRK-@e5Z5Ax|pZuqZ<C#83_LFTf4t?HaUh0^EsOk>~}Uv
z(c#tjtMB3tQ7WIa2_b%G6GHsXCWQE%O$hNjn-Jo6HX+1sI3UFDY(j|N*@O`PkrW|*
zXOj^qr{CFRERg@e*@ULa|G?Q~614f9O%?$0JDco;0>87#5^(&^CM$vXolU4X|6OO3
z^^nPRHu)YIz2t0?fdnRn7*=X9Di^=3MC}Vib2fR8K;dli2|!M}&zF{*P0m2rOcbB9
z3Ed@Z;`={*&L%|hIh!z?*V*I{m@W)Ia5ga^@;RFj>93qki2grzHlddHIhzptPs<yA
z;!Via*$63daV~%J_Oka0En0luCzROleR7w|E1Bv4ow6oc$Z$Q=dWWbkUfS?CZ@Ehw
z7cXtt`y_X1<Km?aA4TOZZCtdp5$m1Y)rpJlKl6K^&~cB?`$UwD);sV%p*tS@-Y3O@
z_`Of)+yLvHWJu!h5}}KHQ~lm2G?V+hPlQIz`-C;}Q51QfkSDPC=8e2h){svA=8e2h
zh==vg8qR@27cXu2o44Ghjf<By?0u5Ev~ls$hL56hmo_e5+VD|S?$XA^OB+6l%3a#H
zXlWzXJ7Q%+q3bOKCWV^OHH$$${Ck%{KJr`6ARqLtWspzkb}&da>^i_8RgLQegM4gu
zkwHEz`<X#LB74XnAB;(;7O}|DzPqE7F#hh&=Y2v|<@Y|J8uodg!~)jVJMcbfi)3!{
zKH(!L&HIF5&HIE7nfNFQmIOLE65b{LqbIC)NCk1Ub%~X7re@P-aq-O?yidLc&F6j6
zf;6Ea5&67Ni0$(}A(+=YPXK6oO$xnqiZVl%5^SZE?pP;S-h@@%R>ndMT=WQztt=yo
zML!)#dEQ?-Ew2(%fv-*SSe}qfMNSrKDM&~qMq(@lNI_*r;>E{gbfX6qfj=gr8$Af&
zkI8VO2jyzlZuH>&6@5&G8$B4NkI8VO2f@MRk-a=tFQ08g<;1^t7ZFZl=$efX{_tk3
z-b?!MF;yr|bH?if1ePiVXOMJ?UXIhG*ZHOCtVu|x=!5e}cg>X!8;a$D+)6J8XwngW
z={Nn-t@M$;+`_j^LHa^^`4G6BlSJa>ei@>MyiElvqz@t)L)QGABP04u1*M2yS%#M9
zOX(u&({V>py;H&i8nRaH92wd-9oCy-de^s5VaQsJE9Nx!9yo^~QP}5{akySCsbyN$
zpJ^!0dqVj0xo~|r%^9%qh(d(xONfFeOJRv5@6yYEv=)N7@=1Pqmp<YJd6&MVUp|yd
zjQ_mw)t}_&`{k9F@>lZYznp3X|9yP+pXAT@<#TjVFXW%2E9K9>0_FcmC%^e8`B0o0
ziRXXxlKi7Q@?9zaUvzTOKglQg<$uw=Sbx9ha_jE{YQw8K`LX6&m0N%F{qk3J1zsrs
zRb54Y`JYioJExQP{7L?dU;dmf_67NKx^jN`mQ?<4b@Gfq$%h8`p8s22^b7Le>Pq|N
z50m@>o!s_M@=1RA11~-QfjrOuCiS0PI=RrF<mdb4cj+Qu$bXkE!Iyt{ZwwxCvQ8#K
z-4>z`UiP0rmVzx(ysF8%7}7zzkI=L(>5|+@r@iZmk^MEf&YE1PaE;elMJ`DZsnke4
zb?jx1jZ@?;O+b6OCE?2cmM&l-3ZToXgE07KYM2O}+>gv&_?_?<>Jk---{uHixRymo
zH|mEGx&-l{%|c?SUL$oLfhUTN7^y2F@EQ@L=((ePpOYp>*ht>%!+S-zs9s}ycxouF
zLX7ocDN%2c_J%syt38MmU5w;!iZ4S?HRlpbWF@2@?wzz@^EIFpr3e)zNc9dVO=!G8
zlcn%!l;b{GCPHUT!vLWG8U{my_<nbvti26M5u2zOU&(e&M+9HUvS0QO!r>|<*dX(;
zc(<8Q*59x5AnDvsZL>kv-jJk_1R+vE=xi3Ti11Ca_T4S+7v4|hd|T*Tuj#Dn*ZG`u
z^3?FQTukUlmBQ0VCPT=CYBKNoWolttC}c8xGD#78DBq#E`Nk}x4)I(zy(;ARAD{!2
zvrj@ima}{wM%{t?C}tzcR}tud)Y`JlQ!%#w5k*M3&c{!Uuuvu7-`CJYq04F-v6*m#
zFY{*kW<qT!e_v5YeBO$?Jh4@CklM3yi6){*v^$`moE_^xk+_SV)#j+A*qM~)Ik#4?
zu>%+_D9}VHzeF8qvz%Yk6(CeWS5wl(kGI()dV8vr|5Zx8R1W=9>JgNBJ##~SsS7Nm
z>YvQ-_({!DQroZpkY8dlzv7t73!?U>`g&Et{#|g`F-`F`W`O~uJ%hEqqiL(`*QPp=
ze=Bb8I3dB>hHBbUr2;!hyc>&m{F8VC66fM~)5csD@9r0mn?c>6HTP(r?EnwJKh!(d
zQ)}OFi={73<kD}@s@WYrM3b>&!W%-Sf+kCjK$(ls`v2?f4u5N3*o-py><a|@?F;@R
zm6z|`q0PD9zCeiIzCeiIzCeiIzCeiIzCeiIzCeiIzCeiIzCeiIzCeiIzCeiIzR(3j
znBTsTd!+J@_wLYO`5)L9a*tGS?@sQK3hv!m1_gfmLhg|Y?%l~fQo+4DxkoDh`Mo=N
zj#O~(&OT)Hl6@f=5B%@PEPv(Z`tQdq|9;H!@5d~@<CFjS$1I-JwB{%%lLx`e|KAXf
z3H%93!9cN&ZWqj7e-_f*4N@g(UMNk6v4Q^Su>Kk`9lrdp53{hH-SpvqFdhDTMZ~{X
zMErY2#Lrj~{Lin5;2SJfz!;@{gr3*2;lH>tg1<ir5Z|Bh^%2(McKYU<uaB^nVE9GD
zKVKhV`^)PitX`OSxiI{}M<YC3b#W44&wHnYW2Mpu3HeS5$7&;`6?aNF)~Ex4;7*D9
zK1K9D`uM5nP6?uVh$8Nkm;oq2q0qB{h<vBS5++jUSwTd;Q(`j{DfH|iBHt<T4HGH!
zoIxP&lyLr{6ohG+@04)f*2+dF0nR&)LokMlJ0+ZVJ*$9-J0+YC+`X||z?~9TSR;iV
z`m9sjDdBvoklCE?lyE-RG}1|cGe>CTJ0+Yr3HX}KiF~Jo6DI-XNGIPZ;lxP*@o=X^
zIcN${DD+H4Anufy$3zM}?-P;ll=zH^6nf~}PI0FMecBnIQ0O^CM7~qv6cZ`*kcWY|
zQ-T}}0u%~8WQG@aN{|&kK%vm1GDywn31V;#fiVnzOdx%48KCSSusnmu39QK=)v%{A
zgH$!1)(mD7*onc_1omd|O9C?(JV)SI2I<oj&vXWj2t5lB@JWC(I*_VL+$rJoD%56t
zr-U=sLEn37Cjrj%z%P(2*y%iJG@_#UP6_8}m0>ywaGnXEo3Qy#3FmhK1p!OO^$;N{
zcSzhRK`Mx&y>>>*nVOAp-<=Z9<4QBo_)ZDuS<7iIgq4!<of6I~(ZuFEC7f3o<~t=u
z`1F{i(EO`$x3y8v7sZ6d%Ofn8N#XKUn9xTC(Az56TV??n#fV}bycNi3Mx1uO4{dA!
zU0V}spRx^{aeVbdjD7kRAmbHMR={VS*6*NY!UR2id=hRpl!9(~h8z#`M#K>$mm(^W
zhKRx+!Q~-h`zN?nA>zn8@bHUxhCGdu{^F-#&(%MKc(5YRw+w(}By3;{InT&mlKc*P
z{zvu#m~EYjQbod0{;o<xN65tWkX|GnNgqted-}4_5Ot9r??<EjWIPIlUuh9R#ZIuJ
zPpn{eAPuK**ofw<(|$C@6O+Em!JR)SP}D>$TI4-O`QbPrZuFZ5%fV(H{S|51V{{2E
z?V`s~{p>MD2s}yTvd37I62lrPa@k|7DRN2l)>(k``;2*R=OD?L!=!e-Q6^-smNV5W
z{p9TTDCfzaaT1R&=UVSF9^=b-@*>8I3A}}upLb(!PBH7|BEd3|$3!iUT-(}x%H!7^
zoJ4!NUeR}!FrH87OchVN%vf4HU9uGAnui2qjPjDPqO4a%gfi1V@Qa1#>KJ2?kZu<}
zgi1KZ7%A{1Q8#0Z38HSA<vXP*S*j`V+DKrU(INzTp>=db!Oc@8wWPQjl7Djc`&xee
zjZQ7O(BIz}>ucetP1IYO80Cs%gw$idv>PU|SpLAdiP7yqhl8O?l(&hmye)*jCcg3}
z=IgJiS*@uN<xR>{-u&07UGEyWT|XHq(oM(entpP2l+~B(U4xt6QQTym8NG)Z`mVuC
zzC*Z7FEcuAI}?h~bJ4%fiC%;^3n{A7`+zIvGQHfF@8kd2q6h1<?qqtoTRt#o_qW{0
z^m1=`=p)?9EWmfjJTfSY@UXa?Wu!L-FRJ|5pwQVkZf_Y`kS@h8juKk*^WUfP;ug=|
z^f+~m^5fBhkniJB0tnOI5^qs@eS(c4bkd9e`lShCT%`N5NE?rP0g{!TJt~CqY;Ta0
zhY2+eAw@qqdxH(#k)N}@!6lw^d3yAxN5E@uh>CzReb$n6(LviA3W=&&oF2_@L3=}a
zLC6MyYO{my!RM!oH@Jg=-y?n2oODsW9SqSzWM+DFpZ8d#I1zMNl4$og4T02;aQRKT
zsNIe}o!ipA%V`wo<RctT7uDU_N7$P#syoG3-DlE8b$2l+^q}~hKHYoledy@w6WO0G
zUf^y%!me~t`&SI|B&{PAx`P7Fg1o5psRpgqXZmVAUj@2z2fuLC7yZ1^m?h*d4ce7G
zN7J>X*jG>zPfIaes2LScjnc><-_^db>%*edC>3A~7md=$U=y{|PSkiKL#V)$M1?mp
z#0gw`YZ@7<`bIz=mzs#jqWg)W>Tf`P5A^cH43SZH|NHS4jIScU2l_yo2J~4sGDRnN
zppO=v;A*Dm1P}D_lq0G#QxxfezJef}2Z1`lLp={9{EqQ3{!q_D6aB|2(Fq>vy`t2T
z=mQV+B?PXBKJZBI7JXoUrgs*KBmbrkn#aYOIMVz5CgkzEk1%tjw>#DK6TS8hKFuu9
zc`$6<f9UNrzVIu@Z|PHCiU*mZQ#{lA-$Bujp8Haa8YwzOj*swLrnl8v<RbZc$+(di
zlRpRLx?ZOL>_<PX8PR?>{q=cWZ(j!%dh<x}XxH`m#iJn)9xHD6^2pYPhM#haB}LVw
z4u3!|r=SPX?-LOrk0~&Ri#9u;_xMIqek%|7YPW@Gj{|z`4Qm$t?Lt;pNz`^@QTqo)
zd%<T>6!D-wpXft1L@gW^CB(_MDDl^Nqmb_<6)r*wZ|Y=Px5R24xHev2`pMaaVWWuN
zOmF#3U8HD@iJ8%Sd(}-{G2-g8MrB4jb|S?sT}VkL{5GOM=Tux;s?!>DTc$UIrpwzp
zULEPP-pTaVpjYRPFGY5yx5qLN?)nIeGreg<xaT7*$@I2d3Br9JVR>c&EQB8T(9bfx
zE*ktF`UtBsy|<~{k95v*Tq^uywc51W^62Gv`-<l)@`m=n`voo;k<!bbYCh5FM6;s<
z^T(PXP@UmaE7QxLYl0BK6Ts2Te5*A*+IgYm<qtNIqAq8Rd51sQ^b_!}*ra@A`-@!X
zYMFaWY)0|k)O%=sh>uE<M%+jipwPO{NMzE-^MV(=iV%!iI|o8I(+lk=)L1a`&`-`j
z;X&K+e3v16M3=oWR5WIW94{L4(ooTu8FEF@m}iE1`9d=s`H9C}Jj|O+Eip{i-o{nK
zxOIp75-b_!eVb~2gsiph-J#xdH2Y=x2=|7H_8;jZ+#f31f0R$hgQ24RNBaoB4HfM_
z#z%NM)XS62SRY~3FfYF=<9vj1!@LivpyPdniNn0S9+==G{6Fly33Oc7c_vs3E>Ix2
zfZ#4s;u9&6fP}TMGNNP^fC5lNp#Y%(5~S=rfj|K)2|&T90!dl6<Vf-oyOY-9B%X{c
z%8rwh>BzD>J?Tj#FFi?|$aXr@Gv~DJWcqaXbR;{O&Y5#Yb~4l5<7B??zxTd+uK;SZ
zwOA%?q29gkzWeU--+%xA_fPIWB-rd@w&KzI4+-D)aa-}J`!$sPqOJJ!{f9n77UC1O
z;>r6D{qQ$X@yW1o+W?3A4}F*W_D{mXbc4yb|IpcALB%hHh4V&|*!vI3`S|7VPHqV^
z!U)IkLlkPGpD-lo=qP~`hUW+8^adXNJ%aClh=l)UMOPgASNOdX4;|V=eEC$kUhm<*
zxj%ICrFq?kb}Ih6#AbIBG=3NsK?h#~L9<75`nF{TFMG}QM@+M4?hhSy%^TCYb?DGN
zoZ;6aTmK8%WBeaFGz<A8{3nqHJ%aCoDN%&c7jfT!r?4@cqLiybTX|2k8MS*%<v)h5
zdw^W(&EfC|@7Aw+(!4SJ<UMGmq4(zSwHlkv+_(ZZxj8IRHxNV)|4-(rar4xn8lsu5
zu<>v>9||3LFVXKcVPPvTySg16Q28(X<R9WCHwb@sO?aP0Z@1hiCgy|QmOpSv5Z-m+
zI-9Q#mft_e^SgDwMbmGy?iT%4@3sBz4)15bK)w2)?GE3>gGMpx2o?-ZL%NS&{dxUx
zd29vralMF}v2X+BO=c6+gN=oc>cM{a0W;Vwm4mJG4Yv7XpTvt^mNj@1o@M;^U-b{h
z`k(Ng7J1LhyYy>*`(mrC$;-#|Tb{nyA`9?xm&OsFzSt&c<%fog{?f&^hq;e`=doy^
z=Kt1-SAI`k`L*}zca^6$v@aBD6&d1tlDLsG)35yA5pQDcvI~E0(9n|)w8<F$ro0Eu
z|6dp{{kFX0fO&@%h{8|)w)c*28-d~XAJp&o_Ko@-UwTka>)RcATEG3Ep4N9|TF4I>
zYW`3E4zu{CyrJ(dJ&XUtTeyS!?JT}&-th4UT0|uIrjbZKY-aV%xPIgR`=ML(tp2QG
zR{y#@tB$|_GT!;DywjQk{>*!)UbSa;YZ2(TAJp%A)_D(FP!!QKe6~}+@6%?6&)%VD
z_^6rTm+Zzp*w#i&{m&%>apbW@bONA6aw?n)W2yfnNINa{%^%f+AQ#7f|LmSn=tut}
zAo;4LUIH1fTIxgPFXUXc)L%^@a5aShwCYz=2)ykn1ePhAzF!g|Rewy=uCI7{lPIL#
zJ-;n2_I?~Usb<;xt1RmjO8?GKy_@xeV3)Gq6Vv<56Z^!`cKRkdUxBgPBYw8i9i+)n
zj)wMs6rz8)RYovw`uioXzu)!xYu%;$d#~=V^#<MFgzm4kMfW#yKp3ZXFJsgG#3#`G
z5$WC%=_6kE|JCdM$WAldYaZbIZraXUwbB3hvT1uQWruB|+6smKE$&r?J_Xj$umOL7
zXUGQp(J$)eKWCo$;`{N%i|5`_HsETY;MG7u{Q<pqaG+px>{*<h@tyRi+~m^}ukj;!
z-@?;2K?1LXz5|;$_2FaOlvDI~@$fP3?}J;H7(slD`(KcAm+#`?W8927LyX|p@FOvT
z47HcT&lo}0*0V<EM5uw9t2)NN^^8!{IS~#@CI8mbf}wLFbVw!t*3%C`#|REdCI8kl
zeoe;+;$z&*I3O{C_!zey8$AQ?G46qCj7`3ahmUbLqM&00@iA_OkR#ydF6ohf>lp^8
zV+8Rr?tf&D%!z=HaqAxG8Gw&*OOKMFnjhmXVrDu+H9yAv1@<W!s`)W)-b0Mw7vv56
zThHh^ofBaPuBv1FThBN<ofF|kspQ{!M%?L~2*;(8f9n~0r*k4?q>_K@8HJ~FBFsr8
z|JF0<fQ}KoB$fPIuX9v#8|!~bY8kLu|7p2om}dQN$R&d_>z|cNhGf?Nfn0Kh>;G6T
zxis}UM<qiY>vfJw1~}I19F+`ftk*dz8O&I(b5t^fv7R|9t78mVtY;=l@eIJnxVcn1
zMi3w4<_b%UAU?*;Z+@eh`4~3?`WpB#Ze}@<4AuA;w_MXR03YN2CDt=10zSsg6b3pb
zkhKDkh!JELydz~~sOFowZBn*QefSvn3#gMAL41sR9M^~uWHX5o#K*XKTVe$9F>YQ<
zjNsql!t|@71Gjg?ex8(((Sh6j(Sh6j(Sh6j(Sh6j(Sh6j(Sh6j(Sh4-bl~p)LJF)u
zK~+gm!%s48^q%!R?D&HX3Ct8c$=)x@14nra9L8Qzw+!|QN_cE$uP_|RZJI%t(C&uG
zoBlUxJc6y`9IGWAb?CQZB<x><awHc1%kLs9_{aZT6ZRh&!u}^*HvBsbo5=t0r+D2)
z`Th#Yjnot#+vM}Rn`bz>8;z@dACP*L3`aGwud;Xj0V&06?T(KOM|XmolW>1zI1;`}
zK7)%4N59Wj67G)-M=W?5j#%(A9I@bKIAX!eaKwU_;fMt<!x0N!h9efd3`Z<@8ID-+
zG92B5acXybWH@5DyAJn9h9i#J%W%Ykm*MDX6tp`&G92BFf|ucF5d|;9(FOFT!~K!r
z=%Xm;aDQYtdIrtB3`dXQjt=)nhNCA@(Bb~baKz#0aQ|1#aKy2B8IC@OXH2+1G8}yq
zw*ncCYGRB(sEP5$R&E%_B`1xw??XYt{cB<j5|D8JnixX_r0wt8w!6^w>v&kPM#BA(
z;piK<u91D0aDQYtVvU6RBg2tgGu$5;jyT}0zmJ>$fdAlQ^aHlzN28G8h*c8qj|@j_
zXEPkJJ{ay_6T21Xurk9DmE`s-x4*z^^Of6Au`lzLTLOS3;JglGINEs~&Sl;GVM$m0
zs-zw1j_yVe-Y0MW21Byxj_69|r91k+P>5)EDp|{aKv~v(8y1aDRt>BX(ynSj7HVSr
z+pr6NgtV*crILRe8d*uFKh-If{M&FFE9vy7Mx>H|8z%9`OLue`1)J_jkBy`qnW&SP
zv6t@XW)!@1N8d*QA9{G7^vJ&r^g8s?9fd)CY`P=eBS||xc`H47>5dppUQT!PMfT~X
zJL0|lA@xOh1OGNuqf;U6YA3F0V*J~1Ei38tr;bV`|2DL<lHvdAR;lFQh9oQL^rt4I
zl7Ab{vXbHdDkqiv+fZO7Gf}G#NG1O^JR+Ce#)f|?m)kj}Uzf`xy!=hM?BnH^<Z_Ic
z&&wrOxPjh$H8Czt!~ZImdwI#v+SbHw;pO+`@(y0UESJ-~{E=MV!^;>}s3!IxFL&Y6
zraR(NdFhV0!ZzK}^|&^q9hs=d(M-}E@snUS-H}|IbVvN6u1$AznUBiHCRi%~3DV9k
zc*l@nNe94fld>i4$VB~3)Y)`Lcj6jL$!0d)5pUacN4yr&{{P^@^lQ@{J&4!sW&J@A
z$9=rm*Nv)Rx}!$itBFxHu>ZQRgHZ1cvtcEH3_t0BmL6boqwrpbE%33;2d`mvBP1|m
zuS^M~KS;IrcG70p;2yjdF@5{@@kvJD7XA{A518W2bw^R%3u&JJw(Flz_@;($U?kTk
z<&pE(_sK6D%Cfk)WE$7m#?7^V&&F%gXyp2w{0@c=eHa%{o5ovh<M-5F|KCf4f$M)+
ze$$*c<p<~Y{uCaGnP|9`)8LzKaNPmc9AOO#r&zd^g^=^Ne}cxtQORc%<(qWl;plZ8
zxUCxxN84Bk9sYIj8<AU0W8Lu9=)rQsTcd5{BS0?DQB~-v1As)GtLV!w8Wn_@$F0EA
z{PzbBhC+Y$T`cHEP9?H=8pbPeD$$L(7^BAo>-ssBs4o}eCSSq25~ot-<5UnhNQ$F#
zF;@9Gl`21{Qq?DHix)UZ3gQJ0vfu>{vfu>{vf!B`SnvV|S?~e}S?~e}S?~e}%bZG;
zpHo5LAV;k^m8z5!G^bK^vhHcz@d5|$MnQ8bRhb=&C}>WlYCKlvR1i2==2Q?kSmso!
z?%G%8R1i2==2Q?kSmsn-F>sIr)tpMzxtNbrLEzvwaVrow_-1h`)f+gKY9FUk?c-Ex
zUV&4o@o_3OK2D|PkZ>@s%BkRESRSVWtAxj?;A2?iR5tr$27w{KwcLZFS2rJ)Or`3R
zIP?KNMtue~ubY2?1D(gee2aKQR1@2NcS{vW8~7LI0TljaH_C#>$-hW;K-Osf<q)oF
zV*K0C#7YJ&s4l7G--g>+Df~-HD*3k|%Sz#2eqJj1x8Xegc>K%np<wwJJvPI?+yyl6
z@h^9v;PEd%LIM0sQF`RxhUeI${gTE$kXXyV=pGIKB0XM#fB8E5tmI$5E^px9hJ)x-
z^DoD6RTJaih7MK=|8hbq`L|(+mBPQ=DV6-&a2G3ue_4=9{%u%grSLEJOC|p{e1w(4
zzdS0H{A)NBZlmE;ZseFWr*fQ^np4T}QgbSEywscuSJ-eWTpGivH1kq(DhXa{PURFY
zHK%efFUhIY#6HYR@*_2|U*TmFE;r?0xWbly>B2RdQ8nS`Uw#GEmVc3J!@v9{>rJ}G
z-{Yggzpz#S((o_5WB3f=U$|{j_VF+O5p|Y-S;RHKl+7&v!rPXA;kEED+d#@qzm|XD
zN!!c%IR4wmi%s~K+fiK;JAl7i{~6k!#f9!h{{CY~4WefMGdR=X1ABP4Hv05$a!6oj
zPP6t%m_7ed1)U8)`5};G$>tb7ay?r_9!pC$$MBIh@;h*ukZg|OBPTSw^ZO4T1iN$7
zK0XJv#lh9CQ21#5!%{kbUEe45JM&@vdvG=s?yD1A^4QG5bD{8YQ+V&expp{H@A$O5
zW8&Z(GI$J`(wztAkicWulqL_(b%w$trj$K6hZm+yY3kryH?oD8()7W(9{BE>(y4=U
zy`ga0lumQ`$OhHKxPC|s1Z&Bzy6E<6x9<!$)YR|$@A{!XiR^{5#Lv~;T6erIwF@wH
zW2mm~b~FM`z@6>oJ8B2+K-pm$VD>ev;X|>bA^z0v3P-os)#*pVsMxa2*Fzs3t}Azm
zUEPJdC%_m%eVN+D_S<hjKW%8Yqs#Z;dk(Ns{SBDH&ZGF#U^-EH#Jl$KtqN~N1$)_T
z-fHgekxp^6_V&ss_xW#gQ2Y0V{ewQ>A2a|(CdKZtLhSxpGdAg7kL}=|@Ua7W3fC!l
z$@Ntj)a`X#p7N828ZoPbc%y??cx&E=q1_OuKntXJDX)E9-L|@|bsXxEU4fzA$f+UP
z2ovxP@Th<BBKt$3W3A!$9I)MSpxpjD_Wx@C0nVop0Xp6`vw0Jq_eRHxn_Dp%J&j`r
z0?*!32cT-e5o)f(w^-nSg}wrhw)!8fcyU{+_hMhSU;sJ()UK`re0PT}09WXjam?>p
zSk@pKfMD+Xo*;<s^+N}6r`d2HMD8y`By{`)l4~}D$ZIV`9;}4&>jDr75WW5tf#^^r
z5IJT{f>yYZH=rT_8i(!9^6n9PNUyZBWwaD6;}FiPN~0xJMoZNeE!AkW6str_wf^Pf
z@@^@22^?k75;8C9Gri5CrS1Mn>7&1cXsJGkk#_1`Y%s5pZQ*Nxsr-n>?y|wWdrarP
zUEJ%fIVyc;ghv4Eh8#+1&L!M$&Yo9sz~3QTTw@pc+F-vG=klOAv+~Z0d)J{)?xOb+
zfJGTMR5accC;~7*pa|fe!xo_&@rH=y*?_ZdETe*(t}uWrBZ(`661R~srR>gE+nui|
z!w?rgW+5oGiyK?3VQ@<s2J0*gZq+cjt^B;EBHL{bBx3AihaU#(Ee!7TFOY_~214Af
zAjIv~5Vxli;`Rn1&XPU@ar<9!je$3U2lkrEGj1uNcg6uZS0FU-@CN{ft~cQ0J#k3W
zji1g_uNQr1<)SwP7k!t1r@@_=9hDQLcXZF@%l2!UK0_HAJR|!yZ*Iu>{>mjjKw$PA
z{-4gZ4#N8z#IUcpa2waT@&HtvmjEY3WE((j*IWTF5q(^1vD`re4BmP8>D-(*b`Un-
z4M08koyw)(Te-YHoeM?rWoHnS8}#B?EFoxt^k8=oi&)fdQ1{+4e+Nc@=NspK*vJ~B
z^#@)7{D{Z{uzGDJiXmJET0pD`avQ|tq}(1OaA>m^%4Tk01%EpC&N5GL5cl}kOQU^o
zcXkP1mA5nd;jK2d*HBtiRXh#M<G)wIlv$$fb#>4pqwBVKTe=gv49nTc3v4HbCXurv
ze0ZHW-a7%H72~;~61u?TQCO^C@iaO$FSs$-U&ZThsszuYS8V=9*uZy!MDdeEso)GW
zN)52uM5%#_*MBUD$$pmj<&9_fKu{qAA-huPQx>to2YNUa{O}b||Miu)_K^R*MD93k
zWz11=Hh9_sM{^UADsZF?-*}@rQ{Iv0A@kHQM}xTG7*8-cnMR8hn>HJg>H6lfv~~-n
zfac9lt7%OgQUESn&5S%Ism-j1S3!-Dx3pZKjw`<5XAK|Nr7Vlk2E&rY3M{iFM}ojD
zP?<fUvLUd*6|8VY^Nq}K<&Iseg_75ga^8iwndfLD!&|vMZ+NM+0%B9OeWg<3C$AWL
zGa25^&$t2a*QUJ)!1M-T?7Hj19il6rsXJ45T9p65%^kalVGafF2~G}jT~1w>=+^@X
z8Jc}>*ai!;-H(Bl-nH9Xi89J8ZvaD?{eB}djb(g8|1*4vBB8&BNh=)wZ_vykxjPu(
z5dI6dwzld2<EyM~`IWh)_4!;|u~b-_UCf2r){ARxOZjtci*s{rm->3#p01YFxt68;
z%KD|2#g+B8zSekahkZ;pw|$(;mx>!}+DiAW=7RO1&4v(sawS*F<G)Q?w$0AX<%-44
z@2wWruJF{{Y-x6>uo!$#@xI{D+Azmp3z>SbYPPt%p`v9mw~|}S&joL+t<G&e%(=qy
zasl2SbLVq&_ZHWe-Td<EQY&(^z>8<my?~w;?sM(UV(T3CE>tYd&*v7<2<}y~sJMg`
z6j5jEb0v2ul}@4p*8B1mp2!za31bH)>gLwgmh#0?;IUGEIagfG&FM$^z)GoQ9e-AI
zGd#0mx?mevz>9^sduQk8*GxAk<xH1(J_FO)e10vrV(L~4#nR$h4llpN8a(ga<)})t
zrrr7qMmCQYs9nsJ3ace|uCSKVFS&$$!_<o?0istIIcVJ(77KoUopFv0+22?!td!OY
zT+1z9Px{?ht8+^jE^c9&^QGK!o-f2T!UR_V{kpJVDlwe7mesYKUULk~o63Uk<%N}n
z{9<`>dH|?GUzn>`beCsW)@M1YKy86jLJdybT+8{~?0jym1#7Zq7G)tfTUy8Bp%LCc
zXO;k}l#7M6d3;6LEYDJ5WwB+xuzqgIR1{0==lBqifLylpblKAU>>9^rK?r@YA(qB<
z7cfDWZw3sx?(#WqPOe<R+dW<7av99#3Wa;~IX7Q$7uHwiO8LTy*N;G;yOdwnFF_Gg
z#WkjLaW-FCSejil?`M&3TsP%~rNXQQLVew$4~^~#8#%k=8(pw$j|Y}&j$3X=ErtBN
zEfS9W*K4-GVdEV5j={=&ewN^kf#>Ws0qVX|u6S{FRS!zeO-p_r(4Q~er|XKjrQ94>
z6=U(Pd2I8G#azk0&~<}lbErAM3j#LSlhxAsHGqhn8~V@{EyeS*0Mi!V4S&;?*|oXz
zdMvX`Tqay&7J4gMWNFPvOKbTR0*Bn?HMc@H#Y*5BP@`8y*I%5&i*bpOSTy0g+*YoI
z>xFHw!(6tUn=KMrb-CcJyxvFMzjw{$D$SoO$g##CW{bs~oDmF;2p`RHh~~He`bm(E
zc>`udSAw9p{u)*03l}_uCwE=*B;RA3W6a()2FdHZ99MHg7kEc^T3`!<7qMA_3(=n)
z2~vt-hjSMI8i75>?%J|#il^{ZEj^KioV%P`UN7aBxEDJ)-D3VyNBKm_8u^O(WgH_k
z#p)a6D)XK#T|y;$=JPJlt1jkNF67q=E6W&`Id{OH3oEmNU(wv4VtGnGu`st%A|Aui
z+e@^QOU)*jW&&%t#z24t&zVC49BmNvJjk)@o0xeMaT%V{SW?&NqIZoK6L+H)n7gn*
zf`|)CD|HF>&QNXS%#9?Q@Bn%opSE$-mhY*^Z>ieIzYOnDk*Cy_HWhtJRlz$s+NKWj
zCvpneXK@iOseb%K{;7)SQtiLK!UMhD1ABuH{0bk~i8}PXg+DK-Yx$G&x_m-~7o*!_
zGiu*UD)KSh3s=9O>Z@b_US0FN+ICr8^OCCjirTfL4n`UqRQR^4Nc4YspjAaiW0zyQ
z<iB0`)&JYA!m(Y^JttJ{S5)K~Rnw-b(Q)0Vl2NV6K*BTLtKX}_XX3Gv+V(wF{g+Bb
zZrr5~KB1zgBYnG7&1cn)*kQHpv#K`sjH*8WO;z>f$jEM0`?RWw9Z}WKsN+wlEzhVu
zv6oob^|-2eLEU^-sozoCBB|Y~Dtak)LLH1292HqtRqs>P(S43mB^4c2kx0`X75+x-
zp4dYw@;53L>Dr}K?2NkpFU-s9e^b@qr7x;&T#tukbHdNS5dx*ZQur(*@1pyiAGcR6
zA5&3GIMT8UHy%>gjj1X<ji))f`hTu=J*_a0nupZhXzfF)>KWA#Tf|H*tDR4)-2jT6
zx2YX3D%4d!9*OT(J3c2jx8pu;#%@<NC4mjB-T{sR@2JYCt+9`&x@Xi8)|8aPpJ&vy
zm(YFms}IIb^eYwJcOrW5gxV3k1HE7`BGGS!zOBM9#oqg(s{7B_aJ4OVR@J?z)CJXm
zrF{s)cT^X9I;ooY^N>2qAJkT*&-mK(>Nd@~jWlJON1j*F&!f}#D7CJ(VL!g0YR{_J
zi>f*uY1*asKdtuSW$W#c!@Jb>=+aYI{i6*S=l-YE_GeY~f;xbg#2VDr7uEL1RrTMg
zstah!*4utVsRgy`E)_-G*iE}5hj**`&#CRPM^*K@Y;2mXc79eVbj4AfQ8)AFGb(zQ
zz4L;KN~_4{Vr}?eJK*-ENZT&_U)4w2cC(u;XH@M499wo!{bHmBBFlaBM+`9`>;@s`
zt&~%Ia#E1H2g3FVDZXYw3B`ZIC!YAL-(J}yFcK@6Xg1o1rRRuBL54Ly43um64dDv>
zG6?8_VGeOchJgYEXkt(X@t6ru+(3<G6VRG0J(-U(S>+R!JmXqOEC_`sI^ZEgMU3cx
zMn2I2aO8;&WE^~=gGd!V(Sf2wP;lW$0s;YE<~0$4z;x*s2>exgHk$|p{zk5F7^vAy
zAi!*V0s&XZ2m}yxd;$TQ1qA|BSb;!z0muegAdr#~2<(0ufdJ3=1OjYhP#};BpFkkx
z@}6HwAh6(I_!SwL5eP6Hzd(Rn0f7KzPayEF%K`zaG_xqYl#C`=wy=t&DhmW?=NAZ2
z^aKKPt=Uo(2*qWA0I&0e01Vu_M!ioUKxu<OfI3eIz*xL%)|UkWyf7dT$N=Pol?8%e
zoe|+t>1X``bLe<;DFHY9O|6uGYEMeQ)G%*=q9-MAm{v-VyH-ljO|_H&P}0k#Pra2A
z@T@)EzDtOxwssSN!Ac2QfG|=5Ch18Ac&~RYgR@cs2H{BwXk?@WoGdFPpwcqj9K6LV
z+?1e{z=f8(R!ZP|Y;&^N_BtRKm>Xr0VWX6Qjs=CGKRXhN6MiWHO#~3Elz_5ril?-c
zfYAq~1c(}bE*}jmxO~+3xO|ih*0fwcYK-h4t05{HQ%5NRmzMMAyNx_xYds-=6;=qq
zF8hQ4VB5`v0E}Dn?ceJpsR7^K90mgu`OoADs-Gfr{9P4^?NVE#wUL{{WLz6mlVx11
zKc}j|=_zn>KU6n=TJwR?x~It0Mt<whE=P{+QQOahb(>K;pHN4iQtE_eS4LH*VOQ=0
z&%0lR*AYMv4qpfC7zy#K4DW2_LpUL;f_=->nVYc;xl53$=eTh={y$MJ!9#szG-nX9
z)l>r5EE)k!zGl_|hi%2GR}9Fbr&nYs@M51@5HH)v1#I9P%2Z!#BK;))P2>x}@VrL8
z08s8z6iVr5i!Z=Zy<_+SjOEAU3;cBbr+@_jvu)2V=#9Mu`jtz|0_gHv!2)poUk3|7
z8fjSoa@4riRNvgt1>Sj0EC6-(KQ;?M?gOZ8omw&=G<|JNR9>AsTPiH)=Pb{RR{52M
zLPgQX7@7uPyH|>$F4SG~AyXG@?BiPSsOUZ`pPdzRfj1(u1sX-N>(Z>z+=J~|FW=(+
zfLG>S^I*}nh6H&`$c3*79!E*D!R85Vcaojfs!=>hfdjWu`D3$Bnvc1=hr|$k%pZrS
z$El9)hYAK7q%EVW=4qv#QCrsUSJy$GRWqZuJZ7()Us2bPA$b&kqScQos+XeGGg4XA
zD#fZ9)J3XWwYsDl`jYDCt`d5#`5V08go>V3Tc}p4{=8Bzsv78Xpy(WBr`3@o4b_pK
z+YM#TOR5I?Kd7TdQMers^z4RaXge6ASgX2$jlZp~=g(7W>m#c63##TzYU|}6sye6#
zYoN8;b6IVn<|(#YcYH^KidBmWv82>!<S;daRk81>t>05s(CGagG+cK?diFG_t;eW%
zjeG>ESiI6|e5#}MC3II4-Ni-n=I~p1?2Ot1E#75SgF9!{Rz9`mX=u7?zAH*U#rHp~
z$ND=e`XyBh-RB~BxffLRqFGC=0$qSsOV#mts0E`xfL8AA*hA49XE2{{`d_LdPgA}5
zJ+7XtC7!6g<QwXi5|y&me~pJ^JENafhrp@crsCgMkr&mTXTYao!z1l`*ah}Gasf)p
zo1y5!prGlhzF?p9>Ab3=+t5d>^QG@8FsVBh(PTkIFQ_fgtH@_rh1sasNmX-JRX>OQ
zjQvnWpXc4`i%LDmx3IJ5k3uJV)Q+zz^*yx(4`8BstXaoC75z@66Z7~ol*!Q@tr&Rq
z-=e=6sOA8g+o1d7WB^DHvA5{sv5!P<hORJ1W%3zSby-EDIW!7KuXBV!z5xd%68cpr
zV6HOAmj5o>`!{2dSBtBVmrDy9nPf1P+Aa&R^~AaL{1R=Cw-a|*#YGvhmDMceD{{kc
zu9EF%gYA{oewVAjaBSZ_tghjys1_^BhFAGtF_gbi6_{{E!b4d@IT<KnaW_+ey+iq5
z&4T?=KanU{u-0(oOarWzH<03Oc^Drf-uevqp1TyRg3*UTAZbC2+_Fd2{Xm63fIG)k
z4Ggo<dtkMT?w^Ssg1Izy5u^}yu7}k2pQpvL102@pltO=b(9JrqJ>OPQ7mP~7=T!LN
zNCbpYo3OruC4HNSWI#Q1MT1q)uJyKRk;sM9Q?0PJkxahH#E>v<JUB>htx$(+(bGa}
z12Mv<FM35q5FJt=P0*1cs1IU|xT(Eaj$;__=5``L;Z=Q5&@SY<!Lo(0t3=YS1fa=J
zPkj>pTWTkIYn4Piyj%brU}$))$px1lQh0X|N#e7sN+QuCZB!DK^+B4K_}f9lOE_W3
z!PVba>QPnqaTR?|sjtLdx(gbE*!z_FWhfYq?NeL83fvh<?uU}2aW5QyqB|+TU#8aQ
zF!dED)gk^Yh&BLzGz+5GLG>N1epFp&?n0<vhr3pETdS&mUUWvI;&%hhz^Hl;bwuIn
z1^kVa)OFRT)t0nwFU-e>0i==8CjrcF6Z5el&LGHVSau}9qr9!mhu@Go$K8h5`xCWf
z1>Eb&WQbpZMl)9P4K(xG4OmyLQJ&j6PVl?LOXVu}@s@BOuk4xf)^Z%L?wRuL;5alX
zjSL%7x({v~f30Hc4=eQpwfzDeQLECBprP=G3o@u;^kE@!Fk9d^`W*f!lDnFx@TV^J
z5dPN2T3LY6qP99Vs=}jjRr@?0qJ6(F>9YEJd`B7-v!ve$uixkeybQk0;n8kTaTZ}|
zfNkanzB_TH;9LtT{QTe*dye3Z&)*@ppT~QlLx{Br4gQy)zL8M+?W4f~;%X`Gs}Sb5
zfChu|tu4*1-uF%v@+)cTRXJWr>rbP<=T%DjmQYeZ8~aY8s_#-cUK7#olZJ@4gsHXO
z{xJ303TAbLl(eh5=T!Z3YCB?rT2&4HjwtmPH_`B1vrZ9lE9oF&qWJTis^!lKwKby}
z9s(^C%5@FSP9*f}Zy)7)6(V&-TGl2(76H2VRuHtgReTx@k{rw~)bC1EvWq~)t~dg@
z3k5_wz9kan8<8Tt@nZPGW?o6Nz0rNqgg(DT9*yu!e2p4xaPK~kMk$p|?1B^D<z`+i
z)jP(_ONUY$22G{M&t#+n?4<T=THi(-Hyi(8-5hoKt#ES`FW=X0<~6R35=u=UMbFJK
z;Pc2kugT3(dnlIUp@qW)VEB9nt=gRy2#JS=Vhotdh7to&X*&3CEFF9y9EOup<8!dm
zwyUk{YRltvb*ej|>SJF}*L?xiht)n(x&^iMVezbreFwT#MAm-#fzm^22kgDEFCvV!
z`GDGaLR~+Dd$GImR2yx)4WmC)wcmwYcSfk=?_j2p&|VDUZPKj<=-U-(VNl`v7LdUW
zM7zzVI$519oqx+o;H#T-J<$4by#C2&;xOaqs<GG05+X;s?4$P14GP2|e#ekXvR^;*
zr1Gs*y}rAIM865kp9X}0rM=$8?Bsd9|Jc6nWoH&*cxDmdKR8R*Mkdp89g2P@6jrq#
zRpGPIEl|^bL`C4L4!`0OozI!(ATqB#&dWI}#%T0p<BtX!(=#58_mGNjY#f81`XlA`
z&<0EAAKs+%(dh2hPk=XoyK{8M3AjE(7knA18a~*jww+c;j=^*tFvsra53){-F-Jko
zxQec^qWUEMMjln&_^HA)m^Gy8Ibkl4VnRwUs&J%nAN}cLUxgpp!IP@$YpUksDgqfj
zuC|<3wIwB<^Cwj`!V;nKE~)K{3SMN}qxT_Q#f%opy-CKRwI`!nOKRuCYR5%Y3nu4w
zogo6b891buRXu-Rgkv~@9CyJ@A9DW~JFZ$+b!U|%j6j+P-Ij?#(6$;sup{5jpID2k
z8dLZ-SlxMTqF&G@>XNCBrI`#vJ&egLK8idaTV{|xVvnP0msM38%-n6rg#tx7vS>uN
zo>1GqMh!$YQej{mT#)dn_Rx<N>2U7ETF$8EMYZQqWSxKmo%UTn6Rp!`Zlr@4RdsNE
ze{M9ovt&SF_YYJ|rcFO}=50ZjGOPbvHI1qpUY1^HEZ6SuKTzA&X;Rqsaih-vj;iIL
z!=q8CL}JgYsyJ7>dQH{dsj5tx4|vzJV=Ob?50m-{wXdX*SOsCFb=Z;U`V3)1R{EsQ
zQi2|$+fGUb3vHQ~v1_CK^{r}KbPFCp5eE92c2)hdx`{71sh9X=wPRH6`+_?9vZ{Mo
zwc<`>NN>?eovr0%)riXIN%04WzO1T#Ln&lud0JinoT__NMfzjw&#OIOQ`_K#00?H+
zm-Xquc4D3ParH~8@pn|+nA(pt9Jf6md+GUT9-$-AYfh-h4Kdm^a2AyMGlZi^TnZ-e
zv5%;lA1HM{API+pIZ-eZgkE6(@%MQZi>}~kMEA9FRn%F)%tNXk&JU;INO=rV4$&2a
z>oW1n^)IS}P%oWO)jvS&#lHenB97o9TqSD%QmXEkn#czvEMfG6DniT24+eh1zc}Xz
z$JqJTs`gpLxill|4CZ)7?Zd7-gE)~z9oZ4xIm1Aj-6bW~976Gyud!*vN7W%r0$cKo
zN??f*g(E3xytBywV(by3G>-AARjbv!+dOj-xlM4zZAmLMBm0CjK{QFBXEmk|GFA>1
z06`pNbp`^>+N0R49S*VKwy!dlhPhp^j0jsnL!3RPTdBga;$u2b4dQEf!m;z=bR-H9
z<wGEHk<bMGUKLmT!5R=KA+H*@&bL}W_v*sQhNz3HQiezsS4~mOMF@S!8%#ZNRm#Bf
zU6nFG5<KTtYVO`Z+_5o5oi&X9O)y3M3#Gn8UZOU-W8n+dnzRQWc8!Gg0Px->0<=Lf
z6;zE`4O&30_7>>NLdBJLMo^&NglqNGug4Q{nd&{A=Bv_wqCzL;w-vp$<|9DgYcwB;
za@<Rlh0@Q0$q37HRSzclkZE%8W!wG920gr!93<YQZihrhElBc<toFc3*i9rcTW%@l
zm*-pNb8}0`v8@wi<IVzNt(jxFWjR+mU!Vb>m8&gCInE5iy09=SY12_)&&WKzR7AS&
zrB$Ts_EKK!#@_vkV#~@bV%je1+I(?tcBQbAM@PCsQ_&?$MW02!T8P@wElA#QLDfF2
z4nLx<c}O*!RM(HH`WHtZhTuAi><f!(*GaYGHg%BdC@7^ww-A9oih3zXoGto?P9Bj(
zEoZh$I@8FlD%>7>FGC_Du^W*K6w!~!sfBC|i?NR)`NHAvoA=d^KKv%V?5KGe#M!^5
z;s{9_kgnkH)9RXUtA>&~u&(M~SpUvtWRXP<37F;%mh>Ds7l^e|ScJTOsbUhQU+shW
zSbWpFpHO?AS4T_gdL(aHP}?R{HF6%D5Rn<0nQfOi`E6KlC=jcmAlY?U9e70TglEip
z^tKa7Nwx=}9y4&biC+7Riov7iE+*Kxb}aT6s_NUwnD7|0(;`J|%~PuG5mo)~71GJp
z{)MXf2%BwrL~TQ6+sHZ5+M!~1HLQ@3>dkXikEp)8RP$5nhUe5ZaNv1P9eZ9~iyR!!
zQ8%~cDRltpflF$~H=(RViU}mjcuZ}NbRp5iWwoQ7e`6h5v54rD$J8#QKvVk>uL2E?
zdQ2hdZ2e<u=i_PzbT)`aL2K0fn%ep`?El@nc?St48Xmv=HU6T6x$n3~)geL!bGS>z
zFr_g--iy@fG(5uDZ+!wPl=P-g*5}o>CsZBcR^&aq9#cDzQ{%GQ{$2og?Nh4OWrwwR
z_%YS+T^0MXs(qZ}+YWmzevZVR1+-ID#Jb2LlEtE3WF84<BWIDW;_*lZ`7GY|ihaa4
z>*I#XK9DIlIF9zm6q0cqdKT7TzUeV)tTr^-@fZ}&DsoOIKzR!L80p%r5L0Y#?20|C
zYJa}6?`@CK!#Hx!u1IRHIuOT<keUvOSXAus_SkLE7}jB5p_ht&?L#N|Y$W=1=*v~~
z5)``8uRH*)UG&f4x&;W_iTobXKm5=O??Jj@Xk<SVIibRh(Yn|S_Gfz(seB`s<(x#~
z;ywK$3W^~^H80>#9a5O!r#AM`%lLtUFNU9RHT0D)7>M3B2E`w=rvJOT77@}=R0;fT
zdtU9h-~+y0(H)Vlun)%SG!&jy+pyo44eS8SWSJgiEtG%_kE_}j)egRLROe{<E4BTs
z+KW4|x7TCgUJ-I)QL1DAjGb-Ad3fy0ssZrnbtuYvbohwc#}1)Li+xn|`kxKH*tR>X
z+riLp+I{Xdw)@T3(ym2q{k&37LQ!1n@J!Z5|HA_?3n0@D(ve3$i@k&vaMxKCeG%$Y
zgrILnvS!@fEuhyX%Ob_Q0ExG;xDUlW5wEBlpTpfzwY^;(z}<c3F7x<AqDy!KR~m}w
z=zd@gV$TS(2Hl4V|DoEth!cHMZAI1~Xq9hAU0OxYqpnRwF5()#^mjl(&D=uYQ4P?R
zendrEP&*o1ioB;mZHEK@7cQvCC7_V*y{Zn_4Z<Hh5U+3n=L}>M(svbsUG$y6!yIDl
zWe~Q~<HQfo=%hVQiQOQANV!EN$*+G^wXr4Aj_Wx|Al*rwrh7r(dq&?oq3YNK367*|
zWRAQ@gkSefV5}EZgfm5Y98euVXXFK7imK=oj#W6?;>6wy%`}WYfu#V+itXx;3?b9e
zOHjl|cRxh{tho#-@vKrW%c6ZlUBf&+$mmVrjv#9he|`aPydAsx8Es!kp)OiYiU!XT
z8e!1G4BaocFdNh(s6eKsqoNpweF4QPB!H?)V@T`YS3Az*_lEWL^>5I^Q5WgjtFNjf
z&AU}yboUF`2V};KoX6;YsCIn|eLqcPUyWn8A0&#e0F44QID}({V&v#P)O>?A(b(s~
z5qt{^k%p0FXS5H%@&zOZnu#34q>(`X^D?%l)L~MsYmiP|^CYp8WJ=VXTn`Wq{AKPb
zy&{*@w*Qk9R`%kuR&xIW#&$bS=4aG)u~EsqKk^Fm9{uVl{;6$9okayc^3JQ#Xg!P@
zk72yXUWMEYKNc?p;|1UnCofup@obyR0XLjcM;=nw!d9?wmdTg4=ee!Bkz5@oqpJOR
z0(tZsYA2#pU?g}6Yj{S*9*gD|06_<yimR<JF;@glkWkC_@dp+q{(!px%K+x%ELk0<
zR%%yUfWk^@?=!Gp#qol};E7)EK<f2Cz5(QXCVCG?vHz(~Dr1oMbj#%nG%vz#5I)nX
zYQM$JW_l=iNul;OI7)Q_K<;A54Z#j<5Ui3_=P{>Us%k-P2bi1a9N2`;sJ-|;03Z_`
zEU5j!TsTy8lX{#-fR7-Kr)HE7hCijY-lFQf;^PUWUgDVx#~u^QL>N+MBs6x_GJy|o
z1uPS=OF$3v&P;<*so}Y5mbhw`@Jta`%@TBi!76wvRfGh1*QkFdDS+Nx&QG2*r&g#m
zf7F;Ij==rsiy%2?y5YJ&&ls?m;BUl@2&7w+W;$j-X~Ogj&@j|OX!tIjgll3ypv#hW
z8>z)}^qr`SA;}^>a8nm)+4JSEeDy2*|5v_@|Gx5N{Ih@cFKh4*_rCh6Cm#Q`Cmw(N
z@h6^m;#0q-YVi#q?M+<3Hbg?B?+_xm!R%p;Bd=mZ`N@Af(R{n$&2SsGadFUq>a%%;
z&yc)|kse#-0Ui1YrUoZj1bb6q|6ol;0w{l#dB$tueL2wINPj2<HLvSKA(-nMwSQo4
zUWUqdvn}KuV*nvsUUdX6$INUJK4W8MbU4kAA2WmF^eS1O1X;ZHm>G<j=)?Q`pcgY^
zZoH|f!8Hl|qhB;6@auHJg(4sv`$Gv}c>=SIgm&WZJKG??f#$qT;hkQLq>kw{Bx!jW
z34M{&7f4$(d2edApf_Td_r|U1$_5%$L7V{LmBh(kg--m%@@H)WwQuMvRxq-^M&cyV
zoSt!BN<SOK$?_~%s&@>XqNl{qJe}HvIN_wgrAkh$rBk~6<I^dewO652+?3bx35$1P
zZ0AZ4wU&a#)nG9fEM5v0D@(vyOGx{varSCq6<@IOV{uJ5e`E>g-Y`DV`Y_~1SmPVo
zt0HPk^mp*F&Z;o1asT$dZ_^}GH;s^lWh#4X&r5^PRKJM(wb8%24}RUZ;o15bbuG*u
zCAs?#>N&al&<BvlA9tr!?dPF@6|#J{kmZlQeUjh?db_!w+=gavKUqH4_h!h68-+t%
z1L&cmYi(`CB;F@H;bguBdC_KvzysLrs}z}WY5F_tKebxas}$MtCh(DZyIkk4QenEQ
z*|Z5i!3+z3^og<_6`w1Q{UR-&HSj7?)v$O@TRrL#Z8+p-&yh?BDpmx`J**nesvU4E
zXr<=>LIU?jx8mbA3u-Suylwv8ruHs+1$_fm_oz08VZu<qcT{!7{#3QK=~CO~rJC3u
z!__7_3qyJICiq{$^(HXL+7DP7yP!R;9SAz+GUQH-XHzroWa6}&9UmA?PGy~VdwXB|
z@pLFPF)=wlHSP|Kk4-rKLTF-YWHOl;bVrkCvhG-7#_4E}ccJ~M9(OW1a4I=D>BPI`
zW4sQ($m@W~2)P4U^R839ZhE4p%N-vY!ryc%Gc}Y>3}>BJZ4q*_1MYyEPGpAN?DWLM
z_~ev3>1<Yse5mLaBje~?*J;1hyAb@Bwll|=(-ZxvVGajBPD@9~?N3c*-KjGZ$*k+T
z>G6R?I_XZiBTh#My}Cn*>{OpSmdZGdEgkW`rcib&F*TiaPbX4Ssm!p`7;g%tO1b6K
ze5yEI%*{9ErO@Ow+N+k^msa(cyPO+tbJ7#;bY?o69Bjg{CzFFItl?DS%ItEkd3CK&
zDm2e7<!6gc4r?5#aa?z{xQux<x^5ynmPw2yoyH@&89Ko#nGKqlOpUwTMf4~OwMji0
z=K1y2*+#qq{UOs)Zl#zntg!q09r@&$Gk1P=&AHii3rq9PZ4O%A>J*Vi$Sq+cOSu))
zp%Y%91GH&!ZgDzp^`2VEUF1``lO{YOANt9!xH^$KAHuWf(`&MvTSkgR-oUfXcmkdB
zM>qRW6Ibfu?Ai)_TJ=gT7Z;lxj(u&#xlqW@>(y%HCEqw(g0kY=I;gGNh+g?@er1tU
z<b&*Q$~`?Wl9+Tt$;|lFbS8DD)7xWVWGp!bC>X^l$qXmG^WnNf=_vuQj&?UWGlAYZ
z`!KNKfdO|UaVqI%&tw3vI6aBUq?^bLy4ll-3AYbm<YHUaa<lX9x!Gb4uC<pEOGrpw
zc}B`m>81xaJtG@i;QfHn&bT|!IM?LNp}0Cb-;SOM&H}>8bgI8UG3fN6mN&cclbsqI
za8t<vx4Zkej5wX-vFQjUrrgZ9o0y&&9~+;}OyMB)wYP_o6UoUj+#PZggP2t^k<J1I
zbarYKfV#<{acQ9OO=D|oYZH-<DZ1{dgqxTg#_6&rWjI$#=T`8kI(tMa4huIvlmx!P
zI^%o{Ivw%e&`f%K*v%worp8l)7+Gg~C^3ZDC#ReV{i_$lO($_;fofg>7r0`YbW50Y
zXkyTvNDL?4q3LuQ-L|*4$GJqQv&r!x4I}^-*XaszYu$m#fzCJ<du^_n-;8hONHKVn
zn*j8pQd(X$S<!TLv(w0yP55PF0AJQk4W%=q4&I`#Dge&|Iov)yJUQVs@?ttW>D+$1
z(}^{l5WG7y2{ejf68UBa#@wmYSQ5Ai*gHF3E&xAF4fc<ZrvVKiEYv_EHRN>m$|t*f
zx<UqWvuK6QC(Z*39B|Q77K@T~&~-97>JBER5>BV!nwjy5WCjCBr1?TYGr=mrNj5n(
zkpci_&k|7B=SnU&5A1qj7T-iKYVt7CpY3*2=?OqDkTL!OQZ-rF^bTH*$xdO%^qB)D
z#EDD*lTS}gOap@xfQN>73Y~FRe(kzWPGy1dNx67n2b0)BB6M?#8(WTdX*3#lC;GE)
zVqk#4Pux!2T3P9Jnj8jn%7OtgbjB_5>1gde-rIA$v!k=CuP5Hw(cRnL*U{A-@9m0r
z_I7skws&=Q_vUWtZ12OIFJU~>IKco{eS)1MN1SO;D<XBEjZ-q}KuPG&44&d40JQc^
zhuri~7Cm62Q{xyAc6oAUa$>-dBdV!7skKMdS<t&QewwS`0Jm|^bSVVvn=6)P^7DiJ
zcCuL@xIs5Hh=WhPY3T!@BuLm~*6HgFrQMmS%%CH1lFSSWS_7Fqof@1P0i0x~CQ}1b
zcsiLmrMLG=Hw`5oN{=U|G6Yx7J3E1mB09#ifnum1x{Z5Dbx(m5&NrUZMV`iUg~C#k
zGf@B|l3T01-xNBX8p=+MPfR2S8!3{lgA%$JuB`;Y)3xQpp_v1oj-f;vg8|@y8%j<H
zNOBuGOk3}$b%e4wa(Wx$y>XNPcT<Uh5x0M0hzHb^kT+;5Ice7nOXQF*!V2M1_c@%L
zN(~Xoa_v3DG%*gSNlpUj`$GBR;@a%$c~St^Ma%_76yP3U95<DfTO^>GTDtj_rTj|H
z(oVG3<p9mZ61py|+~y|mb`x>!OuU<@7C-2Y>y;YLfLAK7Vq;??L(FsY!V7RG>{2>M
zu}ZxQ!YMQnjdkF!)7}(Xop+atE3#OF0|OIIyyrMN-!zAk3!4LmKF?(UB&3t4l4%e=
ztSJx;@!+PlJPq08!1QEl>WtH2*w&HpiF9f#HRW{2LleX9U<znN))Vt_@0?Cfo^g)7
z{}=|svuQa%!&)vZgi<RAq{+{ZFDw*urOf*Bx!fA&>Mu5NZuq4AdNBu<YQ6|kV@Gd~
zl%F?CWk!-1pojQ1ZbH>qGBcG(H_pmhHj;;$pI>uJO{i&-HO&D7@E@qpZCGNjXs|?L
zYHAXoG%$H)0!NULN`h{s05J6cq0@LXe)Ji$(tywfK^j<XvN)$F!ORfDU_p2^4H`{O
zWK-$!43TfLrDt%+_T8VDoJ<ix1Ue)x05{Krgk?{Sr~8Khqx>OO;iqrc3QO*V95NLY
zRzhpJ`9dg}NVzz3gX5VbnDzEfZ@);Agv$|r_}DzRi!@<4nZd#VaR6te#<LtOj@#tG
zss8C99F~4z?NA9SI+z?vOp~d@pVR=j26t*Ijj;`kWg&7*BnFZ~vYke$NTdPpnuvqp
z^;eNOE-Vml&}Qn)fyq2SzeZ9o%V^~%VbASW`_Hlgg|X?W<c#dt6xfYahKJtmpBzsN
za%_OG^8RBdt$;%~TL){rKpsoTg(tA-enn<OBOKSH&=Y-HK>RgJn>+=eA4^V+j1OWP
z+cj55!ERy{L~v-_={g=78OP#qq2&d9lrkNbsyKTghrF-3xqFN2%NVmekWPSgcm;?B
z0LniNv6}E=txsnRe#aFCcCJutZ(J>{`Q?YgxqA@m=5n<i-fQL|lNIJModBdyVMIa)
zRth0(-zfRI{^{Y#B!x%kFp^-rf2rl%`eN}8av{Z5YZQF{Qfpyt@vz{PxC@Ckn;O^S
z@k-_%83e-|$#^w)U}`XKP(@eRq0q$aVs2!%IJB0_rGd5#H7Yi)qF#=(G=Nl)!z6WW
zQ_JnZhw_bQ9BweCBZ;g#G?`3NY))pILWIj9NTTK!I*?1H@%%i_#9$f-O=jr{qm#M$
z{2CVQRYX$Y4Tz*x5G8l1IfhB_Rpfyx%CZjGGuf%+m~e&tiLA!F6?JlOhZ19{^ckn6
zhvegQB15K-B*a`|z0!mI06w5rPpIlT-WlgYmNBoKa@Sdhoazx5&Azykr`-`WB$wF}
z8c7Y0K$dqJDYodvngH`EtHlfSN`{+=!D;X4A{2&_gFu4mQKyqMjq(*Zm{cO;balpY
zLbbMJ8sZy<2yV>jfRG6K9Ar@Wm|tSM;$7Pz5drBD2-Zk)a2lIC?BPkEyUg?$F&NJr
zF?M@<R|pKUATl>KV+DelvOwSs$*<aQ2g)*}jA)aalz#E{QKzSucy)BL-_1^*!a2uT
zF<<k;p_qU?fE`L@z-Bvv64org(&MfRP?%YP5`$A5m(vlf3TRq@hRb>ukvJ0B>B%Gs
z9^RH6bs9UI6DJ%ajd=Hn?FboG$TP6v>79TgnHJlHh@oN2N1$5U`?})o@s8ejS6_E`
zdskN{V4xSQIZ#qx&++439lag7mJaB`e7Y&>i@;mIg7#|j7U0T6{&klc=bFKdk~1?E
zz~p4k8keYuaECL~15nx|CzE$JF3;Z63{MN92Xn*aB6ReSaWXX!T*k@w-3}cKA80DK
zXymi(bCK?H?%eU?Kqk3Mh{$zkFU;nbX3s6<fFW@Rpd%U)v2v2&q{T|tEfr>qbNRd-
zawA4aHH@i-zD7Fd`4Q3tlFa1?oi45~l`L13T|YOqHoH>PpS(}aW0aLunmf{m%h0~E
zZZY0RdYw(Cp~P~bTO>a=opidoL%>HMK$-E==(1yp)9wV{)8ce2HId5`h{E(B$7aur
z5v<*zad7LWafq<dfa3%N5{&uu_#~u>!)>|JTpP5=@W!~ecvx7_DQGt0ec;whYng%7
z`x@tqrDlBLT1)x!1qTW;QH?n+?oi2vA1Jc2q)0|HCLF`9P9E%l)R4=?b6^ctpyO)f
zgV=g>%B7`3;8yAEE?&2EybqlodvE)(Tb&Q#e;3c^v0;tCWnfk&#-^qRQ($}n{>jN9
zFehLVG_7%G2Btl#Wo7!Q$*J)nEQcVytcZ(5d6ls3(<%87hTfYBWvRg^p#hS6p=V`e
z7&s(;F}~sk_ZBP;1))ibkfPNcpiI;az)S%tK#w~K{CuhJcsp@2CHDmMQQ+vzS*Lp0
zFh^hvi6yh{FsK#uBtZ_zFvA_16i{D__CgKJb!eE_cAO<F)Kadar&KuKc!`H~9dGIA
zar5&z7|l!2O<rmW@%_RC_H;GMFBqrhbTT8<oDcML;ZiDY!B}h)0a0PsA82g!npNJ0
zzH52_SS{mpbo6?RR6jAV)6?Az@+J!RGOG$T4jO`OmPFAVJ$l?$*IKR&0l7}$G);_<
zg%*736}i3?ouF&r<Zw5O!!alY%~^qcf`lr39dtP!X9;Z)6r@y&cl31j8IXfw1tRZM
zdQi?N*l5kq4*`hrqM<Bg-2tHP^1IGtK?Bn6Gz5rY{Z0$d6T@UKf$DwgAp@R=Tb4WH
zph`Rk<oUFo!F9ltqBxiWrPRv$<MDWBXKx&wL0@-QZ*O;BdvCf0z;K%MLz=X&%gNFr
zL3Vt45@!lJUE-4T07OwiC6F>^fJkspG`xUJh%{8DV!EPbL~nxtQb<6;=A`<`;iRBC
zafb%M4o^1DH$zM$kZAQZQ2WqeR;&&{1h{jTv`nVKxy(1c&pCRV^MKT7GB{5;VvgLk
zu)0tOOioibhxIAu7MB5^%tsm;6Zz#N)B=!vI1_Uwy1=oNDJB$OcD=x5m6aLOnJl#7
zNu%;6z9h#6Tsb*D#_8y*5CrZK3=>DyY?N8FUMz@@iFC-ofyyLjk^|tha5G;FP!<wH
zQ-NhE6GO08T9xIg0?S1hF5G2&06jF=kNMEdF*rRi1wK7Qo_i3Y%`_DDgBULX$73I{
zMFwvOkKi$p#3Bp8{IYmS(zL0}xG^(1$Gbwsd-JOcgj~FKmap|>8&?=JFqu{Y1F!x-
z`7V~KNf3*mm3Bgt9K9*>m`oGxiBVnc$FVt92-QnTg@_0Mm;+EVRIpPJe<64ZV<a$U
zxKdHW5<A4XysPUgb0w^#!=(*sfaI2FowLw7H!c_EA%DpJWVte6B3MI0Pwp5^z(D}Y
z#wVs=8vsm*QuvDIJcH{E!KkqH^=Cta87fqvPR7B(x#2(IK!#F~EcGjKC3^@lE$wDA
zU>ac~lwA~v7kyx7oE20=*h4<$gup0^eo#dEY;r7tg@(cr)`Gs!Ab3PDw)!XJx(l6g
zV#rHSu;LOG!~(SK$r-+q)leJuvboYA1hulU!m*b=)1HV?lqNvljR{ch78Vvj)Y~g+
zLG4-VkXzF-5Fkifm0)HBP6eU@H#Z6O5!4~~8EOkb<RVCtOM`-@2O_6~uK{!5^t6Wv
z_V$cBGR+~`OGww$s{v(UZG@ZG@9KeOS{qh|CQiGsh|%Ij@rhO{mz+^<4R)NuTn=A&
zDS9&XRAN%JE}oR|{-whf2rc(03@@S5UGkrCD^4b}Wh*nTA<qHkd`1Io5qO=rJ591=
zub|a}0tzB9PSH426}$l^#YZKiZLlh2PuRpTk2y>Tu(^W$D=n`cHo`LG^el4dKr**m
zhavknk=q5^EtPtB<=CCBcls*Iq0FS*(>8JsjRUcxM?m2`obrnIqR=s&CI{QdJ2yG5
zn;9CU9h%6VR-82OJLm~?zhlWkNa84w=0M06l1VBpI!{u)VQL))Nu!_ugH|5=(0B%7
zx7Ltr{bWE3uB$jAttoG86kWPkEh-dS!nBRP)87b$Q|qS^nKN7~PcvCQ=3Fe7xW%vh
z%#H(P4UA{NIN*>{kAcEa3g$!KwJ?c58WU5}5|%zH(}7q;R*m;kM`g?!gJmd64Nj2P
zG14=Zv|Jz(pU+(gO-+Jr27QDwh(v<;j<5g;n|KK|K=e$K<SA>#n3Bp4W}Hqi+ZxG1
zFo)HoY}hbx=r`7wYret?ORye}jb|E*vp7!BV*?f8a$$BkzjPn$>RvU`lT><BgsvK>
zIP2Ico3cLsuv`n>M<vi+Pv|1XzJ#xhEQKIXj=>SY=^_26sT*j)5W)ZrgsN@nUU-Iz
z#BA_`Pi9WGbaoD=LvUcg5kOBwFyM<#PsQ_)wabbb(o`7F$0yHlSE0-V1sO>~4GaAS
zo)!DKAI-(P2_izOH<RZB;{>C-0`6S%xdE=1n?3+RFEK^!WKeTp>YGB4I$-dDnX27z
zZzLsFt`H(igFSEqxGO3|4SaB(DV(XX#&heiUf{ssr{#9CA^>d2p1}&i=00_SG2??}
z?dc-RnoUA~XY4rUBxgP5*TH=j${ctyG34V(deH%imMm4D$H&2>so@f&dT@F|1crDw
zMJHedYy((LkK>*;KtQqKe5tUq*=mX%5B3sSfwCh7EEzE1VvOY~xKjY7=^LUMAS0%2
z!Pt5%HVyX%!@}Zk^^q2_5^ZQ#8A~V0jRSWJ%w(s>9L^Z(5gLXk696hO`5|{~Gy_F7
zbhJ2kS<rjnO0Y*-?BaImqWpycUs{0nQYsT*vqOxJ(iVJieT|R_r-xJq$R$ZGo;pEb
zU?|_Al5zTuhm!q6fM%zg6x8U3(e7|xYrM4s+*e0m2iy}5hlV_wKrJCDlb;*XgvnoH
z_pPj3<F-*NjK!kt1lGQ+F1GG8Mt^*ZgU9q#seKq<C~K1k>1}|yYN!D0x*7b@o?G#5
z$dOh_PmBt5y|6&!2lSVmNkMfoZMe12#9Hpc6!0On*Qr4~=7FA;uf!P&T<RH|JCGwW
zM!M!Ioq{DLX_Tq1OL$v>IaW-M0T60zwDlM=SmkUc;lyE-Gs^qv$+UAARGL0h8q-1S
z?QYki5Wqs1{Z#f`zSM~S<bd+$gwyA|2k<2-M7bkcvc4wg=PGKU`s%}*lc&e0$%X-2
zJIMsp0vWAfhW5b$;v#0iCH5a8om|7*C|+=2&S0vbeCtCf;#!1LCwwXO_YjDL?Vo}l
z22QvGX&zRpsX-7Dbfd{zYH|=$7j!+0nHfC~4Uz!HNvh)A5pA%8qCYK)(gejHM=yp5
z->|Qxp+UjPu(Be#c<t2z)@eY%o1j^#b=#!Q10Xn5j7*MatX|g|^pfx@8Uk0Dp$cW_
zPLN2bsfj+}e?gfK<P<`KIlk5e=9yNmcq>#C8z(6?4L#7xbjAZ(8KAI$Rt9tljzoop
zMyNSw?*j<%F<RqHxlMf6SYc}JYJVSg93WNJrBE*fD;LjDs+d?>I+ZWxp(}k|wi;?O
zPNlM`{&W(;77YBr8ML7!;X?x!O*;-zi7#7&oq}!?9Gl=-eN+?lUExxX!@Uq1bi2}E
zH>pS$%-$LA?E&KQkRUBX(=*s`?OZ35r*voaK)PBl3assmG_K|s?(<A`MZ^;1vE=89
zg(dpSQ5<fh+)MXEET)snl3fPSPc)Y6xr7j1#3PH05XD6CNB}wkcc#55Q`xaQHQ6Mm
zr75YY5pi!iLm2|zSK^ZD7vFH2n7Ptj%#~nDC4vqT#<H6!`NF9bTwUV59U}Z_>gX4N
zvID@&kbK1852|tOCR&P#?MKUm(el-PRg_jn1S~tyM%c2(@P6ID#_;}{kd}PuK9|+n
z_R}-dAH?2Ho}L*tzvHAzh7m~&+&K+*H((1g!{Bgd&Ins9S0PK%XgLQH204yvO0Ewz
zmM#IVoWm2^c9XpmY)l+{Z*IvctT>nYdfc9_mesiypyBmPEpYH?hQsIbDt3F`Id`A4
zI5+1s0*y5}LDJR?-C7IKD>xW!9TSXV`B8D(@E2rd#`>`n839?5kh~&hOTHXCFV0wE
z{SegTBf!EcQZDy1i!j_YLE^_r)rvId<|dIa`qQJEY8A_jmG#(#Lmzy2;jZhs(KgO$
zF!w6rbV3Ci19yl-e=<$G%n<ygT`ZKe#G?yE91mfbJt+aT6vn`A2s0;t=$PX|S)pC0
z#k8;K0RT~~%iuGzlVjj3=<!0KQZvq!Yhd}X_U8??z_FR}jHXl~w%{$~yAo5BibHS(
zAVWq4ha90|3Ce2?tK>b(+I|hK!8&QT!FnkzH`JQfV*kh}YOc?yuy`oN%`o2EM++ky
zFTnGPHlI;Lz%71>5UxaU!IfnK_8@o8e?B{b7zZevS)o^)8v!2TYYTM;1|XP7_(nlP
zVEd*ub*0jYk5ZFGdnjH(PgtYhqJ$#OC{)-2hv-RFZoW8>K(Pp}!099tsNF3ryGMK4
zjJ2nUc~;xgpg;vthJlF+RDGrUhga74!0~Rb8)c%Nt{E}(!`C8}p;cIT&@pbO13&wz
zv!XB@PLKCPEGd0}a2Vn~8cVg`Y)z^L%kuD`=3l8AM2*K!$2&Fg0Z@&NpP~kt*4$ID
zDkVU*9DhM1?gU;K@1ZCmI&+`q*f@0ZIEtYky-h+-cN#7_@sRe|!Ae8b*Iu@ocovU<
z%>;fN@Mn50W)s`e>&F;YJXY77HFEEhPz{5)(y>XHq%3V^a6JXyljS8dTUP$@(4&SR
z)wqe(fXuXeezpi(_uSI@e6A7wlg9e0;AhTCs}zAceFE;)uVj@1?#dUJ3KyYY^I3>Z
z1yG!)6t}1~G2)aj=~7PTDT0u|_Tco`gy(GMcf0e!ep5&=lh6?(DQjM-k*aaMnm*|O
z0){AQaDpdBhhc;Y!K*!+q7xWBjYjB+t7*3wR@t251x`bD7|y9A`f>%V0{3FzN9iqK
zTK6gL{O$_6*Ps^98XvJwpzc}%iFb7McJ+1kboJ7%BYf2a6ajLip|phS<1+T_LSYUH
zd_*|N$#J!_K42a!H7v}cUB(dv3pFt~Q9e}Vg6Wt&W4$%?^@ad`XGs6!5VSFkD<Lb5
z3!Fd9!?WGOjtN0h@GRsBF!x!(uI3t>X~7Vh{5kppBQD2?YyGcicj%`u*A&t&1`7@t
zJ##6d7Z_;RETMBky;ugN6w1O268bQhcPvFA-7uugk}l)P=lP|1IE&O7F%YetJ#jjD
zcXdJxZa?1H-QE+&|9X4-I(p$m)7u^IY0tILs|WEvmn_JTbs4lYZ@m8`NM2TgaX<;h
z6t&?dY2>DNsNW^B92*}*TnxD@c*c;sqRYH+S@3a?viYU+gOu^hqfG@A&OV2P32ihJ
zX@=crT!LQ}?D@Ib<q#Dt8Bi8bVCXtJ;C(hbcklXYwlup|(mH!+&R_w+TQ)d4!2=x7
zl*&Ncl!8<#s%gd;0gH%QgmS#`#U6uyMV{W+<eez2^l)2OFRnp>;^-f<apbr#iL!CZ
z8mZ`1PUx~|&p%u0?F_%L0QJu7+<D=-4bPq(dCl4{uB2y^b%1(|npyzfa|z{6+CoUE
zD1A94d`3t#WG5E1GP3|p6~<Q-mKfE-5aU<6400vl1lV<DB`uj+5Ca{S3ruzaA|mKG
zea4~JI#-y7YsSJV6$tX%a=Yh-2TbddwS!&^mVRvnfOrQEf_4ZH7^m#gpexWlY}IgL
z=YR&lV8aK;afg-)vn7~h?0E(goE=FGO@U9;p2JfG<0II}Frj8Zjn8ZhZWlB`oKKn@
z-iJewX#u^-uw`iI?8<lz=qP~DxEcuzU!+G!KvM~iA>UXn8i{EDGki0^t&y@&rJ;cz
z+o9Pl2zp*<Q-_cra5fdJdS@SaZ1Ws2|HzmiWGbQn>eJeVw6To4n?i*ZD8biqOCond
zuZmY&e?&I}pB-vuHi?i6wsLd}9WvMK=<qnKHaHU+GIdxMgjM+ZHeQ&e&!zsM8<mt#
z&{O9<P6u>EEmDJnXv_@LgTR!K<SWV;C-^sOQLFG^<l^x#YH|x}rFJ%^54s_nhF_Tk
zgA6OLY?9&ri3wrGg*!CV`DOnxQ{m?h;3Vo(JP6MPcR`**qag-(C8slqDQqe<9M%-p
z5B(bWM+|>DJK`pW2*%`-Ez+MnExIcstH3Y`%L#3m_w!HX|D6woP9-Ks5TS4uL&IV1
zjKjJiS{PFzmIY5xmbX1cw3z*<(H4DLH)cJ%w6eauxg}d}aaTZ6@eE?_V0{wbXxNgZ
z)*4H?yFy85P`bs8Z(z~-t{b*ncy7qsILDcUIIxGXU8ynHl#Qf~rSsWnbT3#Iaa#oF
zYX}E)6NQm87!t!IxqWG+8_bB*jUAtWlzJymjD(wVhcXDLfx9DFehMMlLIhBy-Y89!
z=)&VIEgtLuv_bc5auSYlQ{(VmE0zky_a6?OcE?AdEs(v+YBk@G@!-+Q0Z{V|b_UVd
z`CLr79x@k?6+#U#hjD6#+Z@;$C@nXs!;>PrTkV5Ob$D+_ds?tH6kOVOkJ=1dz#}5_
z$ns<Qh#VA-+?c_QCk<rhU{IM|Zw4ZZqJ9Jn91Tq(zQKD4AgV>Q#`(hXtoTss3+Hgl
zVFws;EOByt(hL-egcKGZt_k!Hgl?Hr9L|7-Ti?m$*}1iX4IFhE4_~Q$zUa*7V9^7M
znVUb{L@y*UiENDez>0!J14R^b7#Fi##53^fEFTZMX+Cun0Y+#ySx+#SjU9bmT|K>B
zUG2S{z3s=lyE}S1x~J1TX?lIho_Q##v8A)Um-kNNET>>r9ZWW^<>1_y2jAbwORXSx
zK1fv|LS%Ie1c9)T;S6jfzaWe*FT@3b9h$JB*`<rK_fbL;6~Bqhg{D7H9nk!v#|=S7
z1~K}S?54((wQUPB2%drz1Q_uhG?i0n3Y>2&6?Uoow8n3+Z~%0ELj?$__%_1XYMH*W
z)CXgVU?qp3mxY;Fv=bJ6GsJ|BR%l`LcOx))SRJ1B*#o;a2IOWOoB6s>&s-5bz{
z05Y)ue*bzC_P(JOkJF#9Pup&dBR#Eh^M!Q`zv;DyB5JtNrj21xfAma78i&oHlF8r$
z35zy(K(5JJB}`ZtfAN^MT>ufln^RO`j6EgV(?XrLz$0{v(5%@f&^m1}5&eSpTB(ph
zG&wRN2yjEt8>y_++C+#9YPhTvxn9&;s|lqEp2lRf;7$~-uR^f145p^Xw5Ao>7_5t)
zmZg-mup;iH3tr3ODM-;*Yd)rtwDjKG(tSb4%-p3_#AGzGlDuRyBH==Fs~in9HN@>5
zp~=%DIw%X!IYftN_-ztN1Fp$XlmkW6<oxs$1SPaz8X9UMjdv%8B^IQ+3k;{=b~@Lb
zh7+*Z8)SN^nKGq<qP>JO3maWwl_SG{cpue@R8@75(W7TX5kuFhfSg982au#dE?8(0
zuboC<4MYVJzxZMTfpTmeP>Ttyq*4Md30CBy{DCvf7!5kwJ6!-CyqO}=tGTPOO;ZTA
z>~rwmH7i}bIJ@e0^vEnQRmx~!hdYkLTs(~w5(rh*znCI?AIPW#&Y#_3A%|q{#OtoY
zH{SD#ukdclz#9o6#EjfZkB?7~&yrZED`Z+Q5n~LZU}PVYF3i%e9OMHI<{R!}XJ+Xz
z?#>bqOXx)_aDaZeGY+$igaJB7k6yVAeaz{3_DXHyuc1w+w4oOP{JT0ZD&}bbJ)eTQ
z5~46pDG*sX@;Xoo4Xm!h!VSP_ywnVv9li4IlLs<nas<W5RY)Y?8N?E38OdM@s(&mD
z;Tjt$b_A_+-{gS$louOGD|!+-TlTxKSPTa)2}ieAv%uQ;>q`3_&#Wjdb&86cQS47b
z;HNSeiXQ|{Q%tk+TG^N0(~1$Xf!XmTObN=if%|lE5vpjY6bz_BTMS;3CGcl<5EHE@
zrA;&7TC`gd3`<<5V{wSK#Ts#Jf_5{~)_@*ZxLK{dKo4jEZy7KsklLumrlSlo2S8aa
zT1yvPgv6rRypCk`*-#9D02TLzjnYgDoCl3{KdMJkaB~8Mhs8j9*CO)T*bB`0)PjR4
zFk&S%Z}omeXYvklB^7-P<Ld%y_hsk<+}ZWgdAj{uwIKkNYyj9Q*B>hhH@l9$o}Lc3
zlL{B0Zwha4B=LEqO&HN$`_g5AVjH#4Zk_8YqlV$JfY5<J8)~0%k_Hfbg{6_BVXi50
z0JHTcr&CUcPR@{lITV~Qsey2qUeTihlREHOv-lE24XyVkG1UHAZX<BK-bP-Ow-Ic!
z<$kzN{@go$_rp*IR*P(DO=NU3lVVM4mBbk0X@NrqZ8<}Au%Ga0G?_g-JJ!N{&giAE
z77|9t`SBv8LuO%BW8jz#mH;AxOE{4}+Zjna3!<19WNdJfl7KAH=C|Z2EG8ru-&*Pt
zS7Xku1~2StEaiH-0<H?siv(O1%Kr8#xa3TN78!qAfHY=Irxy|`n4y>z2q#UBCulbC
z1w>48TdC=RQ>5Gc)qARsjqVzU7Ke^AHf~s%5jWM*r$bqTRV29m*+FeXlfZa!>?3A{
zumCRu(Wx7*O)OAc2r4hMd@*B#9;n2Wd^WjxaS%m=vNQ=QteHpsayap5z$_O-untQ8
z6(SFB;OXRWV&IHykZ!6gxB_>=+&>8`Hx|KM;*CU0wC8~1x7Z1i7-|WW9QdnRc*MhW
z2r>oCgLX*=Pl5T8FeL}75ph#MCI+uk#w$mt!$pWT%xv1s!D(!)))uv?F<8{ZXV=L2
zf;2&ti~?A&u*YQjEXm>|PJ9(qSs1&o5k<K9R97Nuj_OJ|O@OTmF$Z=Sc(Zz_pNcu+
zei|Qr`E5kxTZ0_}%mVmI{ZHefN<zm8SJcVc#I+&>{&YjKmA*2;Z(w|UGzIq<nq57w
zexDMaY@thwj3t<&ECR;CZrIC^iLqIjZ(f*(z7#54+?2Q*rxBHosl}n4&#!<C;V(Ju
z!O3x?GthxDW#Zx>?nPuYLA7zv4y}@O=D}<gutGY#9a>Sw=3XHLL<D;*6cvDm%@w++
z<p#i{$5WQ^!34afM$jv)u)u)8dK%go#|SP`AQ%jljuZg*kQxKN)T$DODP#Gm5!+zh
z57vp(jWD%caN09IuHrrfCIfaUQ8Xc5f*#TON5jZ~sFxr>$f8K_h9n~r5mAyyAVU;M
zp9o3EF&_M*LB^x)tkT{H`9-K2){(ReHDhC>7rbX>DWi3i!^|9U#@-`s{Ki&>y>DiO
zWxch;+7%=>DC;LU4KG9QG*N7@e_#>IS#G$bUdj4oQXP<mwBWggvA|=}$U7sOgz{6a
z6ZyiVv_PbgL;y0Nmxrpo9n@V8A}}K~X>tyw%r)S8pEbfC#xs#hxxhg<HZYZOfc^^I
z3;YHdr6(wwhsFmd+Pf_=KTqS5%%GXxx6lB6z}gx#Q_{?U6R|8c!!hi9ZWc$1^D~bo
znJ{i1IR!YkW=HxEWpA^wN0?r)EM7-ct53iyfL;OuFi^2W*A87%qxTBk0<sgF94F*H
zMNpc%AWJcP2$URRzc9eQm{MQN>gKof!jzGBikfEhOm0v}OMG-D3;qdQNOC3(_fJwd
zES>0%unGE<0YzcKb+QL4cR^<>D~#1{Y<BJ5*AXlPFFHt}BC(3hHw#aEZK?%WQJ!Vu
z9P6R0S>5*b^(6$9!^xx@V8lhV?b_ReEp59lv)JW!!IrjNysNT5zOme=ZP&M1vvNae
zhS-)XG_wtDv(4YC=gN26cAL$@t_b_I?XW6h62lRaP$%N)Ug$;On+$RecMx#Vp4b57
zQ?~kdcENjP6{%a$7Ti8_dX+;V;;r#Gat8FYcO8#+!`=<^E%FKU#(O$DJA2>*-qQuc
zH_`|6NLsIyPZ0nCDk&C$pS&gjY|WKb0Y@DpofCrQWT_B_4B@|Aoio*1dI#PH&85sb
zpI=yjQUrEjkm>Ze1afly%r)~TT8d3nPHbgJo$xdIxC!CF*%faM&#?=dbTDM(z_qH6
z0?8|CzLT(l&=DU86K62USV3<;J|IVNlX{;>77)r(6QMDvjhN>U`2fZ<aDgCu4OJNx
zukHZY71Bar37_egbeZl<VrD9P4(q@ht&-eKIQNIbSIG4Ujznw;sV!(Y6(^LUoj%fe
z$(Cnw6f#p0s5=uM;$=jG7?pN2(D@`yUW07*G!*+Iv>y%G<Oap!#e8Y*d@E4P0=zf%
zw(tPJFS(qWNqa!hqQ*?8hA?`;H#9n9NJ5o@UNbl!Vl)(>5i%zbL-#9@(E57B5UD#x
z(2(%q)Qjt|AB%vo&}6=K$TenfC>vl3$`?lK4NpR&w0X5YsF1VX$T<ZczCfdVSO0xT
zA&0Ik#q5WklqoHdNrDVYa(sfjgA}yR@$S${!NnJ%)){z>*$Z!k40>_N#Ut7(bd)tA
zQjQLg(u}Nd@W$dr;|zUgdb+whjvw#t?C9%@!<VV2J&tT7UG3e+iBph}<ai%YO4sr3
z4tj6AiaDNYjU=?M;7lPx4$((NE6p@T2A%L1o2Yo^qS6z9!ZUujK^v<==lpW5rd2&%
zT1Q+~#0R1<t&=p|AG|ccn>GkRYLkR|^d=Jp4V-C`G*)ZvL27=WfUFs%f{2zWF!^hR
zWpfSA%Sj+Cb*(f;V_l>1p0JUQ-ZM>zZGMGz{vOe?{4T_Koy)`dg!)*2BU8Px=~7|S
zrl`hBmlvYjh^Ta&-;f&d3O6+uGnh7}W+bR()*fWmgeW5}EeQQgL!&_7GEr1*&}<o-
zHY75ba*1QK6D#}SzQc=?Q7=&3;#R>X_YZLJMDmoe%i(`s)*YIGCJ>Z@SgcVB=3XHL
zV`A<C><T3Y9V{>$7@l|zLOWQ!zS3&0O6Qp$x7BNbL1=e%Eg@*cMWlj!6igKYRP7GU
zWSIx2gQ4D&$zjSY_zAh>a*RO(mmwFQ`J<^q@{Uq&1Ioulkm8fkX?W=L$P}N&4;BH;
zCITwJBOZs}gy%)v0aq;pb;<Xy&n^{hw#km>b2{xLJRT4nsINQ9kOR>18dz|G;@wu7
zmIRy)Etd*pS%eJ{M|i(2S4_EL<`wM*SE03|?>N$n0#_itDAJDh_H@Hz9^6S^cV~NN
z`|)0sa<^2v;f<tDBV5Q32D`qn-1(Jy$Z+lXS#n5#12D&iOlZ_*hY~b7<7L1x#r1RW
zm!qXhYnHXl4Y*4>$I{2V??`}%%&x7Xd)R@XJ@z;>tv(XYTX2(Hh4gubn_VeKTgiN$
zVAbTz3OexBgbJ%A{NH=K^=V-C%~aYsHb|Ye=3E5%m9G)BrcaU<02oD!SyFVjR6^fC
zC>OoS_yyzF;ZJ51G}fO?%g+vCSp2kDo3F9CWpBRe5M|EFXg{klMi5k45~cfKv%~zF
z`IS{Fk)fGq21mpfF<D6`)tKwZ9fzDA8L=B66M)?$O?#nZ;K9?{|J7qu+0Z(Jc$zcJ
zHJ~kaNFasOJ4h6KoPOW0%ctT>?JIpMz<Du94(AvGtI?<~<a6LkK(i#THY8G748=+>
zY6)J2+Rhi2b8Ya%TwiM|7S`A1a&4=N_<l>V1T93dCEkWCOl{}#E2!NN7DdEb=JJ>p
zva>UE34R{X%@Y)m;tYn4sd8=?*1b|v7e^lQQo7h|0`6(1qQ!x;wo$Y@O%p0DFw<FE
ztMUIN;ws?C20V-K5MXW4NoKp?SgG!2LTCn04ndhjoGwYXCBBeE9nq+M22>7HMV(2p
z4UH2(PU{V=*Q!nzp_BbUQ33u@R6NMS0GQS2Iq2Z0)h!GXoV;@Yk}BXV1#9Sl(+fK+
zwu%^AZ<Qcz?4+cUlnm(-&Cw5G91dlD9mnI;5krrouR&}2CsTvSQvpJwuVAIm>Qvar
z+mQ$dK^=yvn3il7;>#%ftMST&9y^#ZJCk()&18(R>|+y|EVsCv)Y(j)Up)YyF2Ycs
zu(n)e;thylG{b`r9oW<_M~)vfh;-B_GXg%2C}7FE0@oC1g8|}&AjQUafjcZBP^Yox
z5W!OSCHuVT9o>ax`#BYDy+M72LPA^CCD<GI9xr9UhE_XB{H#$@1hoPSh?f^704HyS
zLjWnpH3%<Ns^zr92LRr>3F9C#$X};wvA*h3(g#3??GWTlfS^P-0S<tZ8E7MX!P3%v
z{({e<FG!VakD*$@d<ELim;6LI^;J2(Q0s}b6NCrn6;>pELc+$m>M(Vb2}@8i3YoO#
zUjacf@eng{LJb{FNIn3&eb(%@ar-Lo2=)^_olEq$Vp*PI^J=qX5X2vF6&(zOMuuwy
zJ_{AVi(uD600-fgP~(;C{PfUyn-a0e_Ce7Q%$v>eY(i#-pnLJOqCOe~Frk{TX+vUD
zFXKKEOC#o7JZBg%D=P!!Bbn-Cy^L;H_FN<=4XCe!-$4k4&euGXE-WIU&O~WV#8&Ys
z%OsOP+YB9m)pcoB9Np~IQ3nbS_&`l*GVOejx;E45s6$R=YHWgvZdqu58J2>wAIKQi
z82y(uK!KY!^)Piq<P7mo6oHQM?1JMroj#iGg5QSd0_A1xmpXW?uzhh#st}T4><kvM
zbs+D)bASq~vw=5q8Nq>R^AZn9kR@n50`{171%YkMGAekQXb(?mJ$|mVR9J*PMe8Oe
zXRP0?-wBs_ISjwW<O9LP9QH=p$C}iONv<4gNVgSuDX|4y@U5LdM^nP-%EY;A_d87q
zp(+UoM5Kr0_)|kj|J{z9ZX&T~`l06qB>^$bl8*2f?*SSC4kv`^6#?^%T1yFn+^un_
z)ay|!-HWd)0MwBx3dw}tXwu&g-nLoRU(1zc;Wj5}%#US{g@!Vx=rRsnW(Q(zdZ8`w
zoV&Gmoi=XygMlTFLYb-IcOXQLVhEy4+7mGtngZ14dBl`arYG`gU>ooOo?MrgE6658
zBjHU0T+3KOhX5Zdf&m8<BQ*{TNJ#NO4I{K*o7ACa0F+7S!f#Q4Wk5C-qXR2bRUexJ
zgbzIGbDz<Sil6K8H!<glDV`-{Qb)Awi3Te%7W6A?7Vvlwj76zF1LKwW(*PufbEPaC
z=kl{l<lQ|q$@5J+kf2Mn5$Gv5Kx!F3BK6}@y5XTWAOIw=o6Et;76{p$*CbMViCC9P
zLGopx2IDTLZvkVRU&ym35Pts+o!7;G+u#~;-juzOD-iMVf$Rh`$V$3S#<qxUjc?s7
zkW#!%^}N9e;n>l26beoJ15IJxC4ZWHfW;JC<2~O4ZiK%A&utPxDY@?uG$8?$VA3U`
z61o~mwu>ILI{|`HWFB5g_8z1#BP)h>=Gr=^WJj>6OiKziIu4E3(agkf+}9I}`Yu0i
zQnpj+Cc!#sM^3<>U4uPW)M1_yxC#=eQy(TK2S!o|et=sDEox_shnK{^=>bjyFKch4
zseuU(q2d~Sd5#1rf4s3QuFj5SKJvAph8rgCC%_ua%8wH}kbv?WUw~j{e#HIS?~b6M
zeE9-zGniY_@Bqw+Z;D1$?0pcys-~n7n*9IFy$N?5S#qTrKgFAMS_G>COvHk%OIdTU
zk`S>KHZrBs!~uZ-iAfNEurQfYw|@J4_u6~z&jSHwO0AwVDHY*c`0^Hewp_bbL*3Cd
zJ(Qk@x+m7OYG$eWBeZ+hD5ko**ghJvHFS*d3lPan9)epfo+*zt1u%G0c4{<2u7I*A
zP1|@3j2~r7V|F0|tQqC^w9O++yZ(Zta`NR*(+a_9d|0F85Rs9LAlybTLqqo6ElY5L
z??e%Uo)iBgdL>mOMY4EUaK4RBD2s&dFI<9oMm@Yzh}gp~Z9B#0_Vej;DK?w}$18LS
zjxf45Jn}J77~VtQtJ!TFINyZ-X80;!#Qj$8NctDtZwf|fyu4d8y(oY16as?9AOFkc
z^~H_xzZizU1^?v|<x!hUG=GqH4E=KxCI$t^a}2UEI5#ER*g~oqW9pbk?Y2ko$t#_j
z*sFq4VPx|HZY9NAqvpdVs=n_u^=QatA{kgd7zAtk5nGa4G0m|jJ{{OG;GR<3D_XA_
zH|>yei#W@4p;Gl>U={x)12FqVh~#ChKH~@HEYXg(-@V6&FXT;~X(z$(-7LH58FtC~
z-wi*IJ6_5|vCdrK7B!g<*o@m#lCoWJr@0`lH^pUDx=eHF5w4okyV?^St+^jC0$E;J
zDGVeVqi-FGNV6koVp~1CI{OS=o6BEW;UinjLnF+|aES{50Damf=zV^c2-L4msTxOi
z&)xW68g6!-zr4S{J!R=Flk-$|r&xG=L0))@=CF1@b;V6*7;|JMA<d<-TApjRNOC4-
zS3gxx>k4~Mo(HNz+H!i1U*zLIg*aqZ$Z(jMGPWyShyq^NDyU2E8bjs36tw=o9jabZ
zGt4ERIkK1FzQKz4WF^fh%49cYy(N^e2u+!-+3xXbGKJy6ipllO^>x^>&kfOZ1)W!n
z&v5-Vic3liY}xs84B5`lN?s09$vT!+s4i@_nakn*()f2QIF?NIiZ9_cR8y(_A5@Ju
zDXD?0ot&2o{RP$vsl(e$0TR~f9?Q#G+Z1zVDiz3U^azfbVcK)uV&)5Q7kOewr(ezy
zD4LM+?umML4mtq-Q%*>|nq+@wh0~`v++>C*RS*Kn4X1f)gPxVENEU8#9ZjUiIG{D`
zyr*QF^p|z(80IUPjAScTft&Aecikayh22?mj6*-Y{*^bkq35<~B_?V$XCZI$ZjMtD
zC(SDzfd2)0o}0JIdIh6QQi0-?=rnK)EFR$kBGsYU<_#PI70@)RSXv6I*+{1PzB1G*
z0VoM5>Nj~-@Xhl`5wRtcFcmnm+BE|kj|N}-{Zp+y#efS=5Q&ex|L^r;nU7F^LFPfW
zlAn&OJ(8KsM~fe}R*ahLWY1Tj1tY!9>V?sRvCo|~l~RVCn%f3^V%0;Pg(R~QdG*!!
z^uj?KWfq~4rQV-HQEI&eFx`ivEk8|)5!(e=f!?;L81|Qk%L|MB<@sZMqTW7qJ<mPO
zJSl+-L&o>vk>ZXs%49*Bh#tZrp*vG?)O+D!7o-(;M%Le%#`vwX?MN(uloL%smHuJ}
zCb{>M{FZSnE6!ZWLX#;jJ;Ng|SbFV+wI>#OeMZRS;f~=hiD?q<v@w}?mep~o`jIYd
z&<-n-G6I8dp4P_@u{cnLj)H_f9W&oS&V)k$!9}|$#c!W_zD_{%25oCmb$qlrqU|lI
z&|wr#b#7xTjnr>$2V&@4RyiR`M@%^F_~_s#M!>oyNKL{gz6^V{16<b-<TVG=qz)fJ
z=1t^Pd220;aucOZ4F$Bqpm9Q^fF;Txsd6VdM|a1+)JA`4(zkr0kGHoJkwfsC)2zPx
zGRmz_m6}F6z^tuFf?$`#CaLorR=`l;?&=6m8AXwlY6O3Trh2UjNUwV3gMU}0%gEqb
z-y^(pR4ooVq#z)Durx%>u6iTEr^y@Q2_vxMQeVcjAwL=hXn8BcRVne$f5~H+Xh!{k
zj)gQjoeP7W3F$$tgp$6HdZG2<jz@5=HC9)NQ$Xe98Q6p)DbJmx7o9bR7U6<A(y*G<
zv1J`6u4`vvmY>SZ&C72Rfdvo=7xglPIQ=D(v*%#L0X6vazwzwirse}>PxepNG{$0!
zcNt%Qy}Y9go+j!Ps)$@j?)#^yIIu_8i)hq7n&95wIjUlv-Z|nyNf%CK^IUjG^mBq`
z*}fq!Me7MnT9ll@j~8yv{kzBO(@c`t{iedpu8@;#?N$w-t^3`ZyDz6#Gr9~7PStV#
z+r$+>u&CnMa(cC9$SuzskNg}Q@vR@jRC_k2W~pwEAbfU`HMh__(05D)ZRZCG&7<e3
z7*A7*M~eVj*!LG>t}#|dj#VqEb{I+xoaPsMCDI>XQN!=b7>jziPk5^EwZ;y@!R0+m
z;gAlL!hx<X@=Fr%2iF~nOGmYtojO&kSJacs+G!Nr+cI~3W{Pi)HN^1z#4124PTrrv
zWP;+nTeB;MOO)<R)aO1CIws9S@W2&!C3B)!w`@a~>SmZ0cA_GXPiuKIXHE>Abn?cN
z;9s3Vx^)hcttjymK~6H1j1nqKN0L7S8$)C>TKO!JQflcbZIX#=qC;Lme`5{D@tw-x
zB+foyd9E@i$9AeW=Shu9nRz>_=;L}nop)#lM%{Ee?{-!xaGN_x9Wy?)5?61sj$D9J
z8arfF9;HAf7M~twO-xEoxl_ax<vL&#%fIn^N3csq;KaRGg23H<_B^1_m<Gf%MiGZt
zVZS5Nup5@<D8u$%?ym`KL^{-13_0#;?JG^A(k)k;Nu39PXi7dc=yOv7iEuoVD%Heb
z$|gupN{-_5r9_4&-^S0u5iB>5E1HlWvTo1-3~hm7n7u`*qFfCy)ha^wc)SLv!h$P-
zi5K!Cy!kVx368>~^sH(RQQg!Fz#&Tk_zPeUxR!pW5*QcPdZ2|mkc&*1BGar1l01PP
zw9B|mIn83gs9t#w#~+p0(#VJaJ~&$WgExh!nkDNlxt_gRmOC_(nY6VSQMEXkURQ7q
z2+pjE1(+*f7zw&%zTldBOv{xz1Jq5TBjcId`Z)9)Q!2fYQKw0(Od$}ujbrlEo=K7&
zYZ!vOk*h4~)cIA{VYZQPDaJ_l1E>x(8+%f=8gnIK%ON1g1E`SF1hS$xnm7ezJmlEx
z)uJ41gbv!Fq7Z!j{&L^r3CyO!x-f-(IlVr2#k_jkrSi;I@E)Qu@fN&4NTx#6xKqFK
zibMpcyOrm2y~pE!UgDBOUn9K|MSFlla}A^C;CLJSQ1*5`LwPgxH<J)8d@wvJxIe7C
z;u>9;o1b4E35;}zr~3&1nnQb~&rZd?;g^PGCLyJ9*xg=U-~YMZ9Ru40+LxVkwVn;*
zAU%ieEZQ7%j+Z-b61;nF;MWfSL2jAJgS@q!`u?X#9aWp+?WiX;;2TTIiDUd+bD>_g
z>(#-i`7r5(^{_!3*}nRIpJ;>WX+dOhoXN}+AGp+aPaq%GY61j@_=olH%xJFH9nJOm
z-+=!O`QJSMTi}0-{BMc>Eq9>+;(uJ%<GLQ#^|-Fbbv>@@ab1t=dR*7%x<1$SxvtN3
zeXi?sU7zdvT-WEiKGzMnZoqW|t{ZUOfa?ZaH{iMf*A2LCz;#2e8*<%{>xNu6<hmi(
z4Y_W}bwjQja@{=F&2!y6*UfX?JlD;0-8|RLbKN}G&2!xX*DY|}0@p1Jy1U)iuV16Z
z?aKZC?fmi+6#%AIzlw8O7qa4(90N}SGP3o^07eA!M?a9~-?Yk_WU*@1GYdJ$JbudT
z>Bofspmq|-M!;vpkfJxeoEF<{vD}0Vr(@`!p0{9@{v+eMDCy=%`97bJlgx?fdl8f1
zLFEb_09B8-JvqJgd`gXnVI=IU)PZNf#!K^yq%BauppT;#v~dC(OEUPerK<qXTTB6|
zO01yUYd}gRiK8~3AJUjsG@bimhw7JnV#FRTG-l_MEL3?0!MW)|*HNGk^*7GeH4Ik|
z->>YW@Y%N_)m(Av?`!5-qcQJOYJRbew}1sD`6j`$8LlZhHkIo{FIWK@sVXc<*>Pw!
zn`}aiHEZz%U?J6v`K=2|oTE8Y<d>nL7-5snFZ(_5RuE|Zt#}%jG|sEibrpSB^Vm|r
zS))ym4)N4EFh<3eTAh`Ui@wBFfS>xfW4k?-F*9GV<I@b%T!|IJ-iMpBn=4AOhVIIE
zNa6W_jI)N|INMlHqoIzKrXgU?XJ?CIx*L|NK0zyDfT;q?nrXbHByfCZzv?e@BL~AB
z0V^V3uPU)q1d*g?EyXy(f~gnS){&0o^^)SAkykA(yYuzJ<$le@o40*8qiR&AT0$W2
z@Gf@r*SU9ugmW;Mk(!vs9_~<-@Q8f*a!J7^F)}@zez{d!<9`BBFLLwvk7JC4LJs9X
zfiMI)ufc7Z(C_Z2nOWUR%uF%S<=f>>)0ZpyUtb{J%E@BZ&m7OQ;jc9ZNiA3W`2Pr(
zaA+LSE%Lnn;T!d9O!*;A#MaG}WR-e-)|}vK*Asa012qzUQ0#{sIlWQ(`oZhr9Myma
z!v$qg^p}?AmnrqLI5+Gq4Tei(ZY_`rI{wpOX_JO%#u^0)?3j!$s7;ZqTu`qe{5?|g
zYjwZTR>x>NgisDoj-(HyEiE;eVZxy!I~!ZX<8n<?QR<J}lrcn$o%9;1G4flWvqui$
z#`XEZ_+K8W6o1Z@5%&ZgKfe3Q0bcKLXN_X$?&+}DP#Ed;w5|8jxYzW8@(DwyG1ps8
z+bWm=B{3it+P6aF$|6)6|8)N^Ab(ViZv#jQ;0Ji>Q{?vSb40hbk5(_BlY*Ah@MwIp
z0`ic0A54kJw-2|3ck+CYfCj!SOm=CiD#4fIsscRE_7o*7#JNaXlX9Fn?o28aMr04!
zZ+1TB;6rDlF9kBP$dgc`0L%(yLDZP7BVOO+X@{D-n$5Iwt1E}%t#LR9NzZIPj#$Fx
zK9(2$%eRaU=Kl&(lci-c7oL$(z~G)eTP{n5^2;Y+N%F4EqV@DFpKEt2!hP+nEhcFS
zDsn+KpE{JB2`yBxa@+&Nv9FL>SV08^3m*UWOl)@lg`eRMzPUPmpfCwLGcV4r&i`E9
zjc@;4z2H>34~+p^-v6jx{CM_4tN8gG*?TX3T#uce5YI18+412yHWaWDFo+tACRT0=
z0~JsqZqeh~ztdxKI)`7_C2zRpm+_aIyI)^SI0c!eP7C0#5VQ3-iRvGdyhBEeUqX*?
zR=dDMBS_V53l*SsB(Z_ZIJ|r~y8l4-PWB8ZX7t8i(SV(|?i2ob#-zvmgL~)+B${O(
z7yiGF?`}GdyJ+rjZES2T0STTTs$x~LMPg-8<XLJC@mkUKv65nnMU;q6$8FEbgp1>A
z7p8(H*wQ-AkMQl}$X*sqhNNm*c~bP>rB%iQDnsBAIAzv%>#*)R)!S)REf}{$fGRfR
zl$izIzZ}`6j{RiVPn-<pL%^P`P2+mJR<0fVl<RJVz$hz*y1;Z)`_i<im#Sj1&OwFz
z7xMNf<m7dLQm^J=@-H(0Em<|w!ky}yU>e@UPvLYf5+(eiB4jcs3I&>$o{S+^WmXI^
zfb_Z=0!1`@TXJ8V0@Ly~N|VEGGkjb?!Rqo?O6XMfy-G*KD<wZp60V0EL;R!%@Nu9l
zZIQ5#vF|IabLySpu{ziz8YSr40+CmIk9Sr4;e8DNg0mJH_S1*a$McJ;$9rfE2)B5p
z><^X(SSX4GWQVG77*L*2uI^!c*E&dr2vL|TI~;nNsNjt-#@xvhUSC_&(y}(Pwyc2g
zFH*@A3)Ps;Lb~=aSjaw+(o;|KS#GNAC*XcNr{Bu#ZdR^x7ehIvQ)Qty6TmVh39965
zjVlq(g18<kv6sXB6^d8UP;KzOk4iDj*em;S%nL3`1EPk6d1cNa4jp0^a{~W`7GYB;
zFzSvk$_leeG35mJkSy>dQlk8vjJLE1n{f)Ln~pR@o@}GRrMc7q+EEG<UR=pCdYp#f
z%1)?Dt=;AhR!RxuEzS>@hZMG0>i7Fgiv;tQ7nT+W3-g0T4)llf3rpi!v_M2xXl5Ib
z5j!{<)mL*XMcRV0iw~d(73ca0E@%PHm=P?{8Ad9)5)rw=9{KoiHknm=_PHCC3bM9F
z6$A6k(&st>1dK0@#2~DR>MumGCXdqw;^ZOI8?GN=E`15U<61v<Y?z~jjGa3)dE$>x
z?#Z+M^e2VP{2NN3M+tE|)*MQSZfVkK3s^_MgG*q?<^4y%cR!brt#qIj*-FxNB75K&
z9jI7VpJ+gFjwUb7p9WsVm;-}irUI`c3W+>D#LuCx#av5yUrd0%9<M+TjN`hc`C>Gw
zu2@U=-IfZB;VH+zY72d(7+<8tTc@T-a4&LK<n$KLxW}5othyWCbFkQl)uKvGZZNjR
zEh<!Y;Abo^*59ujY$8oJw{QS6M<ur<7)NJ!XM;YhoF?;(77$+6KT<Vzwl`D#++0Z4
zB6E$%)7^h4`Rm8(-w~ok`WGT1u$<bE>Qv{086l;nL#<tq^k9}W9(cQ65Hn0{W-p;>
z6)(-Aq}k>3MM!EyNt;yBcMk84!Wwq%u#xlpN#w~{`jO*bQ+9rjOd=)wi*h3Wx*-Ed
zhw_a`ox}+{ab0jo_%`{vV1S~Eyd(m47c7~rsHo<>7#2R?-TY$RS&%5lV;frx<RXxy
zOVF&-Qay^Z`W@oqj4B*wUj>)jCbj`IVWFDGygRS!Mowx>N={|$w{le5_CiTdCrdT#
zk!O573^>+xQ!kpx+SXK9=5?6PY}`LDMc<ARjZjL8_O2!(H{yGCDV0SY7%()skaWyn
z=pxR<D?uhG2S%<%_#en?SUFKFYgrzmpvg#8-8;~6gD0=cGRPkL5Va&3q>RzwF+ZUo
z><rSdRD2_9ZV)t4@P|7YJX+Wk(FNuk5vGa(;mQOh+)5MJR|+CW3p1^m`^YH~{UnHf
zxJo&=c^K6YOA&P3h#+({ak<D{!dsejED^bE;#k{W`H6(b*vM<FH9cIJ($6C77b|Nu
zGXae|3#?XY&+s;AN=XzZD(ip44<M6|T%UjH|1-b}+|PF2mx4YWXbJlE+!aaq_eNkL
z>0^Z;Y||cpN!~lte18Rkp$#p^xW0CsGOma&62JNU)?Nrk%#7_lql+_~jS-2}L{f~M
zi}B>5u4xWyr}{=TL#}k>*9iV5#t}FM#y;lBHcoM}Fk!j4JPTOO#<$zMhTfu!ri%|y
zT=oLAIPIgF`M0Y4#o@yjA|sh1qgxd}`Ew;lWl(=*C}B!&@m#m_|B23qhlZwlqOrx{
zxH_!fX?1O^=RJEm+ST|8w6k5A1x!^bXC;JS@<0<T&Sl!AcuKrCmD`yc(nUo&kf`6X
z;MvF&xh~+ca>%-uvo1)=3ne<IThPYhI|L9@R4m{160DJ8r_uL{{*e-2J8kIaRShzM
zeY1O)msF8vk<?HeO#+D}FXGgr_tlGEUTAnZyR~AY0VBrbiXdaosv;Ibkv=Vmy+Lmd
ztsn#ERw*78pLHK91ds;2V-Q?WNH*6!KqvL$Ux!X0A+2jXMxV5^C_Nt9hh&qifvKn5
z0TJAL>ep^Jm2eNKMW7mNDi!FzxI6Wi%D7W1<nmsL^FaaiP$V11uip){vx1&jy5A52
z$l!!6c9QawAl)*p1JgYkz1#RHOEEeK`Db-*!UUsU#P-%ZzyT!DpaMVVGAF}31s!&X
zpd`ErK&oDE@Jb0@wG>VT`5G;&1-H{||6TaC<mPFi<p@t5P%Q6A1(JI9CXe>t3DUwk
zDVj|hn@_G-ay6)<>|Ipus=`i*B>-Zj&!AXH!R^S5QyfSWR*ETKX14^(kOtsyhS<1m
zhG4J)1r~#T)_vr*&S6LUoDiC>HQy@SqPZeYV`W%5ZQ%g<p|=^T+<k3WrTjd@C=S*$
zph8<bt+q4S7VAlisjV)lGJL%Z&*MD2ZA*|v4j1N!Ls2bZ4+8XDz(DL_6k-kbhj@$m
zuHZaMiQ9k`b}b|Tm91AVPsn|CC@gt<b0kd8_sg>}fv`<h{B_lKgI0^_c}`uLUqNr1
zP$HO*gN;NA&3opT7Y|w5EF0lwM~|WS44M?E=oxB{;My0Sf&ug-xhH2mRwEGN_f#DG
zvz)YfN0wc_f?pT6`f~<HHkbY=Z@K6USZ%INZe=(PPSbo%tU4bQFb#3yyl+s-+ePu?
ziDRm^xU*AHU>t{*q%O3u46wFuA{}?Fb4AFMXpG%}u0&BSFX)G$qr-YIkC>L1LmFt`
zXD!Ey{f>BRGb?T2pB)lIPFM~&hYFeNWhz;fjV|oFJvkbpRlpegKMawf75?R3ps>^K
z$L?7d^Bmi(eFSM8vg_dRq>)|Pq4$xA%(m8;{ry7~lQrH+rlo>N>J!~kA%7)wl<6>y
zr&OhNcxafs8S++mQm38gzM5#sxyzoJ;NVg$Qor@^&`zBZCraG#EF$!zyOxx=k0!!>
zN+BDhwQ+W-O-!TNVP4t9PcuQG8>jgt<Tcz^MwfG&!p>l{SVqeQ85m%xBke<o9B@iF
zsUbNChck*H&cF|w)H<V#?A996HDW;#MvK~i868mQNUjN(j@vm)ZRfXmYUQVm&1Ql4
z-=LlNyRgdD)5@y}gc~2e3HJ%oBLI*ahQ6pa6zlq`)cu-Qve*k=Ym>!qaEDNDQixbm
z4`ZzMkAh7b|626B#zwt%G(z8mTg{1BROpQcAX2!=NoKS@mQzkxM2NKioBwaF`@R3`
zf%4}69DXbF+U0fGePjLDDK?ds|Jcn+vo64pXTf2(7+ty)l%knY52li30w~cMsi3!3
zMye&#lSIfsJ%;NLPD6HRiG9L-uA-I7a7j0kdt^iR4B`$HWi7?wrQDcZo?qWYhlngl
z(%itESt^kPbbzv$%42ABwT^_1FAINYeUl!b?G0L&BQl`rMG|7obFy-EbpkjqrvQ9c
zCy6L-(3`MMjZU@>w@6q)6@d~=9A*PUCMgDBd$%41N!y|&Y6qku_tT+<>KH=}Exz7-
zfB<cEtAV7<@euhllylx+vV=?Ka-rU(YKcYRy>DW=IqzVTr%)BN^zfJLza=f^Q0h*2
z$wG2!rOgEjc?%L~>`CaU4oR>`WnEvcgJb5Kc_^u$6xnA|clc%p!z$MFOpGO`Sl~U;
zAvUQ*?NTXX#Y>|9gO|42-z}a;cRs@QAWRb0I4?)7BZ)d<P2!9RCEWEf<6T9HYTvvf
zGl@$`5kL5V7HG&Aj=ZC`_~yvpzb9K~3`Jkb_?5#6!fge~=<Gm%R3?DSDBz?dRvyVz
zxpmT1VQDt@rxqFaQR)f|m#C^2MvqPTD{vKPt}EMbs=-nsd^kD1yA<Mj>wfRz2-twC
zx(eFU6^6hSXr7wNEIp5xR9n6t>7x4-IMOH|-J&z;AeDnh`Bpgk>YH^d%37`5AKKAY
zUqH<I{APS#;i0SK)%EopEU&F^k&2s23S;*HAQqU#VfB=!QKb|&lm{oc8T-nx2BUVc
zzEkB5WspbumU~re8iE@|2)+<lTj<?m$7nd74Pm$)2A(+CcHa5x49dNxwf@Zh1-?kK
z=*JbzG_3=#P*YJ+6UOy@AQ8alE&}m(RWqgVNy{RDhz|=Lh8ch8vDiK|F$nxJ<{z|#
zt5t9ju@i#<lC#7g3XT%On_Hgiquo<+9L<~j#06CFxh1D{?~7Mzh1Lk1hoHzB+P-@o
zU9;aZd5sciU$1Pkq}Cg^js%P(r(_?2AVF}kKH=5F4_UqV?=Px2;AH+Sye2QsYZ`6}
z$uQ=rU~0Hp_6_N&bzO&Tmioy9R1G_O#6q|{7jsU!nx~j6yS_RfVDwChq)c+;II_Y5
zCXSql3*!AgzUDFK+2H}#a^AG>Bau#sbK+PLt#y7r;TyqX0NBwwO0uzz+LIm7Nmi3&
zBqc^L=|VdK=D*x|Y2}7jamE?wjg{;36B0|XUz|`GwJTC-4aRB9F+XFxM<|Ofv?0_(
z9AnOgH!XOhah%0X<KnqFVsE}UAC@!!d21Wzn^U3F)v}YcZ=EyW!5?g07vxE)@!&SS
zAB$}nqt4qX2~PG3VNdtBb|rsRqk_3rQa@d?3MsP!!q$kT*cUC_EuFLKuv(I%{7^<c
zZq9Y4BJ`R8qpLYUsJ6X3jRd3tL?U=kjnP3ErhKnz<_{pVPVb2nrU1q6cC3m<V%Ve$
z&Xb9-8*|W$r!HeL4VY&Twa7>iH~|`FQ39+5#=qHQU^}vZ(ICW<i#SIBV%MAmhk%_S
z`A1Aey>*&keQO8pYXrQ8uSgtBd#uL}C&LCUK`K^g{lx}cd$;mNsS{%MDk;@yKd^&i
zfP?24&1Z~n8XaeIsp=0BNI5(ymODJ>P}+*&xZN+O_di>El6ci13}@vIyn`CEK22Pp
zAXtgwvUS~A$|GQQiTs<O2~{)?1jmCMT;d_4IV~FUjz$Jxxf9$1#$<TcoBj>lSA-d)
zpz03wp3kkE?7yYE6u}GOR34yLlGWI<L>FEU7w3thqXrEJ0E*ZkFLucG@;ok+#RdE*
z^WwQ*SlHBnB+eudMmepxvxY>CiFGp?2hzk^pH{l2Au&2gwRaRceA}|)q|l@@^+Bz)
zyu9_OlsC_-pm@nyc0v_c3IIcji}Rr@8C>DcEBDm{2Os{5vvhM?{Dib@!?<Vc2X3l(
zkV?@BP2bOLzD@9<X!Nwp)vaYJRlM;kRn4w9O_Fsbjyc^Vf<=z${<ts{w3SsMGq_6%
zOy>`=xHx!E={#O6g}-w7WiTPB+Upi6e&6yi5-b%&QI_OsBFn)M+|dU(x}KDRPNkxs
zhB8~Y(b0%C$wET>KF9Klzkw+*OAfw>0i(fS+ltt>%gT~ASgVWz+oBYPZpeGIYkf9$
zMRsm+3By3OU&6Bb{pyAbzbFIYaD7+L(_csxHrvH~ktsyxrMTn8GLptLOiip-(MSlB
zrUh#ADS0KBnFhu2%`4N>(k9RF$V8(dz6!v4b-P9Ru<Xmx;iu`aAW)rNrw3ClJyq3e
z?RoJM2v;pPg%ztjilvI_7o2r-a{%F@kSL5QDn}6&>9p=ghPH4@nZ}iDW7cEk90ys#
zci?_@4%$``N#RU3T8O;kZkIixG(A-s6PDL4DsuFU&9~5Crz6QNxd8T>A{|SE1!ch}
zwdKgf3)U!KJ}}PP?y<%d$Vasiv=fD?by{roC^uL(np|}c*VP4a%{|FHOoa#YsTfw!
zX+m(15{0Y@aB(WAm?uhxjk+o<rYKa~^k_Y2a*Towt4+!o<nzG3&oqdrbdGF9%AxfF
z2ToOwVY_YDJ8GT$&$8XlGvdu`IohLA9o^r%l7DUhx5RAKzk1dAdVgJ{TjZS%m6vu1
zLW4gF-;2vPzTev3-i7x!P{}n|O!ZiPFW?{RI;|h3-P<RhRfBcwe&fsS!>_+h_hDAb
z>hFK74p9{+>HL}8bf4Mc=m6<MC6+HE5^pXpP$n}-1S5?+aA^_cak0#&5KwswbPOnc
zS{L#v6A(hd0+CSm3|Llat79cZ>x;tWYJ!>T$1m{c^JFd8!dItI|EIf~$6M7;1VyP{
znmApT7`cDAxufY`RTRFhZ}!LEyj&u%_K(O7qxL6aJMk6&n_U4&ZcfzM-wC%xSI$oF
z$?6%r<pzjJ+<4?5$^QjbSlXbHI|r2-20ap^d6zgupkrAlDDMPtMHKY%VKF{j_P4*V
zSbLk-e?Jw=SPlXHQyadM<>)NDtu+#*FpUp8cpR#=-eR@%y8pU|RG_>Wygk+Imj^gF
ze*84P#{N8G_w?@U^N$p2nMKW>{lxzds@c7{YWCv#W|m=adG;`C%K+dJ2Y<Ocz5UAH
z@i&57_p`*J&Yax)BUth4^V7TY_~8xU_Sp+K*v7NB0m4nId+h?VC{}XasHplZHXMfr
z7KySqIo*}min5=~t2&xbE!0GGEEM~Oydh=2{d2C~i7fkAGg|Jlr-#BMP}*gt4v-3%
z6qcj6NY3PyDdaEa-9*Wp5ydUXcX(8;emSM0%@p+s!2M+=^mQiWz8RLb_CHK6OL?US
z3h32uV%3^B&C6b(WhwL`dMj-!R-OHjYU{kirP*T>I`oW~VIyKT6_t-pYOjg)QhBpk
zWv5Z3;L0f%lH=XOeac&IZf(S<e-g}e<Ome_-Q(E<n4@qTR)0j>l!5-tS%lcBW0cR=
z)WaZ~ll0_9;#!>WK@R$&Y5&b@`oH6vL0l7Y(uuq(^Y3f6x0L6Z8RtjgvoWL_BPtO^
zgNX+pjaa$QLNc5@;Y2By0j(c#j;+6Y-sz!|GXK<CK+F{P&lbuJ3&1*cZ8lUa(=qHr
zlwQU<h7=m`ej=k1B^B-U@`snO^agec&6)TMfK%D!-~s%XfkpBNv<_H?q(CAdq0wl3
zwG-zmxHW8tYtj|q2C9(YcB>W=|L58&ahFIMZZch@xYv6;*2chuRfzIu2rG~|RYJ+r
zH(7Sf4@XVgm6~N3uAShx$1@1TXj<KuR0>8*=;_1bJz@;+k6pS<3nWi9vERXWByX5g
zLI3sUdYs((a;B)lIneStA@ta|4niM9d5Cvd-Ro?~0WW$v04U&wZJ%Pa7MWU2MP(-$
zD|t(YR5$B*_ua03+HBHVa8I7Kg1L$o&V1|ss-cAd`Aap!CM7A@W?lVNBZiM$d&`~l
z+NmO%zjX2-w-0P`kt>8lXK(!|r8TokXBWyGd74__R3^38&-FkNY+7iW`W{JX*go7%
zih>R_sZ;0^-&^_buQkjBP;FY=OY~>^y{q^Gd-weV>0_#obTTAzwC|4L?Ql7i=0bc{
z(+HHlKEcPrB>=0Yo|I{`DbGZBjM$P%4i6^cksJ;Ix0gFC#@AQBiiQ&y9n^w|%rw6E
zS#h)(b}JvD(#jnsftUMhmcw)y7NEH{n;6m4Lv&VMAA=;kU{#r$BiP=o&3M1HR5t<x
zBS`-FB9EdbDwv1;oUl5fCRoVq-&Fr<#wj$M2)XRi$kE{Zm*lCy(O`E&d#vz~UKjHj
zf6GW^9qGX8=^<_%HRuo#%8e6IXt*K>%aOS8MsadMC4knXz?NQov<2w`mVJuzFLc7g
z^xXh*T!Nc9Ldf#srKDQCMYo?2(ZK)xZ|#e;#ra}Cidh~0bzA<O3z8-#r=Q-@tEz|>
z^^v6?abmwH>niSK+rLPjXB<V@o1|6S;o@n;x(XkEId#qT0IwW1(3K;AE0RNqqbBJ{
zE?#Bky<fr3D|b4ho2&CttX7^kDK(PUXq1?VW5^ccD-#E}-XOt?>Uc|1!wk!cLiN)$
zGe6MY>uJ6?P0{C0#yX@@v*0X+o?wL3;XN>RUZxsST~t7&u-!a9D20tboz<U?0DcSe
zR*k6Jo_CsradeQ*Xq|bp68bx1XhxK6C_jp~5H}-ucRq5wGsWugO*##af?V%JMw%8d
zq%SEcsXiKoDeA%Xig+b!US8*3{O$UM-SQiI2ks;OGG53-B4g))*@lk|_jx7>orpmU
zhaI9ua4u5Bp4r1-SEv&)zTuQ-g28oWdhkWy6#++PS!cI=2s?Y`HGmU_CS|01tWQ5&
z&z<@^YVW<mWT#h{vLz0EGEC+Rn4}7mkp!@Pw1sq1!c7@XBDFaqu}^c=f#kttsg;0>
z07qckzVEz722j(}_%J(C+gdIr1pa#}WMoGx?6<;=gKM1AVm<%_DZa)JcJfGGxNQXF
zuaRUVz7*T^p|$r0nmRw|d@cMcu^DMd09f&$$VY5!j*ts1uUTAV;hX;=gr!&spQ%Ye
z7cxSO)RC)7p%BHusooG8yLP0l14OhLSs{b|@CP6L!yiVYKNmbV(o1TN8{hi!{#J!X
zzt@$%Aa$j@f}oCth}0$;;{WuPLus+8-0q}NPI&WKU(v1w5UGPqiWV@g+nPjfrkt)@
zwiB@<eOdDUL7-RH7j{hzizfM#y&*06VBJNF_@hSs>()MyA-Kx=ozsV#8%Q+0cH2Qp
z>pk2&QVr!BK}?ZdbZLEtZzQf*31lrr7#?@9rE4M}ahyVVFhdW-BuM9!8IErKs^FFn
zbs+FEe3aMGdd~|^9X!yi%sa};iHd*9$3!M(Y8j0VK8itcu%{56<zuPQEY2%!t#Ow>
zMx6SIF%-x7RwcQm{I(LX*u+s39MKpi|1y9##E1OY^#fq<tR1UF)ZV&;xbjFE%47eE
zK_9=u2Zg_E&7Pb|RH~0gQz=4WbYNS1dU{v`3+Vh8hEA$*kw-MN6m(~OShX&Q52DJz
zC&7$N6;!2X30~at9_|opL>)72EyG|U|6ae&bXxggdQ6aLjjj-FHRW(2V{>hZr-yiV
z#Cn6TDw2_uJ$yio1HvO)>*RHh<ix92(ss5^cAY<g(%P%eL*)D%DVt?v?jUB+xJX4(
ziEBeC!D+%)q+#kLOUIR*L=7N@qQL4pP$EoTfqqdp6>>ys|31FF@BL<yFE1vYX1tgR
z>xh#On7~1VS^S9x!BQ&*`uSKMGb-MmbE-TM^)t%T3<usuIwW<1LR&_M#RDK6c650Q
zSB|wY&|FFVT^<-z4TRjjUNCwIjkBD>idudTmCXHH6jEFwQK_5)K4ni#k57wbr0N;1
z@kH^L@wZcQ_y6giqebfFr1yGpd2xa01GGB})Ha@9oLlM*=lhGp{(zdtF!E6I7$Tmz
zB_WxkNYdH?xjdL<sYKF&kmigpRGeefB>A%CL86I2Q+RrOr!&NQ(F~;3AFD!WKojEn
zdbPqt2T#PP0J^ubvqgqhuX8Z|w10O41M%ZsLn831LI7u9JqEMphQ?u*k!c$eBAf_%
zoj?;85vF=n6saPt#}UT$IuiaeMnfx6*&1(ID9a_bifn6%8gYbaL3w9olo)ivBSL+P
zPKVddmM!`+Rf^s;Dm&Yf&MPY1HAs3-VoF#MInNi6+|F<OcztfR5$B|7V!rC!jbWty
z<QzyoLIQUJ<D242#FSP($VA5DXAObD$oh8pz09gPO#GP5%0sc^NRtQOjmkDD99g>~
z;c}8clI~dBen&Z;C^nb2eY_P`We$~GBRQt(!#qK=LQGB0Dq~*(%|hJKLlP?SD%meI
zF~g_FFOdVdmN3J3QP@K)zVg^9_Vo&wP(^V>YkOcpFbcx>Ns4jdT%N)yKdI|l709|S
zvP2z`2K?-D=jik0{n6!@@%H8QcxzoM{QAb~v2q~=5Z#Bow&mKGBnv7nws^;qR+Bbk
z3a0d6ufth4uwkgs`PSYh8NF<jq^>eqpJ_|CWU<(Vz1qyR06C}8QYMzoEe-qluNV8n
zB}m*stPo6CMDjqhwzROkOzux_Jlk7F?0kzpMrL8lf?y+oh{Y8do-)=Fn&9`dNx0`3
zZnlmp4THR+*t4k&71vCid;*)OBGJ{$zs<cY^(pd#oWJkMz`M!rs^4Wujh5x5;ldEq
z<Pvgp5m*lSOQ3QfMOj|%E%XM!a^^Q}?CtHX+Suc-S~?}b=w%q3^cUa)ymzZ$G^$V6
zkCaV9hrE!q;0+?$HLo=$$EJ~l?aZBgg0-uNiDBX#AM7e^ILNksIy`~DP6Q>=%*5Es
zM5l2`a-w`wEnb`5B*%$9Y~YjGw)tgIVTljQKCCQmBPF65t@1D5#}}pKGb?l>0+LPD
z+oONfLV}?kx`aYexB!TnNgt_6ysK#k5ww`{2ZrcC5eaH$lhueRBR*RvsY1@5yqBGg
zI_DCtw^_gM;b<xyhb^B(J$L;0G`$Lr$ySQK?=wZKDiWxlJ&SlwJS){qGnGQLZs7Yq
zi6of1OD2v*zqm|NQ{08=6NQpufk@($Y1D%M=kUIAr1G{JuhC{D@llJQGb|6<l{Fij
zhD`P+sZDeZ>fh+CfxHEzp+qg<502>3WPRSb2-u)m2bSN98!a<G&w#ckVCC*plZjoI
zBIIu8mx6sXEAH`nv*NNHEmf!Ddg>4Q8ux|kR8^i9%($%a8jQ<Kc8EZ%0a-SX3EXse
zJ!oj;A`IB}W7X$MK)>rS(MFnfhi^BeGv+P%XhRItK3X(wy2<%mk;T<SZJL8wj3r-=
zQ}nW1<FHw|vY&0jIw;qZDZ{m1Oxh;7ocr()$X}zrTo|&Y{nwH4VSGtn$zgF2-5cj4
z!@EpiTl#Rgr2u^G6;M=;eH(ZRO-8=AxvE@Fc0>j!UJzPGt_qqBabaXtUfYX(pHD5K
z^?w@6=JTg%QsK+6tbFTYhffHxU^I-JtwzJ39p_>YOpPkuocW;iugFfx&WqYgw&%x3
zODi~`hYTw26}+n5Knb_lRkd15FNz7<+n8PG4F~HOA2f|ojM7mBx}7}GTv0DO#7gq7
znq!s=Ed}Y|z4czJNRhd4&?GH1zOmJfjiddyRd$cXE_Y+7@WQn~()|rO%vP-F-Q=Mm
zwHi?NTj8|<8HgH1^}ul7Wbj6~IfUcjCN@KZ4mJ{;W#dgvbx!UPh7rLBAz8sw$baOP
zepLlbz&sm9ZHFr-0kfMdqf$Jn*)m%|u^n>9uq%<nr9m@cu-F6WNP+($UG`w+?N)bt
z?^U(E*QKtIv_n~<gl+(^=zWWNosy<PZT!0|7)OG^q<D3XEIA(QJ@QMWf!d(ddcXCC
zV9sdQ`B9W50u}?(#zkN=%&1;4GOkm2CMiy){U;ad#oY^Y;i}}o@jlt?fGWVXq!7<2
zQ+mcs!7GTn7Y`~=CWO|AB)U_{@^mGCJSOc#NV>^rZ}lGtIK;r5cEWBjybM^K_Lw_d
z>(St)wX|FMtRv|8VVj|m6C>Z-bri_autey$#hhNpRthI`impgjQ#ioGG{iRTx&M9&
zH-76N?@`+p;i)-dQAM+lVxAA|$l;vA^!nolh#YRu#n@y7U}vtct?gI!mh%;kD3L(O
zEYk(B*i2W<jr4~L?aTeAD#n9CcNz@-PKouc`xZe$msk<5<8NOsmFnI~_h@k;)P;$P
z>v643Tyuip`kkeTfQGbwyw9@X(nQaLDe9_3-KuyCrHn8w|6y-6C$GCji~R`)4O4}|
z$ZC&x(Rhq4oQS8e$B?r@&d>v0N5EH9Sw#>SzxR_uD}u1_0x|g08k4<zFw&?$8GTMI
zi9XGHm>v=5+dpn}7`FKh>(GH|4B{(S@NIq_o5Pg2HL+^cF>1-J%Pop#&8yaNg&@Xb
zxJ^y@<@M3VRYPcf0}$dTE$IFOKy-=q+oDB6lgRrvas(~~S8fqO0#(umq)|-+!Bqwb
zCmRD46JLgviaK?|Li`3KmnIs^0|$!(sm+>lP0@z90ScU#8Y6r!!KFh)O!_=gej@%>
z+hda#+n>QfNJ%vP+c|*cJ~&;Fy13XfXNUY+K^xQdDGS|EA4xv{9f>h3XG1yt<FT=J
z{fCeTZJIXxmU1BZz5?4tS<rhs4e8zA@)>f8hP%Pvl3aBX+QIl7X6mztlttE6gi#Z^
zl~q>cqUA&;JQq*1(PsSXiBk%Z=9Dx$UKJ%nX%hRAMqEb_*75aCp#im)PLEPFtUz#1
ztMk2a^x<gr@TaQ3fY3X#oF{4)Q)`0ZqS17->uQpHL8|)RiqaDpCs6m{GOb3NS2w2*
zgFg8fwb3e_Z-$UR%_>h4;y>p_v4^rI+>__q!8Y=Rs>lG2Nec~<KW9-8O(x`Qp`x2H
zfB2&nbZ6xs_5c?-j}*4#{tF5PZXa#{r4WQXu?;2X<%BZ?av6Ikt$3ygaHzqdL+L>p
z%JrTHw(=n*a*%5}+3fOdQ!1cMy0TmmExEF2AU-};8Qb_WIH%zaQ%Sa4L}yg3XK=!z
z%f#)?k*=hJGSU~&RC%GA=8P3H*{mE-MDUF}5bqSU;$Us9DoGy*p_0i$ldCA+P>=ZC
znXdACBQVpmYvt5#lRC}Bo_%MW#AA1kj&V{Q+CB`<b&O&u<D~}-iu>h0o-UHL7W)HI
zZU-_l7Z;a@eJF62`-=n7-3&MR9m89Bz{*2JUz}5-6Wt|wga0k*Kxs!35Yj2Y{J>~f
zSgPT2=kRcQ^l41Y>+$YRgdk$3OmsDhXwEgdD-9GTh`u^TRLXCZZYVkqAqkYr*V^ka
z4_r2cdR0ezd$`qqf;dyJswLn8AD*6Lh^zd0X3MKhR<q?{b#sa6t&zu5$gT5HeI9PI
zt+4f2JGj>swdZvIV1;K%yFBMBFCTu5`g2ZyvFXk%O8Q%Tnj!@%(U*b;$PPZyKI*#e
zlxN=b!}7=y3{bh6DG8{wSJc3GB4<Kdlmoh*c;|#lFJ7qxJN^D~Jm^2%e2%ZPQ4e8@
zKMQdYQ_h2(*-?`bHW)mMp&Xal86EA6{!z6o@Fk1VQ&q*_Akm_bGz)l6R~p^Idol9R
zCiOjqMVX!c7?r{RW=dy=D!|Dmc)95Sq&3(N06Qae|M}dj>c4}eNtq3@vM?hI@C5>y
zxSdzuxFvAuZ66bIGO=&X3t=X3zgF@tw>+?e*APjG*LcudTI!Qqv_NuXe`z>JC7r<n
z*^<5GC8pPKdA^ri%peWQ{9!nN#}7miH6Fh*V_^BL{BHq}P$@&`zqLT)AI4aas0dw4
zqlzTGFh1$B#QJ8Zob-8ho?Jk#UGAS?bsm2hZEUUMMV7~(TcMK>;@yYm*27S2q;Uvo
z9*1%g1dB4T0zrj`T9;s>ygC1M=B#_sJ?%=tp%2;+7q|@hMP(dFR1M8tfz>FsOn6;X
zOSEw*`QybHbnB?N&ie*uhJ9xODt16NRDVy}7p6SrM78%9aa@`e`ApE%(^vn*QE^mO
z3S7lY@WTu1Vowh1`h=P|d&RHPJO3}N&g{7L#Ld;<ESmC;<VE)IC=r1<hm6ylX+(Lp
zV>v1JI=al`H`WM^_~0l_7Q9kcgH;8<wG&p8hz#VZy(*3U%Jf&tTXtS}8A-HRld5gC
zI80a~^|Q2rD(4|H<#^x2<JDNxlQ(DXwIG_!e|UGGYFJ-Swx2S`HrD@oyoWB**Ecgm
z7;LQx(jz15_k}3Xh<T&^H}ZHP(aDps%*|}4;o)n`TuL49sOBO#i#xKo>4a_9rQYrX
zQe<JX-x9Pr2$j(t87enl>~MadDDS6q0Q*<Q3l&%zR4ZR{c^@8`vs$#&NScpRFf=$o
z8s_a_@Yh5!Q-~(YQ&QJPj!NhIF1D2OC_e!#Egi-a;Cu~1yk?}5V&mYEFsgZDb$`>8
zJOGN(!IFDCTnOT$fy>_GKub@t?}3ix<YfXdGL)*iqIE=zu`DQ2fuwbrFHNfs<GO9Q
z^82<O8$lbJ{|RO%Mmh=cU_s4t!bEifocEa1Dh2XXz#u>3w|~wPJ?;pU<njt`1D(|G
z?2a61OCH>jx6*55SGB5wFyTFvnGkg)7<nk*vG|WLWd~PMNtC^@@ecb?9uwWp)Z71Y
z3Z_K$)#6~nVW6T=$6{hql7@S8gJNDQ+s84_pB(Wuky%>1#==bydfxO?{#8=WQvMEs
zJNmX(*_xv?BCd*><G6;bW7|h^gjQjh{`Dn+^!hv|&jlQ(6J9)s#!QhSn3HS#$WuSl
zI#Y&ulpy(RsD~UMbI^q*@C}`yRzr!mq@n`ylL<{Zp;<}D*gL$q&{?~=Mwnb}VqiMK
zSq%Y2En17)BsL|58nhdEVU(@4`uQ~fr*)E-vR)#LuO=zP5WDMAx2>i?-e6uQS?xl-
z@W2-Sg_jrj$r>mM3Isduyy#a`1?gp*UFwVrov$(-GobA>5&h@?#<u=uIyjd!@R|az
zhl+KOx=_&w$Q|(V#}00w%&BsU35qt31ha>Nk5`GZRA*EKpwP(ay;I??%>#qCTc&{%
zV-i7@7N&`PyIG?o!bj+DE$Laa<SFYf7=nBVAxfm?@S!!!M8ZC4k>zVii`jXj(&^21
z7*cWy0R=Xdh7dZOZ!NZy2lyhSvihW0ca>srOV(dUXc?ZE$>k#B&`4YEo9%AhGHtU4
zQO!issnQ~pw*3rS`k_bApSA7{6elWvX&gNYv%~X>&^-4k_G<lwVHk`lb^BQRuwH1d
zgl&Uxlc`X0H7|oUtK)p_s9Ju#{9FHRpsWvx0Ido9FDyIW7lxB}-6oAbp1{Iq_=P_M
zH*yNGu9gJ8olul%ujI#6Y)GR$<-DsUr=AJ_h)SDi9a99v8b!7UrBR@+w5&KBu#V;9
zEYV+HBiuDK%!Ic_iWU#X?5V4OrEPwMy@xVKg$U{Ek)OK0Wx_5KuZcUkLdnJ*!4@-b
z53^gDSY&%Gi%LOwWN`RcekSt8@sxcX8gnIFEW@+6JJ-x4Ib&f5TYE$6?Ce7Z>>H9{
zY|rP1oW1JZv}5xU5ljXdRdATgR$2$-e0*~msr%9G_-^#=^z0$sMFRt!bMH`}W*fun
zs<ni~Uep)I6_Ecg=C}%Pv8%Xqi_s1s#NSRnzq}oNxqmkIJSFpZyyW!tCmHAYB9OIS
zADOXaN|UmXs3Lapb8EreBie$qFlw|<9;D$J{|u97bYoOv(>8{*1whh7L(m_7ub-+3
z!!9Mq2m!~{V|<Rp+epuI4W9J3_WGfD$yq~K4j4<g?i4`CvRDqkysd;lE++Y}oAUn{
zesN6Y)Vfv93e5T?i3tZAe}%c||LH~q1`92+yhaME_ZHOj`4C@8{koA0)#xW?UX4p;
z^lR2J+TMC*QFY)k#R+L%C$ArY?XfsgMjO!pki4~~z(SsthTx7(3%4@Gwd{Qut!A${
zz^-FDP3}f%r}S)@oe&20@6A!*Hra}+jki_h;clISJ@8i0H`Jv<CPG!|;*27-N(1>k
zTI@{mYpu2Q9cKUX@pAO}GPg451v@p=ob79_ZjhI+Iy;1yjstp1ABF2wPPi{0<%h$M
z|4n#*#|gq4RU-8SB`TmJ-@AY|jYT|a6&VLpCh($U<K~yv033!71fsbY*?xvr@^d>E
zaV(Ik%3jPWiL>zWEH8>ua{UdJ95m<)a>c&fQpLU;crBGm3B@>ro+J>&aXy||0o$jP
z1?EI-k$L+`>n3pltjA0K;OR2gx71Nm7^|1>A%pR}0-LGvJ2Lz@+}9yXH2p09tx?FW
zJ#m~H1Oi}xeP62o){0O~gCQOQgg;3uZ-UcC*(<?m^Yy;S)=AsYz1svc@l+G3e|9O@
z#__gMy`UPWygV-cX<bkZogY59^`Vyfu2-9N;N8Z?KFc-mb2(QZ1)JC%J<RNOg^&%?
z7TWB_y&*S^uOiQ9m;aIQ_!Q%=>|R=6#`mCSSmEr@MTetnAUY5kjTWmKF^Z60g)UcO
zRRVR7gah+1!q_HFd=(1^S1>LEz~sCqYL8PjSew|)_NkAjoy_~<CLkD|ojap*P=m=~
zn+h+O^w-)^<4|)99)z+)Ra_kis;q=6%X_i$bk`{<SUL;^iD}+ae+tC!>dL_ZA4N<@
zU<?T~?!sT}jExhF^>|Lwak85U$^Uj^{TNdgSc-UGmr0YgnD<(c<$F9a?ocx{V)d3k
zOQnllOP9BsM6%G2McSer?V6Iw&(04T#k3g?wZLQwc(CR06?+n@YQO15CTB<|-1uOd
z2mQw>F|--x40DsBo}mBbSmV&Na7;6F)`iaW#|(vEDM~w|HJ{<cqT~bE9<&wJ`J*Nm
znwMlIHLM@MT!)j3SaT|C_&ySYjLX=6FG`|%1QaoS4qlIc0l7eIudeZF$*27I4Gf!*
zgR!T!!$@;G^xp}K#_#3+Q()WJFElt+-eW57V$D~$**3O!59snC?2rhhjOYqY!}Sr2
zg^ms<3SA=Yr1`{f-Z)ZddG#kJ6)H&R=-68tfv}*z!OG;u?akR|vH9VwytTd?Y2zxJ
zW3}5A|JaL3oAt~2<<8Fd3;q(V+IJ6K@ONDs0j?P<{|>2vUwuwq-8jJd<|DXZBuZ3a
z$R`2x{n^|+l}ICWNVf`SGV7+a`0)7ilehu4y``j$4Mx%7yHFQQn&g{&=Yn{y0?Hx@
z(G$h2w6P;y%kHu$ea*?I2O(>Mu;AY!GV;f$pz0UK-&p;T^(kKP&&>rlx8->j&(EiK
zzA_pI9V<Tem(REuaFfU3zpvQ1AyG9w!)5#CYL7M*ZsB;CNl(MyrS{Kv`d_)&1MrYU
z#!TEmsJIpxGlJ;;6bIUuOPB}T)lu{(>C6ei9fM%n8H6)l3wSL>y~)Dc-j$t!XOj6q
zIWN%}A8I_;RJ8I3h%H3C3#=BlMHIvjTZB6Eq92Npyz_2dFpxB4ZDj13SBpCY7wW|s
z4u+>I5p2$%U2hF7ic3<$7}E_-0`i_kS?Yhbqd>>x0qGQx46Ahu)j}p(1~2Hz--S);
zeSVXJl^Z(b#0X}hhbt22#06!}<v)CX2S7p<fC8?QioCUVb8#+V^cttOQvk6&U8V~K
z8U}k|`#|t)Deyb7AMK2o!iZM<Ny^)lnqQMad*3_-?IZA{FoowOa?5uK)kr#Mtr`sn
zfG+FxQEL~{UXV2w^OOMH&N<LQc?;~R_JrRNYX(OgjQG&Fa7&f5@ndW(FF?94zS29G
zu}0r8pk<6n)xZmdMxFN7W_nA*;lko@2yr!87V{L3?}^k-bHh4;ByhQdUU_$Y{&;&j
zbKjj>t|e@dnDG?yygHXCxJR<d^=<7b<el!LEnbF;Xi_YyKUev8tTy^xEjViSbCXb>
z>R0;lpHRpZ)8cZXVjKbN?o^XMH@`)0oZS-5_EmC|`*FUS-`Xz@+9((r3ag<!(9y~q
za#yT0Gu~w3>MDF8Gp7*O5z-KLw0*e#ktvCU`n2!})1cXu-VR%a8q3pFY2<5r`#(9w
zCGRBb5O29=HqH{+hS{mxruQjD#4U+{47>7B<Y>kTUI`%tEaeMg^J^khjcQ6}X+U*2
zf(b_y4S<C=New}Trs?WMxrXjLm!rtl`0f?9?sr#g*zc~?egL^&JU0p$P>yG4o-shB
zDQG+w+q45SL|~CZyimo8;{_Mfa~w_LXIeix+S}Tk0pI1YI=x$O%0)oDUane>#hdCK
zaIlg(lU_;|JIaRMDK_hXe1WzvL-G+JEJl|H)x?P7Gq-kB>R}t8@7^ZUMOR5gV2&nX
zp2=)DsJBWr8=keNbXfrq${?N4dBEU=i+jS`ZV8v~p1eAiWcprU1_w=#)O$S0Wf)tY
zPhb>vHItq|AO@OPt;JzaJta_I;QPH=ZEissfq!o1i`=R7NQe@)d|>u3I9dfVP|!kB
zfzrY`&NktS6=P8QkkpbXw^5}1>4hj4Eb)<xDzmI%2x`(?RyVPUF|=(t$&3+NO%*;A
zH&U=t^)<|PRgY9c6gBf-O=n7bf^ev5gVnypeh~~ns5dC1WDymKnG7#RKC$ANuUaj0
z6^Z&-w!>PWOqbP+ERDC$kE&k$Fn&1u;WN}L_pjxp|B9h1Bu42aT>%%fx<QqJSWZbx
z;oV&4s#ttt6O-rgh<yJXg2{h1qLUfvv9L79=)!TbSratm!6)$Tcj<kkP2kTGW-}P&
z67q7wnWf9JC+n7Nxv?TIDrori`>k`T7##*7Xa;b~+of}_5dUeJ3a&2s5aUNYJ+!v^
zH+Z%f2$BnSWy@NH`aUVF^v;C33WVwEXZdpK6#45kQ9^66wPrOW;gqzF*gu%^!5M{e
zm)Z+26D>9^JgF(1(2F6giv5mw+f&S}*FEcN%AEmfA+kBcp87~)Au)7@+g*UQh<7E0
z2C+<TPu^+eP7*}uS7~<TPpkgOT{A~xF&hSR#)=x6h`-t<GPY4IRHH;`MtRQk7XB!Z
zh5vcgNda?Q{dYHaGb;M__=S)8Z}sN?MVqHsxe0R>|AU4?9B|TQg-utWfM2Jmd;X>L
z2oL?=2)u*tGC;KwmfC!HptuSO2vaIUMfYy}>wOK{Q1++<dq6nYW6)6V_;~FEcMz2l
z(Befy#Y7Vzt-Y)9mHp!*(K|{1h3gP&L}oT;6PhBz6+-_;Z`Pq5LO@yF-lrNtvF%z&
zhudT*DS*}le+!BSN#fBA|4uVbitErjbP;2zOsZvI^o$K9=(rG3$BnhO_6rO<t^9@h
z6%W@6<}U5j%d)uFB!;O&fAi_n_#865EbAI`UqO?pDhy*y)YrtvrYSh-tbG+)XCo8f
zIfhVYyuqt0)?CkuTz@MWA-haVA?#%%>?eO1#hmdKd2SQ6h;vPg|3bka;8l+(BB?0i
zRK0H6Wi)x*^-Yr)E!L~ZXXX{Wgs{}$JOcEecUyS^Ig@ZT7m`Clqi0w-v0_pbBYReJ
z@q(}AFnms|Z@IwCj<u(Hk5<7S6nWgdP}2&byi)6(!bg@$kJp#qXhqzwvX*c<S`o_5
zB*_k%sRlAx45sE@7sVqC97KFbLOJk!NlIM!0j6HS10)D{04u8q<nFG6SA-my?D2V7
z9?Pv(&ddT)5F|q4`s)v{Y)`LlKjR|+en<+aY|?Z<GFWSlSiV9T*k8qIh^s1?BKo6x
zKM(U9l+S)t!jysp9PQx?lSTrpEZi7<DrXgna}Hkg;H2P7q}hcnm~`!WFBX@i12oBs
z)J6#)1?78>*FzalJGH8cKR@pZBkZZ+z|4#Dj_@KO@e=dy#2q$)qfwW7LBKx5!w7{j
z>IdiMfP0c)bZ!iPFLx6Trfo8n;G0a0X;jV=j8-e|CXpo`oBw~Y`GO6&zN(p(G7jv5
ztc=>(tqhF-MAtf%D(r2##vX&#o|v^gxcP(~YINSripkJ+sRzN&riuV=-qeDydRN;o
zS$R3yg9OY1upaQu448Dnct)hD;2YFA=xaxm4&o*Ew=~mnlS|{#{AUmfdr2C&Q=$*w
zg6?ccXv?vvitMPZ;m?0M6sMeKhN$q2qae5$*m0<0wqjW3`JXJBbRt_PAm8sH&W6Du
zAl#kLBPl62W;jdRf-9UV58naL_U9fWa6m>z5<Q=-Xx8W#X`?6^pUD){61KKmQB{!8
zVy?99A~x#DHIlQG+h#>xW;XN{Bh{2Re7~|k8b)_q-$;qCRTk+pD`0s})^J}Tl$%Hu
zC+eN-pb(^GYlpSWuP84hNe4}zyUU{<-01yV1g{8DqM9vs_WjxEwX>Y!7fmP2c98Q>
z{?S&*^!kP>#E*Bu!Cb5p5csDv2+tT35Uxsz0m}y1oFEm8?54}!NOWW{N0@1)J;JC$
zB9=Nq8Ho@Avta$KrPuUpELq#-)+B3NM>?dwqZJ*ZNb~Q^>({SeRjorK{95?{y6D27
zyU_10^tubwIGbPY&M$T67rXNd-TC?M{IEMe=+5`M^S$mor-#ek;Zk?F*c~o(hx6Ux
zusa-dhyCub*Bx?tu-qLibq9;x!9sU1-yICQgF$!D?+$w10jK-R-TqRyzu4_Bbo=w&
z{;=C0bo>2ozt`<^y0_fzEp>Z~-QGgCH{b0IyS+iT*YEav-5#fxyVX*+TI7X!O<sk^
zaZ7&IRf1>MIm!PztjZnEq-4iU=&n4QP>6jnQEEFCTUm&TwkE2td6gJk-`*x9uVt2Q
zS2^_XYxjPSR~sc^?|tWKWW+Kh#rqE565-d3P<b#pv-DTNe+b($N)DB#QWx{ra}(XD
zS^^;Ms8Y`Ua4=*dxa6?ol+eq6BDQ3|>Hk-eqW`0TplGP)<3Py)a0r6q^7JZY*(5&g
zcJ8&VbTTz8cGE7G-j!X^wY6WRTd6*XZV(lLFfYT4F#DH=>0XEy^}Dm925B+$=$(2r
z<ed>jn+0ze!BUpuH-@J7py=J$-z<povK?okquiajh0Jg_Jx*e-JNYBFU5Ki`xa@EQ
z^_LD{ADU!AqA;h9GdB~9$eL9moF560rsO0D{*6Gc>v%XN>DyV#wRqJzy16}k{MfSO
zXuZipuYw9k>y!Ly<n^UoQ1}s<yxBzZip45`!vjYOTKIY$AhA~0;xnr=zBK2M(hI}v
zf|90F{%1M{ry1aXWC_Q&Kp`e+6;^`L>sbW6_8kZ<$V}H?PIK3i?|U(W7}MYFju&8$
z!?l~+Uu$Si#xK0(2Gz^AqRA?uB98Y@&5UxnS+Az{l?hY_O%S;zNI5}xsu%M_o?zgf
z9D2AsO7gfoB7K+-2}IPF)3dvqkxWR+q|AIFHYx8|HT86_nxVI<ba9n0=3aK5oR810
zB64WYK%xS1fIPTBA~~XE+bLfwJTHx}V6?fmdlXbz^rt+DDDdjg!jpJRscQvQ8>iXz
z{q%3@6(6-vt)DN1mHtq_dg-SG7BVAT$T8(U<ZfZwnQz0sm%LLy1c$@r&*LjH$b=cA
z!LrYF5=_QQXTk1&YGM$1x0GGZANOqzE{+Ui1`ETmWSKdUZmH4d4S6}@p-$y&KhYHq
z6vg-As=99ih%llV7e&qJ_HL==6hwK?gm$K1fv&X|;2lge9Gz+5EkZLmo5DpJh(l;8
zzRut`CeR-T<a#VGV1Kq(A#z(6q1B<aP152H0P!0|X{Vu-&$*rgWzFJ0vG%_zD1hj;
zSoQa}SLA8i!5y->LMldR%R7RatXI~YV;kUUqIl#H8a#|kjZK8bRC<Vm_ixJrd^=Qf
z1WHZ4ab-Rmx4h6n@ju0RHA?kbM{)%tZWDgNPQiV=uEwE8=CP1gSRfmk7ACKi+^S<Z
z^ttRgDscwu4XTs9+cn+DrFYfnX{#)%`(st}qbDCy@;V3K12@O<0(fcCPHbW$ir>|h
z&>4Km`9d!#7bHcRClB%g8{3;Sqszb>%t&1U^(?G{Ye-9w5}Ff@1A`?hXV}VxAzDrS
zcL9Zb%a&=lRCkG{sc}4948#Y)cMJd{0Vj|)8AZuAuh6^pY*#O@g{EYe`zH5NjS#ET
zdaYSBiMuEeH`nCDO6L8sNa$sP3aZvLtvsK{VHA5P?rv8t2}v&z({2DChCGz9g+>7u
zsw&98aLI#P$s-a^$>qH(g-zEHd`it{sv9Qf&Cg@%B3TJHV%{G;#~3V{PV2Iilw3|^
z*g3xjyZeCHpE>W|cW(_GAl@(Spy!aBi>}<-hgZx_X+ViR3Xc4Pqb+CPXijDP+y;db
z$>)&ws`g9FP$s$96+wFiQ>1n5?VFq1(3z(y*a9G(-aWLyOFjZnAXM;xD77xAHN*5{
zgwr9>dnOUFNMRa&FUl!h$Eq+*)O=44@%gRbYPNA%F{BnI9>cekR)!KIydeYfB9IkK
zM}Q6!c5Ca1$*A5QyBrT{mhBet&R0v|0MH1Usx(5zfm!QbQGbT9$!5)2N8l*(Qgp)S
zg$wRWua3)t#ZK*7U#H77^n^P4EwmFIE*EhEJpu7hWs-(Bj{2~ul7<3p=3ofFm^_v-
zv|4e|@XE<{O^!Zj!Rhdc=A>RYtbTGwqc*hNQAwtSQ(xj3<B))i{FDvOsVaAPx})hi
z6AR@|>ofiyu#B;A1UXlDsizzd?Li<e4FHKZd?hefoSs+ymnU40McSP)Y50pIJ*ow*
zAtWA`ULU#N58q1N;E%o3gwonxY9F9`YcI9k%~8nNOTXM|lHwe%anNG<qZu@ih?eTl
z_UW&b7da{FscXzUYhlUQG8`b+IaelyYo+I&P^OfDR84j{U34TnN`d-VI^+*%=taPM
z+&q4~8ov_L3G})Pb1;g`&q4GJAbDxHh`(Gs2ur<%;oJg52)zZwwg=P^^UF|t{Ea24
z43NR&K%!y7h{$nNb53h9afl-E_m~}r3X9uO`*a>ueI6JYi8`_Gkuj0KT4}UQ921aw
z3%UF>W5!1(a%N&2r;#gowjIIBDQ{fp_(x&=5m~`R9+_xYT3p{KCTIso^3h|u^367t
zz8k;?TP~^%tt`iflG>_0rm>JWBWWg8(^`7fQPo9KL`d;E`}|1B7mbU~q4Ja8?g8<m
zL}jzS2cwY33IPOli0B1etc;oWTungylQsY}Y!BFp=xe=41XmiNYC8(vJQwhl-48W*
zrK_AkEl>J8y8iY6pe5iZ-O8sJML}~=J6t6{a=8)mBJcQd-ep{1D7NZ4Ihi5}7>d+0
z+r+@3DW}MfQA$>$P%-&{?VPI#u8FtR?*erJGG4gstULyy-LTg6jdc@{(M0H7K(Bko
zvEH9vP~+en;3hfEb|WiGi9x@zJ#*UqNVol$22Y+1ynNIW^2}LY;-k@m&y+|Erd-Um
z=@k??E$sC73oJhSYvpEt=hsns@!4N<`g^~QqKwb}dLXtCmZqZO7nJ4T7^fl462goN
zj^XXD12v)ZF@~}XnrDG*iHMLChqlL+Mp}c=5c_!*^KpO;F-u~O6hT_7`J_e52*$gW
zsQgsOGd3At#<vKwt_*U-LF0d7HZNY-*M!QiGTMd145mDkRLK4)ysYvMaAULnC)W#W
zs56UES}CG%D6>^%qO#7s`W;tS?b`}h*ODKPNYb-RSfSTN=9igXk(+yQsRG1*Xm%{d
z+XPAzj+&ZSDS2v=_Etoxq`IZ}Xe6jAcb0WOi;Plw<opee<xL3+RlUk7MzlMfjL%%@
z?jZ&2-nJcCn;PKU2O#*qKAf^<)}_$m@*IPv>>>4p(980uaS{hZ!+|zQ8ZiNDl@UEw
zvmIJY;(vpykO;k~)hVPTBkJ<vR~SmDlX3McPAZox>IP}X%%O=n0?D-jRR#hmdqos%
zyEl&)fi=DIEhlyG<{~IJ5XLsqCT@+o`r;R{tW;+#G%uvRPF|5XJHoLuin{6d9ViDz
zH$Rsil?meWq6cG6(K;<OZR%5;Z@}<CekffN-OWTr7`B$L;kL_;AZCV6Aj@b|X?-I!
zZ6~u%90nRFA`+6}E-?2H)VkgOsQ%bcztpbpiC=QVKmAFHk9?w<xKvRkina>}zVy6+
zK=^xlSQs+WwV-U0>m+c*3{D;`=lV1I7uJr^$IAy57p*bwC=7K`fZ^+9bgg-Ksg}qy
z94^ccDRRL6(&BP|X<@0~?@=HC#<%`(Q}Jy3vKYB5>-e%=bo9U)(P8^oty%3fjLJK2
zd4*#_;rgN`^Q(Z3_>+W?(Nw9XLuB*<Vo9mJ{CUT(mOaUSr@ZsjNI{mi^8(s7jM~b2
zdbF8JieNGqMIOjHz*Qnwa1!U(s@*Yh)!XstBk860SEu(Bx>ZQS#`w#>y?iO!sp7!j
zUTWo;dGaHc6J#<9G!A;HRa<Ky`j}2<R`V#|Q*G?+qwuMak6usd@i=egCy9oojT_<z
z=v4u;?-?)1qU1V(dWlFI(k@nY)C<H?lwA#3*WVQt1gT@%5Lz)m^5Ba_DaK}s!Fv>j
zDb?mqPWDynAR687-kON&RIl4f6dCI!k}0^4WwS)+Cu--aimExEfIiBrR3jiaAXx-}
zL}w^k!3ubU<ND=>97s~UQns9k+lBqhm8(;oa4;^svLtRKMYs0Kco9|iwpj$Mh1QrX
z*rwV`JqK{3OhuHFk1r)X{_rXxRm5&m2h?Y@B`vmf6lyXZmTup&WTv5ngN#l}g55nV
z$w-XB>%l@FenyzqDf>IOJRpSC=l>QLsR2R;&EkB2VNR@bgZ_3l)iu39`HAE4U|nAA
zIeM&yGV+SWUhV7*hg|k`PWb84;b>DC0{aJRDdv#PE=<R&*ybw~s!CoKhB4ji#2wYs
zWGo19K;0GnV|!x`yv+9AD%IOu<ftwfBH!*pcJBI?>FXlfQ%-1+;)wkSeSI*TUs!yN
zPD7qki7ZF&D0bsBYUqS(G&qic`PCFTqn+wBUQsS3cMo!$j=^wHxsRXtO#2HGzN9lc
z5Wd$O6d~9nV)8woH}D&1+5tO8J4jbDq^Jq>(Og^CU&U*!OjF3sw$7`Wf1jy$UZGi~
z1%d}*IY$B+>Pa{{P_~$0ns+!%lCN7cu)-b{)OQCe&|L>O6+Xc=g5x+=+;-7{B-BZw
zSU+)jAC)5T0{)~B#deYC^4!Ue(SZB(xKfVoTgrs2#=APu<A87TGFt*INp};3EBNBr
zF^#47_@0L^w-mkHePf;dw^e#wS}XLiI)`Koktm}#Qq$y3S6Y3q*Q6GR32L3J7fBvf
zhVas*4c@~62QB&jj-;i9{|hg@@qu3YVN~^!mRA6#mx*ViFr(fjjaTutaxoEIGzJO2
z0AWtl-+1;+cvwDOvBLeUgW4JaR`E3!6nIT^`zzIp(-&<FQ4kf(!2Ku*SIaz5%`?Gc
zHN($QR9lsWpc?5`E^jZ@)rG#Y9-DWhTDO>OaS-VP5~pwqe$;zDCq$j*liND)=!0rE
zlbbU^rrHiD=1J01vI_o;qGHs2TW_E^R0?Yct3@H+tgMvMtuWu(hF&|0O?z!nscTJc
z7-$H|)dAwF7S5c6_8T(le2SNUQZPkn-dm1O;b`6!-`G`(QgtyaA`|<02JT>hWb~JT
zEq*6|0e_^+muHkl#2ITEP=1gorEZXf(*{G{IjjO`PohWNHpgJ@CPXhZTx)yBcsudG
zG-T6IV2JHePSJ1#?-X!$ww<MZ^9zLl@xUJLtjPn*ACC@7vaF+3=kd|(!czqD^`Nrr
z3mO$(7l&)S&!!IUrD+AIa~mhy)|<CSVYh2QP9E%tJ(&<+vi2n_<YL2zd`mgYbCW3D
zCje;^j?btHda^;wr9!4GDbj)<@}@j=T}%F%lPPj@1o?m}-qigZee)e4Q(W*LOLMob
z%TrQjz1A9LhSCY2AxcW-{<J?$(U_tHFPZ~x$hVl(zf)%W+D2&8KVeq~fv@_j*ynX|
z?2+v;^R4@9P+3&Ja_|{{X7*;PAhTc%{ME%w!Z%1>rxRvtJqh53qb%GSiJ&9k-Z-%Z
zCfz$efaLlO?$Ldfd6pkWYpt!R*~J)zTM%`qo~abyW;STW>o`l2lCg&TU;+ehqQZ3)
zYphJQ5Fw(<>!1C7>7rNF|Lv+sT<HAY_$hsvF8yP*xeFuk0a~=KjNcQ8>Me48$sq7t
zi+OpJO=fR}@_%$hYr9<*<pEVT@#SI+9f=@DO)IkYV@lFtL)b#5`)GoLXf1Bi-j@O+
zx-6Oqr0kFgkn1KXQ}3`$$P;IbXbFO}r`U%0<Fks~t8<m6q9i)^bl!>H2zFB|0nr&O
zHRfhSi0!NqB1yKH$isJ}y5Y+w0(-EbA~LF;(_gkginJPk(8dYTj#@NLJCSk(Ha<j-
z09+u%a(PkF`NrK(po!DX!CFPut2~N=-)7-kD1de>j-ZE<5lWhQ;Xb!c!Y^fd((@Rr
z#P!2Hh(1ap2E;$CEptRALh~#*tgwWOJs2J9ghfOHofQeMdF^)g>(*|kbUhtDe0eYs
z$hFkEAeu75D8_P|K(@HDDlV-@C#h-DBba=0Waj$Ma=s;sT7xK>ZA!0~*$g`rx;}Wr
z3MH>Rc7N;Kf0kx+NkLEAqU4xE3T=8cPaf3UD-ywaa_HDJ(k2MoC2sf(${n4N<39qG
zAcvl&z}oHygQC$Li1gQv=mSMnR8Mqwb3LXYN$K8-QSC|7>o{>#_AsMn<xLe48Xi>G
z+zOLuima|2t-ba9@D`qo4>%(iv|dNJaC0M~g@KhHZSG=va;tzI(DdeuW4zXm;{t4<
zB?7BeWtxoWHXRIzl!s6FW(ie&x{eC#A|TEtm#BN1)pNon_?`4LgS3GD^KJ@kq9G(e
zi~-9btz9H1tM@}<`6OwippfKP<S;DtmZJpk2j<?Mrao?TFjN})pl(Q9+i4Vs){*F^
zlLgf)agw1Wi=1Y^p^kEEG#S|ijp|z-;F}Vbr!~uY{d{9;f8Z4xL!~8mWMH<uC`s!n
zO?6kkL;AMWS*(u2N0mYn^c?(?`)>{e34+ZMDMaZ6DQ41Qr_`nQxhb65%$1O_x#+T8
z>Cn82Pj!!#Gh@Z4uefUYsb}yJ5G*BP!{9nt`6-D9jZRd*nP9QkzZ|#>0=6=GKbc`;
zBM7ZS$!2(At(r1rD@Fra{JSkBZV=JUSyxsXlz+wDz33J3p76k*oZek3I<R%WcX5QL
zJ#wqR0xmuMGBz8zHfYIGcK&|1J=R&e7mB<*`Mz7X#jlqys9qp>$M@Cs%|rG1^y}Ey
zbFhri)VJXKGu(J;1|aKx)V2P8mSFM?W%LcxnvPU%zoupopK?g)i`EOIg3+44Ri21T
z=pLy9DG>;3jaz%QUa_|wQ;|KMA~!NSLH6a{|5>r0P~CBXk7$W3c4Z96gl~S5%i)%X
zQrxUK<Z>>>e)59y8h(njM}AaXFk#X9svV$8)YuU9O>t%02OcK~w>h+Lp(cJ<d22&X
zX}$1LqzxK5RV8Hf$B0W{WCD(X{#3DqE@&@SA@m4YsLVIAKQeYe%A*h_eeNHUw8<7i
z@~7pkRf?-sh->HaI<#}6a{s=XK~aa50@yB;6TfP_C9Gjmq$P~2mzyt9BgJzpds)Ww
z_hzpzNVC`4&oOLNZxohEzDXREQl$*o6dVxkDQOkyp`~&Wx}8^3Ngr*5aQ}xr);WsR
zIBA{|l2bA2I2&2^SmZ@!IHL{8&|rnk)dH0vqb6OF0)V|W+Q7+)1QW!T`Ov4<)!lfk
zWm42xx<=!t5z^U}#+ghl)mreT!AM%*5_7Z(U4m=kBW&J3Pj^lDRI&-%Q5O14*w!@4
z*A56_jxLB%qJLRp|6@86o}8#5$;kZ>E?;Gh#&ss1+uD7Lhw{jSHZ<pS#Nmg)3Zcx(
z?#G5nUe{-toc!_V(L)U^U7X_=q08oMZ~@4?K1R`w+5(6k;c*Q*3d6QHPq~cJJC-Qk
zA$7R;a&wJ1Ht3_gB9gV!b?gXJnl8q5)~!B@3;=rGowc{v0C;N&TWai2dkhO1k1*1`
zaL};~)9N)5JtK!pxGAFoZABayXhB9F?T?O*U>)pS-Hu$OjY|#)P|8BcuE{buWnHnA
zyv&$ZznTfZ);pdY>Ffj6$3dhxfFPET+iRHP|7T5IJKU;BOmo`!$|?>4**a#)#vN6t
zHd5d{+?i?2TE@k{`Typ+-~B)AS^iI6qm<)~<0!V06AifC=GEz^`|S3T5)<g}2J@}9
zt4B;)r-<M;-fB$FCUY0h1KNj7wM14%3$;N2YN6JV4qoNQFRPcIHYhfIdVO@WO9=7n
zSY4Za-R3o)v|=<o|BKBS@tO!x|DivRzN)AVZm;ck$gg?NI|fHYV5%4Y{Y4e_WW(F~
zh_V@)K&uFRU1&8!<{*wfpkOm##a|tFI@;b6N1r-aHWM_$v9^GB-Wn<F62?*F#A5u_
z-Vt2%>#I_h#?c*3c`7DWFs;#fuH#qgv$d!cG~xOjW#K5aeJjWvh92Q~U=Z@wbYm6J
z5}1(_`Fs@@h%6E#lzIC>VaIA7wYcu$4{OJ*sSzDt9OeByk}`(y1GCBp^l&Q5q8Xb_
zcrUXFze(#x8bg$O=yMTbO=A=von1_<kgkL*%aI`J8eB{0M2^2G_Ttp$h*z1l_rdP;
zTIq>~4o^A<U+HWx@A{ZG<IZG*Z4hrX?=a{LV35S$s|gsS>a{^Qc~0zvuT9dIN))fE
zGxnd9H*G^3MFTEZ7m=)#BaO4<tvQq9HJ#YN3l@6PRDZo}n3j=)ET~hpD-~bYx=HOu
zORH4qXcDQ54PeG%X-$d+ZI7>`6t`el$L4`%>^jvp=SplJ@9d^LiL;j|s_!wl4uwRf
zxKTi396iUK4>ya`CIdUJka`<d{yhE{lsMWNP;us;1Tq3<6O+zj2<3e8P-i0T227(A
zOy+nv=SW<@W=u#GhE--!)6hZy*2x_j=KF6I9TvvJOLemw>$H4qzkivI5eh&P^IyZW
zu8?lF=BD}qZH)P;)K1%I(-jq;N-KLE`_(4^4pW-{kqt$q&ofEtdWX9!`^r|9OT}IX
zRjkSz7LEJ6BCdghh}P091OHPlRJq{SF7)(rG^$CXic6x3(yCd<=9B9oX1pBK6sGW(
zZ)^|V!Z9RwrwEYfgTgbZssQLj>KODb<%2|OX2{-Lzf*;x)TP3k(FKh+(BkPB%>w0V
zYwyJU*6ez+hrv_DbdQ~2B2Z4=86050cFvn@78WWNE1pz@$W)@cHQiW*x3?X-;ZI&C
zpch7cuJyv@Xb9aVBMBIO%v`i<`bqr|X&H%=DDS8z?!TFq^m`^jT#;-7E7=;Fya3ZY
zmp?-a5RchD3iJ(_n;&R}<Wdu?qbXK&Lh`JJk)4SDrNG{2;k5zI4f^8mLvX);O#p3u
z%{8?lucU}Rg9je`vy1r!c6I~xaRcIVX~_ei{ffOaB5=dALB)3B)o%}kn&D6G;^e47
za;hLlCKt}C04@MU+!G7|;~(C6lwR~gAT5V)w>FO~{RNyy7ho_P$V<%%n@U_Lmnsl+
zkv+?YWN01s3};|OjSdxb>Ypnrclkg#11Q&^VT-jDGKzulqzKLi^grnN%;C-TwfF0K
zyezaj)5_V|3#z+1O00*CrS|a=q^Fh@EzAMZK<Y<K7{MZbN!anZ!qLT~n#(@Fy;PD7
zHeWy{q76fKtsJbqO`=)N7FzN4wog_xXFisSW6%nTH%}_e^JfY2|JDWHsB-`Pdz_oC
z={pY@-vC<s?rW$f{BK-9xs2bq;`9#m=J&6;wc+%gSG~Ny`Hg#AlZf%%hdln)x2Q!s
zayF@`Dnn{-6?9acf{~POWVlBRJa&7iSns!2RyVe*MbUX^LWBq-BiP&M&!&{jk>{M#
zAAGx=mWY0W(Zc^u!V@k*5XBAX<DPUspLo1}owlIY221<8xke_IeKaUlwpTx%p1UFn
zuGYqPFWMk^<v}}V=X|WK_8yYO@#J&|2(k!Gr8|xuA1<%%U!TDS!^#qPBiXVP!zhm&
zZs#N9;=lNGb{3}tJ4|)w>c#BE{?=OSM;v>9)t#L^y`L4jYkXdL|L*qm;dA#Tgr8r=
zKadjnc=v<2f$x8~{dE5?SN$KZE<e8HSNK_R>hjB9X3rlmCSR1b<Mz2lr@X`c&EwtK
zm|Ogi;~um9AMWqYexRf>UV{Xa9^Lw*!W>eG-4`$G2Z6g5C*#YiIdne$`1q+j@NjxN
z3x)W@JqP2{f1Ld^sAl)(s@aR{o7o!*7|dS4ay*{Bjmo=xGRMPnFiV-U>mXe6O@TRF
zTz>np{EV_8UcTs{3}o_I5f8-6E7nAecJ@87EN=hh@*Z&Cr&)gba3kmPJqOd%XK{@`
zd}d!P38!}$z&F>krxz5Onw86M>^f+(@6Cnq<JYTGt^~4uc`<u+`f+@v8|AyFFQ{E2
zoJxo8$5en&O=q_Nj+e``hgnc|r+O<6{&IJE`<1^qN8x$kHK9yV9lIZSgX{CtJB^BI
zO%#W9zw1voG<)54)hxIDPqlg)v-|tG3D-x|pGw+d8oDB8d@*}<`zZ!arA^hS(?(2C
zw*Hm>X|5`5Q=2+_cK7S8{+`n*S6`|Z_}>6RU!QB9laFzAe(!mH3y<*YbgFZ>f*pxb
zuyKIyja%9O>4)l_M~jH!r)Qt9&h^tcsem6LME<qg6fTjIkq9qgi!Qc1V)K+IRpbV*
zF!q3A^yaJdSC4W?)Z}vQ7bRNyx;e?y0((ZLc(bVwKoQ4QWy922Yw!Gwz3p}D0(wr`
zhB||yvZR!c?{Jil&pN;lMmuW|24GShRINQEDj#|FbD>`M;@wyG<K0*H<9=U$jhhiR
z0)0X3cdcqRI~TGj^^p#lP4vXY>EqSIYw{dWkN8wI!RX^P^4AxudC{&mF-hXd_|)8L
z0#RFkQ2UeaqtGUx3BDm|5bo5q?RB+0$_AGmh_FH0AfoN8F7}T2zk2}gWb!qAKKbS=
z6jf_BJCNbSsD`L(Ke2ukxA<~;|1+f3Wd22@aHjXDN?0PY=GZKBH*!90y&>0eKgOFo
zrWtuUNik~>_ZwBSsUEoe6rF(b$T&a8RCvDznWtmj0tKXPV%Ri3K!x<TL+lo36Q9eU
z_;hjUyA{_C`ff%3I(u*~+IBs)?fO;+Wi}0@s@d#dPH?lsqj!h@NQx+xFH(cx0F^Yf
z=0?TpTYbz3FQNUle86WW;1vT7EsQtdBnIBTE%Jz#<LcVP`bL)ogjF@0936d-iK+0^
zo2@mf*`YdD@_ymLnmh^d8#h5ySl>GFO=6QaWUAOtMle4yl?!*EsY<oXdaA~@#wnI>
zIZE#yJl`)r0a0R$e|3ev<M@E8VBizLi&I!;<iDvcKe&Yt)<5Vv>;Pac`4_N&!psa-
zN5GUAA#`MY1g(}qql+1XaM-c2_I4Yqgv}LwA8&k2sMa;MLsl`)9owfyI!@-+UVQ#U
zk#t$d(Uwyyq|*dbL{ZGtbS}QKH;k|B4b^N??(V3OQ%Eo$QtLgq!lQ<GvYBv0@L}bC
z=P<H)>6ZO>$91>Fw(c*dD(lbG`=@g>R(3_yTWY|SZ3>F<mLYNH(pwv@L2Kzv)WjBs
z%zr&Q&3`uadnS-FnWxH*%OGxK$@akCf_PF-Ts=7QfMirt1IYt2Hg&(_!vljXoxmI~
z)k`nc>!d_?7TuWC;!O0c7M!n`&v_F-l27Zk^F}AhUk%J=Q;+@t>{{7=H0VXEuNZZC
zS-d#+@A=^ou{`}gHu<3Tgw%q$Fh57i*8H)*mf^W~vgxml%}!byLV~8boiCw5*wnBl
zSWGk!tgPH$ttFU2RrWY<*{r~v>qwTEC3mB0HuZG|6qY|N=W!NYbd;S;$?D|pDE=sr
zXRSq#W;>!6E7GmBXriabD<8pG!V5dim6vXA%4(!ye7<WM48V-|5j)OKb^AZwLOou-
zA&-A{J@mz$TM8}QkAvw`y{TFSQ^c%mX-V#_itSVObK}|MS9-nXWy|$bo;Z)kt+d2%
zv0gD~HYM`swW%en>MgMus))6pyh(XN_|!k8;5w$n`0nfFnHT@Zwq*GEGBhv_tFsa+
zY@#u&{iIWV_J*dhUjs&86)NF{n5p`60^pXj(4G7JSCar<{m{9+Bpc2NlF}6QH2j+7
zlgfdS1Uy&xK5K_>4k5nREn|}jc=qsZ=Hd$X8UN_XnscIdX24xy%#A}AqO<c7O2j9P
zY0nWxJ;U8|BsR$w!Y>CCLK5Dw>7(K(N=IWezHTjj-I|vdq)9*K!Ro;~RtC=lZ|e0V
zYeZHNn*>yFy=WYUOcTh5Pk4V&;q*$=^O$MaY*MuO{^siIm_GVK1OANJ8!Mf@nZ~rr
zc6I-;Y|(lk7w0wnQE$x58`CFdeCOtlS@(;Z(!ir--iVOL;hVo!c|*S?qPp$ebcQG_
zu-3?Ub&p3F9@MSVhZu+b#ro(#i{V(>uJEvr?)1;MXwzOOMF+7^Ywe}cB>Sk&gM-gv
zFgB$gt#2Kg^0wL3TOauY@<tY-<Vj{B%4WPvy#fpV%8P|o6>vQ5?RnlwNm0ay3kc}o
z{*s4XI^pH@#Z5~@X$Yy`^Jq!iozeOILo?OW;;UJHr%Ebh#<1FWr0LDmNQ=#AmBS<l
z4)<yF(|e3~-UU7~aFoJ9#?l~;re-sKxKw!_F2x?2w|7BAs27k&`Ky@%PUVt1wo3g(
z<e62o*yd^jsH^l>zh+@Uqqj-5^HU)9bXZCjp)Nf}H%Wg92;hYE#rkUb4Pq(chPsz*
zx7dh&DZ<;jC;ec>OAEjr>n7#pF6FFwc2V*<y^g0Hb1Kg@zFbWN2AZyRYsDtlLQlRy
z%3t1yS3WM+5P!_(c6`+4c9;MX2sKqoG6l?OmJfC$RyA4@s)h&9G9oUbXavLB%<a7D
z2#3>@MTBTHCu<dr_iJ`^Mx-n2Q+@}tGbCGhO|m@63OQZPrhbaztgtDsMuBcoUwcr;
zHc6*87haba2po0ya2tK(7&3RUTCLkzKNd}VHVJ{fdF>s4BjemwJnzy{U3#_ZGCw%(
z*jBQ|WF0UY#Cui)>peXxggua{BtjY3fU7t`JMpjTbeB6Gj+%>rYt@Od8TuM|3ZbvD
zp*y@KpMpH0m{Qq>W|z(8%WUXpy+VRtW1b(=7I9Z0?5zc|;g8u{{RhnlDYUW4yR(FO
z(rbKYiIKLkO|<^lHbKmG=_9#<o1@i?4p%y^2g7s~lF6^3K^xkEl?+&-JLBuGG2`_8
z2fcyYfxr4OU#D~#npJT}j%k4bjBpX(q89PDu;G?GSdL$7i6FCt)ap(o|-&*NXG
zE<v?#L07~8a>^b-LT4!_F+i*$kDkbF;8WyLkknw}I5WG-8ydW6n9?@&FwVy`bkk|C
zkauUFCtl5(%_+ytOk~%eD$kseMsji8H4FI6MaopXe}MnuP6cUd9&kXg7q^cOXEQJV
zcKwpR@D<@=2$97A&0nLrkaJ@7G5B&N`DHPQ4+6>|bY5gT@*UDD@UpSDbNX%N3Pny<
z8IxRplpZliWVe)(`+9#JZ%Cu&s+ahPJGZI+Giq|)<cd$%k7xO^PW*6u-nqEVbRwVX
ztp<&f?n-O(R<wcXF}cwf)bPuP?1H<zKD&CPKC^s<hz>dVL5bl{>ilxbRc>6~#XIOS
z_?w8Rs?z}EPaZwJ{&m`!hZ`Ptr#w=U5@{4U0LDO@DZPx=r{e-JB5x^g+JPvEl7!i#
z&p8@bKPrAMOe*0(Gq-SKJWS2W&TTxRz9tn!b+FJ8d3q|Y>D>CU^=r+g-|{!zwRCGg
z>z$Wp6#;v9MH$8Xmj-hS5f~VcPPSG?<MVHHKW?Aia2WsrR30SclWWKd&~w#y__=g&
zM{6y9HXrOS$H9~RwQcG${B`}W`DzxI=KgYOY>&nt^@PRwg)!tG6wOix`SH78F~(;)
zIULLnhGT1c`DFU7-)^`JVaRx4*)KxQ@({vbzsEwn#>&YD*w#*VcenCu;E!sabF|Y-
zpSg%tyKufZ@QvPV*4Oim&E;+tzgFEUDmr<3;a|nGn1n0axZ$FSSQ*ts$`nORrY0K1
z!NL0O-uwM@aL{U=1M=q5i!IDA4i=W@`@YYIL-O&2+K)T&8{PeAbtm7NuhWrum8R_}
zxK(_Gkcl7a_FcG`AJD$~*}?h&hCtlA{X(hD{p|~9ywh+&c2(|@W73eKGijK?@Ej^8
zV+e{%j&}N!{q2@P=n5m3+A9sI)RgawcVP*;#dMsxQXWVbENb_xi$zALKkC!I@aex@
z|Lx%gu<I8;p1y#+0;&1p$B%4%^wt?$;xc^k<2hUWnhWRH@;woz@j@E6`F-Np&fW<?
zsnmO^OILUDkfJ-AmRQ+Y&n-I~o)00bjSgAQ6Qq|Wh<l%V+HE|b-_i#n)E=)N(r0@6
z|H#ME2z|AN9>$j=P+&9~^v@_?G9LDEH{9TLKqRP1Ha+a}_Yb2@Hg{Hrxs}pnLxE?-
zKTJo{C}&dUUlJbH?bfzkE3+BK{dDQ}%3EMHc`T>F-FkL<(Av&Eu(>zAh1L89tv>K*
z%)%>1qn>-PzK;*G-HS2MHuG)L{poLTO1Xdg%pU#TZ}Rl(`uUdm$NP0x(^SpK@LOf!
z(Y)2#wo1UJd!%pXt@2${o5gz$CCKumU>r6#KioP@!>&I!FL@Ee?(nTlkN9N1VzQS{
zzn)$aawNj3?~7-w)lHP=Vl7=-?{d;ifFyJd-{cQZQ-rP`v_EVU=Zcvo;+6PZ(_h8Z
zE<R`Nrvoat#0s!5T<A&Q_#NMpIT62YwXFrS7n52rH*Mo7>EW=jEV;EX?^|vEV<VE&
zm+c>HuD#9e+mifVDBOAT^Ye%}fG$mworHLA#BG`)v{3HG+?FSG6XhfLh0>7BFY)d;
z9ZAyylx;fDOD(;#Q{SdF@MGC<JJicY(++Fk>{gBs(z-HSS_JjScTBm!AIh?G<_oOu
z7MHm#;vwUO-eMn!aC487w79sv2T*k{E~WNHI;<Vyy7a=BTIl`iE*U2$dB`Qv2;Mrk
zNjefk^F(5~mX~V(n5YZGEC>3jRq=F9G}+c_UNrlRI!%O|FSl9dg|u^z(zLj7^W&e+
zu_RW~9BaPY%bQQmOOwbdNt(9^%l+KXS=i+_%o<br#$wPQP?`tSy%#!@$1yNywZh3t
zJ&1A(7E_Ny&0RbfqVJX#hl}&W7{{Bvg|+<6=@#3yYC`q!;tsik(`!JSdx3J;Ji(I4
zCu>R6oZQ{`fMJp+HZyT?0o|=QFTKyLxVXFkPj4DJYxHyOPhJMq;`Cc7IYbcamPiZ3
zQXUTs2QL1oixDQiWbbhQT_Hm_aSoun)@1S<`U{5eRH2!cnodly(1S|NOuFYN&vp&5
zMm+52<)z>1(EOzM+&+}Vl1$B;BpDc|2MgojGAT5=D}yOF2IG?q`O|M!V0_YxRPIZ}
zMRPyv)Y0|Bg$pa(%{y*2H|@86mEIzLw~WYGPJ?c3`+aWZewxiHxV1sP_4^}=hj^OP
z4$TXAin!%ME)36$MgoeF#9&%&-JtnCS!vokY^eiWDyWV~OQ@38pm{}Nc78|Ji6~9L
z`d}nId`6j}`q<6&CCCd*+wf5Azr(p`x&qZ5d{G)tB_`;HWO2TLC*@~_7OtGs^8eEI
z72s7I-`km$dxN`M@B~7L3mzbNf(4f#AtVqIh@dUST?!P6yF10DSaB#?poJO~Ez(k?
zv=sQ>cXn^?O$apoJ>Q>)oHH{!J3Djc*qqrrYqTM|&>I0-V^J!@)JZU`{4A4GgYeKM
zA8tAkN8<IU92wuevhm_kbW)#KqhYor19^OctY$Bl?<e6Ph(r|0(PN<aJzX5OUSxmp
zW|w>k1IGbM%HVKnRK!kKg1nApv%T|tBg~;?M#*gh<mwu)Bq3<f9r1cN8S*9e*7%ug
z%aCUBj6wUsf>p}9VR4OhHr@xQ+q6!`M~z4pR{<Zs0f%IkazdO)#;^xOGGc_oB0hYN
zL0XK}psYDV+@iv}338tBv*Jh|Gm5ovOy5MwCtf&zi$7J*NYlo)S1Bi)U)VLy_R@%)
zdl|iyq{hse&l*4$<%eMSxb8UU%j6uqo?@$k*nNV*V^|4G2IgxheFsFy$uxqjM%}2G
z1T(OWfCO4hnO?9o&Jp-<@A1S^@|!xcS)(c0FMBKXc@pLaWujXQ=!&lO#IK(~x&aZ~
z*&|gMY%v56p&p~RsB0wfi7E~Vz}pEl?-P-f08x=skn(J{iKa-MR6@mq6k3tWMhkYi
zkBf&gOhSk)8ZU`)%g7`vp5`T`QMO?G4p{ahE}o;7M35aPbFmIjWR(?~)i-|p^+7^N
zHj||2i<cQ8C%y0zqP$;v8S!z@PLe>0(LG{g@!Gvj;h3kG{;S%lY_M!89EXlqDWwYP
z+6`ND$=G_7-88Aq?kDz+>1!xI{QM?~M}bNNmj|pRm2oUIAsnETtWz_*>z5Q2Zbb0q
z3*<Q3upt(*m{<JRf+r5(rE;?j=c-4sgdd~fNpEdk`Kwrl^xYGaqo8cv(N3bf@x~(8
z=d8%v;j%`A8i(*lxTn3xMc(9M-i7Azn`oP|j>3i+e?I8yn(|gW-?xm9l0OC|z()r0
z6{m<MVP2l-??60MB3Ew@A5U+;KtInQ?;xBti{ljoag2LFP>`1|ytF{L@xI=k@DZ?6
zfB;7XKB*9bgU(xq*K5%<RIbW=?3daEx2P-2@H@h0*qX%2jf`aTsotp`>gnMhgaa=F
zyaM?eeUPu8k7p1Lw)Mtwm)>|T4!-0Wx9o4r<}Jd#;52x81b7B`dHUl}OfUKyI2F^^
z7ZI7{C_{@Ffn5kSZsD{TlM&tu9}8x9*FG2Ww<qS0;5l*lObU6Mm|aB8l>1X=jZ*~e
zixZ~!wu^n95pOo<6`MOY8SM+PGDeI!A(@F>Gd^PcLdH8J6}7xPe0{yKUd6Bb0|Ntt
z{C#{q1AIU>oCh3;w=BE@gLn)u-5h(@a2zy8hcX)zri{js2OXL3UwM!7(X3%CPAv8E
z_3{hy_Qy%Efk6Q{m=wR5z=@^4K0bjy@P7O}y#oV%n>a;;GK!94C-_l`RK26n_;}lT
zCwuMXN``%h>B;kMF6L<ktO1I0M0lezME>wR?XcXWn4lJRb#Z=yef%Gqcy2d4QV;J3
zvx_Jmm|C(y)>s!9RdfK`NBCi42m+0Kb-+et{Pwd>+Ac|HC#6$3i)@r?6!-{*O+&aC
zDgqyMvA6o}W&C)YlmU&|7qU3zfMNHRVZmWI44SGW<p->BUE@gbfiAA4N+FWvI8-u<
zF%_wq2VdE9ij41@AlpmZmf4OWp5DA4d=BUeR`v1)H~RT|`vgF!!LxyZo?iZeSV=>V
z1o;OAdIn(iWGW*2jTu!#j$NmkWjCRv<!eo;#)XTVDO2^Ze-MWZi(x9|nlZ&{immNQ
z85P<nyu1X9$Rpk54^(WwIkuO(@TE>ns=JHS!8mCg@5IQA?Q}<!rySHLFHi})y9^|q
z@p>p!XtRb=4cWAjmj_teCm_hnKLB%t=s%WeP%K^$XB<TA?d9j|2W{-;)HSkKGCU}n
zL~xpEm|P<2#m09{;(6l^&4aoojWgOfxhn!6z3Cp^2fI%(n7vdYftOGcle^&y#ys(o
zrp=Ke$m1(oj^id*r@G(x@*)mwHq|@%vvlbMvl`7Zh}r@=LY_+%k;X>YFF?9eMvkpP
zoJInsYFgjRGsJ=KJv|W8_YU&)#k(q=KHh#_f!;7-IL|pS2<q9(A3qoK3<N=?aPk+^
zX@!#}*$__r$R|vmmBkmcv5g%X;RUUm${=`p_;>|Ex&j07H6BbO%%Yc{Pe6b_&x#K4
zrT&Gu*(hSDSCA8*hfKt=TX-2O!8nV>t{~E0bQxqQdxs%rOzYbo>Ofzq+T3wwW=w*6
zoT1O{n$0!`rn6ww#gv)K&Bk)34Z$+rG_SS^kdBPhY#E&b+OP<^BX)lEgKq(|3fcwv
zLJa(I{1-lY<n85;nezz>@biNGB|S`MB=$Yt2)>a7cL`^C`v&@g3xb;97Dp!W_JG6V
z9mqdlKOCmz=Z%d!e@+%WAs<YgzqfA?ydL>$1W+KEzk%V4Wp>TnH-3Q4DuvQ_H)Rno
zTkuh-{kn`j%w#W;Q0j27VMf^tvoRnh5=#{aN7PVaQV_$jV}&fY=p8D!OOG_wpkQsK
z9>;HQ;-XS(L0KgAkNm;0q0a4ABja8b=G-?Gec(_A`uPOmV^%(xVhAqe2?`LBiphZ#
z2L|{BFwEmfebX%<nep>b{x;tDTnxQkypdABwkLkdf**Gx$J5X$!2Fei$vJq*FkS12
zN043)5@Wi~hA@~SW4b^Q#<bI00A17GC%>6)xU>);R7l4OlA6sBx71G#4Y9e!WQNp!
zu@mf18l8oug>)tid5Xp0G9pHl39bfG^IOJN12FPjy%`(}8B9kBQVuAk-t13CToKtH
zX}axPl~L0zNA4@NK?Yc<iFBt7gXbi7mC#pj=}?1>eW-N;umXY(5A^c!_3;eghZe9=
z1y$mStsUP$IMg4uaIuDcORX3XxFyk|Vv`f2v06d^%qg}<R1)nvr$P=LW(vF_Jp_7i
z|9P<it(4kLHU$MF#WSSnrJnEqRrKJ|{R<5wW6{jZpBeI)Ba<Vrh;-2T=`SbKDsV)o
z@c|j*$AtW<ggoyA%O79pY<!6zphrcYz<TJBT|e7b4bb267mukf)7gG^QGuICrxZ`k
zkyaUXrcAwHbInpcm7Q{Y07T<m_b-SqW!eWqRm^^*Ei?xxJ!``TI~tvg+F)Z6<<q!c
z`oPNXe_S-NX`|VcCAw|dYg;%uDo;n%mrC=21fYO4XKXT9e#FF7`h$BU#KRjgRFdft
z+P#X5lOh$-C5PKfq`k)y$LND;A{=|0T2W(KVT?^5uSdURdd@r&+tJZWbJg(lsE1OC
zM@AG944b(S#(SLfpY8ny`#6X`o=-;-j>Z2f_|iXce1HQV%D&sRcgB>dvFU=%oz(sY
zS2`$^&PE1l4*oZ%M~>fM5veJ{obXa!x+_!ZiQq2dObNUJ?0EJcsFkJ!=jo+*t<p`S
z%t2FU#NgT-a(lpp>mhfCb&+8;(lWU*0!%BHEZ9`Gvk-#`v8M5dt78WI947~A&cz98
z6ThnF%28T;n#qiq`6i}kMGs0w5#e}HHAz-W3HqcR1LRJw?NU~1^2j1v<jIIb*>oeP
z%g(7_B*<KM+cc#s>Y##Qs?&Io850u6r%|mM1|TNUzO=Q|)>dlwcGD-D<D%3UmjfGy
z*LMxC&sD8EPYZIVQwSAMzegN?h}a!Ixp5JWxm?axabYcLy(kw}rN1tzfNJ$9wXzX;
z2?x>lkUzyT?=h-7klNl>?7MBrWv^pnM#n)8`)$$(ksapTrvx?ln<gWym6j2uUXgyN
zx!Kb_2`>-F;m1~SJ*D+WGYxI7IbzEJaP<9s0}Q`BMOD66OX7l|Ak8&6XhSub;U$v%
ziM(x*Wb7<pALo6Gq?8KgB1w+<oe`(_-DWJ`)~Aez&^W!6fDap9TdfE)H2WS)TGiwi
zDN?qows<+zMoLK6EsX`{%wR{MRWoiIK*miKm)iTupo&Y8`xKsXauhpS3+c@mYX|D0
zbQTZqL{MxVx8u@`%o%|_cQd#h5h0^&h98}}>lyQ-qxvKnQ6(IFf-To1Clv4taOgiL
zAjRZ6b?x3AuXv{_l}kAs-PW*0Ix&{i3q~CE$$I#50lzG6#FK53{?70W7YHjzcNZD9
zadXhgQZJ{aE$vb2t~vTBMsq29EQqJc(1kJVRKWqc`u~?bER$_gHlpOG5$yUrW$E$$
z9@+W>j$?Sn9{9*;fW~q73_?RAm>Yz_2|dHHNnz3=f+`eh&rqRKt)&EP2HKULsVkit
zw5lJ%51mo%riu_oC->~ZSL$VFU1b2@fj8`-O1w#EXG_ekY4;)qIiv@>IpH);I<1bQ
zFx-99Wn-WsJ-Nf#Ja`i}^*q}j4oi{pKxoaBG$M(PPm)1<xvv67=MPGa(Br!rn9<7A
z>etc7%ZT{$T;pU|S^Q9>N94N?aJ)n%@7PPzPTL3~&60hEoE~vL5HC!1cScv9*b(nG
z2*;cnbi{@>-WkA;HslXzU_xB*B|~Eq3cmM8F1C!ZBZUwmcmB<#Hw>(ct?wE4z-5aR
zE5|htwDe#GVEJO8v9T_fKRiz!d*x7C#-4dpLVO?oy3{U}n01#_{OxGf9N1ePIFd@o
z*%b+ujAK-aiyiV1UaX*piGA(VvYfGfZR26=Y1{Xk!*sUl|HUeT;wc9b;b1rUk;oY1
zJKUtKq|ARXmhWlKfuw0R6;fy_mkKE=<K0z2ilsJPYHq0;V%eh~{1KbjNOoXE*PxfN
zUa1!v4yl(e5^YILqPzvpiwyhnSa-%gxjv&8`)-IU5I?&bh!;@cv-RLK(*Nn%wGZww
zBkgJY?ety&csFJf8P9ek_rGVWIu&c0A$5wn?WvA@lu>z=X35x_<679}kQja|MKH|@
zN-70&JJvB`hi5xQGPHnBqD{K4@O<%Q1fE-omTf;T!wV|BY+g@4?(8Um#y&5MB3{Fi
zuX9n3(^g0w?%^%cvYX=Ns7U^svzg+>*dol1f=opv*YJ+*r^_$zghbt(%pi<8d8y;d
z8H)hFtN3Zo4Id>HfygK@K`AnaS61x{w-kAE&|uR1;~;OxRaiY^>(p-i%|%zLC(-&y
z!W*{nJ%Jn=NZTdDeo1~YHy-#=_|*T*P&YO%A&u#6SK2ww)6Q?^+@uzXbYwZ5xk;tR
z=_rt6>GIdbsiM;9$AQ1{y(+j(t@))h+y1B{W#zjITyq((E*L(I!<sV{<>+0ot0+uk
zz|44tr&D)oH+X(7>3Ms;roFYhX^7hMeM6gX=GZgaoXaVvvBjjhm<z;=?v|P%c)4AX
zfw+4UT0EY!lxlo9X5WZO%0Pu&s-~8ujCo6DU$}GfPFIW#M#omuBVT&EnKsHr)Z^Xh
ziHt1&fQh1Ig2Ip*lTjXXizk(F$f&H_dqb8=M}Ao>&6eF5a84Nr$*3tvYa$IBDnERJ
z?|CB}TM+{2iDS+wTYL;S$L`yv7o1P;-3&Qf5N0lB)05`3FCTC*Yj?<~%&6tCO<~(a
z#x%TBMaq(mW~ripHs(hp5addBjGs~4JC2R2Op1jJTH-ModGmyO0u)mRnaU`~!A<6?
z&yK@2SB02dtO;l{7=m%-moxYrhu0*HPhi@DrtjD-Op~Jvijg@*8ACUnZMz1@xT~Fd
zR@ldMx{iQqADr62xPZOBF)SCl-F>AZ1;ywIL${^l{qjVBp8VJ!6fGAB_@#A<pOW%Y
zj}uS9G|gXnukINd*0%r6qxta6H1DTmT>0TfO22J+lF=I-X+Tbv#uC%sCpAaowFoc*
zUSF48#0H?Flx1uY=lCR=W%3W+&%ifBQg0mCO3j@E`xK#jh%Xy&3-H@l_yRemkjCkq
zu!~QsxiuHcwtHbWrMKKYnuiq(M2slC7;x&QQ`SzAj$9k-6vqG)Uo4R}@&iG2F1WZt
zhYe0UTojo>YHBdYz157WV&i8QICE46yX+K`TRYhEvMIP!9Fj3zhhMMR&T_@8<kBn0
zDWi66kBP+V3(#9~<r-=0gTt((kd8d#DDu>jHUs3yx)~ky031k*V-7ev2x@HllKRg3
z*VL!dO4x%{8I=`zrmkF^L!0t>8F??rQGe4n$%xXnMN2u8;F*ZH9zEsnGg5K2xdZOv
z1j8dWk&O)<r9p-&rL$;X-lUD)8~PTuFP$w#Z{r6FJQ{=Nb9IZ4b&}^@Kz+(%RDfk3
zYbczX5tA6r7yV@k&b|WXSsKQf7_=ki+1F5t=0Q>X9y`}~DPLt^#`qM2?W=kCC?vkt
z(6A0aV$io=eH_d+fWQ2R2xYhN(OMi@7}2byYg_j)?9BA(gWquA=L=W~CxT0n#es1&
zQZkQy8rQabiwgIKHirL<UqpmAYZ(#7pUb<3;fg2z5A^Io52{%xR*Cq?bXY_K)HHrW
z0ds;T@mm_4@|2K>XEtf;-p;*AJNHnUKsb}B_J_OT;|qPw$}x%La9YX!UjF@J+>q#l
z5+v9md8=y^nS8fKJ9`a#GE^SF&))RFZ@&=VtLxs{AV30-CP4|FR+o3S2H=SCGXosF
z7EvdpVG|6m4=56y95ui-*hV;J@p~tna)vj7QS1_iYAqY$OR@Mab4z}VxPDl}dJXGf
zHb^AwB-&{a^Pgz(e?p3P=hU^OdtGB@T(N71<8uAo5Z@zJnDLbvkWsRdK^v~7FgAQ(
z0i#2MH~<JPg7Jn|OSwIZ-vHxGRet!#AU4>mdtw477R11Bd1S`TrJT!9-k<hf93~{6
zD(^QQ+_C{a+u0-pujKl;NA>l_pBMf-rAvh}$A06oL(g%Cm&~FNqvAbAi{?CImDaQc
zWWYZ2;rN~rj!1$Q#|;kUErNOe2jZQ$AESUpOu$K5k`D~-NT0R)_vp^L5&eCmV+_tl
zLNd0$ubXRd3e$$!I2AvH>=}n+q5H-Jf?e8??<f$49Jn}`6e&1H?G#q=j(kUGAStO!
z?b3m!E4Ucn6($3>YSXkuU3^nUem+fV7#@(4o_N1IY53d_xgEcdiD;R=N{}+bgQD<#
zMCz{ahWLs(j+H{VH@ux>C&^5#0XG5XN#R7#D1NFC#5RPrZ=`E$gHsK1$dg8Fudm0D
z9I{^DNQ2Jakx@1U66q%8+fX1<C#3nYxjbOQDWYxqT}KZharktkoDZqQ-q(T1zUe>R
zrn6G_$vzW?o(g&I<K=;QcpEW-S~}4+)TYH)J{Yg78VnF>C^9gpBz8;3`Gl{BVPFd^
z9>1~#Asw|K@@e&-lnYx!_yu=FNb^?MC<+P2iGw^y(^0Jr{!eOca0;{j|5oulD<vWs
zM`=o>WM~X;DgXynVJHK@)XJbqtCkH6%_SL`=Y-luh)-cA^hrDOG3lBxS@na%`;ghP
z+bUxqSWIxrQY<xWS4&52@J<(untkmy)Hg9j0^AHMhv%m%Y?(&?*wrA;CxouE&vlC3
zi-fkT?bwe<w=_zLRnr#(sqUmgO6eFxvv!(JV9=&N(EUGgR`(B*o_@#w;yjQf4(w5{
z9%&-^!w@ol?arn&!FsX~1_U4fO0C~;j47wHPv1m*xMTqId$OV59rVe&TxHh&;J`A(
zQ>Il(YgAx^-eXb>9qMJ;6WS`94s$HCYcRVlvGWj}sI;~fsz1ZF6*kbzKb3KHO^#zo
z0SE9WM&r1&cRHV9Z3AuA773dbKq4J`Ych6p3{9^6aW;#ahPS0V7(KwCPDa~~jn8=7
zu^DPR{JoZfzdDA4746#0y`dv{Y~|>AGs>H5L%3f!$_?M%s^Q|z5~IL}{!s*vD&aTF
zq$)M`cyZQlYvbKJ$^V9OGUYV>KgntQ2if_5l~c%8ijGQ@I*ML|-Fb2(hIAIBnxg-o
zRw>%FR4PqrS5mTBQVCDQ?@Y0Z$cSekYo@OiZG!b-=YUU3jE%)1-L&>Lsdz7&#>PhW
zGu;APomAX!^Zd|kigSTi01|DU0Sj^PnAwC}9ib}cyC0buK?=iXKz-ArHrWKLCw#je
zUy+h}5?YV0mcub-SOb%CvM3@Mj)Yxbgty>}foYs^`MT$OJ#mMdo2GE5;p_b05X<KG
zoBp}<&gGKUu9_I^dWYLmG!wlZgX!Mu{}^il`_Ss(fGFc#Dtw5MWaX|=%2G9DVIm1`
zTA)^rX34NTHtCSF_pkf|8*cj1zKI{wpwiyaLB!W0`H4=WcP6XAb?W+`R72PQE7f2s
zDmrqhq7gRPc*hFbZumdx5GGHqa~xGFmeYn{$Vf%xJCO>`#)l#mQaI^>lH&goYhksR
z57RnBUWMtK8XT3zJ$=`ip$PEH1aLuZOW{<kB}XUKe_FQ`VHnDtLm0*;hNj7uQJ55}
z%2)(B(fE*}acYra&cK-VP6wuSsXx%?_loWNdo4u+M~~W;vu3zN&U`8@iRFf)K!;(;
zomLB)hhV^U3V(lC!bSjm0=+Fhr~wM_n;4R-%!OZydN)_pG=FI#U$&K>A;RPoTK9}F
z)Q9c*zcW;dDlyiY2u9#L_52ixr)e_!^ksLrD9A{6KD-?G_wIn2I43-;B|KIfBZlB<
zLi_+vcK|N}Zvq6KBIEIyc*BDLO7L{4j+41j5grw?US`(I$a<Su`dhf_V`hEa_)ard
zW<W`7B8}IX9M`vd?^qeE#UqTCQ{BVoeCQr;<9Vs>VQ132C;9`=VUIs_kNr6Y5#Fud
zC%%8I`Gh1nk|ZzminXyQf2VQOjyA`g<i#;0d78Bm&r7d%O53t0Nu(^>#zdu8JEd(O
z*|s-of9x_K8atKPd^bEi27VaBHaX6)7w=ei#C_89ByFBuDo@hBW-?a%;4_<touarr
zJ+YL81g5Atc7eVIj>hAm)%<86-|`#ck{ri_`-iwdZ@R=(HGW~?62)<VSbh4Ev4A|H
z-W`96M?$`P*9(??m)aQah^-ks;bU`w;J89C;=(v49t#RX;~*{?mnao>UG|$}G4-xZ
zW7jo8=XRZi7m;xkETkQuX_4yCXf?bk-`uhJcK&Q0A1ZXI;}zf%=;7_*g>RsF;fx+{
z{KTRU!+0LvNF*i4^{C#{`~ga(K0M;R`oKW{O4#d8`^7|>!vFl0#`HgMDW}ZbSda!h
z1JTOFbfcBhs6Zv8UJ_7KJH#;Q1LD9VoQ#%w^998Y5eu3dP}K-&Nfp})=Xuy{5(BoV
zU;jNJd~zJ7&IsANGGfo#=>_bacxs!2sre6kEjEjj{+d>HIkf3JTPtSI4JI+oqvRgb
zn3^)xw6()(gmFvr-0u{3pj%=>h8I-{8DCVvf26T;P#toWkVbiO<4ZbY_MlyQtAY#}
zD|1JKq^R94tqYNQ36W|!WkmW6ZD+U%i5d4k?Q>^SVkW<Ndivmv7`xcRN0h-8sSe)1
zp^@Sv{%Zio?sugumkeGoJ0LZMAtKmOaAX6hD9Q9VQ?>=%BnV4Wd}G~oUx&DG=j$C~
z@g8L<9aX6xKXx4KJ7Y~J&XzLLf7uvHJ5G@F|CD37NtqG%>VpfTe`i>Rm~c3&T?~y#
zii5no$J82ul>gb<f_z5M#$d>lvA%a9C~FT$8ExvApUuFhE=|Y6o5CqK7v*tw(vzpL
zldPCVqvG?`DXJ_ieYHp>Z}2-d-%~FaJ|OEUu>wQfO<N_m#T{+cJN}KVC^yRuKh+je
zeb2T+T0>>%a2O)&{iPQh=Lee7c>nxN@B7G;qGD(RFrl`^OzQ2y^aqiypO&`2Ag*aX
za1eHiTjGguD4u4jZ0>qxL{2`u-}Uc;VxMS-ooP_Uh|Svh$`)A1CBt`=uROfF)%Wg-
z&wSUxanQE)Bm7W<#71KoSU#GK$=OVAo0kkv73T07g+nYV9X)BhfMGhQAKp5*dCNvT
z!f=?=ec!}Eo2MIzNagz)qGFF4`1{&yMFz|mw%(1dr{05?N?pEFPT+<&FKnKjVaz}w
zhr!^9=3wl%z^_lW^_?=L_dGG3A<4LJpO}FWjQggyvm7_r9sy3-TTZ)k-Yrmm5+~59
zPK!G7NKkvcH;JDy=-}C@gDaxFE}i-M9$iS!f%WQjVwR=YeZUD+Y3^~zCw2=&#Q2cr
zB>#C>3|z&;RZVP^+TlWHBmFd!foo(oaBUOYYZ|GMW<fz=B-~Bh(8Ou$H&l$0Myi3E
znz)&X)7H;sHA=>scxGNBKhMNz>+dgVlq8lmaM7{`E@tAi^))BiA*(;ySDh5qsf4v|
z4ksr~&Fo_Bf`nQqt6q7!8xmf*^nxQig_Byir0$&cIO<sYvZlp%<YQd+FJ=8h-WX8Y
zs^?=1fn}`gWumIpx(^ApNLfpXpb#WVmb0E=!Fsj4HKHIAdsJ8J`9i#Duj-~*@5tm4
zwSt<tqSdo7$~Db8j@Y^nP%Wui&oHN#9#<J1D&ubLUW7N7)AWL4f>Bb$Y|fg9a&?lZ
zWc`j6ELTK_Ypht_Lsx2sp`u<{SK2fZc!28asVj*UkW}5gbY)-|>s7Gm%3ztQs9TR0
zWs!%so@W=BL9@O>UL7rbtnG?1I~h=&VR1%N)qM4WJIf<msH}Cby!f0KOQ_Z?#Zgr+
zJ@sp3%llcEGON1!Tfb+*>RJND0Rh(5!~+8LeED*sj|D-IQ78coGP~fyr%I5XcUo?w
z@>kLGl?S8YkrULaR{Y|}NgY~ES9&dEv%^GnU3rp&BN{Gh=t?mkCP#>xy7DqFlGahO
zHKAG;Afc)AL@hmUhh!wJn|S4r2(}(#qD*bwny)6ye_*y+9jgv5)70v9^@6jivb?0s
z9}yvX{^8qDrFcCxzikTo$ZI9*>q<`YiPpe6l_i>5wxO<g$&2MP>1v+KY-kbl*4u#U
zGto%T+oKu?P@u7%_trENg*MUil<S6)e745*Qmj9OP_MaX4v?nq&_b=#QKA+!M`m3r
zuP?V8Y=V3Lu!f;h^`_P@Sf{XUh7L0a#=6M5)<2lrq&Bl^kR!E#*%0olp<45zNK<RN
zTbIa0-R5eh7S>0sXKPF?+S2-3UMyzg>#~{kydKNgbe=MRdbS2x*-HH^dV#{7(5s^1
zdft3Y6mF%8)~aZutL~qmNxiN0i*mee9QX-hpcntCD&|fv*Sa>YdbZQ^Mkext+PJ;-
zM~Z1t)%qA%Q;T)5maf1%he%ktA}(%HI$FO(LbLwF>>a(6HMkP9{Q%WRnL6v)my|&#
zaz<F!c(9RNUGxH{Yoeq`4n5DFFHxyLS3U2V#z++IW-aW=0`q~IUZmlA<Wo9OGP|_`
zD-<y+=%uEiO|@KiYZVl0)+AnBFS1#C`!G8eP#vR2s-lOk4y=V|>QSobX`SMWtD5z&
zf!g}BTV<kkC7?a;S}OBpW2{%0u=cKkMuw|(z~^dR&04V<YrFKaPCx?c#2^js@2xA#
zR-;I>&5tEk&%3%8o0u!_aqXj7cgR%kIBU%ssOn-H#*l*tSg$DxLH8o@)?AbgP0gBH
zFW}Z0{jQ$VT1kpaD?s(W=&KjH7epG`xONrGmxzAWPii4iF2OnvShGGQ)YpgvNsl4H
zxb_^<DW_9YvnA>UTw%4;9(k;jaZR&!APekQldOa4$PQ%BYkh^3dP*c)d(=gddRFwe
z3*vLAUd<X@4>fMunzcqVdqW#wU4Vq9-nI<X^KA}cXY=X#suX3r)<a>ul(UeYa~>x?
zcYZxv&7YCTF-R{^r5x&(9&EjbI%@SHR;?-UHRU6#TMOKz-&OKMAAoILi}k>TusoWz
z6E9dpTe8hO1+2+3k++~;Y697}%*WQzEU})E@SePyYp9-O6BUh``x8C$(AlV$b(qle
z+z&vi@Ni*m70z1=TYCUQNYL;wp{aR`={Z(5!j;@btiyRl%`-yiIhsl(pLZmZr*q+I
zuA<hxylSo5iZF&y%{NNu*{ijq#>_9QAGKy#f#TK~OlVGUDs-o;u7useB)Zef1_E&1
zK&H1WA^a5+pgXw?A&mc&utizIjaLYN-cGn0LZ~}EX~uh7?_)aG9>U-UgnQQ!7KOgk
zow^n!tbCs(9kUZYhvn0qh6}>om=@is7i3s>8WBslu{g_%?qph#@XyRF*$L^^o%X{R
z>Q0xJ6aFB}56gBU%duqL4Z=&A2y1^rxV8!5T(Fz&^j8w0TOYzm*`MeAnRbT8)SbS{
zPq^`8!l&a2TXiHn2b!b)Xr^1pn)|+Ix||<j-AKY-@S$|49R7q?VBT~mS5RAb`f&!~
zPUsfhX@$JE>UE|=p#XHJRUKJBr$5tsa}xeFiEyCo;opUs&I_j0owAK3Z0tpN8tke&
z{Uh7dnzMz`BU#cRhOovd!rThWePC2{r!un5E#sK}cmm<YM8e0i|51`+<>ZXCfE=Oz
zAB1J)crP4dx*@ck?zHeC;rC71@~=)zpM@yvPB&ax((*H==l(_5tpVX<Nu|ZpnQkiK
z`ZlcjdLm1*{6jeJio802FiDQ_3N$u&uNzCgGWt-O>F2`;f9Og$Y%<}M%Y?U7LT5><
zCz7|n@L|c>I&5dlTBb|Neoj8e^q=inzO56}^^yrs$vP{nOz)R`bsz`p<k-a$>lDJS
zvgCdvri&IMyyM3@!C9Eze}{0noa?BwOka__`a?mcZ=WE{0jmI>k>w?&6ol4idC_o|
z&yjq7c0NmL$-YgI+_E5^C7$wXm>lU}k{^5}1;%~HtMO7!vt47l$Pa`s+7TA2O6V>7
z({C!%Nrwpw!hzDAdWRBDk~}a)(&`~JyY6&&GvRbN%eqtG3Z{qsNmx|&wpBKk{|(1Y
zciK>m<trsMs_bJrxC`M@$;k)hxXy>N<eZd|EH!x7o7^m!+=j61W5T?#gr}t*s3oa1
z8CnvW;TOWg9SG|lC44c9@FOsv?o|3q!obf7uPi0p_y=KSIj0>ZoyQdB6W?Yc9Q2%5
zce@k5s7}~k_ONU;)A5p$d%t0N>jT1DQhtA3#q=yWQZGs8JW_{r%FZ_LcO`5i^@BAR
z)3XW@b_?QNVVwvYNx54pWzh2>OA>bwR$omx@v1D5qdRb!>8e@SLa`{O>-8k8CV4*F
zA*PdMZ+lvq4%kgtwIyMRD!glY2f{U%SQ6WxuvR0Mw3dA9B%c-Ck>$?^5Z>NE*b+_|
zBumn{wJ%FNBsC`e#`IpPCzn=c`m4IE8Q7fZe6kPMoLKVXI+ldWo_Ea3^r+W_oeB{~
z1`>M8u?&x3{Rxt88(T3QEoZIFaMpP)`5`CP6S~tfSZTELx4c)%!uophUTGbCWPg^*
z*>o+-IyL1}&bDECOflAsl=5<^Gt-+ThiAFMt6e0OV#_nVKyunKNvr;mta(9lX_3;r
z`Vp2{x|3xt)7loo?41cGNv^J<Gd)`JTktKWdrRJ^zlP~rCkgkpCoG<u?fBPWI$BCu
zd?l8AYhj5tity%q!XgU@7laY&^9auev*r)7ouqm!iEv}dr?Q9ZC7)k?!txWn33GT6
zo|5Ud(kc~@Hln>udrDrdEUA<gqn6%4o>{a*<($}ij{+KCmBx?(<JuOiJM{cH(Jz@V
zbAAu5m;ZvCP~nerNC%wY%mxl$0T@(;?N?d(1=3ZskpHVWmqkf+ANIV4H>p>%To%Au
zeFVybr;?v*ciIbB=O@lf-7DmukgaT|UWt}~^(T{K8}!{}_CU{`AEnOE^8jX9aRAq{
z@4g0@*L^Wy0dI1A;cTRr%ejhxrB}xSy6^7-==~=xfnNvqspf-0NH=Ird)BzF4%jqi
zB1+nM1R@=AixSwoC?&IBZO%v1N%TM;CO)B9j1kGy_7g;@(x^N|w5SfWQsm)etrZFM
zJT{8n1%bX0_sD5G!~!aUJ>m=Ya=-Zb4bWk6h!*jvIMWWfv*I`M=g(qb2nv4@b5<dD
zUlgFoJQ9n(NA7nqT1D=u@ckXR7vj_IK!1yYi9oN#v711{lyD2Gj8qO~LT-%mbvK~#
z%0KMHB&BFSpqYxbHqdM(TLYkZ3ZobLTID)BzghY17v#1o(Up<gr$mt02b8Lm&%??I
zs=X7+&ZfwnR;qET&MI%}qws=qg(C2aa=jeTedRT8c%?jI%{NNZszAfl3bY%e)DxM3
zK2`J5tC*pRc%V6Iacb%L>Nw^WsXa*Db!s;<-$wN}Z{)V9mgYd))Ef<f_NWbffWB0x
zQi~i@4@M$)Mje?A=$v|`FVIDG!giqR>bYt_KdL>c?QW=*Rs-EqgXaR>Q(b6G9;jy`
zfF7&GIld=qeooG7^)_d0m=?j#kJM_t0ve+=@B^Bv^&xeqYnga$rZ(|wpxK&BMch75
z``it=g<3)k&=PGjskBy`NRHm1ZKUGeti8+!v{iGWGTW_XrH6b#yEg%aN42Ar@6%dS
zGWc1o=4BLK(^gS6{-Av~66j}b_-LSCwDyaD?r7HQK=-xm)N~KEl~iEQwdLDT_)`0r
z=#^Hy3D6tu<3ONMdKnAQ1U;Bz_*5@V*_@&8?*sI?-VEyoeXd@VZqfpM7g=hJemgtR
zCVdrW^9#NBXF%KaECo<ym)?g$x>p~=N!_pi&Osm2gL&T(eG6+|(I+*;weR(D6tdg;
zFw*Rv9zZ^OpsVA79_uH60D7YLsfjAj^qQ2MKXq@C=Wjif<aw>vSc<}7mU~n^BP}rq
zIOt<6)u`&HT8h;Mnr<mfO3brlrI0MKc=18YExTNhTV;8}F|4&*J_)qJ(vYmN+0vpo
z&{oS{GVKmaE>7caOH+=0pM`Nk{h-AoH_%s>a5Bm<OB7YjNlPnQhBKC2n~^(bxlcWF
z%M!yXcP*dP0{Yc5*8^1^Sr89_+*(37D^D$*zr?i{mfjT6(bjro_9@nTg@LA9b5hI9
zvd;0rwY}DZB;66~c|PyBbzTLaJJ#J)VsEYe0#Id?a}tnL%6UWR(iMwYj;j@4l9he`
z(ohoQ=METrqbN!mhLZhT{Z4b+WpW%~)U#T+8n^#1q=$r)ETcsht_438o7*BcSLCSz
zG*3kQ1~gy%PT^Q4#7pFM2<H+&d&TcPfsTn_KKg+;KNo1cqH-3eE6JshJE-I?>11>>
zdsAwy{2Q@eklil0=^pZh8EusLJhj$3IavnOfl{*!+y(7y8PuL|a6_zSEJNI3VJshQ
zNe2A5Iyzw)+7`;!@<~Dtq=$+1pqpj5*aL>Qj1V&+<(82mC-kyql*nBcxzVEaPe5bD
zw-bQIicj<5+Bk810dnKTY1l{01aY?#audb6*FclRQ@C@M$)e0b<fe$+&(Pf&!tE~5
zLa{s#(0UO>jk;NcgVL4*V*N(s4vH<{3(Fx<^Bd$2i=S^J_m!B!2{<BFK+Rf?ie*cX
zJ0@;F#I@t1ZB67(h`qSoa#Hm2K<<?2SQxp};t;y4pAjqOBloq~PaAbsguvajoD(aG
zAa`E$0y|nRh)I;)i{e&e)coGK_Jh0zYf_^&YTg%>l7M~_Gb*FX7)5qstRg!xPLZ7$
zudoyPdL^D}XtR>7K62ZXbzA`LQjW|*ZlAJiI&S|`;q2)971`JUg)@ffmQOgMjDqU6
z993kOjw$SteoC3&8t9A?Uk2!FC4w_{R{4jLd`=1c3xyZueflNkCUwPS`5^s@QUsdc
za@Ba=x5o3nGoE+Nc-}4L>m9iLo>K2I(C^AAs_th>jj=#4mCmH+Saox2v^iBxK7z&;
zs=dHwuryz>FU!=#T0qOyb2K<B)dUy-gC)INIXNk%@cxXGrE~)Aa)n>|qh>`fx>Erc
z=OG>FMYSJPiSk~hI?Y(;6O@tehsh|Bedu7sbfA&!bz>4r`iuwxOrA#A{|FWHkpAF7
z{i8kE0Y5pd01g-1h60Tc56a@csbV8J=~MBNJhxFSfdJ{-#YR4Br*Ped+#XQ~oC|@B
z=Y8kJy4^tEiXk+FH$@*V+wO}A<kde!Fo*n9tY(K_iLWjpH(EIgX@JaDCXMDRCumhy
zD3ua%ZG#d(O}|4aNh#W|oEV701InabKqr(uy@0MMSu+D&SGv~#x~1$n1oW$tOpWnC
z`5FY*N2;DwY}3_m%K^<$Cp!VnRo6p6^$qG<5K<Cc`*}YqXFbaq%YFemMwVVs2mXwL
zi)u-7|0Q)eRr)vTP(~gut3|2huBv4?)^F84fk3~giG0j0brI?JLM``^*-~b8CN5{$
zHw7?jC4azdKahUek2V0zv8MuH&g<O(b19@&?&G%r^L)W6&f78y>3kEz0P}b630UAY
zh|V5b%C0<tyz9OqVCl;HyP~$&32cB^0^>)ayvi*IlO_1Z4@lR!w%+VZ3C5gtmuRSJ
zIm*4FEg!Z|I1zm*+^9zPi&^x`4v0Hc%?CxHW4Lxm%>M%Du;@nX`IYEQbVQVx3UpK)
zql_ODMVLD-mYo7RA)+XHCq-Gdc}kqKAa_wjKw(%OizAq3%QUr{Ckkh&m)wz?uXaFA
z4z1J#RxW>RliBUqfm}ZHS;{46oW{<@|MmbSeNU2j{qmCYKAH9yaF|$0Gc;UOqO6S&
zHKKt=3U6AgQ6duyM~h)^fyRhEPk_dW_N&aN6kPuZ1*PVd1}vS2vQlo;MZoe4pzCzk
zNi<Dv>#74*ctg!mG2tp;rNR{fJ)Y9?`?O^c%r{{U(tc6wq5tx5z<?R#@xWUcrXF<s
zDPWZnRLxa?rxaG3R1mQGA5~GO#yD!dn)AL!Nv)r#V1nPUKee+^JJqSh07Bi6Whe=W
zq3W$Sm_w>Rl+S7~bRl5F(rm5KOH!%XQ4*u2<yVxnegJK*cWhDK9BrPvGzU>tgHWT_
z!OxM74~FK^`xXuWOnBZHFfqClVDj=`%zOOGP}J)bZ-aa4$I$r;n;DCePTzeC*!g=e
zzz8eH)csT!!03y(Q}6SFEynqm0gONU9I#(K&Tm4TH(D5Yq>IUnc{c1pX`zjzUg4q?
znj$l3Y>Mys2hiop7BEQ}UoLFR#zQYi2A(Xkh2r8AvFkL@RPp>XpoOB=V4y|9hwHn=
zA_ts0$yahe^?wqaqEc{*?IMMW+r-D@V|}7hNJsf3Wy>?5$;t@XB_%6mYN}F11^QH}
zO7@(l+##i=D>sQ|D4{<BeWpyH^v_iKkXL3YZRucsu0%68TN%p6<|rR+#OPNl*+v3g
zS9U?B-YY(OAtx%)!jF;wi+@)Qt(AzQc6F&5hmw-l>H(Hn=>b^o;R?X=_a~a<Eb7WB
zDPbkuN-c}PwX%8c0J?U_4Cwys4WQTl7l3}ZIoW}^YXAn_Wt}S5Iq_99!OqL6so0Oq
z>V9Y#N-Hfm3h0qYkDzjP>MhTLnE<`VqK~q6-Yw3^yIkY6yVjq?8gA79{cHCD3~qQB
zF!ZFC$ttA=G(x`g!J>d=di)JoHcMB)a+&Bz`}$G^`xX8jCH~*GL^_}`bx~kGAHbm5
zqXBDnS_W9J#3jH^S&9M1%%rgN8xDP{f9(DUC7+8y<lJRq#4Dh6;?jMfo#H&6A^Tka
zDaE1rbIxAthI;`!9ONu_^ehM1r3~zitXCj}V|6`3YvD78tQoM5b6jm1SG>U&e?v*V
z5iWp@m(>Jp_O=ILw^tljbW6r<lIC&F1}|P~QfsUjP#pym#jq?uQw6z`kFsf2yDo+8
z9#5sTPn?IO%kJc9K|NF=Qxia!%DDhbmT74=?3PR-RBE#mr5@KYm3n1Q63z1%<;yqv
zrdh9GA_-kQkVALhvl_7Ks^;dkTEEstK6oNYSUdM8fORSt01T<gl6uWa%=$4TRD+|g
zfQ@o31Z@14>x3o^+5(35ZD_Vt^mJ9^OHTe2uu4rDxoV#jLb^sfFsB@iGI=ZV+IMEp
zEWT~&Q2F18MQOn72&4l?lt4Q8H8ovBFOIKSGuCO<i0s_@HtEo2Fjz}(8+6QU^|UBi
z2gMgf0!s8N;<uAP*F<<F^Ln8ZQ79;U`zl}&nJ#L%fpoE}Tzt7N%7%2M@L)iX=XC)q
z@2&;t>E{dRQ=<&(_}(Hr_zg#kdcc^^0Rw-hdsM@XLCu<1sAF4rRY7^%Rbl2+yUrek
zd}Nndfa63F_(;ZtMljx1AcYBveLVveN<VTlZ!0h@EAoX>Ry}FF02|Z1&m8O9M1BPg
z>=*mdHRFl>V^EOa_C&pW(`U%Ldj0@dp%K&L#bpZT1koWJXrk~w1vE+IB-tj5DClx1
z?QTR<g%_)QDlYJbX`&)VXIUaPKquLH_5p>bFHz^w5Ux}|4l;*ZqIe%P<kH+54H%w8
z<-yZY;@J-?7~QMgBcy%TQw;sr62`=k?y+a!ZRiPJKLaMeayAEAARGD5PP)8abCEAq
zt`lH|@Wy7Xwnw<2=q0y-ddqE{r0)~hjtA9L|I6r!J|NEnz(GIJ`y6apf%K3r-vfSB
zWIW&}gGst!VlTDzaM7F+F+%hwD~u6g<&YaIN?4E^Csxs{j2GKq15Fe;$hVV37Y=!{
zcv%BziWs;YXqpJS4>Vnjr0$y`JO==MCaTa9%oerLb*#5*(41nu&DmKjhI|aPRQ$pF
zmW$K*aczZIa~Qd`qKFsJ29XzXrf(9vC|;Yz*VTcxi<=!$bB|DIJobu&UMSorTC=gE
zB6t|kaUmK5of4h90$mVK3IbgenG=D26gj>F`bh+J1o}msNe22&d_%GMUA!ivJre~e
z^v}gfO6*HfCK%{1@i%qQ-{Ovn+&^L)_0$NZG_Q?R{^GS!N&_}FMp<PAnyj4XwJA!g
z&wyqrufQ;{vwsuKRw`WpnxnWA%~hTi1DdBiC7Q2vXW;^+8_^==*VjOcmHR|Xlu$Cv
zQl$yeGUZ7HpykT%L@Si;EL^E{BU-I|)*on%GJ|NXvX_ctowA2$y|QTo3O6Vli8d-1
z9s+Gr&J%4>68(X`P!fo?DIG=tZCBb8?Nmn7x!$Fex`^CvWqvoHJ<8fzKzo(uSOG|B
zRjcknf!;s^v~=Y%(mI+7)w%jez)Wwd0A^l8m6&DXRHU<pcLmJWnaVkP*jm6GmD&U5
zEPn_v*Yfp%xu=qBd0vwJ^1dc1@;z+>nE&@lfCc_(09deoVO%Y=lulFOEacuIUra%|
z=sp^jViuZ!;$z8dB_`bfbV;IDRC3+{z*2$K6s51V11xiS9bj3Pr-0?0Va?>ItYvZ`
z?>rd}t)A)9R>0iXFk(G#(Yr|J^E?Kazap;|=t}V~Sc`VBP$hD8;c}f(Qltc3*rKn$
zM!Hx&_OG}zX;DI@@^xuZ80nH5IoeW<x!@~(iJGR&;PEIaJF*L4xjkINx=x^xtI(h<
zV8w330o`x%S(WC}zI(LXh?2?^>4$jMrz-GzT@ui{bZ$VOyhj0j7gqxGyRZ__e{Cng
zfR5}{px<D?paR@ssB-u^(pAfn+p7)X%vCSKWkroC>`6_Xvs<ed{fb~qNx<6s={eLn
zPUTv66WJwXKx@Eyoty#duji~Z2xOld{>l~_^@5qv8*gIpph?khkPfX-1Td@t=e21A
zM85Q9he{*e+`9u{3-1MhEf<i&;gtgcTfLkE*g6}j*X9cOtnFwIl(Z|(d1>F{AEZ0f
z;=LV<JOb?W6%BLe?tXv~r8r+*!f465{>0vP3*-~K?_!<EO$7jZBvL0tjo1L#GkF$Z
zbfM*dG0Xk{?0x(dU~JuBz&=*2g!MQlJ~93PDbu$UY1r>bL%^i%SaRse;zyJZ*w66}
z`V$_n<f;O7@*rO%#l`=xhDbJ2um&3`>_yow(wodz)ISuk*tuxH;?MU2mhiU#x?JSo
zOD^0ESZbC#VCh560LxtEc6iw)bi~V9p|51Sr8iKkm3iud8f9zm2P`+2EL`5>Z=_wv
z`6KOCnNwH6`Vg>UJk-4IUW5IpWTnpa7+DjrauO}xXCi?LZKjw@LeCOY+4$$;UMUpL
z7Bj8^%@M<*faZ!?r1m_Kg$8N9xJEu*AZpb?ZlPF9K3*gWRYh*G=*<~CAiSVRWS`@{
zJCDnQ?s6K3iQ|6)4Hwhcukqs4SIA8ep;({6lob|0lf-biHLyMjQ*dpQm{<<zkQh&?
zIwOX2a=#X>xU2J%c+T<uEMCFS)^CV;WR54IQVHb#78gb%H(t500_aoa5|$#eH^nTZ
zdP&dgfTec+1z39b1VG>W7tQ`|R=QIMzE*k+0D7aWEQj_!RTH-$H%<MbEpiLgA9o@5
zh3Z4yuvPu11<*G27j|yD`WY7xJJhwGBezo>NOQYOZAIg>TfH2D+#WTAZSGa;L;~$o
zlj#nAss5e|=!_aqJ{_*b=fw>pv{#&(iP~C<`y{PIE##(Yg|;KNPUG8q`VDOoRqaFV
z+9yDNXm=~1%1bThZ^*sU)&u~((fo%34b!791I^O!xB{)z$9Mv*(QlRkTC3loa$To4
zpr&1~k39*rLBGPO*ra<MLT<Bui39sW-<lO@yDk`B+@T*Jm3Hc3!N~2>2l)Z*)^q0o
z+M^#Dj%$bYW(9zb>Am3j$T=u={TT{eZ}$O=zRPq%FUDtv9{3s_+eD$?LD3|!A6I1k
zyLx6;y{E_Ex_)1u@B`4V`gOVs5A?2AfOcALkZijx#}^^@z2y~E?oXD-l<jAhF~^Ym
z+w!Iba>J}aCy<+Nt&$IDi8Z%3&<5*X3$C5D=Az<!V69va=!rF%N`H*A7KAFxoqx{>
zw8B}TM&97uhupr|`A_<5Tb*NR61F*4Vqdm9XQ~9W!+F65pq<Xph7btnJHvnuIv=9E
z9&&!v9l68Ke^f;7sPl{V$enYpP#frm^ZjZ-&z$Q9;@XT%ML6}JWg1S2nVBi~FUZZx
z^wkoe<C%WV0dy);k_)a~&Gd>E{C=hr=$WKjfxlH;u6UOc)OH}}xBIM{fRQ&20!H1r
z0XR=A{2k~^v5m~MUlfP~Iv_+{G<8ta&J1)&Okpn%i+h=H?JE(|3ArQU*CNOr6}`xy
z$3(et$Q>7V#{iua^SScBA?~rUo8nPDpj+Y?qiDB<2i3t7ajqZGGw~7UWPwtVGrLeJ
zS`@iON;R6w#meV=`V!^mUbwbYx%n&5GG!>Ibh+~4EOIN9c5G#(Qh<KLDkY4oy)}yV
z1#;_^6I5;+l*$yO&C21*xVA+pS_tS1Wk08MtFmGt&^BeU7tnU)=Ep!gl%aWmb}I43
zfp#h5z5&{;932F-NAZW2z-Gk0*+2)BiG_g<Dxq~uM$J2a8Vc;IqP*QX^1>;rq@pcp
zeM&epG+wgIJX|aN>n6Z*`>@>5UB}--+PxlG$)_iu=ily**|MAWQsgT<<X9>eBAd9!
zR|TxJm%{JShJ07~5q%ENZ1jG-c9Y89r+Amoe)<Hyi%3Slsid3#3M%e^d)<%@yip%8
z$dU!HO2Bf!s^5|^s$C+BRo~DMu!bkoH8qS=ueEJBVDLcJslBBfV4XQL0P99m@<K*m
z1FRQV2(W&I%76`Waf%uy@oJ-$RJ)BwbimV_G>b(#^raOrETkl0(=p_zW;Lb*HqUJV
zZ1H3xV9Ud#SNL7(uvR&a0k*z%4zNw_DS&MsQ+0Ik3`V-+5teuQp5oB?9F=Rt4Gkq-
zy4M2i`uY%Hw_cRB?lm}dkyp9^_E<9nFe)@NV9)uPa98v(YUG$t*#BNLZvgi8eghaA
zN$KlTmva;6Ow}463+dMTZX<K`dvY5k33Da`CSGArl5%E8dGb@rM*lej0S8Q?5*he}
zN?=eTsWjN7C*Y9HM*u%ccm??JDbCl>3#9xfeMzNZB6&B^aM3{E$_U|l6uFV28#VqY
zVd1sW!gDOp7*U6&ajaMzhuk=EsS(h4;ZL(OLDZuNeJa9608JBx{{osW=F+0g5MwDA
zpNXlBQ8-gn_XL_H)|LnQTwFZ>G+V4~4KzorBcbMs(=I^s#4i<q=8K`!4GV-Pwah|M
zLPcYX#CB?m#bV~4C|n|Z=qfK213i#iCZ<qtEEkpdycHr3Rpm<IL7lute1@i_1}k`*
zPGqSOB~e^9S3IEWIOuD6ZI?L1`QMGO0qUF(jmdeZ#9I>hj5y$p!t>%OiGE2O-v)G9
z?5zlNRb1Hw^qts#3h27%d<N(Tk(oy7Cvh<c&<%0cALtjc6di%enoLu7SA0ahd|&LR
z@qQq-B_a1n)Z(aq6EmrMp9nXy>Qj+GIe#ezlbIGP)oY;Ua^(TLv_f&GXS!M`+#iK&
zm7r`u>y`a~1D#RMtpPfxL{ag*P;T!*?l0v91>zrN(qW*t%8IK%Bh)9*c#;h2?^#h`
z(LVyTcI^e|e3x95X*<<z<}LY=&f>#~$m;ixd5`m5@=E3yw4rCYLnkV00<P)VzGUCB
z4{<{}$DPjrb8h5J=E}rg=FUmgn`h~8r1K7@uE{s#5z_e&(QhoU?g`Qb@3DnK#Ty}A
zc>jFBqGN9Y7W>!@<;ACvaZ9+7@m%hcDkbwK1D5SfB9?o}$u7V0B}&{@a#q|wVtJ*0
ztmDz1^5z*+3ngB;CL-<qKnL{Q!#VPQx*afJ_esFOqXDQBRIDq~RVPD(>eb)mMoA5C
zj;Pj#B1i`}s*H4<<eh*a&)MgCVKo8kZ>WuuMm;&ACUfs19kx3!V6zIG(H6m^a`-iB
z>(=uhBHi{&a&x;$<o*sbN$ZXt&j34DB0qF_%$|1}ObSO%;v7URXo8aHb~LEHB03`-
z+YFF1X<4N%ZGfo$YIP^kI@Q`0XuVpLXp`zZ6je5>#fY}5i@{5lZR&QSooZk3r)8Hq
z8c6bldoF5fkCP(-JqxY?tXaDoo)|XoEbeaJiX&<f^cLw>Ov`)R4s*#;`A3e~GqNz!
zUYmlE_8#6EX`k1P0sV^XK#Bi7YP*2k(MShYj{poBMLJfg$vRbAxdK)b+fY)yY$3oJ
zGilCi779hW)*`z4wG$|ob%uo^U3Ua$I;0az>OErp`WGo~4NkQHY*>uE*yt4POcN_z
zywG2$sG8kshmz()z5#4eo>r^nT3WpDKj|j5O7sD2-E9|2+6-d-w%^c3wVQPsuzeD_
zq{HMjfE^>LLpnV_4%k^ftLsy8Mz`<z^vE-lk?wJyV~o<N5_`@kZ$*Dg&g<>uiFEAW
z-yz+{M@KqtEiGHZ>3k?j>Pi{wFR3&@QfZ*1(jZBt!IDZtB$YmrRQgy_X{e;qCz49T
zgzV68AxA$-G@uBL5ydHsW5uwUK;y)bdO)j1SDLvsqB$+!S~2=F<kpGthk(|LIb?$k
z;teg#MzOUF&?b?aHff7kz7A-sIME7dn;1j#Y!^S41ll3a6#&{PTJhR0(d-=1Zn1(@
z_K5MkeXn>L1+-5Tp(uPQ7UjUx_Y3kfH;*~H2gO&M{6pf~DkwZG8gXL268*O$cSMw0
z3UpiqWCuDSBH8(qV($s$PKkV3fKH3Zzktq&$GrV(5jq^`tf)cZIw$&5X`UC`{gAsL
z)>(iqil2(1@RE21snNd?g*ok4MdIf`--_iAfUb#3^bCFwKg~hzXR#O(1t+!UA3(o|
zrsV%y;wYKvwwOU(bVpQU&+dxtn~-}TswDwE6xA*OJr<E9!Ed4=_2(br$hSaGMd$>e
zXQE_T-1l6(pssu&?$!bNQ;hb(wZBFA^+2yg9qPz8Vk&QcE50MY4Ob39JxHOf7*`(!
zm9kRcD(78>w3py?`25ow(68ndz<}gCfI(ZhD5`p$%CP!mTDF><-2sC&XTUn=R{@6X
zu7P&y&n%8~!((26jq}kl2=%{#bklQ>0h=$S%5CYvIcyb@AL%x=gOF}liZtj@_$<;<
z`^g~D{<O5cR&xBYWoWVD28=+uueTS{36rV-t`rHJ+f|}KIiS^I5M^$S2rB}#R`|h`
zgFn|$J#G*;IjI}P;Q~OL#4%E7v)D@>{6g%fv#?bpQ*CY&vjR}KT{Ndc*(vVQIoTzm
ze*)Sg_K<D%iuQwnz=|ZtL9w4A1Lhk7bX1I?syQL5^RZ_{BI)q8xK7=DRvaQl&xsyO
zfUb*Nc~IpCacLuRKZ-6?`?p19dVF`pv$ZI^CtBA+?!NegD(D;KBB$}b622egqeRYC
z&m9I@swPq)uT{$@0Buq~Y6-MWT}&;zSKSc-^p*MpMdJ^(8t3tiI;02C2+fzy*%&Qm
zA9544I4amFTBbkE=F1+Z&2mdwUpQwYwKMnl9k9}<_JAIrf!7UITFvl+dllNTN;&fJ
z+Fi0z<=Ny_&t<d{fko(f*38oZ=~~g$lC?`n9i8hD(hVBC25jj10<h7-XMjz690v@y
zR0QlbnCWxc5YFv+trREj8}0Qapv&6zUx2P?+j9Y3)jnm-Z?%C`sNZQh+u@ntYq6Pt
ze$p~=<@mGK7fM9Gsa56ex3&75v3pvHy+HT1{mDSTYEwBY547Ioxrf?NGUFrdF-^x~
zZRb?ve$#5*1^Qh(M9MzVdh&TswX0NJ&$I@V)aP0hx!{H7zYe)SwGOdBFSVmtf&S9G
zs6+nNM$_!P(dP4cZ?#$BKqK^m)qqCoI~D+q(xY5~M(aD`fyU^+IpKzJ`a??0c)e(K
z6i(E$j4~&!z-*cdxdkUDrsOg|R0yAO1hBPh1;F?cX8@Cfs5XYUP$f(e>x%*%5D&=+
z2gP^a1051i1!^1?Pbn*3iJ=pLj$pchj*5C`fsToOWW%pTASd~P7%~9pqL@wla8qQa
z9{fcdtBl+(W!YAsBgz7f{gQH=v%f$cM9s5I{efLPq;7f$^g!LT3TUJj{L~y#=9Z*F
z&KF$(b8jGn<@s^Ed98A7NU6nZHFcEto`!%vrxybH=9mW<_;{OHuhoOo$hXPEa7tvy
zV@O9?-T+2_(*-5He8Z5AU05CIxO-&mzOPA;grMz!NoB~e18?!JK?||ul5}-BNTx0Q
zmTqvlesp6igpNgtXVaU2zEdd<0rS|~DsQ<Qt6q)$3SL|mu<ocCfb~1DPNO=FkPeIL
z3)t!^xx3RBH2}MIqxy<G$=*cYD-GCNu>!`uVGI4DNsXjo!vF`oX72~bLTs_PI|w*j
z>;+k{cJqt`8Y`Y(2AU{7B^OQ?(<cGV5=A=#%@M&AngwDI_)d<i(6@9Xi_UKaSfchW
zTr71jn|Vzw%_r%}B>=5wNgL<4RArg;Q-GQ4aD-V_v42^!kw3DPWn0-t@VPmxEXmoD
zb}-l727q}kKz{g4ONGXmVN1n2gq0PTWQ%7}jN9Vtmm8j^UmI|cB`64VwN$zL1b5b0
zm=~}X8iDP(Pq_=()C^B*a1yrK(&#VrR}af{39xy~f`H*GNPsr?XgS&kZbZ6M<LrQ4
zPPhSfpAu+}?~%Ag26>k9zJp%C!bL;b`y5BPN^HgL8t-v0{SvTqBaS0tWIezxZOMJ>
zMeoT#8^m*R#3u2Ug0WR>xel~Llqd<bQ`D;ov`e(0e%>vzl7f50uvtKRMRF3*K9P&F
za!~kf2Etpnoc_ZwDL}`BB|p${v4Fbhgy?n^xs#$=Eud2(3$5H~@lj#q&WiIS(>ZYt
zQZ45;I{!x~i23C>V6VrW0ehcniW_3*bU`|9E_uG+Y#lK9+!??jvpL5fkGuysG%JVl
zNvY<jKTPb4LuP~s`vz#V$j%$5h<&wzW{W`;faZvMi-G2fy;Pu!#pPE(OGJM%^$M|v
zL|QF6aw^t{Heo<(MQ{|*dQtuy&?a$|OtV?6m<zN;^qvN^O*rob+Aa>Ui#x<c5^0zC
zo@#23D9FL?6@RouZl8#z*4ZyAv`6kMF`^=JN5s^nKu5(rU!dcn4B26fdbx~Ao$9^0
zw5V~vH?G#~Mx9gZDkUyBJ{Yj}4!Z1ha*!5v%dq1in`yl26*-S|{Z_o%;Mi288`jB&
zbfdCwk#79-C}8MidKO`~iUBqq><!p#ZUw;Rue0E43oH0X&QF1LO_49vjY|r@`G=6M
z6HTwK-g7>w(ZMQ!VGHTOv`B{R=&j3~25f(ai-yQw@*&+bm;;V3PrixSO5Ts%$*X;f
zlgkoy8z<!T9E5y<CA1ZV>pCM{Y&xmwQqvW%^f%N%<(w(2Zfi+wcP|yNvdZUqH!BI~
zfBQSYpdUTZLbWfeAzkw)4e8qFXqQ5qzC}p`KWe^)>wiSLiIrEI=0gwk78}UBtv=`I
z+pXj@bozW4(p~1z7mOUb2(V{e8uealC~dvJrEv5qMQ)5Au@*2PCIm1!{BOYi>-qo=
zxX-y7e3c%}$CkcG4->(6fJTZoyfQ`<$_g}IETUxMO_A2fO%>5xWK9?S3Ioj)8!5@N
zMKWv76Ajtph2kbn#1e6HHVT)E`S*cViO;tHtrf8|fi{Rdm_RV=!(BjIMYHZe+r$v^
z>`u{=Jh4kmt_8GPj2aKLM+~9s`=yu(PLwkrScS`rATSQ>McKMYSAWZSt5IeTN@`Bq
z1{iUO^VubKIAD+CI{~9y*o){~7XV|Dpd@5l7saLGD7Ylrb^-cE%%bx7S?V-|o@_gh
zco!k-Wn7r6+^h;TPg%MWXuh(P9?M~6G3oo2(v1%M5v3<MM7CYz5hb@+4bD)B`nds1
zUZ)pZ`t46hmwm!vm0xuXX}24r04tXE1gtbC8L;vXTs(W_4+Qi%Ou6to(g5jzM^lgv
z>KG4LHG$);-likUYc`pKbnt_hfOWQS1#IL78=yD2Hxe+cC>39`+@}CrjHC>P7ycZu
z^<%ntZ4cf?x_u?eQ^%21>YcmM>~*;Y`>l6t)dcCt-t2$W;@*JK4-)`;4g3o*ws9@M
zxWUf=`>w47m{67coix5Y?(M&g+GAj;wt#~tlY$?KK)|6b*8mO^tx5om5J%a&QQ|xs
z93yU1a>t5q2O>9Kd{P%^qBu;Uo-6`iAU9PUTMRT!_*X~aEU|>-oGt2-{BuPyIvMjt
z$V3z_6k}MpSPb|cxurr~542pAqn=nP?q5T0wU|zhSSP%50BsNlz607MIz<C*5jT1O
zZ58`I0opE3d<?Wxd`yDu5icma`$WM3$n6(Leg!%xZj#9ki?@`rBVr5n>oJjd3%L{G
zBMSH_(Tf`4jF?VMd{*qBXq^|QN&Jgq8*l$c<f9?IBKA@iuZbp{^6y1ua>S3~6<OtH
zQGjjU6x|O2-4gqI0o@VfdEY%zfO_dyVWHZ5D1JGE++$IS#_M;H^%Zh|h(n~=GqH&C
z@Iu_7D7_TpxT<<3E^%q|TC{(S!nfiG*=xAcd<SwPmAi9+Mk|SNK;x8qZE$UZ@&lI*
zlaz6dP&h@YN`3XI^0Fav)0JH0n3+o1TqyipnNLE_Q4Ub%<}2-aZK2YXOuJZVz&4jD
zZ#n%d6v5}MQkG{yZjBN`PF$x<9g9M2Bkcv+q|D$HZ&3!i18r3{lTo%Sttk^bmDQY%
z-AZQ8^j>A!6Xd>Bj&t>JK-ol-aY!jN4254QJt+=HmCHF$ctXh=iQFk=%}$^*N;Qt*
ztWt^nJg@k%OBa>j>H}R?e)I>rs(5<<eWwg$mFr5MRX{%~TS$qYl@=DDn@YzQK)01l
zCGdp1%0%9GU)djw+ykZXN#q_W2c3X^Q`S(eJW)>a2~U;I7lEED&mRK)skET#|4W(9
z#$G8+C@Zg(;uMm%%H;+?!_^;u1{$gEBO0x)><=_nT}4hFuht-EO;jgt2AZr+t`0O+
z4djwyntI3?xf$w~KY?bd$6BNCbF~W%(HwQI570byWp<ziY9uFTk!m4bmZ+_B0xeT_
z)C5|gj$l7msR_RTtx+@6n_s6oZA5N^db}ghCbg0kXp0)!9B8W=Mv2?5*5-8VRNGU9
z>{b_%KliF$F+ls(n`D%OYPHcoht<v%fsUy6LV=E}5AFh;R71)7r`39kfxcGr(-l9b
zPL%;D_1i}%{6;NA>u^O4p(^}V?b;aVntCA|=zH}5dFe;>T4tc1Ro|;XH`O0@1Km<z
zl)`;?)P7}w?x_#EqVQL>IBES*Juw~VvAV81&>w0qdNj|}5mfJgs_p4I{iSB^0`yA#
zs4UQHH8UAym{yh?GE$p93uv^~_5{#a%{>-qympy1o21Q{iriG~_z<9JS`r5~TU$o2
zdY(3bQ#D`vjcRR?7JD7JCE74opyk?~!9XjtN$r4EYo+f2t<jbp2U@Q+BndWZHAunD
znlHl~UuY960d3Q!Wd+)yZJdT{yR<EZf%a(EPXT?Y<s%my(Du|ql|$M*GW1v4d>Znj
zT9ea2$F<hEfKF<$t$<Ezn?iuT*3LHrI;Xvw4Rk^4o*(Fv_9q8+S-a00u4-kdlfTop
zlFP4a7x<tbwVB>PKWp9E0^QVnJ_Wj^4gD7Aj@H--ow%p1qGkD2%Sm7Pp*EAV@>rWu
z6otQQtHvVthgNY0&@-(HTnz*i^7H90wVxIO{jJ@PM9qJ+PV<p_t6AyG4A<w6M{cA(
zl_q(#{vaB;vHD*$zT@>Xa5NA!s3a|r{yj-PRnKx5Xqw(?HP8&baZ#X|-1mXx=-<@>
znxijPf#&H+dw~|{hl-<>#rnET$Su{^gLkEmu*G@c8I*kCJarV%R_E#WfwnpC<}*(_
zzgb~cE4cF&%8RXRi{j$lIc+5h^fj+lSV)yuaql8P_pucEN>9HAtbBm<@?Jz!<hOxl
zJurccQZ;L7z#3O70S0eg23Yqx-RyeL)*;>CH%?&F@*@FTl%#=e)r6|C?W=<*>5!u(
zVCQWe0lU7z0um1B38Z66ld8Rj>VUmJ;ZR43!{|KLI}bQrV?^eY$W0ItHISPk3grWu
zDms#ipNbHQ;%6fADbP%DjvbyQUS0sg8&H1%Efljkx}_rW7SJkDxgXGK@!&Dg8nJ+d
zYeha1a-Ep-2hc__jqI^Ws8k?ZMH~fln;1ta*d;t_p>Vf2L;CC!eW~xh6j{0f9TGLl
z<Jw_yp$X7eqPR2A5pja!Iw5+K$xe#pRZw_JIQIlPEe291&WH_s(AT1w26R^RdkA!1
zl)eFULHs~Hc~Qh{2D&6pQ{P<?XE@>4#ZRQtPofp+^0WAa#JV9m^LclLM{``eCq}T9
zN8)}}AOyMF0R1NN^XX5-k92>Zi*{#`dm%2)0s2$4CtY3&|6M?@#b`3;8?lEpdn?X!
zx`ru#Gg>)Z`DYVyBa|cx-$-Q$$2U*;oqd_Fg!D$?0%dgrpoPk#tw76^(Yt|`E4zY$
zRw$kWfmSLOH{7sFxtfIBYUSgO$Zb?Y=OVXBc}ntZR`wqO+M>An<JuQWMT+7MrO?kn
zJCzolQMgz6>{sOWDIx^9FO~aTYwTCDeTl*Y%BQuEJE#OTM(&t$g3iQAW$8c2olzc>
zNiQgI=a9Rol=KC<tURfQ!YfLq?WT~Z%c)WHaEhO0NhqN8^>2X1o{a^p_&w=YZCXRX
zu#!-KQZ}2(2tacg0cas304-$%Abd_2)N3sx0PSQ1preccbd~G&ZgSZkC0Fe+a?##P
zuGxFbC3_#aVvm<=^S*Lv-cPQ~6Xn7@S+2|b%Vqfhxhfwd7v<xH{Mpt7A%C|uQOF-|
z%@muvVvw`N=?&=498q^C&^+;HUZ4fSn~b<fydWnn7CSf@OGKx8KubmKLO{#J%1EFU
z;?F1Mc#CH0fr4VMrvcVnvmCH-9781GEBuk}R0?WG(k1&t%9eZk>*js=zWWaOlD7%~
z28<^wHO)2%aA1)#fP?;Dd*1;kMV0*DJv}`EB!^vKfhFe!mIam^mbB!YVVK>S-5uEE
z%q+=-N)`bH1BePLqA1Zr#E4?RaC+yN6#?bWcq)1(6#dn=->aGF-kFu(o%jFW@AupO
z?ACi-U9Vof(C<}Mch}6{OmJx@=sTDGu{q^0JD5Z8atB802K6FZ(+sxlI^&*ee+O>W
ztZ_Q!hZg-Ht4O}G8(|wY!XA_LSsKA1D^7wUeICJKub{ERcVmqh@dBRvWw$_nq_;i6
zQEi(L9KCrF!7-<>1&zI72f^pmble1IRo5;$JEz`A3%nsneWfiSQa_~mVVu`%*qP4z
zPsNQlpy4@!1NXlouhpnmjn12Gfo9oa`fkb(Q&S0!Iq4!exyLO8J&$AVEf_nCU}*EN
zTvEf-ZuKQ}?$%phqI{cTnt6KL0t{L2&1llV-Vk)yGN{>OR-!kjEWyn=Yd(pp=d8nh
z7x=yg<x8I~A-JezQ-W7?KS=NzwGY;Yo79If=y$4FjVU>(mQ0ZKH=I$G5RJ~GRVkli
zM{n%HR^0Xl$kl!MaDqK+w3U4_uhvOA&v^k`&wlk&ESU$?))=IN>X}}Y98xEO$1!#0
zGn727=B%aU6}1D9uc@!kq2wL47_3jLS7LU2q}F+wlCRZ6ttk0L-4vo^wJm!bB^zvY
zaQknuJ%yIsW4k&)$y2rh=wpAf<=3X<wC#(>3Hhn5#dDN=Zo3Gp&X=~!(HUoLrTLV6
zWBcp9lzeCN`6&6pw)R^}&e`VTUN~>-+K!T6ZPV|k<Vx*rTwAVve1?*>+BDk9NiTjH
zo8uO(70jEh+E7fBZCZ`-vIFOT@n=F5o<LQr)Fxz|8?&WtKdj-`s+-pne!aSWi$QL~
za}4MAv~OyPs!{kP5vsOseZ3As>}U<DaID(89IG}dZmim5j#b-$W7Rg~ShWrrtF{rx
zs!icowT(GeZIei>+NRN1wW(T_W^rTH+RU+Pn{%vMjbqhHb{wm=g*8@fOO93BDjKV{
zbtG188=Yg->XBHrc8*os))K3>ojF#mlOiSdZO^f48<}F&+RU+PJ8-Ppj;2_(on)-q
zG>%nEJz|bk+nHn4c8Q5q+trz#N`rWLdNfvTw`i=|jA*Rd?v_}!wn(hn9+p_OJ)KX2
zWy?(66sxwEQ*TC7O6%9#8mqPs$ExkC)#o87GH9Do+*}c>w!d=&Fchnng64KLF18sM
zH&*Q+j#WFDW7TGHtXgIsKUVD!8LL*d%o?k9C{^BXn2c4+BP3$g+RU+Phs#*Ck`a~b
zZ{t|C%{f->2+f@>W7SG#TFOX{Rm=0bl`DR%S~th4rFgs|R_$m_9iypZIaV#j)NM`O
zl^|B__-L%!2{Kl#jbqhLj2EkRk~voGWOJ<ADIBYID#xmw#<6Ovaje?u9IJMQj8!Wu
zv~#Rl7NFThGFGibaID&y=2*2=&9Q1{Y3gi_RXayh=f;m!>#@YDoo9|!+Y7O3dq-l`
zdS$FymQ-t*!?9|4d>px!Shc=LtXjW}Rm;t=#H!8XShY#jIaF+ZBvfoc^gb|!ij}-L
zRBRv;Dt11HimhP@6`Q<(L)8{KuN{OLAw$&`aj4n`#T+`eM23zn<<PMgCkP!|C_~3;
zjp}e9+M3Qme%#!3YH=9Z+8jnU7$=NuN=Sy0tqkB=8NjtNfNNy{*UA8{l>uBU1GxU5
z4&aKQtQ5)}4qv5t9@(6V#}AlV0C>{C6dIVT4A{qjV+`1jXbfxB2#2p<nCHhfxL&Lc
zTZe7fx@^N%WgE60+pzWH+OSD%!zQy0+rVnWHe?&N5!<jSY{NE=*sx8aHf&Rke)T2Z
z@2yyD*k<KySjmoU*ydInwgua;WX$sKt&lk@e{Y4XS^0Y_actN&78|y092>SB+prBQ
zvSHh^4cozF!*-N5Y$vv1sYlE<Y#Q6Jonvg+F4BgjAB*ANTOo6ne{Y4%S^m8hGG|S{
zw-U8syIXA79!_Dy_B7kDnQX)Mvf8k{*@o?-)#o9QHZ1)NjId$*MQm954H#*|y5icf
z1K5Tg$TsXCwqcohd>eLf#WrjfY}g^vhUF28Yr_t$z=mzcHtaCXJzUzbl9`q=f^Ary
z*R9;~ZP+So!)CJ$J5p0eY3gXUVaHTx!;X#Gu;Z*Y?D%*#>;$t7JJD>zPGTE>X3I
z*oLjnHtbZkVW<6;4O=a~4LjXz!(L>zVP|OSOtxWR&&Ic5=U8mmxn>(S6E<wGhz;v0
zXT#288<xk%K|dpN&wQAM^7mG9A~tMp#Wt)jV#C%TE3=T?eog#(-6e}D(=gB3rw}|l
z)6*~d-Isjb_E0H)!lM`cUd-GwqOkKsK1LK;&48}GMP6XL^Djn$-t(QSiV6Ryx<J?R
zaDkjCg}QAw1l6*NoUf37*zP<_kbXj?*s0OeL%VjTt;AV`9DVkNH`ld2Hfb``@9mU2
z*U%*`DPqnav|a4nVw8K-7IY4~7#tq6g`BgE+;g_DUU%dI$o!G5jFue|+U^V**OC{q
zRbA6;RWFKI)r+0W4bq0S*t}m7G4Gc~mxzkY`%5C`{iST)*EXB?ddE(ekW|LbS9b+v
z^^}5(2{zudh+q?yhJ>DDAm3&7LV{h}EG3v;9W>pV0FyDCnx%K28YGyRv4mhRFY>*c
zplqL>VS;_11ngJ6h+zLN;4olAF~Pxq15MU*;4|bM@Eoe)>ab#(0s8Pi(|GG6Ud7ey
z&%tx#))IoFUIFFkATVP_(a7p!ZIEzW9%`TTKIEDDIrjW%=TOV^-GK9Uga~?PgM)7c
zjf&CcrftAIbst@GG+zLYErybmj+RY9*XkPJTNj~Zn;VgDdjP!JjRj5n+aO1WAE;JG
z#{sCd(^Bd|M_Lkb*E`RrHaogBMN7ISgFgLh$kgp^!Z<Q6hOFH~RJ)^RHAtU%hVX_=
zuBPBs<ytCZoYy`K8SAY?ZOQA=V~uV!MrRgyH`|21T}H3|QgWqQ8*R8s{Rb7McNZr^
z+U4pqsCb2X0(@4gBham@)T@!KRv!byHR^3Bw^luha@VMzp~dUedg$D1)mEr^y&A;O
zY*5#M*+z9II_P?}OMnsz--P0u)oRp!L*}}7qfC=+DAIi<XnK4P0ekL*44Hq$)n0>u
z>HTNW^f54f)6lMdqe0XE639NhANncVjUE~GDd3o=QSrDR!E?fL<R`rdK2wrWcJ^X4
zXYTjF&vT=HavGwYz9VQ+UL(K)H?Gd_2Fi<f0*20^O=TIVcTr!|d-)MGeVKX$Cs(O!
zA;og_epI?reGGgzsAXVwomvF`Th*>O+om3XjJK)7akgCz<LnOgZWO;uy%?Q)pIV0L
zu}2*VDfX$0aqWN_#MuLCbDSMghftr0emsocY4SB{OwB;S<}aX~t&>PFgJPHZB}6@>
z9wZu3<}OIo=0lWh+ZrV^C!vRWEeB!mWb}PM9Rd#g5R`-Vq4il^kRLV!GG;eLO(T;~
zYE&s2J#HdeGX7JDH^Il!)%g)+8r^_8Qr<xe8fQ~~3Mr1GTr}K3oq>?M@d-?nwx>{g
z`>)XV-4*nD\M*H(0QpU&Xa?*M8akOcgoGmt7P6&!~iglxm7qAl6)0W<0f@Er3C
zxQ(k0ZWHE!|D@+2^OPsh-f0t2`;6y6IqNde%()0{@hk@i?>6G;SZ>=+C?WhLTx|3f
zs%p}e*a++=#`#l#si#Ri1N*h@6-fP!Z4-%PV3Mvy1%o}H%gRB;Lmmgeq0P~Z*|kyS
z96!+zwJn#PbVMJ6)s!YggIa7tM1HcdHk%N2;+hb3*@URaCPaNUAzbn?SQ48M$!tP2
zU=yNY#Dr)RH6c>8DvdA1gwWW8(D^Z#WXC2%6RQc)lud}#s0q<5VnQ^Jm=Mm$V=xCl
z25Vt4AzGSE2sdr>eOs{!kyMch(V9(&HYO9Itu!Ipu?ayvVm2Y#vkB25#)Rl7O^8lW
z6M}x2l1&KuX-YOB=%*<Ym=IkpCPcbZm=N8}CPW6C5Z$dNL=QG0dTRA~2#T7~G}D@8
zvI)^EVnXzem=Jy9nh<^2gy_d6M1M9Rn0b5?VnD?v#6Xx3gQN+;BNW$!7+iq~p|J^(
zrMZVl6GAf6Qiie#!Snhzf3k5Hn-Ie_b%ds7vk5V>LKA`>p84o0JvC_5XsZb^CY}i~
z)@(wIGn)|O*@T$DCd5QGAtFE7IEhV&$-iYn@J}}K^CS6_jZ@4fgykn2#m`lWpKP42
zsWal65Hl?%#4NK3kpUB;d&Gp8UCx9c6GZ%EBae?`uEm7#L`;Zz6`K%VX+r2J&qz(y
zwz^I1l3Ja;GSr3Q&4QZjm5KZcZtb{BY8_ru>++IXkC#+_v!FgNsY$$~Ci9ZoAhM)3
zj4r8-v??hVvZOZVB{lM9fn@hDFR5|7GQY8;8gCXvmel&c$C64t@;`7%<staDEUC==
zUtdyrg#O!0YU4Ods$>?kr1HG}%{L4F)0fnX*ei44OKR2lUKz`hTE*;@`G35m^7#Dg
zORAPsjlD8C?3Kwidu4nPuZ*9)GCYxmS0+z4du8(3D^n2l$^;@_nfYd~%z}tlrZBNr
zrl`DEra0o2DT#PxN+VvGi=$qdpv5Z_5<VCBLu?4C5>?D=@Ger5caey1zj7CWhM88m
zi(Hsu&iBXvg<Zrg74yh0GK3X#UZoQ4A`##IP*%)`X)1JcR?M?2RLtq;=2<ZxWmU{a
z$5YJ5m=*J}X2pCQE9T=_F`vMSdBnFrkrnevzonQb@a<1FE9Ms8K7LA`ef!fib$Wco
ze1=6apIM<|KC7H!KARPD9v{aXi()=EqL_OsR?O!~#oSm-*Yu>t^wm}5J;%$w)#PH@
zqB?8hHQ2dPGp;6Hi#74utclmLYT|WS6R*dbczxEylOmdUa#Ry<pjByjA)0uja+<hg
z$C`MG)w$7_of}P}&JA*S$mgTv@kso9w3)@Z(cJ9ZFf{QN?A)kZktW`fog1x8&W+a6
zxzUE58`L9aO}s5TH^~1%?`^}jU8}X1&W#RH=SIh<bAwzS?A#!iM*`;ty*oUsjMzE%
zk;6-SQtcw08(q!LjdXTybhA1)GT6D%U8~PSAlJ|y?A+)Xac+>mqpRWhXwxgMbE7vq
zH~O%1qc1x*n0b6nykEu6js9?M43N$Z9-+9J_`nLB8;w{KAEdbl%QaLo(^9f{4dr>A
zXbp|L<2!`c(4m@&Wt7*@5f!eX^i-AC(2>?PbX2@GbhLR59b;ZY$MPCFj@QufyoN?J
z@d><!PW-JkG=U~Q$-IVIH1WxrI)&FzXyx(O(CL;nbcT5iO^0)%Tg16Bv)mdwi`P&d
zAIEIV8agMkhR&^M4Xu2&wer>0%2!(}Uu~^?wYBoq*2-5~D_?CjUTuYIDyx6QnzCsM
zKQpfPJ9wm=@=@{c_FO|Ve=!+(u3_g##q16VH6pWMo%&BdDt1I373&xHsMtlSdh}5-
zd(FRXOmZ~U%wJ4i=%eC)^K%U?^2Oxu`ly(P;QymXO18}U#iZ#`F^`b2rZn>xll-Vy
zGNPgFukja?|IOzbzxB51@Ah0n<1Z$E-$%s}k5swm8nVL37n3YNGxkVH1b;DUd9Gpc
zNc}sXYp|qR%gCeRh)3$T9~E;m%wJ3<tIk0;z*XI}27g)Tjs3Dvm|G2MFQ?aY8a}s-
zV59xWr>t8=uyJ4w!6pl?BG`1?T7s#+tR~p(T~Ibpxq@Je9xDj8oCca!KG3vo2~3+u
z!J+Lj&~#XiQXOXjc3Oa|Y42P~uyYbhb@>)BV=!R%x2`7G<EqOE_B;kjGG~Ck*DK)G
z`%9GSvlKl0{&*R|0ne@^IIuIW4(ba0;Le~K+7o<+W#H<_mf$l=1swe)Bp>5Pe(WgF
zjH?a)<JW<5f(w|5hakzMb>KGn3~HIu7(AyQMsG}OhpW@SM1Dpcz?qq7^{gjRcJ@f{
znKK%-%$<$A*M_S(pWtfl2-M}<g?{i4gv@!HfzN*t?JZaeOyCCO=T`?@phAYie?Z!z
z9jLeXMZl6UI0xsTY^WYeg>M6<tOx2{WCP9ORA4Ugfy1RMA?IbMfxl9{9R(<hHeIc5
z0`nE>3?Nskk3zsz>McmtsDDB+`jrxpT%(S}*|lmLNVZ;m6O1>gxj4H{{SXDNSKkEX
zCUq~6H>eFD&5dd!bihsO50Gbzx)ofus>48Xi`pMuajSY1$!%&AB)6;WL9#==3_R~p
zPXT$SIt=ygR8^eatu8{-cd74!=e_E8xVBqu2;_b0ax`;~x)8GLRZpOzed==<M+)VK
zW*$`UM1coX1;r1kxsdZ=bqI9`G!5_2h^9e)Th=t{MBYxT%?{i;?7*!X*MVD)9k}(`
zftzG?;3l&Jw*fnF8?pnpQN)3p5_RA<)~YnQ5C?A4at>U{jvctERtGM<i8f#?y<o1>
zi)j3T3w`t=9k}$>OJWCZD~kiSbsPt78+PC(SLDEL%MRRjCI@bN>A<B|&+64A8I!3;
z%nsa+?7*dOVA`+)H%&Tl={+?5z=hsJ<G0i3vln*Y(q}IT9Jt*q4%`gs!0m2!;Pzk#
zZcnQNH<KN>y|nr~1XAUoSJA`=E`1^nT>9okI&k~Nb>Q}A2QGc-a^^bH*~ScH2QD*@
z@4y{Yu>*H79JpE1fy*Nl*MU2v0taqWcHj=x+{2^;S2EL5hO+~g=k;$oa7VBMH(OKj
zjSD+)M_1^;9TRonj<q^)$HjBtjyF4SCzu_$6WM_~i5<9;%Q<kTumgAMKk2}oR-pqI
zZ=|sUcc!M!itoUshgdyl;!(pL6sC#deataCaC^Xk+cV<8om<X<OF<sQ+i5&Lj(HXb
zt~cVqtyYl(S8J3bAA)H~x%|P0&-}rMpFj9$5eegv7x~U3KYFKB^qogR<U5Z5f9GKh
z<IsA(GyOK|4ZZzG*U$<-_L>d+*@yb-wS=&}y`G@fe;q;lqu8bNBY=)hM5sGAfX1~B
z(ET!Ks?-9k+7QexT1a{97Bwfv?iT!%uU_R-K1{<7kw`w3Px&gJ^3jACQ2CVapWGz=
z*FEL4T}J`&qVGr4mhVS2ulBF2*ZlFvze~L?|M3T_U(%!3)T&<78xTUhCOw<gYtpo(
zdQFP9RIlTF{L#{)Ubix<*T%;mtuIKuZo}$zTa$X-PO8`K6ROu8SiSBTqh5EC>NUOo
z!0I(=+N@rarp@X#Y1#?Y>vW5H-A$_38D{mmJFC|{tm<`7R<ARoN<&dI>^#kSv3lJ*
zqF(ojsMmets@L?*hrch`7fGaN3-x-yg{jvAD^{-uLA@S)A?kHj1?qL=<BuVlyQ)>a
zj(q&V>RY0ZKQ7|$$A34E9)By``1keb(L1LPQ=mvKshmfTKSeGV1Fgmdy{TON-|(h#
zcc|ArBI<PoAAj)p{L^nLx2)y#Q-o2i5t~+Ry;_4DXBpw_cIQVFy;)nU=IV`V--VB`
zA6I?O8T3uJ-5J1_-Jhs_yK^IQJ8XG&?G9VM-C1=HogT#L12`Ry(-Sy-AE#?^x<aGe
zv&h|uoNX`V?$U^47f$=*^daPyBKJIU4<h%yR$zCw-$%I>cFIjhZVPfNk^30AKOpxt
zavvb~GjcWeQ*NbBx&Fv)MXnIJyO7(8++O64Bli??-y!!BaxD)~?i6wpkvr}TI4`2;
zp>}6mK)qwzo9Uy+{2w2sA4V$p>w5$Pzp8|p|4TQ@FIamG!9t3NsyK=!yiBmz^DvtX
z$x}Cz+TZXneP=^zKY2grUPtFGt8XXRsu-~KH9HBmY5W+$w#p`g?Ouc-(SGoq1Ur=8
zOR(chy9suB`*wnzPv1eX%Y%;)?3#ppdi7ffcDwF=f*D)MKc#oyv4>!fJHWqZ(?1Z*
z{1LVE9zlZXeL4d6O$WDr9Z*aEQ@0TuaPbj>0}rFrpjr139DEp@v$jL3p=b9J9Ci}2
z4bQ!S;D}*>*^hwR$QMCBs^y~uN6$D!aLjpd9{Urd8h?_el|EtfL4p&HfZL?o(Jzy)
z0i2R^J;ABx9w#_01e)o01J3vX63%)BvdzBgA%b&0L#yYuL#sW9(9(IysNK8n0KuG{
zpvfH#8egxihM`40^*E&I^E{~gJ`9RswtWP%R~{xf`iBPzj%yAnCr;T<aMogsM9$;e
z2o`s{hhWJ=kh1h4v|-VJT?Ci50?*67N3F}$0$jdYrD)1xjFayC10j;TQX7mhe^On9
zLeHvQpP=Lw)r-ros_&t1-&7|u*;i^UjP*CF9n8N|-`PmXkLnXhepW}Ie^=UeVfw7L
zy@x(rXS-+zB^zxEuczdC+ZqUagY7Iv=qB4uXyjH~T~OX?>$ZiG+iX{$g}2-8!erQP
zE2fGdq2o^5->E5%yKK#FqGYG-2=R5?ZF>(RxYzbDAt@U7610E6Z4aiwfoR!3h_Z)7
z*#|}0!=mh4HXmww+m?kfd`Fah*C?yMXB$ghZ&ZI$us9`HoE9uT5G>ByZomltV)H_Z
zUj>V0+HE9^QFfJPh`w4gL|>yBqOa8q(bsE}FdsK){UF6g%@F;%XxW=Z*)5{%R#EmA
zQTBFiHb!H+)*RDhhbVhTwCug2>~2x^K2i35QFgy}^hPRlKpTz;d{C5qAX@feQT7p0
z_EAyxF;VtOZQR{d=qc@bh<;3zeL7n9MN#%8QTC6b?8`=3#|K)7=Ah$4?L)ei9A~sM
znC%~FhtQCZwdXO%KhZ`b`BZC&`Te;z3=7Mb+Q(FmeuJV4CfT>z*O*V=X(2i@`uGQJ
zFWo(kAGMw|_KtH}7ZmzgP@mU^VY>dJeJAI!W2N1oUS&6^SKAHhjrKhl#_Q|`_4W2V
zYMsG*tDwF`P;V2|f3?rWBJ?-=QViTj_6;;24eCz>^{0aRGeP~Op#I99hdckQ{WOVZ
zP=6z+{~@Tq71Tcp>T~v8kozb5URv}F>hps77eW22puSo+Bw4O+MqjSbchKTsP_NRB
zR<72KR<6;FR^FsHq+ZZ()(uIv=$omN4eDD2^)^9$tDxQ?sPE7XN$%7S(CubW?-bN`
z3+i2h`aVH@zmQ~)-j5bRgL<E!-Y=*R2<j*FakK#HM+NUE^~-48FsP3S>Zb+uGlKeg
zLH&Z@{i42*77c^?kAnJTLH&xLKA{)jc79(_pVZsaN@Gx;7StaI>JJTS$Itp|EGp;q
z9Y}uBtJ26je$_9bu5m1LG@vEXai!yHEQxnIKBsA>-|P50rr~}^E^dUk92uAc?>bIm
zv3=k1I#!|6jxJbH&p5uo;`ps23AfvM$40DI8=S{64Q_PSK=Iq04^w^mK4){RwuhXr
zVe5F%`65=ur$`48lIsnUP-VkCHxNu252mf(<qhN|r-Fs`6K7YfKwmjmW0$<mbrkpZ
zU9M{}i*~y1z&3KX>uR)Pmn$7r-{Z=|l)u+i3dwi7cH+jn&ovI4!2Pc5TPfM&`W_4N
zUe_{A_kFIPAmn~m1`U>e*mV`%#ri|8`e1#;H69c2Vb?jd{}ERWTzk~@H15>LTyIfl
z>W{l#1<xm3D{#Lbb=AQjJ?Yv9%BNiUxOU9-B=&(fUH?Gxmg{C9-*&x#hP~q&g4*A8
zRYNDd=Q@BT;e_ilbmsf6uhDNOU45WeoN`S=N1t}pfk+>?{<4{p4_$i?Q}U531Ksnn
z>jb!b;(8A|)Tgd8oPFjRkB<D@brOB{ch_nN@rCOo3Vi8$6eIeT>lRdW*7X`j^lR6f
z7`$&>OVHQfyN;qKesFC<!+vyieT<TGuFug~Ke-;kjrX&w4#Yk0dJuB{;tFD~`qh<>
za?9L}FuSgFmx1+F?iW$~YWG=Cu5xd}_^)=qi>_Pa9tFwPx?Sj=YuvN3ldf|QKu2Ed
zZU&O|?(ZSQ2KP<4%Qm_XW8ANEH^nes@1Bc_Ho13T+;4JwKytIYHHK=7`xTIEb)Ul=
zyT!d6$u{>pm?*cp`(tcxb6<*axZRzJ0od+-9MgY?`%Coo9qy6n<U8FR!1{=L6Z-67
zcLDnB5qC10@~ArlvOMNqhT@OA!$2Ob`YFmCuey(>8{Mi#Xs4(Xh276JD&5c2JC*L|
z0n`i56RBhg^{sS2SGu1m!d&G?Cgdh@)ElyvjHUW(=+|J3Pxbau)H&y6Ptbj$&?_JO
z-E)n~pc&>>^gs1;XP0zE)AxDoil*=L*cD9<YIa4FgF1mLx{JjX-PQRdmd%#wW><7K
zc134cUD4gy72QLtZ+W@2C%d9EBd%!jPIqmF&TZ2>t}B|p&=am`3P>be(f!yJ&CKJ!
z>CwMpSM&h5q6bP>wE5*y%g=EPs=yT;dAW43=Fa-<&z*nk<<kH3&z&Pb$8q7GJ16)#
z4$J4xmY?JJT|Rg223K@O#1;LoeePVnqBlJ%Kh>-JRIl<=y~<DZDnHe${8X><Q@#Jw
zpX&V?u24n5r;Pmj28{kX1$<S_7n7A0{-d<A_{I7fm7nk#R#xRFd=X#O|Cpdsm7nm@
z)OB9c9<Xx<y0Q1!{^@A1m7nm%|Aenr{4e*kIyrjEvnT0wwWc;bWzo}=NzK($-hB}m
zpPu47ii}@RIq(cJd3s8^k;&In_P>KygPIrU4NkvJIp^K4)5x81`|16m;2Rjqrwn+;
zfZq|+?sEk61_P-MI~{mJNIyA*pGQ`5`vU$j0?Yq>w7l!efgOOK4gpNP1h8Zy;6(hu
zud<i8>B^}?fcMaZ(Ur5~QFbc5k)tcM9tIrr8Q>%L0jA{Osy79&^C!T}t`7KZXJD>W
z0gsFVW_~BYs|o@4HUU26F!DA7eqRNc!z}?Hr?0Mc<@LJ&PZ;>83_d>k?N4gOX~4g@
z0S8|JxO+6<*1>>3h5_3Z0s0JizFmmC3qNVC9IFqw_cFk5)&NeK4fslX)azS;{6vHD
z(f5#V-5qd<4{&}mXlnETe4{onZKndhzX|X#y|bk&JB_k^-bH>4eeIzuyJmyFW)I{a
zsR{VwI>5z-gy$L}Uw0f}H3~$gE2FvszBm@}XM?AG0ywO=8kkuD!2T})*0urPjefXT
zS6Ui853EN1GV+}2m1bY1*;i@yRhoU3X5ar9v#<9{SkGs@1?c<>;H~ci)_)i9gt421
zhhc5^v;$^i3*_%NcFZ@=B7b}VU=3pjdBMm}H}<R~V+WaO<hvL<UFT__G#P*N3IErm
zn&@MXKSgk2;d=xp^*&8-@~$@tP8kaMrv5~#i9W6HWrEW?e?V~NLDD+(Svw*3?5|!W
zIA`mh3C^{jAn4hQmdrZ~nDYgA=6(PUzHz9<pN~>`j{z3^K&p-&X!9Jw`NN(k_;>va
zRJqDE^B5&-T`z$7HdiV5Zg*XRLRY%CK!)Y+tS2dX!#&`2mDD3@Ti<if65=}N$m5jU
z=Wc;M+~a=vkCg0lzX1*h+&`lcuU2{R1xjA4ay5k8ShWTE;QFdpAlY2?qCZJ~ZJ#?u
zh;}p4*7o<1Qm1$HqXlVgsWnE8&#DhV%okKQ@iQ>()_|`4v*^PPrvW>jL6x1Jc!gl+
z6M$X1LBRCA=%0*+V9;X`<jLFx%HAiyq3`FAe#UXq4)s~BfS=p=48e<^gEZj~`faWH
zHwbvG+7!tq^=DMKS$zN_aidxjopF=e1x?(dW~0lts%~`IHr4$aCAX?!jLIIh9x8lR
z-Hn-dQk{qXT4o!J@n3Gsh5)B+$1p-4+3KSC-`bAi+Lc;kbm>aXew31(T3s~fUd@96
z+pCTIkdl|QiD>W3+83z(RqYWVpRiw#fqvEgJ+A%5-T@7I%{~vv*X=){zPIdq(W1BQ
zei{n>9s3Y8<z4%JNd3P3QZPGdFGGP-_Vp-!+P)SA&e-RpeSft-i-vt{Uj~srvDd-b
zdVL_;aGl-}&AdT(qnEbmt5Dz@{qH#YPX8ln_)+hV<Y#>;hTvCyGP?C@$0d+umE$3d
z-8GJRn57#Wm!R#N9CJxP8h+amQca`p{je|P_Xvkn`7;k99FpxnG8|I$$2b4Cg+r1p
z`)`CpvOKZoa7dC7!Xeq>g+r<g&`oB1xsRtf9Fpaib1YA+E#Z)UmnYViU(Wg0o>*5L
zpgSqzOPLoN1xasON;@8H>=hsMa*%S5tvrpU*p#Q^;x*+=T)bUzY4P~z%D}jIhtifD
z64pagoXYyDRxI4>E~Vy0@$hbCUR-<?WjeW4tcRu$^HeJq{4Y{c<Kn9+^GI#99-2~J
zdFWaz7JO<b3OR?Y2L;_xwm)OVg1(lbJRc8VTiN*cc=$TX&2jN{mF;oy^_1gr@%5GC
zU&iB~q$pp-!zU|8NoBSkn$kd-_q7#^c9EYZ&Y>yfsfu%GN=h6o_)r|bIESV*iGu}w
z(**d`1o&p<@T85<jdS)_>jhT=d>z8ulxl`~5nHC7#2>ZHAM!F9^h(QEYDJ-FkaRaT
zP*%(Sgd8}=(7&pbA9wQG8GOcD_$)K%%g>+N2_LUr<hdmNHI!62Z&4lxrX|`nL-Ow=
z?|bsq8T9jR0+oUpY#jd}2>CZk`O#lE&eK2eH(T)MB!0UEZ>NTX{vHdyro`{J;G0SO
zLlV#ZR+rWV@HuLsZ$|`ZS0{Y;NXHu@Q92uZqW2NT2}g#c*Cb-1Kr8(ueu~83B>QK$
z#D7lD5ODk;*T1O-|77KqypQG?R~4mz@bSjwQiHyRGM*fkILH@j9Q3(DT>5Q<N4w^c
z>lDW+11idHi67fXAb6dAMB@K8K;Uh3jzd^u<>$Y!6wBkuqKGTcSCYSNxZqn|^4F-J
z;<dM00(@)2qg{U*Dfp0w**JPgygo{tFOm4c5`VeGUnTLgC0-dV5Il}S!pD<miKKsL
zhM=D%`CKRQttI{s62D#I-<Ti}w@Cbb5`VF*knAJlc%1O@>U}8z{!Pgz*(=Js<^J}b
z#9um3;J1lH`B>sJuM&vwCH@P-<9=VeLm+gCchWpXKQ}88h%9*n*O&O!^e_ZRfXd@&
zEb(WTigQ^^X(jQ#MFPKDBuYn#uP*t>VoEoOpLdBM94WPyz7pU0YJo44e1=Q>)5`^-
ztHe*0_`4+i8Hvx6_;d8I0!O+?l#s-)mVEk3{1p=at;FkcysjZU`fa1^H`zVPjS_#e
z#IriGPvY;f&_7K0c;o(zLEk_*^|cg4*83*mF)r_|6nMF;D4$5YgC2w6cubDVxdi+*
zd;EIqCBU~LJleZ`r{J%oiWKdRCx3T?zLs(+FQ~E@l6dqV5Ix$p>{>yH{SwDoQqh3l
zZoy{}y`t1Er~gbg>MgJ5%#rj~J*w2em)FOxHu#j+f3^_5Hpz3l2lXq;36UswN_?r5
zlkcMkBz`45e8aIzB+AnUUY^7xZ%O*C*#g;AUONT6t$hD{lz`8VlFuY^dE%HV5~T(S
zR$Jy{ljaHV0~6q<B*4!nd}=IXnO{lxTD1SNe#rOBR)bG9W$Xw+*j&<gBzquLDX(Yl
zk@Uyb2*Tcy{viV|PhygnB>jzZ1(L_>%>?wHNct8~wCJGeXB^*2e5sU^*EiK6=EJ?R
zpXD~7G&1nAa7>b(06&88wWz%V<^9h6GmY>y2|t3@0m*Bw!Ka#XY_LE~k?TXg#BZA;
z@X)Jpgbe--lmf|zriXE?GU(+=OtOvn3=+sTlK%q|-$qWT3|VH6F+a=OK~E*%|3(7*
z=Lzt3r|9Q_Jw=7pWdmv%_!z-uwxy&$DaVWVm$U@*BNO0rC7-HtU>iyPixbeVlk|l%
z1>sAQeoF%SJqhqnNj^g-3kJ}uaJ-&?{wqm8QR*`_1i7NpM2dI6*GqtJp8!7~0e))&
zc_t;G4;c7r$_C5)S(<?U+64I96W||DfPdA%S5wZ*b>mstKOZHaKPTzEvOjs9sZQ$>
z*8dsVf&s5r^$oma8k4jqJnmnsz1%Ya{V2j?oh-Gi10I9En)14oCrj25Ou%P3;ZtMe
z;Or*ww<RmbrQN~!?UMg3mi8V2J?;DK$KZA2xTGJqQgD#lp7Mr)XRguovjljhis;Yo
z@_v!Gw^B{w|3-mpa4ePjOk;`fDC=z@E-UQ}ygZ3X21$BF3h0;DCM2NuC%`X}d@`lt
ze_HZ6NE-y^!zI}Q!Ta?JN$;RnQE=3j@{gkqtV!d0>KlP)c$1|6PV#9eFLW{bE$Vmb
zEb+HX`l)af(6Ldj5Bns(^K^md_4!eW?><uC*&caX;=i}tkMBzSF$?}niT6x1^Z(Vr
z3tTL$FZ<{7Ap)B(<!oo*V<|<xzohRxLY#AdW=nj7*#dvJEINho=uhkZ=uN<XvE*|m
zO)#+0zK-KEi4RG=uA@LJtAQuI%c_5FOu&Dq<dZDh`<mplU*c<y6o^(5e?;PES^D`|
ziT}yc&%YRWSvV$nEdl;x$tO+fGjGW<=M&JwuZsDcEA7^hq;G8CV=9!{u9E)PT7l>z
z=lKwcf6gNRgamwY6W|voz^_e!zm4!%->matw?SV`Ngglcyj9BoJmFJiJ~sJK((k@P
zVAsobeJSy|^8UI<7CLX>V>yU?t!kn_J6QUmiNtTR;L{}jtduh-Sq>$9^YZMKQMm7H
zG?=O8i(bhmd5pjlg>eKW{^Nn-{0|~gu8{b5<@&>V&I$uBiDHs1lKzmClkMe)B>srx
z!!jS2_$TE3%X;`<B)+EQem_BY?1RfJ`=Ami4usQkmE2%B6fVolqpZ&#^yddcVSmsQ
zF7o6SmK6I#ipS$C@#Ghl<ai4`zHmt}<nfj*R&q;<N(=pAzb~z4dS=h~M4mjV!s87F
zy-Pg);&5<@k{9$A`8~d}qM{{KB<9QmVN9f@`JRRTU?6XaC%3?#yC76n6oRlaKiC`c
zkITyo`NJWFC>ia^DJ;odK&;1496EXO49~DBW3wh_j~napr1j|9wX2dBD5MtUczp(4
zNXaT8u}TVu5n3rJE?nXX`U^o*T;lU9RDDiK$gh-zy!n15v>;Gw09;o>x!z)G9_M_4
zpg))TW{E<5mhUMk4F^h!Luoy_rDsx48Fl!)VXvnw94JIH;tAM2vqy&FU+mAN>kGYw
z0V?44;9Lp%LnVa^{hrX0qQXG&0%~!f)K8*%eI7dZE2W{*V4ygR5nSY@ZuAF(B|$pL
z^LvW?MO-I!Q<1lj`iJ_OM!48psD#47!ram&M2bEP=5|ZZ^@KwHjCAVI5^6eK@Obk5
zVNY&pS-?ld7ssBLE97114HSBFs8hg>CJ6Hs+>}ypxPVxOgMlJXI7qYs>Z_2y)Eo3t
z-wAxQCn@J40dtE=l_Ku~w71Y34p29M4b?!`LV<j02C)nkM|!|lQW{k93d=$TIW(#L
zLB&@>lyQ4JGrd<jO^IS3O}zZFBATKhC9fz<-Gd~zz)OsZ=;knzQjz2pmUzR&voz=_
zDI*%=Mku3^D)stk-o#ufDk}^Js8O<=bY1cK=9h)S9*C_J((s}E!26c^gC%L18JXSU
z%bC%uXN0C#CWeiN*huNzLyb+#=+Uc}l3!dF2o<2`jBB2tf&rv_aS0|+PD!B>F2LL=
zKk8;oPNwNRYz(Z&Gi>6ZF+)8=#}4s$NLbTt>r*^KW{e#)COgY~B^34s!-N_>c8X`{
z2+3i@kco<C_~>zi2aWcO8#Zjx(8-?3g9eWt>cQj@qe(>y139^#+)$aZlo-=WE&$YT
zbW>8({KX3g4jeu@dvKN~JuN-0JIyjo@-9@E>ZOqk`u%1C;~t@VipIwi2pOY6_kuU4
zBp5amdStf|JDPbV#pbKj!^L#-#6cF7nvtID$zwdxwRQ5CEUc`PjSdZkeKe2E^wB%j
zznI$R^ZS(2i--HerHkk;#ORl+mP+KwJ}WH?hs^A9X{r^MmGXUqg#$e#){1C9c#D^C
zU!aPDU~yTYnGt3U79ca$e6K4WKUTWPnuIZ1$g*KwEcTg;my}Wq13@#A)^$%Q?K!#L
zLb|i^L;m2xK&~;|o)F!Zat$Cc>6SB>q$#r)<8H>%JTW@YW2wiu`HN_G!ZMp%P)-gK
zo+|N`m5SMvyMPwP1)e;*rK9{aFbk<=9xVM)G^Qe5pk5Es!s3apYoc?_gy<w*!<16m
zcte&sONB5b=2FJ8=*^+cK`cPDEP6a%TEK&0i;NJaz#H=Tg2n<G^ydW@E3`LIKbu*S
z#D1D&W+YW!T&|;uGIx=M&=9;_C!!Yu#kqxLJ~?iAG)HsIRbXnFG4kFw)(+Zv0+w+m
z;b~_u7ctL7v6CrzRK4Yf$O{B%6(%`jHXkaBl|muW%B?7*0b%nN%NT~wS}~12?I31h
z3ccm($n%%vnTcq3Eh;G%oQ?I=7b@`-c#CN(p^XVrVzcus^u~(n$sR`~X!G!tg|IXj
zbCef*k0;LbNAH#!7Ew-Yp~g&=ii)At#4wF)H`bdcW_89kK)W=S6>|ggjGfqw18=!s
zE2Ku16bI-w4=k0Fqm()rH=J>6mV^sP7d7rq%%UPvx{TeQXrQ<lV#}$8HG*i3%b+b?
zkZ+oF5=R?KgoS9kS-9BEAxt|a?fP=PG}BQdM0>j!E1cB8=>E$I(UxLtcyx1#70g@`
zbJ-{p-Q=)f;dU%4S!lWMOQXv|kQ60j-9}63=8HD9AY4=^&b+y~{*Yz1&o7~wVoW2b
zWVG=>%_+sQYnIFCpkN7UR7G?f81u<YKuwNS>Owi0M*sUrqcnP*l(>MogzObF(!V%h
zS#<Ii1p{GA!#y6_CZ(Dog)a_M&xNQiV?m`gjI5Pn+7*oLl?@lu<|<}!Ua+LdXNX!*
zLe_^NtC(bF!O#Jb>0zA6{m0NZaUYVlP-?85o{~Jl$K%Oe?1ge-%wRJ=V|gx))me@8
zQ5Y*qsj=LU9Q0={XEc|piYUCa9(Z#6g(ZtTwBkzrhHeU04d`~_y`Lu~-Ls@`SokVw
zr2ml`M`<`H(gMyICPAr^R$LPHrwtmM-6`zNmuI9<rsb3c3VoeO`AEYV?IZ<Cnr}%l
z@e?WCCB_xH14E=JnNCOpO+!;qK;}vd!%CX5W~7Cw9ZDK?V1TZb7^*{>zd)LX1wJav
zXQG6#5k(2kqZFzj28KA^qChT*P(q9eq#3md&UAAsX*7wU>l3Lzr!1eCc#HGNDwL;z
z;yiE>mvVA~{)K#I*jn-m^&6wj#f!YTB*pJ>z;6H-pLCZ*-Y4Yu3R5hVNOxca3x6&>
zQizo2_YOHtjZlgV`=c3+=l`+g`TayrTQZ?ct>xji`|j6YOIUt>KZaBGr%JaWy@_Z(
z;O>R5h0715=kGW;{a6-s5DsnU^2s2g18zVR;qv@mB&T_N0mO7z`wx#fe~po!-=F04
za-$SJUl9kh<8%R`e~Bs2?^$vx-8~BAM;Vr1a3eoGWzeV0kKf1S^am+0*U#m-{mW(f
z!IB}rx5?>oIbbNCME}<E>j{S{AfIuhnp5^`Nvt6&=aH_V|Jd@WslY0c_d$6a1s1+%
zad5hsPGigS`=Xpqna?H7<rurwQl8%{<&@lFrbBi|1h<`XsGs|vpWAS{onM%eB~oR|
z$vu>bEzj?%aynbiXQsE-|DdIO<bBrXMTupW2JrYXzoV3?PXAmDzpt@TIt;D!e9q}h
zR2I)Zxjergd-DaBhkqRH=koj>?RHtdEnSN}Qe#gj|5y6Q+~e~6zU^^#c*=p{`N?wg
z{P}{)#@5g8<sLst7wE_oNd$6vPVrbNwtVFM-BY5-eq4+>xE!ZH)3w<0{M_>BX;El(
zOj%@8E$4V|5WKkk{JzJC4+P;dFpoJ>WAey4={iWdSnyn)*I&NO>^K$gh4R&Cda11c
zPmvo#T;p;~*_04g9IlPVz|wKxUXe1i(soEt{``7D+0e28S<ClIP(BGSZqm_s71u0M
zYx$wFytV%_x{8LsL>~gg9(=~>cv(J{#>m$fcX}!lN&?HNGB-i_)Fe@U%=>1}*77*F
z43SZw^jlH>sN5)85Oy*B^SqU4#Z5S#dqzPCx>OfC&RNU)W39hdg7VE$MEM1l0+v(d
FzW_Zh=;8nX
literal 0
HcmV?d00001
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 5888242f75..42ad0a1889 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -927,6 +927,7 @@ typedef enum
WAIT_EVENT_TIMELINE_HISTORY_READ,
WAIT_EVENT_TIMELINE_HISTORY_SYNC,
WAIT_EVENT_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_TS_SHARED_DICT_WRITE,
WAIT_EVENT_TWOPHASE_FILE_READ,
WAIT_EVENT_TWOPHASE_FILE_SYNC,
WAIT_EVENT_TWOPHASE_FILE_WRITE,
diff --git a/src/include/tsearch/ts_shared.h b/src/include/tsearch/ts_shared.h
new file mode 100644
index 0000000000..f8cff7d8ad
--- /dev/null
+++ b/src/include/tsearch/ts_shared.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * ts_shared.h
+ * Text search shared dictionary management
+ *
+ * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+ *
+ * src/include/tsearch/ts_shared.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TS_SHARED_H
+#define TS_SHARED_H
+
+#include "tsearch/ts_public.h"
+
+#define PG_SHDICT_DIR "pg_shdict"
+
+typedef void *(*ts_dict_build_callback) (List *dictoptions, Size *size);
+
+extern char *ts_dict_shared_init(DictInitData *init_data,
+ ts_dict_build_callback allocate_cb);
+extern void *ts_dict_shared_attach(const char *dict_name, Size *dict_size);
+extern void ts_dict_shared_detach(const char *dict_name, void *dict_address,
+ Size dict_size);
+
+#endif /* TS_SHARED_H */
--
2.21.0
0004-Store-ispell-in-shared-location-v19.patchtext/x-patch; name=0004-Store-ispell-in-shared-location-v19.patchDownload
From ed02fa5be0f88d472f8d61dc8c90cdf93eeb5c0b Mon Sep 17 00:00:00 2001
From: Arthur Zakirov <z-arthur@yandex.ru>
Date: Fri, 5 Apr 2019 18:48:04 +0300
Subject: [PATCH 4/4] Store-ispell-in-shared-location
Reviewed-by: Tomas Vondra, Ildus Kurbangaliev
---
doc/src/sgml/textsearch.sgml | 15 +
src/backend/tsearch/dict_ispell.c | 201 +++--
src/backend/tsearch/spell.c | 1343 +++++++++++++++++++----------
src/include/tsearch/dicts/spell.h | 239 +++--
4 files changed, 1217 insertions(+), 581 deletions(-)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 40888a4d20..87c53397f4 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3112,6 +3112,21 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
</sect2>
+ <sect2 id="textsearch-shared-dictionaries">
+ <title>Dictionaries in Shared Memory</title>
+
+ <para>
+ Dictionaries, especially <application>Ispell</application>, may be quite
+ expensive both in terms of memory and CPU usage. For large dictionaries
+ it may take multiple seconds to read and process input text files on first
+ access, and the in-memory representation may require tens of megabytes.
+ When each backend processes the dictionaries independently and stores them
+ in private memory, this cost is significant. To amortize it, the compiled
+ dictionary may be stored in shared memory for reuse by other backends.
+ Currently only <application>Ispell</application> is stored in shared memory.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="textsearch-configuration">
diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c
index fc9a96abca..821d66f4e2 100644
--- a/src/backend/tsearch/dict_ispell.c
+++ b/src/backend/tsearch/dict_ispell.c
@@ -5,6 +5,11 @@
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
*
+ * Compiled Ispell dictionaries are stored in DSM. All necessary data are built
+ * within dispell_build() function. But structures for regular expressions are
+ * compiled on first demand and stored using AffixReg array. It is because
+ * regex_t and Regis cannot be stored in shared memory easily.
+ *
*
* IDENTIFICATION
* src/backend/tsearch/dict_ispell.c
@@ -14,95 +19,57 @@
#include "postgres.h"
#include "commands/defrem.h"
+#include "storage/dsm.h"
#include "tsearch/dicts/spell.h"
#include "tsearch/ts_locale.h"
+#include "tsearch/ts_shared.h"
#include "tsearch/ts_utils.h"
#include "utils/builtins.h"
typedef struct
{
+ char *dict_name;
StopList stoplist;
IspellDict obj;
} DictISpell;
+static void parse_dictoptions(List *dictoptions,
+ char **dictfile, char **afffile, char **stopfile);
+static void *dispell_build(List *dictoptions, Size *size);
+
Datum
dispell_init(PG_FUNCTION_ARGS)
{
DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictISpell *d;
- bool affloaded = false,
- dictloaded = false,
- stoploaded = false;
- ListCell *l;
+ Size dict_size;
+ char *stopfile;
d = (DictISpell *) palloc0(sizeof(DictISpell));
- NIStartBuild(&(d->obj));
+ parse_dictoptions(init_data->dict_options, NULL, NULL, &stopfile);
- foreach(l, init_data->dict_options)
- {
- DefElem *defel = (DefElem *) lfirst(l);
-
- if (strcmp(defel->defname, "dictfile") == 0)
- {
- if (dictloaded)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("multiple DictFile parameters")));
- NIImportDictionary(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "dict"));
- dictloaded = true;
- }
- else if (strcmp(defel->defname, "afffile") == 0)
- {
- if (affloaded)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("multiple AffFile parameters")));
- NIImportAffixes(&(d->obj),
- get_tsearch_config_filename(defGetString(defel),
- "affix"));
- affloaded = true;
- }
- else if (strcmp(defel->defname, "stopwords") == 0)
- {
- if (stoploaded)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("multiple StopWords parameters")));
- readstoplist(defGetString(defel), &(d->stoplist), lowerstr);
- stoploaded = true;
- }
- else
- {
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized Ispell parameter: \"%s\"",
- defel->defname)));
- }
- }
+ if (stopfile)
+ readstoplist(stopfile, &(d->stoplist), lowerstr);
- if (affloaded && dictloaded)
- {
- NISortDictionary(&(d->obj));
- NISortAffixes(&(d->obj));
- }
- else if (!affloaded)
- {
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("missing AffFile parameter")));
- }
+ /*
+ * Build the dictionary in backend's memory if dictid is invalid (it may
+ * happen if the dicionary's init method was called within
+ * verify_dictoptions()).
+ */
+ if (!OidIsValid(init_data->dict.id))
+ d->obj.dict = dispell_build(init_data->dict_options, &dict_size);
else
{
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("missing DictFile parameter")));
+ d->dict_name = ts_dict_shared_init(init_data, dispell_build);
+ d->obj.dict = (IspellDictData *) ts_dict_shared_attach(d->dict_name,
+ &dict_size);
}
+ d->obj.reg = (AffixReg *) palloc0(d->obj.dict->nAffix * sizeof(AffixReg));
- NIFinishBuild(&(d->obj));
+ /* Current memory context is dictionary's private memory context */
+ d->obj.dictCtx = CurrentMemoryContext;
PG_RETURN_POINTER(d);
}
@@ -146,3 +113,111 @@ dispell_lexize(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(res);
}
+
+static void
+parse_dictoptions(List *dictoptions, char **dictfile, char **afffile,
+ char **stopfile)
+{
+ ListCell *l;
+
+ if (dictfile)
+ *dictfile = NULL;
+ if (afffile)
+ *afffile = NULL;
+ if (stopfile)
+ *stopfile = NULL;
+
+ foreach(l, dictoptions)
+ {
+ DefElem *defel = (DefElem *) lfirst(l);
+
+ if (strcmp(defel->defname, "dictfile") == 0)
+ {
+ if (!dictfile)
+ continue;
+
+ if (*dictfile)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple DictFile parameters")));
+ *dictfile = get_tsearch_config_filename(defGetString(defel), "dict");
+ }
+ else if (strcmp(defel->defname, "afffile") == 0)
+ {
+ if (!afffile)
+ continue;
+
+ if (*afffile)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple AffFile parameters")));
+ *afffile = get_tsearch_config_filename(defGetString(defel), "affix");
+ }
+ else if (strcmp(defel->defname, "stopwords") == 0)
+ {
+ if (!stopfile)
+ continue;
+
+ if (*stopfile)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("multiple StopWords parameters")));
+ *stopfile = defGetString(defel);
+ }
+ else
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("unrecognized Ispell parameter: \"%s\"",
+ defel->defname)));
+ }
+ }
+}
+
+/*
+ * Build the dictionary.
+ *
+ * Result is palloc'ed.
+ */
+static void *
+dispell_build(List *dictoptions, Size *size)
+{
+ IspellDictBuild build;
+ char *dictfile,
+ *afffile;
+
+ parse_dictoptions(dictoptions, &dictfile, &afffile, NULL);
+
+ if (!afffile)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("missing AffFile parameter")));
+ }
+ else if (!dictfile)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("missing DictFile parameter")));
+ }
+
+ MemSet(&build, 0, sizeof(build));
+ NIStartBuild(&build);
+
+ /* Read files */
+ NIImportDictionary(&build, dictfile);
+ NIImportAffixes(&build, afffile);
+
+ /* Build persistent data to use by backends */
+ NISortDictionary(&build);
+ NISortAffixes(&build);
+
+ NICopyData(&build);
+
+ /* Release temporary data */
+ NIFinishBuild(&build);
+
+ /* Return the buffer and its size */
+ *size = build.dict_size;
+ return build.dict;
+}
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index eb8416ce7f..123fba7a11 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -23,33 +23,35 @@
* Compilation of a dictionary
* ---------------------------
*
- * A compiled dictionary is stored in the IspellDict structure. Compilation of
- * a dictionary is divided into the several steps:
+ * A compiled dictionary is stored in the following structures:
+ * - IspellDictBuild - stores temporary data and IspellDictData
+ * - IspellDictData - stores permanent data used within NINormalizeWord()
+ * Compilation of the dictionary is divided into the several steps:
* - NIImportDictionary() - stores each word of a .dict file in the
* temporary Spell field.
- * - NIImportAffixes() - stores affix rules of an .affix file in the
- * Affix field (not temporary) if an .affix file has the Ispell format.
+ * - NIImportAffixes() - stores affix rules of an .affix file in the temporary
+ * Affix field if an .affix file has the Ispell format.
* -> NIImportOOAffixes() - stores affix rules if an .affix file has the
* Hunspell format. The AffixData field is initialized if AF parameter
* is defined.
* - NISortDictionary() - builds a prefix tree (Trie) from the words list
- * and stores it in the Dictionary field. The words list is got from the
+ * and stores it in the DictNodes field. The words list is got from the
* Spell field. The AffixData field is initialized if AF parameter is not
* defined.
* - NISortAffixes():
* - builds a list of compound affixes from the affix list and stores it
* in the CompoundAffix.
* - builds prefix trees (Trie) from the affix list for prefixes and suffixes
- * and stores them in Suffix and Prefix fields.
+ * and stores them in SuffixNodes and PrefixNodes fields.
* The affix list is got from the Affix field.
+ * Persistent data of the dictionary is copied within NICopyData().
*
* Memory management
* -----------------
*
- * The IspellDict structure has the Spell field which is used only in compile
- * time. The Spell field stores a words list. It can take a lot of memory.
- * Therefore when a dictionary is compiled this field is cleared by
- * NIFinishBuild().
+ * The IspellDictBuild structure has the temporary data which is used only in
+ * compile time. It can take a lot of memory. Therefore after compiling the
+ * dictionary this data is cleared by NIFinishBuild().
*
* All resources which should cleared by NIFinishBuild() is initialized using
* tmpalloc() and tmpalloc0().
@@ -73,112 +75,166 @@
* after the initialization is done. During initialization,
* CurrentMemoryContext is the long-lived memory context associated
* with the dictionary cache entry. We keep the short-lived stuff
- * in the Conf->buildCxt context.
+ * in the ConfBuild->buildCxt context.
*/
-#define tmpalloc(sz) MemoryContextAlloc(Conf->buildCxt, (sz))
-#define tmpalloc0(sz) MemoryContextAllocZero(Conf->buildCxt, (sz))
+#define tmpalloc(sz) MemoryContextAlloc(ConfBuild->buildCxt, (sz))
+#define tmpalloc0(sz) MemoryContextAllocZero(ConfBuild->buildCxt, (sz))
-#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+#define tmpstrdup(str) MemoryContextStrdup(ConfBuild->buildCxt, (str))
/*
* Prepare for constructing an ISpell dictionary.
*
- * The IspellDict struct is assumed to be zeroed when allocated.
+ * The IspellDictBuild struct is assumed to be zeroed when allocated.
*/
void
-NIStartBuild(IspellDict *Conf)
+NIStartBuild(IspellDictBuild *ConfBuild)
{
+ uint32 dict_size;
+
/*
* The temp context is a child of CurTransactionContext, so that it will
* go away automatically on error.
*/
- Conf->buildCxt = AllocSetContextCreate(CurTransactionContext,
- "Ispell dictionary init context",
- ALLOCSET_DEFAULT_SIZES);
+ ConfBuild->buildCxt = AllocSetContextCreate(CurTransactionContext,
+ "Ispell dictionary init context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /*
+ * Allocate buffer for the dictionary in current context not in buildCxt.
+ */
+ dict_size = MAXALIGN(IspellDictDataHdrSize);
+ ConfBuild->dict = palloc0(dict_size);
+ ConfBuild->dict_size = dict_size;
}
/*
- * Clean up when dictionary construction is complete.
+ * Copy compiled and persistent data into IspellDictData.
*/
void
-NIFinishBuild(IspellDict *Conf)
+NICopyData(IspellDictBuild *ConfBuild)
{
- /* Release no-longer-needed temp memory */
- MemoryContextDelete(Conf->buildCxt);
- /* Just for cleanliness, zero the now-dangling pointers */
- Conf->buildCxt = NULL;
- Conf->Spell = NULL;
- Conf->firstfree = NULL;
- Conf->CompoundAffixFlags = NULL;
-}
+ IspellDictData *dict;
+ uint32 size;
+ int i;
+ uint32 *offsets,
+ offset = 0;
+ SPNode *dict_node PG_USED_FOR_ASSERTS_ONLY;
+ AffixNode *aff_node PG_USED_FOR_ASSERTS_ONLY;
+ /*
+ * Calculate necessary space
+ */
+ size = ConfBuild->nAffixData * sizeof(uint32);
+ size += ConfBuild->AffixDataEnd;
-/*
- * "Compact" palloc: allocate without extra palloc overhead.
- *
- * Since we have no need to free the ispell data items individually, there's
- * not much value in the per-chunk overhead normally consumed by palloc.
- * Getting rid of it is helpful since ispell can allocate a lot of small nodes.
- *
- * We currently pre-zero all data allocated this way, even though some of it
- * doesn't need that. The cpalloc and cpalloc0 macros are just documentation
- * to indicate which allocations actually require zeroing.
- */
-#define COMPACT_ALLOC_CHUNK 8192 /* amount to get from palloc at once */
-#define COMPACT_MAX_REQ 1024 /* must be < COMPACT_ALLOC_CHUNK */
+ size += ConfBuild->nAffix * sizeof(uint32);
+ size += ConfBuild->AffixSize;
-static void *
-compact_palloc0(IspellDict *Conf, size_t size)
-{
- void *result;
+ size += ConfBuild->DictNodes.NodesEnd;
+ size += ConfBuild->PrefixNodes.NodesEnd;
+ size += ConfBuild->SuffixNodes.NodesEnd;
- /* Should only be called during init */
- Assert(Conf->buildCxt != NULL);
+ size += sizeof(CMPDAffix) * ConfBuild->nCompoundAffix;
- /* No point in this for large chunks */
- if (size > COMPACT_MAX_REQ)
- return palloc0(size);
+ /*
+ * Copy data itself
+ */
+ ConfBuild->dict_size = IspellDictDataHdrSize + size;
+ ConfBuild->dict = repalloc(ConfBuild->dict, ConfBuild->dict_size);
+
+ dict = ConfBuild->dict;
+
+ /* AffixData */
+ dict->nAffixData = ConfBuild->nAffixData;
+ dict->AffixDataStart = sizeof(uint32) * ConfBuild->nAffixData;
+ memcpy(DictAffixDataOffset(dict), ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->nAffixData);
+ memcpy(DictAffixData(dict), ConfBuild->AffixData, ConfBuild->AffixDataEnd);
+
+ /* Affix array */
+ dict->nAffix = ConfBuild->nAffix;
+ dict->AffixOffsetStart = dict->AffixDataStart + ConfBuild->AffixDataEnd;
+ dict->AffixStart = dict->AffixOffsetStart + sizeof(uint32) * ConfBuild->nAffix;
+ if (ConfBuild->nAffix > 0)
+ {
+ offsets = (uint32 *) DictAffixOffset(dict);
+ for (i = 0; i < ConfBuild->nAffix; i++)
+ {
+ AFFIX *affix;
+ uint32 size = AffixGetSize(ConfBuild->Affix[i]);
- /* Keep everything maxaligned */
- size = MAXALIGN(size);
+ offsets[i] = offset;
+ affix = (AFFIX *) DictAffixGet(dict, i);
+ Assert(affix);
- /* Need more space? */
- if (size > Conf->avail)
- {
- Conf->firstfree = palloc0(COMPACT_ALLOC_CHUNK);
- Conf->avail = COMPACT_ALLOC_CHUNK;
- }
+ memcpy(affix, ConfBuild->Affix[i], size);
- result = (void *) Conf->firstfree;
- Conf->firstfree += size;
- Conf->avail -= size;
+ offset += size;
+ }
+ }
- return result;
+ /* DictNodes prefix tree */
+ dict->DictNodesStart = dict->AffixStart + offset;
+ /* We have at least one root node even if dictionary list is empty */
+ dict_node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, 0);
+ Assert(dict_node && dict_node->length > 0);
+ /* Copy dictionary nodes into persistent structure */
+ memcpy(DictDictNodes(dict), ConfBuild->DictNodes.Nodes,
+ ConfBuild->DictNodes.NodesEnd);
+
+ /* PrefixNodes prefix tree */
+ dict->PrefixNodesStart = dict->DictNodesStart + ConfBuild->DictNodes.NodesEnd;
+ /* We have at least one root node even if prefix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy prefix nodes into persistent structure */
+ memcpy(DictPrefixNodes(dict), ConfBuild->PrefixNodes.Nodes,
+ ConfBuild->PrefixNodes.NodesEnd);
+
+ /* SuffixNodes prefix tree */
+ dict->SuffixNodesStart = dict->PrefixNodesStart + ConfBuild->PrefixNodes.NodesEnd;
+ /* We have at least one root node even if suffix list is empty */
+ aff_node = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ Assert(aff_node && aff_node->length > 0);
+ /* Copy suffix nodes into persistent structure */
+ memcpy(DictSuffixNodes(dict), ConfBuild->SuffixNodes.Nodes,
+ ConfBuild->SuffixNodes.NodesEnd);
+
+ /* CompoundAffix array */
+ dict->CompoundAffixStart = dict->SuffixNodesStart +
+ ConfBuild->SuffixNodes.NodesEnd;
+ /* We have at least one CompoundAffix terminating entry */
+ Assert(ConfBuild->nCompoundAffix > 0);
+ /* Copy array of compound affixes into persistent structure */
+ memcpy(DictCompoundAffix(dict), ConfBuild->CompoundAffix,
+ sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
}
-#define cpalloc(size) compact_palloc0(Conf, size)
-#define cpalloc0(size) compact_palloc0(Conf, size)
-
-static char *
-cpstrdup(IspellDict *Conf, const char *str)
+/*
+ * Clean up when dictionary construction is complete.
+ */
+void
+NIFinishBuild(IspellDictBuild *ConfBuild)
{
- char *res = cpalloc(strlen(str) + 1);
-
- strcpy(res, str);
- return res;
+ /* Release no-longer-needed temp memory */
+ MemoryContextDelete(ConfBuild->buildCxt);
+ /* Just for cleanliness, zero the now-dangling pointers */
+ ConfBuild->buildCxt = NULL;
+ ConfBuild->Spell = NULL;
+ ConfBuild->CompoundAffixFlags = NULL;
}
-
/*
* Apply lowerstr(), producing a temporary result (in the buildCxt).
*/
static char *
-lowerstr_ctx(IspellDict *Conf, const char *src)
+lowerstr_ctx(IspellDictBuild *ConfBuild, const char *src)
{
MemoryContext saveCtx;
char *dst;
- saveCtx = MemoryContextSwitchTo(Conf->buildCxt);
+ saveCtx = MemoryContextSwitchTo(ConfBuild->buildCxt);
dst = lowerstr(src);
MemoryContextSwitchTo(saveCtx);
@@ -190,7 +246,7 @@ lowerstr_ctx(IspellDict *Conf, const char *src)
#define STRNCMP(s,p) strncmp( (s), (p), strlen(p) )
#define GETWCHAR(W,L,N,T) ( ((const uint8*)(W))[ ((T)==FF_PREFIX) ? (N) : ( (L) - 1 - (N) ) ] )
-#define GETCHAR(A,N,T) GETWCHAR( (A)->repl, (A)->replen, N, T )
+#define GETCHAR(A,N,T) GETWCHAR( AffixFieldRepl(A), (A)->replen, N, T )
static char *VoidString = "";
@@ -311,18 +367,189 @@ strbncmp(const unsigned char *s1, const unsigned char *s2, size_t count)
static int
cmpaffix(const void *s1, const void *s2)
{
- const AFFIX *a1 = (const AFFIX *) s1;
- const AFFIX *a2 = (const AFFIX *) s2;
+ const AFFIX *a1 = *((AFFIX *const *) s1);
+ const AFFIX *a2 = *((AFFIX *const *) s2);
if (a1->type < a2->type)
return -1;
if (a1->type > a2->type)
return 1;
if (a1->type == FF_PREFIX)
- return strcmp(a1->repl, a2->repl);
+ return strcmp(AffixFieldRepl(a1), AffixFieldRepl(a2));
else
- return strbcmp((const unsigned char *) a1->repl,
- (const unsigned char *) a2->repl);
+ return strbcmp((const unsigned char *) AffixFieldRepl(a1),
+ (const unsigned char *) AffixFieldRepl(a2));
+}
+
+/*
+ * Allocate space for AffixData.
+ */
+static void
+InitAffixData(IspellDictBuild *ConfBuild, int numAffixData)
+{
+ uint32 size;
+
+ size = 8 * 1024 /* Reserve 8KB for data */;
+
+ ConfBuild->AffixData = (char *) tmpalloc(size);
+ ConfBuild->AffixDataSize = size;
+ ConfBuild->AffixDataOffset = (uint32 *) tmpalloc(numAffixData * sizeof(uint32));
+ ConfBuild->nAffixData = 0;
+ ConfBuild->mAffixData= numAffixData;
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd = 0;
+}
+
+/*
+ * Add affix set of affix flags into IspellDict struct. If IspellDict doesn't
+ * fit new affix set then resize it.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * AffixSet: set of affix flags.
+ */
+static void
+AddAffixSet(IspellDictBuild *ConfBuild, const char *AffixSet,
+ uint32 AffixSetLen)
+{
+ /*
+ * Check available space for AffixSet.
+ */
+ if (ConfBuild->AffixDataEnd + AffixSetLen + 1 /* \0 */ >=
+ ConfBuild->AffixDataSize)
+ {
+ uint32 newsize = Max(ConfBuild->AffixDataSize + 8 * 1024 /* 8KB */,
+ ConfBuild->AffixDataSize + AffixSetLen + 1);
+
+ ConfBuild->AffixData = (char *) repalloc(ConfBuild->AffixData, newsize);
+ ConfBuild->AffixDataSize = newsize;
+ }
+
+ /* Check available number of offsets */
+ if (ConfBuild->nAffixData >= ConfBuild->mAffixData)
+ {
+ ConfBuild->mAffixData *= 2;
+ ConfBuild->AffixDataOffset = (uint32 *) repalloc(ConfBuild->AffixDataOffset,
+ sizeof(uint32) * ConfBuild->mAffixData);
+ }
+
+ ConfBuild->AffixDataOffset[ConfBuild->nAffixData] = ConfBuild->AffixDataEnd;
+ StrNCpy(AffixDataGet(ConfBuild, ConfBuild->nAffixData),
+ AffixSet, AffixSetLen + 1);
+
+ /* Save offset of the end of data */
+ ConfBuild->AffixDataEnd += AffixSetLen + 1;
+ ConfBuild->nAffixData++;
+}
+
+/*
+ * Allocate space for prefix tree node.
+ *
+ * ConfBuild: building structure for the current dictionary.
+ * array: NodeArray where to allocate new node.
+ * length: number of allocated NodeData.
+ * sizeNodeData: minimum size of each NodeData.
+ * sizeNodeHeader: size of header of new node.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length,
+ uint32 sizeNodeData, uint32 sizeNodeHeader)
+{
+ uint32 node_offset;
+ uint32 size;
+
+ size = sizeNodeHeader + length * sizeNodeData;
+ size = MAXALIGN(size);
+
+ if (array->NodesSize == 0)
+ {
+ array->NodesSize = size * 32; /* Reserve space for next levels of the
+ * prefix tree */
+ array->Nodes = (char *) tmpalloc(array->NodesSize);
+ array->NodesEnd = 0;
+ }
+ else if (array->NodesEnd + size >= array->NodesSize)
+ {
+ array->NodesSize = Max(array->NodesSize * 2, array->NodesSize + size);
+ array->Nodes = (char *) repalloc(array->Nodes, array->NodesSize);
+ }
+
+ node_offset = array->NodesEnd;
+ array->NodesEnd += size;
+
+ return node_offset;
+}
+
+/*
+ * Allocate space for SPNode.
+ *
+ * Returns an offset of new node in ConfBuild->DictNodes->Nodes.
+ */
+static uint32
+AllocateSPNode(IspellDictBuild *ConfBuild, uint32 length)
+{
+ uint32 offset;
+ SPNode *node;
+ SPNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, &ConfBuild->DictNodes, length,
+ sizeof(SPNodeData), SPNHDRSZ);
+ node = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, offset);
+ node->length = length;
+
+ /*
+ * Initialize all SPNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affix = ISPELL_INVALID_INDEX;
+ data->compoundflag = 0;
+ data->isword = 0;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
+}
+
+/*
+ * Allocate space for AffixNode.
+ *
+ * Returns an offset of new node in NodeArray->Nodes.
+ */
+static uint32
+AllocateAffixNode(IspellDictBuild *ConfBuild, NodeArray *array, uint32 length)
+{
+ uint32 offset;
+ AffixNode *node;
+ AffixNodeData *data;
+ uint32 i;
+
+ offset = AllocateNode(ConfBuild, array, length, sizeof(AffixNodeData),
+ ANHRDSZ);
+ node = (AffixNode *) NodeArrayGet(array, offset);
+ node->length = length;
+ node->isvoid = 0;
+
+ /*
+ * Initialize all AffixNodeData with default values. We cannot use memset()
+ * here because not all fields have 0 as default value.
+ */
+ for (i = 0; i < length; i++)
+ {
+ data = &(node->data[i]);
+ data->val = 0;
+ data->affstart = ISPELL_INVALID_INDEX;
+ data->affend = ISPELL_INVALID_INDEX;
+ data->node_offset = ISPELL_INVALID_OFFSET;
+ }
+
+ return offset;
}
/*
@@ -333,7 +560,7 @@ cmpaffix(const void *s1, const void *s2)
* - 2 characters (FM_LONG). A character may be Unicode.
* - numbers from 1 to 65000 (FM_NUM).
*
- * Depending on the flagMode an affix string can have the following format:
+ * Depending on the flagmode an affix string can have the following format:
* - FM_CHAR: ABCD
* Here we have 4 flags: A, B, C and D
* - FM_LONG: ABCDE*
@@ -341,13 +568,13 @@ cmpaffix(const void *s1, const void *s2)
* - FM_NUM: 200,205,50
* Here we have 3 flags: 200, 205 and 50
*
- * Conf: current dictionary.
+ * flagmode: flag mode of the dictionary
* sflagset: the set of affix flags. Returns a reference to the start of a next
* affix flag.
* sflag: returns an affix flag from sflagset.
*/
static void
-getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
+getNextFlagFromString(FlagMode flagmode, char **sflagset, char *sflag)
{
int32 s;
char *next,
@@ -356,11 +583,11 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
bool stop = false;
bool met_comma = false;
- maxstep = (Conf->flagMode == FM_LONG) ? 2 : 1;
+ maxstep = (flagmode == FM_LONG) ? 2 : 1;
while (**sflagset)
{
- switch (Conf->flagMode)
+ switch (flagmode)
{
case FM_LONG:
case FM_CHAR:
@@ -422,15 +649,15 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
stop = true;
break;
default:
- elog(ERROR, "unrecognized type of Conf->flagMode: %d",
- Conf->flagMode);
+ elog(ERROR, "unrecognized type of flagmode: %d",
+ flagmode);
}
if (stop)
break;
}
- if (Conf->flagMode == FM_LONG && maxstep > 0)
+ if (flagmode == FM_LONG && maxstep > 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix flag \"%s\" with \"long\" flag value",
@@ -440,31 +667,28 @@ getNextFlagFromString(IspellDict *Conf, char **sflagset, char *sflag)
}
/*
- * Checks if the affix set Conf->AffixData[affix] contains affixflag.
- * Conf->AffixData[affix] does not contain affixflag if this flag is not used
- * actually by the .dict file.
+ * Checks if the affix set from AffixData contains affixflag. Affix set does
+ * not contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
- * affix: index of the Conf->AffixData array.
+ * flagmode: flag mode of the dictionary.
+ * sflagset: the set of affix flags.
* affixflag: the affix flag.
*
- * Returns true if the string Conf->AffixData[affix] contains affixflag,
- * otherwise returns false.
+ * Returns true if the affix set string contains affixflag, otherwise returns
+ * false.
*/
static bool
-IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
+IsAffixFlagInUse(FlagMode flagmode, char *sflagset, const char *affixflag)
{
- char *flagcur;
+ char *flagcur = sflagset;
char flag[BUFSIZ];
if (*affixflag == 0)
return true;
- flagcur = Conf->AffixData[affix];
-
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, flag);
+ getNextFlagFromString(flagmode, &flagcur, flag);
/* Compare first affix flag in flagcur with affixflag */
if (strcmp(flag, affixflag) == 0)
return true;
@@ -477,31 +701,33 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char *affixflag)
/*
* Adds the new word into the temporary array Spell.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* word: new word.
* flag: set of affix flags. Single flag can be get by getNextFlagFromString().
*/
static void
-NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
+NIAddSpell(IspellDictBuild *ConfBuild, const char *word, const char *flag)
{
- if (Conf->nspell >= Conf->mspell)
+ if (ConfBuild->nSpell >= ConfBuild->mSpell)
{
- if (Conf->mspell)
+ if (ConfBuild->mSpell)
{
- Conf->mspell *= 2;
- Conf->Spell = (SPELL **) repalloc(Conf->Spell, Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell *= 2;
+ ConfBuild->Spell = (SPELL **) repalloc(ConfBuild->Spell,
+ ConfBuild->mSpell * sizeof(SPELL *));
}
else
{
- Conf->mspell = 1024 * 20;
- Conf->Spell = (SPELL **) tmpalloc(Conf->mspell * sizeof(SPELL *));
+ ConfBuild->mSpell = 1024 * 20;
+ ConfBuild->Spell = (SPELL **) tmpalloc(ConfBuild->mSpell * sizeof(SPELL *));
}
}
- Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
- strcpy(Conf->Spell[Conf->nspell]->word, word);
- Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
+ ConfBuild->Spell[ConfBuild->nSpell] =
+ (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
+ strcpy(ConfBuild->Spell[ConfBuild->nSpell]->word, word);
+ ConfBuild->Spell[ConfBuild->nSpell]->p.flag = (*flag != '\0')
? tmpstrdup(flag) : VoidString;
- Conf->nspell++;
+ ConfBuild->nSpell++;
}
/*
@@ -509,11 +735,11 @@ NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
*
* Note caller must already have applied get_tsearch_config_filename.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .dict file.
*/
void
-NIImportDictionary(IspellDict *Conf, const char *filename)
+NIImportDictionary(IspellDictBuild *ConfBuild, const char *filename)
{
tsearch_readline_state trst;
char *line;
@@ -564,9 +790,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
}
s += pg_mblen(s);
}
- pstr = lowerstr_ctx(Conf, line);
+ pstr = lowerstr_ctx(ConfBuild, line);
- NIAddSpell(Conf, pstr, flag);
+ NIAddSpell(ConfBuild, pstr, flag);
pfree(pstr);
pfree(line);
@@ -590,7 +816,7 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* SFX M 0 's .
* is presented here.
*
- * Conf: current dictionary.
+ * dict: current dictionary.
* word: basic form of word.
* affixflag: affix flag, by which a basic form of word was generated.
* flag: compound flag used to compare with StopMiddle->compoundflag.
@@ -598,9 +824,9 @@ NIImportDictionary(IspellDict *Conf, const char *filename)
* Returns 1 if the word was found in the prefix tree, else returns 0.
*/
static int
-FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
+FindWord(IspellDictData *dict, const char *word, const char *affixflag, int flag)
{
- SPNode *node = Conf->Dictionary;
+ SPNode *node = (SPNode *) DictDictNodes(dict);
SPNodeData *StopLow,
*StopHigh,
*StopMiddle;
@@ -636,10 +862,14 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* Check if this affix rule is presented in the affix set
* with index StopMiddle->affix.
*/
- if (IsAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
+ if (IsAffixFlagInUse(dict->flagMode,
+ DictAffixDataGet(dict, StopMiddle->affix),
+ affixflag))
return 1;
}
- node = StopMiddle->node;
+ /* Retreive SPNode by the offset */
+ node = (SPNode *) DictNodeGet(DictDictNodes(dict),
+ StopMiddle->node_offset);
ptr++;
break;
}
@@ -657,7 +887,8 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
/*
* Adds a new affix rule to the Affix field.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary, is used to allocate
+ * temporary data.
* flag: affix flag ('\' in the below example).
* flagflags: set of flags from the flagval field for this affix rule. This set
* is listed after '/' character in the added string (repl).
@@ -673,26 +904,54 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
* type: FF_SUFFIX or FF_PREFIX.
*/
static void
-NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
- const char *find, const char *repl, int type)
+NIAddAffix(IspellDictBuild *ConfBuild, const char *flag, char flagflags,
+ const char *mask, const char *find, const char *repl, int type)
{
AFFIX *Affix;
+ uint32 size;
+ uint32 flaglen = strlen(flag),
+ findlen = strlen(find),
+ repllen = strlen(repl),
+ masklen = strlen(mask);
+
+ /* Sanity checks */
+ if (flaglen > AF_FLAG_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix flag \"%s\" too long", flag)));
+ if (findlen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix find field \"%s\" too long", find)));
+ if (repllen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix repl field \"%s\" too long", repl)));
+ if (masklen > AF_REPL_MAXSIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("affix mask field \"%s\" too long", repl)));
- if (Conf->naffixes >= Conf->maffixes)
+ if (ConfBuild->nAffix >= ConfBuild->mAffix)
{
- if (Conf->maffixes)
+ if (ConfBuild->mAffix)
{
- Conf->maffixes *= 2;
- Conf->Affix = (AFFIX *) repalloc((void *) Conf->Affix, Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix *= 2;
+ ConfBuild->Affix = (AFFIX **) repalloc(ConfBuild->Affix,
+ ConfBuild->mAffix * sizeof(AFFIX *));
}
else
{
- Conf->maffixes = 16;
- Conf->Affix = (AFFIX *) palloc(Conf->maffixes * sizeof(AFFIX));
+ ConfBuild->mAffix = 255;
+ ConfBuild->Affix = (AFFIX **) tmpalloc(ConfBuild->mAffix * sizeof(AFFIX *));
}
}
- Affix = Conf->Affix + Conf->naffixes;
+ size = AFFIXHDRSZ + flaglen + 1 /* \0 */ + findlen + 1 /* \0 */ +
+ repllen + 1 /* \0 */ + masklen + 1 /* \0 */;
+
+ Affix = (AFFIX *) tmpalloc(size);
+ ConfBuild->Affix[ConfBuild->nAffix] = Affix;
/* This affix rule can be applied for words with any ending */
if (strcmp(mask, ".") == 0 || *mask == '\0')
@@ -705,42 +964,12 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
{
Affix->issimple = 0;
Affix->isregis = 1;
- RS_compile(&(Affix->reg.regis), (type == FF_SUFFIX),
- *mask ? mask : VoidString);
}
/* This affix rule will use regex_t to search word ending */
else
{
- int masklen;
- int wmasklen;
- int err;
- pg_wchar *wmask;
- char *tmask;
-
Affix->issimple = 0;
Affix->isregis = 0;
- tmask = (char *) tmpalloc(strlen(mask) + 3);
- if (type == FF_SUFFIX)
- sprintf(tmask, "%s$", mask);
- else
- sprintf(tmask, "^%s", mask);
-
- masklen = strlen(tmask);
- wmask = (pg_wchar *) tmpalloc((masklen + 1) * sizeof(pg_wchar));
- wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
-
- err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
- REG_ADVANCED | REG_NOSUB,
- DEFAULT_COLLATION_OID);
- if (err)
- {
- char errstr[100];
-
- pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
- errmsg("invalid regular expression: %s", errstr)));
- }
}
Affix->flagflags = flagflags;
@@ -749,15 +978,22 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
if ((Affix->flagflags & FF_COMPOUNDFLAG) == 0)
Affix->flagflags |= FF_COMPOUNDFLAG;
}
- Affix->flag = cpstrdup(Conf, flag);
+
Affix->type = type;
- Affix->find = (find && *find) ? cpstrdup(Conf, find) : VoidString;
- if ((Affix->replen = strlen(repl)) > 0)
- Affix->repl = cpstrdup(Conf, repl);
- else
- Affix->repl = VoidString;
- Conf->naffixes++;
+ Affix->replen = repllen;
+ StrNCpy(AffixFieldRepl(Affix), repl, repllen + 1);
+
+ Affix->findlen = findlen;
+ StrNCpy(AffixFieldFind(Affix), find, findlen + 1);
+
+ Affix->masklen = masklen;
+ StrNCpy(AffixFieldMask(Affix), mask, masklen + 1);
+
+ StrNCpy(AffixFieldFlag(Affix), flag, flaglen + 1);
+
+ ConfBuild->nAffix++;
+ ConfBuild->AffixSize += size;
}
/* Parsing states for parse_affentry() and friends */
@@ -1021,10 +1257,10 @@ parse_affentry(char *str, char *mask, char *find, char *repl)
* Sets a Hunspell options depending on flag type.
*/
static void
-setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
+setCompoundAffixFlagValue(IspellDictBuild *ConfBuild, CompoundAffixFlag *entry,
char *s, uint32 val)
{
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
char *next;
int i;
@@ -1044,19 +1280,19 @@ setCompoundAffixFlagValue(IspellDict *Conf, CompoundAffixFlag *entry,
else
entry->flag.s = tmpstrdup(s);
- entry->flagMode = Conf->flagMode;
+ entry->flagMode = ConfBuild->dict->flagMode;
entry->value = val;
}
/*
* Sets up a correspondence for the affix parameter with the affix flag.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* s: affix flag in string.
* val: affix parameter.
*/
static void
-addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
+addCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s, uint32 val)
{
CompoundAffixFlag *newValue;
char sbuf[BUFSIZ];
@@ -1083,29 +1319,29 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
*sflag = '\0';
/* Resize array or allocate memory for array CompoundAffixFlag */
- if (Conf->nCompoundAffixFlag >= Conf->mCompoundAffixFlag)
+ if (ConfBuild->nCompoundAffixFlag >= ConfBuild->mCompoundAffixFlag)
{
- if (Conf->mCompoundAffixFlag)
+ if (ConfBuild->mCompoundAffixFlag)
{
- Conf->mCompoundAffixFlag *= 2;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- repalloc((void *) Conf->CompoundAffixFlags,
- Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag *= 2;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ repalloc((void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
else
{
- Conf->mCompoundAffixFlag = 10;
- Conf->CompoundAffixFlags = (CompoundAffixFlag *)
- tmpalloc(Conf->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
+ ConfBuild->mCompoundAffixFlag = 10;
+ ConfBuild->CompoundAffixFlags = (CompoundAffixFlag *)
+ tmpalloc(ConfBuild->mCompoundAffixFlag * sizeof(CompoundAffixFlag));
}
}
- newValue = Conf->CompoundAffixFlags + Conf->nCompoundAffixFlag;
+ newValue = ConfBuild->CompoundAffixFlags + ConfBuild->nCompoundAffixFlag;
- setCompoundAffixFlagValue(Conf, newValue, sbuf, val);
+ setCompoundAffixFlagValue(ConfBuild, newValue, sbuf, val);
- Conf->usecompound = true;
- Conf->nCompoundAffixFlag++;
+ ConfBuild->dict->usecompound = true;
+ ConfBuild->nCompoundAffixFlag++;
}
/*
@@ -1113,7 +1349,7 @@ addCompoundAffixFlagValue(IspellDict *Conf, char *s, uint32 val)
* flags s.
*/
static int
-getCompoundAffixFlagValue(IspellDict *Conf, char *s)
+getCompoundAffixFlagValue(IspellDictBuild *ConfBuild, char *s)
{
uint32 flag = 0;
CompoundAffixFlag *found,
@@ -1121,18 +1357,18 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
char sflag[BUFSIZ];
char *flagcur;
- if (Conf->nCompoundAffixFlag == 0)
+ if (ConfBuild->nCompoundAffixFlag == 0)
return 0;
flagcur = s;
while (*flagcur)
{
- getNextFlagFromString(Conf, &flagcur, sflag);
- setCompoundAffixFlagValue(Conf, &key, sflag, 0);
+ getNextFlagFromString(ConfBuild->dict->flagMode, &flagcur, sflag);
+ setCompoundAffixFlagValue(ConfBuild, &key, sflag, 0);
found = (CompoundAffixFlag *)
- bsearch(&key, (void *) Conf->CompoundAffixFlags,
- Conf->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
+ bsearch(&key, (void *) ConfBuild->CompoundAffixFlags,
+ ConfBuild->nCompoundAffixFlag, sizeof(CompoundAffixFlag),
cmpcmdflag);
if (found != NULL)
flag |= found->value;
@@ -1144,14 +1380,13 @@ getCompoundAffixFlagValue(IspellDict *Conf, char *s)
/*
* Returns a flag set using the s parameter.
*
- * If Conf->useFlagAliases is true then the s parameter is index of the
- * Conf->AffixData array and function returns its entry.
- * Else function returns the s parameter.
+ * If useFlagAliases is true then the s parameter is index of the AffixData
+ * array and function returns its entry. Else function returns the s parameter.
*/
static char *
-getAffixFlagSet(IspellDict *Conf, char *s)
+getAffixFlagSet(IspellDictBuild *ConfBuild, char *s)
{
- if (Conf->useFlagAliases && *s != '\0')
+ if (ConfBuild->dict->useFlagAliases && *s != '\0')
{
int curaffix;
char *end;
@@ -1162,13 +1397,13 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"", s)));
- if (curaffix > 0 && curaffix <= Conf->nAffixData)
+ if (curaffix > 0 && curaffix <= ConfBuild->nAffixData)
/*
* Do not subtract 1 from curaffix because empty string was added
* in NIImportOOAffixes
*/
- return Conf->AffixData[curaffix];
+ return AffixDataGet(ConfBuild, curaffix);
else
return VoidString;
}
@@ -1179,11 +1414,11 @@ getAffixFlagSet(IspellDict *Conf, char *s)
/*
* Import an affix file that follows MySpell or Hunspell format.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* filename: path to the .affix file.
*/
static void
-NIImportOOAffixes(IspellDict *Conf, const char *filename)
+NIImportOOAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char type[BUFSIZ],
*ptype = NULL;
@@ -1203,9 +1438,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
char *recoded;
/* read file to find any flag */
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
if (!tsearch_readline_begin(&trst, filename))
ereport(ERROR,
@@ -1222,30 +1457,36 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDFLAG"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDFLAG"),
FF_COMPOUNDFLAG);
else if (STRNCMP(recoded, "COMPOUNDBEGIN") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDBEGIN"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDBEGIN"),
FF_COMPOUNDBEGIN);
else if (STRNCMP(recoded, "COMPOUNDLAST") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDLAST"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDLAST"),
FF_COMPOUNDLAST);
/* COMPOUNDLAST and COMPOUNDEND are synonyms */
else if (STRNCMP(recoded, "COMPOUNDEND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDEND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDEND"),
FF_COMPOUNDLAST);
else if (STRNCMP(recoded, "COMPOUNDMIDDLE") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("COMPOUNDMIDDLE"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("COMPOUNDMIDDLE"),
FF_COMPOUNDMIDDLE);
else if (STRNCMP(recoded, "ONLYINCOMPOUND") == 0)
- addCompoundAffixFlagValue(Conf, recoded + strlen("ONLYINCOMPOUND"),
+ addCompoundAffixFlagValue(ConfBuild,
+ recoded + strlen("ONLYINCOMPOUND"),
FF_COMPOUNDONLY);
else if (STRNCMP(recoded, "COMPOUNDPERMITFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDPERMITFLAG"),
FF_COMPOUNDPERMITFLAG);
else if (STRNCMP(recoded, "COMPOUNDFORBIDFLAG") == 0)
- addCompoundAffixFlagValue(Conf,
+ addCompoundAffixFlagValue(ConfBuild,
recoded + strlen("COMPOUNDFORBIDFLAG"),
FF_COMPOUNDFORBIDFLAG);
else if (STRNCMP(recoded, "FLAG") == 0)
@@ -1258,9 +1499,9 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (*s)
{
if (STRNCMP(s, "long") == 0)
- Conf->flagMode = FM_LONG;
+ ConfBuild->dict->flagMode = FM_LONG;
else if (STRNCMP(s, "num") == 0)
- Conf->flagMode = FM_NUM;
+ ConfBuild->dict->flagMode = FM_NUM;
else if (STRNCMP(s, "default") != 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
@@ -1274,8 +1515,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
}
tsearch_readline_end(&trst);
- if (Conf->nCompoundAffixFlag > 1)
- qsort((void *) Conf->CompoundAffixFlags, Conf->nCompoundAffixFlag,
+ if (ConfBuild->nCompoundAffixFlag > 1)
+ qsort((void *) ConfBuild->CompoundAffixFlags, ConfBuild->nCompoundAffixFlag,
sizeof(CompoundAffixFlag), cmpcmdflag);
if (!tsearch_readline_begin(&trst, filename))
@@ -1295,15 +1536,15 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
if (ptype)
pfree(ptype);
- ptype = lowerstr_ctx(Conf, type);
+ ptype = lowerstr_ctx(ConfBuild, type);
/* First try to parse AF parameter (alias compression) */
if (STRNCMP(ptype, "af") == 0)
{
/* First line is the number of aliases */
- if (!Conf->useFlagAliases)
+ if (!ConfBuild->dict->useFlagAliases)
{
- Conf->useFlagAliases = true;
+ ConfBuild->dict->useFlagAliases = true;
naffix = atoi(sflag);
if (naffix <= 0)
ereport(ERROR,
@@ -1313,11 +1554,10 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Also reserve place for empty flag set */
naffix++;
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
- Conf->lenAffixData = Conf->nAffixData = naffix;
+ InitAffixData(ConfBuild, naffix);
/* Add empty flag set into AffixData */
- Conf->AffixData[curaffix] = VoidString;
+ AddAffixSet(ConfBuild, VoidString, 0);
curaffix++;
}
/* Other lines are aliases */
@@ -1325,7 +1565,7 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
{
if (curaffix < naffix)
{
- Conf->AffixData[curaffix] = cpstrdup(Conf, sflag);
+ AddAffixSet(ConfBuild, sflag, strlen(sflag));
curaffix++;
}
else
@@ -1343,8 +1583,8 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
sflaglen = strlen(sflag);
if (sflaglen == 0
- || (sflaglen > 1 && Conf->flagMode == FM_CHAR)
- || (sflaglen > 2 && Conf->flagMode == FM_LONG))
+ || (sflaglen > 1 && ConfBuild->dict->flagMode == FM_CHAR)
+ || (sflaglen > 2 && ConfBuild->dict->flagMode == FM_LONG))
goto nextline;
/*--------
@@ -1372,21 +1612,21 @@ NIImportOOAffixes(IspellDict *Conf, const char *filename)
/* Get flags after '/' (flags are case sensitive) */
if ((ptr = strchr(repl, '/')) != NULL)
- aflg |= getCompoundAffixFlagValue(Conf,
- getAffixFlagSet(Conf,
+ aflg |= getCompoundAffixFlagValue(ConfBuild,
+ getAffixFlagSet(ConfBuild,
ptr + 1));
/* Get lowercased version of string before '/' */
- prepl = lowerstr_ctx(Conf, repl);
+ prepl = lowerstr_ctx(ConfBuild, repl);
if ((ptr = strchr(prepl, '/')) != NULL)
*ptr = '\0';
- pfind = lowerstr_ctx(Conf, find);
- pmask = lowerstr_ctx(Conf, mask);
+ pfind = lowerstr_ctx(ConfBuild, find);
+ pmask = lowerstr_ctx(ConfBuild, mask);
if (t_iseq(find, '0'))
*pfind = '\0';
if (t_iseq(repl, '0'))
*prepl = '\0';
- NIAddAffix(Conf, sflag, flagflags | aflg, pmask, pfind, prepl,
+ NIAddAffix(ConfBuild, sflag, flagflags | aflg, pmask, pfind, prepl,
isSuffix ? FF_SUFFIX : FF_PREFIX);
pfree(prepl);
pfree(pfind);
@@ -1412,7 +1652,7 @@ nextline:
* work to NIImportOOAffixes(), which will re-read the whole file.
*/
void
-NIImportAffixes(IspellDict *Conf, const char *filename)
+NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename)
{
char *pstr = NULL;
char flag[BUFSIZ];
@@ -1433,9 +1673,9 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
errmsg("could not open affix file \"%s\": %m",
filename)));
- Conf->usecompound = false;
- Conf->useFlagAliases = false;
- Conf->flagMode = FM_CHAR;
+ ConfBuild->dict->usecompound = false;
+ ConfBuild->dict->useFlagAliases = false;
+ ConfBuild->dict->flagMode = FM_CHAR;
while ((recoded = tsearch_readline(&trst)) != NULL)
{
@@ -1457,10 +1697,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
s += pg_mblen(s);
if (*s && pg_mblen(s) == 1)
- {
- addCompoundAffixFlagValue(Conf, s, FF_COMPOUNDFLAG);
- Conf->usecompound = true;
- }
+ addCompoundAffixFlagValue(ConfBuild, s, FF_COMPOUNDFLAG);
+
oldformat = true;
goto nextline;
}
@@ -1533,7 +1771,8 @@ NIImportAffixes(IspellDict *Conf, const char *filename)
if (!parse_affentry(pstr, mask, find, repl))
goto nextline;
- NIAddAffix(Conf, flag, flagflags, mask, find, repl, suffixes ? FF_SUFFIX : FF_PREFIX);
+ NIAddAffix(ConfBuild, flag, flagflags, mask, find, repl,
+ suffixes ? FF_SUFFIX : FF_PREFIX);
nextline:
pfree(recoded);
@@ -1552,53 +1791,48 @@ isnewformat:
errmsg("affix file contains both old-style and new-style commands")));
tsearch_readline_end(&trst);
- NIImportOOAffixes(Conf, filename);
+ NIImportOOAffixes(ConfBuild, filename);
}
/*
* Merges two affix flag sets and stores a new affix flag set into
- * Conf->AffixData.
+ * ConfBuild->AffixData.
*
* Returns index of a new affix flag set.
*/
static int
-MergeAffix(IspellDict *Conf, int a1, int a2)
+MergeAffix(IspellDictBuild *ConfBuild, int a1, int a2)
{
- char **ptr;
+ char *ptr;
+ uint32 len;
/* Do not merge affix flags if one of affix flags is empty */
- if (*Conf->AffixData[a1] == '\0')
+ if (*AffixDataGet(ConfBuild, a1) == '\0')
return a2;
- else if (*Conf->AffixData[a2] == '\0')
+ else if (*AffixDataGet(ConfBuild, a2) == '\0')
return a1;
- while (Conf->nAffixData + 1 >= Conf->lenAffixData)
- {
- Conf->lenAffixData *= 2;
- Conf->AffixData = (char **) repalloc(Conf->AffixData,
- sizeof(char *) * Conf->lenAffixData);
- }
-
- ptr = Conf->AffixData + Conf->nAffixData;
- if (Conf->flagMode == FM_NUM)
+ if (ConfBuild->dict->flagMode == FM_NUM)
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* comma */ + 1 /* \0 */ );
- sprintf(*ptr, "%s,%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) + 1 /* comma */ +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */);
+ sprintf(ptr, "%s,%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
else
{
- *ptr = cpalloc(strlen(Conf->AffixData[a1]) +
- strlen(Conf->AffixData[a2]) +
- 1 /* \0 */ );
- sprintf(*ptr, "%s%s", Conf->AffixData[a1], Conf->AffixData[a2]);
+ len = strlen(AffixDataGet(ConfBuild, a1)) +
+ strlen(AffixDataGet(ConfBuild, a2));
+ ptr = tmpalloc(len + 1 /* \0 */ );
+ sprintf(ptr, "%s%s", AffixDataGet(ConfBuild, a1),
+ AffixDataGet(ConfBuild, a2));
}
- ptr++;
- *ptr = NULL;
- Conf->nAffixData++;
- return Conf->nAffixData - 1;
+ AddAffixSet(ConfBuild, ptr, len);
+ pfree(ptr);
+
+ return ConfBuild->nAffixData - 1;
}
/*
@@ -1606,66 +1840,87 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
* flags with the given index.
*/
static uint32
-makeCompoundFlags(IspellDict *Conf, int affix)
+makeCompoundFlags(IspellDictBuild *ConfBuild, int affix)
{
- char *str = Conf->AffixData[affix];
+ char *str = AffixDataGet(ConfBuild, affix);
- return (getCompoundAffixFlagValue(Conf, str) & FF_COMPOUNDFLAGMASK);
+ return (getCompoundAffixFlagValue(ConfBuild, str) & FF_COMPOUNDFLAGMASK);
}
/*
* Makes a prefix tree for the given level.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Spell array.
* high: upper index of the Conf->Spell array.
* level: current prefix tree level.
+ *
+ * Returns an offset of SPNode in DictNodes.
*/
-static SPNode *
-mkSPNode(IspellDict *Conf, int low, int high, int level)
+static uint32
+mkSPNode(IspellDictBuild *ConfBuild, int low, int high, int level)
{
int i;
int nchar = 0;
char lastchar = '\0';
+ uint32 rs_offset,
+ new_offset;
SPNode *rs;
SPNodeData *data;
+ int data_index = 0;
int lownew = low;
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level && lastchar != Conf->Spell[i]->word[level])
+ if (ConfBuild->Spell[i]->p.d.len > level &&
+ lastchar != ConfBuild->Spell[i]->word[level])
{
nchar++;
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- rs = (SPNode *) cpalloc0(SPNHDRSZ + nchar * sizeof(SPNodeData));
- rs->length = nchar;
+ rs_offset = AllocateSPNode(ConfBuild, nchar);
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
data = rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Spell[i]->p.d.len > level)
+ if (ConfBuild->Spell[i]->p.d.len > level)
{
- if (lastchar != Conf->Spell[i]->word[level])
+ if (lastchar != ConfBuild->Spell[i]->word[level])
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, i, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, i, level + 1);
+
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within
+ * mkSPNode(), so reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Work with next node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
- data++;
}
- lastchar = Conf->Spell[i]->word[level];
+ lastchar = ConfBuild->Spell[i]->word[level];
}
- data->val = ((uint8 *) (Conf->Spell[i]->word))[level];
- if (Conf->Spell[i]->p.d.len == level + 1)
+ data->val = ((uint8 *) (ConfBuild->Spell[i]->word))[level];
+ if (ConfBuild->Spell[i]->p.d.len == level + 1)
{
bool clearCompoundOnly = false;
- if (data->isword && data->affix != Conf->Spell[i]->p.d.affix)
+ if (data->isword && data->affix != ConfBuild->Spell[i]->p.d.affix)
{
/*
* MergeAffix called a few times. If one of word is
@@ -1674,15 +1929,17 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
*/
clearCompoundOnly = (FF_COMPOUNDONLY & data->compoundflag
- & makeCompoundFlags(Conf, Conf->Spell[i]->p.d.affix))
+ & makeCompoundFlags(ConfBuild,
+ ConfBuild->Spell[i]->p.d.affix))
? false : true;
- data->affix = MergeAffix(Conf, data->affix, Conf->Spell[i]->p.d.affix);
+ data->affix = MergeAffix(ConfBuild, data->affix,
+ ConfBuild->Spell[i]->p.d.affix);
}
else
- data->affix = Conf->Spell[i]->p.d.affix;
+ data->affix = ConfBuild->Spell[i]->p.d.affix;
data->isword = 1;
- data->compoundflag = makeCompoundFlags(Conf, data->affix);
+ data->compoundflag = makeCompoundFlags(ConfBuild, data->affix);
if ((data->compoundflag & FF_COMPOUNDONLY) &&
(data->compoundflag & FF_COMPOUNDFLAG) == 0)
@@ -1694,9 +1951,19 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
}
/* Next level of the prefix tree */
- data->node = mkSPNode(Conf, lownew, high, level + 1);
+ new_offset = mkSPNode(ConfBuild, lownew, high, level + 1);
- return rs;
+ /*
+ * ConfBuild->DictNodes can be repalloc'ed within mkSPNode(), so
+ * reinitialize pointers.
+ */
+ rs = (SPNode *) NodeArrayGet(&ConfBuild->DictNodes, rs_offset);
+
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ return rs_offset;
}
/*
@@ -1704,90 +1971,98 @@ mkSPNode(IspellDict *Conf, int low, int high, int level)
* and affixes.
*/
void
-NISortDictionary(IspellDict *Conf)
+NISortDictionary(IspellDictBuild *ConfBuild)
{
int i;
int naffix = 0;
int curaffix;
+ uint32 node_offset;
/* compress affixes */
/*
- * If we use flag aliases then we need to use Conf->AffixData filled in
+ * If we use flag aliases then we need to use ConfBuild->AffixData filled in
* the NIImportOOAffixes().
*/
- if (Conf->useFlagAliases)
+ if (ConfBuild->dict->useFlagAliases)
{
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
char *end;
- if (*Conf->Spell[i]->p.flag != '\0')
+ if (*ConfBuild->Spell[i]->p.flag != '\0')
{
- curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
- if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
+ curaffix = strtol(ConfBuild->Spell[i]->p.flag, &end, 10);
+ if (ConfBuild->Spell[i]->p.flag == end || errno == ERANGE)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("invalid affix alias \"%s\"",
- Conf->Spell[i]->p.flag)));
+ ConfBuild->Spell[i]->p.flag)));
}
else
{
/*
- * If Conf->Spell[i]->p.flag is empty, then get empty value of
- * Conf->AffixData (0 index).
+ * If ConfBuild->Spell[i]->p.flag is empty, then get empty
+ * value of ConfBuild->AffixData (0 index).
*/
curaffix = 0;
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
}
- /* Otherwise fill Conf->AffixData here */
+ /* Otherwise fill ConfBuild->AffixData here */
else
{
/* Count the number of different flags used in the dictionary */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *),
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *),
cmpspellaffix);
naffix = 0;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->Spell[i - 1]->p.flag))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ ConfBuild->Spell[i - 1]->p.flag))
naffix++;
}
/*
- * Fill in Conf->AffixData with the affixes that were used in the
- * dictionary. Replace textual flag-field of Conf->Spell entries with
- * indexes into Conf->AffixData array.
+ * Fill in AffixData with the affixes that were used in the
+ * dictionary. Replace textual flag-field of ConfBuild->Spell entries
+ * with indexes into ConfBuild->AffixData array.
*/
- Conf->AffixData = (char **) palloc0(naffix * sizeof(char *));
+ InitAffixData(ConfBuild, naffix);
curaffix = -1;
- for (i = 0; i < Conf->nspell; i++)
+ for (i = 0; i < ConfBuild->nSpell; i++)
{
if (i == 0
- || strcmp(Conf->Spell[i]->p.flag, Conf->AffixData[curaffix]))
+ || strcmp(ConfBuild->Spell[i]->p.flag,
+ AffixDataGet(ConfBuild, curaffix)))
{
curaffix++;
Assert(curaffix < naffix);
- Conf->AffixData[curaffix] = cpstrdup(Conf,
- Conf->Spell[i]->p.flag);
+ AddAffixSet(ConfBuild, ConfBuild->Spell[i]->p.flag,
+ strlen(ConfBuild->Spell[i]->p.flag));
}
- Conf->Spell[i]->p.d.affix = curaffix;
- Conf->Spell[i]->p.d.len = strlen(Conf->Spell[i]->word);
+ ConfBuild->Spell[i]->p.d.affix = curaffix;
+ ConfBuild->Spell[i]->p.d.len = strlen(ConfBuild->Spell[i]->word);
}
-
- Conf->lenAffixData = Conf->nAffixData = naffix;
}
/* Start build a prefix tree */
- qsort((void *) Conf->Spell, Conf->nspell, sizeof(SPELL *), cmpspell);
- Conf->Dictionary = mkSPNode(Conf, 0, Conf->nspell, 0);
+ qsort((void *) ConfBuild->Spell, ConfBuild->nSpell, sizeof(SPELL *), cmpspell);
+ node_offset = mkSPNode(ConfBuild, 0, ConfBuild->nSpell, 0);
+
+ /* Make void node only if the DictNodes is empty */
+ if (node_offset == ISPELL_INVALID_OFFSET)
+ {
+ /* AllocateSPNode() initializes root node data */
+ AllocateSPNode(ConfBuild, 1);
+ }
}
/*
@@ -1795,83 +2070,104 @@ NISortDictionary(IspellDict *Conf)
* rule. Affixes with empty replace string do not include in the prefix tree.
* This affixes are included by mkVoidAffix().
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* low: lower index of the Conf->Affix array.
* high: upper index of the Conf->Affix array.
* level: current prefix tree level.
* type: FF_SUFFIX or FF_PREFIX.
+ *
+ * Returns an offset in nodes array.
*/
-static AffixNode *
-mkANode(IspellDict *Conf, int low, int high, int level, int type)
+static uint32
+mkANode(IspellDictBuild *ConfBuild, int low, int high, int level, int type)
{
int i;
int nchar = 0;
uint8 lastchar = '\0';
+ NodeArray *array;
+ uint32 rs_offset,
+ new_offset;
AffixNode *rs;
AffixNodeData *data;
+ int data_index = 0;
int lownew = low;
- int naff;
- AFFIX **aff;
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level && lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (ConfBuild->Affix[i]->replen > level &&
+ lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
nchar++;
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
if (!nchar)
- return NULL;
+ return ISPELL_INVALID_OFFSET;
- aff = (AFFIX **) tmpalloc(sizeof(AFFIX *) * (high - low + 1));
- naff = 0;
+ if (type == FF_SUFFIX)
+ array = &ConfBuild->SuffixNodes;
+ else
+ array = &ConfBuild->PrefixNodes;
- rs = (AffixNode *) cpalloc0(ANHRDSZ + nchar * sizeof(AffixNodeData));
- rs->length = nchar;
- data = rs->data;
+ rs_offset = AllocateAffixNode(ConfBuild, array, nchar);
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+ data = (AffixNodeData *) rs->data;
lastchar = '\0';
for (i = low; i < high; i++)
- if (Conf->Affix[i].replen > level)
+ if (ConfBuild->Affix[i]->replen > level)
{
- if (lastchar != GETCHAR(Conf->Affix + i, level, type))
+ if (lastchar != GETCHAR(ConfBuild->Affix[i], level, type))
{
if (lastchar)
{
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, i, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
- data++;
+ new_offset = mkANode(ConfBuild, lownew, i, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so
+ * reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
+
+ /* First save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
+
+ /* Handle next data node */
+ data_index++;
+ Assert(data_index < nchar);
+ data = &(rs->data[data_index]);
+
lownew = i;
}
- lastchar = GETCHAR(Conf->Affix + i, level, type);
+ lastchar = GETCHAR(ConfBuild->Affix[i], level, type);
}
- data->val = GETCHAR(Conf->Affix + i, level, type);
- if (Conf->Affix[i].replen == level + 1)
+ data->val = GETCHAR(ConfBuild->Affix[i], level, type);
+ if (ConfBuild->Affix[i]->replen == level + 1)
{ /* affix stopped */
- aff[naff++] = Conf->Affix + i;
+ if (data->affstart == ISPELL_INVALID_INDEX)
+ {
+ data->affstart = i;
+ data->affend = i;
+ }
+ else
+ data->affend = i;
}
}
/* Next level of the prefix tree */
- data->node = mkANode(Conf, lownew, high, level + 1, type);
- if (naff)
- {
- data->naff = naff;
- data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * naff);
- memcpy(data->aff, aff, sizeof(AFFIX *) * naff);
- naff = 0;
- }
+ new_offset = mkANode(ConfBuild, lownew, high, level + 1, type);
+
+ /*
+ * array can be repalloc'ed within mkANode(), so reinitialize pointers.
+ */
+ rs = (AffixNode *) NodeArrayGet(array, rs_offset);
- pfree(aff);
+ /* Save offset of the new node */
+ data = &(rs->data[data_index]);
+ data->node_offset = new_offset;
- return rs;
+ return rs_offset;
}
/*
@@ -1879,139 +2175,151 @@ mkANode(IspellDict *Conf, int low, int high, int level, int type)
* for affixes which have empty replace string ("repl" field).
*/
static void
-mkVoidAffix(IspellDict *Conf, bool issuffix, int startsuffix)
+mkVoidAffix(IspellDictBuild *ConfBuild, bool issuffix, int startsuffix)
{
- int i,
- cnt = 0;
+ int i;
int start = (issuffix) ? startsuffix : 0;
- int end = (issuffix) ? Conf->naffixes : startsuffix;
- AffixNode *Affix = (AffixNode *) palloc0(ANHRDSZ + sizeof(AffixNodeData));
-
- Affix->length = 1;
- Affix->isvoid = 1;
+ int end = (issuffix) ? ConfBuild->nAffix : startsuffix;
+ uint32 node_offset;
+ NodeArray *array;
+ AffixNode *Affix;
+ AffixNodeData *AffixData;
if (issuffix)
- {
- Affix->data->node = Conf->Suffix;
- Conf->Suffix = Affix;
- }
+ array = &ConfBuild->SuffixNodes;
else
- {
- Affix->data->node = Conf->Prefix;
- Conf->Prefix = Affix;
- }
+ array = &ConfBuild->PrefixNodes;
- /* Count affixes with empty replace string */
- for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
- cnt++;
-
- /* There is not affixes with empty replace string */
- if (cnt == 0)
- return;
+ node_offset = AllocateAffixNode(ConfBuild, array, 1);
+ Affix = (AffixNode *) NodeArrayGet(array, node_offset);
- Affix->data->aff = (AFFIX **) cpalloc(sizeof(AFFIX *) * cnt);
- Affix->data->naff = (uint32) cnt;
+ Affix->isvoid = 1;
+ AffixData = (AffixNodeData *) Affix->data;
- cnt = 0;
for (i = start; i < end; i++)
- if (Conf->Affix[i].replen == 0)
+ if (ConfBuild->Affix[i]->replen == 0)
{
- Affix->data->aff[cnt] = Conf->Affix + i;
- cnt++;
+ if (AffixData->affstart == ISPELL_INVALID_INDEX)
+ {
+ AffixData->affstart = i;
+ AffixData->affend = i;
+ }
+ else
+ AffixData->affend = i;
}
}
/*
- * Checks if the affixflag is used by dictionary. Conf->AffixData does not
+ * Checks if the affixflag is used by dictionary. AffixData does not
* contain affixflag if this flag is not used actually by the .dict file.
*
- * Conf: current dictionary.
+ * ConfBuild: building structure for the current dictionary.
* affixflag: affix flag.
*
- * Returns true if the Conf->AffixData array contains affixflag, otherwise
+ * Returns true if the ConfBuild->AffixData array contains affixflag, otherwise
* returns false.
*/
static bool
-isAffixInUse(IspellDict *Conf, char *affixflag)
+isAffixInUse(IspellDictBuild *ConfBuild, char *affixflag)
{
int i;
- for (i = 0; i < Conf->nAffixData; i++)
- if (IsAffixFlagInUse(Conf, i, affixflag))
+ for (i = 0; i < ConfBuild->nAffixData; i++)
+ if (IsAffixFlagInUse(ConfBuild->dict->flagMode,
+ AffixDataGet(ConfBuild, i), affixflag))
return true;
return false;
}
/*
- * Builds Conf->Prefix and Conf->Suffix trees from the imported affixes.
+ * Builds Prefix and Suffix trees from the imported affixes.
*/
void
-NISortAffixes(IspellDict *Conf)
+NISortAffixes(IspellDictBuild *ConfBuild)
{
AFFIX *Affix;
+ AffixNode *voidPrefix,
+ *voidSuffix;
size_t i;
CMPDAffix *ptr;
- int firstsuffix = Conf->naffixes;
-
- if (Conf->naffixes == 0)
- return;
+ int firstsuffix = ConfBuild->nAffix;
+ uint32 prefix_offset,
+ suffix_offset;
/* Store compound affixes in the Conf->CompoundAffix array */
- if (Conf->naffixes > 1)
- qsort((void *) Conf->Affix, Conf->naffixes, sizeof(AFFIX), cmpaffix);
- Conf->CompoundAffix = ptr = (CMPDAffix *) palloc(sizeof(CMPDAffix) * Conf->naffixes);
- ptr->affix = NULL;
-
- for (i = 0; i < Conf->naffixes; i++)
+ if (ConfBuild->nAffix > 1)
+ qsort((void *) ConfBuild->Affix, ConfBuild->nAffix,
+ sizeof(AFFIX *), cmpaffix);
+ ConfBuild->nCompoundAffix = ConfBuild->nAffix + 1 /* terminating entry */;
+ ConfBuild->CompoundAffix = ptr =
+ (CMPDAffix *) tmpalloc(sizeof(CMPDAffix) * ConfBuild->nCompoundAffix);
+ ptr->affix = ISPELL_INVALID_INDEX;
+
+ for (i = 0; i < ConfBuild->nAffix; i++)
{
- Affix = &(((AFFIX *) Conf->Affix)[i]);
+ Affix = ConfBuild->Affix[i];
if (Affix->type == FF_SUFFIX && i < firstsuffix)
firstsuffix = i;
if ((Affix->flagflags & FF_COMPOUNDFLAG) && Affix->replen > 0 &&
- isAffixInUse(Conf, Affix->flag))
+ isAffixInUse(ConfBuild, AffixFieldFlag(Affix)))
{
bool issuffix = (Affix->type == FF_SUFFIX);
- if (ptr == Conf->CompoundAffix ||
+ if (ptr == ConfBuild->CompoundAffix ||
issuffix != (ptr - 1)->issuffix ||
- strbncmp((const unsigned char *) (ptr - 1)->affix,
- (const unsigned char *) Affix->repl,
+ strbncmp((const unsigned char *) AffixFieldRepl(ConfBuild->Affix[(ptr - 1)->affix]),
+ (const unsigned char *) AffixFieldRepl(Affix),
(ptr - 1)->len))
{
/* leave only unique and minimals suffixes */
- ptr->affix = Affix->repl;
+ ptr->affix = i;
ptr->len = Affix->replen;
ptr->issuffix = issuffix;
ptr++;
}
}
}
- ptr->affix = NULL;
- Conf->CompoundAffix = (CMPDAffix *) repalloc(Conf->CompoundAffix, sizeof(CMPDAffix) * (ptr - Conf->CompoundAffix + 1));
+ ptr->affix = ISPELL_INVALID_INDEX;
+ ConfBuild->nCompoundAffix = ptr - ConfBuild->CompoundAffix + 1;
/* Start build a prefix tree */
- Conf->Prefix = mkANode(Conf, 0, firstsuffix, 0, FF_PREFIX);
- Conf->Suffix = mkANode(Conf, firstsuffix, Conf->naffixes, 0, FF_SUFFIX);
- mkVoidAffix(Conf, true, firstsuffix);
- mkVoidAffix(Conf, false, firstsuffix);
+ mkVoidAffix(ConfBuild, true, firstsuffix);
+ mkVoidAffix(ConfBuild, false, firstsuffix);
+
+ prefix_offset = mkANode(ConfBuild, 0, firstsuffix, 0, FF_PREFIX);
+ suffix_offset = mkANode(ConfBuild, firstsuffix, ConfBuild->nAffix, 0,
+ FF_SUFFIX);
+
+ /* Adjust offsets of new nodes for nodes of void affixes */
+ voidPrefix = (AffixNode *) NodeArrayGet(&ConfBuild->PrefixNodes, 0);
+ voidPrefix->data[0].node_offset = prefix_offset;
+
+ voidSuffix = (AffixNode *) NodeArrayGet(&ConfBuild->SuffixNodes, 0);
+ voidSuffix->data[0].node_offset = suffix_offset;
}
static AffixNodeData *
-FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
+FindAffixes(IspellDictData *dict, AffixNode *node, const char *word, int wrdlen,
+ int *level, int type)
{
+ AffixNode *nodes;
AffixNodeData *StopLow,
*StopHigh,
*StopMiddle;
uint8 symbol;
+ if (type == FF_PREFIX)
+ nodes = (AffixNode *) DictPrefixNodes(dict);
+ else
+ nodes = (AffixNode *) DictSuffixNodes(dict);
+
if (node->isvoid)
{ /* search void affixes */
- if (node->data->naff)
+ if (node->data->affstart != ISPELL_INVALID_INDEX)
return node->data;
- node = node->data->node;
+ node = (AffixNode *) DictNodeGet(nodes, node->data->node_offset);
}
while (node && *level < wrdlen)
@@ -2026,9 +2334,10 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
if (StopMiddle->val == symbol)
{
(*level)++;
- if (StopMiddle->naff)
+ if (StopMiddle->affstart != ISPELL_INVALID_INDEX)
return StopMiddle;
- node = StopMiddle->node;
+ node = (AffixNode *) DictNodeGet(nodes,
+ StopMiddle->node_offset);
break;
}
else if (StopMiddle->val < symbol)
@@ -2042,8 +2351,67 @@ FindAffixes(AffixNode *node, const char *word, int wrdlen, int *level, int type)
return NULL;
}
+/*
+ * Compile regular expression on first use and store it within reg.
+ */
+static void
+CompileAffixReg(AffixReg *reg, bool isregis, int type,
+ const char *mask, int masklen, MemoryContext dictCtx)
+{
+ MemoryContext oldcontext;
+
+ Assert(dictCtx);
+
+ /*
+ * Switch to memory context of the dictionary, so compiled expression can be
+ * used in other queries.
+ */
+ oldcontext = MemoryContextSwitchTo(dictCtx);
+
+ if (isregis)
+ RS_compile(®->r.regis, (type == FF_SUFFIX), mask);
+ else
+ {
+ int wmasklen;
+ int err;
+ pg_wchar *wmask;
+ char *tmask;
+
+ tmask = (char *) palloc(masklen + 3);
+ if (type == FF_SUFFIX)
+ sprintf(tmask, "%s$", mask);
+ else
+ sprintf(tmask, "^%s", mask);
+
+ masklen = strlen(tmask);
+ wmask = (pg_wchar *) palloc((masklen + 1) * sizeof(pg_wchar));
+ wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
+
+ err = pg_regcomp(&(reg->r.regex), wmask, wmasklen,
+ REG_ADVANCED | REG_NOSUB,
+ DEFAULT_COLLATION_OID);
+ if (err)
+ {
+ char errstr[100];
+
+ pg_regerror(err, &(reg->r.regex), errstr, sizeof(errstr));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+ errmsg("invalid regular expression: %s", errstr)));
+ }
+
+ pfree(wmask);
+ pfree(tmask);
+ }
+
+ reg->iscompiled = true;
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
static char *
-CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *newword, int *baselen)
+CheckAffix(const char *word, size_t len, AFFIX *Affix, AffixReg *reg,
+ int flagflags, char *newword, int *baselen, MemoryContext dictCtx)
{
/*
* Check compound allow flags
@@ -2083,7 +2451,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
if (Affix->type == FF_SUFFIX)
{
strcpy(newword, word);
- strcpy(newword + len - Affix->replen, Affix->find);
+ strcpy(newword + len - Affix->replen, AffixFieldFind(Affix));
if (baselen) /* store length of non-changed part of word */
*baselen = len - Affix->replen;
}
@@ -2093,9 +2461,9 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
* if prefix is an all non-changed part's length then all word
* contains only prefix and suffix, so out
*/
- if (baselen && *baselen + strlen(Affix->find) <= Affix->replen)
+ if (baselen && *baselen + Affix->findlen <= Affix->replen)
return NULL;
- strcpy(newword, Affix->find);
+ strcpy(newword, AffixFieldFind(Affix));
strcat(newword, word + Affix->replen);
}
@@ -2106,7 +2474,12 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
return newword;
else if (Affix->isregis)
{
- if (RS_execute(&(Affix->reg.regis), newword))
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
+ if (RS_execute(&(reg->r.regis), newword))
return newword;
}
else
@@ -2116,12 +2489,17 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
size_t data_len;
int newword_len;
+ /* Compile the regular expression on first demand */
+ if (!reg->iscompiled)
+ CompileAffixReg(reg, Affix->isregis, Affix->type,
+ AffixFieldMask(Affix), Affix->masklen, dictCtx);
+
/* Convert data string to wide characters */
newword_len = strlen(newword);
data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
data_len = pg_mb2wchar_with_len(newword, data, newword_len);
- if (!(err = pg_regexec(&(Affix->reg.regex), data, data_len, 0, NULL, 0, NULL, 0)))
+ if (!(err = pg_regexec(&(reg->r.regex), data, data_len, 0, NULL, 0, NULL, 0)))
{
pfree(data);
return newword;
@@ -2160,7 +2538,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
char **cur;
char newword[2 * MAXNORMLEN] = "";
char pnewword[2 * MAXNORMLEN] = "";
- AffixNode *snode = Conf->Suffix,
+ AffixNode *snode = (AffixNode *) DictSuffixNodes(Conf->dict),
*pnode;
int i,
j;
@@ -2172,7 +2550,7 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
/* Check that the word itself is normal form */
- if (FindWord(Conf, word, VoidString, flag))
+ if (FindWord(Conf->dict, word, VoidString, flag))
{
*cur = pstrdup(word);
cur++;
@@ -2180,23 +2558,29 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
}
/* Find all other NORMAL forms of the 'word' (check only prefix) */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
while (pnode)
{
- prefix = FindAffixes(pnode, word, wrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, word, wrdlen, &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(word, wrdlen, prefix->aff[j], flag, newword, NULL))
+ AFFIX *affix = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *reg = &(Conf->reg[j]);
+
+ if (affix &&
+ CheckAffix(word, wrdlen, affix, reg, flag, newword, NULL,
+ Conf->dictCtx))
{
/* prefix success */
- if (FindWord(Conf, newword, prefix->aff[j]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(affix), flag))
cur += addToResult(forms, cur, newword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
/*
@@ -2208,45 +2592,59 @@ NormalizeSubWord(IspellDict *Conf, char *word, int flag)
int baselen = 0;
/* find possible suffix */
- suffix = FindAffixes(snode, word, wrdlen, &slevel, FF_SUFFIX);
+ suffix = FindAffixes(Conf->dict, snode, word, wrdlen, &slevel,
+ FF_SUFFIX);
if (!suffix)
break;
/* foreach suffix check affix */
- for (i = 0; i < suffix->naff; i++)
+ for (i = suffix->affstart; i <= suffix->affend; i++)
{
- if (CheckAffix(word, wrdlen, suffix->aff[i], flag, newword, &baselen))
+ AFFIX *sufentry = (AFFIX *) DictAffixGet(Conf->dict, i);
+ AffixReg *sufreg = &(Conf->reg[i]);
+
+ if (sufentry &&
+ CheckAffix(word, wrdlen, sufentry, sufreg, flag, newword, &baselen,
+ Conf->dictCtx))
{
/* suffix success */
- if (FindWord(Conf, newword, suffix->aff[i]->flag, flag))
+ if (FindWord(Conf->dict, newword, AffixFieldFlag(sufentry), flag))
cur += addToResult(forms, cur, newword);
/* now we will look changed word with prefixes */
- pnode = Conf->Prefix;
+ pnode = (AffixNode *) DictPrefixNodes(Conf->dict);
plevel = 0;
swrdlen = strlen(newword);
while (pnode)
{
- prefix = FindAffixes(pnode, newword, swrdlen, &plevel, FF_PREFIX);
+ prefix = FindAffixes(Conf->dict, pnode, newword, swrdlen,
+ &plevel, FF_PREFIX);
if (!prefix)
break;
- for (j = 0; j < prefix->naff; j++)
+ for (j = prefix->affstart; j <= prefix->affend; j++)
{
- if (CheckAffix(newword, swrdlen, prefix->aff[j], flag, pnewword, &baselen))
+ AFFIX *prefentry = (AFFIX *) DictAffixGet(Conf->dict, j);
+ AffixReg *prefreg = &(Conf->reg[j]);
+
+ if (prefentry &&
+ CheckAffix(newword, swrdlen, prefentry, prefreg,
+ flag, pnewword, &baselen, Conf->dictCtx))
{
/* prefix success */
- char *ff = (prefix->aff[j]->flagflags & suffix->aff[i]->flagflags & FF_CROSSPRODUCT) ?
- VoidString : prefix->aff[j]->flag;
+ char *ff = (prefentry->flagflags & sufentry->flagflags & FF_CROSSPRODUCT) ?
+ VoidString : AffixFieldFlag(prefentry);
- if (FindWord(Conf, pnewword, ff, flag))
+ if (FindWord(Conf->dict, pnewword, ff, flag))
cur += addToResult(forms, cur, pnewword);
}
}
- pnode = prefix->node;
+ pnode = (AffixNode *) DictNodeGet(DictPrefixNodes(Conf->dict),
+ prefix->node_offset);
}
}
}
- snode = suffix->node;
+ snode = (AffixNode *) DictNodeGet(DictSuffixNodes(Conf->dict),
+ suffix->node_offset);
}
if (cur == forms)
@@ -2266,7 +2664,8 @@ typedef struct SplitVar
} SplitVar;
static int
-CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
+CheckCompoundAffixes(IspellDictData *dict, CMPDAffix **ptr,
+ char *word, int len, bool CheckInPlace)
{
bool issuffix;
@@ -2276,9 +2675,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
if (CheckInPlace)
{
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && strncmp((*ptr)->affix, word, (*ptr)->len) == 0)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ strncmp(AffixFieldRepl(affix), word, (*ptr)->len) == 0)
{
len = (*ptr)->len;
issuffix = (*ptr)->issuffix;
@@ -2292,9 +2694,12 @@ CheckCompoundAffixes(CMPDAffix **ptr, char *word, int len, bool CheckInPlace)
{
char *affbegin;
- while ((*ptr)->affix)
+ while ((*ptr)->affix != ISPELL_INVALID_INDEX)
{
- if (len > (*ptr)->len && (affbegin = strstr(word, (*ptr)->affix)) != NULL)
+ AFFIX *affix = (AFFIX *) DictAffixGet(dict, (*ptr)->affix);
+
+ if (len > (*ptr)->len &&
+ (affbegin = strstr(word, AffixFieldRepl(affix))) != NULL)
{
len = (*ptr)->len + (affbegin - word);
issuffix = (*ptr)->issuffix;
@@ -2346,13 +2751,14 @@ AddStem(SplitVar *v, char *word)
}
static SplitVar *
-SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int wordlen, int startpos, int minpos)
+SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig,
+ char *word, int wordlen, int startpos, int minpos)
{
SplitVar *var = NULL;
SPNodeData *StopLow,
*StopHigh,
*StopMiddle = NULL;
- SPNode *node = (snode) ? snode : Conf->Dictionary;
+ SPNode *node = (snode) ? snode : (SPNode *) DictDictNodes(Conf->dict);
int level = (snode) ? minpos : startpos; /* recursive
* minpos==level */
int lenaff;
@@ -2367,8 +2773,11 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (level < wordlen)
{
/* find word with epenthetic or/and compound affix */
- caff = Conf->CompoundAffix;
- while (level > startpos && (lenaff = CheckCompoundAffixes(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
+ caff = (CMPDAffix *) DictCompoundAffix(Conf->dict);
+ while (level > startpos &&
+ (lenaff = CheckCompoundAffixes(Conf->dict, &caff,
+ word + level, wordlen - level,
+ (node) ? true : false)) >= 0)
{
/*
* there is one of compound affixes, so check word for existings
@@ -2415,7 +2824,8 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
while (ptr->next)
ptr = ptr->next;
- ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen, startpos + lenaff, startpos + lenaff);
+ ptr->next = SplitToVariants(Conf, NULL, new, word, wordlen,
+ startpos + lenaff, startpos + lenaff);
pfree(new->stem);
pfree(new);
@@ -2474,13 +2884,14 @@ SplitToVariants(IspellDict *Conf, SPNode *snode, SplitVar *orig, char *word, int
/* we can find next word */
level++;
AddStem(var, pnstrdup(word + startpos, level - startpos));
- node = Conf->Dictionary;
+ node = (SPNode *) DictDictNodes(Conf->dict);
startpos = level;
continue;
}
}
}
- node = StopMiddle->node;
+ node = (SPNode *) DictNodeGet(DictDictNodes(Conf->dict),
+ StopMiddle->node_offset);
}
else
node = NULL;
@@ -2530,7 +2941,7 @@ NINormalizeWord(IspellDict *Conf, char *word)
pfree(res);
}
- if (Conf->usecompound)
+ if (Conf->dict->usecompound)
{
int wordlen = strlen(word);
SplitVar *ptr,
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 4cba578436..df0abd38ae 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -18,21 +18,23 @@
#include "tsearch/dicts/regis.h"
#include "tsearch/ts_public.h"
+#define ISPELL_INVALID_INDEX (0x7FFFF)
+#define ISPELL_INVALID_OFFSET (0xFFFFFFFF)
+
/*
* SPNode and SPNodeData are used to represent prefix tree (Trie) to store
* a words list.
*/
-struct SPNode;
-
typedef struct
{
uint32 val:8,
isword:1,
/* Stores compound flags listed below */
compoundflag:4,
- /* Reference to an entry of the AffixData field */
+ /* Index of an entry of the AffixData field */
affix:19;
- struct SPNode *node;
+ /* Offset to a node of the DictNodes field */
+ uint32 node_offset;
} SPNodeData;
/*
@@ -86,21 +88,55 @@ typedef struct spell_struct
*/
typedef struct aff_struct
{
- char *flag;
/* FF_SUFFIX or FF_PREFIX */
- uint32 type:1,
+ uint16 type:1,
flagflags:7,
issimple:1,
isregis:1,
- replen:14;
- char *find;
- char *repl;
+ flaglen:2;
+
+ /* 8 bytes could be too mach for repl, find and mask, but who knows */
+ uint8 replen;
+ uint8 findlen;
+ uint8 masklen;
+
+ /*
+ * fields stores the following data (each ends with \0):
+ * - repl
+ * - find
+ * - mask
+ * - flag - one character (if FM_CHAR),
+ * two characters (if FM_LONG),
+ * number, >= 0 and < 65536 (if FM_NUM).
+ */
+ char fields[FLEXIBLE_ARRAY_MEMBER];
+} AFFIX;
+
+#define AF_FLAG_MAXSIZE 5 /* strlen(65536) */
+#define AF_REPL_MAXSIZE 255 /* 8 bytes */
+
+#define AFFIXHDRSZ (offsetof(AFFIX, fields))
+
+#define AffixFieldRepl(af) ((af)->fields)
+#define AffixFieldFind(af) ((af)->fields + (af)->replen + 1)
+#define AffixFieldMask(af) (AffixFieldFind(af) + (af)->findlen + 1)
+#define AffixFieldFlag(af) (AffixFieldMask(af) + (af)->masklen + 1)
+#define AffixGetSize(af) (AFFIXHDRSZ + (af)->replen + 1 + (af)->findlen + 1 \
+ + (af)->masklen + 1 + strlen(AffixFieldFlag(af)) + 1)
+
+/*
+ * Stores compiled regular expression of affix. AffixReg uses mask field of
+ * AFFIX as a regular expression.
+ */
+typedef struct AffixReg
+{
+ bool iscompiled;
union
{
regex_t regex;
Regis regis;
- } reg;
-} AFFIX;
+ } r;
+} AffixReg;
/*
* affixes use dictionary flags too
@@ -120,14 +156,13 @@ typedef struct aff_struct
* AffixNode and AffixNodeData are used to represent prefix tree (Trie) to store
* an affix list.
*/
-struct AffixNode;
-
typedef struct
{
- uint32 val:8,
- naff:24;
- AFFIX **aff;
- struct AffixNode *node;
+ uint8 val;
+ uint32 affstart;
+ uint32 affend;
+ /* Offset to a node of the PrefixNodes or SuffixNodes field */
+ uint32 node_offset;
} AffixNodeData;
typedef struct AffixNode
@@ -139,9 +174,20 @@ typedef struct AffixNode
#define ANHRDSZ (offsetof(AffixNode, data))
+typedef struct NodeArray
+{
+ char *Nodes;
+ uint32 NodesSize; /* allocated size of Nodes */
+ uint32 NodesEnd; /* end of data in Nodes */
+} NodeArray;
+
+#define NodeArrayGet(na, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (na)->Nodes + (of))
+
typedef struct
{
- char *affix;
+ /* Index of an affix of the Affix field */
+ uint32 affix;
int len;
bool issuffix;
} CMPDAffix;
@@ -176,30 +222,75 @@ typedef struct CompoundAffixFlag
#define FLAGNUM_MAXSIZE (1 << 16)
-typedef struct
+typedef struct IspellDictData
{
- int maffixes;
- int naffixes;
- AFFIX *Affix;
-
- AffixNode *Suffix;
- AffixNode *Prefix;
+ FlagMode flagMode;
+ bool usecompound;
- SPNode *Dictionary;
- /* Array of sets of affixes */
- char **AffixData;
- int lenAffixData;
- int nAffixData;
bool useFlagAliases;
- CMPDAffix *CompoundAffix;
+ uint32 nAffixData;
+ uint32 AffixDataStart;
- bool usecompound;
- FlagMode flagMode;
+ uint32 AffixOffsetStart;
+ uint32 AffixStart;
+ uint32 nAffix;
+
+ uint32 DictNodesStart;
+ uint32 PrefixNodesStart;
+ uint32 SuffixNodesStart;
+
+ uint32 CompoundAffixStart;
/*
- * All follow fields are actually needed only for initialization
+ * data stores:
+ * - AffixData - array of affix sets
+ * - Affix - sorted array of affixes
+ * - DictNodes - prefix tree of a word list
+ * - PrefixNodes - prefix tree of a prefix list
+ * - SuffixNodes - prefix tree of a suffix list
+ * - CompoundAffix - array of compound affixes
*/
+ char data[FLEXIBLE_ARRAY_MEMBER];
+} IspellDictData;
+
+#define IspellDictDataHdrSize (offsetof(IspellDictData, data))
+
+#define DictAffixDataOffset(d) ((d)->data)
+#define DictAffixData(d) ((d)->data + (d)->AffixDataStart)
+#define DictAffixDataGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffixData), \
+ DictAffixData(d) + ((uint32 *) DictAffixDataOffset(d))[i]))
+
+#define DictAffixOffset(d) ((d)->data + (d)->AffixOffsetStart)
+#define DictAffix(d) ((d)->data + (d)->AffixStart)
+#define DictAffixGet(d, i) \
+ (((i) == ISPELL_INVALID_INDEX) ? NULL : \
+ (AssertMacro(i < (d)->nAffix), \
+ DictAffix(d) + ((uint32 *) DictAffixOffset(d))[i]))
+
+#define DictDictNodes(d) ((d)->data + (d)->DictNodesStart)
+#define DictPrefixNodes(d) ((d)->data + (d)->PrefixNodesStart)
+#define DictSuffixNodes(d) ((d)->data + (d)->SuffixNodesStart)
+#define DictNodeGet(node_start, of) \
+ (((of) == ISPELL_INVALID_OFFSET) ? NULL : (char *) (node_start) + (of))
+
+#define DictCompoundAffix(d) ((d)->data + (d)->CompoundAffixStart)
+
+/*
+ * IspellDictBuild is used to initialize IspellDictData struct. This is a
+ * temprorary structure which is setup by NIStartBuild() and released by
+ * NIFinishBuild().
+ */
+typedef struct IspellDictBuild
+{
+ MemoryContext buildCxt; /* temp context for construction */
+
+ IspellDictData *dict;
+ uint32 dict_size;
+
+ /* Temporary data */
/* Array of Hunspell options in affix file */
CompoundAffixFlag *CompoundAffixFlags;
@@ -208,29 +299,73 @@ typedef struct
/* allocated length of CompoundAffixFlags array */
int mCompoundAffixFlag;
- /*
- * Remaining fields are only used during dictionary construction; they are
- * set up by NIStartBuild and cleared by NIFinishBuild.
- */
- MemoryContext buildCxt; /* temp context for construction */
-
- /* Temporary array of all words in the dict file */
+ /* Array of all words in the dict file */
SPELL **Spell;
- int nspell; /* number of valid entries in Spell array */
- int mspell; /* allocated length of Spell array */
+ int nSpell; /* number of valid entries in Spell array */
+ int mSpell; /* allocated length of Spell array */
+
+ /* Data for IspellDictData */
+
+ /* Array of all affixes in the aff file */
+ AFFIX **Affix;
+ int nAffix; /* number of valid entries in Affix array */
+ int mAffix; /* allocated length of Affix array */
+ uint32 AffixSize;
+
+ /* Array of sets of affixes */
+ uint32 *AffixDataOffset;
+ int nAffixData; /* number of affix sets */
+ int mAffixData; /* allocated number of affix sets */
+ char *AffixData;
+ uint32 AffixDataSize; /* allocated size of AffixData */
+ uint32 AffixDataEnd; /* end of data in AffixData */
+
+ /* Prefix tree which stores a word list */
+ NodeArray DictNodes;
+
+ /* Prefix tree which stores a prefix list */
+ NodeArray PrefixNodes;
+
+ /* Prefix tree which stores a suffix list */
+ NodeArray SuffixNodes;
- /* These are used to allocate "compact" data without palloc overhead */
- char *firstfree; /* first free address (always maxaligned) */
- size_t avail; /* free space remaining at firstfree */
+ /* Array of compound affixes */
+ CMPDAffix *CompoundAffix;
+ int nCompoundAffix; /* number of entries of CompoundAffix */
+} IspellDictBuild;
+
+#define AffixDataGet(d, i) ((d)->AffixData + (d)->AffixDataOffset[i])
+
+/*
+ * IspellDict is used within NINormalizeWord.
+ */
+typedef struct IspellDict
+{
+ /*
+ * Pointer to a DSM location of IspellDictData. Should be retreived per
+ * every dispell_lexize() call.
+ */
+ IspellDictData *dict;
+ /*
+ * Array of regular expression of affixes. Each regular expression is
+ * compiled only on demand.
+ */
+ AffixReg *reg;
+ /*
+ * Memory context for compiling regular expressions.
+ */
+ MemoryContext dictCtx;
} IspellDict;
extern TSLexeme *NINormalizeWord(IspellDict *Conf, char *word);
-extern void NIStartBuild(IspellDict *Conf);
-extern void NIImportAffixes(IspellDict *Conf, const char *filename);
-extern void NIImportDictionary(IspellDict *Conf, const char *filename);
-extern void NISortDictionary(IspellDict *Conf);
-extern void NISortAffixes(IspellDict *Conf);
-extern void NIFinishBuild(IspellDict *Conf);
+extern void NIStartBuild(IspellDictBuild *ConfBuild);
+extern void NIImportAffixes(IspellDictBuild *ConfBuild, const char *filename);
+extern void NIImportDictionary(IspellDictBuild *ConfBuild,
+ const char *filename);
+extern void NISortDictionary(IspellDictBuild *ConfBuild);
+extern void NISortAffixes(IspellDictBuild *ConfBuild);
+extern void NICopyData(IspellDictBuild *ConfBuild);
+extern void NIFinishBuild(IspellDictBuild *ConfBuild);
#endif
--
2.21.0
Is 0001 a bugfix?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Apr 5, 2019 at 8:41 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Is 0001 a bugfix?
Yep, it is rather a bugfix and can be applied independently.
The fix allocates temporary strings using temporary context
Conf->buildCxt.
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company