PG15 beta1 sort performance regression due to Generation context change
Hackers,
Over the past few days I've been gathering some benchmark results
together to show the sort performance improvements in PG15 [1]https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/speeding-up-sort-performance-in-postgres-15/ba-p/3396953#change4.
One of the test cases I did was to demonstrate Heikki's change to use
a k-way merge (65014000b).
The test I did to try this out was along the lines of:
set max_parallel_workers_per_gather = 0;
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);
insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x; -- 10GB!
vacuum freeze t;
The query I ran was:
select * from t order by a offset 140247142;
I tested various sizes of work_mem starting at 4MB and doubled that
all the way to 16GB. For many of the smaller values of work_mem the
performance is vastly improved by Heikki's change, however for
work_mem = 64MB I detected quite a large slowdown. PG14 took 20.9
seconds and PG15 beta 1 took 29 seconds!
I've been trying to get to the bottom of this today and finally have
discovered this is due to the tuple size allocations in the sort being
exactly 64 bytes. Prior to 40af10b57 (Use Generation memory contexts
to store tuples in sorts) the tuple for the sort would be stored in an
aset context. After 40af10b57 we'll use a generation context. The
idea with that change is that the generation context does no
power-of-2 round ups for allocations, so we save memory in most cases.
However, due to this particular test having a tuple size of 64-bytes,
there was no power-of-2 wastage with aset.
The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.
Basically, for my test query, the slowdown is because instead of being
able to store 620702 tuples per tape over 226 tapes with an aset
context, we can now only store 576845 tuples per tape resulting in
requiring 244 tapes when using the generation context.
If I had added column "g" to make the tuple size 72 bytes causing
aset's code to round allocations up to 128 bytes and generation.c to
maintain the 72 bytes then the sort would have stored 385805 tuples
over 364 batches for aset and 538761 tuples over 261 batches using the
generation context. That would have been a huge win.
So it basically looks like I discovered a very bad case that causes a
significant slowdown. Yet other cases that are not an exact power of
2 stand to gain significantly from this change.
One thing 40af10b57 does is stops those terrible performance jumps
when the tuple size crosses a power-of-2 boundary. The performance
should be more aligned to the size of the data being sorted now...
Unfortunately, that seems to mean regressions for large sorts with
power-of-2 sized tuples.
I'm unsure exactly what I should do about this right now.
David
On 20/05/2022 08:56, David Rowley wrote:
The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.
Could the 'context' field be moved from GenerationChunk to GenerationBlock?
- Heikki
Hi David,
Over the past few days I've been gathering some benchmark results
together to show the sort performance improvements in PG15 [1].
One of the test cases I did was to demonstrate Heikki's change to use
a k-way merge (65014000b).
The test I did to try this out was along the lines of:
set max_parallel_workers_per_gather = 0;
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);
insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x; --
10GB!
vacuum freeze t;
The query I ran was:
select * from t order by a offset 140247142;
I redid this test here:
Windows 10 64 bits
msvc 2019 64 bits
RAM 8GB
SSD 256 GB
HEAD (default configuration)
Time: 229396,551 ms (03:49,397)
PATCHED:
Time: 220887,346 ms (03:40,887)
I tested various sizes of work_mem starting at 4MB and doubled that
all the way to 16GB. For many of the smaller values of work_mem the
performance is vastly improved by Heikki's change, however for
work_mem = 64MB I detected quite a large slowdown. PG14 took 20.9
seconds and PG15 beta 1 took 29 seconds!
I've been trying to get to the bottom of this today and finally have
discovered this is due to the tuple size allocations in the sort being
exactly 64 bytes. Prior to 40af10b57 (Use Generation memory contexts
to store tuples in sorts) the tuple for the sort would be stored in an
aset context. After 40af10b57 we'll use a generation context. The
idea with that change is that the generation context does no
power-of-2 round ups for allocations, so we save memory in most cases.
However, due to this particular test having a tuple size of 64-bytes,
there was no power-of-2 wastage with aset.
The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.
Basically, for my test query, the slowdown is because instead of being
able to store 620702 tuples per tape over 226 tapes with an aset
context, we can now only store 576845 tuples per tape resulting in
requiring 244 tapes when using the generation context.
If I had added column "g" to make the tuple size 72 bytes causing
aset's code to round allocations up to 128 bytes and generation.c to
maintain the 72 bytes then the sort would have stored 385805 tuples
over 364 batches for aset and 538761 tuples over 261 batches using the
generation context. That would have been a huge win.
So it basically looks like I discovered a very bad case that causes a
significant slowdown. Yet other cases that are not an exact power of
2 stand to gain significantly from this change.
One thing 40af10b57 does is stops those terrible performance jumps
when the tuple size crosses a power-of-2 boundary. The performance
should be more aligned to the size of the data being sorted now...
Unfortunately, that seems to mean regressions for large sorts with
power-of-2 sized tuples.
It seems to me that the solution would be to use aset allocations
when the size of the tuples is power-of-2?
if (state->sortopt & TUPLESORT_ALLOWBOUNDED ||
(state->memtupsize & (state->memtupsize - 1)) == 0)
state->tuplecontext = AllocSetContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
else
state->tuplecontext = GenerationContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
I took a look and tried some improvements to see if I had a better result.
Would you mind taking a look and testing?
regards,
Ranier Vilela
Attachments:
002-improve-sort.patchapplication/octet-stream; name=002-improve-sort.patchDownload
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8340a66052..cc7c996718 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -182,8 +182,8 @@ typedef struct
{
void *tuple; /* the tuple itself */
Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
int srctape; /* source tape number */
+ bool isnull1; /* is first key column NULL? */
} SortTuple;
/*
@@ -695,7 +695,7 @@ static void tuplesort_updatemax(Tuplesortstate *state);
/* Used if first key's comparator is ssup_datum_unsigned_compare */
static pg_attribute_always_inline int
-qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
+qsort_tuple_unsigned_compare(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
int compare;
@@ -718,7 +718,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#if SIZEOF_DATUM >= 8
/* Used if first key's comparator is ssup_datum_signed_compare */
static pg_attribute_always_inline int
-qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
+qsort_tuple_signed_compare(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
int compare;
@@ -742,7 +742,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
/* Used if first key's comparator is ssup_datum_int32_compare */
static pg_attribute_always_inline int
-qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
+qsort_tuple_int32_compare(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
int compare;
@@ -972,10 +972,13 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
+ if (state->sortopt & TUPLESORT_ALLOWBOUNDED ||
+ (state->memtupsize & (state->memtupsize - 1)) == 0)
+ {
state->tuplecontext = AllocSetContextCreate(state->sortcontext,
"Caller tuples",
ALLOCSET_DEFAULT_SIZES);
+ }
else
state->tuplecontext = GenerationContextCreate(state->sortcontext,
"Caller tuples",
@@ -4360,7 +4363,6 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
tupDes = RelationGetDescr(state->indexRel);
if (sortKey->abbrev_converter)
@@ -4380,6 +4382,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
equal_hasnull = true;
sortKey++;
+ keysz = state->nKeys;
for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
{
datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
@@ -4444,15 +4447,19 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
+ if (blk1 < blk2)
+ return -1;
+ else if (blk1 > blk2)
+ return 1;
}
{
OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
+ if (pos1 < pos2)
+ return -1;
+ else if (pos1 > pos2)
+ return 1;
}
/* ItemPointer values should never be equal */
@@ -4499,15 +4506,19 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
+ if (blk1 < blk2)
+ return -1;
+ else if (blk1 > blk2)
+ return 1;
}
{
OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
+ if (pos1 < pos2)
+ return -1;
+ else if (pos1 > pos2)
+ return 1;
}
/* ItemPointer values should never be equal */
@@ -4642,25 +4653,25 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (tuplen == 0)
{
/* it's NULL */
+ stup->tuple = NULL;
stup->datum1 = (Datum) 0;
stup->isnull1 = true;
- stup->tuple = NULL;
}
else if (!state->tuples)
{
Assert(tuplen == sizeof(Datum));
+ stup->tuple = NULL;
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
stup->isnull1 = false;
- stup->tuple = NULL;
}
else
{
void *raddr = readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->tuple = raddr;
stup->datum1 = PointerGetDatum(raddr);
stup->isnull1 = false;
- stup->tuple = raddr;
}
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
diff --git a/src/include/lib/sort_template.h b/src/include/lib/sort_template.h
index 3122a93009..86ac216172 100644
--- a/src/include/lib/sort_template.h
+++ b/src/include/lib/sort_template.h
@@ -309,12 +309,12 @@ loop:
for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP;
pm += ST_POINTER_STEP)
{
- DO_CHECK_FOR_INTERRUPTS();
if (DO_COMPARE(pm - ST_POINTER_STEP, pm) > 0)
{
presorted = 0;
break;
}
+ DO_CHECK_FOR_INTERRUPTS();
}
if (presorted)
return;
Import Notes
Resolved by subject fallback
On 5/20/22 12:01, Heikki Linnakangas wrote:
On 20/05/2022 08:56, David Rowley wrote:
The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.Could the 'context' field be moved from GenerationChunk to GenerationBlock?
Not easily, because GetMemoryChunkContext() expects the context to be
stored right before the chunk. In principle we could add "get context"
callback to MemoryContextMethods, so that different implementations can
override that.
I wonder how expensive the extra redirect would be, but probably not
much because the places touching chunk->context deal with the block too
(e.g. GenerationFree has to tweak block->nfree).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
On 5/20/22 12:01, Heikki Linnakangas wrote:
Could the 'context' field be moved from GenerationChunk to GenerationBlock?
Not easily, because GetMemoryChunkContext() expects the context to be
stored right before the chunk. In principle we could add "get context"
callback to MemoryContextMethods, so that different implementations can
override that.
How would you know which context type to consult for that?
regards, tom lane
On Tue, 24 May 2022 at 08:32, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 5/20/22 12:01, Heikki Linnakangas wrote:
Could the 'context' field be moved from GenerationChunk to GenerationBlock?
Not easily, because GetMemoryChunkContext() expects the context to be
stored right before the chunk. In principle we could add "get context"
callback to MemoryContextMethods, so that different implementations can
override that.
hmm, but we need to know the context first before we can know which
callback to call.
David
On 5/23/22 22:47, Tom Lane wrote:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
On 5/20/22 12:01, Heikki Linnakangas wrote:
Could the 'context' field be moved from GenerationChunk to GenerationBlock?
Not easily, because GetMemoryChunkContext() expects the context to be
stored right before the chunk. In principle we could add "get context"
callback to MemoryContextMethods, so that different implementations can
override that.How would you know which context type to consult for that?
D'oh! I knew there has to be some flaw in that idea, but I forgot about
this chicken-or-egg issue.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
FYI not sure why, but your responses seem to break threading quite
often, due to missing headers identifying the message you're responding
to (In-Reply-To, References). Not sure why or how to fix it, but this
makes it much harder to follow the discussion.
On 5/22/22 21:11, Ranier Vilela wrote:
Hi David,
Over the past few days I've been gathering some benchmark results
together to show the sort performance improvements in PG15 [1].One of the test cases I did was to demonstrate Heikki's change to use
a k-way merge (65014000b).The test I did to try this out was along the lines of:
set max_parallel_workers_per_gather = 0;
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x;
-- 10GB!
vacuum freeze t;
The query I ran was:
select * from t order by a offset 140247142;
I redid this test here:
Windows 10 64 bits
msvc 2019 64 bits
RAM 8GB
SSD 256 GBHEAD (default configuration)
Time: 229396,551 ms (03:49,397)
PATCHED:
Time: 220887,346 ms (03:40,887)
This is 10x longer than reported by David. Presumably David used a
machine a lot of RAM, while your system has 8GB and so is I/O bound.
Also, what exactly does "patched" mean? The patch you attached?
I tested various sizes of work_mem starting at 4MB and doubled that
all the way to 16GB. For many of the smaller values of work_mem the
performance is vastly improved by Heikki's change, however for
work_mem = 64MB I detected quite a large slowdown. PG14 took 20.9
seconds and PG15 beta 1 took 29 seconds!I've been trying to get to the bottom of this today and finally have
discovered this is due to the tuple size allocations in the sort being
exactly 64 bytes. Prior to 40af10b57 (Use Generation memory contexts
to store tuples in sorts) the tuple for the sort would be stored in an
aset context. After 40af10b57 we'll use a generation context. The
idea with that change is that the generation context does no
power-of-2 round ups for allocations, so we save memory in most cases.
However, due to this particular test having a tuple size of 64-bytes,
there was no power-of-2 wastage with aset.The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.Basically, for my test query, the slowdown is because instead of being
able to store 620702 tuples per tape over 226 tapes with an aset
context, we can now only store 576845 tuples per tape resulting in
requiring 244 tapes when using the generation context.If I had added column "g" to make the tuple size 72 bytes causing
aset's code to round allocations up to 128 bytes and generation.c to
maintain the 72 bytes then the sort would have stored 385805 tuples
over 364 batches for aset and 538761 tuples over 261 batches using the
generation context. That would have been a huge win.So it basically looks like I discovered a very bad case that causes a
significant slowdown. Yet other cases that are not an exact power of
2 stand to gain significantly from this change.One thing 40af10b57 does is stops those terrible performance jumps
when the tuple size crosses a power-of-2 boundary. The performance
should be more aligned to the size of the data being sorted now...
Unfortunately, that seems to mean regressions for large sorts with
power-of-2 sized tuples.It seems to me that the solution would be to use aset allocations
when the size of the tuples is power-of-2?
if (state->sortopt & TUPLESORT_ALLOWBOUNDED ||
(state->memtupsize & (state->memtupsize - 1)) == 0)
state->tuplecontext = AllocSetContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
else
state->tuplecontext = GenerationContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
I'm pretty sure this is pointless, because memtupsize is the size of the
memtuples array. But the issue is about size of the tuples. After all,
David was talking about 64B chunks, but the array is always at least
1024 elements, so it obviously can't be the same thing.
How would we even know how large the tuples will be at this point,
before we even see the first of them?
I took a look and tried some improvements to see if I had a better result.
IMHO special-casing should be the last resort, because it makes the
behavior much harder to follow. Also, we're talking about sort, but
don't other places using Generation context have the same issue?
Treating prefferrable to find a fix addressing all those places,
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, 24 May 2022 at 08:52, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 5/23/22 22:47, Tom Lane wrote:
How would you know which context type to consult for that?
D'oh! I knew there has to be some flaw in that idea, but I forgot about
this chicken-or-egg issue.
Handy wavy idea: It's probably too complex for now, and it also might
be too much overhead, but having GenerationPointerGetChunk() do a
binary search on a sorted-by-memory-address array of block pointers
might be a fast enough way to find the block that the pointer belongs
to. There should be far fewer blocks now since generation.c now grows
the block sizes. N in O(log2 N) the search complexity should never be
excessively high.
However, it would mean a binary search for every pfree, which feels
pretty horrible. My feeling is that it seems unlikely that saving 8
bytes by not storing the GenerationBlock would be a net win here.
There may be something to claw back as for the tuplesort.c case as I
think the heap_free_minimal_tuple() call in writetup_heap() is not
needed in many cases as we reset the tuple context directly afterward
writing the tuples out.
David
David Rowley <dgrowleyml@gmail.com> writes:
Handy wavy idea: It's probably too complex for now, and it also might
be too much overhead, but having GenerationPointerGetChunk() do a
binary search on a sorted-by-memory-address array of block pointers
might be a fast enough way to find the block that the pointer belongs
to. There should be far fewer blocks now since generation.c now grows
the block sizes. N in O(log2 N) the search complexity should never be
excessively high.
However, it would mean a binary search for every pfree, which feels
pretty horrible. My feeling is that it seems unlikely that saving 8
bytes by not storing the GenerationBlock would be a net win here.
I think probably that could be made to work, since a Generation
context should not contain all that many live blocks at any one time.
However, here's a different idea: how badly do we need the "size"
field in GenerationChunk? We're not going to ever recycle the
chunk, IIUC, so it doesn't matter exactly how big it is. When
doing MEMORY_CONTEXT_CHECKING we'll still want requested_size,
but that's not relevant to performance-critical cases.
regards, tom lane
On Tue, 24 May 2022 at 09:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However, here's a different idea: how badly do we need the "size"
field in GenerationChunk? We're not going to ever recycle the
chunk, IIUC, so it doesn't matter exactly how big it is. When
doing MEMORY_CONTEXT_CHECKING we'll still want requested_size,
but that's not relevant to performance-critical cases.
Interesting idea. However, I do see a couple of usages of the "size"
field away from MEMORY_CONTEXT_CHECKING builds:
GenerationRealloc: uses "size" to figure out if the new size is
smaller than the old size. Maybe we could just always move to a new
chunk regardless of if the new size is smaller or larger than the old
size.
GenerationGetChunkSpace: Uses "size" to figure out the amount of
memory is used by the chunk. I'm not sure how we'd work around the
fact that USEMEM() uses that extensively in tuplesort.c to figure out
how much memory we're using.
David
Em seg., 23 de mai. de 2022 às 18:01, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:
FYI not sure why, but your responses seem to break threading quite
often, due to missing headers identifying the message you're responding
to (In-Reply-To, References). Not sure why or how to fix it, but this
makes it much harder to follow the discussion.On 5/22/22 21:11, Ranier Vilela wrote:
Hi David,
Over the past few days I've been gathering some benchmark results
together to show the sort performance improvements in PG15 [1].One of the test cases I did was to demonstrate Heikki's change to use
a k-way merge (65014000b).The test I did to try this out was along the lines of:
set max_parallel_workers_per_gather = 0;
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x;
-- 10GB!
vacuum freeze t;
The query I ran was:
select * from t order by a offset 140247142;
I redid this test here:
Windows 10 64 bits
msvc 2019 64 bits
RAM 8GB
SSD 256 GBHEAD (default configuration)
Time: 229396,551 ms (03:49,397)
PATCHED:
Time: 220887,346 ms (03:40,887)This is 10x longer than reported by David. Presumably David used a
machine a lot of RAM, while your system has 8GB and so is I/O bound.
Probably, but Windows is slower than Linux, certainly.
Also, what exactly does "patched" mean? The patch you attached?
It means the results of the benchmark with the patch applied.
I tested various sizes of work_mem starting at 4MB and doubled that
all the way to 16GB. For many of the smaller values of work_mem the
performance is vastly improved by Heikki's change, however for
work_mem = 64MB I detected quite a large slowdown. PG14 took 20.9
seconds and PG15 beta 1 took 29 seconds!I've been trying to get to the bottom of this today and finally have
discovered this is due to the tuple size allocations in the sort being
exactly 64 bytes. Prior to 40af10b57 (Use Generation memory contexts
to store tuples in sorts) the tuple for the sort would be stored in an
aset context. After 40af10b57 we'll use a generation context. The
idea with that change is that the generation context does no
power-of-2 round ups for allocations, so we save memory in most cases.
However, due to this particular test having a tuple size of 64-bytes,
there was no power-of-2 wastage with aset.The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.Basically, for my test query, the slowdown is because instead of being
able to store 620702 tuples per tape over 226 tapes with an aset
context, we can now only store 576845 tuples per tape resulting in
requiring 244 tapes when using the generation context.If I had added column "g" to make the tuple size 72 bytes causing
aset's code to round allocations up to 128 bytes and generation.c to
maintain the 72 bytes then the sort would have stored 385805 tuples
over 364 batches for aset and 538761 tuples over 261 batches using the
generation context. That would have been a huge win.So it basically looks like I discovered a very bad case that causes a
significant slowdown. Yet other cases that are not an exact power of
2 stand to gain significantly from this change.One thing 40af10b57 does is stops those terrible performance jumps
when the tuple size crosses a power-of-2 boundary. The performance
should be more aligned to the size of the data being sorted now...
Unfortunately, that seems to mean regressions for large sorts with
power-of-2 sized tuples.It seems to me that the solution would be to use aset allocations
when the size of the tuples is power-of-2?
if (state->sortopt & TUPLESORT_ALLOWBOUNDED ||
(state->memtupsize & (state->memtupsize - 1)) == 0)
state->tuplecontext = AllocSetContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
else
state->tuplecontext = GenerationContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);I'm pretty sure this is pointless, because memtupsize is the size of the
memtuples array. But the issue is about size of the tuples. After all,
David was talking about 64B chunks, but the array is always at least
1024 elements, so it obviously can't be the same thing.
It was more of a guessing attempt.
How would we even know how large the tuples will be at this point,
before we even see the first of them?
I don't know how.
I took a look and tried some improvements to see if I had a better
result.
IMHO special-casing should be the last resort, because it makes the
behavior much harder to follow.
Probably, but in my tests, there has been some gain.
Also, we're talking about sort, but
don't other places using Generation context have the same issue?
Well, for PG15 is what was addressed, just sort.
Treating prefferrable to find a fix addressing all those places,
For now, I'm just focused on sorts.
regards,
Ranier Vilela
I wrote:
However, here's a different idea: how badly do we need the "size"
field in GenerationChunk? We're not going to ever recycle the
chunk, IIUC, so it doesn't matter exactly how big it is. When
doing MEMORY_CONTEXT_CHECKING we'll still want requested_size,
but that's not relevant to performance-critical cases.
Refining that a bit: we could provide the size field only when
MEMORY_CONTEXT_CHECKING and/or CLOBBER_FREED_MEMORY are defined.
That would leave us with GenerationRealloc and GenerationGetChunkSpace
not being supportable operations, but I wonder how much we need either.
BTW, shouldn't GenerationCheck be ifdef'd out if MEMORY_CONTEXT_CHECKING
isn't set? aset.c does things that way.
regards, tom lane
David Rowley <dgrowleyml@gmail.com> writes:
GenerationRealloc: uses "size" to figure out if the new size is
smaller than the old size. Maybe we could just always move to a new
chunk regardless of if the new size is smaller or larger than the old
size.
I had the same idea ... but we need to know the old size to know how much
to copy.
GenerationGetChunkSpace: Uses "size" to figure out the amount of
memory is used by the chunk. I'm not sure how we'd work around the
fact that USEMEM() uses that extensively in tuplesort.c to figure out
how much memory we're using.
Ugh, that seems like a killer.
regards, tom lane
On Tue, 24 May 2022 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
BTW, shouldn't GenerationCheck be ifdef'd out if MEMORY_CONTEXT_CHECKING
isn't set? aset.c does things that way.
Isn't it done in generation.c:954?
David
David Rowley <dgrowleyml@gmail.com> writes:
On Tue, 24 May 2022 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
BTW, shouldn't GenerationCheck be ifdef'd out if MEMORY_CONTEXT_CHECKING
isn't set? aset.c does things that way.
Isn't it done in generation.c:954?
Ah, sorry, didn't look that far up ...
regards, tom lane
I had another, possibly-crazy idea. I think that the API requirement
that the word before a chunk's start point to a MemoryContext is
overly strong. What we need is that it point to something in which
a MemoryContextMethods pointer can be found (at a predefined offset).
Thus, if generation.c is willing to add the overhead of a
MemoryContextMethods pointer in GenerationBlock, it could dispense
with the per-chunk context field and just have the GenerationBlock
link there. I guess most likely we'd also need a back-link to
the GenerationContext from GenerationBlock. Still, two more
pointers per GenerationBlock is an easy tradeoff to make to save
one pointer per GenerationChunk.
I've not trawled the code to make sure that *only* the methods
pointer is touched by context-type-independent code, but it
seems like we could get to that even if we're not there today.
Whether this idea is something we could implement post-beta,
I'm not sure. I'm inclined to think that we can't change the layout
of MemoryContextData post-beta, as people may already be building
extensions for production use. We could bloat GenerationBlock even
more so that the methods pointer is at the right offset for today's
layout of MemoryContextData, though obviously going forward it'd be
better if there wasn't extra overhead needed to make this happen.
regards, tom lane
On 5/20/22 1:56 AM, David Rowley wrote:
Hackers,
Over the past few days I've been gathering some benchmark results
together to show the sort performance improvements in PG15 [1].
So it basically looks like I discovered a very bad case that causes a
significant slowdown. Yet other cases that are not an exact power of
2 stand to gain significantly from this change.I'm unsure exactly what I should do about this right now.
While this is being actively investigated, I added this into open items.
Thanks,
Jonathan
On Tue, 24 May 2022 at 09:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
Handy wavy idea: It's probably too complex for now, and it also might
be too much overhead, but having GenerationPointerGetChunk() do a
binary search on a sorted-by-memory-address array of block pointers
might be a fast enough way to find the block that the pointer belongs
to. There should be far fewer blocks now since generation.c now grows
the block sizes. N in O(log2 N) the search complexity should never be
excessively high.However, it would mean a binary search for every pfree, which feels
pretty horrible. My feeling is that it seems unlikely that saving 8
bytes by not storing the GenerationBlock would be a net win here.I think probably that could be made to work, since a Generation
context should not contain all that many live blocks at any one time.
I've done a rough cut implementation of this and attached it here.
I've not done that much testing yet, but it does seem to fix the
performance regression that I mentioned in the blog post that I linked
in the initial post on this thread.
There are a couple of things to note about the patch:
1. I quickly realised that there's no good place to store the
sorted-by-memory-address array of GenerationBlocks. In the patch, I've
had to malloc() this array and also had to use a special case so that
I didn't try to do another malloc() inside GenerationContextCreate().
It's possible that the malloc() / repalloc of this array fails. In
which case, I think I've coded things in such a way that there will be
no memory leaks of the newly added block.
2. I did see GenerationFree() pop up in perf top a little more than it
used to. I considered that we might want to have
GenerationGetBlockFromChunk() cache the last found block for the set
and then check that one first. We expect generation contexts to
pfree() in an order that would likely make us hit this case most of
the time. I added a few lines to the attached v2 patch to add a
last_pfree_block field to the context struct. However, this seems to
hinder performance more than it helps. It can easily be removed from
the v2 patch.
In the results you can see the PG14 + PG15 results the same as I
reported on the blog post I linked earlier. It seems that for
PG15_patched virtually all work_mem sizes produce results that are
faster than PG14. The exception here is the 16GB test where
PG15_patched is 0.8% slower, which seems within the noise threshold.
David
Attachments:
generation_bsearch_block_v2.patchtext/plain; charset=US-ASCII; name=generation_bsearch_block_v2.patchDownload
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c
index e530e272e0..b11c0d77a5 100644
--- a/src/backend/utils/mmgr/generation.c
+++ b/src/backend/utils/mmgr/generation.c
@@ -71,7 +71,11 @@ typedef struct GenerationContext
GenerationBlock *freeblock; /* pointer to a block that's being recycled,
* or NULL if there's no such block. */
GenerationBlock *keeper; /* keep this block over resets */
- dlist_head blocks; /* list of blocks */
+ uint32 nblocks; /* Number of items in 'blocks' array */
+ uint32 blocks_size; /* Allocation size of 'blocks' array */
+ GenerationBlock **blocks; /* Array of blocks sorted by memory address */
+ GenerationBlock *last_pfree_block; /* Pointer to last block pfree was
+ * called on */
} GenerationContext;
/*
@@ -88,7 +92,6 @@ typedef struct GenerationContext
*/
struct GenerationBlock
{
- dlist_node node; /* doubly-linked list of blocks */
Size blksize; /* allocated size of this block */
int nchunks; /* number of chunks in the block */
int nfree; /* number of free chunks */
@@ -127,7 +130,6 @@ struct GenerationChunk
char padding[MAXIMUM_ALIGNOF - GENERATIONCHUNK_RAWSIZE % MAXIMUM_ALIGNOF];
#endif
- GenerationBlock *block; /* block owning this chunk */
GenerationContext *context; /* owning context, or NULL if freed chunk */
/* there must not be any padding to reach a MAXALIGN boundary here! */
};
@@ -151,11 +153,17 @@ struct GenerationChunk
#define GenerationChunkGetPointer(chk) \
((GenerationPointer *)(((char *)(chk)) + Generation_CHUNKHDRSZ))
-/* Inlined helper functions */
+/* Helper functions */
static inline void GenerationBlockInit(GenerationBlock *block, Size blksize);
static inline bool GenerationBlockIsEmpty(GenerationBlock *block);
static inline void GenerationBlockMarkEmpty(GenerationBlock *block);
static inline Size GenerationBlockFreeBytes(GenerationBlock *block);
+static GenerationBlock *GenerationGetBlockFromChunk(GenerationContext *set,
+ GenerationChunk *chunk);
+static void GenerationTrackBlock(GenerationContext *set,
+ GenerationBlock *block);
+static void GenerationUnTrackBlock(GenerationContext *set,
+ GenerationBlock *block);
static inline void GenerationBlockFree(GenerationContext *set,
GenerationBlock *block);
@@ -272,7 +280,10 @@ GenerationContextCreate(MemoryContext parent,
* Avoid writing code that can fail between here and MemoryContextCreate;
* we'd leak the header if we ereport in this stretch.
*/
- dlist_init(&set->blocks);
+ set->blocks = NULL;
+ set->nblocks = 0;
+ set->blocks_size = 0;
+ set->last_pfree_block = NULL;
/* Fill in the initial block's block header */
block = (GenerationBlock *) (((char *) set) + MAXALIGN(sizeof(GenerationContext)));
@@ -280,8 +291,8 @@ GenerationContextCreate(MemoryContext parent,
firstBlockSize = allocSize - MAXALIGN(sizeof(GenerationContext));
GenerationBlockInit(block, firstBlockSize);
- /* add it to the doubly-linked list of blocks */
- dlist_push_head(&set->blocks, &block->node);
+ /* add it to the array of blocks */
+ GenerationTrackBlock(set, block);
/* use it as the current allocation block */
set->block = block;
@@ -330,7 +341,8 @@ static void
GenerationReset(MemoryContext context)
{
GenerationContext *set = (GenerationContext *) context;
- dlist_mutable_iter miter;
+ uint32 i;
+ uint32 nblocks;
AssertArg(GenerationIsValid(set));
@@ -346,14 +358,20 @@ GenerationReset(MemoryContext context)
*/
set->freeblock = NULL;
- dlist_foreach_modify(miter, &set->blocks)
+ for (i = 0, nblocks = set->nblocks; i < nblocks;)
{
- GenerationBlock *block = dlist_container(GenerationBlock, node, miter.cur);
+ GenerationBlock *block = set->blocks[i];
if (block == set->keeper)
+ {
GenerationBlockMarkEmpty(block);
+ i++;
+ }
else
+ {
GenerationBlockFree(set, block);
+ nblocks--;
+ }
}
/* set it so new allocations to make use of the keeper block */
@@ -362,9 +380,8 @@ GenerationReset(MemoryContext context)
/* Reset block size allocation sequence, too */
set->nextBlockSize = set->initBlockSize;
- /* Ensure there is only 1 item in the dlist */
- Assert(!dlist_is_empty(&set->blocks));
- Assert(!dlist_has_next(&set->blocks, dlist_head_node(&set->blocks)));
+ /* Ensure there is only 1 item in the blocks array */
+ Assert(set->nblocks == 1);
}
/*
@@ -374,8 +391,14 @@ GenerationReset(MemoryContext context)
static void
GenerationDelete(MemoryContext context)
{
+ GenerationContext *set = (GenerationContext *)context;
+
/* Reset to release all releasable GenerationBlocks */
GenerationReset(context);
+
+ if (set->blocks != &set->keeper && set->blocks != NULL)
+ free(set->blocks);
+
/* And free the context header and keeper block */
free(context);
}
@@ -422,7 +445,6 @@ GenerationAlloc(MemoryContext context, Size size)
block->freeptr = block->endptr = ((char *) block) + blksize;
chunk = (GenerationChunk *) (((char *) block) + Generation_BLOCKHDRSZ);
- chunk->block = block;
chunk->context = set;
chunk->size = chunk_size;
@@ -437,8 +459,8 @@ GenerationAlloc(MemoryContext context, Size size)
randomize_mem((char *) GenerationChunkGetPointer(chunk), size);
#endif
- /* add the block to the list of allocated blocks */
- dlist_push_head(&set->blocks, &block->node);
+ /* add the block to the array of allocated blocks */
+ GenerationTrackBlock(set, block);
/* Ensure any padding bytes are marked NOACCESS. */
VALGRIND_MAKE_MEM_NOACCESS((char *) GenerationChunkGetPointer(chunk) + size,
@@ -518,8 +540,8 @@ GenerationAlloc(MemoryContext context, Size size)
/* initialize the new block */
GenerationBlockInit(block, blksize);
- /* add it to the doubly-linked list of blocks */
- dlist_push_head(&set->blocks, &block->node);
+ /* add it to the array of blocks */
+ GenerationTrackBlock(set, block);
/* Zero out the freeblock in case it's become full */
set->freeblock = NULL;
@@ -543,7 +565,6 @@ GenerationAlloc(MemoryContext context, Size size)
Assert(block->freeptr <= block->endptr);
- chunk->block = block;
chunk->context = set;
chunk->size = chunk_size;
@@ -632,6 +653,113 @@ GenerationBlockFreeBytes(GenerationBlock *block)
return (block->endptr - block->freeptr);
}
+#define INIT_BLOCK_ARRAY_SIZE 8
+
+/*
+ * GenerationTrackBlock
+ * Add 'block' to the 'set' blocks array maintaining it in memory address
+ * order.
+ */
+static void
+GenerationTrackBlock(GenerationContext *set, GenerationBlock *block)
+{
+ uint32 i;
+
+ /*
+ * We must special case the storage of the first block as we're unable to
+ * malloc() memory during the creation of a memory context. See comments
+ * in GenerationContextCreate().
+ */
+ if (set->nblocks == 0)
+ {
+ set->blocks = &set->keeper;
+ set->nblocks = 1;
+ set->blocks_size = 1;
+ return;
+ }
+
+ /* Check for special-case from above */
+ if (set->blocks == &set->keeper)
+ {
+ set->blocks = malloc(sizeof(GenerationBlock *) * INIT_BLOCK_ARRAY_SIZE);
+ set->blocks_size = INIT_BLOCK_ARRAY_SIZE;
+ set->blocks[0] = set->keeper;
+ set->nblocks = 1;
+ }
+ else if (set->nblocks == set->blocks_size)
+ {
+ uint32 newsize = set->blocks_size * 2;
+
+ set->blocks = realloc(set->blocks, sizeof(GenerationBlock *) * newsize);
+ set->blocks_size = newsize;
+ }
+
+ /* check for malloc() / realloc() failures */
+ if (set->blocks == NULL)
+ {
+ /* Free this block so that we don't leak it in GenerationDelete() */
+ free(block);
+
+ MemoryContextStats(TopMemoryContext);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory"),
+ errdetail("Failed while expanding memory context \"%s\".",
+ ((MemoryContext) set)->name)));
+ }
+
+ /*
+ * Search for the slot in the array where we must add the new block so
+ * that it remains sorted in block address order
+ */
+ for (i = 0; i < set->nblocks; i++)
+ {
+ if ((char *) block < (char *) set->blocks[i])
+ break;
+ }
+
+ /* make a gap in the blocks array to store the new pointer */
+ for (uint32 j = set->nblocks; j > i; j--)
+ set->blocks[j] = set->blocks[j - 1];
+
+ set->blocks[i] = block;
+ set->nblocks++;
+}
+
+static void
+GenerationUnTrackBlock(GenerationContext *set, GenerationBlock *block)
+{
+ uint32 low = 0;
+ uint32 high = set->nblocks;
+ uint32 mid;
+
+ /* binary search the memory address sorted set->blocks array */
+ while (low < high)
+ {
+ GenerationBlock *curblock;
+
+ mid = (uint32) (((uint64) low + high) / 2);
+
+ curblock = set->blocks[mid];
+
+ if (block == curblock)
+ break;
+ else if ((char *) block < (char *) curblock)
+ high = mid;
+ else
+ low = mid + 1;
+ }
+
+ /* fill the hole in the array from the removed block */
+ while (mid < set->nblocks)
+ {
+ set->blocks[mid] = set->blocks[mid + 1];
+ mid++;
+ }
+
+ set->nblocks--;
+}
+
/*
* GenerationBlockFree
* Remove 'block' from 'set' and release the memory consumed by it.
@@ -644,8 +772,8 @@ GenerationBlockFree(GenerationContext *set, GenerationBlock *block)
/* We shouldn't be freeing the freeblock either */
Assert(block != set->freeblock);
- /* release the block from the list of blocks */
- dlist_delete(&block->node);
+ /* release the block from the array of blocks */
+ GenerationUnTrackBlock(set, block);
((MemoryContext) set)->mem_allocated -= block->blksize;
@@ -656,6 +784,55 @@ GenerationBlockFree(GenerationContext *set, GenerationBlock *block)
free(block);
}
+/*
+ * GenerationGetBlockFromChunk
+ * Find the GenerationBlock that 'chunk' belongs to and return it.
+ */
+static GenerationBlock *
+GenerationGetBlockFromChunk(GenerationContext *set, GenerationChunk *chunk)
+{
+ GenerationBlock *block;
+ GenerationBlock *lblock = set->last_pfree_block;
+ uint32 low;
+ uint32 high;
+
+ /*
+ * We expect GenerationChunks to be pfree'd in roughly palloc() order, so
+ * it's likely that the last pfree'd chunk is on the same block as this
+ * chunk.
+ */
+ if (lblock != NULL &&
+ (char *) chunk >= (char *) lblock &&
+ (char *) chunk <= lblock->endptr)
+ return lblock;
+
+ low = 0;
+ high = set->nblocks - 1;
+
+ while (low <= high)
+ {
+ uint32 mid = (uint32) (((uint64) low + high) / 2);
+
+ block = set->blocks[mid];
+
+ if (((char *) block) <= ((char *) chunk))
+ {
+ /* Check if the chunk falls within the used space of this block */
+ if (((char *) chunk) <= block->freeptr)
+ {
+ low = mid;
+ break;
+ }
+ low = mid + 1;
+ }
+ else
+ high = mid - 1;
+ }
+
+ set->last_pfree_block = set->blocks[low];
+
+ return set->blocks[low];
+}
/*
* GenerationFree
* Update number of chunks in the block, and if all chunks in the block
@@ -671,7 +848,7 @@ GenerationFree(MemoryContext context, void *pointer)
/* Allow access to private part of chunk header. */
VALGRIND_MAKE_MEM_DEFINED(chunk, GENERATIONCHUNK_PRIVATE_LEN);
- block = chunk->block;
+ block = GenerationGetBlockFromChunk(set, chunk);
#ifdef MEMORY_CONTEXT_CHECKING
/* Test for someone scribbling on unused space in chunk */
@@ -728,10 +905,11 @@ GenerationFree(MemoryContext context, void *pointer)
/*
* The block is empty, so let's get rid of it. First remove it from the
- * list of blocks, then return it to malloc().
+ * array of blocks, then return it to malloc().
*/
- dlist_delete(&block->node);
+ GenerationUnTrackBlock(set, block);
+ set->last_pfree_block = NULL;
context->mem_allocated -= block->blksize;
free(block);
}
@@ -878,11 +1056,11 @@ static bool
GenerationIsEmpty(MemoryContext context)
{
GenerationContext *set = (GenerationContext *) context;
- dlist_iter iter;
+ uint32 i;
- dlist_foreach(iter, &set->blocks)
+ for (i = 0; i < set->nblocks; i++)
{
- GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur);
+ GenerationBlock *block = set->blocks[i];
if (block->nchunks > 0)
return false;
@@ -914,14 +1092,14 @@ GenerationStats(MemoryContext context,
Size nfreechunks = 0;
Size totalspace;
Size freespace = 0;
- dlist_iter iter;
+ uint32 i;
/* Include context header in totalspace */
totalspace = MAXALIGN(sizeof(GenerationContext));
- dlist_foreach(iter, &set->blocks)
+ for (i = 0; i < set->nblocks; i++)
{
- GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur);
+ GenerationBlock *block = set->blocks[i];
nblocks++;
nchunks += block->nchunks;
@@ -966,13 +1144,13 @@ GenerationCheck(MemoryContext context)
{
GenerationContext *gen = (GenerationContext *) context;
const char *name = context->name;
- dlist_iter iter;
Size total_allocated = 0;
+ uint32 i;
/* walk all blocks in this context */
- dlist_foreach(iter, &gen->blocks)
+ for (i = 0; i < gen->nblocks; i++)
{
- GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur);
+ GenerationBlock *block = gen->blocks[i];
int nfree,
nchunks;
char *ptr;
@@ -1004,8 +1182,8 @@ GenerationCheck(MemoryContext context)
nchunks += 1;
- /* chunks have both block and context pointers, so check both */
- if (chunk->block != block)
+ /* ensure the found GenerationBlock matches block */
+ if (GenerationGetBlockFromChunk(gen, chunk) != block)
elog(WARNING, "problem in Generation %s: bogus block link in block %p, chunk %p",
name, block, chunk);
David Rowley <dgrowleyml@gmail.com> writes:
On Tue, 24 May 2022 at 09:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think probably that could be made to work, since a Generation
context should not contain all that many live blocks at any one time.
I've done a rough cut implementation of this and attached it here.
I've not done that much testing yet, but it does seem to fix the
performance regression that I mentioned in the blog post that I linked
in the initial post on this thread.
Here's a draft patch for the other way of doing it. I'd first tried
to make the side-effects completely local to generation.c, but that
fails in view of code like
MemoryContextAlloc(GetMemoryChunkContext(x), ...)
Thus we pretty much have to have some explicit awareness of this scheme
in GetMemoryChunkContext(). There's more than one way it could be
done, but I thought a clean way is to invent a separate NodeTag type
to identify the indirection case.
So this imposes a distributed overhead of one additional test-and-branch
per pfree or repalloc. I'm inclined to think that that's negligible,
but I've not done performance testing to try to prove it.
For back-patching into v14, we could put the new NodeTag type at the
end of that enum list. The change in the inline GetMemoryChunkContext
is probably an acceptable hazard.
regards, tom lane
Attachments:
invent-MemoryContextLink-1.patchtext/x-diff; charset=us-ascii; name=invent-MemoryContextLink-1.patchDownload
diff --git a/src/backend/utils/mmgr/README b/src/backend/utils/mmgr/README
index 221b4bd343..c932f88639 100644
--- a/src/backend/utils/mmgr/README
+++ b/src/backend/utils/mmgr/README
@@ -397,12 +397,14 @@ precede the MemoryContext. This means the only overhead implied by
the memory context mechanism is a pointer to its context, so we're not
constraining context-type designers very much.
-Given this, routines like pfree determine their corresponding context
-with an operation like (although that is usually encapsulated in
-GetMemoryChunkContext())
-
- MemoryContext context = *(MemoryContext*) (((char *) pointer) - sizeof(void *));
-
+Furthermore, context allocators can make the pointer preceding an
+allocated chunk be a pointer to a "MemoryContextLink" rather than
+directly to the owning context. This allows chunks to be associated
+with a subsidiary data structure, such as a sub-group of allocations,
+at relatively low cost.
+
+Given this, routines like pfree determine their target context
+with GetMemoryChunkContext() or equivalent code,
and then invoke the corresponding method for the context
context->methods->free_p(pointer);
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c
index e530e272e0..bf1acc13c7 100644
--- a/src/backend/utils/mmgr/generation.c
+++ b/src/backend/utils/mmgr/generation.c
@@ -85,9 +85,15 @@ typedef struct GenerationContext
*
* GenerationBlock is the header data for a block --- the usable space
* within the block begins at the next alignment boundary.
+ *
+ * To make it cheap to identify the GenerationBlock containing a given
+ * chunk, a GenerationBlock starts with a MemoryContextLink. This
+ * allows us to make GenerationChunk's required overhead pointer point to
+ * the containing GenerationBlock rather than directly to the context.
*/
struct GenerationBlock
{
+ MemoryContextLink link; /* link to owning context */
dlist_node node; /* doubly-linked list of blocks */
Size blksize; /* allocated size of this block */
int nchunks; /* number of chunks in the block */
@@ -101,7 +107,7 @@ struct GenerationBlock
* The prefix of each piece of memory in a GenerationBlock
*
* Note: to meet the memory context APIs, the payload area of the chunk must
- * be maxaligned, and the "context" link must be immediately adjacent to the
+ * be maxaligned, and the "block" link must be immediately adjacent to the
* payload area (cf. GetMemoryChunkContext). We simplify matters for this
* module by requiring sizeof(GenerationChunk) to be maxaligned, and then
* we can ensure things work by adding any required alignment padding before
@@ -128,17 +134,16 @@ struct GenerationChunk
#endif
GenerationBlock *block; /* block owning this chunk */
- GenerationContext *context; /* owning context, or NULL if freed chunk */
/* there must not be any padding to reach a MAXALIGN boundary here! */
};
/*
- * Only the "context" field should be accessed outside this module.
+ * Only the "block" field should be accessed outside this module.
* We keep the rest of an allocated chunk's header marked NOACCESS when using
* valgrind. But note that freed chunk headers are kept accessible, for
* simplicity.
*/
-#define GENERATIONCHUNK_PRIVATE_LEN offsetof(GenerationChunk, context)
+#define GENERATIONCHUNK_PRIVATE_LEN offsetof(GenerationChunk, block)
/*
* GenerationIsValid
@@ -152,7 +157,8 @@ struct GenerationChunk
((GenerationPointer *)(((char *)(chk)) + Generation_CHUNKHDRSZ))
/* Inlined helper functions */
-static inline void GenerationBlockInit(GenerationBlock *block, Size blksize);
+static inline void GenerationBlockInit(GenerationBlock *block,
+ MemoryContext context, Size blksize);
static inline bool GenerationBlockIsEmpty(GenerationBlock *block);
static inline void GenerationBlockMarkEmpty(GenerationBlock *block);
static inline Size GenerationBlockFreeBytes(GenerationBlock *block);
@@ -226,7 +232,7 @@ GenerationContextCreate(MemoryContext parent,
/* Assert we padded GenerationChunk properly */
StaticAssertStmt(Generation_CHUNKHDRSZ == MAXALIGN(Generation_CHUNKHDRSZ),
"sizeof(GenerationChunk) is not maxaligned");
- StaticAssertStmt(offsetof(GenerationChunk, context) + sizeof(MemoryContext) ==
+ StaticAssertStmt(offsetof(GenerationChunk, block) + sizeof(void *) ==
Generation_CHUNKHDRSZ,
"padding calculation in GenerationChunk is wrong");
@@ -278,7 +284,7 @@ GenerationContextCreate(MemoryContext parent,
block = (GenerationBlock *) (((char *) set) + MAXALIGN(sizeof(GenerationContext)));
/* determine the block size and initialize it */
firstBlockSize = allocSize - MAXALIGN(sizeof(GenerationContext));
- GenerationBlockInit(block, firstBlockSize);
+ GenerationBlockInit(block, (MemoryContext) set, firstBlockSize);
/* add it to the doubly-linked list of blocks */
dlist_push_head(&set->blocks, &block->node);
@@ -413,6 +419,10 @@ GenerationAlloc(MemoryContext context, Size size)
context->mem_allocated += blksize;
+ /* set up standard GenerationBlock header */
+ block->link.type = T_MemoryContextLink;
+ block->link.link = context;
+
/* block with a single (used) chunk */
block->blksize = blksize;
block->nchunks = 1;
@@ -423,7 +433,6 @@ GenerationAlloc(MemoryContext context, Size size)
chunk = (GenerationChunk *) (((char *) block) + Generation_BLOCKHDRSZ);
chunk->block = block;
- chunk->context = set;
chunk->size = chunk_size;
#ifdef MEMORY_CONTEXT_CHECKING
@@ -516,7 +525,7 @@ GenerationAlloc(MemoryContext context, Size size)
context->mem_allocated += blksize;
/* initialize the new block */
- GenerationBlockInit(block, blksize);
+ GenerationBlockInit(block, context, blksize);
/* add it to the doubly-linked list of blocks */
dlist_push_head(&set->blocks, &block->node);
@@ -544,7 +553,6 @@ GenerationAlloc(MemoryContext context, Size size)
Assert(block->freeptr <= block->endptr);
chunk->block = block;
- chunk->context = set;
chunk->size = chunk_size;
#ifdef MEMORY_CONTEXT_CHECKING
@@ -574,8 +582,11 @@ GenerationAlloc(MemoryContext context, Size size)
* mem_allocated field.
*/
static inline void
-GenerationBlockInit(GenerationBlock *block, Size blksize)
+GenerationBlockInit(GenerationBlock *block, MemoryContext context, Size blksize)
{
+ block->link.type = T_MemoryContextLink;
+ block->link.link = context;
+
block->blksize = blksize;
block->nchunks = 0;
block->nfree = 0;
@@ -685,9 +696,6 @@ GenerationFree(MemoryContext context, void *pointer)
wipe_mem(pointer, chunk->size);
#endif
- /* Reset context to NULL in freed chunks */
- chunk->context = NULL;
-
#ifdef MEMORY_CONTEXT_CHECKING
/* Reset requested_size to 0 in freed chunks */
chunk->requested_size = 0;
@@ -979,6 +987,12 @@ GenerationCheck(MemoryContext context)
total_allocated += block->blksize;
+ /* Sanity-check the block header. */
+ if (block->link.type != T_MemoryContextLink ||
+ block->link.link != context)
+ elog(WARNING, "problem in Generation %s: invalid link in block %p",
+ name, block);
+
/*
* nfree > nchunks is surely wrong. Equality is allowed as the block
* might completely empty if it's the freeblock.
@@ -1004,21 +1018,11 @@ GenerationCheck(MemoryContext context)
nchunks += 1;
- /* chunks have both block and context pointers, so check both */
+ /* verify the block pointer */
if (chunk->block != block)
elog(WARNING, "problem in Generation %s: bogus block link in block %p, chunk %p",
name, block, chunk);
- /*
- * Check for valid context pointer. Note this is an incomplete
- * test, since palloc(0) produces an allocated chunk with
- * requested_size == 0.
- */
- if ((chunk->requested_size > 0 && chunk->context != gen) ||
- (chunk->context != gen && chunk->context != NULL))
- elog(WARNING, "problem in Generation %s: bogus context link in block %p, chunk %p",
- name, block, chunk);
-
/* now make sure the chunk size is correct */
if (chunk->size < chunk->requested_size ||
chunk->size != MAXALIGN(chunk->size))
@@ -1026,7 +1030,7 @@ GenerationCheck(MemoryContext context)
name, block, chunk);
/* is chunk allocated? */
- if (chunk->context != NULL)
+ if (chunk->requested_size > 0)
{
/* check sentinel, but only in allocated blocks */
if (chunk->requested_size < chunk->size &&
@@ -1041,7 +1045,7 @@ GenerationCheck(MemoryContext context)
* If chunk is allocated, disallow external access to private part
* of chunk header.
*/
- if (chunk->context != NULL)
+ if (chunk->requested_size > 0)
VALGRIND_MAKE_MEM_NOACCESS(chunk, GENERATIONCHUNK_PRIVATE_LEN);
}
diff --git a/src/include/nodes/memnodes.h b/src/include/nodes/memnodes.h
index bbbe151e39..e0f87a612e 100644
--- a/src/include/nodes/memnodes.h
+++ b/src/include/nodes/memnodes.h
@@ -95,6 +95,20 @@ typedef struct MemoryContextData
/* utils/palloc.h contains typedef struct MemoryContextData *MemoryContext */
+/*
+ * MemoryContextLink
+ * An indirect linkage to a MemoryContext.
+ *
+ * Some context types prefer to make the pointer preceding each chunk
+ * point to one of these, rather than directly to the owning MemoryContext.
+ */
+typedef struct MemoryContextLink
+{
+ NodeTag type; /* == T_MemoryContextLink */
+ MemoryContext link; /* owning context */
+} MemoryContextLink;
+
+
/*
* MemoryContextIsValid
* True iff memory context is valid.
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index b3b407579b..f01ad58a20 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -301,6 +301,7 @@ typedef enum NodeTag
T_AllocSetContext,
T_SlabContext,
T_GenerationContext,
+ T_MemoryContextLink,
/*
* TAGS FOR VALUE NODES (value.h)
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 495d1af201..1a51fce193 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -105,9 +105,10 @@ extern bool MemoryContextContains(MemoryContext context, void *pointer);
*
* All chunks allocated by any memory context manager are required to be
* preceded by the corresponding MemoryContext stored, without padding, in the
- * preceding sizeof(void*) bytes. A currently-allocated chunk must contain a
- * backpointer to its owning context. The backpointer is used by pfree() and
- * repalloc() to find the context to call.
+ * preceding sizeof(void*) bytes. This pointer is used by pfree() and
+ * repalloc() to find the context to call. The only exception is that the
+ * pointer might point to a MemoryContextLink rather than directly to the
+ * owning context.
*/
#ifndef FRONTEND
static inline MemoryContext
@@ -128,6 +129,12 @@ GetMemoryChunkContext(void *pointer)
*/
context = *(MemoryContext *) (((char *) pointer) - sizeof(void *));
+ /*
+ * It might be a link rather than the context itself.
+ */
+ if (IsA(context, MemoryContextLink))
+ context = ((MemoryContextLink *) context)->link;
+
AssertArg(MemoryContextIsValid(context));
return context;
Hi,
On 2022-05-24 12:01:58 -0400, Tom Lane wrote:
David Rowley <dgrowleyml@gmail.com> writes:
On Tue, 24 May 2022 at 09:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think probably that could be made to work, since a Generation
context should not contain all that many live blocks at any one time.I've done a rough cut implementation of this and attached it here.
I've not done that much testing yet, but it does seem to fix the
performance regression that I mentioned in the blog post that I linked
in the initial post on this thread.Here's a draft patch for the other way of doing it. I'd first tried
to make the side-effects completely local to generation.c, but that
fails in view of code likeMemoryContextAlloc(GetMemoryChunkContext(x), ...)
Thus we pretty much have to have some explicit awareness of this scheme
in GetMemoryChunkContext(). There's more than one way it could be
done, but I thought a clean way is to invent a separate NodeTag type
to identify the indirection case.
That's interesting - I actually needed something vaguely similar recently. For
direct IO support we need to allocate memory with pagesize alignment
(otherwise DMA doesn't work). Several places allocating such buffers also
pfree them.
The easiest way I could see to deal with that was to invent a different memory
context node type that handles allocation / freeing by over-allocating
sufficiently to ensure alignment, backed by an underlying memory context.
A variation on your patch would be to only store the offset to the block
header - that should always fit into 32bit (huge allocations being their own
block, which is why this wouldn't work for storing an offset to the
context). With a bit of care that'd allow aset.c to half it's overhead, by
using 4 bytes of space for all non-huge allocations. Of course, it'd increase
the cost of pfree() of small allocations, because AllocSetFree() currently
doesn't need to access the block for those. But I'd guess that'd be outweighed
by the reduced memory usage.
Sorry for the somewhat off-topic musing - I was trying to see if the
MemoryContextLink approach could be generalized or has disadvantages aside
from the branch in GetMemoryChunkContext().
For back-patching into v14, we could put the new NodeTag type at the
end of that enum list. The change in the inline GetMemoryChunkContext
is probably an acceptable hazard.
Why would we backpatch this to 14? I don't think we have any regressions
there?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2022-05-24 12:01:58 -0400, Tom Lane wrote:
For back-patching into v14, we could put the new NodeTag type at the
end of that enum list. The change in the inline GetMemoryChunkContext
is probably an acceptable hazard.
Why would we backpatch this to 14? I don't think we have any regressions
there?
Oh, sorry, I meant v15. I'm operating on the assumption that we have
at least a weak ABI freeze in v15 already.
regards, tom lane
On Wed, 25 May 2022 at 04:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Here's a draft patch for the other way of doing it. I'd first tried
to make the side-effects completely local to generation.c, but that
fails in view of code likeMemoryContextAlloc(GetMemoryChunkContext(x), ...)
Thus we pretty much have to have some explicit awareness of this scheme
in GetMemoryChunkContext(). There's more than one way it could be
done, but I thought a clean way is to invent a separate NodeTag type
to identify the indirection case.
Thanks for coding that up. This seems like a much better idea than mine.
I ran the same benchmark as I did in the blog on your patch and got
the attached sort_bench.png. You can see the patch fixes the 64MB
work_mem performance regression, as we'd expect.
To get an idea of the overhead of this I came up with the attached
allocate_performance_function.patch which basically just adds a
function named pg_allocate_generation_memory() which you can pass a
chunk_size for the number of bytes to allocate at once, and also a
keep_memory to tell the function how much memory to keep around before
starting to pfree previous allocations. The final parameter defines
the total amount of memory to allocate.
The attached script calls the function with varying numbers of chunk
sizes from 8 up to 2048, multiplying by 2 each step. It keeps 1MB of
memory and does a total of 1GB of allocations.
I ran the script against today's master and master + the
invent-MemoryContextLink-1.patch and got the attached tps_results.txt.
The worst-case is the 8-byte allocation size where performance drops
around 7%. For the larger chunk sizes, the drop is much less, mostly
just around <1% to ~6%. For the 2048 byte size chunks, the performance
seems to improve (?). Obviously, the test is pretty demanding as far
as palloc and pfree go. I imagine we don't come close to anything like
that in the actual code. This test was just aimed to give us an idea
of the overhead. It might not be enough information to judge if we
should be concerned about more realistic palloc/pfree workloads.
I didn't test the performance of an aset.c context. I imagine it's
likely to be less overhead due to aset.c being generally slower from
having to jump through a few more hoops during palloc/pfree.
David
Attachments:
allocate_performance_function.patchtext/plain; charset=US-ASCII; name=allocate_performance_function.patchDownload
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index bb7cc94024..d11495b716 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -193,3 +193,122 @@ pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(true);
}
+
+typedef struct AllocateTestNext
+{
+ struct AllocateTestNext *next; /* ptr to the next allocation */
+} AllocateTestNext;
+
+/* #define ALLOCATE_TEST_DEBUG */
+/*
+ * pg_allocate_generation_memory
+ * Used to test the performance of a generation memory context
+ */
+Datum
+pg_allocate_generation_memory(PG_FUNCTION_ARGS)
+{
+ int32 chunk_size = PG_GETARG_INT32(0);
+ int64 keep_memory = PG_GETARG_INT64(1);
+ int64 total_alloc = PG_GETARG_INT64(2);
+ int64 curr_memory_use = 0;
+ int64 remaining_alloc_bytes = total_alloc;
+ MemoryContext context;
+ MemoryContext oldContext;
+ AllocateTestNext *next_free_ptr = NULL;
+ AllocateTestNext *last_alloc = NULL;
+
+ if (chunk_size < sizeof(AllocateTestNext))
+ elog(ERROR, "chunk_size (%d) must be at least %ld bytes", chunk_size,
+ sizeof(AllocateTestNext));
+ if (keep_memory > total_alloc)
+ elog(ERROR, "keep_memory (" INT64_FORMAT ") must be less than total_alloc (" INT64_FORMAT ")",
+ keep_memory, total_alloc);
+
+ context = GenerationContextCreate(CurrentMemoryContext,
+ "pg_allocate_generation_memory",
+ ALLOCSET_DEFAULT_SIZES);
+
+ oldContext = MemoryContextSwitchTo(context);
+
+ while (remaining_alloc_bytes > 0)
+ {
+ AllocateTestNext *curr_alloc;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Allocate the memory and update the counters */
+ curr_alloc = (AllocateTestNext *) palloc(chunk_size);
+ remaining_alloc_bytes -= chunk_size;
+ curr_memory_use += chunk_size;
+
+#ifdef ALLOCATE_TEST_DEBUG
+ elog(NOTICE, "alloc %p (curr_memory_use " INT64_FORMAT " bytes, remaining_alloc_bytes " INT64_FORMAT ")", curr_alloc, curr_memory_use, remaining_alloc_bytes);
+#endif
+
+ /*
+ * This might be the last allocation, so point this to NULL so know
+ * when to stop looping in the cleanup loop.
+ */
+ curr_alloc->next = NULL;
+
+ /*
+ * If the currently allocated memory has reached or exceeded the amount
+ * of memory we want to keep allocated at once then we'd better free
+ * some. Since all allocations are the same size we only need to free
+ * one allocation per loop.
+ */
+ if (curr_memory_use >= keep_memory)
+ {
+ AllocateTestNext *next = next_free_ptr->next;
+
+ /* free the memory and update the current memory usage */
+ pfree(next_free_ptr);
+ curr_memory_use -= chunk_size;
+
+#ifdef ALLOCATE_TEST_DEBUG
+ elog(NOTICE, "free %p (curr_memory_use " INT64_FORMAT " bytes, remaining_alloc_bytes " INT64_FORMAT ")", next_free_ptr, curr_memory_use, remaining_alloc_bytes);
+#endif
+ /* get the next chunk to free */
+ next_free_ptr = next;
+ }
+ else if (next_free_ptr == NULL)
+ {
+ /*
+ * Remember the first chunk to free. We will follow the ->next
+ * pointers to find the next chunk to free when freeing memory
+ */
+ next_free_ptr = curr_alloc;
+ }
+
+ /*
+ * Store a pointer to curr_alloc in the memory we allocated in the
+ * the last iteration. This allows us to use the memory we're
+ * allocating to store a pointer to the next allocation.
+ */
+ if (last_alloc != NULL)
+ last_alloc->next = curr_alloc;
+
+ /* remember the this allocation so we have it in the next loop */
+ last_alloc = curr_alloc;
+ }
+
+ /* cleanup loop -- pfree remaining memory */
+ while (next_free_ptr != NULL)
+ {
+ AllocateTestNext *next = next_free_ptr->next;
+
+ /* free the memory and update the current memory usage */
+ pfree(next_free_ptr);
+ curr_memory_use -= chunk_size;
+
+#ifdef ALLOCATE_TEST_DEBUG
+ elog(NOTICE, "free %p (curr_memory_use " INT64_FORMAT " bytes, remaining_alloc_bytes " INT64_FORMAT ")", next_free_ptr, curr_memory_use, remaining_alloc_bytes);
+#endif
+
+ next_free_ptr = next;
+ }
+
+ MemoryContextSwitchTo(oldContext);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571a33..12ed69408b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8093,6 +8093,12 @@
prorettype => 'bool', proargtypes => 'int4',
prosrc => 'pg_log_backend_memory_contexts' },
+# just for testing memory context allocation speed
+{ oid => '9319', descr => 'for testing performance of generation allocation and freeing',
+ proname => 'pg_allocate_generation_memory', provolatile => 'v',
+ prorettype => 'void', proargtypes => 'int4 int8 int8',
+ prosrc => 'pg_allocate_generation_memory' },
+
# non-persistent series generator
{ oid => '1066', descr => 'non-persistent series generator',
proname => 'generate_series', prorows => '1000',
On Wed, 25 May 2022 at 15:09, David Rowley <dgrowleyml@gmail.com> wrote:
I didn't test the performance of an aset.c context. I imagine it's
likely to be less overhead due to aset.c being generally slower from
having to jump through a few more hoops during palloc/pfree.
I've attached the results from doing the same test with a standard
allocset context.
With the exception of the 8 byte chunk size test, there just seems to
be a 3-4% slowdown on my machine.
David
Attachments:
В Вт, 24/05/2022 в 17:39 -0700, Andres Freund пишет:
A variation on your patch would be to only store the offset to the block
header - that should always fit into 32bit (huge allocations being their own
block, which is why this wouldn't work for storing an offset to the
context). With a bit of care that'd allow aset.c to half it's overhead, by
using 4 bytes of space for all non-huge allocations. Of course, it'd increase
the cost of pfree() of small allocations, because AllocSetFree() currently
doesn't need to access the block for those. But I'd guess that'd be outweighed
by the reduced memory usage.
I'm +1 for this.
And with this change every memory context kind can have same header:
typedef struct MemoryChunk {
#ifdef MEMORY_CONTEXT_CHECKING
Size requested_size;
#endif
uint32 encoded_size; /* encoded allocation size */
uint32 offset_to_block; /* backward offset to block header */
}
Allocated size always could be encoded into uint32 since it is rounded
for large allocations (I believe, large allocations certainly rounded
to at least 4096 bytes):
encoded_size = size < (1u<<31) ? size : (1u<<31)|(size>>12);
/* and reverse */
size = (encoded_size >> 31) ? ((Size)(encoded_size<<1)<<12) :
(Size)encoded_size;
There is a glitch with Aset since it currently reuses `aset` pointer
for freelist link. With such change this link had to be encoded in
chunk-body itself instead of header. I was confused with this, since
there are valgrind hooks, and I was not sure how to change it (I'm
not good at valgrind hooks). But after thinking more about I believe
it is doable.
regards
-------
Yura Sokolov
Yura Sokolov <y.sokolov@postgrespro.ru> writes:
В Вт, 24/05/2022 в 17:39 -0700, Andres Freund пишет:
A variation on your patch would be to only store the offset to the block
header - that should always fit into 32bit (huge allocations being their own
block, which is why this wouldn't work for storing an offset to the
context).
I'm +1 for this.
Given David's results in the preceding message, I don't think I am.
A scheme like this would add more arithmetic and at least one more
indirection to GetMemoryChunkContext(), and we already know that
adding even a test-and-branch there has measurable cost. (I wonder
if using unlikely() on the test would help? But it's not unlikely
in a generation-context-heavy use case.) There would also be a good
deal of complication and ensuing slowdown created by the need for
oversize chunks to be a completely different kind of animal with a
different header.
I'm also not very happy about this:
And with this change every memory context kind can have same header:
IMO that's a bug not a feature. It puts significant constraints on how
context types can be designed.
regards, tom lane
On Sat, 28 May 2022 at 02:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Given David's results in the preceding message, I don't think I am.
A scheme like this would add more arithmetic and at least one more
indirection to GetMemoryChunkContext(), and we already know that
adding even a test-and-branch there has measurable cost.
I also ran the same tests on my patch to binary search for the
generation block and the performance is worse than with the
MemoryContextLink patch, albeit, limited to generation contexts only.
Also disheartening. See attached bsearch_gen_blocks.txt
I decided to run some more extensive benchmarking with the 10GB table
with varying numbers of BIGINT columns from 6 up to 14. 6 columns
means 64-byte pallocs in the generation context and 14 means 128
bytes. Once again, I tested work_mem values starting at 4MB and
doubled that each test until I got to 16GB. The results are attached
both in chart format and I've also included the complete results in a
spreadsheet along with the script I ran to get the results.
The results compare PG14 @ 0adff38d against master @ b3fb16e8b. In
the chart, anything below 100% is a performance improvement over PG14
and anything above 100% means PG15 is slower. You can see there's
only the 64-byte / 64MB work_mem test that gets significantly slower
and that there are only a small amount of other tests that are
slightly slower. Most are faster and on average PG15 takes 90% of the
time that PG14 took.
Likely it would have been more relevant to have tested this against
master with 40af10b57 reverted. I'm running those now.
David
Attachments:
Sort benchmark PG14 vs PG15.odsapplication/vnd.oasis.opendocument.spreadsheet; name="Sort benchmark PG14 vs PG15.ods"Download
PK y��T�l9�. . mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK y��T5w�V� � Thumbnails/thumbnail.png�PNG
IHDR � � ��� `PLTE$$$+++333;;;CCCKKKSSS[[[ccckkksss{{{������������������������������������������������ ����J�: IDATx��]�r�8��KE�q�����`�$wlx\e�cv��h���@<�|��_����_��;p�x�o�y���o�V��l���pf�6GP��S����_|�4�f�3�{�_���sf��u��0�J�Z{��l���p^��n
����:D�A�7���e��h������WU�=��.��!�rU������-��:o;3����{�<���9���D�8�����6Px�B6`E����
�O�� "����
�IHG��������)������ z��#��i����Sn ��>����q2�/����Pe�b�`� ��������(_���'���a�N6�*�d�l~KH���pP�~U����A��Um!���p��!)�f����?���n|����F�RW��(3�{uTJ������w.8�
��$1�w��}�}�qH�s��z�y�]\����={�wV`D����M�p6�C��hT��/pB�W�)b�id�sx�o8�yK���y��M�� j �^��H w-A��6*���@� �wL)H{fF]�1&��8V$�9'%~2�["&$qp]D�b:��+!QA8H|<.���\��!@�*s��lu � ����\4�cp]�����!�rn�:W.Js�1j���Q��S��i��P���:����pu�,��#�x��A�o$��Kxp�J�������l� t9�t�}�%���C�&��h���4k���F/8��v8�5����'|��8F�Y�a���v��}SO�_Y>���u��}��y�F�u��04]{r��J�?�$��u�s�*P��a��Cv�����H���@&��b���l��Y�����8!��[���v S���(5�a�w�s����I���7)� F�D��p6�&�}��v��f<}�Y7!���m��I\�"�h��#�e���� 9��i�V�4q��t�an]f���"g�%��BV&|T��@����"���6�5ej���d8����#$z��S^�)<�T`��B6�|���6����\p.8pj"�]H�����r�-��Mt.�>�:+���?'�u��$���u��:L%����;A��{�'C�#��p�@}����0��o����uOT��.��������d���aj��k�j�f��h�F�-� ����$$����Y�-+�l3:�]�a/�������&y5�]$���I��&=����Lf��w%�����u�����G�����Ir�&��w��~�3�j��j�p�m/��%;�EM�8�X'P�Lj�C����I��z�C����^
zeoQ�>4t<X�w�47l��p4d.2H���1�p\�h/�s���\p�I8���#�w�:������yQ�Y����n���O���lK� !�����$*W�P���F�"z3���V:M!��`S��~��0�H�>MrT#kM�����;�$9��A���o��X��g[���l8}U��C����F9|���!S6��R3"N�p\��q�>��#��Md�rUpu�x���-�aj��m�}m�>�t\��g,f�la�=��WG$�O�gQwQ�P�FqE���h�H��d�M��<�9y�����-9��;c���R8�4>I��W�'{���V����w9RDe�"^���@!0��g�����s����+I�FZ�b���E�,�7 )4f}��8�4K��J/r�����T��S�a�����u#��?f���ma�iO�� w�PHq�������DLU���"j�����r������������b��>��m�E:@#��g �����ErJ9�0Ud�����fW��d��|��}1L5����I��H��G�)YC0�1bj
�w���P��������x�L�B�)\c�6�"�?8s���=uy�y�"�F��� �'r6dY��5v�U.l@n���S(I�Pb�d��9F���9������������4�O�}���������'k�8�6��_�h%$)#$Kb�t��[��f<�#+`^�r)��
2�_�%�WT���J�z���{�STD>��DLQ1�p#x��nx&P��gru���.)d�o�������M<,9�S����HaN���Fu==$S��W58���3�{���jm�n&1�����G�s�������p�$�n6���l#����a��oy�
]�.��
S���h[�TN������;n�1�l7|b{���3{����O���o����o�9wF�^��]��F$�g[u�#�Q1���Ar
VL�����:�������V�XY ��u�����s�����R�K8�p�%������$�P��C���l�%ys9��P���SI�0a+b*�mg���!#���CI�@��9lY?MY"=�-y(��h�VdP��G�T���!i����vs�`���8�7����&5����W��7)�Z�ruj4s�=�}��Q��^$g��
Sc�z��������F"����@Hq������)��Ml�(�`����EL����M�B��#$�����7#Go�H�����
S�!����W���S�b�2�g�����7;Z<H2)$��iY`��B�Y��s���������s��g���}c���'����������Q��d1Xq�^� �H��tpC)�$
j�4RK$
A4��$v�����i��8�5Z�@�L�F#UD��!9�8��M�C5J,L�h�\x���atG�S�b����Y2?�����������u{� � ���k$��Nd����y����7Z���P�����aj�g�F�~P���{�T������|T|��`.�m���]D�D��1�K�8[rOg\����ah�"kP�:h�g��3�1C,PN?���P����[�=�x��o���[}��:B�y
{�������S��f�E�����e��T�Y�����e��#g2��Y�XA,A������a���RKP��;����}m�$5=���l?��[�9�|������T��-�|~#K��� ��^�s����)��v�B�vP ���l��oM�k����4Je.���h_"�l���*����R5Q��X*��b�_o��t��U��������C 6�6�x8�c���Q*�~���{*kC(��/��q���|8�N��(q��)���;������i��%@����ZQ�\�T�h(�t��%�Q�N�d�R�hwG*�'���3�xs��g��`A|��#�O}K*_]F ��s�Y_�F�:�S
A�1�E�f&��?S�:����
�� ������A��mx����"������k��b��l4-��������[S�hw���`h�{{��QM�%�n����W6d�uz���{*_8?�/S0��mF����bL�����Wp�����bz%9_D��9R�������������o�|��^���E��*���w��Ee���/�s���\p>�$�v[�L�����a�>'�%2Q%�����<�D��4_6���!�X��N
���g���e��!o+����$R{ip(�W�
�t���>jL��n]S�l;��h��K����������������������&���H��g���v�9�!r�'��8�,���}���}�b��aj�^�#�oi�����C�1���j�m���:iT��~��E�����E<S��~[I��`)�w�d;{�@�&)Yd�u�����~�����������+�7�W8�����(G�o�Ze�I���q�_d��/�<����Z�9;}^|�TI��N>��������C�UM����a�K�F��'�~�����s��d0r^������6���!��(������b �[p�[=����:�Q�T�V����&6�w|�I�a�*T��/�% ������~DL9�g��0N��:�S��o��]y��z������l��>N����qPn����_
���ex�=��L���
�����+���b�}��
��y���;���NY��3��Q��xz��J�#�s�i���W�a�i�,c�A���C�1��H��z�l�XS���T����S��=�����tm�O��g�^���.�=E>xd���H�[�����|�:e��\u����y�!�,~NEBW��������.E�"�f��~~*__�|! ��->�]��~���9��v�3'?I��T��
S_��S\;/���`1�yg����\��P�{�W�#��|g*�[�zAJ]o��ex�``�����#C�L!���������P�T�}���]c�(Fj�r��,J��|�]PG��u�d�%��+�T�>i����'�!��"u��o���Z�%��Q���jaT
��R�;����d��DF���gN/��zd#:�.8�_2M~-~6���D6���c}/��|����M��c��MS�a�����/������T�
�C8R1~�1�0L��;�>|8;b���x��;��lzi���U�#��;/��8�=Qm���o�c�P�pu����g�-G�_i�4����2M���cQ)�>�(.�s���\p8�
��;�����c��h+gp�b�kaL����UZ���V_8P,��������������R7�UJ�R���v5��}��"�!v;����Mq�p�gbQnEL����s��"���
�Y����`����
\i��9:�y��a���S�{,9��B�T��Hr��H}�l��&p�V�����1E�"v���gs�y��A���F��8�.��Z� �w���{~��9��yk��w��qHu��:��<e���ig��T�A5u���k�q��~r����|����r���\p����� ����_F_>}}����w|���2�y�
�4j���lVS��Ai���_(���.O�P�K�.��R{p���2?�}����N����6��!�Xo�MC{���{�2���fMU����Yz�>��&C���p�%'/�%�;��.l�����/����"15��H���J�7�]��Uly�������z��&Q*U*8�]�c��1LIk���2�=�4}Y,�)��y(�X��<{����#��iW��4U�d|��1��l��zW���N�<��wd�BY����������W�G#��.5z���\p>�8m&�ZS� x}��MS\x���s�H�I���tZC�0���z��CVfX�W���m$����R����V\��$]L���W���j��U{������|�Dzx^R�s����m %%���~�_��E6�p����Z[6����%�
1�8U�����n6���T��}�a��$
$5Y�?���[���� ����5x]p�����:��������l�.����d�+��N�k�����ES]�m������iWc��2��vvl9�f�y�*������zE�EAF�b��4�w"��K�\p.8��w�@�
�K^�T>�R����"�����s��^0���������G��}Cu��;x)fH��4L�|��(�OGZS��)�p,��
S�T>UyW-*�L�u��}��T�r����S����m�'5�]�?k=�������X��`���V*_�6���g��p��2L���{�>u�#8���2X�\���k�z}2��~��)}�V�D�����a**C�aX�����0�N#I��-p��T��)Uh��~L;���0�x�>Rz>j�}��M���SN}���t������g�x��-�S����)�)���RL�Q%H�\ o�S������#i]��i��Um�c
{������R�S������@����Ht@p���L�p�^M��(;���l�>!�!�|��ISj��,��x�gi��>�
,p���E�Lx��^�������TU��0L)=�0���0e����eYw�)>�%����9�����*��>p�/�4��#2L�qz��1 E�SYR�z�WjX����&7 �'��aoj������1��c��W�0��C��OG�y�����M+���
��
�[��p�;�fq� �[�^z��s���<��9M
X`P���M��q�yD@j�n1���<~��A�$�����?
i�@i�j����Y����g4�:a;P��T��7+�I�.b��I#�1����
���c����a�
�*Pb�W?_V��:��g�1�������[G��)�X-�/B=V��>�v�Z8O&S�C"9)�n�9��6�Y���
����p�Su��DL�L�!R��O��,�|l�����&����������fr>�z��:fQ�r���T��=2��������rW���ly�`��k���F#���+�}���zw������ ����mU(|��C�,���P���_��)��8�'�m�%�Ga���!/5z���\p�����)�����o��7I��$QH���E�������F���x��TO&�HY��y�7�[z'k<��Z����Q%�3�\S�
�'[Az&,x�
PqN�s������bS�A,dg�i t����eNW%C���z�l�EKj����_HN���%'P�����y�l� a���0u���������iO�(~��� p����G�5��)�l��a�����)�<��h��g��.dr�1�������~/��~���M��-S�8�==bj��`*�S�����}�F�q]
�� 2X�Q��w�i����\p.8����{�'�{� IEND�B`�PK y��T
styles.xml�]�o�8�~���-p�d+���������{��`$��V�����7$E�%��_��-�9�����r_�y��=f1���x:��#:�%��f����������^S�#^��Ir#��>�G08����f��pIQL�e�/�����e�z)�R-�Y����8���w��-�Ew�g����.C���-`Z����b����� B�T�x�I��f��<Z��f��l� e+s�X,L�� �dtQ�|I�:&���,6����i�Q_�mQ�0 �0�
��f���H@]�������~����W-0;k�z��$.����w�-�
_��wn��N�����_���\����H�[ME]O)�DT�Kqg�ua�����|���@�t�;�w2�i��MM�0��pyM�����/M�#�x&��?�:�,T�<��CU�j�s�FR�6!l!h�{�7/J����)����s��2M o�l�*[<���
;~�0#��r�����1>�2]C
J��`_G[�R#J� 6HNF�eat)����M����)���@�L�),����^9=
���l������*�e�#�,����O�d� �?�v�{���'w�4�=[�%��%���(���"�X�S��Q���7V8�!t�
��ED����f�3�����'��@�G�����)2�%a�0�`p����U(�&���ei�*��.�P��E���J*��`�k�1�b(ZD��*?����F�Kb�BQ�W�!�Q&�lq6�.}���^��g ���U�`V]C�h>���Z��u����lE���n���TYg]�~/-� }{H���#}.a
�sF�y|
u����]�G"{CB����z��O��V�T�RC<�C�5r�� ��17@��������;�>�"�%���B\L)��5�FI��Dz��
��43�!z�;����9q����F@]`��J�%���"-�0���Cnu<��8CY�u�[�I�n/Obl@�/����&�,��-)����
P����b�@���%{���9����B ��f�j5�>��)�V��hgj��N��)�Jq��q�h�>$���
��dEx"��xf<8�/M�Lg�|"`�pp��1�b�4\ �����2�M>��;4t��5Yz�u��a��c�B���b#�c0%Z�����=Gm��C���O�����G
����tv��;�%�*�(�n}��*��S7tv�|�*t�]�V��Y��#���6�v��1Y�i��ON���F,,��������,�W�0�P���m!�6���Q�lIU��.p����M�<�W<oT�<����
GN��9��W����$t�*��e���+��M�S(��F��u���0���@���
PVS���<>U���y��Ows��n>�2����>�}����^|���� ������������k]p� v����f�E\������>-[�\�^�]
��.�����
=���4Y����w�n���m����~vX�[����"��H5A�(0�`�+kj[�����>��g�6yo��y������C���b)���������NC�S?�p�g��]p���C������=9���[�J�:�(�Gs�����r���l7Y�����W�pV�����S��~�`�r8f9M��lR��������}Vi�I���{c'�I�M��oc[J=9$#�C�%(���m�Je���>����mM��u�ef�����=V]��X��r��W� ������)y�i����l�����AW�����r��>s���r�����].��])����o�>���I��k)|�5���`o������*������{K��'���o�v�u�����'\������b�
������>������E��������:��j?��7�Oo1�.0�/��4�N>�J�.��������c��K~j���3��sa�(p�\��>��Ki��A���N�����YU��U]�t�
E~�j�����������D!b��%F�Mv0Z� ��"=H���&,(�q��j������z}�Y���%1��G�D1��]�6X��|���!������H�����<���v��)��xK���&�}��/�� ]#�r��'�"Z���opVN�Y���Op����X�k��QL}��^�-��W����fD��F��R��v�R��E8,U�B��PW
P1s�~p��`��~RF�i����l�+n�g�[(��
�G�+��P�;�� ��XCb�qd��+����!C�=N8+�G%M|}@Y,��?���D���i\u ����`/���b�2F��3������x��$Hv��r��4[�����cP7)hC|!E�H��kx���l5R���4� ���Ia�;a��
�+?uE��MX������c�-�
�
MxI�wQ075�qEX��D�����y$�/�)_V�3����^�&5�������~��/<��5g���*�~��
�|�A�Und�����f���'�B��d����r19�d,�c�����~~�+n�g��E��.�Pr�����r�5������S� Us��������=x����!W�RX��WY�F CSF�y�u�������%�]�J��d~�h���(�T �����qc�J��X��w�����G����K����;��3j��2�3��o-s���v��8c�-�i����5Q1*�z���kQ�-��g��XQ��y���YmL[��*h��5��
�,�*����%}���v��+��D��J�w�`�p���m���j��'�W�K7!��`�����/N�_$U��j��f,5I��H!<j������=l2"� T����.
�2�_�J����i�A�Z����MnT��t��7��W���Eq�j;�/
}>��������]Z6z9a���}���n�xWk�"�)�9O���w���PK����� b PK y��T meta.xml��1o�0���
�v��$��Ru�����E��P��F� ��(�2t�����;9�]*��X�UF�(&*������>����R}: �����P��B���UY:�2�E5��R�*��q�kTc���
�E
����j
��m�.#mJH��-���|���HOPb�`!��>�C��<��z2��!��[��
��H��(��:v ]B�Xx������������c��3�:��u�^w�(1��Q�k&��Q�I�7�WY?����KTh��&����o��gXGq�-�B5���f}X��s���GB
��k��]�oPKf=��6 l PK y��T manifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH��� �Bi��|9�� fWQt���y� =��:���
a�R��� ��@� L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h� � PK y��T Configurations2/toolpanel/PK y��T Configurations2/statusbar/PK y��T Configurations2/progressbar/PK y��T Configurations2/toolbar/PK y��T Configurations2/popupmenu/PK y��T Configurations2/floater/PK y��T Configurations2/menubar/PK y��T Object 1/content.xml�][o���~�_a(8@��R�/����!��s�,����(�;�(������jR�����:;m�3����u������*�>wz^�7_��)��� ���M���y�Y~���?���������Z,�,��W��:��IVmZ��fo���0y�7�U���&]��m��V�|��u�������4�K9zz7��������n,��>�_���������n,�����N~n�dQ�����-)��b���d��������i���U��Qk��C��a���.�Q�l���[���)�����6+����y\?��h��m�jW�����uy�4�*�G�F7o/���^>����vubO�����O�P�����"Seu��f?��_U� ����N\F������Og�?�E�������,-�������`���$���tp|g���6��ap3?y�~��9[���0������4m�9X�v�pRS9��mU��a�&�d[����qw�~�����q��>��k�?�����`g� t����d�
����&�1�
� �_/�,��7�d�����m^J�n�-����e�n��\�P8/�r/������u�����z�qv����n��F5_�w�D�4�=�
��u���%G���<R�&�{�>���"K�������~���7}4w�~��H&�����u��D^��y�,Z8�t2�����p�<s�/�c�������X�e�-�,���.�:1���r�;���0��}>�������<[����"]�����~��N�+0�68��"on�!�[�����������-��/�}������d[5E��F[?v�^ biQ������
X���}n����uQ�����}�u���n(!?L��0�d���EZ.7��A<J��`����s��x���GOMv����ON����M�Wl��q�'�b>�7I��%��"-���[�C����m��m��������M��b�%���.v���g���������KR�y�\d_�uZi���T5��#���U���Ae�L����E u�BhQ�����;��B�������p��Gm���K�8P0$��n�pQ�Q��u�Yv����)`[`&�=��x�s�W=�Y��s���������=���wg>y*���)�&[O|h����u�I����a��-X��SWm�rp W ��������D�{l���W <@o*Gl���b��L���K%y��v:V���{�z��E�V��huv����e��A��K$��=~~��|�(A�`F.�Y�:�
gw������s����\�K�������y�^w�;�nL�Ej$��;Q���-jN��4���7O�a��wi���1���l��K�}���;X���<����NP��c�y.��r�Z����{���*�������s� '��e�{h��eg'����j�2|�������U�����!CI������0ad��vW�nWu��0�N����m���n�����L�f���CZ��������E)�e�����J>�"�p2�b�l��F��sg����y�n��k+P�������e�&i��G���Lq���k '�|^�N���<o��o�v���.��zP�1�NW�J��s�v5�D��*O#2D�?������*�j������QU�6��y�I�<��%X`��*�VQD��(p �jX�S�:X��|�.�:��y�uo�m]���%y>n�����]���#?�h�������T�N�N�\���%��k�m���������n��p7��?0�������f��a���bv0�y#�7��q%�qS��H�[�O��,�����T�� ���G������
\��?��S^���y�������.�GW4A���'����!7����m���c�$g6������p��g8��W��a�����;�����C�v������=�zUF(F�x�>�}��Gz��x�>
�}��'z��x�~4�q�f?���+h��k����O�4���_A31^���~����0�
�����2h��(�~���W�L����A�_Gi�+=���fz�f��6yF������SZ���2f�(�~e�;t�I����;���(�,-��EH4*Y��<�]e���4�/��.�q��^������`�^�o�l�z]= \ ����s�W������*����nZ��@�� ���u�>�U�K(A�A��&�?B�����w����$�f�s��Br�P>�*��@(�I�����I@L�(����U*z����7�T�_�g\�.qv?���
]���Lq4V���c����m�P��{�2t�N�
Z*bR+��I��� ~)��s%)��X���Ht�H4~��6V� �e����>4"����n��9����"��������oau����(b������r�M$7Th�Q+*)A#��_�����3��`�*F��e�Y�/i�Bvi�6V����ehD.������p����
! �;�Fd\�,��)���Q����}�(��if�&8f�&��Zi�$�`m}�i���c>��hHa
!�B\c:`$�'7�HF��$ ��RJM�%4�`(����1��eHE��%d�!�$�bI`�T�@j�T�A�Y "�A�"�A�@)���z��hP�-�� 7����I`3\pfa-,����#%L��W[ C�b��P�k���0o,W�h +P���x� ��U�R���\�����0�@�D���"Fb�4%*;�y�}�q��xq�J��F9�E��h �1�{�%�`(��ey�]Dkk��&��pG8c�[(��#�(� Q�RA�WI�H<�(�V�L`E��p��@�����D��6$� D~���>����2\��r��D�H�\�PLQf5��F@4.������d��.��I ��qN��A�-��I��4�R
��(���%����
��C��0�J��\�S8#(��'n������#$"a0
�&�C�$�������,�]�f����WP�A���_�3>��1���p�m�#�, ���'�=��@PD� PK�k�$���I`��RC�U��gMf�B8�kqs6��I���<
��-� �G��P�X��$Sq���PwQ�gbNt�04<f5L���A����7C%�l.`�e�q��l��HB�����ZH�v���q�C���)������"�����8���'"(^20�(}��$�c(�
�����!#��x�(��%�
S2��
%��$@������Gbn�{�&��&p?�� j!4c
a��P����w[Nl��W �cC
�1����B��x�X
a�YW.��P����A
��C���h(^RNIM���pJ�P<? ��%�7F���*��_#�JH(�8��g6��xA@�����E,�s]8&����4��x@hA��BA��� �b>����e�[C�H/����onP�!�2e�k)�I����R���2j4NF��
,s�CB��N�Ji1�VPk("�{-
!E,������q��b�w���B�-�t�~����xp/���A�P��V��;AY��x���f�5���X�����q��O�Z0@~���
��UA�u�'�z��F��rO����k��C� �c�R��
��@P��'�K��TP�G1��v�/���s�%������n5��D,����X��
B��
��*?.��(�2�,�
W�������G1�_����5�������������&�2�d���F`(^1� �-\���A�� ����.���c(b9��-#�
������!(��a��ZE�������2��hpu���8��x~@��`n.�r]�;�bEB�� �!]�����-�ey��9 �[>��I` �3(��#��h�#�P�+M�rP��\R����P}/��P�(��� ���P ��E!���JH�����">��F���Yb(!$��&��P��gU�0�W|�~���v�w��s������
�w���PK��1�� �y PK y��T Object 1/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4 ��?�Y�<�������=����������C���.����T�h$� 1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c= ,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j- .��Y��� 5 ��Z�W�kLa������
�����=N�����L�
����Ct�PK�P�� � PK y��T Object 1/meta.xml��AO�0��~
R���\F�b<-���7��7V���2��B��b<����������J����Z%$��P �KU$������}z��I
d�m���Z�
��aS)!�QL�F6L�
f�5���-i�������3!gkk�u��T��(��Ug4W�nM��\ �8&4�fv4����.�������]�*60�g�0y^�����a0�������!���?C� D��[�hE�y�x;�n�I�h��Ox�-
�#]=�j��}���o�d��(,����~���O�~PK�-�� B PK y��T settings.xml�W�R�@}�����A\R������jt/o��!SN�Ss!�_���*��\���9�����k,@H��o��6�p��}�yz{���|��lF}p�u\�HP�,�
��K'��[Zp����$�(���v��~��&�F�����*;��$�ir~�bn�u�]{=�]�#����T���T��F�n��Y����;{�� �I��[��?�m��� J�il������t�7���}��y��z��ck;�V���\Y�f����=�T=�?i��"����t��i�;����������DC|�,�Wd�*a�O`��[�(L���T������r�"����0 �(�� r��~�&OF�
��>�2%tE��8
nFS�bf~�c,���CI ��Wzm��y�"W���t9��8!R��� ��D�S�~����e�,���}\�;"�@M�W��!��.�4Y�r���VWgrCs���M:}��MR�����*�#h�����?�CY�%�����C>O��9o���C�?��ZYC��D
��<�����Y�>.���t��"�t!_��d�? ��0]�W�MO�U��h����Nd�St����)�r3���`rg�������y��,\�a�
G����2�q�1������������>F�1[=K7D��G:2�M�#�b2zj���XR�J�b{���]�/p�PK���D� G PK y��T ObjectReplacements/Object 1��{pT��OB"(P^���yAx�C��1�@h�V-�AD|���T[�#�������|tJ;8�v���5�m�Xg*�a�#�1���L_N�����~s.���-�������w�������9�����rQn������#�23��oe�m��#�4w3n�1�lS�m������zZ�5����5��{������<iu77���9��~��"��@����v4����F&����c���Z;�-���LZ*e���_�ww��b�~���/vs��^/�;h���k������'�|����m��d�~����=����cCw�;7,��\��=�+�^����L�sy8��4�k���Z�b���9��������_�
S�8��Ah_J���#^J��s1�5�\"�:�MG�dxO��6�tuq:�&�{:�#�:�MG�dxO��:��)m:�&�{:z��R���H����=Q*uJ�������c�R���H�����\*uJ��� {:v��u���##)�t$%��p��?��:4��^��������M<G��WK5�^�u�}�w���8
��P�����_�����l�?�p/Bz��CJ;�E�P7>���I�7{�Y7��R�3 i��z�b�b�%^��n���vg �����n�������I��<@jw �y\@�_ Ut�������Cy���@Z���n�I����![�k�� �����q]y�T���[lu���vg ������RE���+�:��R�3 i��Z�,U������&i�� �����q-9C�(���d��k�� �����q��&U�3�{M��R�3 i���1Q��KJw7����vg ���4y�T����l���R �;���
hQ����.����I�<@jw �y\@H
[/�:�V�C(���Hk���K��Jr�R�3 i��BRx��^�g%9�� �����q!)�����$�P �;��<. $��M���$�P �;��<. $������$�P �;��<. $�R���$�P �;��<. $��S:�I�<@jw �y\@H
���ZI�<@jw �y\@H
�V{}�I�C(���Hk������4Ir�R�3 i��BR���^�n����vg ���U<�f@H
+�y��E�C(���Hk��7R��F�C(���Hk��WS���$�P �;��<. $��O��I�<@jw �y\@H
�O�CYI�<@jw �y\@H
��ucV�C(���Hk��o�����$�P �;��<. $��-Ji6� f}���vg �������s�6�����vg �����pW��'2�By���@Z�������zG�$�P �;��<. $��'{��Q�C(���Hk��M�x�C�$�P �;���
��k� !)���[$9�� �����q!)|�
��I�<@jw �y\@H
����}YI�<@jw �y\@H
�|����$�P �;��<. $�������$�P �;��<. $�w�����$�P �;��<. $�X��OI�<@jw �y\@H
�n�zwV�C(���Hk��K�z}m�$�P �;��<. $�]
^��Jr�R�3 i��BR�<���FI�<@jw �y\@H
[�x}�A�C(���Hk���S���N�C(���Hk�T����� ���{����$�P �;��<. $��]�u_�$�P �;��<. $��o��Z�$�P �;��<. $�W��::+�!�H�� �5�I��.��$��Cy���@Z����������$�P �;��<. $�g����5�By���@Z�����;��W�%9�� �����q!)�>��Y5�By���@Z����Nj���i�By���@Z����V�x� �!�H�� �5�I�)=� �!�H�� �5�I��)}�A�C(���Hk���3� ���{��Q5�By���@Z����v\���I�<@jw �y\@H
�����I�<@jw �y\@H
g����jI�<@jw �y\@H
��z��Jr�R�3 i��BR8n��{j$9�� �����q!),[��}��k�� �����q!)�{J7�Hr�R�3 i��BR�����Jr�R�3 i��BR�����Jr�R�3 i��BR�/�l����vg ��������^�5Jr�R�3 i��BR�g��K$9�� �����l@�P ������xF�C(���Hk����x]P#�!�H�� �5�I�����Jr�R�3 i��BR����*#�!�H�� �5�I�K)]\-�!�H�� �5�I�)=5#�!�H�� �5�I�ORzE�$�P �;��<. $�O�{�)#�!�H�� �5�I���{m�����vg �������&��I�<@jw �y\@H
����\�$�P �;��<. $������zI�<@jw �y\@H
������By���@Zs6����������I�<@jw �y\@H
���g$9�� �����q!)��F��e$9�� �����q!)�����+#�!�H�� �5�I����z~�$�P �;��<. $��Y���I�<@jw �y\@?Y*U��������Cy���@Z����~x��;�Ir�R�3 i��BR�m��9YI�<@jw �y\@H
7���j�$�P �;��<. $�����V/�!�H�� �5�Ia��^�����vg ������}��L�$�P �;���
���o@H
����S��By���@Z����~��^3�By���@Z��������}�k�� �����q!)��=^�3�By���@Z����ny�W������vg ���������n�Hr�R�3 i��BR�f���I�<@jw �y\@H
�������By���@Z����.l�z^V�C(���Hk���Y^�S%�!�H�� �5�I��^�����vg �������*�������vg ������`JG�����vg ��. 7������c}����V���^�lm��(w���?3��I��vl�x�tm�{���{�{U�+�f�wDr����*3w�_}��YBec��/��S��J�dL�Ym�'���^��h�����|������������#���]���U��Mw��3��iLM������~��Lv{i��m����\-�������������~����p�}#���qB�x���XhZ��7��������G[�{�G��}��<�X��*<��v��bd�"��������p=�qQ��*�]�3�g�����E}2������
=�}�zd����������
-2w������y��<���G���r���t�|�����d�����d�d����wx�[K�}������'���n/M��M�;��e��'����\|��kK���M� xr�`��$c��0��������w,�]�%��4���]��$�w�m��v>;e��;���g�`Nv�����{��g�����Z�@���+�
ph+q�s��G>a�%����T�\5���IzP4;uP4�x8(Zw�/U�$�j��}��P����c��u@TC��_B������p TCv`����c��~.�iK���|�&{De�����c�����c���{L�x�M�mI~.�?����������`_q�(�� ��fH���K�P��q<pEU�WLlpO�9 ��:��y�T u��WT,p��������m�*����J�k�8���`�+�6�����{�Q��67J�P��q<pEU�WLq�s�4��gJ���L���1���*X��)"�
��h�*����J�k�8���`�+��8_z���T -�#UB];���U\1E���4�g�I�����*���x������"��?����J��m�}������<��l2������
�F���>�����������v�aw���:�O����t��+j�t���F��*�]����r'��F��Nvo1��K�m�,�3M��-���JK��ts�6���u�:XK7����--��\�U
X����t��Wi����`.��M�t�#��Z��}��ni����g���}k�]�F����Dml���'n,��*d�6[��S���Q�Y�=��%\`���%��:�W���X��rG��{,������}`VwK�t��Ww`���]Z�C��������3�cY���K��]���V�EV������t�y����0W�
f�]�[�ku�����vB�|���&�����g���J7��gF]�C�D����U��x�����sZ�:��tZ�E�����r��l��d������{Xm~d�!#��5v��l�����������7$m������[k�)��_��B��#���[�U������"������e�}���_d���_�[��b��_�����;��U��\�7�{vk�`������*,����q�1���Kv��d;MS�Y6���v�|�1���9��;���������i�<g�;�������q��>S��wk���d(���n�?rc�]���N
w�w�wM��pW�m�����Ia�pWnr�kJ�kj���p���
�]��p��pWu�+���+w�j�]����r��]3�]g����]��
�]��pWc��)����w�
w5��f����������_W�k��P�f���n��ETn�%?��2��5~���=�?PK�z��h w! PK y��T META-INF/manifest.xml�U�n� ��),��l��*�������R��+��
C��}���T�H�*�3�~f�0�[w�x�55���HF�F��&�O����-.�7JB@6,��� ��&�fyP��A`(�u`+b��|�3��=3�����o� $m4
���%���r�kf�T�B���(v��P�t�� �Q��q\����+4���qP��V�D�MS�5���W�y�MX �S�<�b�j�������iHQo���$a
f��h�1����$��
��� ?=�����<��)��d�{k�j��!���C�h��*�J�#L#x}��5=W�G�3�}�?K�G����b���� 1
�s�����@�G\����(-�F��g����~�i}�1����o�x����?)i�����9���X|PK(DG� p PK y��T content.xml�}�sI����+����*'�/�o��������������� �D���DI����PERd�Ig����'{m2���%���7�����/���fq��54�������������������^��w��7����������������rq���b������w�?.��.f�������|�v}�v�0��r���~��Vwe���6��%��^�?�S����w�s�/����W���Tf�e���_/R�?�n/������������7�y���z�����O�>5�t�X�!�7���_��>.o[���7������
4�fK{7_�R�'��K��x��|����z���>,�+&aq��L��}�������O�/�#j��0[&?g-������}��{7[��_z�����?>W�����U].o��������n������rQ)����������7��r���$����r����1�1�a���/���^"Q�*��o��w����W��������n�H|�e�����zv����������r��X�w��N_�[�[����m:��-�����QR^�~�0�/��/7�Os�������%������������������z������N�����|4�m��|��|����=m���f~�}�w"�����n�?����{������s��������c���\�������?���.d���a�K{���nktX�z��p�6����r~q5��]}���w�_u�u�{������5��-�������oN3����������'_���ox�jo���f��#_��g�����u_�:�j��x?����`��t�ZP<��/���d��ai�8����<��=��%��Z���fM������U���~��w'��<������������7�����{v�_���\.`�c�����v������[����6�L����������_.~����_(?���
���+���j�=����-'��elm������������������z�m����\�/������>�z�q��n����g�����g���x���|y�0{?��8~�?�$���^�����{�I��W7������n�YLv.�W�������O���o��.�yOE�=��;��_�6�W�U���������������6�Tv��,�����5
�����=|���y������k=�"�c/���v���l)h���-D^�>���/*3=}�dm�)���X->.�k�ev�q~���aw+����vN������r��>f���/.��[��������v��}���G.f����TO?���������d��b����y����_D�[�|?/��t����??��7��n^����f�y��U����j�D����=����5�D�{x�y���W��:�H�{���^�����5;�����Gr�b�a������&L����Dri��g(��������� x��mfy'& _ne�av%����g�p�l�p����}� ����%�~�&��~����'|����{O��.����s��w���O������&P�}������"o���7��������.�r��z�����
�?���������}�M����^�}����f��U�A'���g����� �Op��~�$��B������}�I�c&����/�w����+�pn��-�������� �����������r�}�%&g��t�
c�x���_���{�0��T
������5�Gl{��v�~�m�Q�������O��n3mW��['����7�MP0�������6=�/�6�/{��2�W���I��]�X��M�xa�����_����^�$(��o����8~��0m5_���
_�V�v��W�e��S���?x}suq;�y~���H|9�_����l9�;F�����r��Zmb����i{w����n������������A^,1�0�vw:.�����|Yw�7!����=Ab�{���;�%��}��W{��|��p����|%�bg��B�{K�������g��6b�!]]t�����fKt��l�~ul���0��G~Uvz��;�< �o^w6�<�l|�g/os�}�3�~������-���������-�u���:>-�����n
kY?��Na!�9;SWY�h�La![�Lb1mn�r~����%�l�������]����a
��8�����rhR� nR�5�g`Z3���f��{�O������;�sr����;�m�G�q}���?y��_WG��'Lag7?����a7�O����,+�.���^�������������H_��
X$���A����3��op��T}��%���9����4�;�$)�L_/�>*�����`�~��1��M��)2�X�����=�G�����F�Q��!"�y�����������b������ I���@����~��i�M1�vPCd�������8Gd-d<_�JJb����� 9��l�����HBly~�y�����(����(��Eq�f���!X�.��%E z�(���`I9$����s<`QQ�� i���c���-4H���}��=�#���dGN��:r89�z��*0�#��b�}����t�J��E�}���%g=�����>SAGN����
ILm�c0�9f�D<���'=9J���JQ������+7�|��n@�!X�)G�suqO.��]����F���E�R3(R`)�{�sD�E]�!��I�E TV���(nH��Xk��A�n[��/7 T�=�����C��:*�0�!��+ ��a�<{�4Q@
��2�#q��|��r��/n|gN�@!��h��:s<9�z�A@��:����,�;rV%�n��:r9���!��(���L{�z<�8��qC�"��g��[v��a������|�������K�aqGn�IX!�s:������'
T����LE3$P��7_v�Q���/;$P�����C�������;rC"������Fg��X��)`�!������X$*��
�xRX���� zv���
�zfu��Q+.Ku��eGN��8r�3����=C�� G.�'QG��9�g_[D���>���FiN����3��o-��X���6�4}��
�s�u��H�z��!'�#��&&r*���;'`3&,����v���� ��S~7���e�E���
��)��!G�s�������`��c�E����8.
e����(���9`T�Hco��(~@� �h^Y�s���e����@��A���Nv({.�/���g ��D��
V��:��eX��F[���{��;�-�s���4?���r-qQw.�'1wN��i���OK]�3��o��������;��o��#��_�XrH_�=����k��;�v��co3vnz��D����'�9�\.wT���_���;s��� V*�%�!=���6�:hE�
���37 Z@^��kM��iYQ��h�� �4 C�9P�4^?$Z`�v�o
o�^g�����;s��v�H���L�Ma+��77 � X��}�<i2���D��� X����'-�y���R��o��|��9C6�8�^qQw.(���X�o;gF�X��{�c��s!�f�%.��e�b�Z�]�l��;w�T����;7dK2
�0�����������|:�����Lq�����x�� #BY���
�
M� H9r0�iB��t��(f@�b`��m*������b��2%k=���BN���ztB���94
9��gqN����} ~���w%G���\Bv�@��8O�yosBR���`H���!������q���e��m����b�c�.�%'1���;�T�����Br3����?��oh�v��({��A:r<��T�������@�`{q�s��sog��"2���Z�������eqn����@5l�� wY�nP�b@m�!�5�q����^YbHL��$��!�����e$�����C"��uMN�?9�l����A �[!��
�������}0�������gtCb�j+�D�!/�3�������Yx�
:C���AY�.'1��-:��v�R�����������ztY���\ o�b�Gw�T���G7hCR�d��r�L����|��k�fTT��v/�����!C���
�c������Ft`����mNQPYnP�@��A��78��(nH��;E�<vNs���R�����C�:8�L��nH��4X����lNk��t�E�������r�8E��X�-c1����K���e~7�7'������by�1�s��e�&���FW���u�����y_�bN'����
zt,��t����3�h�f�r4���������.��tyd��������;b4b�(`���3��
�TdwN�����{��O7$f ��-��������{AF�C�!Q���!A���a���F�>����P�l�����'>�5$j����RF�@>'jP��6 �V,)�Q1[��)�����}:|���M������"x��o���26Z�����4�O.��N���tY
��^��]�'��K���O�O7(��y�{+?�2����� }:�2��;2�}�w��>�#��v��y�L�-��
�UO&8k����M���ef��3Jg�Ae]�!1�� s����.������}�<*i\�#�9J�.����!�/�����w��������k��������.�7���Q�UQl�n=��.]�n�3�t&������CA��ae��Ud���8`]���g���e�,�������hR���J:t�9t���e���H�����9���':���������aq�n� aT@��o��X�g�/zH��WT �5�9o,� rP���^��tFq��l�F< b����1������gk
��#g�����\EE���Z��9t�� 5(W ��7��cW��,��T������?���:t_�z�N��{L\����I��0^;�����Gt�<��\��(��u�2��`�%�R���J�s�s�mH���L�I�9eY�(j8�;��;G���YP)����w����k��P��;���!�Z�@�(��WGW�I�/`����XRY��EEqC���X�7���������x�k'J {t��9(OQ2��v$����� jH��y/2�Hg�.(��
�oTD�3�(6B�i���e����+�x�b��e��]>`�V���������61�������K���vx������>SI��t.���@����$m0#�{���s�t�O����:�� S��{q�,��
�"L��Y�R�;��"LQ��8x0D�"�����E��|W��s����9�(nH��Z��r������N����w����A2�F>��;u���F[OA��'��Aa�P�s� �UU�F/0$nT�^��V2@�!�����r�3���!�r�V����<��%.��E�����n$]�M:tH1����g[��U���S���e���[j�e��+�m����T����W7`Sb����*���f+��<��;�����������?��?5��o���_����;L��K;�0_���f����w�g������Z��;Z�;�|mvT�0X�wk����4��%�.Y�B\vG�Q�����*h������Sr?s�~6����t2�HeU��a��'P\E����S���*�[��0�5������K��|�C��A<q�iY �Q���Jy�F��iOL��$�wC��$x�: BF���0>s��w��Z�>��^�01��W�G����|�z�������;Q��3o��A�������1
j��~�]��_��Xm.tz�����l��k��r{U�����G�d�����n��������������k�T�i��^�F�Y�����o��hd���/���-�7��^,�/>>\�&v���X���������O`^m��/�v���5|��#�~G����{|��#�aG����|��#�����;������v��������#����uG��;���k;�?���#�#>^���#�iG�>^c��|{s������w��7�nl�o���������������s6R���f���|{�M��~1��G�v��3b��=�G�������������r~����\����})�e���K���3LC�V�N��YbL\���J��
�z/��RG���FW8���-qY{,G�c��S�l���}���X7�v�Y����5$�Yg�����O�-#�{��>�^���E��Qk����k�1�$��4������Y�-��Y��t?9V��&���$�w��b2��q2F�dGdIe�����*��||�3.�l,���qQ �� +��>�R.8']�S���it '���L��By���!e�<�����>SA(�n�ML�����J���AQN�a�q*��H������mrp�%.��9}���V�K��J���3>�G[U�q*\S��n�di��;�M�QW��$�wm�c2}��x�^��+�@^�<I7g���k�������,���SeP�B!��3�L�C�'J�9[��P�����k�=� ��*1{��T���/az{��o��c\��qM�!.�Y��T����w�������@�tz�@!.�Y
��<�`uH���L%��k]�F�V ����9f�NoW�����)������%mS�{L�+��X��#
��@��ph�����4��i�Q��$�w
k^������2�?�` gC��W&.��9����F9�J����Z�;��Bg:���x���!��2�h���+��T����0������1�����+B\�s
�"��#�����H�it�3V��x��@��p�h�^��mc�}���JyW���W ����9:$u2iY��d+��)O&-X�g[�:D��)��(�U oYwN[-��i(�c*��]y�K��ZQ<��I3�`%���!=�%.�9e��2I�y/]^�����V8��ds�%.�9
�V�8�f�I<��3����3&��>�^�vP���}6j�J�)t�:$��-qY ��3�F��fH�*��_/�c]��*��p!.�9
7� �F�S@%��3����3&�I 7�����x�3
���k���������yK\�uV��k|0V����i�~�it$w��e�%.��Y
���k�:vg���?���SA$�]]gL�����}5�bg���|W��� ���g��z���,�g����
��ic����L�+�x��p!.�Y
��;@�dx��$�c* �]agL��s�$JNJ)�C(?��"������3��������,�g�u�4f��q �=�zL��{bl���tO&.��Yu��8��� &�r}��H��u�d:�5�Yi���g`3��"yE�it��:�$rK\�s
���������@�>��
W6=�����,�K�1��84
c���=�JByW���t"���}�*�������
�n�����mY �)4dLv�����N��}��a�b3���8���,uV��@�x:�G�>OI��:c���� �����5�+�O�e����-qY��3d�P�K#i�>
��4>�d�
f��@���`1H�R�Z���g* �]agL����l�-�X��
����2�w��+B\�s�"�"���������mzC!.�Ye��A�m�P�t��}��T�����L_� ��is�:+�W�@�,�%'���ea<��S��:e�W�.�<�g� |�Q�����N)� ��0J��V�L%������t:i<����ds�:k�
�����Cb��-qY ���)����3�\bc�>��9+6}�GK\�����W�1���$�>SI �*;c2�r�����kMY�+�O�c������+Y$"HJ��?A�t\�g]�Rs������y��N��R��xb�g* �]egL��@��Y cZ����
����1�e4qj��"���� �5� h�p��VzL�#9bb4K\�����h���EH���g*�������t�k)����Q�_�f������gx�_f�,�X���-�9e���QYBG�����?zLc��M�E�����[h�i6����|b��SI��:c2��R�$w�$@����*�W��Y �����,��jl �;��}?zL�o����h���x��
6�:��%H���J�xW���t�C �(x�)}��J�Y�@>��Y�u:�qY$��iI� `�`�H���c��%��m��"y���i�Cdh�s!M�}��H�u�d:����B�����T�V$�n�,o�����e�<k|�k����MIl���]��&�����,uSZ2��P���>SI��:c2��"gWQ�M�Y����1���j !.��YU�^�h@��213���c��3������,�g��T�lZ�d}b��!OI��:c���{e���:3���/���D�e9��B�����N��m-C}�3�����FW8�����-qY �R85���#��Nl��g* �]agL�/��^�. ���&y�� �����C������N/�W��/��A}������^�L\��N���q.x�XJ{�S�����D�;��2���TN�~-��@>��Y6�F����79���`e��vml��S`9`]����
��@����(cIYpR����l��$�w��1�N�����14���j�W(�l�,M�#���e�<���1�Zv�=���!���FW8��qY �Rxhd��"`����<zL%��+���t�&7��#g��X�j�W �l�,��`�c"r�U�@����,o�
���� �"��@�����5I��A/�X��T�mW����E�K��On����\����fY��5A-qY �)5� ����c;1��c=o�'�1���������(%9(���qu}��0�Uv�d�B�6������X�0>��Y����ea<k~�j��NT�g4U�g]��b��G.�e�<K����C���T �1����3&���Y�^�
l���aS{fU �l�� :q@���,�g��l��xB���JLl�3���:���%.��Y��m�����\HlR�g*��]igL��=�,C�c/�����J^{f����J|������5���|�-qY(�)6�^�9���&�}�_4�c]�
��[��P��pg�U��r�����T�����L'��o� ����I@�5F^�|��Cz���E��bC�� K� �� �!y�it$7:�bK\�s.�#����NL\�3�D���3&���YZ�;����X;gU(?`y��Y��������A��!$c���0�'o�it�+�c�����k� y�!���M��>SI,�
<c2��ro����XS��XN�+�O�{������e�<k��k[{�j F���:`z��39�"�e�<K�����
!��3�D���3&S��q J9`Tl��1 ��+�O��4�L�C��H�Uq�v�A��>���YL�+��X~��qY,�������rhb�����D���3&��W�g� d�b�CM_�H>�Z&q����,�g���1�9o��>q�t�i|���s���,�g*�l[ ����>SI0��<c2����Jo/2:��(���
�S��E&��pK\�]V�!�����'����v�L�+\Sb����(��)\�>�nh1-i��T�]W���$�#�q\�d�$��K������.��"���bO6�����i���=���,!�+���,�S�������Rw�SI0��=c2�s��i��=�9�P�>+�O���sF�r<S2v�s�[Tv���CH���c�.�
��P��pj�����Lo]v�T�����L_�d<�W����f$V(�F/-C�"�e�<����=���F���d���it����8!�e�<S�2�����>SI(�*>c2�n��:�D@Y!�Z'T�|���K��k��ByN"��X���Q���c]��%�����L�{�JS����Vy��$�w�1�NB9��C����4�z�Y�|"
���h���e�<s�$)����&�W�L�+�������,�g)��W��X�%v0�1�D���3&�i$���xG�2�
����RC�y���,�gM�l�g�D4���V2��p�� j[��X��ph�y�_��{�
�1�����3&��f��r"z���{_,������g8X�-�|F����,��T j��YZ������3��pE&���������Hk�5��i�}��P��|�d:�R��x���w��9Z��+�W(�DO-��'���e�<� C�\ "�nqH�1�g�-����r!.�9
����R�MX�R;���JByW���$��*�tP`��Z��6�2>�t���-qY,��@D�H��.���}��7O��#,-qY,�Q���Ie��l�����SI,�J>c2��r����t�����k�P��i��2�.9;�%.��>� �a��@�L[�O*�0���R(���B\��n��� �S.���3�r�U|�d:�Yn��������G���by��It�r�������,�g�:I-t���+Klp�g=����(�e�<�������iHL�?�)��]�gL����Z;v��6g�<� ����S����w�n��ByV��k�eA)�R[U�xF������,�gV��SZ#*��|�<%a�+��It�K��{5;AY���9b�����rg�����,�g�2$[�J����X�������{
qY(�R8[�������#0����3&�i��;G�9����j�\�
�����fJ�!.�Y��� :8��}bLl9�g]��%&�l���y�����`P�
�����J�yW����en����^cl
u��+�O�����Z�-qY(�)?��[;�J�X��#f�>����M���,�g*<x�� � A�]~�T��z��L_h��t�qdM4�]^�|���,��l��ByV�!5`4"� �Z�=���r���\��By��5�s�X��!�SI(��=c2�.�w�Z�-�J'�Z�_���������0��,��TJ�T���4�LF���9V9�����"vf��M��c* �]�gL��V����t��.V }�k�g��I4��2(�S�=SD"
�Y�&C�]P�������Lc+�P���-qY(���U�1���L ��*��T��r��L���+c��&t9��+�O�����&w[��X�5pR56�kJ�Sj0��p��BB[�3�M�:���[j��J"yW�����f���4c#zN;�Z 4�'����\�E���2�+��P��9e�R���
�9w��6b�il�#*JN`i��By���m���*�X��g*���z�d:�N�c GFt�*w5��B�4�i�>j�%.�9�����#x����|�
�L�C��C1��e�<G�Z�\����i�\��$�w��1�N6`�$���V�2lr_'{V �H3-mc������E���C���J���H�Fy�i��S���;[��H��p��q��[�!$6F�1�����3&�I,���d���[~�k*b���4���1FH��byN�!���8�^'7�:`]�l��7/�e�Wr��h�F9V���zL%��+���t:Q�mk�N-?(�Fy��t�2��~���yf����j�� �x��g_��d�.c��P��pCR �6�X<����JByW��)��(%��;j9��@����'�K�J�������C��%��� �B�O�����<�B�%.��
7��W������) �]�gL��V9�grN�������R;��Os�r,m�g���)6��r�C�1~
-L�+ }���P��pi���X���) �]�gL��PH��B2�HU��b�Wj�Y�i�)y����e�<���5�9���K���Lc+��>��%.���
'����'N��3�D���3&���`!hE@6g�P��W0?cy�vZ���-qY(����
c�e\�!1���4����P8���\��iM�Bt0���3�����3&��p9D�ei��+�e���� 1�cK\���$���]H�p�4��d4���P��pb��*��Mt�z<%��+��It�l��Ol����6�/b
�W �F3-��5[��@rj����[�fS
S��{L�+Cz�~K\�sN2
5�>�1qR�� ����3&�Z�o�^km���k��
�G������MO&�e�<k�$5��0F���I�,Lc+�J?�l��ByVu-/��t4
Ul�N?���T��r��L�[�h��#iW^G�U(��H��������D���,��T��I�^��tR��g]�J�T?�#.��y
�-���%�h>d)��]�gL��&y��h~,Bp���5���8��;��2[i���%[��0�UzH
"j�:L���>��
�
$1�cK\�sn�!ge7��Y�=��@�Uz�$:�H��f���� rWc+���H���,���,���o<�`I�/���4:�&�tK\�s�m�[��%L���L%������t��
��P&v�Q[iU(�r+-��GcC[�s�es��Q*-��g������,�g)��{f�7�d���$�w��1���Q�"] L��sE���Sn�el�������C�e�PC06��i���4��i��WZ��X��ph��
l����x��c*��]�gL�� ��=y����9 �X��b�4zi��4�#e�L[����sD;���=�������-�e�<��S�3����
�1�D���3&��3O��"��M�TC����ILz���,�g��=�7�;���0���+���,��&x����.q�g��$�wU�1�N��>�����*��m5�+����t��V�.m�g����t%-Vv*�0��sbz�gK\�������Bm�'���1�����3&��rg��=x�A�*�P>�FZ0���(����<�
e@@�fvO���i|4W�=���E�<S��������S���L�\�<�Ju:���r���[�_F��Z^�X�e�jO>�]����rO���* [�>$��}��5n���V B\��4�vl�X]j�m��(���U|��3�'v�be\������j��`����DFf��� �1d��L�������s&.�Ygdf+; ��{j0�s�����Ol�7Z����Z5�R�|m�%�G����BD� �;0q�}�it�[�8,cK\��4N��Yr�bHmv�c*
�z�C}z��RF�C��f
�T8�pk-�\:�qa8�<I
�H/�����r�4����������<S��D��~Mnf��S�����Q��`��Y ��+�O����d��#.��9%���H�XDRZ'��L�k��D�F�<�r�8�=6����T������$~C�2D1���zV<Oll}���e6�2>$<R`.���<�*�oA�t�V*�D��4��Q{���E��y��
6��g��iON��(��
�����y���w�XG����*�W8�H�-C�����0���&�o�����TM���:`��N�o��y���adV��A�:L���T������2�i�`]F����*�O��CZr�\K\�s���8c��k|�4��i�4=�\h�y���Y�`5��:N�?���9m�<�"�m�H�����;O�-�*�O���Mm��%.��9%�����,[U�����L�k����sK��y���!�������OQ44�.
�Ff�h���!+�>^�o*�W4�F�-g���%.���Y��^�����F��v�f ��p��,��j����"