zheap: a new storage format for PostgreSQL

Started by Amit Kapilaalmost 8 years ago59 messages

amit.kapila16@gmail.com

almost 8 years ago

Sometime back Robert has proposed a solution to reduce the bloat in
PostgreSQL [1]https://github.com/EnterpriseDB/zheap which has some other advantages of its own as well. To
recap, in the existing heap, we always create a new version of a tuple on
an update which must eventually be removed by periodic vacuuming or by
HOT-pruning, but still in many cases space is never reclaimed completely.
A similar problem occurs for tuples that are deleted. This leads to bloat
in the database.

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version of
the data is kept in main storage and the old versions are moved to an undo
log. We call this new storage format "zheap". To be clear, this proposal
is for PG-12. The purpose of posting this at this stage is that it can
help as an example to be integrated with pluggable storage API patch and to
get some early feedback on the design. The purpose of this email is to
introduce the overall project, however, I think going forward, we need to
discuss some of the subsystems (like Indexing, Tuple locking, Vacuum for
non-delete-marked indexes, Undo Log Storage, Undo Workers, etc. ) in
separate threads.

The three main advantages of this new format are:
1. Provide better control over bloat (a) by allowing in-place updates in
common cases and (b) by reusing space as soon as a transaction that has
performed a delete or non-in-place-update has committed. In short, with
this new storage, whenever possible, we’ll avoid creating bloat in the
first place.

2. Reduce write amplification both by avoiding rewrites of heap pages (for
setting hint-bits, freezing, etc.) and by making it possible to do an
update that touches indexed columns without updating every index.

3. Reduce the tuple size by (a) shrinking the tuple header and (b)
eliminating most alignment padding.

You can check README.md in the project folder [1]https://github.com/EnterpriseDB/zheap to understand how to use
it and also what are the open issues. The detailed design of the project is
present at src/backend/access/zheap/README. The code for this project is
being developed in Github repository [1]https://github.com/EnterpriseDB/zheap. You can also read about this
project from Robert's recent blog [2]http://rhaas.blogspot.in/2018/01/do-or-undo-there-is-no-vacuum.html. I have also added few notes on
integration with pluggable API on zheap wiki page [3]https://wiki.postgresql.org/wiki/Zheap#Integration_with_ Pluggable_Storage_API.

Preliminary performance results
-------------------------------------------

*We’ve shown the performance improvement of zheap over heap in a few
different pgbench scenarios. All of these tests were run with data that
fits in shared_buffers (32GB), and 16 transaction slots per zheap page.
Scenario-1 and Scenario-2 has used synchronous_commit = off and Scenario-3
and Scenario-4 has used synchronous_commit = onScenario 1: A 15 minutes
simple-update pgbench test with scale factor 100 shows 5.13% TPS
improvement with 64 clients. The performance improvement increases as we
increase the scale factor; at scale factor 1000, it reaches11.5% with 64
clients.Scale FactorHEAPZHEAP (tables)*ImprovementBefore test1001281 MB1149
MB-10.3%100013 GB11 GB-15.38%After test1004.08 GB3 GB-26.47%100015 GB12.6
GB-16%* The size of zheap tables increase because of the insertions in
pgbench_history table.Scenario 2: To show the effect of bloat, we’ve
performed another test similar to the previous scenario, but a transaction
is kept open for the first 15 minutes of a 30-minute test. This restricts
HOT-pruning for the heap and undo-discarding for zheap for the first half
of the test. Scale factor 1000 - 75.86% TPS improvement for zheap at 64
client count. Scale factor 3000 - 98.18% TPS improvement for zheap at 64
client count.Scale FactorHEAPZHEAP (tables)*ImprovementAfter test100019
GB14 GB-26.3%300045 GB37 GB-17.7%* The size of zheap tables increase
because of the insertions in pgbench_history table.The reason for this huge
performance improvement is that when the long-running transaction gets
committed after 900 seconds, autovacuum workers start working and degrade
the performance of heap for a long time. In addition, the heap tables are
also bloated by a significant amount. On the other hand, the undo worker
discards the undo very quickly, and we don't have any bloat in the zheap
relations. In brief, zheap clusters the bloats in undo segments. We just
need to determine the how much undo can be discarded and remove it, which
is cheap.Scenario 3: A 15 minutes simple-update pgbench test with scale
factor 100 shows 6% TPS improvement with 64 clients. The performance
improvement increases as we increase the scale factor to 1000 achieving
11.8% with 64 clients.Scale FactorHEAPZHEAP (tables)*ImprovementBefore
test1001281 MB1149 MB-10.3%100013 GB11 GB-15.38%After test1002.88 GB2.20
GB-23.61%100013.9 GB11.7 GB-15.8%* The size of zheap tables increase
because of the insertions in pgbench_history table.Scenario 4: To amplify
the effect of bloats in scenario 3, we’ve performed another test similar to
scenario, but a transaction is kept open for the first 15 minutes of a 30
minute test. This restricts HOT-pruning for heap and undo-discarding for
zheap for the first half of the test.Scale FactorHEAPZHEAP
(tables)*ImprovementAfter test100015.5 GB12.4 GB-20%300040.2 GB35 GB-12.9%*
Pros
--------
1. Zheap has better performance characteristics as it is smaller in size
and it has an efficient mechanism to discard undo in the background which
is cheaper than HOT-pruning.
2. The performance improvement is huge in cases where heap bloats and
zheap bloats
the undo.
3. We will also see a good performance boost for the cases where UPDATE
statement updates few indexed columns.
4. The system slowdowns due to Vacuum (or Autovacuum) would be reduced to a
great extent.
5. Due to fewer rewrites of the heap (like is no freezing, hot-pruning,
hint-bits etc), the overall writes and the WAL volume will be lesser.

Cons
-----------
1. Deletes can be somewhat expensive.
2. Transaction aborts will be expensive.
3. Updates that update most of the indexed columns can be somewhat
expensive.

Credits
------------
Robert did much of the basic design work. The design and development of
various subsystems of zheap have been done by a team comprising of me,
Dilip Kumar, Kuntal Ghosh, Mithun CY, Ashutosh Sharma, Rafia Sabih, Beena
Emerson, and Amit Khandekar. Thomas Munro wrote the undo storage system.
Marc Linster has provided unfailing management support, and Andres Freund
has provided some design input (and criticism). Neha Sharma and Tushar
Ahuja are helping with the testing of this project.

[1]: https://github.com/EnterpriseDB/zheap
[2]: http://rhaas.blogspot.in/2018/01/do-or-undo-there-is-no-vacuum.html
[3]: https://wiki.postgresql.org/wiki/Zheap#Integration_with_ Pluggable_Storage_API
Pluggable_Storage_API

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Amit Kapila (#1)

1 attachment(s)

Re: zheap: a new storage format for PostgreSQL

On Thu, Mar 1, 2018 at 7:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Preliminary performance results
-------------------------------------------

I have not used plain text mode in my previous email due to which
performance results might not be clear in some email clients, so
attaching it again in the form of pdf.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Satyanarayana Narlapuram

Satyanarayana.Narlapuram@microsoft.com

almost 8 years ago

In reply to: Amit Kapila (#2)

Re: zheap: a new storage format for PostgreSQL

Cons

-----------
1. Deletes can be somewhat expensive.
2. Transaction aborts will be expensive.
3. Updates that update most of the indexed columns can be somewhat expensive.

Given transaction aborts are expensive, is there any impact on the crash recovery? Did you perform any tests on the recovery duration?

Thanks,
Satya

________________________________
From: Amit Kapila <amit.kapila16@gmail.com>
Sent: Thursday, March 1, 2018 7:05:12 AM
To: PostgreSQL Hackers
Subject: Re: zheap: a new storage format for PostgreSQL

On Thu, Mar 1, 2018 at 7:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Preliminary performance results
-------------------------------------------

I have not used plain text mode in my previous email due to which
performance results might not be clear in some email clients, so
attaching it again in the form of pdf.

--
With Regards,
Amit Kapila.
EnterpriseDB: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com&data=04%7C01%7CSatyanarayana.Narlapuram%40microsoft.com%7Cad676656345544116aa008d57f85e87d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636555135932006655%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=7z7XUUdXr3CZe71y%2F7kVto%2BzJB5IogypcRHODu8yAu0%3D&reserved=0

Hartmut Holzgraefe

hartmut.holzgraefe@gmail.com

almost 8 years ago

In reply to: Satyanarayana Narlapuram (#3)

Re: zheap: a new storage format for PostgreSQL

On 01.03.2018 16:30, Satyanarayana Narlapuram wrote:

Given transaction aborts are expensive, is there any impact on the crash
recovery?

In InnoDB/XtraDB, which has used the "move old row versions to UNDO log"
since the very beginning, rollbacks are indeed costly, and especially
so on recovery when the UNDO log pages are not yet cached in RAM.

There's is a cost trade of between this kind of "optimistic MVCC" and
rollback/recovery that one has to be aware of.

We get support issues about this at MariaDB every once in a while, but
it is not happening that often.

I can dig up some more info on this from the InnoDB side if you are
interested ...

--
hartmut

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Satyanarayana Narlapuram (#3)

Re: zheap: a new storage format for PostgreSQL

On Thu, Mar 1, 2018 at 9:00 PM, Satyanarayana Narlapuram
<Satyanarayana.Narlapuram@microsoft.com> wrote:

Cons

-----------
1. Deletes can be somewhat expensive.
2. Transaction aborts will be expensive.
3. Updates that update most of the indexed columns can be somewhat
expensive.

Given transaction aborts are expensive, is there any impact on the crash
recovery?

I don't think there should be any direct impact of aborts on recovery
time as we start processing the undo records after recovery is done.
Basically, we invoke undo worker after recovery which performs the
aborts in the background.

Did you perform any tests on the recovery duration?

Not yet, but I think we will do it after making some more progress.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 8 years ago

In reply to: Amit Kapila (#1)

Re: zheap: a new storage format for PostgreSQL

On Thu, Mar 1, 2018 at 5:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Preliminary performance results
-------------------------------------------

*We’ve shown the performance improvement of zheap over heap in a few
different pgbench scenarios. All of these tests were run with data that
fits in shared_buffers (32GB), and 16 transaction slots per zheap page.
Scenario-1 and Scenario-2 has used synchronous_commit = off and Scenario-3
and Scenario-4 has used synchronous_commit = on*

What hardware did you use for benchmarks?
Also, I note that you have 4 transaction slots per zheap page in github
code while you use 16 in benchmarks.

#define MAX_PAGE_TRANS_INFO_SLOTS 4

I would also note that in the code you preserve only 3 bits for transaction
slot number. So, one have to redefine 3 macros to change transaction slot
number to the value you used in the benchmarks.

#define ZHEAP_XACT_SLOT 0x3800 /* 3 bits (12, 13 and 14) for transaction
slot */
#define ZHEAP_XACT_SLOT_MASK 0x000B /* 11 - mask to retrieve transaction
slot */

I'm only starting reviewing this, but it makes me think that we need
transaction slots number to be tunable (or even auto-tunable).

BTW, last two macros don't look properly named for me. I would rather
rename them in a following way:
ZHEAP_XACT_SLOT_MASK => ZHEAP_XACT_SLOT_OFFSET
ZHEAP_XACT_SLOT => ZHEAP_XACT_SLOT_MASK

*Scenario 1: A 15 minutes simple-update pgbench test with scale factor 100
shows 5.13% TPS improvement with 64 clients. The performance improvement
increases as we increase the scale factor; at scale factor 1000, it
reaches11.5% with 64 clients.Scale FactorHEAPZHEAP
(tables)*ImprovementBefore test1001281 MB1149 MB-10.3%100013 GB11
GB-15.38%After test1004.08 GB3 GB-26.47%100015 GB12.6 GB-16%* The size of
zheap tables increase because of the insertions in pgbench_history table.*

I think results representation should be improved. You show total size of
the database, but it's hard to understand how bloat degree was really
decreased, assuming that there are both update and append-only tables. So,
I propose to show the results in per table manner.

What is total number of transactions processed in both cases? It would be
also more fair to compare sizes for the same number of processed
transactions.

Also, what are index sizes? What are undo log sizes for zheap?

I also suggest to use Zipfian distribution in testing. It's more close to
real world workloads. And it would be a good stress test for both HOT and
transaction slots.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Alexander Korotkov (#6)

1 attachment(s)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 2:42 AM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Thu, Mar 1, 2018 at 5:09 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Preliminary performance results
-------------------------------------------

*We’ve shown the performance improvement of zheap over heap in a few
different pgbench scenarios. All of these tests were run with data that
fits in shared_buffers (32GB), and 16 transaction slots per zheap page.
Scenario-1 and Scenario-2 has used synchronous_commit = off and Scenario-3
and Scenario-4 has used synchronous_commit = on*

What hardware did you use for benchmarks?

Also, I note that you have 4 transaction slots per zheap page in github

code while you use 16 in benchmarks.

#define MAX_PAGE_TRANS_INFO_SLOTS 4

I would also note that in the code you preserve only 3 bits for
transaction slot number. So, one have to redefine 3 macros to change
transaction slot number to the value you used in the benchmarks.

#define ZHEAP_XACT_SLOT 0x3800 /* 3 bits (12, 13 and 14) for transaction
slot */
#define ZHEAP_XACT_SLOT_MASK 0x000B /* 11 - mask to retrieve transaction
slot */

I'm only starting reviewing this, but it makes me think that we need
transaction slots number to be tunable (or even auto-tunable).

Yeah, that is the plan. So, the idea is that for now we will give compile
time option to configure the number of slots (the patch for the same is
ready, currently we are testing it), later we can even give the option to
user at relation level or whatever we decides. Why I think it makes sense
to give an option at relation level is that for larger relations, we can do
with very few transaction slots considering that the chances of many
transactions operating on the same page are less, it is only for smaller
relations that we need more number of slots. OTOH, there could be
workloads where we can expect many concurrent transactions on the same
page. However, for now if you want to test, the patch to increase
transaction slots is attached, you need to change the value of few macros
according to the number of slots you want.

BTW, last two macros don't look properly named for me. I would rather
rename them in a following way:
ZHEAP_XACT_SLOT_MASK => ZHEAP_XACT_SLOT_OFFSET

How about ZHEAP_XACT_SLOT_SHIFT? I see similar things named with *_SHIFT
suffix in code .

ZHEAP_XACT_SLOT => ZHEAP_XACT_SLOT_MASK

makes sense. I will change it.

*Scenario 1: A 15 minutes simple-update pgbench test with scale factor
100 shows 5.13% TPS improvement with 64 clients. The performance
improvement increases as we increase the scale factor; at scale factor
1000, it reaches11.5% with 64 clients.Scale FactorHEAPZHEAP
(tables)*ImprovementBefore test1001281 MB1149 MB-10.3%100013 GB11
GB-15.38%After test1004.08 GB3 GB-26.47%100015 GB12.6 GB-16%* The size of
zheap tables increase because of the insertions in pgbench_history table.*

I think results representation should be improved. You show total size of
the database, but it's hard to understand how bloat degree was really
decreased, assuming that there are both update and append-only tables. So,
I propose to show the results in per table manner.

Fair enough, Kuntal has done this testing. He will share the results as
you have requested.

What is total number of transactions processed in both cases? It would be
also more fair to compare sizes for the same number of processed
transactions.

Also, what are index sizes? What are undo log sizes for zheap?

There shouldn't be any change in the index sizes and by the end of tests
undo is completely discarded. I think to see the impact of undo size, we
need some different tests where in we can keep the transaction open till
end of test or some such.

I also suggest to use Zipfian distribution in testing. It's more close to
real world workloads. And it would be a good stress test for both HOT and
transaction slots.

Yeah, we can do such tests, but keep in mid this code is still a work in
progress and lot of things are going to change and till now we have not
done much optimization in the code to improve the performance numbers.

Thanks a lot for showing interest in this work!

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

increase_slots_15.patchapplication/octet-stream; name=increase_slots_15.patchDownload

diff --git a/src/include/access/zheap.h b/src/include/access/zheap.h
index f463543..e9149bf 100644
--- a/src/include/access/zheap.h
+++ b/src/include/access/zheap.h
@@ -22,7 +22,7 @@
 #include "utils/rel.h"
 #include "utils/snapshot.h"
 
-#define MAX_PAGE_TRANS_INFO_SLOTS	4
+#define MAX_PAGE_TRANS_INFO_SLOTS	14
 
 /*
  * We need tansactionid and undo pointer to retrieve the undo information
diff --git a/src/include/access/zhtup.h b/src/include/access/zhtup.h
index 4dec7d7..2bb63f3 100644
--- a/src/include/access/zhtup.h
+++ b/src/include/access/zhtup.h
@@ -25,7 +25,7 @@
 /* valid values for transaction slot is between 0 and MAX_PAGE_TRANS_INFO_SLOTS */
 #define InvalidXactSlotId	(-1)
 /* we use frozen slot to indicate that the tuple is all visible now */
-#define	ZHTUP_SLOT_FROZEN	0x007
+#define	ZHTUP_SLOT_FROZEN	0x00F
 
 /*
  * Heap tuple header.  To avoid wasting space, the fields should be
@@ -87,7 +87,7 @@ typedef ZHeapTupleData *ZHeapTuple;
  * information stored in t_infomask2:
  */
 #define ZHEAP_NATTS_MASK			0x07FF	/* 11 bits for number of attributes */
-#define ZHEAP_XACT_SLOT				0x3800	/* 3 bits (12, 13 and 14) for transaction slot */
+#define ZHEAP_XACT_SLOT				0x7800	/* 4 bits (12, 13, 14 and 15) for transaction slot */
 #define	ZHEAP_XACT_SLOT_MASK		0x000B	/* 11 - mask to retrieve transaction slot */

Alvaro Herrera

alvherre@alvh.no-ip.org

almost 8 years ago

In reply to: Amit Kapila (#1)

Re: zheap: a new storage format for PostgreSQL

I think it was impolite to post this on the very same day the commitfest
started. We have enough patches as it is ...

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 8 years ago

In reply to: Alvaro Herrera (#8)

Re: zheap: a new storage format for PostgreSQL

On 02/03/18 16:53, Alvaro Herrera wrote:

I think it was impolite to post this on the very same day the commitfest
started. We have enough patches as it is ...

To be fair - he did say things like "wanting feedback..." and "shows an
example of using pluggable storage.." and for PG 12. If he held onto the
patches and waited - he'd get criticism of the form "you should have
given a heads up earlier...".

This is earlier :-)

Best wishes

Mark

P.s: awesome work.

#10

Fabien COELHO

coelho@cri.ensmp.fr

almost 8 years ago

In reply to: Amit Kapila (#1)

Re: zheap: a new storage format for PostgreSQL

Hello Amit,

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version of
the data is kept in main storage and the old versions are moved to an undo
log. [...]

This looks more than great!

*We’ve shown the performance improvement of zheap over heap in a few
different pgbench scenarios. [...]

2. Transaction aborts will be expensive.

ISTM that some scenarii should also test the performance impact when the
zheap storage is expected to be worse than the heap storage, i.e. with
some rollback which will exercise the undo stuff. There does not seem to
be any in your report, I apologise if I misread it.

I would suggest that you can use pgbench scripts such as:

-- commit.sql
\set aid random(1, 100000 * :scale)
BEGIN;
UPDATE pgbench_accounts
SET abalance = abalance + 1
WHERE aid = :aid;
COMMIT;

and

-- rollback.sql
\set aid random(1, 100000 * :scale)
BEGIN;
UPDATE pgbench_accounts
SET abalance = abalance + 1
WHERE aid = :aid;
ROLLBACK;

that can run with various weights to change how much rollback is injected,
eg 1% rollback rate is achieved with:

pgbench -T 10 -P 1 -M prepared -r \
-f SQL/commit.sql@99 -f SQL/rollback.sql@1

Also, I would be wary of doing only max speed test, and consider more
realistic --rate tests where the tps is fixed.

--
Fabien.

#11

Tsunakawa, Takayuki

tsunakawa.takay@jp.fujitsu.com

almost 8 years ago

In reply to: Amit Kapila (#1)

RE: zheap: a new storage format for PostgreSQL

From: Amit Kapila [mailto:amit.kapila16@gmail.com]

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version of
the data is kept in main storage and the old versions are moved to an undo
log. We call this new storage format "zheap". To be clear, this proposal
is for PG-12.

Wonderful! BTW, what "z" stand for? Ultimate?

Credits
------------
Robert did much of the basic design work. The design and development of
various subsystems of zheap have been done by a team comprising of me, Dilip
Kumar, Kuntal Ghosh, Mithun CY, Ashutosh Sharma, Rafia Sabih, Beena Emerson,
and Amit Khandekar. Thomas Munro wrote the undo storage system. Marc
Linster has provided unfailing management support, and Andres Freund has
provided some design input (and criticism). Neha Sharma and Tushar Ahuja
are helping with the testing of this project.

What a gorgeous star team!

Below are my first questions and comments.

(1)
This is a pure simple question from the user's perspective. What kind of workloads would you recommend zheap and heap respectively? Are you going to recommend zheap for all use cases, and will heap be deprecated? I think we need to be clear on this in the manual, at least before the final release.

I felt zheap would be better for update-intensive workloads. Then, how about insert-and-read-mostly databases like a data warehouse? zheap seems better for that, since the database size is reduced. Although data loading may generate more transaction logs for undo, that increase is offset by the reduction of the tuple header in WAL.

zheap allows us to run long-running analytics and reporting queries simultaneously with updates without the concern on database bloat, so zheap is a way toward HTAP, right?

(2)
Can zheap be used for system catalogs? If yes, we won't be bothered with system catalog bloat, e.g. as a result of repeated creation and deletion of temporary tables.

(3)

Scenario 1: A 15 minutes simple-update pgbench test with scale factor 100
shows 5.13% TPS improvement with 64 clients. The performance improvement
increases as we increase the scale factor; at scale factor 1000, it
reaches11.5% with 64 clients.

What was the fillfactor? What would be the comparison when HOT works effectively for heap?

(4)
"Undo logs are not yet crash-safe. Fsync and some recovery details are yet to be implemented."

"We also want to make FSM crash-safe, since we can’t count on
VACUUM to recover free space that we neglect to record."

Would these directly affect the response time of each transaction? Do you predict that the performance difference will get smaller when these are implemented?

)5)
"The tuple header is reduced from 24 bytes to 5 bytes (8 bytes with alignment):
2 bytes each for informask and infomask2, and one byte for t_hoff. I think we
might be able to squeeze some space from t_infomask, but for now, I have kept
it as two bytes. All transactional information is stored in undo, so fields
that store such information are not needed here."

"To check the visibility of a
tuple, we fetch the transaction slot number stored in the tuple header, and
then get the transaction id and undo record pointer from transaction slot."

Where in the tuple header is the transaction slot number stored?

(6)
"As of now, we have four transaction slots per
page, but this can be changed. Currently, this is a compile-time option; we
can decide later whether such an option is desirable in general for users."

"The one known problem with the fixed number of slots is that
it can lead to deadlock, so we are planning to add a mechanism to allow the
array of transactions slots to be continued on a separate overflow page. We
also need such a mechanism to support cases where a large number of
transactions acquire SHARE or KEY SHARE locks on a single page."

I wish for this. I was bothered with deadlocks with Oracle and had to tune INITRANS with CREATE TABLE. The fixed number of slots introduces a new configuration parameter, which adds something the DBA has to be worried about and monitor a statistics figure for tuning.

(7)
What index AMs does "indexes which lack delete-marking support" apply to?

Can we be freed from vacuum in a typical use case where only zheap and B-tree indexes are used?

(8)
How does rollback after subtransaction rollback work? Does the undo of a whole transaction skip the undo of the subtransaction?

(9)
Will the prepare of 2pc transactions be slower, as they have to safely save undo log?

Regards
Takayuki Tsunakawa

#12

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Alvaro Herrera (#8)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 9:23 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I think it was impolite to post this on the very same day the commitfest
started. We have enough patches as it is ...

I can understand your concern, but honestly, I have no intention to
hinder the current commit fest work. We are preparing to post this
for more than a month, but it took some time to finish the
documentation and to fix some other issues. I could have posted this
after the CF as well, but I was not sure if there is any benefit in
delaying, because, at this stage, we are not expecting much of code
review, but some feedback on high-level design and I think it can
certainly help pluggable API project. I think the chances of getting
pluggable API in this release is remote, but maybe we can get some
small portion of it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#13

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Mark Kirkwood (#9)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 9:29 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:

On 02/03/18 16:53, Alvaro Herrera wrote:

I think it was impolite to post this on the very same day the commitfest
started. We have enough patches as it is ...

To be fair - he did say things like "wanting feedback..." and "shows an
example of using pluggable storage.." and for PG 12. If he held onto the
patches and waited - he'd get criticism of the form "you should have given a
heads up earlier...".

P.s: awesome work.

Thanks.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#14

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Fabien COELHO (#10)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 1:35 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Amit,

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version of
the data is kept in main storage and the old versions are moved to an undo
log. [...]

This looks more than great!

Thanks.

*We’ve shown the performance improvement of zheap over heap in a few
different pgbench scenarios. [...]

2. Transaction aborts will be expensive.

ISTM that some scenarii should also test the performance impact when the
zheap storage is expected to be worse than the heap storage, i.e. with some
rollback which will exercise the undo stuff. There does not seem to be any
in your report, I apologise if I misread it.

No, there isn't any. One idea, we have to mitigate this cost is to
allow rollbacks to happen in the background. Currently, the patch for
the same is being worked upon.

I would suggest that you can use pgbench scripts such as:

-- commit.sql
\set aid random(1, 100000 * :scale)
BEGIN;
UPDATE pgbench_accounts
SET abalance = abalance + 1
WHERE aid = :aid;
COMMIT;

and

-- rollback.sql
\set aid random(1, 100000 * :scale)
BEGIN;
UPDATE pgbench_accounts
SET abalance = abalance + 1
WHERE aid = :aid;
ROLLBACK;

that can run with various weights to change how much rollback is injected,
eg 1% rollback rate is achieved with:

pgbench -T 10 -P 1 -M prepared -r \
-f SQL/commit.sql@99 -f SQL/rollback.sql@1

Also, I would be wary of doing only max speed test, and consider more
realistic --rate tests where the tps is fixed.

Your suggestions are good, we will try to do some tests based on these
ideas after making some more progress in the Rollbacks (there is some
pending work in Rollbacks as mentioned in README.md).

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#15

Kuntal Ghosh

kuntalghosh.2007@gmail.com

almost 8 years ago

In reply to: Alexander Korotkov (#6)

1 attachment(s)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 2:42 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

I think results representation should be improved. You show total size of the database, but it's hard to understand how bloat degree was really decreased, assuming that there are both update and append-only tables. So, I propose to show the results in per table manner.

What is total number of transactions processed in both cases? It would be also more fair to compare sizes for the same number of processed transactions.

Also, what are index sizes? What are undo log sizes for zheap?

I've added the table sizes and TPS in the performance results. As of
now, we've just performed stress testing using pgbench. We've plans
for performing other tests including:
1. Introduce random delay in the transactions instead of keeping a
transaction open for 15 minutes.
2. Combination of ROLLBACK and COMMIT (As suggested by Fabien)
3. PGbench tests for fixed number of transaction.
4. Modify the distribution (As suggested by Alexander Korotkov)

Do let me know if any other tests are required.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Attachments:

zheap_perf_data_1.pdfapplication/pdf; name=zheap_perf_data_1.pdfDownload

%PDF-1.4
% ����
4
0
obj
<<
/Type
/Catalog
/Names
<<
/JavaScript
3
0
R
>>
/PageLabels
<<
/Nums
[
0
<<
/S
/D
/St
1
>>
]
>>
/Outlines
2
0
R
/Pages
1
0
R
>>
endobj
5
0
obj
<<
/Creator
(��Google Sheets)
/Title
()
>>
endobj
6
0
obj
<<
/Type
/Page
/Parent
1
0
R
/MediaBox
[
0
0
792
612
]
/Contents
7
0
R
/Resources
8
0
R
/Annots
10
0
R
/Group
<<
/S
/Transparency
/CS
/DeviceRGB
>>
>>
endobj
7
0
obj
<<
/Filter
/FlateDecode
/Length
9
0
R
>>
stream
x��\mo�^@lD��j9E?8@�T�})�
d���:�,YI�E�Dti��?��+9�Y�p�;���&������C�9\��4�e��w��&/�������l��P5�����e��C�����]5�������T����mbz�Q�KU�uM��PW��WL���m�#P��������9�V��	�Y[JdZ������P���em��d�������r�Q�����������B����!�+zT�i+�i��Fy����2�d�������|�e`[pj����M�#�A���%�f4Y�	Q]e����Ep��R,&��|u������5���.A#����\D
��#.���Id�z��M�#�MV��5��a3�,����zCaK�"8YMUq���E��G"j�(6��\��Rv�su%�����������:����q�yn�jK��K��h���A%�7+����xv$���XI���N�h1e���D�����]pI5���<n��[L[�2�Z�Y��V]����������f�V"j��Y�Qhi�9�[JdZ�f���U#r�����>@)�])������De���Ue�������R"j�V]�L[�\�|4e���d����\%��(��y��d�B�:^l)�i��:�-D�k<��{��r�<2��
�A�u�?�����w�����{r���Y��dF���1��Gh:�l�+�R�.��ns+�Q���=����3�^�q�q�wv�Yf��R�:w��l����tlF��K�����:��n�I�9��vl��Xg6@V�S����
c�	6G`�f�������~w�<s&��&�����-4/1E�n�g�$���#0�>g���Y�%f6�C"��d��`lB���J����tLgC�	rk�t���*l�{��QRe�e@��%f��������`32�bX����|�=�s&��&���5�&/�K�W�K�q�
c�	6G`�-sn�?+1_A�n�g�$���#0��97����� p7�3g��n�m,pO������
#�N�v#d�+b�bh�O�C�8�w�1HP	/A���hK�z�I�j��vlbF��������;��ss�e,�#�����I�����-�0g��n��a���g%f=Mc��N�V�M�9#���#�������� ��If�	6G`�6f���)���������#so:��`	�fd���wi>L�vKm7�blh��R�A$]:f�TbS@e��oK�zlR��'�f�vlbCO����`��������Ny�)���W`��5�j����S��2����/���$O����k.����������L�{~�F��{��'�L��#Q.�����G�	yw&<�0Oe����H��
��\=:�K�
kN�W�o�36F.�u}I�.�=%�%�,�S�g�<,������h.=��%��'�*���?#���Jv��W��*��!q����q�7<��w[�����u#4���:�$�@�ZA������-	�P�NPv|����K������=M��.-G����j&h
�L��a�heR3C�!�}��c��9�d�	��rHW���C\���e-z;��u~���q s���/|�2����]�I��:�-�H+{����������Y&z;��Y~C�0Jr�����)|���.����2/�:�$V���]5	%9����}����DC4a};�o���r�/ezd}���S�f~�7�DS�sR.o!G0�L���q/�F'��2Ns|f����b��
q��7�AMf��������t �6cS�k�w|��=��g|U;����s����Z�/�����������oS��_<�=Bmo������g��H���d��s����1�3ND���q""�2+'"�DD���q"8�Ns�S1���/
��(���8k������q""ND���8��>;"�DD���8��#"N���������8����S��,'"�DD���q""N�]����8��#"ND�����.���U�N�5_O����B-m�6K�J5*��/�Y%B4�{����0�����D%@{\�E%@k~�_:�w@�+���9/y%B~��+��:��+k�?����m�{�B���U"�]�����=��p�,���c^Y�L\�]��M�{vB�����J����+��?(�q/GKW��XW�Ce)���B�p5������*���:�J�~)�F��Q�#��!qTr�F�q�Du���!N�J�,��������%��L�������o~N?;�����������w�M[8Ju��.�2O��U��o��&H^$�I�T��u�*�H�J�U��*��.��Ur�~�J�H�U�	�~������~�>���������}�Zl�W�d;�1���T�����\�v
�j��L^&����Z�@������N���O�����r_��o��V}6�����o`���[�}���4���P{��~���!}�&�����<WS�5E���J�xZ��m]�k�\?���s���A��~��&;������A����F���S_�/���D!��s��kUB-\�9�3{���TH�B��������kw`=�
��s���L�q���_.�"�&e#������C����U]����Z���P{�Osj���s<�������Q�HL�yOwC���yaJ�Hu=E8����i������L��N�^��N��������I.�t/���]�h\YS	]g�&Q<=�G����W��_��{��C���Fu7��6����V����!����K��W����kS���������(mD[���Lso{��������j���~�x�|�@x~��l�6U�Jo���A?g�-�T\>4����K�_x�1-�U\^�����'��m��G,�\�R��������&��A8��1���I��Z��j1+Z���D+��tm�{<���Q����

o��G�r��q����-����+t��Ky���8��YQ�y!4�ir�W^�q��s/�ju:�?�\'�4Xo����o�4J���]xFi��~b�8bR�=U�'�`�W�����>�O!�����]��.� =8���>��x��.�t��?�Rp���$l1V�Ug�����
�eVm���y��3����04��������������EQzF_��1������u�|��1W/��^c�^�<y�����&�U�z!����5a�sRM�����.�������f�Q���#��e����t�e]���%�L�f�c"^�	L��i"-lGh����Z0�1��yxy7�����0�<��=a������'�_M���`�S����;�c���`o�n��zy�W�gW��[
(f�p`)����<_��p���a��z�u��@������7W?�e_'��"b���f's�	���yf��ZK����6�(K
(e��`>����������R�,�fM�wM�~�`U��Y�i
j}`	���h���!�����hY��N}9�����y��J��]t=�+����d��2M&!1�~lr��d���g����\$�N�V����1iZ��A��:^p�s����^����
#�S_�]����s��g�����ZHU7b2��Va�����2�jU����O�B����4�fy�yL{�EIt���\���&l��^A"�4�7��Y��	�
%����:Lj��h�{�-���=>f��Y�
�E[����1��G����A+K�L�Tl[K��O�����3��r������2�~��2�~��������)�S+�,�����p?D���U����2�M�����r�	���j����;?o\�.���c��Z0f9�����;Y�nZ��yx
q7�<�����e��<������^�yL�c�j������~���8�<��=]��s�fZSj{���w�<A�
�>��L/�Nt�iH��x�*.R�q���k�2A:�?�>�������n�������rr�X� �������{l���;p�2��+e�]�����x
���CP:\q�X��o�un�O��x`y���v�V�b+���w�S�����4�/����������r�����W}��f�-v��L4����e�&o��1A�����D�w����M��YC��;�tYN/�dIN/{l�sm���.��Yqb���_����;�����`�x����@��?H�y{�K�����%oj��]���`LrJ5�M��v��v'���F�?	�����X��6�2�*�O�L�cS��Y��v���R��WmQ�3�I������U�^��V�}uT9�"����w��
>M�
�oG��|L@+���d;�h��:�����`�;�K_������V��6�y-5>�h�c���	d��0����A�u@@��K���{��_���'���#�hw8A"3���/Kd�]��D�^|"�mjO"_�=6)K�����M������$��gu�uo������������r]������H�uE�����wM�S�Xs����uJ>�
endstream
endobj
9
0
obj
4527
endobj
10
0
obj
[
]
endobj
15
0
obj
<<
/Type
/Page
/Parent
1
0
R
/MediaBox
[
0
0
792
612
]
/Contents
16
0
R
/Resources
17
0
R
/Annots
19
0
R
/Group
<<
/S
/Transparency
/CS
/DeviceRGB
>>
>>
endobj
16
0
obj
<<
/Filter
/FlateDecode
/Length
18
0
R
>>
stream
x��W�O1n����@�����nw�G���`x���`4p!w(h�������]��5���1����|��+.��gb�L����_q��
M��
cc2Q��D!8x��[��(S#sn��L��,�3������rp�J�p��Q!��;��	�Dj��>��9��C4w�(�������-���S��3���C���o��k������e.]�a�t��	�T�&:���B�@8��-�"��/�){
>i$I�3D��Z���?��}������"�^�g��,����A8��22�>����fZ���lSD
�J�L��f���8��&'�p>8�(?GX�R5��>3�*�L;��YAf���LW���V�[��ULr���b	E�L��s��-���Ifa�\�.�h"H�a�|��(�������@I*��Y�]��M
��Z���$�MlB{X��#l���>^������$�W�������.DM���f��e�i�E��hjC�m*-��"o+r:�"o+sCly�Vty�Qt5g��`�*
��5������R#B�7�(jDH�B���J�h!�ljD��H�;(#{y���'S��^���o�?3.�����//����L�\I>]�i��?`;`O����a�{0<`'�������l�o���N�:;@|��,��}n���#v������i'v��6��e��DZ�����H���eD�Z6#F�R���{����Z{a�IZ�e���j�mn.�Rz�Y�$u�n��lQm	�A��.�����Z�t�vB���!��l���h�x
hEGZ���d;�(�2�)X�����S��8d
���U�]�4l�U��-v������MT����*��[�U#@1�����`��*�N�S��) ����`�]e�z��OAwQ(����K���
vM�I����>�h�B�	�F��,�N�G�l���#6������Vk��l�GF���dd��������N
%ug_���J�i����n�D21e���Zh��K&F�)��U �d��cv����L<#������1�!$��Md��
%_�]����F����A�����?��7jq�
endstream
endobj
18
0
obj
1083
endobj
19
0
obj
[
]
endobj
11
0
obj
<<
/CA
0.14901961
/ca
0.14901961
>>
endobj
8
0
obj
<<
/Font
<<
/Font1
12
0
R
/Font2
13
0
R
/Font3
14
0
R
>>
/Pattern
<<
>>
/XObject
<<
>>
/ExtGState
<<
/Alpha0
11
0
R
>>
/ProcSet
[
/PDF
/Text
/ImageB
/ImageC
/ImageI
]
>>
endobj
17
0
obj
<<
/Font
<<
/Font1
12
0
R
/Font2
13
0
R
/Font3
14
0
R
>>
/Pattern
<<
>>
/XObject
<<
>>
/ExtGState
<<
/Alpha0
11
0
R
>>
/ProcSet
[
/PDF
/Text
/ImageB
/ImageC
/ImageI
]
>>
endobj
12
0
obj
<<
/Type
/Font
/Subtype
/Type0
/BaseFont
/MUFUZY+ArialMT
/Encoding
/Identity-H
/DescendantFonts
[
20
0
R
]
/ToUnicode
21
0
R
>>
endobj
13
0
obj
<<
/Type
/Font
/Subtype
/Type0
/BaseFont
/MUFUZY+Arial-ItalicMT
/Encoding
/Identity-H
/DescendantFonts
[
24
0
R
]
/ToUnicode
25
0
R
>>
endobj
14
0
obj
<<
/Type
/Font
/Subtype
/Type0
/BaseFont
/MUFUZY+Arial-BoldMT
/Encoding
/Identity-H
/DescendantFonts
[
28
0
R
]
/ToUnicode
29
0
R
>>
endobj
21
0
obj
<<
/Filter
/FlateDecode
/Length
32
0
R
>>
stream
x�}RMo�0��+r�I��Ji�4��>4��!�z��/���:�������E8�W��&��A�0�����8��������tz����{��8��y���l;DE���P'?��{34p�/������|����G���;1�%3��'��U,F��2��M�&p~;�gLb.���Ni�� *x8%+�)#�����������S���<�$A�s�K��(�J"a�������&uk�
RC�����r�jk�'��&�t5��T�$�l�L8	����%0��%�=%����=��)��n�S
�3���n.	$^f�]��,7�\��=-|Y��:���a�p{q����,��
na��
7��
endstream
endobj
23
0
obj
<<
/Filter
/FlateDecode
/Length
33
0
R
>>
stream
x��}	`���{o�������l6����@�$@0����	���SE�CDE[��U�j�iM-j=(�Z[mU��Q+��Rj�����f'���7�����x��w�� �R��!c����'���+�<��8a��9�Y�����w���V��}g�`��V���sgM�y���W]3��?*���R��
�es.�nZ��I����W-�1]�0�����?w���{��5��'/��8���W`�
!�O�}(�����F(�9���������n��pvG���6�"��^D��W�q8�%���_�
��
�m@"�5��q���8�oG5�	�'�A8�rt���8��Z��q��Y���J��hZ������AS�'�-h�]��5���;����B?B{�_���E���-�.����=�>��(��wYG>��������OAJ����Dq'���g��q����U�����Qq������>\�/"%�����A�{\W}�@����~�>��p<�T�8��jt	<O;�%��r]ksM0b�R%j��E���
t����"A�	�p}�=�G}�D��3p�g��&���^���#7���t��k��8�k�h<�T�E�1n)���}�;���~��1���D#��'����bQ�H�
3�A�G���O������}�g2�L#�?q?���-M���Z��@��`/���+�\�
o�w��A|A.$��
7�[������2�a�p��Enr�@�W��������k���������C��~�����n�&q	��o��M��C�
?���.����_�����4A�I���R���Rr-�y���a��'�J�,W�5r��"��n|wq���!>��O�Ox\�&</�*5�f���y������m�����k��`�0
��z?��a���{	��5�(����`d���x	�F�V������~���o��:��>�&�d0
���Yd	�B�!��}r��8��\w�����s+���6��#�O�I�|�����|������k�������������*.����H���1�X�E�K�-�'�v��B{P�>����q�������_�_>OC3��0�l������	������(t���X�N''�y�H<�G�I_�j��6����1~?<�/������|#jhF������ro��O��?�~��8���g�1�?�/&���cn	��"�RO���G���/L���w\qd`�������:����3�9�NT�W����@���b��o�y�&��������5�2�	~t+n��!�k�!^Es/@��s#���8<(�F�-��E+�������I(��������v5p����vu�>p!7j��9�^L�|>�����/Q�8�t�9��A�;7M�?���AW��A��l���+nC����6�.wZ�@9������0<��l"����s�F;���K��v.^F������)�9���
���+���(<��p���NT�E���s��y?Ac���������B��~�#I@��,�q�5<�
h�_�����q�F������m���.��.8���A
��������w��lUeEy&]�*-I'���h$
�>�ix���RY�#UK
oM�eZ��L���{���t������-	U��=�-��K�{�G���#-�H��Hl$Qc����T����T�O;�;����m�<��[�\R'$���M�������+�n�:.���I
����F�U�.��B���q��6h;A��j���k�����q�a�g��;y��XIIs��6<dF��6�����C�v�6qH��n��G����^��is���l�j3S3�O���Mo��0�p��m���������C&o���6
�K��M�6$�����������p
8����n���8b|�F�5On����I�$������FkZ�'������M�[aj������%;�Qko��Kn�09U��K5O��G�������s[zUo7L{`��=@�{�������q�#�i�R�B�%g$�'�S�Li1k �4c �fg������i�d����6!m�����R��:�fz�FLG�x��j���m�l[UEi�)���_��zEI�I����10���������	���BW�N������$�2�Y5��6�J[:���D���i�>�5�����h�3�#�6wP�/�����S#�N�����0�#&��g��n+@m�!��)@$��V@������Z��?�!��I�d589��h��.�������:���Yls��B7�e��?���s��m��� *GL��i�zN��}�K
�x4arIrH������|�@�k��Y0dC��vUa��c�>;{UF�i��Tr���M�;�k�L%�����U�����Z�����=�6|s3��\<�����Sx����8~����
'L�A0�:�y{�M��D�b����J���;h���Adv|l������
�?�#V';u�� v������:����1C&L��=�$�{6�l��.!Tb��i(0�3I���%��(�w�����%>.�VC�P9h��n�l���r4� Y�p�G���'����'�#�F�\\4�p,NW��b��M��v����Pz��<����z��`��T�����]>%�5N�;q�8	j:�u�h4��g�MoC�m����8��Ox�Jk��Ld�Qm�o1���� �x�Q�cS�'TUTQ���@Q]���IA��Z1
P.o1�KP�l��`-n������XL�fp��k�����p����v�c�m��g����aH���C����s��������
�0�Vk��Oy��������k��c����\vj�������~O���?�y9m��z.�1�,4�����>/v�1Vs��`7"�p���������.]l+��V�� �h9q�8�
c���G�o���Zx�������_C���%�gFF�_��w��d!`R�YLsd$	�L!�~���(�h���y�o���$p!���v��� ���J[aB;�hw�%�o���<�����v��<xpM���B'2P�X�G���=��5x�����2�����A��uQ�>
�OXA]<1.����R2h��P2��h��	])pr��C&R���:@&�h
j��C5�$�8��=�m��J'�J"��g��������%�K��FKZ�}���l�#����������'����F�F��nA�\a2�wY������������i�k�����Hkl%�V\����A|@��x3�!y_|��{O�����t$����Y~:H�e�U����`�
aF�`C�G([�M6/��6�2�4�!`�VNM"���H�G v��x�v:���-�cPR�e	��[��4�~0������/%��7`�������h�T�$N\���;�������{���V=��M�.m!�b���������\��/>�?�����`���z�z������S���P��9+U��������,��[��&w�e�+H�X#�-�
�JGa��:�G�
�#��e2��3�r3��9�"ya�� �:�^�M�%��`	D����F�����G�����]@�M�l6��D-�+I��(�������;��?�,�o�`U��/zk��l1	F!A�{��AG@�`�/QZ��aD�����M
�i�a�O��S�1M|m)�P�/$(���mM�������&:���F�P(Yl��$��_��w��Q
E�l-���G�o�y����R<&q�s�ry}db�O���w��� ��������Q����wc7���'�'�,�"�,�!��.���	��L����}�y�{?�~;�^q����7���!�4F��a��G�!��[��?���(���sz�� O�mb��w�'@��;1�T�PU�PUJ�����B��Rb�/��(�<���]MdYDV��������k;#�*x���	���A�5�W�����u���%rhp j�-K���d���/U�$�-E	�x��J?��7����G�^�w�z���������_��qF�M>�����l����'?������R
�����eq�n.��|G)���vE���paJY
vpeJi�I�je ��q�RO�Z8��7iPjKJvM8<SC����E���&���`���n��m0d����
A����:?���\���2��g��W�������~��B�cH��t7/a�/�sJu��1�}��&�����P���������{�)�v�b��i�EI�SI���C%�P��PiY���F#C�IO��ge��
w��;"��}x B0`�n����S��,S�~|"������R�%�"��+�Y;���%����J�����A$Z"P�&���`�K?���2����c��y���%_������y�����,�o	?����UO�#��8��[oM�zc������������F��(p,C�|[�{�"-�z���.2�a,\d�b�sV
����Tl��JIf��\�l���Y)�Rd��J[,x'ks���g�75�2�2�<��F��I���$�'����9��tD4������H
~��"��C�[*�Af��**�S���Z�f����X�d����5%�������u��$�=����������7�����`*7NP�n4>3��0N4�l4pA��x<�n0$�"=��op��,WmW�������%����Y�m����4h�q��j`���[���AY�U�q�YH����}]��G������z<�G��3��(���{�P>A���i��g,��M���.�4�����Y&b�(�c����\�Qh�Q���E/+]a6�� 6��r���h�txq��&����0���q���a��K0�E��8�������L�������#�+�6��{��s�}��1Y���UQN�#�q����1|j��������>�����|�#�{����+���������x�L�=s���Xm�e����T����1
av�JxK���#�(o�����\��e.���j�Ca��b�8��
r��(��b�/i��W���o���J%0�K�J�����W������7W<{'��k��[�����;N���16�~����$��Bn����_�q��#����
�*BU����q66fGH:�%;@��F�n4*��V�2������j�$�Y�V���x��"a�X�A��	�	��&0J�,f���Aq*�$Z��'��-q����kn�3�P��� ��JV`hdhr�wBr7S�)���L.�����������R�Nl��3D�0 �(T�$�P�L%Kh�I{9F'��~���T�9}�T��]ig��f��1+��`�e`d��<������j�l-_�Qu�1�c���`�6����V��3�BlCAzn(H{� e;����-�z��1[�1a�T�3���K����L93/D�J2/�tR��4P���=��;�3\}��IN��\�N{���o�c����}��G]F�9j�S?��������}F����?�h��������G�U���m{��G-�5w<��K/��N��A:Zl����?"�
�h��+������L��H�#/S��F2M#\l��`E�be�hY�8���Q�Ij-0Tl0l����"�D)���0���9wlD�^������zq��9o�t��_�_�7�~��@5��J�>���(���Q���p�v
��l����=���wg��,�RFI���(��t��.�jXC����qJ'�w�G�~�����L��a
��Y6���X�7�`����{{K�P���	�Q0�����o��PAi��`��0G�$����n���=<��`48�8��@-�c

�>�=�#�����Q���8��pZ���V���3L����}>�fp/\S����le@}W/	^��L�KA��oD7�U�ry�k�v�~}�v�	o���k]�j��;B�������>v��Q�I&k��W2CyJ�2��Di����q��v�H�a#	�FM,{E��>2ee�����&��c%��x0����w[�~a�`�����	/k�0�A�X�2�X��Xc���c�'V���}��mL�1���n���={Z� R���k�\kY�-i��JQ�=���;�GGyFP����9=�����^��r��
w�N~�A���W�_0w�m��lt���k��x��g�X���~�����WV��?�0���gx��[o�6c��g�#��~z���ms�q�� ��E ��@�.�de��t6p%��4�N��(`�Y���-&�JV�M�A�q���m�7�E��W1('iT������e�q��"x�dn2��6������]����n?�13$u�'b��R�N��lK?&%������hPI��k����8��YUL�3Q����{��UM�Yu�������S�����sR�GoLl���x(�lt���g��I�����/�A�3ERN���V�$)&+����j�������v;�D�Cc��;^��}��
�.:����4���6�U�TS
�
�
��!��(-�����#�2��%{��.�P����DGR��%����T�n��R_WN�4l���d��fDb{��\5}��c���//�}K��u������/|H�����v<���'�x���/[���Zx�,��l<��s������?~��{x��G6��*a�z>����Y5�\����z�u�% �����$u��K��*�T��s�XjE����W�o?%4�X�U�!A�yu�k�P�7�$�c�	O������Q0��S��k]v�z/(� �M*���%�����������<��Y�?"�.�e3X�r�"j�|��������O�
�����Z�"o����N��,�����c��Q:��2W�����������:�T��������������z�f&
�XXYc�1��s�Vc#��xSx]�4�.Yh���c������7�on��x�ws.Ux^���(I���I!����MJ�������������� �	�;�bKA���`B�a0Y����h�����?��-0�[�1Z����m��F�
�tH"��5���y����G�a��h�8v������6�H��
B�,��m�f[;�6�q�����
�����?�-1v��v������q0�������Rk��R����W�e�E�#��"�?z���'>������x����p�?7�L��������vxh��2�U���E<�T%����������2�VE��FX������������"����uf���Q�������J/J�H'&�����a��Q�`��RgD��a F>���W�^��E���(�^�sD?�*qB��ZO+����WKg�8����kh���h5������IH"'��`@!
p��!�
�v\@e����yS��~���.(G��������u���f#�
*��&��b�sGc�fI����pI?���I�@��E)|�l{{��9���|��6����_;���o~,�{��{�!SM�ws�Ea_��Y�{��|��9�[�r������,�<�U�.>n%��.�.��Ei<�����i-���}�����o���#:����MP�8S�.s�����2:�,Kr\�$�[���(\�%
.���2.QK�M��)��El���,�Y*�$U�������F23�`��<��F�mZ���f�Z���9�<����jQ
Yi
�S�x4�s��1��LqFN��T:�����W�}I	�J�t	��B%�oB�PJJP\��H9~7IV�en���)���2�5oYm?>�zw�����`r���w�o�]����x������K%W�^���kKn�����_@�^�]G�.���������9?��x����������wk��&���@j�U��F��V������nU�Dq�� ;S&;S&�)S���II�C��!�SE ���QP��3zlY�c�������������[��B�7��:)aAIF]�p��;+h����qR����u���p�eIwp�!����v���A��i����_
��^%�^}�K�u=M��Nvv���>�����3����U.x�J��(����n�d8�u;N�������G������HV�����������M�r�z����3*g��	mS�����8|��:��Y�0�L�Pw��ds�J��<+V���R�R�����c���#�i�*f
kI�[�a��q�]H�1a�nz5���L&!/2�$KG�FHt��]Qc5{����M������O�xu��V#���$&�+���/`(�K�K<\%������+��u�
��"����w�&#���%�������}�}�6�I����G ~A 2}@Y��-PdY]G��n��S�w��x��mH�}wI������2��V����va�>x`7v�Q�6��������6:��=I�UX#pB������ -��.��Ls��h���-�I�@=�Q�/�F��F�P��otVq�	���K���}�7�h����(/�n�[�����{�K��%,n�{@��������@�4���*�0?@B8�?��6�e��>�H=����s�^�M�������<��95��t=�t���G�/u)x!����.)p9LBkA���f)@2hsII�N&��
O�"�<q�q/��^���J����������4��XF-IN���Z]�]k\�K�cN���&m��C�������g^A��}6 ��&���V^����g3l�^V���ur
����}���9l���
�����7�V?�� �F�5}�;`?��)Z�T�����G�O��Xd�E(���@����A�6
�b�F�������L�Z~5L���kh��	����AkD������b<o�~W���
�v���B�p�$�,s�ot��N&+��i�)�i�I��c��$F$�q��K	8+��y�R�5�LR����R"��IvT����!QW�)��GHt
���)�E&�<<����;c��H���"�SZ����rD�S�3���e��@%�xo�O�j0�s�u�`4o������o��~�Pn��_�����V���_�~K������~��Or[w�O�i��C��vb��s���?Z}��������e.�����~��%�#�P�j�H�:CzN�{7�e�>`�(	S�e� PNX�����d�_4��!����l���S�.P�V����#=5����{h����L���E��P������S�����H�=#��������������\����m�>�w���/���^����So	��WM�����>������Q�8�-	1�$��EXe��C�� �,5���O�J���~������\C|w�tY����e���s�����+n��St����9@�2P�)6���6�v�����N�
=SF���tO]jhzX�������U����l���J������k����s�\���<w��nI�����/�((&�J2�X&�d*q������7�f���V�n��X:��J��qZ
�l�@���H9F2Y��[l��nZX�C�1��z����K(#3&K"���J�N�^Q���]������^�0p���x1��E���,w/zKzk�����tS�P��JU�J�D�nBc'�bz��h���������`p�KE=������g��@�$��`���<JmE��!'�Q��]-���8AG���\j:�D@4�����;�s�$�65��36��I��
�!fw���2S���~q������z^�����������za���g��h�?������}#���o����|������B������g3����}��k�][����]+�9�l�_(e������>�""�+E�qn������J��� �e++1�:k>m�Hf��:JY�3i�p��ILjh��]�`�|a�_�L�[�������8�(g+�����{Z�0[ ���|f�|��������%���rE��\L�_|����l#$���U�o��UE��m3��4B�Uh���D��}4�'����1��0��r�$Z�4;���S����?����3)����S�2f�7n@��o��%�"��E���rs��������������H6P'�iC�����a�vE`�0S[X.,�nx�5�2�4���i���-����(�A���*�����>�7�� �w
(��[�k��5% +IV,hh~�QX��D ��^MQ/�^S�x���7��hr8 xLCC�$pa��QY&����k�H��BQ�B�EI�A���<vw2�1�D:�����T���M�+�
�6k�g����	��KS�
I���odO���
`��q���gj��x��^�:�l�>
�UL�Gt�E�pC�N�,��8�������O\�c�|��_-���w��c����,:P��/=:����sW��{�\
�so
��4��_��������v{;�cP�Z6'g]t�I�s�4��/@h�fU+�R��U�zUk��AU�T��-U��yU�}6��+
>}VT��e;|a�h?y�bw����C�|T!
bw��jq��{��\O��)T*g������K���'�������
m����O��Ys@��FMY]�_�?<�rQ%������w�w���������9w����r����(��te���Y7a�E�rg(���m�v��Py������y�?�Pw���r�_�sUN7���m?>eA4����%�K��)���l����|��:�h���VE
���8�X���V�A�����DLf�d^�
 �tU�}8�4�L�f%Ru}:���@����!T���5L��0�S�QM�+�!��M"��}���<��������b���X�]d���f�]���<���Y��4�f�x��t�c���U����RQy4�t�+{���>w��H8���y�YZ��r�I��[_Wng0]@���`(��D�M�P�5��;���-��~��sp����W���>|����J�t<t��ES�-�7����[&~�����n=Z�V��u~�����GX�/�}������?��#k.n�b���RjZ�D�f�7���y��za� 4�����xm|p|q��bq��1��,xY�En�'{Z��������z�^�,�@�0�a�O��B_E�\t�8_I
5���c	�y�������24#��E�bq�j �v��7�H������������^���Y��QM�#vt����Fg��,�p��%:���O�����-]`�b�"/#��o�L,�����S���tbP���6|���	���B��
�4UD13�������0C=L��6;4H������B8��h@�����R����@{c�|l����Oqt��hII
LhP��R1P�����l�@�g��n���%V����_@�&����]��a_���}�[�r���������}���T����;\7������*e�*6�
+=�4�1ee!������=.l�1h1�����p�waw@���Il�$1��{�d�u���m��@K?�����
���������Z}������C�S�SQM�#�|2��/\�-���Ok����.Mj��?�]:������y0���>�v���mEG�q� �����1]w�C�����-3�\c�	��~m�e�T������rgA��(a��1�3�ba1t��!A�!�%�C�C��w��C.��$"��oA��IL:I�:'v��7Vw�[����{ia!!�a��KO���R'�a6�-G��)��JN�
��oul!N�ROq�k�^���?��c�_n{��/EVO���S�����9����&k_z"��������7������X�T��i�N������d�:�?>>��f)3���������>�|����M���Oo	g��!��R�$�&ez�� R�� ����K�����9�����S������2<�s\����p�0���0�SK���=l���9�\$(gH�6��&6L�l5����(�����R�`2qL��)R<7�2��������h:Nn�qfS�je���k�����es�k�^�I�Hy���1Z���&���MWg�"!E�Duczpj�1K�����Ff(�i<j3�F�;�j�����
6�l6f��=���:��7������jvv%_�f����p���|���1�i���}j8�����^���T"���N�x�
�x�L�Z�e�k�@X��r������5�BEq����@8�?��zE��/���^����]��N�.p��(�=z(4&�\��q�c�A���U	�#�Q*q8{�������l!�������5!K��Yz����S����N/��W���X����t��n�P5��l����jmfj�Y7e�UVU���M�6H���8�w�W����8��s%LKvd,���������
v
Nc��FfQy�82�l�:l��/�����KX���}o_��{��������g�Pw�����C2V8��U��I�'�q1�P
W�>���H��o.�w���OS��tX#��B��J����D�D������#����+��Hg���a���uE��i���$������������1�Y|�!6�!�V1y��{0����;��������-G,J�K�E~f�W0����3�.���������K&�Y��-�=�e9�'���x*A'�������G��'��cTw,{�����l�V�i����)*�(��k(�2EO��U�v-�=.�5S���u���4e���b��������ljl`�qC���dA�������[��|f6P���X�K�����Y�.���v������E�(RT����S
���K]��������S�������R�_V��zk����eS�p���a�J&T��f���l�^S�a�%_��)7CA1�A��W�}�F�a�x
�D�X2�F��q�:�4���@m���b�o�\���r�'a#d�ZCkB|5L	�X��q�q�P771nL��Y��67�G�4�7��R�rh��5�a�rN��b�L���2���9��������&�h�4����W{�z�W<�,A8N�l���0���d���P��u�L�uFO�X4#��tU��B�����B(`I(��D��G�t���d~�L����/��
Y~����h����u��������?����o\������69:6�o��m�����x�k�����u�sU��|�����sJ��������($��Er��J���0n����@(R�M��sF�� �]���\��n�`S��Ji���_�Wp���L�-:�J+�tbj��,9�i�J����)[����|�l���p��������T���k������`>���M��M��M�?m����q���$`����C!:q�
1n�;I=�y�l;�3;F.���-�:K�'�E'����W��	����nQ�a]��f-��5���i����^��o�\����,sG#������#]��n��]/w��M�m�+B����BB�Z���q�
g�eV6�3����Z�Y)�N�rT�.'���I=��S�	3�8V���X�m��$9�Lv�A�����he��UiS:�O�����be��Fy�PuD�+j����Ni�����zF� ��(��?�o���N�/v��y��$�x�6��sa�y6��J��3A�;��w���*E~���H���-����v�2������s?,�{c{{;��C�N���)��lr��,��n�]�At��|����V_;W������3�B�$NQ8��7���)N��	f���8WXq!N��]�������������)T���W`%���jD�xq�r/��^�d�Z��C������SbFJ�
�@�I�7���d�Y��_)<��.��_<*�E���O9�UU��x"�������rZ��$r<�T� �*L7/c�J���].���c)����2��d��a�u��2W�Lb�e�LL#�oNX��1'i�A�hB���`����5���<�a�I+`&b�&�h�K.��S0�=FA]�!�%'iH��n�1�����AWt�6��zK��(7r�,���
.Vn���i��OKU��������E
�yoG�m���x�$�%(�e�b�sG	�����w
��a{�lw9I �����~�c���������N�������-��o�l ��0�;[RXt���%7��q�����3�q[nE�LR|}���� ��������������PONuw�4��E��{����
}����v����>}�mi�~�C��G(>��P�ba��F�<��p���Wb"'z��w���J=��wg�NQ�c�������	?��N@��~�(�\�C�u�R������m���N3�z��m2E���
,���`�9�w��D��IM�;@���99����>G4����5�p���;I����5�������Q���O��o��I��)%K*�J��U�$,��C=��[�[�$
E��-&6y�>3���3���-)),7�0'����o:�=\	��J��k����[b8�n��A�� F�X&�A�i)1�m�Qf����Fos��1z�
DjS��)��R�����0���F�eycyE����:��z:a��rd�����e�|�������mO)S�{�O[z�H��.O[�Q�
d�H����L�uk~_���1��������M��u�,�b��f����:����������z�����,�A������g�5����^��UN�j��{��������1���AA��()��[>�}d��a����w�;�y*'K�V��p�H8�����zA��bPWu��v���P���]PY�i�a�Q��.�b���[�v16�*eG�v�p�+��3�p���I��.S�]�\���mUS�:|<L������a>���@��M��P�aO0m�<�f!7�?������fu�/p�N������[o����L�>�!k'�� <`���:vP4UV%���)�c��zC���P!�����~x�G�O�1���/{�������#�������z��������������a�u����������l���,
EX�WR#�E���$�Y�#���:c�wP�><���*LU�-�����Ba�2�X�]��Q���&�+���Y�,�*M
�y�F�w0��xA��x�1c�,�l�C��EG�u��NX���8{Yj!q����,]�G�H2���Ig�����=b!u��f��������������`�3�`���`��u����Q�Ds�^]�V$��Q�*+���'&K�-�f���N��RMF/�W��Tx����|l�8*,$�iz}���~��7���Or�����~��uv.�sE��]�z3N`������ko�������VxQ�g-��^����oJ�%Iq�RK��+\�8�%)

�]�4�,_�M
M���h�����������E�M�MI����5��z~�1����b|��kQ�p�n.��:1w��;� D�A�H!����*6TKmU��|��E�*���3�E�C
�O9�Bw���S)n������WKjnp�n;����x��3z���s�p'�5�B���Y���a|N�;��������1��3�s�A0�g��M���oxj�=s7��'7L��������f����y�O6��9������_6��4���o����~K5�u�^�y7����6x����!�x~6��SVdE����8���!U��"c�4��>Rz�������wF�&��e�`�"#�s����{X+���ON��F���Gi�Y���d������[��G��L	���^0����]0x�y��'��K.�L�EM�K���>7��������@B�.^[������;;l ��'��E6�r�R(q�k5�F��K)�*C�&��*]����Z�����W9]	E��>#��	12��V�S���Tu�k�6U�/�W���]���z{���CS��*��MQ�]333+�����)����vO�����yJ}V{�������2�
G�,u���9@�yE�D��D�1E�Jly
S������d&��zE�3�4R�BY�������K�C�)�,�|��#wEH�'��G%���p�.y0�a�����j���`��n���S��*"E�����$�?������Q4���]�Q-�X�p]?zz=�����Ru���1��gF�����#,�A[a���+���v7�`�U��v�W�*zOz~���*{��H�/�V�,z��(�AIyU]k��~����~�
�����2|O��O�7��~1l)�}K2,L�y���{���KU�{y[2S������OGA�o!n��dd�%������r�U��%4z�C�=Fc�Y�R�%,Y��m4��n���l}�*��H	���ix
����z2��
)��^P$��[�N�PiJ��J5�+�U��1TlQ��^��
�*We��]����,�)�����B�<S������o����9��g���s�
���O��G_8����7�d���-��j~0X����'�{��C�����z~*��w��Q��(�^|��������E>����US�<~������%U��(�����r����5l9�]I6 :�J�<����Nk<k"aMW1������ 39��(E�X?G����p^��)�Z���i��#P~�JmR�tX���Z�Y���X��������S��ZEE-u��+[i����(��o��/�={s��e>z��E{��P3kk��rG�I���/
4������(�������o�u��]�lE����f�������rwl�����(���>B��
�e/��ph TG��`��������p��j�t�3a�Pm��_�����_��p�Qf������ Ow6[�	�P����=�(@���!{�0Y>�;C84*J'��-��Q�8�5��Gy�lBcl*5�����-�I��rD�G�*����PY4����O�Y-
(�"�8`
��7Ol����
��b���[��4�U�L^�!]6c�(UUkAU�3��r��ZH-�����V��{O�6\�.���c�<�������_F���yG�����k#i8�!�h�F|`FU���a�����]�����2��1Gh��$�~��q��1zd
����E	�,`"���V5����:}���z�R�A�"M7�{�dZ��N���V��%QR�*�`J�R��CA(`�C����u(	�G�DJFm@����"u�D����l<����)��k��d�|�r��o ������&�Q��r������G����^S?D�Q�BVO�j5<�FA���C�F���7X'���9/T������>ly�k��t,h�T���Z"��f#~��������,����
��=�$�iE�+��8B�vf�����4eQRa�F�Z�lY�����vY��YJ�X�����)Z�F�Z�Z��cG[����n���p�B�~��	y����[N���������tq8�����|���9�&� ������0�+L�9�L��?$,�lO;u�1?�`��X���A�����`�V�1N9��Svr�-��N��� �W�����ld7X����)g�zYz�����"���W��N2�������p2�^�I�� �>vBd�<KB�-v'l��dg����=�.T�~Y�HJjvCg���YuZ52-���F(�X���{i�fb�������,2����4�cK�	��~��U|�J.5�0�493i����:����,������!"kOqY/j�O�)��#^t).��5���Kq9�*r���T%g�u�^$���]$Z�Hy�k��"�R��q��Ly�w�x��\�+�����]<�T��
T���+<���@4�{��^~��_{o#�\Ok��nq����������'�SJ�%�k�4D;m����w�@�1������%9-y�nj��%N�ZZ���o
�2AeK����>Qu�5kN���S���U�&S5U�N�=1����&{��^�c�_[��������IPTUtV
�&���) /(��X�U�;�sS�����f�/��9������M�'��~8�.�(�D���e���u�=/�QY�$�_��.P��4tL���9�?c���*^��V��A&Z�h/2W���{.C��,�zf>�;9�����'ZZ�����D-���H���LV�����F#�Q��F�����'�$��?����nG}<I��#�o�j�V7���?�]��+����#�jY���?�]J����Z���B��
�k'8�C�C��
$��;u_���;����&�$XX��Bn��{��
�~4.����f�y���i�u��j���CY2���[���sxD��}�6����}����/��_~�����>j�E��z���d����]g�.����^%�

��vf+��X����V���caVj�������fiN��bQ;[+�
E�����x��]���z�.KKz�����d�E���n��1��NO�&�~l-�7T�'y^R9K�<����Sg�B�oP{�]�Z��_����4]�
���fo��90�;�7/�R\��4��_X�o27{7�n�?�ns�7^6���T?��]�2����S�\������y"����$���L<��!��|i����$�v�`��>�d�%���'5�W�$�A�vy`,,�`�����L���%�<x���a1�6����Zm���������54	�4������`����@�����G#�}���a��P����" ��<�7��uH�2��p:8�2X\_ W������whPK4��B�@�Yj�o�
0u��\�EY_���4��<*(��������/���[��G������sW]X�g�����g����O_���5kW� N��������Q�{��n������v��A������h.�/-|A�e6�j]
@%�Pj���^�����%�hc*�@&�S�1�Ux�!�Wn������:�6����E����J��#��X2h���#@�
]T���RT"�j����W��B�X��#��L�d�*���v����2�!$Q�����[��-w�{���[`JKmr/G�M���w&�QUg�2w&@�HH �$$�%B �&�LbX�B���DQQ�K���,
n(ET��b��!Q����"���`�V����K�
�����{�d���>��|�~����s���y�23�VR&�\2I�������v�K�&�u���7�e�C��&^y���<��83,�Bo���w����s}�L�Y��������s.rVZ_�s}�r������)�x���(O|�sYC��:��<m�p��o����#�?;���J���n�*m��U6�8�W)��<��77]���G�����T�u���4���;��nu�{�[=�QVx�{�*�u�g&th��i��g����-E��a�kO�
�Z.��yz�O�ux��R�L|��5��Xa�M7\�?�z�un]��w����/��������~b��I}��W��	�8q����_�����G���O{�e�����om��K��Y�C�s�l��������o����E����PT��K�*����t���J��4M���-����n�a���g�p?|rA	�&�.��b6����w\�����y|4�Ub6!���=4����S�d�k`�
:�Pv?+�jZ���|��@������7@�G��S"K`���s�Lo/�E�W�BZ����`��9�Mg��X�CK�=�<���1��B)��=w�y�e����2����B�� ����V����i�	��q:+�E� ����p�dTX��,t0ND�9�;���A������z�u�45���#p�^I�� ���gi5���*�I��i_�0<��X�tT"��o)[�+�3z���B��v���B��J����C��:�,CXG�|����v��E�{n�_��\�j����<�rWJ����gpS����i�:�~�?��S��
-�6��r��AH�qlD=���+xO"0@2��
`8����&�+��MQ?P7\{������V������z�.'��Bs$i�Nn/\g�m���Mq�����sD��;���TH����i�A�A�-[r�C��=���s	�j���\g9~��|��&�mB����mR#J�u}�-�������n\�>e����X��\��
�>��5vH���t�g
BYN���\����\������Fzyz������k���������|q�t�����d���X���z��}�f�/�F�Dz�6�>����}
L��U�9J�{
y
���\�O���4T����C?���)�?Q��������J�j#-r��e�J�iK=@~?�ua���:��K����Sr�/���@�{S���[�
�Q�d"�
�?��}4Xd�W�x�~��' �������q���z��blA�n�S�c��~����>��9�gl�N��n�����7h�l�=$��Od�G?���j��h�)�Ys�kn2rp�{�2�B�o	���f�O��c�eO��q�5�����F�����8Z"�el��(w��"��dD~"�s�������H��=��s��� J�q��D�a�3�E�i����;�:��"��"�{��T�l��J����OA_��*��8.{����C?�H���&����:�~zJ��;*��=�����
�o����X�E^��E�s^��F]$�����5���
�wh=�fB���w<S8.��M����h_K�7-A�C��O3Oh���[��-�<�L	��p�H{�n�����h���u�x�0���z�2��r�-w��D����N��h�K��'�mB�Ka�~�X�a����O����H�����}I�����8��0��.��*�Jc
h!��%�,����J����b{u����<���o�+�����v�~��h���Q-�*�}��^G���w��Z��%����l���*������V��*m����N����?���'���(��4�mk���[���Ys*���~�_"�6�8�EH�x�)���'�q
���c+���������m3?=-�<I]N��:�t�v�B�d�#_G;n�+����`z���� �Zf�n��p��"d�!��1�)�!,a�������pZ�����?����|m�3N���!o�~�Y��.�g���}u�z�>�fWw���(C#�_m��L�o@X>���h�d���0��R�
?9n?��p����S�U���r���D9@^�F�As?�;��i��W{G�������iv�k[f��*���A�><H#=�����K#�e<{�T��TL���j��`�S������k7��6B���#���i�m�Q��|
���QLX��|�V[������Y>��_��AfB�,�o��6�l�N;�/i���m8�;�?��������Rux��!��<����O.���4�/�><�~h2�{�����D�����|���7������w�uR�L��v��#�Wl�?�*����V���jp��0���#������/�w'�o��M0�^��0�W�R��y�b�%���)���]�>���:����k^��9�?X����t�5��oK��8���3}�/>�9���(��p�)ft��G�.������7��"\�.�d���W��Y�\/�\">Sx�/�%����U���/�.��9��2�D����
�F�;�s?���u
�w��|�d�o�1��[O�c��~n��#���#�pp:{�a�q�s,���5v����4ct�8����q�&j$�0n�Y�8��S��6�m��?���;~�����f'�<w�=[��F�B8������s-��g;�7iF�
�@o9�n@��L��a7��=�x����7��Y�� �*�y}�w������m�����������y&���9��
F�X�
\c�5�!��*F]�����o�7�ClS���30�������	������x�v����OjY�3��[���bmy�E?���k_�KbM��b��>�B��>{��8^r��z�Y'���G1N�x�c�-{Bst^�=Ji��P�!w���y}��+�?y�:F�:��4P/�B��[�TSx�E;$�j���v!� �����iM�Z�����b�i��-��c�{�1#K��L��U[Y����n�5M�f�N �WN��zLx��?�h��G�:�����n�1�nPi�W�����i�.�=g�1�����\�/�I�|���5�'!���" ��;,�K��t![7��T��g���T��}��"���5����6��o�������W)Q?��G�?�k�`��>��A�muP_!���������w��of�<�k�U���
�����>���u5s����y����c��{���2�,�4[���b���>���/`�i�R���b0��4��\"<�h#��t�~�F�����i���2���_v��KB��G���B�J����U2�h�M�0$4uF����|f��}�.�����i:!�]��\f&�V+ir��L�'[��N��;���0����PB�T��!�u����h��rIo�}*�����~Yf;�=��N�}7'�gY�����t�N���g:�}�����7�	����"'�/��8]>g8�}��q��_���'�c�_��t���x�%����}�/��s�4�^�����|`�lH���<^�W�y�9�����I�9;s�J���o�N+l���������s<��
O��}o=d:X-��D�������}s��F�/�������Y�B�s�n���.�G���Sd~p���w��t\_�>c:��.�jK����E������:�~J�Dg����#z��T�z������=�J�M��D�����K�]��B��
�������Eb_��~�um)]�^�����[hq�g�����.���G�s�%�RR��p���~�����/�V�5�`|��6�,�W>�������������\�H�YE3�g���R��))j1���h"�l�vh��M]`����"��>�/���B�1��b�p]�<���P�9=M:���:��{�vGA���%Q	����0��f�}�Y2�����
�ZL������<���������n���B_��Zr= $�w�~[���Y	�^c�Q!�B�����@��J��a������ie�>�XqJ'���uI����� ��Y�"*v]�|�L����C	����B����h�1������9O�����1�����{`���}��h��kZ-����+����\`�7��/��f�oB;4yN
[���B�CR��Ty�j�)�e�����6�\C�6�g�Z��w���m3����F��T`�z�S�������!�'��5�5����y~�t�Y���Z���'�{���Rf����!����H����?t�N��u�e+��5�i�2
��LH�z,����>?��9�3\w��L	��	Z��H���>����>�����|������d�)�K�'�?���+4G�������B��gy@n�`O����gZ�>"���4;��|��i������|��[��n��\�>S���g�l�g��V�����B�������l�$+��n|VI���4S�W�9�,����fy��0����v�g,����q�������O����{B:���Y�����H��9Y�-w���_�Y'�|f�5�67`�t�}'�<��f��u+����W]��u�d�d����[�9HC�������6�a7�
<!�m�co�h��e�/����C:NP�8_��������_I��a<����M�e��c�6����E������ -��/�Ma������ ��p9��1���"�����b}i���'����8��w�Y'��4�(\P/&��$�mm�X�w�Z����T��~�,��������\�'@��2���
��'��h:��C�0�w���d�?���T	�����5
?U*�aUjx���)i�(�8yo��]��`��y]���g���7����v�h��,5�]�f��%�����4FY�)��G��-s
��T��{�\J�3t��\��j�~�!��A�u}K�����1����4��n�'g��7��o����^�1(C���T7��9!�S�x$��+��m�42q~�jkB�u�B���`�<�=������[�T{�OP������er{(F�Z{e�g��nI]^�-�;<�5��^�:��k	�[�R����f�X���J��zKu�C�n����m��8��#�T�g7�:��������~�*���1�4�f#�����ndt2M�>j�kK��h��0GrM��IS�������"�}�}S��6o~/?� �Sym��R-A|���\��\������MS���>U�|��4�7����{i����)���H�K �B�{���S
(J�����b���O�h�Y�}u��a����d�����Z}����|
��{5<�sj>����[���y���(��)�� �-}I���r�-�����������
���bL�C}�loh�E4H���\�W��9� -z���.�	ed��+�y��p��szk����0Ya!�i��?C/k�q�|�8��Se~-����D��{Bs?{.g�5�F�kh�vt�|&I��/��o72��^zB�e���p7�7��2x�
�X�TM��g�8_B���|~�y��C��+�9��zK_�t��3��F|v������*^�}�����Q�G����-\���Q��D��b9����4��(��n��IBMN�>���P����#Q��(
<�S�B�.D��[����B�!��
�#:��=H4y���Up-Qa���aDc1�}At�'�#���Eq;��Hw�v�i���]��b�UDWTF�O3�!B�"D�!B�"D�!B�"D�!B�"D�!B�"D�!B�"D�!�B�_����K���T�R6�����o�E������L������U�S��������kF��uZzml\NL~?-o��T\�����T������[AxD���T0�����%�������D�MDc��t�@#��`�������f.X�W��_�Z�� ��k�=B�^uu�0^f�/���e��`�%�Y����
<���_`�^gY2�gN�e���]��Z<��_����D1�B>Z��Q��!m�ZlmFf��M'ES5�*�g�����N9��TS=B��S����z����c������Oh+h��	��I�-Pr�������z�>�����(F�#e�<P��p��?��U?��"�|�T�C\��H�������? j��:<g�����7����kwy�S��Ss�jT&J5j���F� �GM���:-�&�J_��imj�o]��]
1y!�K�`"�������`�2\� U�^h������U�`���5���x�MuuE����*���+B���,�^��}�+5)>�o��?^H/d6���k3b}f~'�y��5��	���A�QS���Kv�>4O�ZC_
�$m���*�?�<T�T�d�s.�pY��6S�g�|F�d�� ���y�2��%���q����o�_2+��_2�U��/�&��:u���|C'�QR�c���K7#�nF.�L�z3��c:�����}�c��Y}���J�%p���f*��J�v%��.UYJ I	�(���CV��'����>%���R�J���P��P��V3n������Fy�H�>1jr4
u>
}B���)L~8J�a9NLa���o�e�N�����nx��b�M��h7^�/��5T�]�0��=���5�l�*�p":G�Jse���e�HO`���z�_���O�&y��c�����eB�����x"����T�Do�W�w�����(�^�>JFA�/�}5��}u�������8�W����)�)S�	9���y0%yX�MI�f����x���<�W�td_�}��>�}�T��������R�t������}�&-���������:�>U8��4���>��v<X]���b�������$�3��V����]�9�7�+L�����;����.��Z������B�u����$MO/�2�N��?���]�����q��Ns������.�X������������z����<���_x�b�z��t+��{��W������S��V�(E�]3��������uJ�I����%[DE��������E��YEA���K�)��e�
�K��\Z��luW�`�y�;HQ:���;��w-/+������bGv>����ty
��������+��K���e��1����+�S�Kw(G��F�P����t�6R9:�"��F���)%��*��;��w���R=)�������w,�.*�z
w=���;]aw��2Fn��n��R�pS�55����p���p�}�����	�N���$%I8Q�Q�p��tNJZ�dK'KCN���4��M��&���&� �����O��Y�����(�Q>jf�����f���{n��\���mF?H
j��/�1��e3�e�3�3�S��(o�q9?�^���GM.�V��YX3�?bT�e�e�c&�=������:{b+/��/;��3���C��k(�5���#�"Q�'�n�PA�y���U��C}��=�� �{�HQyG�%��^me��*vH/F~�/�_>?B��Ga#%����^�$ya�)�����Xu#%�����_�?X���3��fU���F��V�#*
�-.
�M�V�����tNR���}�Qu�.��?,�aKM9d�\����O-��?�Pw�*�eU�i����*�������i����xx�*C��,��~����+�,8�6�n�w2/�Ii���*;KB�YY�������2
endstream
endobj
20
0
obj
<<
/Type
/Font
/Subtype
/CIDFontType2
/BaseFont
/MUFUZY+ArialMT
/CIDSystemInfo
<<
/Registry
(Adobe)
/Ordering
(UCS)
/Supplement
0
>>
/FontDescriptor
22
0
R
/CIDToGIDMap
/Identity
/DW
556
/W
[
0
[
750
0
0
277
]
4
7
0
8
[
889
]
9
14
0
15
[
277
333
277
0
]
19
28
556
29
[
0
277
]
31
35
0
36
37
666
38
41
0
42
[
777
722
]
44
47
0
48
[
833
722
777
666
0
0
666
610
0
0
943
]
59
67
0
68
69
556
70
[
500
556
556
277
556
556
222
0
500
222
833
]
81
83
556
84
[
0
333
500
277
556
500
722
0
500
500
]
94
181
0
182
[
222
]
]
>>
endobj
22
0
obj
<<
/Type
/FontDescriptor
/FontName
/MUFUZY+ArialMT
/Flags
4
/FontBBox
[
-664
-324
2000
1005
]
/Ascent
728
/Descent
-210
/ItalicAngle
0
/CapHeight
716
/StemV
80
/FontFile2
23
0
R
>>
endobj
25
0
obj
<<
/Filter
/FlateDecode
/Length
34
0
R
>>
stream
x��S�n�0��>���@�H�JU�C*����H� C�}��I��RK`fwggw����17���7���FV7F[��U�*�5&��F����U[�A���4������4e��s����Aw����d�c��m�p���oj���Y�4�.�s���-�����v�f�����1������N����liv�����O�d}��`U5�8|��G}����\�%�<J�d������a��
�kpk���Qy�@z��W���b��bD��{p5P���@��4qU��!�Fb���^��D������;Z�+��%"H$��%P}*_I��"��W\8�HWp�b��%��%�zsV��.�VwQ��uk�W����y�����w�������
endstream
endobj
27
0
obj
<<
/Filter
/FlateDecode
/Length
35
0
R
>>
stream
x���w`T�?~f��{�����M��lB6!@B
��D t$$@$t"54��"��(���� �d)
��*��={yb�`������B,�}���$|���;����3��\h�8�������iZ�� T��>a��'��b�w�K&��Q����E��k'�5�������3���x�a�7`>y��Ys�7������>�<m�(���������)��N7$s/��u�3}���=��c��n�(���X>��g��
U���{�i����C���	X
��Dx��v�=_�x6��0��NL#O��0�����OhK8�<��E�*��Y8e�Q^O��t�p�;��[�����@!z� x���$<�,8��r&�������6�������G����;$M�.^�`ly���d=Gg���a�|�������
%P��f�-p/���+9E>������C��;q�� �B1���</�[�	�B����Oo���?A��b��=����0�`!,A�m��anm����@P&8��^�AG�0��9x������V�'L��r����.�;������o.�0w�{���������?U��4�T�(����'����nH�>��p��Q�Ka9������v�OB��Qx>���{��I{����d2�K��C�0y��B+�(�=���a��N��~&�JB������d�\/������n�y
r4�a��6�Vlq/<u�{��w���%b#�LZ�L�Er�2�#�,2�,#w�
d+��������Y�6��|K�G� ������&��I��~t]E7�}�=����k�M���~G���
��T�'���M��r����^����<��g����V~��2�����6	����@%�:��&��VUo��8P/.����'�����A=��8�f?t<����}�����d}�l&F�	������R���:���p?�9d��G�e�OR��I��#d����N�sy�?�7�Y�+<G/�n�
kGe�����)��&�x����g���4t��zH�=�����^��pu�I�����<,������Te��~@:	�`�J������^ p�Gy)%i
��)y�|J�@_��'������&���!��O����y�J8}�30������C/:
�qO�/� Y�O�&b/�R�,��������A� <M�po?<��%S�&���~T���s�\>^~>�.�E��G�w�'?�
	M ��.\�q��@�`/>?j����T
��;P^���4��KPs������b�#�rI��I0�v=*��
��J�
�����G]ou�\��cA~^������f������Nk����K�z��	�q�.�#�n�Z���h�����pV	d�J�<u�Uu|��g�L�����Q�
��<XTrm�:O�R�sm����f \3�T�H�B(����<u�z�<A2l`9����Ux�.*��J�OU2�x�����9����Ty��J�L\S\��w@����>N���:L�0U��M?@]������(�
���X_��:���B�R<jl�����=������:�}�ot�����J��4S��^'*�x��p�v����k�%]�����5���U��0���u���W��rK��U���qk����]�f��n����w��ZQ���giJI��lz-��3;�����8_1+���S��u�M\ssNH��:4�[8"���57���uEq��Q=��`��y
���u�����9��FS$�74O�k�����,U:����������3��=)��@��e\>��������Sucq��4���HY9{�NH�|�5?N����������R��%�p4	��������t&bw�H�c%�!3cN����K$�>P��Ut�B�{�lVo`4f�j���W�,E�bwNF����;��;M�W�P|s��u���?&)�Z<�c�������K�J+��������kr���M�"�:k�r.�FR4�S��$�h��2��:>��I�(�J	���IU=��
���?>��eO)��c�n�u�_��tM�����p�a>���8l��5�JP��YS�����Z3*(���y$��#t��fzqUtF������J�V� &��(�����������H����������[��d�W~�PJiS)�yXJ	Jz=U+���j���R���	(e�h�1A.��2��dF�M���h�D�}��c�
z�"����69��,q��K��}
������/�T�X�O�\����0-���vm�f�9/x�����= ���'A���k���/�
Z3�d��BT$��{�B)�?. ����P(�N�N���i��<g�yN ����9�������ao���~���O��W�E�B����"�-�lq�2����g�f���][RYC�y�"Gx}�K��b����7���C�~��`���i��q#%���Q�/b���@���X.�.������}/_���E(*bo&��c����V�KV�{��eeO���2gfh )'��'����E�������=D�Hs�VK�j-�d�;��QR[�m7���cNd�h�$��d�����E���H'*��l��J��������,Y6���YD5{��jhOh���s��\L�#��'��({���d"z7�0'�D|��S���� ]V�.H�
X(%�������	����)����8()���.��R�Y��l��J%"W��_JIZ?�|K���7��{���H���)(���D}��W���9��,� �t�.�kk���#SiZ
	� ���b�9p=���=B65�Y�.���C��9�s�������3~�p��d����"�D��L�N�9��.���yR�@&n�n��T�<i��+rf{���]|B�G%�xLK�R��4����(���"�5k����Q������X�����D1���h����>/��!'����)Sm����ue{>�7b��w}�Df���e��sQ���l��QR�����}��7�O�M"��U}�G���������M���p�������@b)]Dh!�����FZ��:�Q �TWFN��ti��j5:*��4�h0K�����(�J:8q0��?_�]�JLfE�,��B�����6~\x�8v�U%��&��A���7��MZE�ni���s��97����5��7�����{�������W�r���b���������)96%���5X	�+��D�������1	�������K�n$��%�S���x�m���G�����|�Q���`P��-r��n��l"qi}�h�"&��&-�SZ������72��rsq~���y�87~��/��)Mr��8�^���+��0~b tf�r/��[~`����~������ ��M<>��U7�|����L�����F^����1��yL�
cB��)�X��D��}�bZ����(���Dz��0��{�<Zm��/��4a��_,J���r'c���=l)r�fD�x�b%.���l)��xAj���j��	����0���d�v�Jd,_R��:�m�V����w���Gn3��������`���!����������������-������[�|�����z� {C���>`��ed(Z-HB���'��1�?���x}o=���e������^0�	U�^e������� �^o��!H�@��8��A�=
z\z��]@#���A,����%4��jq��F��Z5Q�[��R>���V�?Zmd
��������MT����/p���n;������}/�B��2���-*�����Y��t��LbM][i����A���q���G1tPY��z�H6�-�8�J������r���K����6�O�4p�#�������3����.a7��*q������F�d�����6C^\���mV��#�]����Cp�cl�4H!����sL��j�Y�G�QGs��X�l���6����j�)�Yy��5��2�X��W+]QB��hHc:5*�Y�2��r�EE"Z�|�0Q0������������nS$�I.�^��E���(��r�i����O�8�����G_�5�:����=o�j�������V�n+o����w������������GCo�qf��r/r�v\�	�b��;����N�f]��*=lW>x��)~�y����U��W�!�&�<�6�m>���Rt�HI�4���w��b=V����R�T�bb3�.��<0$l)/�u�Xb\�����������J"�VP��<�"9�T�gk�'r�-�y���g���_����6?m�c��w��9C�����7M��7�����_�����}}�����K������/>:���+�
��V�S�V��%*:��AY�g|0b�cM"�,b:�5���	����pe�ds���)n�����i�Ob4����m>�(�=6��������
����".Hz6t�*B��>�)���F�s#
�O�lRk��L��DL��O�D�5i/GD}u�����*8�.�~G����AU���znmyj����r��y7/�o��Vv]�in�M�{����).5s��5[_�<i_v^����
�xa������fn�Z9���F�<������/{�i��xq���?��T1Q	�������U�B��nPw������<�|��_L���U�;��#aO(H�ky�����VP^�]��A
hU���z���`s4#��1����^���|�w�����������g����JEUj�(h�z^�6���Jm4�~��P��DQoE=��|"GQS"�x�r��
*L��+���X�(�p��?jZ���wI��YT(����Hb���B��:U����;�J2>#<��*�*J�����#�V����Km�[sE�}��
�O|E��'^u�*��o%dwh�L��s����kcu�Z ����
�L���R�Tm?:���%�����:!�������v�lVoZZ�V���{IUO�iP;��A�9`0�Cr�����d>99H^$&�?��xMB��	�p��������d��o�l~��C���72�(�7�
�'Y�K������0l�����#�F-�J��&��>/�m���I���d���^�{�m�����u������:�Z:s��[��rV���Bk��Z��M��m[m����<���e]����I#�]\<a����g*/�Bx�$�0���M����xq2�\Q]1����l���%�$)�sCB����H�;�Z`���I/M�X4&��%d�4�t�P	�D��#)6���di�6���a)���0�*����$�w���/�
�{�\����������f8�,h�!�I��S�>O�T�dIf.bX���[�A��
>�l�����C'X�������p�����C_�z���w��n����:�0�&^�y���=��{�YL:1�=@�����
m�%�\�����no��H�,�p�dn
_�����W�Y���b�Y�����0��<jy��Al���b��~c�A-���r��Y���X�������"���.�W(\E��L��xwL���F���|�z����@U�Z-�
F��$��F��n��8�N�j�Z�Vr\�`1{�$-0��J�
B��n�j�]�i��f����V��h���L�8�6|�@��aTc\�NSc�J��4�����j��4���>$�%E��SE85v����$h�o�lh��Q�=E��B��c��m�=������>�q����X��;�C�M��A�UW�QgwJ:��Q���P?SR��V]e�>[������}�|������-��������.0al�`^����,�"�^,�}DVtD��R#�'�!�	���������
�*��F��E?����V�H|��������������C/���aLQe��Om�i\�
G�}l�]����J���I��]�w��-�w���
�_U�A�����X�R`�����
�&��b�q&;��1V��N��$;j1��:���t\�1���84+c���^TUv�%�B&�aJ3��NN�8������C)w�G��R�T�D�w�:�?�:�j�?D�M�fZ4�>/H�DV<�nj�72Fa��U����pU�**U@��ae�D}���Q��j��d�	O��e����#���S]�����V?JL������M5�$�h�=�
_�:,]�����\��8�Kt�n������3a$j�<x�d���nh:��l�C�X>����4��U�{9M.�������Ed������t���K�h����P$!!����N��AC[M�V�K[5Y�[�%H��5����bS�N'��V��������oO��������
gp*�q�\n��G�[Y�i����,tE�4�_+A���y�-��Je��[�F�bN�6�S	8�_�{���7���.�;������i���
?M#O�l7n���G~��>P8�l��~
=K��dY���1��U�i]���y�j����������=���i�PV��aK�-g���vc@��<��8��'�r�C���]�`�k5������a��W�UgzG���a{PvL�X.NBe��]�([��e[V������8g�\�jLF�A�ajA�*%$��5z�F�wHv{b�Y�H��8<R���c�=��N��� $��%�Md��+a0����G�E|���Q2$����c�M��GHX���K�"��>�?�.�1�y1�[���7P��j���D�3��'�%QZ�\~����<�ee-�B�_Q�l1Uz9���l3E�����RT!���N�DJ�����o��,mz��Cgc^i'��[�x�h�?��@�y�o��>�?����q�p+����}{��{�l��C�|��jbz�_+�����f>r��\�}�h�z�2�k�?�s�.�
:������$E����S2:��D��.P�N��9�:>+-���0����9�A�>6����lZ�!&�� �AVw���m��t�rS���g 1��J�5y�\�M�Q��0[r QJ���%�sq����u��J��*/JJ\��S8�����/7F����+����.���X(W��c��i�G�?����K��T�1����r�M��l:8p���dHo{��y36z��p�����N�1�M�S��`Y��Q�vW�:���U�
���	��%�W)=�����C��~Y���M9�S��1�o*�3z�c��=��=q�Y����2_1�]xo����5�G�G���������e��v�|/�pe$Y��.>�S�����p���\����i�u�M:��RO >+�(���x!>>HN��CP���[�A�pi���FP����e�Z��3�O����9������#����<eE��d�������W�J���W*�%;��a�p�Z�VQ���]����)ek��}z����v����>�,5r���#G��k��r��������#���8K&��c��e�][��F{2y��k���6!n��w2�1��?���tD\��P3O��T�[kMwRN��*S�q��$�E����Nt�y�{H[B�����o*�d�������)���!]��W����;��L�����4"�y������!d��3�d�MSv�[^� Yn=}������'Nm����&�����W�DM�Q�E�2�����88�H�7���|����
O��-�����`jT�$�x��2�x������KI7mo��i2�u:�Fqn�h2jmn�3�h,��(�F���v�xS�$?��<B�'�I���;�}/��w
��D��H��J�}���D8����i�Ea�p��G���~i�s����H�m����]y�����{�[��k� ]w���0�}jZ����~[|G���`+�f�GW�u$��#�E>h�!n���0��3����nw�����b��\���T.CwK��XX]���B��H�H�2"����`>.6������q�EJ�w�lu�/�c
8<9`�L�M ��$�v	��?,��/&�Nae�������������.��D6���b����od?�P�����g'<H���?6��=�����dz#Y=i�	RmY����+����b�An���������w�*������n��$����X?�	N�j��p<���Y�11fL����9��h�����~���H�
�����9Uj#�,7KB��2C���r������
���:4/�Em����������&+���9a�5L�DU��@:+
Ra�(�7jP�g[}y���!F�L��lW���C	���3�/����!��|>��S��k����������k�$�Ch(+JPd��#�^Q��a���P���l�(�,�I�Z�AH���@S,)�"(t���e8Z�+���t�|ZZFI����#�m�r�Dt�Tin�V�N09����4rK�X� nG������� �65HSj|	������K��J�EZl[ �!i�m�HeW��l
T��O�v~"5��_YY3C��"j��B�k
����R�9�i/\=�f�RF�uX�+���p�b�g�yy�Vt^6g�a���1W���v<L�R����3+��n��o��z��j�w�������#�?�'N�QQ7e��n'�U��t1�)8Q���[
$c�s����>5���A.:xJ^�����������y?�@5���+[��x�*��m����EJ�C�U����q�z>��c�cf�����!�NJ�N_8r���>�dC�����YYJ�]R}�A�^"���2�^6����a�a��7�tQ 3M-��(��?-
�KC@2��58��\
��Y\�~�=v�h%jo#w���9�e]���hX'��f@eM3�+r Gv�E��������N��C�z�]?.}$��s���>#s�:�|mrN�	
�C�(��q�$#�9IV@����#�kX��'+?���Vs^��5Y���'4'4�F�Y	�K��������a�U	*1��"�Dq*����a/t�k��<�:S?[�n
a\�i�?r}�C|�o�"�1MQj����JS����^,������Jh��Ya��T��X��VJ���-�7����E�F����f��_��t[�9R�?��x�����s��X��+o3�E�&(�
�T	��P+�85qY�v�	�� "����*�J&����F�'�e��p�Y����w7��\�6#�48-Ee0�)9c|��Y�����ya�L�(��1I)E&A�����n�,�r��u8�ov_�6sH.
f�%�������E��Nz:��IF@����q�w"�e�Y�gT�����?f���bc�z�q�e�qI�JC��I��lc�Z��:�� �p��)MOg�$�`��IBb����yX���������l6V`#V�-��r��zE+��������X3`��4�r�V�J���0���rT�n�M���
TJ/'�U}P����Hn�deB^�a��V$}�N6O=C���O�Hb�$�j����L9�M^�	~��h*�EE%8�T14Ow�P����cV�6�Y�������n��T�&�,u�����I�|�5e�7��xF�����e)9����jl��z�s��x(��h��K\
P����du.-Q�����t�z��1�	T�/����\���*�:�\����\E���3$��\��fJ�*M�0�.2e�	����r	I�6�����E�+Rr��QN��h��ln�W�t�����$��V�BQ!��NEG8�tp���F����&W9~
�Q��������n�]uh��W��oeO�S�L�vv�%������:t;�C��z,z���{F���#`Fd0�#q�G'R�4=���8�-���
��%���������%��-���x�����A����@fV�z]%q���3�R�5�f�M�	�A{�\S����d:�9��3���{x��'��+ti��&�0�����v���C��t �H��{�}�p��=7�c������0f�������U�����3�~$����L���LBG{w1����.��mL~{_h$v
=�8P��K�9o#[����L����d�4��R{��L�5�#�S��	a��r��:s5�X�Kc��HfQ��lFS�Z���]�h )��SH������E��T��t����[��]����(�J�i�"�V\����Rv����r��{�L����D��f����O.�*�3�[�Zx��Q����;tiw�4��z����2���M�6�^��p��	.h���M��^e�1�V�����T�4��'��'�Gn#��D95��N���La#d��8`�r�=l*6h��g�y�`:��oqU��e�b�^8B
�~��}����@�t���15FZCL��d:Z��.7����QQ�"C��(����8\�;2m���wV��3�����i�'����P���p�q�����.��1%����|��a����
�{�O��p�]�uw�yHO�s/��+���;u�
|	7O�Z�����I�|=�����THy�H�d�dBZ�����n�K`�J�H45����J���- e��o�H�Rh��1�c���9#��cH��6e��K����re��JL������i�B�l��|��� �������}������/2���v�r���*u���K���MN	}������g���kg�413q����Q�W�.%�t���Tu�JO_�j=����G������o��;,g�-��>|����
�S���:���4�4���t�q��M]�����hR.Xtf=ZgNg���v�^�e9JB`�q
�r�>H��
�t�i
kh�I��VI�^�i�t}C\~�>����e��^�|�SdVN&=)�2�9y,���).�#s�Ub���mc��OX������)����2wH+�-��k��(�xm���1�$_-��>�P(0��vi��=)>�����S��5�[�+�w�wk�r�����y�9��,�l�*�dI�^f��r�q�q�b�.���q1�(�m��h�D��v�h�m��T�rj����Z��1�����E."���F���xF2
q������{�
�������e	E\�!�j�OE�_?�h�_�*'!����I�_��M���cR9�������`edW(���.����s�8�`KR��;,�n�����Pu��GN��n|������!Y��{/.�[C^@G��C�a�C�ja[������K���m�jM�~��=Lt*A�������>��V6�Z���ivxwU�f��*-F~�����aG�B���p��zB��;J�*[�lI5*_���,��Q7������Zkq�E�8@9 ��
7��$cN4��SW�Us']��g;z�y8�Xl?�1�O[�np�b]���V��YW8kc�h^�}��\����7����N��LF����������t��:�_�Ri�c�*�V�M9�v��j����16[�jL,h�G�X��q�c��XK��(9::����h��H-'h3�6��� 9���`����U�j��T����Se�������x\�O�}���A^]DV]s���6b)0+�/l�0z��g�����#��(�^=R1N���J���6�����}��C��O�i���g|j���8|��q4�����W~C�s�m:���I73����"��
�	�������UZA��b4m�.��R�5�Qb+Qs&I��6Q4�ZQ�j��V�Q�����VR2�c��k�Qz'�	w%]����jY9�PX�8��nDwb��^J��?�.TD����Q�	���a]~�������8�������gLQh�c��;|"��������W���`�L���Q��H`�J��@U��d	?_;���NULJ�T��j
%<A��|v��G;#M���jI6�$`�*���:�i��y�Z����g)*�?��t��d��l�_�+���,��|������^�a��P�B�LN2�j��F���Hn������D%�f{����s����l���x���-��!�Z��G/������4����$�/���HS��j�Z������	F_�����'h?������D��W�K�t�>v���i�����eL�/�HUZ�����������j1Ne� YEx�<N�7q	�5�
{����H�������I���ELGT"�@�E����X��g�g�;�0��j7L��c{�i8�����P�*�~��2�,�������aX>���r�Oa�
��9��a:$�#��>��KX��c@��~��Na���b�I�wC��6f"�����qt`��4t&�e5�/�t.��]�?��;�b<C����1^b��a?�� 7b�m���}0���8�'����7����4&��OD��77���9����m��\61#�@t��`
���c�-|��@���a8���XX��q��~� >��&��R�~�s�!��Wm���h;�O����R���P|�V�>|�E����|�=�@,�D��S�|b��U�S\K��������x��&j�����\�s6����s|�@�3
���i
f��(�#8���|���Z��W)���mX�P�,����9�I�Y�;��;=1�Q�`^�m;Q�RyE�a�����6W����P����f���;�=� U�����G���l�0�e�������(U�{����q2�j�l�LoZ�([Q���}Fi���!��,��(e|Q������&zu�2�/N�������Fi�MtU�]1=����[�����	�8����ll�+�E}������-�=���f�$|���u�����&���?�W�BrV�K+�?�� '��eh~�������^��/��e�s'[�W�-��X^��E�����$�����bk!�����v�)X>Dxfs����#�h^'v��6`3�X[�
X����tz39�F�Z�R�F��%e:?"S
e��t���.����60�����
y��6��Y��WT>��S�t3����l)�-�b[P�G�)[��3��t��L������^}�d�:����s0,���BlB��{���q�.`��zM������\<�V
����*U6����i��7�_G�i��-e|��_G�����>{�*��E�T�(���O�.hT��y�[����b6�x���)��x�;���r~(,S�r�����Q�M�6):h,�|�������I�"���:�����:HY���i���1
�*6CT��W��������O��X'>�gcAtP�T�<�����K,��_���!�g�_E�Q���3^�;U#�}B�����0T����g��X[�p�� �%?���Z�o�0��ek�"�7K�g� ��v�;��n������BQ���Q�
�o<���'6��O�-��0O�2���y��X�q�p��.�?���bN��?�(&�a_&���=�7U�����>0?e&,�����0�.�~ee��[E�&�Ca��C*�Ac�$�H������a�7��e��x��8gS�������r�P.�h��e�W�����uun��q�
�ru r�N)|�W����-��B�U~5����l�<��&��2����;��J�?B��Gg �|�����������w0�=�����A�k-�LK!��
����wM��M�\�~������[y/>����A��]DJ��6�1��mF=����dvA�u���*�g�@-��2@cW�a�W��k,��i�?9�,�y5 5"�c9�����|����Z�e7�� ��<7��|&���_D�h�!��X��Q��0��^�?���������������V���K��������~��s�l����N���������u�yK��?��|���vE����f>h�Ct�l����(�W�m��]PG2��t#���>�P���/�
A�
��e���>D]����HgG|�SH�D��k�A%�=�Z�{X��o�Z���t����|����N'�g���������r:�U�_�6Q�������Q�����C�����6���G�2��&����b_�`>3��\��g�����f�G��A�{B�1��)���?��]���y���o;�����&���x���zt�`���.����85��;��b�X����J�8':������h�&_-2���������4o7�����;�������"�>7B���,�h){������b�P��9)���K�-���&�k����^B>��P��"e5�C0��Q�/pa���@f�U�/v����za�](_nD�>�?-����"����0%��(����a�B�At
��/Y��gv�7�l��)��8����|Z�������#����&W|5�_E9�{|"����(�?�E�?�8��C�^9�
�e��-���=�ne��X�����I0Y��8
����_u�6��OD�RH�x�����;��Jo�����9��Vtt�m�*=�v��i�����a��`�NG~��x����x����IGx�_w#�nBzy1��A����f����o���oE_�Rl���-�uG�!���#��������,�Y-�����D�<�%���n-�����U����W��-������W��������-������+>'��'�M?����k���s��X���
��>��
!E_^>�i������p���`F#>������?�o��"�1�\!�+~��@�a�)�r���!����X>)��|�jD*bl�����|��>_�Q�������H{��w1�>&���)������4L��o�
Rl�q������"����ktc�����W�q�c�"��*' ���(��*������a� ���?��58
��pI4�BF��t�2��f����?�JN�q��qZ!��8�o�U��T�a,2��^�����3�WP��1�{V?���bl9���sc�C��/�����Ux����#���b�Z����?`����I'�!@w�Iz��.��O�*�&����!AX���0�������r�j�����&�g��AE��}8&V`�v��M���)�����m7��,@~� �2����9��O�������mt?TH�����g�b?���w����^(���}���y;�50��d��B)�����P��y%�>��(e�F�/xl3
��^������G���w�}���pK�&�G5�)_�����)��M���?��`����������>)�+����q�R�E���U�h�� jl7(�)�zF~S����
m��.��/����<��WcL���O�Z�Hx�����X���:�Pr1?��1���V7�ID�5����`�8��b��������`�J�5�T�������ja�m������#[����X��avV�W{�W���r���=M�����g�~tK���1?oA��4�dx�������_����[�Qz�-��4�7���gFh>����=H����~����%�P��3����j$6l�����B����g����}�r����MgS� �x���A����DX��+��]�Q����
T�����W��]���y���{��2�^~#�B�hF�+��s���m��6l7�������������7@�1�P[�1~�l��b>LbPk�->�����w�|��1������m?������������y�5���{s�_2��$zR9�����o��e�/�B�H@���<�>�+9���s��_��X��>���-_P��������Sl�A�|v����?"����P������G�c�
8���X�t|D�u�h�J\Y��F��o���`|���`6+O�j�/?����`2&�5��3�:a�f��%F0 ���}��B���'��O+�`g6s���F��gP�tB��v���#��*�)_�ECEu/'�����ss��������h�==�����v���nHR��"g	l�a&�������g����l�r�'������?^���x��1&Bz�[Hcm!������s��lo����k�/�����)��=��Gl��+x�{��^0����V���4�g/ ~�:&X�-��#�x�[�������5�aO�v�:����+:eqTqI���������F����
��>�h���F6`l�������s���"��rm!�~osz��R���������Ur��c��\[�0�����_`��)<��rgxPhD;�.����	?A�e}���&�����L���;�����8��;+�P����?�|�(�H�V���{�_�(�g�e~��_�}�V4�O����w%��h%>���X>���B����9��t"~X�J�d����(�����q�'�������,��EMF�}c�r�2��!�-�|2l[�K��x*'�0��V�����,o��-�A��>�<���6�W�^��F?�)�+���ok���l_�������������,3���!_�4PK��G������[�.�K&7�
8��S�A)��E�.(/�)Sa1�����l _�L@��'!��c���x/��C1f"jPnn�����o���}`�/��w ���8��P�M(����mQu��)�D��v��I���m��'�o6����v����3��+c��.r.��*>e5��c�����7Q��.C��Z����j���=w6�(���{�7�?�����e��tD+����$4P�|�kt����o�}�&��,&�(kyM�y�����l�<�q�1f��i��Me��E�D�Q���b�������Xv2���8�`gb8���o!�EG<��[c|i�YGC>{�����C����(lCY�����d�w�w+- �����j��3���#�����#H>b?j��f�X�~��������q����5u@���vC��a���a��0`�o<�G�~�sXv�6��w\����
a8/�V���#�V$��x`��s��]Er���Z!o���k�H�9�H�D���������0����O�����������������������������������������������������������a��|�0� A�Ki_��?pc�c�1���=��� �� ��3Z/���
zK�%]�t7<�8�����-^�#F"8||w�zVw�H�4�����>}�+�@�0��T�1L�f�z;����������v�|r
6/����K����,Db	���w6���il���b���N��������Jh������H/u�#_��T�.Q�#�k�r�R����/Y����r}B�f)�"��_�NS�J}r������K�e��D� ��2H�M�
����7��=/��������������X��������p����i58`1�A��ph��q�4ART����UC:�Q�5�������p?�O{�,����w_���z�o�����u)	X���/�vs����>���}k�[���<����{&���A����������d/���=�>�]�	�V�y�Vnp�k�t���AB�t����3�G��G�t�s�p�q+�e��k�[#O�[acIJ+��!�A���]q�q�nq�bqCWqC'qC��������!K��7��D�����F�^�U��*5��jP��?l�g��M%1����W�eW��2)QS�
uV����FJ�N���������D;pX���F�,�Pzc7g]��4(������u������K��� ������V��Y���Yu�X���b]E��)rY��Jz���*rm�?S8���*J�;��\� �;����
,��e�	u�K��=�PQ��%�����M�=#����d_q�#�qF*��pd� V�e���(��Q����g��3���o@�E�7�z<	��)�P���b<�S��b<��K$��z��`=�����Kt|�������|����*u���UW�Tq�����T���V��	U��\������J�%�\���1x�u�%���3���_\�de@�5t��>"Lc��]�y7����;
��/��uZ_�:��9���N��Sa��`�;y�����@�(��Xl��������B�e��Xl��r.���;J�DnIXl�6��s�������MfF~fG�,(�K\ZW4pX�Q,�T������2��8(����BV�qM��4�HE����3H7��.T�gbW����5S���?��9�
endstream
endobj
24
0
obj
<<
/Type
/Font
/Subtype
/CIDFontType2
/BaseFont
/MUFUZY+Arial-ItalicMT
/CIDSystemInfo
<<
/Registry
(Adobe)
/Ordering
(UCS)
/Supplement
0
>>
/FontDescriptor
26
0
R
/CIDToGIDMap
/Identity
/DW
556
/W
[
0
[
750
0
0
277
0
354
]
6
9
0
10
[
190
333
333
0
0
277
333
277
0
]
19
25
556
26
[
0
556
0
277
0
0
583
]
33
35
0
36
37
666
38
41
0
42
[
777
]
43
47
0
48
[
833
]
49
52
0
53
[
722
666
]
55
57
0
58
[
943
]
59
65
0
66
[
556
0
556
556
500
556
556
277
556
556
222
0
0
222
833
]
81
83
556
84
[
0
333
500
277
556
500
722
0
500
500
]
]
>>
endobj
26
0
obj
<<
/Type
/FontDescriptor
/FontName
/MUFUZY+Arial-ItalicMT
/Flags
68
/FontBBox
[
-517
-324
1358
997
]
/Ascent
728
/Descent
-207
/ItalicAngle
-12.0
/CapHeight
715
/StemV
80
/FontFile2
27
0
R
>>
endobj
29
0
obj
<<
/Filter
/FlateDecode
/Length
36
0
R
>>
stream
x��S�n�0}�+��=THz���i�hl�!����/�k�n���c����]i��������i��0�'���plm�SfZ=~Y���rQ��i�+m�Gy���F?����k���'o����Vo�*�������,���hB��U,��uiB��u��F�NX�6�fto`pJ�W�Q��U��>�"k~���2�|���~W�d�N��IA0����3�FX�	�+f'h�-n/��!�����aJ�h*q�!:������R�I���3*���?����$M2��(i�j��iT{C�i�i#������oO4����I�]Jb������l�W��H���W}�>�*>��y:[��z7g��	(��
endstream
endobj
31
0
obj
<<
/Filter
/FlateDecode
/Length
37
0
R
>>
stream
x���	xTE�?|�����Nw:I����t6 !$�@$7l���	:#as��QQ��T�1,�q�G��Agwye�Q�����6��������|��$�wOm��S�N��{bDd�������~��H���8}n�9��}���#D�_����s�D3N'r�%*�?o�����z�|��a����2_.�!�7��K.���4�*'Q��_,8{&O��J��r�/g^��:B����}(l]4��_���C�%����,�)K-�,���8"��D���"����F�_XR;����Kgi��
F����fJ��t�R�h�c*����������$����>�.�nF���A��2��	(?���������l��hMf>�Io��+��F��~�~=�VS�r�WCuT}*z���Zu�������L���O}(�V�����w����~z}*a��X����J���+�!t3��"�������hiM���BZE��%�e
�>�P�W�I�d*B��������������t��0^�����>�����}�R�q�`{�SZ�v]�e�{�����L@;��7��H��/���rKS���Y6�H�M����e��4�mFo��z
cFv�nz��+���Y
�d��Yl��;�l��r��]����� ��CFK��A������2���e���n����g����F�N��
"�#�E'D�"?e��t	-�l��v�N�7����f6��c��0��>�v��'�V~�?�LP�)O����<�e�m�*m�1����1�H�����W�;�����@��A+�'�u���C�	�A�����,���]�nb����W�'%��\>��B��"��r~#�	������m����hJ�2XY�����e����Q��@u�z����k'kS�M��i��^���[�����+l�*��{�"�"�H;t�M�������o����	=�O�1,����l�g��i�L6�]�V��m�Nv{#������:>���s�|%_���w�������y�RJ���8�t��|�a��L��]�lV^Q^W>T>Rb���>�R��v�Au���v��K���=�uj�jG��:�3�,�T?W���g��`����������e�b�<H�~x:�`������A$d3��y	�a
V��T�D0/�"}K��j���M��/a�������\�UT�S���>�O�7XKWT��^�9��h-��w�������.���l�}��nb�����dC����-�?s�2�]A5�����l;D�]�����'X5��>�����_�>u�-����]�}���g�n
��LX�k��W��z�Xg���aA~��B��+^�W/�C�o�X��K�ad�z���hU�?VVm���G'c��-yq;+�[R�U�@��l�Vo]4�+�������~���o���p������b��O��q��Od6u�'���Y9��A�m��Y���N{Yi_AwB���6;0���U���a6�M:��
�w��H��M�4�eP+�l��k$�Q����]X�O`m��8�~G�gi��h��z�!�(�3���������)�����%h�DM��ju�O� ���W?��Ql:���N��ha05�m��T
�:J�#���<4������+4���Z�'��/2!:��W��E��^�t[�^�1�.Je�22}x�)j��&{q;�]�\��������("�n�Y;���aC��TUV*XV:�����E��y���`�OvVfF�?�����M��]���f���pF�F����-a� 4vl�D��n	-� ���X&l���'�4Qrn��f��y�$�k�����P0���P���>��5�BM��A/�ke��pN����F��%8:<��y�F��Bu�#C#�8���m�
��Z����Lx����8�\�T8#4jt8=4J� ����9;�0�q���������l���Ya
��Kd)�	�#��l&8_��V���\um��f��8g�f�<�1��lm$���Q��K��GQ�wd����������A]�je0�9��{n�x65����1-����k!��)A���lj�+�dP�D�*6�9��"���`�����LM��0M�8�-#���O����6�r������������������s����I�	v[��
8]�s����,.B���I����A!�����Icc"s������~����1#����-�<CE��k��Pp�W
�����V�����DP��1UC~<.)	1FbN���2^���|p��������4��������Y��WLj���4+������0o9����i"gE<�{K������������G�f���������tzcp��K��SO������B����J&�B<S��P�3��FgX��]*�����	,8&�i{69rr~����uc��\�g�zZrb|�	�z�\���j��z��U�����Z5&��e�����Y��'�j'�?��utK|B;��Vg��\��A�cC���Fl��'m3��SNo���I����m���-#���!�q'\S��c�"1�gP�6n�Y�;M�2W�	2~v#�f��1:����<2
?�����������9��F�5�IS#
95�(��k��ad�3�'������	��5��j�a�Q<��$�$����`
*�GM�����	����L�����Cc��"��
���=��=�rq�dx��t��t�T�T�v�[��8O��|���A�A����x�kf�^�I\5���o���y��
gO����sN�E&}���.+��D���i�����;����DOL��q���:���8n��mJ4�6���#���n�O�m_jw:Eo����9�u���$W*+�2��MM1�2z���Ys�y��G�?����#�.�/�+�K���W?��|'zrD_�ub��M�y�W|5W0j��}���~��6���i���G��M�Fj@
�aUU�������%�����b^0!�5��V,����$���W5H)�|x���3^v@
��{�*��A�^��A6�5g<����3�%�y�^�������#�FfOO�S���Y��e�3�T�We�UG�G���������2_�:]��W�]Yweo���m�R�';�=0���+��f��m���v�����l�q��=$�2�A
!�����~o;gNw�n��R'w�HwnL���|>�-�2�}�yz���������kj���*Yx*Y���&�[���4�h���hg[R��C�[3�S��<��-	4��D�4m�����f�=3=�g&3���"�in]?��	����dG�2��-lnnfI9��U��WV�ru#p��r�C�����:�z6|����s���"�3�so9y����'����&f�������5��_e}��'���>��pC������L�R_z�,_��b*�U��,�QyH{0e��K������t�/����I�9�,-��py���g:'������].��`�t�K�y�o��L�A��y�W�?����BW���9p�<����l
<���}�X^F�o_��l����`l2@�&U�6[""�� ���D*�
�B|�����
*�I����p>�\���
���g��I�]�h��������������3�r�9�/�5��k�G�eW��;���s���2��Sg���r�W����g��������� \u�� �kV;��j�3�Y���<���S?�b��S��"�X��]���s���Sw�#�e���ru�G�EMQ��N����� �tu��"���p�����`���]���b"�c�����Z�V?=q;������B���X�ba���v5�z����+�%���g�nw\u�;Or�w��|��QLi!���J6(iPj(�%1��k��g;vDE���������M�-��}I�d^������������CT�}���^��Z�������H�,������`���e����^d��Y������	[*�i8��B�V����>�^U]��a��UV�&��3r
��G_*V��"G�sUi��s�\>G�������Hq��3��(�]5�L#�X���AMO�4��03��;D	��|�(�j�`{�D����2�9��2`�f�	&��V0�u�<���2�
;���y���=��p��1�;�+��������&��3��a��f�A�S),RI�����>�r�_�SS���gc���^awUP����p���p�(;I�F�l�cW4I���C�X'f�rr���dE����nE���#��a�������H����*��/��[ �Y�|2,o?�g�^X��%^T������sR�zQ��|�7�:1���nM���)��|o�-����g��+t��nu2�0���
����9e�
Z�X1`��
l�e���� ��`h:��������.����_��L��"���;5��-�:U�AVlK6�	���LVn�j�#�C�S���+����`(�,LJaAHI��"�[�)�nYy��W�m���S"�G\���G�O=���~�nf�
%#������>���<\R�g��'�l\}.��f?��Q��Z���Evg��gk}3��!�9���'�4����������X4�]Q���9��	�r���:(�%d]e.�B���9	�9��l���m����5? 2
���� +�L;Ad�mX6()'5���j�������2H�u$�����o��[��+�{;-2k�{]�7��2���wmj�m���l����'B#&+0o<#�P��N������sS�j���?��[��]�����D�8��NHVhp����f�]�
�;�U�V�)
���N6�����+�;���5]
����Z��B�`�UO�u�S�R�KOT�E_7[�\�x��JI+R�JL9�!��`������(V���G���$��v��k9�t-�cRh������
M,�P��fmJZi������kZ@k�Z�C��Bc�B6��O>L�I��C1�j/b*��\��E�Pj���pz+�w-+�v};s%�����L������);���;G{S���|Wb"ez�����+�*�����@vYvKvk��l-����Y'.���?���@�Y3zL�/�P([|�����'/�<��	����}[/y���
�._�;[Wv�OJ����t�e����y.�zJ��x���^���I�;�����FjZrZ��d�6�h�~�kI�[�7C�&���i�M�y���sr��������o�qzC���}��s�3*&�N
=��TH]��0tY�e���#��8�]y�y�jWE��Q��;2t�kN�b�%���V�nt<����lw�]z�Jw��|�Fn����6�o�+���z?���s�cv����@&�����X&ta\F���������l�c����K53�=*S����G�X���V�Vodn��=�S�>O�MIz����������!M��o-Y$���%��K�������f�3R�s!������^���-�:�A��6���5��jW�[��p����D'�\��@��u���|�uuU�VB��\#s��6:�uH_F��qW�P�VV<(��i�w�SS�|��"5�SX0c�����tj���jY����X
K3"��/���q�����+K�����O"o�w��]}���q����_�h�3s�x������������|b�������E��^vJ?f�*���eZ���Z��3���4�f�C�.�9��t�|
Z.��IJbo�+�Z���m][�2�q�����><�O��K�\���V�����LP�>r������zBXo6W�V�sW�y-�:9��t��&8�>�+�P�L8L1t	z�l���|8�p�������6�4��Wg��\70 ����5��k*��T`�B��tm�����Bw���
{�������$�e��4(-hg2sS@�����q;���Dnv�;�����DAFB����#$\�I�&�7��l�����+��s�*�xGtc[Re�(^�I��`4g���"~���x��6�8��y���e�ng�c_��>�?d�-�����(S�rR}
���<��2�%(���h��j\�#u�:�BT����[����8%sl����z�M�N�~Z���s
�*�!����8�?��B�S�]�>=�W��MmJ��_�7������'+�������I�\yf���<3�����yG�x�y�-KtW����I<���l5;�D&Rc��cf%����<�9|�c��t9�	�@����%���i�$�X`�k/r5���<�dQ��f"N�+XE4��2����93���ilb���iJZ���u�	�f����4���C�`�����
�BIL�J�������N��v>��]15ovo.i3�T=�E��Y�B��r%���#����r���Y=v�db�IM�	�w6'Z��+{:�����'	C����>1��~j���������U5�:u� Oo�75]�x���+O��J���v��k����Y�F���-���^^P�7"_	m�I��Yt�N�F����2O�����t�t�tS�7�^�s
K�������Gg�d�nw8��S&�M3R�\$'$����c�h���x�r������N�J+������{!"]5L���{qV�-lf�#qj���u����������Yz�7��CwAaj2�j�|��d���=�t�<c���wq�o�8g�Upm��0�o�*�>��.^�����[v�{����0�Z��t��9����m��s����]��8�V~��9�s��x���l�8���#z���!��xO���79�;���*_�_�P���R��}M�&�����>{����
A��T$r���>���*\��J����@&���r4h-�tl���`��D*��R��r&&���L#'%=������ �K,�^y��4��pI	hl��L�^���U�P:�$�������gO����������/���>�>��|��u�������{�0�,�Aqj���
!��b��$�fLB���;\�x6i:v�w�:2l�6����qL����C��������}�#�7.W�;+���I5�*��O�����J����41
��1��DoCbK"O�{�i���Y�y�5]v�B���1Z�?F�Y��n��
�������BK���_hk^�A9�45&��>3�,�������c3]�
[z�eJ�����CY��0S�fQJ�����&�/l��j��[�E'P�+:�B^�t	�/
�#=
�@��AR��	n�;��h]Nm	���Ei��'��R�h4Q4�hBX$+��{8G8���5VS�k"�����$���4�-��xk��+����2��e*��qz�r��3kj�\z����{�]I������~q���#�u��B|��Gf�`��AO0�I]��l��~��K�)^_Rb��<���<<�ns'�	Q��D8t�������D���B�zr��>��6��`SlE���I<����+1�����
�N�	��v�KO�h'��.����6�eI?@~q�j��.w��������V�f�U�*n�r�B���o_z�����O�|����w�
W]1%�YO���w�>��k�Ft�ayb2���dg�$G�io����������]�[�����)�A������d�P3����������w���P�m�5��;/��!?�a�J��\7�.��L�������!��=�"�tf_�?���
-�uj{��c���$�����nF��P����Xm��+�m�����2�a;�/���������J���_�G�F�G��A���{K�}������������3�wo��}�oi�d<����z_3#���2S3rT##[�K����Z���������B���x���+[�����1���YH[��Z�X���2�QmD����k5���(Zh�5@^aE�9�����W
�-T�����
�
��jaz�?k���u�/<]5_c�C���%R���b��v��D�#+I�I~B���|�qE��M]��eLY�9���1����������������O���y�#���:m��g�������g@����D8s����\q{�Q��:x�M�3�7�FrZ���y6�Ce�-�(�(��M"�N2]�3!�A=X���`^�Vx��=�_�P���X������C~����	�D!_��)
2X68�!xh�i�NA^��&�����e���q��,/��>I
�OU�"�M������~�������9C�h�������w���Z��}����]�Vs�G�C�7�������4�M�����3=y���+�M��v�������7�����\�m����=��������)IR���RT#���h�l����9t�6sv���tt������������L8:8NVx1,��](�O9��L^�u��XE���n�|��o9���o>��[x��L_y��E��"���M�6��i����/�[1^��;�C��&so�R��N����s�K���L�����1o�k���6���~�/!��N��uI}��X6fBO�n��L����|;q�0b����{c��.-�]��r��rX�Q�f��G���x�����'b�}���u�Ug�s5��av���������L��X�������[�\I�T��o2�n��=�M��jK5����8/���:�ng���wF���9����B��a@��;��������3�{�{��r�V�^���PS��9_�6�NO����(��S�������H�]]�B��M�W����!M��hL��HOb�F�<oTK�i'�4lr�Zp�y�*�P�9�/��������f����l�iIi������Zxs��H����];��y?t��u���5�5�us���sw:����7�����+�+S���M���`�>&��D����Fn^aA��2G��T��CK���]��������k�[��(����.-�:���;Ro.���=�Z��O��{�C��~zZ���	U���@E����V?����}>�7����Pg���6�ce�Xi?��ON��y���,��(���+����:�������eA�M�����m�w
��
r�������f��i_3K�jFN./Jv9yQ���c�2X��d���.bh^�).���^T�#j���K�<������3d��D�<�;&�6�M����9W��u�T5C��1x�4H����e��'������l�}�b7mj[�1��G����d�%3���f��T.��3Q�oP��z�LT�fVVU��z�����N����6-���vg���h�/�����=\b�n�����b��B�4�����/��o������jg��Z������m	��ee��=����7�8��{4��u�Fo^����ex�?��U�)��"�������sQ������Y���������JK&O+�*
��$��~��=��8|D����5����nx-�U�~��iwcOx��� 9���&����6�S���R)������<����0�~!n7�mH�)- �iJ\�6��!�R��+K�Dg���QJT�f�J�����y�����O������HY��7�P�F)��`JY����E��D}�
vb�|��WqGc7q����=(��P��x�=��l���)�4!4q5��T���/�L(�*<�?���^R�`��2����L��$+���A�F�������F��|���LQ��e���tN�-�b���^�T���>��[5IM��#5N;<���T2�'���{�^��v�=��C6f;�=������AsN��L� sb�<X�	C����SYz`r��E�,}��Ygm��o����6q��	7�L�t:���2�T�(���]<l��tD���J=�1����J���p&��T�U2��\���|�k������AZB
�'�SqB
MXIv�-������{Z����trPm��VD�l�[Mo9����9��W���UT��
�wt�+--���uLt(�^f&��:A�U'������A[a�����0!
Kw>�J�U��;Us�|K,��?���K]�K�D��g�w�,'9M\W%�0�xd*+|ah���y��D ����������412I��&����"�%��w�t���]<4��Yr��Au%$�N����UEa�8�����m�P����"
�����*� |i/*5�Y}*R�Mw�b��+�+b�.4�\�8�"�e�df
���B�<k����]�x
��^���,\4�s�<����dX���'��H_]�%�����u(�k���]Q�d���)&��[�F?2]I����t<��Z
�����m���jJ�I���D%�[X(/oq?�B�kF��<my��	�#*g���'�/�����N�M��kq��~.|�7!}������WD����
��`6��q����';Y#�����^���^���~��(U�5v�%d#��=�n�Y��^���9�W�c�	��`%���A��]��E����w�]�6����;������m�p�����:g����6�~d�wb�O��W�]_�Z�%�<�a �KR�r��`�"�����x�-�����^3���`nr��:��K�B��r��+��bs�bGBu��	����D�*��*���U�1xPNj.�nV����5���s�_�����^l-U���H�;�oH~����*�0�����l�6����g���_h���o�vy����v����1<2�Fz��o���CV��W�n%�j��C������1��I{�Ng���;�dS��0MC���/��WG�P~:p0g�Y����E`3��!����t��2�����	��M���w�^M�D:��u���w��i��!�I��T�O�S�����������]H���D�A��buq� ���{�W�N�j��/��<r�b���0�������j��G���/����/;�HT�rQ���Da��h�5n���8��	��}�aL���I��c�e`o�������F)����1��a�e�D������@��a\�M�Mm��~�j��6��������h�rs��.�o�{�N| �5���(C��*�����A��H}�-��4��O��8m����
�c1��({T���
`>dp�H���
�c��a�#��~�S/�6{l^i)��.&���C��?2}xxJ�!�gd]�I���_�&���:�o@P-��}�;��Bg�n
���=/uu��{lr-����/������PV�g��Y�^�u�:�R���z�����n�&��&�>�v�[q*���XP%U���
9���������5����hC�����H?����]���,��W��9]_=��&�o���4A���o'�0�
���h����������M�x�����W�@o��R�o�\�
�i[�k�^���e2�=��3�'�@���4������"���F4��� ���)+�q��6`Pl+a���c�4��D���IC5���N�US�����>M?Y�����y�)��|]e�RH��m�7�?�~��������Kq���T�����h:��.`7����A�'��A�g�?�Fkb�=xL?_��@���g=-���FO��I��"���[�N��5���(l�������/^�'��3l�_�~�N��u_�(E{,;����X���Gw����K�������y�����0�������-�)��
9�}1��j4��gw��h_���� �b��si����W�:T����<Q���&����E����X�:�&��^�0�a�E~��V�OV����o���t�.��<��e�?cy"M;�n������o�m�(�J��WF_s�5�a[Aw~��P���;1��%�S���m�~+�g�4M��D@��<��q��E����R�!Q��g�o������6�a+�}��2���6:�fJ��r�������it��������Q�[��O���R)]����Z�Z�'FW���|J�BG0�����:�-����z'��
�o`�>�X��!�S�D�C����D�H�$��O���b���7L��2���C������t5lI��S�_gQ``2�X\�L��(�A��t>�^@/8Q����>��w'�)��:��'t9/�����A�
]*�j?*RR�rD�?+5U�r>��Q���N����l%����>��N����u�����N
�j�K��r��#Q��z1�K�n�}�C���n}���� �+�@��_��c����C�����O��+�AN�c42���-��6��NZ�n���\�����qu�hTu����
�.`�_j%]���m��;�.P�?<�=����J������q���x?��~'����P�z�pYt��r9�D��sW_�����G\+�[�(;�����O?e��hvc|>@}�
��
���_������O��]�#���H�>�}�����������@,NR��PR|��~�L�1���B�=����{����Q�����AF9���Q��~`
�B��}?~���T�����S�����=T*�[�
�P����T�+ �
@�!�z��r����,��&�F!W�S�J~9?q=�9?�%�Y�����RFO�}��\�=�������Xe?V�������<����v����������#_�~���@k���&��)��3`��}i�.��G�|P�F��G�y^6��t��W�#>:�����/?�/��������w+�;�-����@�B�0��@�����p�U�����>���pc���������9��,��������Y���o�3�����gh��F|����������$�����g�8�|���a�s��h|JC���e��-�G�J�
�O2�NQR�;�U����K��^A����/��G��V���>�{4e^C�����~�r��u�yV�?`�r��>��D�}�l�e����������������=���N�0���s-��/��+�s/�O�s{��x/��=��>������8���\�0�o��~iO?���?������~G��6������~I<�����������8z���X�#�����5������z���r��S���`G��Z���������v��m�P9����H��hSl��^�����F�NW"n�/����~N�{����������\�R`���<6������
v^q�UD�B]_��/�c��E����q7lq�����{
y������������w|�.�]�9S�-�M`���L�@�g�a�������{y�v��.%S�
��w?�G��������4M��>�=D�.~6��-���?��&����/���~���{Lb�{��@�:���j�=�����>��Z������#4]�sl�����M6��>��6��nu��J���r��U|_��#?p�'�2���i�1��	d���VG���n��6��{7d����o�z����a������l�{����s���=����j_�^�N{��D=N�w)��}���t��/�M�;*���r�\�b�M�~A����(�����WW�d�9���2���������Tc��2����4����y'�g��J����;�O�]�d} ��4���8���!���m	��~�g-��N���n�f�z���MRg��;A���k����X�{���$���r_E�����-���#�������T���V���hw������+����G��r������u����0:"�����a{H������jK��0�n�KQN����(��Y���0N��e�K����P.��;~P��~�~�W�O�����eA��;@�4��D��	�
Z�H4�'����@��#z�#~�?V����c�=�������zC=���O���'�^�������H���~L�	�O�������8�>&�O������$������k�_���;�
X��4���g�]���mES���j'z�_��kK�Fv������l���8����E��v��X{�m�#�+��D�@�����}0��H��=��Q�=xQ,�(�x����}����7�3E:,Y�k���h:������]�pN<��!�j'T�Zis�SJ���B���������ZC�:|8��~����<	�/�����gK�a�3��@�����WP�hC��Y,�g��'|��4I@��r�����$j���O�)�g�k����%r�K0�!~�����l����>�+4���xz�l�_�j�(!Nm_S���_A!-�B�=����NU��c���\9�Bw��p�8U��M��7 e�"��
�����C>#���uu��;��t�����yv����n���������J�����!�_b-*�F����t���������up�����>�;��=���u������a�GP�wv��������6:M���������$���wi��,�
���K��>Bc�gi�6~�)4��AA�J��1�)�5q������5�yz�B l��������e;&����DZ�V�������|�]w}�_��:V�K�*���;]�{�f��p79�"}�������,�����M�a�M����I���a�8��k1_.��������KV������X�B�������~���g�q?�Zgqz�w/=i���r~������!�������a�����Xwr�������8�~��-?61v�Hk��9?�c�p}�B��;���7=?���!2�~a<z��o~�����K ��-��R���6���Y���g@;�����_�^�B]��b�X'�'?�}�d�n�{7����`���_X�M�O>;��m���{�O�Oh���*�sx�E;�J��@\�q9����}(�+��x�V���yD���~|^�_
���O�]{>�*��c-��:��-���8����~�1�i1�oi6G���%��>�Wa�,<m}����-: �v,���z��+��&���g�}G�������~H>������w�[@��s�I�^���������l_��e�_k��LP���0t�`���'�|�)�
���o��z;MU"t���T'l�����^��.��E<c}#)��e�@x(�C+��-[���b��H?��c�H��Y�0���o��_nG�n�����_�{�������S&�on�s��{�h9��T������QN|W�orb�)M��h��_~_��W|��.�x?��9���5��
�H�'�����3���T������g�E�Q��p���\���.�8x�J��k)�KCXgiuG�*��Z��_�A	�
���P�
��_ m,��Ye���y#d�r�1�8�J����[��.�����eF/��1���S%�Dwiu�]�c��N�E]f2"�C�[~Vkt"p����w����^�j�*�}t��N�8��n���T�#���;E��n���X|K�������eX�K����O�7���-��M�{����	�mK�#��Oc���\����������l�L��+I��W�;��4"�
U4��XS`���
*�i�e������O|s��*����Z�w,0Xf���{O��O;z~���������~�}��}��s�j|/��S�����}��s�������e�?�O	Z�Q#[��o�W�.G��6t)�~�a��
��w��D3��������2V��|y�*��[��=���]�
�T�}V�9�"�;�q��j�����I����Zw�k�9M� ��.���A�9vF�K.A<'f�D�_��p�
��S�)3�3i��l������u{���Y�b6������?�j"7� ���z�7I��c��{H����"�>���L���9��/�>��=�?������3��;�w�����'_���X�.��h}�<�H�����v�^a�9W�����`S2�w.Ph�����rz���$Ng� �i!���d�,�����N�~�8���������?��,�*��n�>x���������-~+ �!y����2�(�O��~�+����	�������s���C;��o�S�D�[���@��I�:�E������{�}Q7(�/��A������+~���I����t2~K~4�>�8��� ��;�8�v���F��G2t#�|���1�� �wCzc7<K�yQND����L�!������mDy�I����p����z��^����E/z��^����E/z��^����E/z��^����E/z��^����E/z��^����E/z��^�L��V�����'����k���I#�m��:��0m��g�(d*��r���"i���|g����dz���W�Q����-m�D��vsT�����h�@I�l�l#�<P��R���
M��O::������lR�k@
�"w]���g��
������JQ�����N����+S�\n<=�
`+�
��<�Q@A�>��G\�O�����9�{h9��;���u�����������\n�y����V�S'�Q�:��#���m�J��;�=(��^���F��d2n����d���7m�$������X���/o�."��Q���e�}@�����&��������@{�(^��R_d�)>*�dP�,��-1������r�x���E���*@m��V�VL)����	�W�yR��P�TJA�(�p?�80�9���vW��:�2��
��G)�/+:�
�%)��,�!�<%�RA�(}$}P������^�����(�n����1���J,���+��V��\'_�^0���
�"*8d���R�W!�
��
3�
3�
�Z�#��\�2��%��\Hk���Jm�@w�@^Q�N%]�C0��%CjF�=Q����M��������'�������T������V��P���3Ck��	%-65`��)yB�� �`��>m��p]q��b�%�W������_A\�?X�e��)F��|olQ���_���Ge3�;�!�w�g�o����I����
�t������}����������Z�@�H��^_y]>�?EY��/�y�O�N�}��������J�����=B���|
moK]��lm�y��b�����(�B(�H[AR7����Q��%m�o�����a�@�%/���JT��mO0����kM��o�77*e�e��6*��`�`Upc�����Y��~�j<�(��=�	�����U��.�I���
<7�P��2Dxz����Z~%M8�X,V����%���_���%�R�BX�Vp��������hG��h��/G8Z������p���Er�����Er4��
�h�
�hG8$G8�� 9Lp��0�aJ&8Lp����	Sr���e�(�e�(G8�$G8��Q&9���#����#� 8��#� 8���8<��H8<����#9<r~��c?8��c?8�K��������/9��c?8���){�~��`����e/X��e/X�J��`������Hap��2`9�����o'x;%o�T����
�#�08��#�08��K�08��K�
���
�� 96�c86�c��� w) 8�s�����_�m�k�
�W������h����6IM%�].�%T%��T )��t	l�-P����Lf���V�I���W�w�(�4sU�1�Xol5�4���~�����z}����m���<X��]����������l"x��P-�@�������f�������b�d1�Z��/fuv~2S��RG�Y��,�T�e�n�gi������'F��%�������@P����L+F�F3��rP�A��|D�M��;��ml�������"��n+,�h+��x[��@����B���0s[@�� ��y�-�dS[����p�m�/�\lT�:��S0nA'������@���lD�b4��������[\y��Bm�a �m�jQ�F�b��N�e�4@P��|'kT��8�1��?�`�o;T�W�;�t�������9Dy��,�����kw�.��#p{`@���6$�A���M�.v�-fr`E�,�������)������|������&����4��qE~[�����1��f�0P�#�KCb�V��#$@����A���B��Uu�$��8d�5�0F����k�1������%��6��f�m�����"�����������U�Te���S��>�����)NV�y���>�y6��
����`�I������S���!%�Ftr���>l4���������Ww0�����"����w$2��5�;���+�45��wA���;<�z��x�X���?�����-�S������"�n�_6%xf�N�����v�DA�w���=z�HW[G5��Y���bT(��FPP�=!�a�b�
��r9����E�\��%��L���/8z��`P��'�'����ne�1����@�
Y�(�CA�����@E�d�NV`��p��"�V��cE*e[
;^&+�R/�R�2%���9#JX�����='4�%4z�^}�<x��`p���"#V
Zf�=O��s�KCsF���F�
|����C���3��6n{��3�m�9pth�����������X[�5?PY���Q�U[��u"�V�U'��m���������74n����K��������>O�p��;����e�R�m�����34"�DV���u"�Ld%"�me��
����6YY$'�FP\�$
��+'��s���(T%l���9[,~d��F��?�/��o����������K�����D���)������@S-���6 ��(2m��>�#���t�-��P	+�MN]��o0�8*,i��._�v���q���Ry|������������qU����r�?�W�U��5��#�6m��U�7��P�#u�F$6����t�BKJ����&[���h����l��()i*Y,�Sy�)��?����	v�U�bY������[�`&b�/��-��d�R��$;�8����1Z�
endstream
endobj
28
0
obj
<<
/Type
/Font
/Subtype
/CIDFontType2
/BaseFont
/MUFUZY+Arial-BoldMT
/CIDSystemInfo
<<
/Registry
(Adobe)
/Ordering
(UCS)
/Supplement
0
>>
/FontDescriptor
30
0
R
/CIDToGIDMap
/Identity
/DW
556
/W
[
0
[
750
0
0
277
]
4
10
0
11
12
333
13
19
0
20
23
556
24
28
0
29
[
333
]
30
35
0
36
37
722
38
39
0
40
[
666
610
0
722
277
]
45
50
0
51
[
666
0
0
666
610
]
56
60
0
61
[
610
]
62
67
0
68
[
556
610
556
0
556
333
0
0
277
0
0
277
889
]
81
83
610
84
[
0
389
556
333
610
556
0
0
556
500
]
]
>>
endobj
30
0
obj
<<
/Type
/FontDescriptor
/FontName
/MUFUZY+Arial-BoldMT
/Flags
4
/FontBBox
[
-627
-376
2000
1017
]
/Ascent
728
/Descent
-210
/ItalicAngle
0
/CapHeight
715
/StemV
80
/FontFile2
31
0
R
>>
endobj
32
0
obj
354
endobj
33
0
obj
27119
endobj
34
0
obj
370
endobj
35
0
obj
18265
endobj
36
0
obj
368
endobj
37
0
obj
19052
endobj
1
0
obj
<<
/Type
/Pages
/Kids
[
6
0
R
15
0
R
]
/Count
2
>>
endobj
xref
0 38
0000000002 65535 f 
0000075565 00000 n 
0000000003 00000 f 
0000000000 00000 f 
0000000016 00000 n 
0000000160 00000 n 
0000000231 00000 n 
0000000397 00000 n 
0000006459 00000 n 
0000004998 00000 n 
0000005018 00000 n 
0000006407 00000 n 
0000006830 00000 n 
0000006974 00000 n 
0000007125 00000 n 
0000005038 00000 n 
0000005207 00000 n 
0000006644 00000 n 
0000006366 00000 n 
0000006387 00000 n 
0000034899 00000 n 
0000007274 00000 n 
0000035430 00000 n 
0000007704 00000 n 
0000054414 00000 n 
0000035627 00000 n 
0000054958 00000 n 
0000036073 00000 n 
0000074738 00000 n 
0000055166 00000 n 
0000075237 00000 n 
0000055610 00000 n 
0000075439 00000 n 
0000075459 00000 n 
0000075481 00000 n 
0000075501 00000 n 
0000075523 00000 n 
0000075543 00000 n 
trailer
<<
/Size
38
/Root
4
0
R
/Info
5
0
R
>>
startxref
75631
%%EOF

#16

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Tsunakawa, Takayuki (#11)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 1:50 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:

From: Amit Kapila [mailto:amit.kapila16@gmail.com]

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version of
the data is kept in main storage and the old versions are moved to an undo
log. We call this new storage format "zheap". To be clear, this proposal
is for PG-12.

Wonderful! BTW, what "z" stand for? Ultimate?

There is no special meaning to 'z'. We have discussed quite a few
names (like newheap, nheap, zheap and some more on those lines), but
zheap sounds better. IIRC, one among Robert or Thomas has come up
with this name.

Below are my first questions and comments.

(1)
This is a pure simple question from the user's perspective. What kind of workloads would you recommend zheap and heap respectively?

I think you have already mentioned some of the important use cases for
zheap, namely, update-intensive workloads and probably the cases where
users have long-running queries with updates.

Are you going to recommend zheap for all use cases, and will heap be deprecated?

Oh, no. I don't think so. We have yet not measured zheap's
performance in very many scenarios, so it is difficult to say about
all the cases, but I think eventually Deletes, Updates that update
most of index columns and Rollbacks will be somewhat costlier in
zheap. Now, I think at this stage we can't measure everything because
(a) few things are not implemented and (b) we have not done much on
performance optimization of code.

I felt zheap would be better for update-intensive workloads. Then, how about insert-and-read-mostly databases like a data warehouse? zheap seems better for that, since the database size is reduced. Although data loading may generate more transaction logs for undo, that increase is offset by the reduction of the tuple header in WAL.

We have done optimization where we don't need to WAL-log the complete
undo data as it can be regenerated from page during recovery if
full_page_writes are enabled.

zheap allows us to run long-running analytics and reporting queries simultaneously with updates without the concern on database bloat, so zheap is a way toward HTAP, right?

I think so.

(2)
Can zheap be used for system catalogs?

As of now, we are not planning to support it for system catalogs, as
it involves much more work, but I think if we want we can do it.

(3)

Scenario 1: A 15 minutes simple-update pgbench test with scale factor 100
shows 5.13% TPS improvement with 64 clients. The performance improvement
increases as we increase the scale factor; at scale factor 1000, it
reaches11.5% with 64 clients.

What was the fillfactor?

Default.

What would be the comparison when HOT works effectively for heap?

I guess this is the case where HOT works effectively.

(4)
"Undo logs are not yet crash-safe. Fsync and some recovery details are yet to be implemented."

"We also want to make FSM crash-safe, since we can’t count on
VACUUM to recover free space that we neglect to record."

Would these directly affect the response time of each transaction?

Not the first one, but the second one might depend upon on the actual
implementation, but I think it is difficult to predict much at this
stage.

)5)
"The tuple header is reduced from 24 bytes to 5 bytes (8 bytes with alignment):
2 bytes each for informask and infomask2, and one byte for t_hoff. I think we
might be able to squeeze some space from t_infomask, but for now, I have kept
it as two bytes. All transactional information is stored in undo, so fields
that store such information are not needed here."

"To check the visibility of a
tuple, we fetch the transaction slot number stored in the tuple header, and
then get the transaction id and undo record pointer from transaction slot."

Where in the tuple header is the transaction slot number stored?

In t_infomask2, refer zhtup.h.

(6)
"As of now, we have four transaction slots per
page, but this can be changed. Currently, this is a compile-time option; we
can decide later whether such an option is desirable in general for users."

"The one known problem with the fixed number of slots is that
it can lead to deadlock, so we are planning to add a mechanism to allow the
array of transactions slots to be continued on a separate overflow page. We
also need such a mechanism to support cases where a large number of
transactions acquire SHARE or KEY SHARE locks on a single page."

I wish for this. I was bothered with deadlocks with Oracle and had to tune INITRANS with CREATE TABLE. The fixed number of slots introduces a new configuration parameter, which adds something the DBA has to be worried about and monitor a statistics figure for tuning.

Yeah.

(7)
What index AMs does "indexes which lack delete-marking support" apply to?

Currently, delete-marking is not supported for any of the indexes, but
we are planning to do it for B-tree.

Can we be freed from vacuum in a typical use case where only zheap and B-tree indexes are used?

Depends on what you mean by typical workloads? I think for some
workloads like, when we are inserting monotonically increasing values
and deleting the initial values from index (say someone inserts
11111111111111...2222222222....333333... and then deletes all 1's),
then we might not immediately reclaim space in the index. However, I
don't think we need vacuum per se for such cases, but we will
eventually need some way to clear the bloat in such cases. However, I
think we are still far from there.

(8)
How does rollback after subtransaction rollback work? Does the undo of a whole transaction skip the undo of the subtransaction?

We rewind the undo pointer after rolling back subtransaction, so we
need to just rollback the remaining part.

(9)
Will the prepare of 2pc transactions be slower, as they have to safely save undo log?

I don't think so, for prepared transactions, we need to just save
'from and to' undo record pointer. OTOH, we have not yet measured the
performance of this case.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 8 years ago

In reply to: Amit Kapila (#16)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Mar 2, 2018 at 1:50 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:

From: Amit Kapila [mailto:amit.kapila16@gmail.com]

At EnterpriseDB, we (me and some of my colleagues) are working from more
than a year on the new storage format in which only the latest version

of

the data is kept in main storage and the old versions are moved to an

undo

log. We call this new storage format "zheap". To be clear, this

proposal

is for PG-12.

Wonderful! BTW, what "z" stand for? Ultimate?

There is no special meaning to 'z'. We have discussed quite a few
names (like newheap, nheap, zheap and some more on those lines), but
zheap sounds better. IIRC, one among Robert or Thomas has come up
with this name.

I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair
enough explanation for me without need to rename :)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#18

Thomas Munro

thomas.munro@enterprisedb.com

almost 8 years ago

In reply to: Alexander Korotkov (#17)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 11:35 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

On Fri, Mar 2, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Mar 2, 2018 at 1:50 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:

Wonderful! BTW, what "z" stand for? Ultimate?

There is no special meaning to 'z'. We have discussed quite a few
names (like newheap, nheap, zheap and some more on those lines), but
zheap sounds better. IIRC, one among Robert or Thomas has come up
with this name.

I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair
enough explanation for me without need to rename :)

Nice.

A weird idea I had is that it adds a Z dimension to your tables.
That's a bit... far fetched, I admit.

--
Thomas Munro
http://www.enterprisedb.com

#19

Aleksander Alekseev

a.alekseev@postgrespro.ru

almost 8 years ago

In reply to: Amit Kapila (#1)

Re: zheap: a new storage format for PostgreSQL

Hello Amit,

Sometime back Robert has proposed a solution to reduce the bloat in
PostgreSQL [1] which has some other advantages of its own as well. To
recap, in the existing heap, we always create a new version of a tuple on
an update which must eventually be removed by periodic vacuuming or by
HOT-pruning, but still in many cases space is never reclaimed completely.
A similar problem occurs for tuples that are deleted. This leads to bloat
in the database.

This is an impressive work!

Personally I would like to note that performance is probably not a
priority at this stage. Most important parts, in my humble opinion at
least, are correctness, maintainability (tests, documentation, how
readable the code is), extendability (e.g. an ability to add point in
time recovery in the future), interfaces and heap format. There is some
saying on premature optimization... don't remember exact words and who
said this.

--
Best regards,
Aleksander Alekseev

#20

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Alexander Korotkov (#17)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 5:35 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair
enough explanation for me without need to rename :)

It will be possible to bloat a zheap table in certain usage patterns.
For example, if you bulk-load the table with a ton of data, commit the
transaction, delete every other row, and then never insert any more
rows ever again, the table is bloated: it's twice as large as it
really needs to be, and we have no provision for shrinking it. In
general, I think it's very hard to keep bulk deletes from leaving
bloat in the table, and to the extent that it *is* possible, we're not
doing it. One could imagine, for example, an index-organized table
that automatically combines adjacent pages when they're empty enough,
and that also relocates data to physically lower-numbered pages
whenever possible. Such a storage engine might automatically shrink
the on-disk footprint after a large delete, but we have no plans to go
in that direction.

Rather, our assumption is that the bloat most people care about comes
from updates. By performing updates in-place as often as possible, we
hope to avoid bloating both the heap (because we're not adding new row
versions to it which then have to be removed) and the indexes (because
if we don't add new row versions at some other TID, then we don't need
to add index pointers to that new TID either, or remove the old index
pointers to the old TID). Without delete-marking, we can basically
optimize the case that is currently handled via HOT updates: no
indexed columns have changed. However, the in-place update has a
major advantage that it still works even when the page is completely
full, provided that the row does not expand. As Amit's results show,
that can hugely reduce bloat and increase performance in the face of
long-running concurrent transactions. With delete-marking, we can
also optimize the case where indexed columns have been changed. We
don't know exactly how well this will work yet because the code isn't
written and therefore can't be benchmarked, but am hopeful that that
in-place updates will be a big win here too.

So, I would not describe a zheap table as zero-bloat, but it should
involve a lot less bloat than our standard heap.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21

Amit Kapila

amit.kapila16@gmail.com

almost 8 years ago

In reply to: Aleksander Alekseev (#19)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 7:06 PM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:

Hello Amit,

Sometime back Robert has proposed a solution to reduce the bloat in
PostgreSQL [1] which has some other advantages of its own as well. To
recap, in the existing heap, we always create a new version of a tuple on
an update which must eventually be removed by periodic vacuuming or by
HOT-pruning, but still in many cases space is never reclaimed completely.
A similar problem occurs for tuples that are deleted. This leads to bloat
in the database.

This is an impressive work!

Thanks.

Personally I would like to note that performance is probably not a
priority at this stage.

Right, but we are also trying to see that we just don't fall off the
cliff for some more common workloads.

Most important parts, in my humble opinion at
least, are correctness, maintainability (tests, documentation, how
readable the code is), extendability (e.g. an ability to add point in
time recovery in the future), interfaces and heap format.

+1.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#22

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 8 years ago

In reply to: Robert Haas (#20)

Re: zheap: a new storage format for PostgreSQL

On 03/03/18 05:03, Robert Haas wrote:

On Fri, Mar 2, 2018 at 5:35 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair
enough explanation for me without need to rename :)

It will be possible to bloat a zheap table in certain usage patterns.
For example, if you bulk-load the table with a ton of data, commit the
transaction, delete every other row, and then never insert any more
rows ever again, the table is bloated: it's twice as large as it
really needs to be, and we have no provision for shrinking it. In
general, I think it's very hard to keep bulk deletes from leaving
bloat in the table, and to the extent that it *is* possible, we're not
doing it. One could imagine, for example, an index-organized table
that automatically combines adjacent pages when they're empty enough,
and that also relocates data to physically lower-numbered pages
whenever possible. Such a storage engine might automatically shrink
the on-disk footprint after a large delete, but we have no plans to go
in that direction.

Rather, our assumption is that the bloat most people care about comes
from updates. By performing updates in-place as often as possible, we
hope to avoid bloating both the heap (because we're not adding new row
versions to it which then have to be removed) and the indexes (because
if we don't add new row versions at some other TID, then we don't need
to add index pointers to that new TID either, or remove the old index
pointers to the old TID). Without delete-marking, we can basically
optimize the case that is currently handled via HOT updates: no
indexed columns have changed. However, the in-place update has a
major advantage that it still works even when the page is completely
full, provided that the row does not expand. As Amit's results show,
that can hugely reduce bloat and increase performance in the face of
long-running concurrent transactions. With delete-marking, we can
also optimize the case where indexed columns have been changed. We
don't know exactly how well this will work yet because the code isn't
written and therefore can't be benchmarked, but am hopeful that that
in-place updates will be a big win here too.

So, I would not describe a zheap table as zero-bloat, but it should
involve a lot less bloat than our standard heap.

For folk doing ETL type data warehousing this should be great, as the
typical workload tends to be like: COPY (or similar) from foreign data
source, then do several sets of UPDATES to fix/check/scrub the
data...which tends to result in huge bloat with the current heap design
(despite telling people 'you can do it another way to' to avoid bloat -
I guess it seems to be more intuitive to just to do it as described).

regards
Mark

#23

Amit Kapila

amit.kapila16@gmail.com

over 7 years ago

In reply to: Alexander Korotkov (#17)

Re: zheap: a new storage format for PostgreSQL

On Fri, Mar 2, 2018 at 4:05 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

On Fri, Mar 2, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Mar 2, 2018 at 1:50 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:

From: Amit Kapila [mailto:amit.kapila16@gmail.com]

At EnterpriseDB, we (me and some of my colleagues) are working from
more
than a year on the new storage format in which only the latest version
of
the data is kept in main storage and the old versions are moved to an
undo
log. We call this new storage format "zheap". To be clear, this
proposal
is for PG-12.

Wonderful! BTW, what "z" stand for? Ultimate?

There is no special meaning to 'z'. We have discussed quite a few
names (like newheap, nheap, zheap and some more on those lines), but
zheap sounds better. IIRC, one among Robert or Thomas has come up
with this name.

I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair
enough explanation for me without need to rename :)

It's been a while since we have updated the progress on this project,
so here is an update. This is based on the features that were not
working (as mentioned in Readme.md) when the branch was published.
1. TID Scans are working now.
2. Insert .. On Conflict is working now.
3. Tuple locking is working with a restriction that if there are more
concurrent lockers on a page than the number of transaction slots on a
page, then some of the lockers will wait till others get committed.
We are working on a solution to extend the number of transaction slots
on a separate set of pages which exist in heap, but will contain only
transaction data. There are also some corner cases where it doesn't
work for Rollbacks.
4. Foreign keys are working.
5. Vacuum/Autovacuum is working.
6. Rollback prepared transactions.

Apart from this, we have fixed some other open issues. I think to
discuss some of the designs, we need to start separate threads (like
Thomas has already started a thread on undo logs[1]/messages/by-id/CAEepm=2EqROYJ_xYz4v5kfr4b0qw_Lq_6Pe8RTEC8rx3upWsSQ@mail.gmail.com), but it is also
okay to discuss on this thread as well. One specific thing where we
need some input is about testing of this new heap. As of now, the
idea we are using to test it is by having a guc parameter
(storage_engine) which if set to zheap, all the regression tests will
create tables in zheap and the operations are zheap specific. This
basically works okay, but the results are different than expected in
some cases like (a) in-place updates cause rows to be printed in
different order (b) ctid based tests gives different results because
zheap has a metapage and TPD pages, (c) \d+ show storage_engine as an
option, etc. We workaround it by either creating a separate .out file
for zheap or sometimes by masking the expected different output (like
we don't allow to compare additional storage_engine option as output
of \d+). I know this is not the best way to test a new storage
engine, but for now it helped us a lot. I think we need some generic
way to test new storage engines. I am not sure if it good to discuss
it here or does this belong to Pluggable API thread.

Any thoughts?

[1]: /messages/by-id/CAEepm=2EqROYJ_xYz4v5kfr4b0qw_Lq_6Pe8RTEC8rx3upWsSQ@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#24

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Amit Kapila (#23)

Re: zheap: a new storage format for PostgreSQL

On Sat, May 26, 2018 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Mar 2, 2018 at 4:05 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

It's been a while since we have updated the progress on this project,
so here is an update.

Yet, another update.

This is based on the features that were not
working (as mentioned in Readme.md) when the branch was published.
1. TID Scans are working now.
2. Insert .. On Conflict is working now.
3. Tuple locking is working with a restriction that if there are more
concurrent lockers on a page than the number of transaction slots on a
page, then some of the lockers will wait till others get committed.
We are working on a solution to extend the number of transaction slots
on a separate set of pages which exist in heap, but will contain only
transaction data.

Now, we have a working solution for this problem. The extended
transaction slots are stored in TPD pages (those contains only
transaction slot arrays) which are interleaved with regular pages.
For a detailed idea, you can see atop src/backend/access/zheap/tpd.c.
We still have a caveat here which is once the TPD pages are pruned
(the TPD page can be pruned if all the transaction slots are old
enough to matter), they are not added to FSM for reuse. We are
working on a patch for this which we expect to finish in a week or so.

Toast tables are working now, the toast data is stored in zheap.
Apart from having a consistency for storing toast data in the same
storage engine as main data, it has the advantage of early cleanup
which means the space for deleted rows can be reclaimed as soon as the
transaction commits. This is good for toast tables as each update in
toast table is a DELETE+INSERT.

Alignment of tuples is changed such that we don’t have align padding
between the tuple header and the tuple data as we always make a copy
of the tuple to support in-place updates. Likewise, we ideally don't
need any alignment padding between tuples. However, there are places
in zheap code where we access tuple header directly from page (ex.
zheap_delete, zheap_update, etc.) for which we want them to be aligned
at the two-byte boundary). We omit all alignment padding for
pass-by-value types. Even in the current heap, we never point directly
to such values, so the alignment padding doesn’t help much; it lets us
fetch the value using a single instruction, but that is all.
Pass-by-reference types will work as they do in the heap. We can't
directly access unaligned values; instead, we need to use memcpy. We
believe that the space savings will more than pay for the additional
CPU costs.

Vacuum full is implemented in such a way that we don't copy the
information required for MVCC-aware scans. We copy only LIVE tuples
in new heap and freeze them before storing in new heap. This is not a
good idea as we lose all the visibility information of tuples, but
OTOH, the same can't be copied from the original tuple as that is
maintained in undo and we don't have the facility to modify
undorecords. We can either allow to modify undo records or write
special kind of undo records which will capture the required
visibility information. I think it will be tricky to do this and not
sure if it is valuable to put a whole lot of effort without making
basic things work and another thing is that after zheap, the need of
vacuum will anyway be minimized to a good extent.

Serializable isolation is also supported, we don't need to make any
major changes except for making it understand ZheapTuple (used TID in
the required API's). I think this part needs some changes after
integration with pluggable storage API. We have a special handling
for the tuples which are in-place updated or the latest transaction
that modified that tuple got aborted. In that case, we check whether
the latest committed transaction that modified that tuple is a
concurrent transaction. Based on that, we take a decision on whether
we have any serialization conflict.

In zheap, for sub-transactions we don't need to generate new xid as
the visibility information for a particular tuple is present in undo
and on Rollabck To Savepoint, we apply the required undo to make the
state of the tuples as they were before the particular transaction.
This gives us a performance/scalability boost when sub-transactions
are involved as we don't need to acquire XIDGenLock for
subtransaction. Apart from the above benefits, we need this for zheap
as otherwise the undo chain for each transaction won't be linear and
we save allocating additional slots for the each transaction id at the
page level.

Undo workers and transaction rollbacks are working now. My colleague
Dilip has posted a separate patch [1]/messages/by-id/CAFiTN-sYQ8r8ANjWFYkXVfNxgXyLRfvbX9Ee4SxO9ns-OBBgVA@mail.gmail.com for this as this can have some
use cases without zheap as well and Thomas has just posted a patch
using that facility.

Some of the other features like row movement for an update of
partition key are also handled.

In short, now most of the user-visible features are working. The make
installcheck for zheap has 12 failures and all are mostly due to the
plan or some stats changes as zheap has additional meta pages (meta
page and TPD pages) and or we have inplace updates. So in most cases
either additional ORDER BY needs to be added or some minor tweak in
the query is required. The isolation test has one failure which again
is due to inplace updates and seems to be a valid case, but needs a
bit more investigation. We have yet to support JIT for zheap, so the
corresponding tests would also fail.

Some of the main things that are not working:
Logical decoding - I am not sure at this stage whether it is a must
for the first version of zheap. Surely, we can have a basic design
ready.
Snapshot too old - This feature allows the data in heap pages to be
removed in presence of old transactions. This is going to work
differently for zheap as we want the undo for older snapshots to
go-away rather than based on heap pages as we do for current heap.
One can argue that we should make it similar to the current heap, but
I see a lot less value in that as this new heap works entirely
differently and we can have a better implementation for that.
Delete marking in indexes - This will allow inplace updates even when
index columns are updated and additionally with this we can avoid the
need for a dedicated vacuum process to perform retail deletes. This
is the feature we definitely want to do separate than the main heap
because current indexes work with zheap without any major changes.

You can find the latest code at https://github.com/EnterpriseDB/zheap

I want to again like to highlight that this all is not alone my work.
Dilip Kumar, Kuntal Ghosh, Rafia Sabih, Mithun C Y and Amit Khandekar
have worked along with me to make this progress.

Feedback is welcome.

[1]: /messages/by-id/CAFiTN-sYQ8r8ANjWFYkXVfNxgXyLRfvbX9Ee4SxO9ns-OBBgVA@mail.gmail.com
[2]: /messages/by-id/CAEepm=0ULqYgM2aFeOnrx6YrtBg3xUdxALoyCG+XpssKqmezug@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#25

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Amit Kapila (#24)

1 attachment(s)

Re: zheap: a new storage format for PostgreSQL

On 11/01/2018 07:43 AM, Amit Kapila wrote:

You can find the latest code at https://github.com/EnterpriseDB/zheap

Seems valgrind complains about a couple of places in the code - nothing
major, might be noise, but probably worth a look.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#26

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#25)

Re: zheap: a new storage format for PostgreSQL

On Thu, Nov 1, 2018 at 7:26 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 11/01/2018 07:43 AM, Amit Kapila wrote:

You can find the latest code at https://github.com/EnterpriseDB/zheap

Seems valgrind complains about a couple of places in the code - nothing
major, might be noise, but probably worth a look.

I have looked at the report and one of those seems to be problematic,
so I have pushed the fix for the same. The other one for below stack
seems to be bogus:
==7569== Uninitialised value was created by a stack allocation
==7569== at 0x59043D: znocachegetattr (zheapam.c:6206)
==7569==
{
<insert_a_suppression_name_here>
Memcheck:Cond
fun:ZHeapDetermineModifiedColumns
fun:zheap_update

I have checked in the function znocachegetattr that if we initialize
the value of ret_datum, it fixes the reported error, but actually
there is no need for doing it as the code always assign the valid
value to this variable. I have left it as is for now as I am not sure
whether there is any value in doing such an initialization.

Thanks for the report.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#27

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Amit Kapila (#26)

Re: zheap: a new storage format for PostgreSQL

On 11/02/2018 12:12 PM, Amit Kapila wrote:

On Thu, Nov 1, 2018 at 7:26 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 11/01/2018 07:43 AM, Amit Kapila wrote:

You can find the latest code at https://github.com/EnterpriseDB/zheap

Seems valgrind complains about a couple of places in the code - nothing
major, might be noise, but probably worth a look.

I have looked at the report and one of those seems to be problematic,
so I have pushed the fix for the same. The other one for below stack
seems to be bogus:
==7569== Uninitialised value was created by a stack allocation
==7569== at 0x59043D: znocachegetattr (zheapam.c:6206)
==7569==
{
<insert_a_suppression_name_here>
Memcheck:Cond
fun:ZHeapDetermineModifiedColumns
fun:zheap_update

I have checked in the function znocachegetattr that if we initialize
the value of ret_datum, it fixes the reported error, but actually
there is no need for doing it as the code always assign the valid
value to this variable. I have left it as is for now as I am not sure
whether there is any value in doing such an initialization.

Well, the problem is the ret_datum is modified like this:

thisatt = TupleDescAttr(tupleDesc, attnum);
if (thisatt->attbyval)
memcpy(&ret_datum, tp + off, thisatt->attlen);
else
ret_datum = PointerGetDatum((char *) (tp + off));

which means that for cases with attlen < sizeof(Datum), this ends up
leaving part of the value undefined. So it's a valid issue. I'm sure
it's not the only place where we do something like this, and the other
places don't trigger the valgrind warning, so how do those places do
this? heapam seems to call fetch_att in the end, which essentially calls
Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
trick here?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#28

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#27)

Re: zheap: a new storage format for PostgreSQL

On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 11/02/2018 12:12 PM, Amit Kapila wrote:

On Thu, Nov 1, 2018 at 7:26 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 11/01/2018 07:43 AM, Amit Kapila wrote:

You can find the latest code at https://github.com/EnterpriseDB/zheap

Seems valgrind complains about a couple of places in the code - nothing
major, might be noise, but probably worth a look.

I have looked at the report and one of those seems to be problematic,
so I have pushed the fix for the same. The other one for below stack
seems to be bogus:
==7569== Uninitialised value was created by a stack allocation
==7569== at 0x59043D: znocachegetattr (zheapam.c:6206)
==7569==
{
<insert_a_suppression_name_here>
Memcheck:Cond
fun:ZHeapDetermineModifiedColumns
fun:zheap_update

I have checked in the function znocachegetattr that if we initialize
the value of ret_datum, it fixes the reported error, but actually
there is no need for doing it as the code always assign the valid
value to this variable. I have left it as is for now as I am not sure
whether there is any value in doing such an initialization.

Well, the problem is the ret_datum is modified like this:

thisatt = TupleDescAttr(tupleDesc, attnum);
if (thisatt->attbyval)
memcpy(&ret_datum, tp + off, thisatt->attlen);
else
ret_datum = PointerGetDatum((char *) (tp + off));

which means that for cases with attlen < sizeof(Datum), this ends up
leaving part of the value undefined. So it's a valid issue.

Agreed.

I'm sure
it's not the only place where we do something like this, and the other
places don't trigger the valgrind warning, so how do those places do
this? heapam seems to call fetch_att in the end, which essentially calls
Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
trick here?

This is because, in zheap, we have omitted all alignment padding for
pass-by-value types. See the description in my previous email [1]/messages/by-id/CAA4eK1Lwb+rGeB_z+jUbnSndvgnsDUK+9tjfng4sy1AZyrHqRg@mail.gmail.com. I
think here we need to initialize ret_datum at the beginning of the
function unless you have some better idea.

One thing unrelated to the above problem is that I have forgotten to
mention in my previous email that Daniel Westermann whom I have cc'ed
in this email has reported few bugs in this branch which seems to have
fixed. He seems to be interested in doing more tests. Daniel, I
encourage you to share your findings here.

Thanks, Tomas and Daniel for looking into the branch and reporting
problems, it is really helpful.

[1]: /messages/by-id/CAA4eK1Lwb+rGeB_z+jUbnSndvgnsDUK+9tjfng4sy1AZyrHqRg@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#29

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Amit Kapila (#28)

Re: zheap: a new storage format for PostgreSQL

On Sat, Nov 3, 2018 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I'm sure
it's not the only place where we do something like this, and the other
places don't trigger the valgrind warning, so how do those places do
this? heapam seems to call fetch_att in the end, which essentially calls
Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
trick here?

This is because, in zheap, we have omitted all alignment padding for
pass-by-value types. See the description in my previous email [1]. I
think here we need to initialize ret_datum at the beginning of the
function unless you have some better idea.

I have pushed a fix on the above lines in zheap-branch, but I am open
to change it if you have better ideas for the same.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#30

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Amit Kapila (#29)

Re: zheap: a new storage format for PostgreSQL

On 11/5/18 4:00 AM, Amit Kapila wrote:

On Sat, Nov 3, 2018 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I'm sure
it's not the only place where we do something like this, and the other
places don't trigger the valgrind warning, so how do those places do
this? heapam seems to call fetch_att in the end, which essentially calls
Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
trick here?

This is because, in zheap, we have omitted all alignment padding for
pass-by-value types. See the description in my previous email [1]. I
think here we need to initialize ret_datum at the beginning of the
function unless you have some better idea.

I have pushed a fix on the above lines in zheap-branch, but I am open
to change it if you have better ideas for the same.

Thanks. Initializing the variable seems like the right fix here.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#31

Daniel Westermann

daniel.westermann@dbi-services.com

about 7 years ago

In reply to: Tomas Vondra (#30)

Re: zheap: a new storage format for PostgreSQL

Thanks. Initializing the variable seems like the right fix here.

... just had a warning when recompiling from the latest sources on CentOS 7:

labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o tpd.o tpd.c
tpd.c: In function ‘TPDFreePage’:
tpd.c:1003:15: warning: variable ‘curblkno’ set but not used [-Wunused-but-set-variable]
BlockNumber curblkno = InvalidBlockNumber;
^

Not sure if this is important but as I could not find anything on this thread related to this I thought I'd report it

Regards
Daniel

#32

Kuntal Ghosh

kuntalghosh.2007@gmail.com

about 7 years ago

In reply to: Daniel Westermann (#31)

Re: zheap: a new storage format for PostgreSQL

On Sat, Nov 10, 2018 at 8:51 PM Daniel Westermann
<daniel.westermann@dbi-services.com> wrote:

Thanks. Initializing the variable seems like the right fix here.

... just had a warning when recompiling from the latest sources on CentOS 7:

labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o tpd.o tpd.c
tpd.c: In function ‘TPDFreePage’:
tpd.c:1003:15: warning: variable ‘curblkno’ set but not used [-Wunused-but-set-variable]
BlockNumber curblkno = InvalidBlockNumber;
^

Thanks Daniel for testing zheap and reporting the issue. We'll push a
fix for the same.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

#33

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Kuntal Ghosh (#32)

Re: zheap: a new storage format for PostgreSQL

On Sun, Nov 11, 2018 at 11:55 PM Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:

On Sat, Nov 10, 2018 at 8:51 PM Daniel Westermann
<daniel.westermann@dbi-services.com> wrote:

Thanks. Initializing the variable seems like the right fix here.

... just had a warning when recompiling from the latest sources on CentOS 7:

labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o tpd.o tpd.c
tpd.c: In function ‘TPDFreePage’:
tpd.c:1003:15: warning: variable ‘curblkno’ set but not used [-Wunused-but-set-variable]
BlockNumber curblkno = InvalidBlockNumber;
^

This variable is used only for Asserts, so we need to use
PG_USED_FOR_ASSERTS_ONLY while declaring it.

Thanks Daniel for testing zheap and reporting the issue. We'll push a
fix for the same.

Pushed the fix now.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#34

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Amit Kapila (#24)

Re: zheap: a new storage format for PostgreSQL

On Thu, Nov 1, 2018 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Now, we have a working solution for this problem. The extended
transaction slots are stored in TPD pages (those contains only
transaction slot arrays) which are interleaved with regular pages.
For a detailed idea, you can see atop src/backend/access/zheap/tpd.c.
We still have a caveat here which is once the TPD pages are pruned
(the TPD page can be pruned if all the transaction slots are old
enough to matter), they are not added to FSM for reuse. We are
working on a patch for this which we expect to finish in a week or so.

Now, this work is also committed to zheap-branch. The basic idea is
that if all the TPD entries are old enough that they can be pruned,
then we clean such a page and record the same in FSM. The empty
pages from FSM can be used either by zheap or TPD when required. We
have one optimization where without going through each of the TPD
entry, we can decide whether the entire page can be pruned. We have
used tpd_latest_xid_epoch stored in the page header to prune the
entire TPD page. Basically, if tpd_latest_xid_epoch precedes
oldestXidhaving undo, then we can assume all the entries in the page
can be pruned.

Another interesting feature which is now working in zheap is ALTER
TABLE .. SET TABLESPACE. The basic idea is the same as heap (copy the
relation page-by-page) except that in zheap we can have some pending
aborts (as sometimes rollback requests are pushed to undo worker), so
we finish those aborts before copying the page to a new tablespace. I
think if we want we could do without it as well, but as we already
making the page-dirty and writing, it seems wise to complete the
aborts.

Now, single-user-mode is also working. In single-user-mode, we always
perform the rollback requests in the foreground as there is no undo
worker/s present. Also we discard the undo at commit as we won't need
it later.

Other than that we have made miscellaneous code-improvements and
bug-fixes in the branch.

The next big step now is to port it over pluggable storage for which
Andres has done the legwork and we will take it forward. The other
thing we are going to focus next is performance optimization of code
in various scenarios.

I don't know how much what I write on this thread is read by others or
how useful this is for others who are following this work, but I am
trying to be precise here, so feel free to ask for more information.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#35

Adam Brusselback

adambrusselback@gmail.com

about 7 years ago

In reply to: Amit Kapila (#34)

Re: zheap: a new storage format for PostgreSQL

I don't know how much what I write on this thread is read by others or

how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking
it up, because I don't feel like i'd have anything useful to add in most
cases. It is very interesting seeing the development take place though, so
just know it's appreciated at least from my perspective.

#36

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Adam Brusselback (#35)

Re: zheap: a new storage format for PostgreSQL

On Sat, Nov 17, 2018 at 11:21 AM Adam Brusselback
<adambrusselback@gmail.com> wrote:

I don't know how much what I write on this thread is read by others or

how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking it up, because I don't feel like i'd have anything useful to add in most cases. It is very interesting seeing the development take place though, so just know it's appreciated at least from my perspective.

Thanks, it makes difference and keep us motivated for making progress.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#37

Hakan Kocaman

hkocam@gmail.com

about 7 years ago

In reply to: Adam Brusselback (#35)

Re: zheap: a new storage format for PostgreSQL

Adam Brusselback <adambrusselback@gmail.com> schrieb am Sa., 17. Nov. 2018
um 06:51 Uhr:

I don't know how much what I write on this thread is read by others or

how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking
it up, because I don't feel like i'd have anything useful to add in most
cases. It is very interesting seeing the development take place though, so
just know it's appreciated at least from my perspective.

+1
count me in

kind regards
hakan kocaman

#38

Darafei "Komяpa" Praliaskouski

me@komzpa.net

about 7 years ago

In reply to: Adam Brusselback (#35)

Re: zheap: a new storage format for PostgreSQL

On Sat, Nov 17, 2018 at 8:51 AM Adam Brusselback <adambrusselback@gmail.com>
wrote:

I don't know how much what I write on this thread is read by others or

how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking
it up, because I don't feel like i'd have anything useful to add in most
cases. It is very interesting seeing the development take place though, so
just know it's appreciated at least from my perspective.

I'm also following the development and have hopes about it going forward.
Not much low-level details I can comment on though :)

In PostGIS workloads, UPDATE table SET geom = ST_CostyFunction(geom,
magicnumber); is one of biggest time-eaters that happen upon initial load
and clean up of your data. It is commonly followed by CLUSTER table using
table_geom_idx; to make sure you're back at full speed and no VACUUM is
needed, and your table (usually static after that) is more-or-less
spatially ordered. I see that zheap can remove the need for VACUUM, which
is a big win already. If you can do something that will allow reorder of
tuples according to index happen during an UPDATE that rewrites most of
table, that would be a game changer :)

Another story is Visibility Map and Index-Only Scans. Right now there is a
huge gap between the insert of rows and the moment they are available for
index only scan, as VACUUM is required. Do I understand correctly that for
zheap this all can be inverted, and UNDO can become "invisibility map" that
may be quite small and discarded quickly?

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

#39

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Darafei "Komяpa" Praliaskouski (#38)

Re: zheap: a new storage format for PostgreSQL

On Sun, Nov 18, 2018 at 3:42 PM Darafei "Komяpa" Praliaskouski
<me@komzpa.net> wrote:

On Sat, Nov 17, 2018 at 8:51 AM Adam Brusselback <adambrusselback@gmail.com> wrote:

I don't know how much what I write on this thread is read by others or

how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking it up, because I don't feel like i'd have anything useful to add in most cases. It is very interesting seeing the development take place though, so just know it's appreciated at least from my perspective.

I'm also following the development and have hopes about it going forward. Not much low-level details I can comment on though :)

In PostGIS workloads, UPDATE table SET geom = ST_CostyFunction(geom, magicnumber); is one of biggest time-eaters that happen upon initial load and clean up of your data. It is commonly followed by CLUSTER table using table_geom_idx; to make sure you're back at full speed and no VACUUM is needed, and your table (usually static after that) is more-or-less spatially ordered. I see that zheap can remove the need for VACUUM, which is a big win already. If you can do something that will allow reorder of tuples according to index happen during an UPDATE that rewrites most of table, that would be a game changer :)

If the tuples are already in the order of the index, then we would
retain the order, otherwise, we might not want to anything special for
ordering w.r.t index. I think this is important as we are not sure of
the user's intention and I guess it won't be easy to do such
rearrangement during Update statement.

Another story is Visibility Map and Index-Only Scans. Right now there is a huge gap between the insert of rows and the moment they are available for index only scan, as VACUUM is required. Do I understand correctly that for zheap this all can be inverted, and UNDO can become "invisibility map" that may be quite small and discarded quickly?

Yeah, eventually that is our goal with the help of delete-marking in
indexes, however, for the first version, we still need to rely on
visibility maps for index-only-scans.

Thank you for showing interest in this work.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#40

Daniel Westermann

daniel.westermann@dbi-services.com

about 7 years ago

In reply to: Amit Kapila (#36)

Re: zheap: a new storage format for PostgreSQL

Thanks, it makes difference and keep us motivated for making progress.

Is it intended behavior that a database can not be dropped when undo apply is running in the background?

zheap=# update pgbench_accounts set filler = 'bbb' where mod(aid,10) = 0;
UPDATE 1000000
zheap=# rollback;
ROLLBACK
zheap=# drop database zheap;
ERROR: cannot drop the currently open database
zheap=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database zheap;
ERROR: database "zheap" is being accessed by other users
DETAIL: There is 1 other session using the database.
postgres=# drop database zheap;
ERROR: database "zheap" is being accessed by other users
DETAIL: There is 1 other session using the database.
postgres=#

Regards
Daniel

#41

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Daniel Westermann (#40)

Re: zheap: a new storage format for PostgreSQL

On Mon, Nov 19, 2018 at 3:59 PM Daniel Westermann
<daniel.westermann@dbi-services.com> wrote:

Thanks, it makes difference and keep us motivated for making progress.

+1

Is it intended behavior that a database can not be dropped when undo apply is running in the background?

Yes, we need to connect to the database for performing rollback
actions. Once the rollback for that database is over, undo apply
worker will exit and you should be able to drop the database.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#42

Daniel Westermann

daniel.westermann@dbi-services.com

about 7 years ago

In reply to: Amit Kapila (#41)

Re: zheap: a new storage format for PostgreSQL

Yes, we need to connect to the database for performing rollback
actions. Once the rollback for that database is over, undo apply
worker will exit and you should be able to drop the database.

Thank you, Amit.
Can you have a look at this one?

create table t1 ( a text ) partition by list (a);
create table t1_1 PARTITION of t1 (a) for values in ('a');
create table t1_2 PARTITION of t1 (a) for values in ('b');
create table t1_3 PARTITION of t1 (a) for values in ('c');
create table t1_4 PARTITION of t1 (a) default;

insert into t1 select 'a' from generate_series ( 1, 1000000 );
insert into t1 select 'b' from generate_series ( 1, 1000000 );
insert into t1 select 'c' from generate_series ( 1, 1000000 );

postgres=# begin;
BEGIN
postgres=# update t1 set a = 'd' where a = 'a';
UPDATE 1000000
postgres=# rollback;
ROLLBACK
postgres=# select * from t1 where a = 'd';
postgres=# select * from t1 where a = 'd';
postgres=# select * from t1 where a = 'd';

The selects at the end take seconds and a lot of checkpoints are happening.

Regards
Daniel

#43

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Daniel Westermann (#42)

Re: zheap: a new storage format for PostgreSQL

On Mon, Nov 19, 2018 at 6:36 PM Daniel Westermann
<daniel.westermann@dbi-services.com> wrote:

Yes, we need to connect to the database for performing rollback
actions. Once the rollback for that database is over, undo apply
worker will exit and you should be able to drop the database.

Thank you, Amit.
Can you have a look at this one?

create table t1 ( a text ) partition by list (a);
create table t1_1 PARTITION of t1 (a) for values in ('a');
create table t1_2 PARTITION of t1 (a) for values in ('b');
create table t1_3 PARTITION of t1 (a) for values in ('c');
create table t1_4 PARTITION of t1 (a) default;

postgres=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+------+-----------+----------+---------+----------+--------------+-------------
a | text | | | | extended | |
Partition key: LIST (a)
Partitions: t1_1 FOR VALUES IN ('a'),
t1_2 FOR VALUES IN ('b'),
t1_3 FOR VALUES IN ('c'),
t1_4 DEFAULT
Options: storage_engine=zheap

insert into t1 select 'a' from generate_series ( 1, 1000000 );
insert into t1 select 'b' from generate_series ( 1, 1000000 );
insert into t1 select 'c' from generate_series ( 1, 1000000 );

postgres=# begin;
BEGIN
postgres=# update t1 set a = 'd' where a = 'a';
UPDATE 1000000
postgres=# rollback;
ROLLBACK

Here, you are doing a big rollback, so I guess it will be pushed to
background unless you increase the value of 'rollback_overflow_size'.
You can confirm that by checking if any undo apply worker is active
and rollback finishes immediately.

postgres=# select * from t1 where a = 'd';
postgres=# select * from t1 where a = 'd';
postgres=# select * from t1 where a = 'd';

The selects at the end take seconds

I think what is happening is as rollback is still in progress, the
scan needs to fetch the data from undo and it will be slow.

and a lot of checkpoints are happening.

It is because Rollbacks also write WAL and you are doing a big
Rollback which will lead to re-write of the entire table.

I guess if you allow rollback to complete before issuing a select, you
will see better results.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#44

Darafei "Komяpa" Praliaskouski

me@komzpa.net

about 7 years ago

In reply to: Amit Kapila (#39)

Re: zheap: a new storage format for PostgreSQL

In PostGIS workloads, UPDATE table SET geom = ST_CostyFunction(geom,

magicnumber); is one of biggest time-eaters that happen upon initial load
and clean up of your data. It is commonly followed by CLUSTER table using
table_geom_idx; to make sure you're back at full speed and no VACUUM is
needed, and your table (usually static after that) is more-or-less
spatially ordered. I see that zheap can remove the need for VACUUM, which
is a big win already. If you can do something that will allow reorder of
tuples according to index happen during an UPDATE that rewrites most of
table, that would be a game changer :)

If the tuples are already in the order of the index, then we would
retain the order, otherwise, we might not want to anything special for
ordering w.r.t index. I think this is important as we are not sure of
the user's intention and I guess it won't be easy to do such
rearrangement during Update statement.

User's clustering intention is recorded in existence of CLUSTER index over
table. That's not used by anything other than CLUSTER command now though.

When I was looking into current heap implementation it seemed that it's
possible to hook in a lookup for a couple blocks with values adjacent to
the new value, and prefer them to FSM lookup and "current page", for
clustered table. Due to dead tuples, free space is going to end very very
soon in usual heap, so it probably doesn't make sense there - you're
consuming space with old one in old page and new one in new page.

If I understand correctly, in zheap an update would not result in a dead
tuple in old page, so space is not going to end immediately, and this may
unblock path for such further developments. That is, if there is a spot
where to plug in such or similar logic in code :)

I've described the business case in [1].

1:
/messages/by-id/CAC8Q8tLBeAxR+BXWuKK+HP5m8tEVYn270CVrDvKXt=0PkJTY9g@mail.gmail.com

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

#45

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Darafei "Komяpa" Praliaskouski (#44)

Re: zheap: a new storage format for PostgreSQL

On Tue, Nov 20, 2018 at 12:53 PM Darafei "Komяpa" Praliaskouski
<me@komzpa.net> wrote:

In PostGIS workloads, UPDATE table SET geom = ST_CostyFunction(geom, magicnumber); is one of biggest time-eaters that happen upon initial load and clean up of your data. It is commonly followed by CLUSTER table using table_geom_idx; to make sure you're back at full speed and no VACUUM is needed, and your table (usually static after that) is more-or-less spatially ordered. I see that zheap can remove the need for VACUUM, which is a big win already. If you can do something that will allow reorder of tuples according to index happen during an UPDATE that rewrites most of table, that would be a game changer :)

If the tuples are already in the order of the index, then we would
retain the order, otherwise, we might not want to anything special for
ordering w.r.t index. I think this is important as we are not sure of
the user's intention and I guess it won't be easy to do such
rearrangement during Update statement.

User's clustering intention is recorded in existence of CLUSTER index over table. That's not used by anything other than CLUSTER command now though.

When I was looking into current heap implementation it seemed that it's possible to hook in a lookup for a couple blocks with values adjacent to the new value, and prefer them to FSM lookup and "current page", for clustered table. Due to dead tuples, free space is going to end very very soon in usual heap, so it probably doesn't make sense there - you're consuming space with old one in old page and new one in new page.

If I understand correctly, in zheap an update would not result in a dead tuple in old page, so space is not going to end immediately, and this may unblock path for such further developments. That is, if there is a spot where to plug in such or similar logic in code :)

Yeah, in zheap the dead tuples will be less or may not be there in
many cases, but I am not sure how much it can help for your use case.

I've described the business case in [1].

I am not sure but maybe you need something like Clustered Index where
heap pages are linked via leaf pages of btree.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#46

Mithun Cy

mithun.cy@enterprisedb.com

about 7 years ago

In reply to: Amit Kapila (#1)

Re: zheap: a new storage format for PostgreSQL

On Thu, Mar 1, 2018 at 7:39 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

I did some testing for performance of COPY command for zheap against heap,
here are my results,
Machine : cthulhu, (is a 8 node numa machine with 500GB of RAM)
server non default settings: shared buffers 32GB, max_wal_size = 20GB,
min_wal_size = 15GB

Test tables and data:
----------------------------
I have used pgbench_accounts table of pgbench tool as data source with 3
different scale factors 100, 1000, 2000. Both heap and zheap table is
lookalike of pgbench_accounts

CREATE TABLE pgbench_zheap (LIKE pgbench_accounts) WITH
(storage_engine='zheap');
CREATE TABLE pgbench_heap (LIKE pgbench_accounts) WITH
(storage_engine='heap');

Test Commands:
Command to generate datafile: COPY pgbench_accounts TO '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data';

Command to load from datafile:
COPY pgbench_heap FROM '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data'; -- heap table
COPY pgbench_zheap FROM '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data'; -- zheap table

Results
======

Scale factor : 100
------------------------
zheap table size : 1028 MB
heap table size: 1281 MB
-- table size reduction: 19% size reduction.
zheap wal size: 1007 MB
heap wal size: 1024 MB
-- wal size difference: 1.6% size reduction.
zheap COPY execution time: 24869.451 ms
heap COPY execution time: 25858.773 ms
-- % of improvement -- 3.8% reduction in execution time for zheap

Scale factor : 1000
-------------------------
zheap table size : 10 GB
heap table size: 13 GB
-- table size reduction: 23% size reduction.
zheap wal size: 10071 MB
heap wal size: 10243 MB
-- wal size difference: 1.67% size reduction.
zheap COPY execution time: 270790.235 ms
heap COPY execution time: 280325.632 ms
-- % of improvement -- 3.4% reduction in execution time for zheap

Scale factor : 2000
-------------------------
zheap table size : 20GB
heap table size: 25GB
-- table size reduction: 20% size reduction.
zheap wal size: 20142 MB
heap wal size: 20499 MB
-- wal size difference: 1.7% size reduction.
zheap COPY execution time: 523702.904 ms
heap COPY execution time: 537537.720 ms
-- % of improvement -- 2.5 % reduction in execution time for zheap

COPY command seems to have improved very slightly with zheap in both with
size of wal and execution time. I also did some tests with insert statement
where I could see some regression in zheap when compared to heap with
respect to execution time. With further more investigation I will reply
here.

--
Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

#47

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Mithun Cy (#46)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <mithun.cy@enterprisedb.com>
napsal:

On Thu, Mar 1, 2018 at 7:39 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

I did some testing for performance of COPY command for zheap against heap,
here are my results,
Machine : cthulhu, (is a 8 node numa machine with 500GB of RAM)
server non default settings: shared buffers 32GB, max_wal_size = 20GB,
min_wal_size = 15GB

Test tables and data:
----------------------------
I have used pgbench_accounts table of pgbench tool as data source with 3
different scale factors 100, 1000, 2000. Both heap and zheap table is
lookalike of pgbench_accounts

CREATE TABLE pgbench_zheap (LIKE pgbench_accounts) WITH
(storage_engine='zheap');
CREATE TABLE pgbench_heap (LIKE pgbench_accounts) WITH
(storage_engine='heap');

Test Commands:
Command to generate datafile: COPY pgbench_accounts TO '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data';

Command to load from datafile:
COPY pgbench_heap FROM '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data'; -- heap table
COPY pgbench_zheap FROM '/mnt/data-mag/
mithun.cy/zheapperfbin/bin/pgbench.data'; -- zheap table

Results
======

Scale factor : 100
------------------------
zheap table size : 1028 MB
heap table size: 1281 MB
-- table size reduction: 19% size reduction.
zheap wal size: 1007 MB
heap wal size: 1024 MB
-- wal size difference: 1.6% size reduction.
zheap COPY execution time: 24869.451 ms
heap COPY execution time: 25858.773 ms
-- % of improvement -- 3.8% reduction in execution time for zheap

Scale factor : 1000
-------------------------
zheap table size : 10 GB
heap table size: 13 GB
-- table size reduction: 23% size reduction.
zheap wal size: 10071 MB
heap wal size: 10243 MB
-- wal size difference: 1.67% size reduction.
zheap COPY execution time: 270790.235 ms
heap COPY execution time: 280325.632 ms
-- % of improvement -- 3.4% reduction in execution time for zheap

Scale factor : 2000
-------------------------
zheap table size : 20GB
heap table size: 25GB
-- table size reduction: 20% size reduction.
zheap wal size: 20142 MB
heap wal size: 20499 MB
-- wal size difference: 1.7% size reduction.
zheap COPY execution time: 523702.904 ms
heap COPY execution time: 537537.720 ms
-- % of improvement -- 2.5 % reduction in execution time for zheap

COPY command seems to have improved very slightly with zheap in both with
size of wal and execution time. I also did some tests with insert statement
where I could see some regression in zheap when compared to heap with
respect to execution time. With further more investigation I will reply
here.

20% of size reduction looks like effect of fill factor.

Regards

Pavel

Show quoted text

Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

#48

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Pavel Stehule (#47)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 10:03 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <mithun.cy@enterprisedb.com> napsal:

COPY command seems to have improved very slightly with zheap in both with size of wal and execution time. I also did some tests with insert statement where I could see some regression in zheap when compared to heap with respect to execution time. With further more investigation I will reply here.

20% of size reduction looks like effect of fill factor.

I think it is because of smaller zheap tuple sizes. Mithun can tell
more about setup whether he has used different fillfactor or anything
else which could lead to such a big difference.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#49

Mithun Cy

mithun.cy@enterprisedb.com

about 7 years ago

In reply to: Amit Kapila (#48)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 11:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 6, 2018 at 10:03 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <mithun.cy@enterprisedb.com> napsal:

COPY command seems to have improved very slightly with zheap in both with size of wal and execution time. I also did some tests with insert statement where I could see some regression in zheap when compared to heap with respect to execution time. With further more investigation I will reply here.

20% of size reduction looks like effect of fill factor.

I think it is because of smaller zheap tuple sizes. Mithun can tell
more about setup whether he has used different fillfactor or anything
else which could lead to such a big difference.

Yes default fillfactor is unaltered, zheap tuples sizes are less and
alinged each at 2 Bytes

Length of each item. (all Items are identical)
=====================================
postgres=# SELECT lp_len FROM
zheap_page_items(get_raw_page('pgbench_zheap', 9)) limit 1;
lp_len
--------
102
(1 row)

postgres=# SELECT lp_len FROM
heap_page_items(get_raw_page('pgbench_heap', 9)) limit 1;
lp_len
--------
121
(1 row)

Total tuples per page
=====================================
postgres=# SELECT count(*) FROM
zheap_page_items(get_raw_page('pgbench_zheap', 9));
count
-------
76
(1 row)

postgres=# SELECT count(*) FROM
heap_page_items(get_raw_page('pgbench_heap', 9));
count
-------
61
(1 row)

because of this zheap takes less space as reported above.

--
Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

#50

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Mithun Cy (#49)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 7:55 odesílatel Mithun Cy <mithun.cy@enterprisedb.com>
napsal:

On Thu, Dec 6, 2018 at 11:13 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Dec 6, 2018 at 10:03 AM Pavel Stehule <pavel.stehule@gmail.com>

wrote:

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <mithun.cy@enterprisedb.com>

napsal:

COPY command seems to have improved very slightly with zheap in both

with size of wal and execution time. I also did some tests with insert
statement where I could see some regression in zheap when compared to heap
with respect to execution time. With further more investigation I will
reply here.

20% of size reduction looks like effect of fill factor.

I think it is because of smaller zheap tuple sizes. Mithun can tell
more about setup whether he has used different fillfactor or anything
else which could lead to such a big difference.

Yes default fillfactor is unaltered, zheap tuples sizes are less and
alinged each at 2 Bytes

I am sorry, I know zero about zheap - does zheap use fill factor? if yes,
why? I though it was sense just for current format.

Regards

Pavel

Show quoted text

Length of each item. (all Items are identical)
=====================================
postgres=# SELECT lp_len FROM
zheap_page_items(get_raw_page('pgbench_zheap', 9)) limit 1;
lp_len
--------
102
(1 row)

postgres=# SELECT lp_len FROM
heap_page_items(get_raw_page('pgbench_heap', 9)) limit 1;
lp_len
--------
121
(1 row)

Total tuples per page
=====================================
postgres=# SELECT count(*) FROM
zheap_page_items(get_raw_page('pgbench_zheap', 9));
count
-------
76
(1 row)

postgres=# SELECT count(*) FROM
heap_page_items(get_raw_page('pgbench_heap', 9));
count
-------
61
(1 row)

because of this zheap takes less space as reported above.

--
Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

#51

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Pavel Stehule (#50)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 12:30 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 7:55 odesílatel Mithun Cy <mithun.cy@enterprisedb.com> napsal:

On Thu, Dec 6, 2018 at 11:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 6, 2018 at 10:03 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <mithun.cy@enterprisedb.com> napsal:

COPY command seems to have improved very slightly with zheap in both with size of wal and execution time. I also did some tests with insert statement where I could see some regression in zheap when compared to heap with respect to execution time. With further more investigation I will reply here.

20% of size reduction looks like effect of fill factor.

I think it is because of smaller zheap tuple sizes. Mithun can tell
more about setup whether he has used different fillfactor or anything
else which could lead to such a big difference.

Yes default fillfactor is unaltered, zheap tuples sizes are less and
alinged each at 2 Bytes

I am sorry, I know zero about zheap - does zheap use fill factor? if yes, why?

Good question. It is required because tuples can expand (Update tuple
to bigger length). In such cases, we try to perform in-place update
if there is a space in the page. So, having fillfactor can help.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#52

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Amit Kapila (#51)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 8:08 odesílatel Amit Kapila <amit.kapila16@gmail.com>
napsal:

On Thu, Dec 6, 2018 at 12:30 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

čt 6. 12. 2018 v 7:55 odesílatel Mithun Cy <mithun.cy@enterprisedb.com>

napsal:

On Thu, Dec 6, 2018 at 11:13 AM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, Dec 6, 2018 at 10:03 AM Pavel Stehule <

pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 5:02 odesílatel Mithun Cy <

mithun.cy@enterprisedb.com> napsal:

COPY command seems to have improved very slightly with zheap in

both with size of wal and execution time. I also did some tests with insert
statement where I could see some regression in zheap when compared to heap
with respect to execution time. With further more investigation I will
reply here.

20% of size reduction looks like effect of fill factor.

I think it is because of smaller zheap tuple sizes. Mithun can tell
more about setup whether he has used different fillfactor or anything
else which could lead to such a big difference.

Yes default fillfactor is unaltered, zheap tuples sizes are less and
alinged each at 2 Bytes

I am sorry, I know zero about zheap - does zheap use fill factor? if

yes, why?

Good question. It is required because tuples can expand (Update tuple
to bigger length). In such cases, we try to perform in-place update
if there is a space in the page. So, having fillfactor can help.

Thank you for reply :)

Pavel

Show quoted text

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#53

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Pavel Stehule (#52)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 2:11 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

I am sorry, I know zero about zheap - does zheap use fill factor? if yes, why?

Good question. It is required because tuples can expand (Update tuple
to bigger length). In such cases, we try to perform in-place update
if there is a space in the page. So, having fillfactor can help.

Thank you for reply :)

I suspect fillfactor is *more* likely to help with zheap than with the
current heap. With the current heap, you need to leave enough space
to store entire copies of the tuples to try to get HOT updates. But
with zheap you only need enough room for the anticipate growth in the
tuples.

For instance, let's say that you plan to update 30% of the tuples in a
table and make them 1 byte larger. With the heap, you'd need to leave
~ 3/13 = 23% of each page empty, plus a little bit more to allow for
the storage growth. So to make all of those updates HOT, you would
probably need a fillfactor of roughly 75%. Unfortunately, that will
make your table larger by one-third, which is terrible.

On the other hand, with zheap, you only need to leave enough room for
the increased amount of tuple data. If you've got 121 items per page,
as in Mithun's statistics, that means you need 121 bytes of free space
to do all the updates in place. That means you need a fillfactor of 1
- (121/8192) = ~98%. To be conservative you can set a fillfactor of
say 95%. Your table will only get slightly bigger, and all of your
updates will be in place, and everything will be great. At least with
respect to fillfactor -- zheap is not free of other problems.

Of course, you don't really set fillfactor based on an expectation of
a single round of tuple updates, but imagine that the workload goes on
for a while, with tuples getting bigger and smaller again as the exact
values being stored change. In a heap table, you need LOTS of empty
space on each page to get HOT updates. In a zheap table, you need
very little, because the updates are in place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#54

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Robert Haas (#53)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 16:12 odesílatel Robert Haas <robertmhaas@gmail.com>
napsal:

On Thu, Dec 6, 2018 at 2:11 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

I am sorry, I know zero about zheap - does zheap use fill factor? if

yes, why?

Good question. It is required because tuples can expand (Update tuple
to bigger length). In such cases, we try to perform in-place update
if there is a space in the page. So, having fillfactor can help.

Thank you for reply :)

I suspect fillfactor is *more* likely to help with zheap than with the
current heap. With the current heap, you need to leave enough space
to store entire copies of the tuples to try to get HOT updates. But
with zheap you only need enough room for the anticipate growth in the
tuples.

For instance, let's say that you plan to update 30% of the tuples in a
table and make them 1 byte larger. With the heap, you'd need to leave
~ 3/13 = 23% of each page empty, plus a little bit more to allow for
the storage growth. So to make all of those updates HOT, you would
probably need a fillfactor of roughly 75%. Unfortunately, that will
make your table larger by one-third, which is terrible.

On the other hand, with zheap, you only need to leave enough room for
the increased amount of tuple data. If you've got 121 items per page,
as in Mithun's statistics, that means you need 121 bytes of free space
to do all the updates in place. That means you need a fillfactor of 1
- (121/8192) = ~98%. To be conservative you can set a fillfactor of
say 95%. Your table will only get slightly bigger, and all of your
updates will be in place, and everything will be great. At least with
respect to fillfactor -- zheap is not free of other problems.

I have a problem to imagine it. When fill factor will be low, then there is
high risk of high fragmentation - or there some body should to do
defragmentation.

Show quoted text

Of course, you don't really set fillfactor based on an expectation of
a single round of tuple updates, but imagine that the workload goes on
for a while, with tuples getting bigger and smaller again as the exact
values being stored change. In a heap table, you need LOTS of empty
space on each page to get HOT updates. In a zheap table, you need
very little, because the updates are in place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#55

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Pavel Stehule (#54)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 10:23 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

I have a problem to imagine it. When fill factor will be low, then there is high risk of high fragmentation - or there some body should to do defragmentation.

I don't understand this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#56

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Robert Haas (#55)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 16:26 odesílatel Robert Haas <robertmhaas@gmail.com>
napsal:

On Thu, Dec 6, 2018 at 10:23 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

I have a problem to imagine it. When fill factor will be low, then there

is high risk of high fragmentation - or there some body should to do
defragmentation.

I don't understand this.

I don't know if zheap has or has not any tools for elimination
fragmentation of space of page. But I expect so after some set of updates,
when record size is mutable, the free space on page should be fragmented.
Usually, when you have less memory, then fragmentation is faster.

Pavel

Show quoted text

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#57

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Pavel Stehule (#56)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 16:26 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:

On Thu, Dec 6, 2018 at 10:23 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

I have a problem to imagine it. When fill factor will be low, then there is high risk of high fragmentation - or there some body should to do defragmentation.

I don't understand this.

I don't know if zheap has or has not any tools for elimination fragmentation of space of page. But I expect so after some set of updates, when record size is mutable, the free space on page should be fragmented. Usually, when you have less memory, then fragmentation is faster.

Still not sure I completely understand, but it's true that zheap
sometimes needs to compact free space on a page. For example, if
you've got a page with a 100-byte hole, and somebody updates a tuple
to make it 2 bytes bigger, you've got to shift that tuple and any that
precede it backwards to reduce the size of the hole to 98 bytes, so
that you can fit the new version of the tuple. If, later, somebody
shrinks that tuple back to the original size, you've now got 100 bytes
of free space on the page, but they are fragmented: 98 bytes in the
"hole," and 2 bytes following the newly-shrunk tuple. If someone
tries to insert a 100-byte tuple in that page, we'll need to
reorganize the page a second time to bring all that free space back
together in a single chunk.

In my view, and I'm not sure if this is how the code currently works,
we should have just one routine to do a zheap page reorganization
which can cope with all possible scenarios. I imagine that you would
give it the page is it currently exists plus a "minimum tuple size"
for one or more tuples on the page (which must not be smaller than the
current size of that tuple, but could be bigger). It then reorganizes
the page so that every tuple for which a minimum size was given
consumes exactly that amount of space, every other tuple consumes the
minimum possible amount of space, and the remaining space goes into
the hole. So if you call this function with no minimal tuple sizes,
it does a straight defragmentation; if you give it minimum tuple
sizes, then it rearranges the page to make it suitable for a pending
in-place update of those tuples.

Actually, I think Amit and I discussed further refining this by
splitting the page reorganization function in half. One half would
make a plan for where to put each tuple on the page following the
reorg, but would not actually do anything. That would be executed
before entering a critical section, and might fail if the requested
minimum tuple sizes can't be satisfied. The other half would take the
previously-constructed plan as input and perform the reorganization.
That would be done in the critical section.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#58

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Robert Haas (#57)

Re: zheap: a new storage format for PostgreSQL

čt 6. 12. 2018 v 17:02 odesílatel Robert Haas <robertmhaas@gmail.com>
napsal:

On Thu, Dec 6, 2018 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

čt 6. 12. 2018 v 16:26 odesílatel Robert Haas <robertmhaas@gmail.com>

napsal:

On Thu, Dec 6, 2018 at 10:23 AM Pavel Stehule <pavel.stehule@gmail.com>

wrote:

I have a problem to imagine it. When fill factor will be low, then

there is high risk of high fragmentation - or there some body should to do
defragmentation.

I don't understand this.

I don't know if zheap has or has not any tools for elimination

fragmentation of space of page. But I expect so after some set of updates,
when record size is mutable, the free space on page should be fragmented.
Usually, when you have less memory, then fragmentation is faster.

Still not sure I completely understand, but it's true that zheap
sometimes needs to compact free space on a page. For example, if
you've got a page with a 100-byte hole, and somebody updates a tuple
to make it 2 bytes bigger, you've got to shift that tuple and any that
precede it backwards to reduce the size of the hole to 98 bytes, so
that you can fit the new version of the tuple. If, later, somebody
shrinks that tuple back to the original size, you've now got 100 bytes
of free space on the page, but they are fragmented: 98 bytes in the
"hole," and 2 bytes following the newly-shrunk tuple. If someone
tries to insert a 100-byte tuple in that page, we'll need to
reorganize the page a second time to bring all that free space back
together in a single chunk.

In my view, and I'm not sure if this is how the code currently works,
we should have just one routine to do a zheap page reorganization
which can cope with all possible scenarios. I imagine that you would
give it the page is it currently exists plus a "minimum tuple size"
for one or more tuples on the page (which must not be smaller than the
current size of that tuple, but could be bigger). It then reorganizes
the page so that every tuple for which a minimum size was given
consumes exactly that amount of space, every other tuple consumes the
minimum possible amount of space, and the remaining space goes into
the hole. So if you call this function with no minimal tuple sizes,
it does a straight defragmentation; if you give it minimum tuple
sizes, then it rearranges the page to make it suitable for a pending
in-place update of those tuples.

Actually, I think Amit and I discussed further refining this by
splitting the page reorganization function in half. One half would
make a plan for where to put each tuple on the page following the
reorg, but would not actually do anything. That would be executed
before entering a critical section, and might fail if the requested
minimum tuple sizes can't be satisfied. The other half would take the
previously-constructed plan as input and perform the reorganization.
That would be done in the critical section.

Thank you for reply

Pavel

Show quoted text

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#59

Amit Kapila

amit.kapila16@gmail.com

about 7 years ago

In reply to: Robert Haas (#57)

Re: zheap: a new storage format for PostgreSQL

On Thu, Dec 6, 2018 at 9:32 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 6, 2018 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 6. 12. 2018 v 16:26 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:

On Thu, Dec 6, 2018 at 10:23 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

I have a problem to imagine it. When fill factor will be low, then there is high risk of high fragmentation - or there some body should to do defragmentation.

I don't understand this.

I don't know if zheap has or has not any tools for elimination fragmentation of space of page. But I expect so after some set of updates, when record size is mutable, the free space on page should be fragmented. Usually, when you have less memory, then fragmentation is faster.

Still not sure I completely understand, but it's true that zheap
sometimes needs to compact free space on a page. For example, if
you've got a page with a 100-byte hole, and somebody updates a tuple
to make it 2 bytes bigger, you've got to shift that tuple and any that
precede it backwards to reduce the size of the hole to 98 bytes, so
that you can fit the new version of the tuple. If, later, somebody
shrinks that tuple back to the original size, you've now got 100 bytes
of free space on the page, but they are fragmented: 98 bytes in the
"hole," and 2 bytes following the newly-shrunk tuple. If someone
tries to insert a 100-byte tuple in that page, we'll need to
reorganize the page a second time to bring all that free space back
together in a single chunk.

In my view, and I'm not sure if this is how the code currently works,
we should have just one routine to do a zheap page reorganization
which can cope with all possible scenarios. I imagine that you would
give it the page is it currently exists plus a "minimum tuple size"
for one or more tuples on the page (which must not be smaller than the
current size of that tuple, but could be bigger). It then reorganizes
the page so that every tuple for which a minimum size was given
consumes exactly that amount of space, every other tuple consumes the
minimum possible amount of space, and the remaining space goes into
the hole. So if you call this function with no minimal tuple sizes,
it does a straight defragmentation; if you give it minimum tuple
sizes, then it rearranges the page to make it suitable for a pending
in-place update of those tuples.

Yeah, the code is also along these lines, however, as of now, the API
takes input for one tuple (it's offset number and delta space
(additional space required by update that updates tuple to a bigger
size)). As of now, we don't have a requirement for multiple tuples,
but if there is a case, I think the API can be adapted. One more
thing we do during repair-fragmentation is to arrange tuples in their
offset order so that future sequence scans can be faster.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com