Sync Rep: First Thoughts on Code
Breaking down of patch into sections works very well for review. Should
allow us to get different reviewers on different parts of the code -
review wranglers please take note: Dave, Josh.
Can you confirm that all the docs on the Wiki page are up to date? There
are a few minor discrepancies that make me think it isn't.
Examples: "For example, to make a single multi-statement transaction
replication asynchronously when the default is the opposite, issue SET
LOCAL synchronous_commit TO OFF within the transaction."
Do we mean synchronous_replication in this sentence? I think you've
copied the text and not changed all of the necessary parts - please
re-read the whole section (probably the whole Wiki, actually).
"wal_writer_delay" - do we mean wal_sender_delay? Is there some ability
to measure the amount of data to be sent and avoid the delay altogether,
when the server is sufficiently busy?
The reaction to replication_timeout may need to be configurable. I might
not want to keep on processing if the information didn't reach the
standby. I would prefer in many cases that the transactions that were
waiting for walsender would abort, but the walsender kept processing.
How can we restart the walsender if it shuts down? Do we want a maximum
wait for a transaction and a maximum wait for the server? Do we report
stats on how long the replication has been taking? If the average rep
time is close to rep timeout then we will be fragile, so we need some
way to notice this and produce warnings. Or at least provide info to an
external monitoring system.
How do we specify the user we use to connect to primary?
Definitely need more explanatory comments/README-style docs.
For example, 03_libpq seems simple and self-contained. I'm not sure why
we have a state called PGASYNC_REPLICATION; I was hoping that would be
dynamic, but I'm not sure where to look for that. It would be useful to
have a very long comment within the code to explain how the replication
messages work, and note on each function who the intended client and
server is.
02_pqcomm: What does HAVE_POLL mean? Do we need to worry about periodic
renegotiation of keys in be-secure.c? Not sure I understand why so many
new functions in there.
04_recovery_conf is a change I agree with, though I think it may not
work with EXEC_BACKEND for Windows.
05... I need dome commentary to explain this better.
06 and 07 are large and will take substantial review time. So we must
get the overall architecture done first and then check the code that
implements that.
08 - I think I get this, but some docs will help to confirm.
09 pg_standby changes: so more changes are coming there? OK. Can we
refer to those two options as failover and switchover? There's no need
to change definitions that many Postgres people already use. This change
can be done without making any change to server behaviour, so this
change can have benefit to 8.2 and 8,3 people also.
01_signal_handling: I've looked at the LWlock acquires and releases in
the patch and am fairly happy, except for the ProcArrayLock acquire
during this sub-patch. Do we really need to do things this way? Is the
actual state important? Could we just do this with a counter which
cycles? So callers increment counter atomically and the reader just
polls to see if anybody has incremented? Or could we protect that part
of the proc with a different lock? Touching ProcArrayLock is bad news.
Anyway, feeling very positive about this. Hope we can get this reviewed
and committed in next 3-4 weeks.
I have many clues as to how to structure my own work also. Thanks.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi, Simon.
Thanks for taking many hours to review the code!!
On Mon, Dec 1, 2008 at 8:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Can you confirm that all the docs on the Wiki page are up to date? There
are a few minor discrepancies that make me think it isn't.
Documentation is ongoing. Sorry for my slow progress.
BTW, I'm going to add and change the sgml files listed on wiki.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Documentation_Plan
Examples: "For example, to make a single multi-statement transaction
replication asynchronously when the default is the opposite, issue SET
LOCAL synchronous_commit TO OFF within the transaction."
Do we mean synchronous_replication in this sentence? I think you've
copied the text and not changed all of the necessary parts - please
re-read the whole section (probably the whole Wiki, actually).
Oops! It's just typo. Sorry for the confusion.
I will revise this section.
"wal_writer_delay" - do we mean wal_sender_delay?
Yes. I will fix it.
Is there some ability
to measure the amount of data to be sent and avoid the delay altogether,
when the server is sufficiently busy?
Why is the former ability required?
The latter is possible, I think. We can guarantee that the WAL is sent (in
more detail, called send(2)) once at least per wal_sender_delay. Of course,
it's dependent on the scheduler of a kernel.
The reaction to replication_timeout may need to be configurable. I might
not want to keep on processing if the information didn't reach the
standby.
OK. I will add new GUC variable (PGC_SIGHUP) to specify the reaction for
the timeout.
I would prefer in many cases that the transactions that were
waiting for walsender would abort, but the walsender kept processing.
Is it dangerous to abort the transaction with replication continued when
the timeout occurs? I think that the WAL consistency between two servers
might be broken. Because the WAL writing and sending are done concurrently,
and the backend might already write the WAL to disk on the primary when
waiting for walsender.
How can we restart the walsender if it shuts down?
Only restart the standby (with walreceiver). The standby connects to
the postmaster on the primary, then the postmaster forks new walsender.
Do we want a maximum
wait for a transaction and a maximum wait for the server?
ISTM that these feature are too much.
Do we report
stats on how long the replication has been taking? If the average rep
time is close to rep timeout then we will be fragile, so we need some
way to notice this and produce warnings. Or at least provide info to an
external monitoring system.
Sounds good. How about log_min_duration_replication? If the rep time
is greater than it, we produce warning (or log) like log_min_duration_xx.
How do we specify the user we use to connect to primary?
Yes, I need to add new option to specify the user name into
recovery.conf. Thanks for reminding me!
Definitely need more explanatory comments/README-style docs.
Completely agreed ;-)
I will write README together with other documents.
For example, 03_libpq seems simple and self-contained. I'm not sure why
we have a state called PGASYNC_REPLICATION; I was hoping that would be
dynamic, but I'm not sure where to look for that. It would be useful to
have a very long comment within the code to explain how the replication
messages work, and note on each function who the intended client and
server is.
OK. I will re-consider whether PGASYNC_REPLICATION is removable, and
write the comment about it.
02_pqcomm: What does HAVE_POLL mean?
It identifies whether poll(2) is available or not on the platform. We
use poll(2)
if it's defined, otherwise select(2). There is similar code at pqSocketPoll() in
fe-misc.c.
Do we need to worry about periodic
renegotiation of keys in be-secure.c?
What is "keys" you mean?
Not sure I understand why so many
new functions in there.
It's because walsender waits for the reply from the standby and the
request from the backend concurrently. So, we need poll(2) or select(2)
to make walsender wait for them, and some functions for non-blocking
receiving.
04_recovery_conf is a change I agree with, though I think it may not
work with EXEC_BACKEND for Windows.
OK. I will examine and fix it.
05... I need dome commentary to explain this better.
06 and 07 are large and will take substantial review time. So we must
get the overall architecture done first and then check the code that
implements that.08 - I think I get this, but some docs will help to confirm.
Yes. I need more documentation.
09 pg_standby changes: so more changes are coming there? OK. Can we
refer to those two options as failover and switchover?
You mean failover trigger and switchover one? ISTM that those names
and features might not suit.
Naming always bother me, and the current name "commit/abort trigger"
might tend to cause confusion. Is there any other suitable name?
There's no need
to change definitions that many Postgres people already use. This change
can be done without making any change to server behaviour, so this
change can have benefit to 8.2 and 8,3 people also.
Agreed.
01_signal_handling: I've looked at the LWlock acquires and releases in
the patch and am fairly happy, except for the ProcArrayLock acquire
during this sub-patch. Do we really need to do things this way? Is the
actual state important? Could we just do this with a counter which
cycles? So callers increment counter atomically and the reader just
polls to see if anybody has incremented? Or could we protect that part
of the proc with a different lock? Touching ProcArrayLock is bad news.
Agreed. I will add new lock for proc.signalFlags.
Anyway, feeling very positive about this. Hope we can get this reviewed
and committed in next 3-4 weeks.I have many clues as to how to structure my own work also. Thanks.
Thanks again!
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, 2008-12-02 at 21:37 +0900, Fujii Masao wrote:
Thanks for taking many hours to review the code!!
On Mon, Dec 1, 2008 at 8:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Can you confirm that all the docs on the Wiki page are up to date? There
are a few minor discrepancies that make me think it isn't.Documentation is ongoing. Sorry for my slow progress.
BTW, I'm going to add and change the sgml files listed on wiki.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Documentation_Plan
I'm patient, I know it takes time. Happy to spend hours on the review,
but I want to do that knowing I agree with the higher level features and
architecture first.
This was just a first review, I expect to spend more time on it yet.
The reaction to replication_timeout may need to be configurable. I might
not want to keep on processing if the information didn't reach the
standby.OK. I will add new GUC variable (PGC_SIGHUP) to specify the reaction for
the timeout.I would prefer in many cases that the transactions that were
waiting for walsender would abort, but the walsender kept processing.Is it dangerous to abort the transaction with replication continued when
the timeout occurs? I think that the WAL consistency between two servers
might be broken. Because the WAL writing and sending are done concurrently,
and the backend might already write the WAL to disk on the primary when
waiting for walsender.
The issue I see is that we might want to keep wal_sender_delay small so
that transaction times are not increased. But we also want
wal_sender_delay high so that replication never breaks. It seems better
to have the action on wal_sender_delay configurable if we have an
unsteady network (like the internet). Marcus made some comments on line
dropping that seem relevant here; we should listen to his experience.
Hmmm, dangerous? Well assuming we're linking commits with replication
sends then it sounds it. We might end up committing to disk and then
deciding to abort instead. But remember we don't remove the xid from
procarray or mark the result in clog until the flush is over, so it is
possible. But I think we should discuss this in more detail when the
main patch is committed.
Do we report
stats on how long the replication has been taking? If the average rep
time is close to rep timeout then we will be fragile, so we need some
way to notice this and produce warnings. Or at least provide info to an
external monitoring system.Sounds good. How about log_min_duration_replication? If the rep time
is greater than it, we produce warning (or log) like log_min_duration_xx.
Maybe, lets put in something that logs if >50% (?) of timeout. Make that
configurable with a #define and see if we need that to be configurable
with a GUC later.
Do we need to worry about periodic
renegotiation of keys in be-secure.c?What is "keys" you mean?
See the notes in that file for explanation.
I wondered whether it might be a perf problem for us?
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
On Tue, 2008-12-02 at 13:09 +0000, Simon Riggs wrote:
Is it dangerous to abort the transaction with replication continued when
the timeout occurs? I think that the WAL consistency between two servers
might be broken. Because the WAL writing and sending are done concurrently,
and the backend might already write the WAL to disk on the primary when
waiting for walsender.The issue I see is that we might want to keep wal_sender_delay small so
that transaction times are not increased. But we also want
wal_sender_delay high so that replication never breaks. It seems better
to have the action on wal_sender_delay configurable if we have an
unsteady network (like the internet). Marcus made some comments on line
dropping that seem relevant here; we should listen to his experience.Hmmm, dangerous? Well assuming we're linking commits with replication
sends then it sounds it. We might end up committing to disk and then
deciding to abort instead. But remember we don't remove the xid from
procarray or mark the result in clog until the flush is over, so it is
possible. But I think we should discuss this in more detail when the
main patch is committed.
What is the "it" in "it is possible"? It seems like there's still a
problem window in there.
Even if that could be made safe, in the event of a real network failure,
you'd just wait the full timeout every transaction, because it still
thinks it's replicating.
If the timeout is exceeded, it seems more reasonable to abandon the
slave until you could re-sync it and continue processing as normal. As
you pointed out, that's not necessarily an expensive operation because
you can use something like rsync. The process of re-syncing might be
made easier (or perhaps less costly), of course.
If we want to still allow processing to happen after a timeout, it seems
reasonable to have a configurable option to allow/disallow non-read-only
transactions when out of sync.
Regards,
Jeff Davis
On Tue, 2008-12-02 at 11:08 -0800, Jeff Davis wrote:
On Tue, 2008-12-02 at 13:09 +0000, Simon Riggs wrote:
Is it dangerous to abort the transaction with replication continued when
the timeout occurs? I think that the WAL consistency between two servers
might be broken. Because the WAL writing and sending are done concurrently,
and the backend might already write the WAL to disk on the primary when
waiting for walsender.The issue I see is that we might want to keep wal_sender_delay small so
that transaction times are not increased. But we also want
wal_sender_delay high so that replication never breaks. It seems better
to have the action on wal_sender_delay configurable if we have an
unsteady network (like the internet). Marcus made some comments on line
dropping that seem relevant here; we should listen to his experience.Hmmm, dangerous? Well assuming we're linking commits with replication
sends then it sounds it. We might end up committing to disk and then
deciding to abort instead. But remember we don't remove the xid from
procarray or mark the result in clog until the flush is over, so it is
possible. But I think we should discuss this in more detail when the
main patch is committed.What is the "it" in "it is possible"? It seems like there's still a
problem window in there.
Marking a transaction aborted after we have written a commit record, but
before we have removed it from proc array and marked in clog. We'd need
a special kind of WAL record to do that.
Even if that could be made safe, in the event of a real network failure,
you'd just wait the full timeout every transaction, because it still
thinks it's replicating.
True, but I did suggest having two timeouts.
There is considerable reason to reduce the timeout as well as reason to
increase it - at the same time.
Anyway, lets wait for some user experience following commit.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Breaking down of patch into sections works very well for review. Should
allow us to get different reviewers on different parts of the code -
review wranglers please take note: Dave, Josh.
Fujii-san, could you break the patch up into several parts? We have quite
a few junior reviewers who are idle right now.
--
--Josh
Josh Berkus
PostgreSQL
San Francisco
Jeff,
Even if that could be made safe, in the event of a real network failure,
you'd just wait the full timeout every transaction, because it still
thinks it's replicating.
Hmmm. I'd suggest that if we get timeouts for more than 10xTimeout value
in a row, that replication stops. Unfortunatley, we should probably make
that *another* configuration setting.
--
--Josh
Josh Berkus
PostgreSQL
San Francisco
Hi,
On Wed, Dec 3, 2008 at 6:03 AM, Josh Berkus <josh@agliodbs.com> wrote:
Breaking down of patch into sections works very well for review. Should
allow us to get different reviewers on different parts of the code -
review wranglers please take note: Dave, Josh.Fujii-san, could you break the patch up into several parts? We have quite
a few junior reviewers who are idle right now.
Yes, I divided the patch into 9 pieces. Do I need to divide it further?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii-san,
Yes, I divided the patch into 9 pieces. Do I need to divide it further?
That's plenty. Where do reviews find the 9 pieces?
--
Josh Berkus
PostgreSQL
San Francisco
Hi,
On Wed, Dec 3, 2008 at 3:21 PM, Josh Berkus <josh@agliodbs.com> wrote:
Fujii-san,
Yes, I divided the patch into 9 pieces. Do I need to divide it further?
That's plenty. Where do reviews find the 9 pieces?
The latest patch set (v4) is on wiki.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Patch_set
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hello,
On Tue, Dec 2, 2008 at 10:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
The reaction to replication_timeout may need to be configurable. I might
not want to keep on processing if the information didn't reach the
standby.OK. I will add new GUC variable (PGC_SIGHUP) to specify the reaction for
the timeout.I would prefer in many cases that the transactions that were
waiting for walsender would abort, but the walsender kept processing.Is it dangerous to abort the transaction with replication continued when
the timeout occurs? I think that the WAL consistency between two servers
might be broken. Because the WAL writing and sending are done concurrently,
and the backend might already write the WAL to disk on the primary when
waiting for walsender.The issue I see is that we might want to keep wal_sender_delay small so
that transaction times are not increased. But we also want
wal_sender_delay high so that replication never breaks.
Are you assuming only asynch case? In synch case, since walsender is
awoken by the signal from the backend, we don't need to keep the delay
so small. And, wal_sender_delay has no relation with the mis-termination
of replication.
It seems better
to have the action on wal_sender_delay configurable if we have an
unsteady network (like the internet). Marcus made some comments on line
dropping that seem relevant here; we should listen to his experience.
OK, I would look for his comments. Please let me know which thread has
the comments if you know.
Hmmm, dangerous? Well assuming we're linking commits with replication
sends then it sounds it. We might end up committing to disk and then
deciding to abort instead. But remember we don't remove the xid from
procarray or mark the result in clog until the flush is over, so it is
possible. But I think we should discuss this in more detail when the
main patch is committed.
If the transaction is aborted while the backend is waiting for replication,
the transaction commit command returns "false" indication to the client.
But the transaction commit record might be written in the primary and
standby. As you say, it may not be dangerous as long as the primary is
alive. But, when we recover the failed primary, clog of the transaction
is marked with "success" because of the commit record. Is it safe?
And, in that case, the transaction is treated as "sucess" on the standby,
and visible for the read-only query. On the other hand, it's invisible on
the primary. Isn't it dangerous?
Do we need to worry about periodic
renegotiation of keys in be-secure.c?What is "keys" you mean?
See the notes in that file for explanation.
Thanks! I would check it.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hi,
On Wed, Dec 3, 2008 at 4:08 AM, Jeff Davis <pgsql@j-davis.com> wrote:
Even if that could be made safe, in the event of a real network failure,
you'd just wait the full timeout every transaction, because it still
thinks it's replicating.
If walsender detects a real network failure, the transaction doesn't need to
wait for the timeout. Configuring keepalive options would help walsender to
detect it. Of course, though keepalive on linux might not work as expected.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hi,
On Tue, Dec 2, 2008 at 10:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Tue, 2008-12-02 at 21:37 +0900, Fujii Masao wrote:
Thanks for taking many hours to review the code!!
On Mon, Dec 1, 2008 at 8:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Can you confirm that all the docs on the Wiki page are up to date? There
are a few minor discrepancies that make me think it isn't.Documentation is ongoing. Sorry for my slow progress.
BTW, I'm going to add and change the sgml files listed on wiki.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Documentation_PlanI'm patient, I know it takes time. Happy to spend hours on the review,
but I want to do that knowing I agree with the higher level features and
architecture first.
Since I thought that the figure was more intelligible for some people
than my poor English, I illustrated the architecture first.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_Design
Are there any other parts which should be illustrated for review?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, 2008-12-03 at 21:37 +0900, Fujii Masao wrote:
Since I thought that the figure was more intelligible for some people
than my poor English, I illustrated the architecture first.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_DesignAre there any other parts which should be illustrated for review?
Those are very useful, thanks.
Some questions to check my understanding (expected answers in brackets)
* Diagram on p.2 has two Archives. We have just one (yes)
* We send data continuously, whether or not we are in sync/async? (yes)
So the only difference between sync/async is whether we wait when we
flush the commit? (yes)
* If we have synchronous_commit = off do we ignore
synchronous_replication = on (yes)
* If two transactions commit almost simultaneously and one is sync and
the other async then only the sync backend will wait? (Yes)
Do we definitely need the archiver to move the files written by
walreceiver to archive and then move them back out again? Seems like we
can streamline that part in many (all?) cases.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi,
On Wed, Dec 3, 2008 at 11:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
I'm patient, I know it takes time. Happy to spend hours on the review,
but I want to do that knowing I agree with the higher level features and
architecture first.
I wrote the features and restrictions of Synch Rep. Please also check
it together with the figures of architecture.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#User_Overview
Some questions to check my understanding (expected answers in brackets)
* Diagram on p.2 has two Archives. We have just one (yes)
No, we need archive in both the primary and standby. The primary needs
archive because a base backup is required when starting the standby.
Meanwhile, the standby needs archive for cooperating with pg_standby.
If the directory where pg_standby checks is the same as the directory
where walreceiver writes the WAL, the halfway WAL file might be
restored by pg_standby, and continuous recovery would fail. So, we have
to separate the directories, and I assigned pg_xlog and archive to them.
Another idea; walreceiver writes the WAL to the file with temporary name,
and rename it to the formal name when it fills. So, pg_standby doesn't
restore a halfway WAL file. But it's more difficult to perform the failover
because the unrenamed WAL file remains.
Do you have any other good idea?
* We send data continuously, whether or not we are in sync/async? (yes)
Yes.
So the only difference between sync/async is whether we wait when we
flush the commit? (yes)
Yes.
And, in asynch case, the backend basically doesn't send the wakeup-signal
to walsender.
* If we have synchronous_commit = off do we ignore
synchronous_replication = on (yes)
No, we can configure them independently. synchronous_commit covers
only local writing of the WAL. If synch_*commit* should cover both local
writing and replication, I'd like to add new GUC which covers only local
writing (synchronous_local_write?).
* If two transactions commit almost simultaneously and one is sync and
the other async then only the sync backend will wait? (Yes)
Yes.
Do we definitely need the archiver to move the files written by
walreceiver to archive and then move them back out again?
Yes, it's because of cooperating with pg_standby.
Seems like we
can streamline that part in many (all?) cases.
Agreed. But I thought that such streaming was TODO of next time.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hi,
On Wed, Dec 3, 2008 at 3:38 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Do we need to worry about periodic
renegotiation of keys in be-secure.c?What is "keys" you mean?
See the notes in that file for explanation.
Thanks! I would check it.
The key is used only when we use SSL for the connection of
replication. As far as I examined, secure_write() renegotiates
the key if needed. Since walsender calls secure_write() when
sending the WAL to the standby, the key is renegotiated
periodically. So, I think that we don't need to worry about the
obsolescence of the key. Am I missing something?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, 2008-12-04 at 17:57 +0900, Fujii Masao wrote:
On Wed, Dec 3, 2008 at 3:38 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Do we need to worry about periodic
renegotiation of keys in be-secure.c?What is "keys" you mean?
See the notes in that file for explanation.
Thanks! I would check it.
The key is used only when we use SSL for the connection of
replication. As far as I examined, secure_write() renegotiates
the key if needed. Since walsender calls secure_write() when
sending the WAL to the standby, the key is renegotiated
periodically. So, I think that we don't need to worry about the
obsolescence of the key.
Understood. Is the periodic renegotiation of keys something that would
interfere with the performance or robustness of replication? Is the
delay likely to effect sync rep? I'm just checking we've thought about
it.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
On Thu, 2008-12-04 at 16:10 +0900, Fujii Masao wrote:
* Diagram on p.2 has two Archives. We have just one (yes)
No, we need archive in both the primary and standby. The primary needs
archive because a base backup is required when starting the standby.
Meanwhile, the standby needs archive for cooperating with pg_standby.If the directory where pg_standby checks is the same as the directory
where walreceiver writes the WAL, the halfway WAL file might be
restored by pg_standby, and continuous recovery would fail. So, we have
to separate the directories, and I assigned pg_xlog and archive to them.Another idea; walreceiver writes the WAL to the file with temporary name,
and rename it to the formal name when it fills. So, pg_standby doesn't
restore a halfway WAL file. But it's more difficult to perform the failover
because the unrenamed WAL file remains.
WAL sending is either via archiver or via streaming. We must switch
cleanly from one mode to the other and not half-way through a WAL file.
When WAL sending is about to begin, issue xlog switch. Then tell
archiver to shutdown once it has got to the last file. All files after
that point are streamed. So there need be no conflict in filename.
We must avoid having two archives, because people will configure this
incorrectly.
* If we have synchronous_commit = off do we ignore
synchronous_replication = on (yes)No, we can configure them independently. synchronous_commit covers
only local writing of the WAL. If synch_*commit* should cover both local
writing and replication, I'd like to add new GUC which covers only local
writing (synchronous_local_write?).
The only sensible settings are
synchronous_commit = on, synchronous_replication = on
synchronous_commit = on, synchronous_replication = off
synchronous_commit = off, synchronous_replication = off
This doesn't make any sense: (does it??)
synchronous_commit = off, synchronous_replication = on
Do we definitely need the archiver to move the files written by
walreceiver to archive and then move them back out again?Yes, it's because of cooperating with pg_standby.
It seems very easy to make this happen the way we want. We could make
pg_standby look into pg_xlog also, for example.
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.
We need the existing mechanisms for the start of replication (base
backup etc..) but we don't need them after that point.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Simon Riggs wrote:
On Thu, 2008-12-04 at 17:57 +0900, Fujii Masao wrote:
On Wed, Dec 3, 2008 at 3:38 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Do we need to worry about periodic
renegotiation of keys in be-secure.c?What is "keys" you mean?
See the notes in that file for explanation.
Thanks! I would check it.
The key is used only when we use SSL for the connection of
replication. As far as I examined, secure_write() renegotiates
the key if needed. Since walsender calls secure_write() when
sending the WAL to the standby, the key is renegotiated
periodically. So, I think that we don't need to worry about the
obsolescence of the key.Understood. Is the periodic renegotiation of keys something that would
interfere with the performance or robustness of replication? Is the
delay likely to effect sync rep? I'm just checking we've thought about
it.
It will certainly add an extra piece of delay. But if you are worried
about performance for it, you are likely not running SSL. Plus, if you
don't renegotiate the key, you gamble with security.
If it does have a negative effect on the robustness of the replication,
we should just recommend against using it - or refuse to use - not
disable renegotiation.
/Magnus
On Thu, 2008-12-04 at 12:41 +0100, Magnus Hagander wrote:
Understood. Is the periodic renegotiation of keys something that would
interfere with the performance or robustness of replication? Is the
delay likely to effect sync rep? I'm just checking we've thought about
it.It will certainly add an extra piece of delay. But if you are worried
about performance for it, you are likely not running SSL. Plus, if you
don't renegotiate the key, you gamble with security.If it does have a negative effect on the robustness of the replication,
we should just recommend against using it - or refuse to use - not
disable renegotiation.
I didn't mean to imply renegotiation might optional. I just wanted to
check whether there is anything to worry about as a result of it, there
may not be. *If* it took a long time, I would not want sync commits to
wait for it.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi,
On Thu, Dec 4, 2008 at 6:29 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
The only sensible settings are
synchronous_commit = on, synchronous_replication = on
synchronous_commit = on, synchronous_replication = off
synchronous_commit = off, synchronous_replication = offThis doesn't make any sense: (does it??)
synchronous_commit = off, synchronous_replication = on
If the standby replies before writing the WAL, that strategy can improve
the performance with moderate reliability, and sounds sensible.
IIRC, MySQL Cluster might use that strategy.
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.
You mean that not pg_standby but startup process waits for the next
WAL available? If so, I agree with you in the future. That is, I just think
that this is next TODO because there are many problems which we
should resolve carefully to achieve it. But, if it's essential for 8.4, I will
tackle it. What is your opinion? I'd like to clear up the goal for 8.4.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hello,
On Fri, Dec 5, 2008 at 12:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.You mean that not pg_standby but startup process waits for the next
WAL available? If so, I agree with you in the future. That is, I just think
that this is next TODO because there are many problems which we
should resolve carefully to achieve it. But, if it's essential for 8.4, I will
tackle it. What is your opinion? I'd like to clear up the goal for 8.4.
Umm.. on second thought, this feature (continuous recovery without
pg_standby) seems to be essential for 8.4. So, I will try it.
Development plan:
- Share the end of WAL address via shared memory <--- Done!
- Change ReadRecord() to wait for the next WAL *record* available.
- Change ReadRecord() to restore the WAL from archive by using
pg_standby before reaching the replication starting position, then
read the half-streaming WAL from pg_xlog.
- Add new trigger for promoting the standby to the primary. As the
trigger, when fast shudown (SIGINT) is requested during recovery,
the standby would recover the WAL up to end and become the
primary.
What system call does walreceiver have to call against the WAL
before startup process reads it? Probably we need to call write(2),
and don't need fsync(2) in Linux. How about other platform?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hi, sorry for my consecutive posting.
On Fri, Dec 5, 2008 at 4:00 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hello,
On Fri, Dec 5, 2008 at 12:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.You mean that not pg_standby but startup process waits for the next
WAL available? If so, I agree with you in the future. That is, I just think
that this is next TODO because there are many problems which we
should resolve carefully to achieve it. But, if it's essential for 8.4, I will
tackle it. What is your opinion? I'd like to clear up the goal for 8.4.Umm.. on second thought, this feature (continuous recovery without
pg_standby) seems to be essential for 8.4. So, I will try it.Development plan:
- Share the end of WAL address via shared memory <--- Done!
- Change ReadRecord() to wait for the next WAL *record* available.
- Change ReadRecord() to restore the WAL from archive by using
pg_standby before reaching the replication starting position, then
read the half-streaming WAL from pg_xlog.
- Add new trigger for promoting the standby to the primary. As the
trigger, when fast shudown (SIGINT) is requested during recovery,
the standby would recover the WAL up to end and become the
primary.What system call does walreceiver have to call against the WAL
before startup process reads it? Probably we need to call write(2),
and don't need fsync(2) in Linux. How about other platform?
I added the figures about the latest architecture into PDF file.
Please check P6, 7. Is this architecture close to your imege?
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_Design
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Fri, 2008-12-05 at 16:00 +0900, Fujii Masao wrote:
On Fri, Dec 5, 2008 at 12:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.You mean that not pg_standby but startup process waits for the next
WAL available? If so, I agree with you in the future. That is, I just think
that this is next TODO because there are many problems which we
should resolve carefully to achieve it. But, if it's essential for 8.4, I will
tackle it. What is your opinion? I'd like to clear up the goal for 8.4.Umm.. on second thought, this feature (continuous recovery without
pg_standby) seems to be essential for 8.4. So, I will try it.
Sounds good. Perhaps you can share what changed your mind in those 4
hours...
Could we start with pictures and some descriptions first, so we know
we're on the right track? I foresee no coding issues.
My understanding is that we start with a normal log shipping
architecture, then we switch into continuous recovery mode. So we do use
pg_standby at beginning, but then it gets turned off.
Let's look at all of the corner cases also:
* standby keeps pace with primary (desired state)
* standby falls behind primary
* standby restarts to change shmmem settings
etc
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
On Fri, 2008-12-05 at 12:09 +0900, Fujii Masao wrote:
On Thu, Dec 4, 2008 at 6:29 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
The only sensible settings are
synchronous_commit = on, synchronous_replication = on
synchronous_commit = on, synchronous_replication = off
synchronous_commit = off, synchronous_replication = offThis doesn't make any sense: (does it??)
synchronous_commit = off, synchronous_replication = onIf the standby replies before writing the WAL, that strategy can improve
the performance with moderate reliability, and sounds sensible.
Do you think it likely that your replication time is consistently and
noticeably less than your time-to-disk? If not, you'll wait just as long
but be less robust. I guess its possible.
On a related thought: presumably we force a sync rep if forceSyncCommit
is set?
IIRC, MySQL Cluster might use that strategy.
Not the most convincing argument I've heard.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi,
On Fri, Dec 5, 2008 at 7:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2008-12-05 at 12:09 +0900, Fujii Masao wrote:
On Thu, Dec 4, 2008 at 6:29 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
The only sensible settings are
synchronous_commit = on, synchronous_replication = on
synchronous_commit = on, synchronous_replication = off
synchronous_commit = off, synchronous_replication = offThis doesn't make any sense: (does it??)
synchronous_commit = off, synchronous_replication = onIf the standby replies before writing the WAL, that strategy can improve
the performance with moderate reliability, and sounds sensible.Do you think it likely that your replication time is consistently and
noticeably less than your time-to-disk?
It depends on a system environment.
- How many miles two servers? same rack? separate continent?
- Does system have high-end storage? cheap one?
... etc
On a related thought: presumably we force a sync rep if forceSyncCommit
is set?
Yes!
Please see RecordTransactionCommit() in xact.c (in my patch).
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Greetings!
On Fri, Dec 5, 2008 at 6:59 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2008-12-05 at 16:00 +0900, Fujii Masao wrote:
On Fri, Dec 5, 2008 at 12:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
I was expecting you to have walreceiver and startup share an end of WAL
address via shared memory, so that startup never tries to read past end.
That way we would be able to begin reading a WAL file *before* it was
filled. Waiting until a file fills means we still have to have
archive_timeout set to ensure we switch regularly.You mean that not pg_standby but startup process waits for the next
WAL available? If so, I agree with you in the future. That is, I just think
that this is next TODO because there are many problems which we
should resolve carefully to achieve it. But, if it's essential for 8.4, I will
tackle it. What is your opinion? I'd like to clear up the goal for 8.4.Umm.. on second thought, this feature (continuous recovery without
pg_standby) seems to be essential for 8.4. So, I will try it.Sounds good. Perhaps you can share what changed your mind in those 4
hours...
Yeah, it's my imagination about the real situation after 8.4 release,
especially I considered about the future conjugal life of Synch Rep and
Hot Standby ;) Waiting to redo until the file fills might lead to marital
breakdown.
Could we start with pictures and some descriptions first, so we know
we're on the right track? I foresee no coding issues.My understanding is that we start with a normal log shipping
architecture, then we switch into continuous recovery mode. So we do use
pg_standby at beginning, but then it gets turned off.
Yes, I also understand so. Updated sequence pictures are on wiki
as per usual. Please see P3, 4.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_Design
Let's look at all of the corner cases also:
* standby keeps pace with primary (desired state)
* standby falls behind primary
* standby restarts to change shmmem settings
etc
Yes, I will examine such cases!
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Sat, 2008-12-06 at 17:55 +0900, Fujii Masao wrote:
Yeah, it's my imagination about the real situation after 8.4 release,
especially I considered about the future conjugal life of Synch Rep and
Hot Standby ;) Waiting to redo until the file fills might lead to marital
breakdown.
You're obviously working with some comedians now. ;-)
Could we start with pictures and some descriptions first, so we know
we're on the right track? I foresee no coding issues.My understanding is that we start with a normal log shipping
architecture, then we switch into continuous recovery mode. So we do use
pg_standby at beginning, but then it gets turned off.Yes, I also understand so. Updated sequence pictures are on wiki
as per usual. Please see P3, 4.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_Design
p.6 looks good.
But what is p.7? It's even more complex than the original. Forgive me,
but I don't understand that. Can you explain?
What is the procedure if the standby shuts down, for example if we wish
to restart server to change a parameter? Or to reboot the system it is
on. Does the primary switch back to writing files to archive?
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi, thanks for the comment!
On Mon, Dec 8, 2008 at 11:04 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Could we start with pictures and some descriptions first, so we know
we're on the right track? I foresee no coding issues.My understanding is that we start with a normal log shipping
architecture, then we switch into continuous recovery mode. So we do use
pg_standby at beginning, but then it gets turned off.Yes, I also understand so. Updated sequence pictures are on wiki
as per usual. Please see P3, 4.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Detailed_Designp.6 looks good.
But what is p.7? It's even more complex than the original. Forgive me,
but I don't understand that. Can you explain?
p.7 shows one of the system configuration examples. Some people don't
want to share an archive between two servers would probably choose
this configuration, I think.
If archive is not shared, some WAL files before replication starts would not
be copied automatically from the primary to standby. So, we have to copy
them by hand or using clusterware ..etc. This is what p.7 shows. If archive
is shared, archiver on the primary would copy them automatically (p.6).
What is the procedure if the standby shuts down, for example if we wish
to restart server to change a parameter?
Stop postgres by using immediate shutdown, and start postgres from an
existing database cluster directory. When restarting postgres, if there are
one or more archives, we also need to copy the WAL files after stopping
replication before restarting replication.
Or to reboot the system it is
on. Does the primary switch back to writing files to archive?
I assume that the primary always writes files to archive, that is, basically
the primary doesn't switch to non-archiving mode. Of course, if archiving
is disabled on the primary in any reason when restarting standby, the
primary need to switch back.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, 2008-12-09 at 17:15 +0900, Fujii Masao wrote:
But what is p.7? It's even more complex than the original. Forgive me,
but I don't understand that. Can you explain?p.7 shows one of the system configuration examples. Some people don't
want to share an archive between two servers would probably choose
this configuration, I think.If archive is not shared, some WAL files before replication starts would not
be copied automatically from the primary to standby. So, we have to copy
them by hand or using clusterware ..etc. This is what p.7 shows. If archive
is shared, archiver on the primary would copy them automatically (p.6).
I agree that is the way to do it *if* the archive is not shared. But why
would you want to *not* share the archive??
What is the procedure if the standby shuts down, for example if we wish
to restart server to change a parameter?Stop postgres by using immediate shutdown, and start postgres from an
existing database cluster directory. When restarting postgres, if there are
one or more archives, we also need to copy the WAL files after stopping
replication before restarting replication.Or to reboot the system it is
on. Does the primary switch back to writing files to archive?I assume that the primary always writes files to archive, that is, basically
the primary doesn't switch to non-archiving mode.
OK, I think that clears up what I was seeing in the code. i.e. I didn't
understand the modes of operation.
I really like most of what you've done, though you must forgive me for
saying I still don't like this. I really am with you on how tiresome
that sounds.
For clarity: I don't think its acceptable to have the archiver send
files to the archive at the same time as we're streaming data. In normal
running we should not duplicate the data paths - its just too much data
volume and/or bandwidth.
The cleanest way I can see is to have two modes of operation:
* First mode is file-based log shipping (FLS) (i.e. "warm standby")
* Second mode is streaming log shipping (SLS) (wal sender to wal
receiver)
When we start we are in FLS mode, then we catch up to the cross-over
point and we switch to SLS mode. If streaming stops, we just switch back
to FLS mode. If they reconnect, we follow same procedure again. So the
two modes are compatible, but are never simultaneously active except for
a short period when we switch modes.
If SLS mode is active then the archiver doesn't send files. If FLS mode
is active, we send files. All of the places in code that currently are
not optimised when XLogArchivingActive() must remain unoptimised for
either FLS or SLS mode, so we need a new name for that.
This makes least number of changes to existing architecture. People
currently use FLS mode and understand it (!), they just add
understanding of SLS mode. It's also a very straightforward
architecture, which means fewer code paths and less weird bugs. (There's
been enough already, as you know).
So just for clarity, let me rephrase it:
We set up FLS mode as we do currently. Then we initiate SLS mode. At the
end of the next WAL file on primary we archive it, then turn off
archiving on primary. (So for up to one WAL file we operate two modes
together).
If SLS mode ends, we send next WAL file via archiver. Some part of that
file has already been streamed across, but that doesn't matter. (If SLS
mode ends because primary is down, we obviously do nothing. If we have a
split brain situation then we rely on clusterware to kill us (STONITH).
So AFAICS p.6 of the architecture is all we really need. Nice, simple.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Simon Riggs wrote:
For clarity: I don't think its acceptable to have the archiver send
files to the archive at the same time as we're streaming data. In normal
running we should not duplicate the data paths - its just too much data
volume and/or bandwidth.
What if you want to run archiving for backup purposes, and also have a
standby server?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, 2008-12-09 at 14:42 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
For clarity: I don't think its acceptable to have the archiver send
files to the archive at the same time as we're streaming data. In normal
running we should not duplicate the data paths - its just too much data
volume and/or bandwidth.What if you want to run archiving for backup purposes, and also have a
standby server?
If we want to include that as an option, yes. If it is "always on" then
no, not everybody wants that.
The best way to implement that is to archive from the standby, not to
send the data twice. By definition the archive is more closely
associated with the standby node than the primary.
Maybe I misunderstood the diagrams? The additional flows to the archive
are actually all optional?
Anyway, I enclose a slightly simplified version of p.6 to allow us to
see the progression of file mode through to streaming mode. This is an
in-my-understanding version.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Attachments:
SyncRepArchitectures.pdfapplication/pdf; name=SyncRepArchitectures.pdfDownload
%PDF-1.4
%��������
2 0 obj
<</Length 3 0 R/Filter/FlateDecode>>
stream
x��YKo7������7"��������@EO��A�n�\��K~�feo;m�$��YrF�H��>����_��3��Fv�%����~���������/�������?����I��t0������0����gWK�AE�K��W��Rf�����������jy%5�
ddpsmW]#�[��'?\G��O���^�"*�/�l���(�V�gt����Y8�q/L���k�d�����?��s\_�{�3>�������U8�G��z9������;7�vSP��rI ����������,������%P�E��e�k����1&�&��KbM����7t�����/����,(\ ��I�VB����wE�V �:2���@�.��k�l,��R�W��Y�-������a��1%�,�(Iup��>�$�SXuF�g�4�j���_M������[��R��h�8�;Kl�r >J���=l=x��9�!�<�N��['FM��F�o���N�Q�����:8&��8�bZ��,w�f��dT�����e��Pk�������i�������z�MXY8�o�<����{������~�L�c�r:M�gR��n���e�$��V��qd�:�C�
P�jL���L����m��!�"�X������;���p��A��Z
����4����5z]�hO
�-�U����������VM��7�����������2��/����rn�17���q�ZH�~�_��}A��9wrPx����|,l���<��w�1Z���^/Z�r�"i
r'l���k������s�B��\�3�J�K��AKi��HFe��z�j\;g�6N�0i��������g�cu��3&����N��5���|�9 �h�^��$�;���p�s������!p�������1{�%:�P;�y'���Y5�����A��q�{Z��� ��W�F]��#{b�������-�s��j�U@jA��?����)H_'��s��7!��h������d�ak%����7��!�&Q\k��.��RP7��0��� U����F�$^M� q��"��p��<Vl=52��g�)J��mOL�w���0SJ�W�!Q~e�� �����I����_���J]Pf'67cV�5�t5�T''E#�I�Oa�^s�fGu�D��[����g�ETPN�l3&��(=Jm;�V!���&Q��e*�8�:Ga�j�+�"�z�����0� �E�0Z4��bs6�����Y�NVF-�����h�W�
?R�MjZ@o~��0���_�����<�@���T��=�q����%��V�!������0������3��D3��D��x�!�S�����a�#�A��P����$&7�d�@���D����{�6��B��s��W������)�j�Z��Vz�������=J�S�"~�%������'9m����u@�����������d�Z�z�V�
���7\�����+�1(�-J^&;��� � 8T����I�����m@Cq��T�A��%N8��6�P��Y�#�P��Q�������a��������8��Y_�`��,v� &��f�p�L�Q ��R���:k�
�2���6K�L��
��
��:�mX����������GT
�MR��uC�W2�;�d�(�\1�7
���v���7�H��c��a���(���~���V�z��L�<&�u�g�.�cX�@����?�P8f�+�7�]�_
endstream
endobj
3 0 obj
1722
endobj
4 0 obj
<</Type/XObject/Subtype/Image/Width 55 /Height 55 /BitsPerComponent 8 /Length 5 0 R
/Filter/FlateDecode/ColorSpace[ /Indexed/DeviceRGB 255 <
0F0E0E 181717 252424 192076 2E2D2D 3D3C3B 41468D 4E4D4D 626161 6C6EA6 7E7D7D 9393A9 B3B2B2 BABAD4 C7C7DC DBDAE8
E4E3EA F1F0F6 F3F3F2 F5F4F7 F7F6F7 F9F9FA FCFBFC FFFFFF FFFFFF FFFFFF FFFFFF FFFFFF FFFFFF FFFFFF FFFFFF FFFFFF
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 FFFFFF
> ]
>>
stream
x����� Ee82�����)�H�EJ��Oy��S j�O�(���V�r�x4e*>PY��6��m���X��v6��5T��-��l^E����m�f[�(6����2�'I:��-;�������y~�j^�2����5��d���Z�iH����^�b��.�mz k���
�9�Q��D��cu���{������l�������X�h�;r�>���
endstream
endobj
5 0 obj
207
endobj
7 0 obj
<</Length 8 0 R/Filter/FlateDecode>>
stream
x��YKo7������7"����)�[==M���Es��/�Q��]���E��g��H��D}����_��3��Fv�%�>��~���1����O���Y���w?��t��IZ�t��;L����a0���S�A���%w��{�RF�����������zz-5�
d�ssmW]#��o�g?�Dw��{}��������e�����_�����>��������?��F����qF���� ���������W���2��e�"d����.b%�Z�1]�v�����hPN<��K�����C������|?�����.�L�*+���s�����e�c�w<�qp��7��k^�b�2��/8��ua��&y�}�W�.gk�9g f:�V����U��M�=�6�����*QI�����(��8��S��B�2sG>i��6r}9��(�l��]
�`�W���c$��d�� �d>�(n��u�e]�QB$D�a���MV�&
������6�Q��_��=p\y��.I���y7���4�t#ty���k=OAz��\�����viP�:���V;���A�G~����C�v�S������������S����n/�IMY7�%V��2M�v���ub����K@a�c�6���Fx��f�b-b�e�
��L�c�j�����0��I��.��^��)���R��h�?�����8,F�t�jc�|d}_�������f.Sl�r:���.!�Vc�>�������+��/���N
�8�8��T��^�'"0��3F�$; ��I^�\$�AA����Sr���dv��HU�ks�W�yJ�P#hI-� fL�V��V�s�l����&m��V��#?y�<U�<cb�0[q����F`���OB� �
7S`����t�Ni�<�\hvtv.�pH�1���p.p�Z�B�D�a�0�a�d|�6�6��G�?*}<-wog��_���U�Q�z��Z�R���0eS��\z �k�\����S`� ��G
R�I��B��1$\3�&IjlX���U0l�$k��}��=�R$�k
!UQ���\
���x F����U�� 1h�KB��E$�RCD�pn�-����E���M8�C������z�(^� 3%�{�b�Wf� �j�5�O��$D��zfx,��etbs#1F�]��&����h$5I�� ���a�����WK$Z��-��ey�ZD���6b2�|��Uj�9'��
/�Di�
��:����%���8� �uuv�����-�]�h�4���qX8�n�j�b����3r����]I���Jm���nR�z��Z����,������yT��O��P�� z� �_��H[5�$�k��H�>w&����G,��2��{��O�fx���Q�$Q��C�j����D`��9o�~$�~"�>��4)��=r�0�QYV�}eZ+K��J�>Z\���F)~S����B��(9-�INKy"�U1��Mm�AFi��MP2e�,5I+�]
��\�����3��(��H�2��t�� �0p�������L
6,�M�h(�b�
3��7�� GXX��
h��u���#5\]��_<�+���FU�mk�nGd���]����� ������J�$. ��R���:k�
�2���K�L����Zu��0���y�Y-�9�����H}�
-��1����U�wh��C;�h^��M ����s�~���q#e�be//*^7<���^���z����MZ��H_�e���������1c��Sd� S&�bpv�m�����P*7M���cA���-�U�k9����8+(?�����K�C/������_/�#���~?Eau����u�m
endstream
endobj
8 0 obj
1812
endobj
10 0 obj
<</Length 11 0 R/Filter/FlateDecode>>
stream
x��ZK��
��������7�@���M�>9M��l�����#��f�������Yr��"Y���� {Z~��g ����KK���������,�����~��.q�wu���'�����%��3�&������`2�':?��j��(�}^���|{+������w?�n�v�D��}���K��%��}9�����}Z���|��K\Ee��������U�9��+��rEtxGW�^?�{��� ��1��]#�A|�M��^�v��0v�B���lV��ZAWl�0t}�i�*�x|EJ>���&�����n������ �����d)�EJ��p�}��-��i�J����s���s>�L�i�q���y=�����.��5��<��N��Y�*O9��W�h�KmZ�|��Q�����tR�;Y���X��=���g������e�X�,�i�����&9���3��H���)���D�a��I<���1�*EY�nJs�����:I��(6^���&
Q���s��(%�bK�����{�tY�Z�_��c2/�#}\��|��AwQFU�NKl������4(h=��NN�.�\#����ae��}���K�V]�<R���`3��w�����A2_�����F�.h�>�e�Tp~x�� a (N9&��c&��eX_�R ���"�;WH�n�A�ep�"�����I����^���"Z��J�F��AU,�s���r��n�d���z����������9��s9��.!�Vs�>��Bb���.�����N���8�8��T��n)�����M����N^q.���(����S�����e:GH�*��/�W��.KBM�%�PG$�1�[��j<:g��VN�0i���}<���g�[u��3&���]n��k�+:y�E9 �h��E�1 .�NW��W*�1�b�����!���-�s�c�(�%��H��a�����&mVM���U�x[�>����
��7�F� ���=1KMb����-�����^������
����oB�(u��_�C�-h�������a+%����7���K�(�u�TE
cp���j� ECV��7)WS�H�����<J,������������E����pJ������Iy�(^.fJ�Jw� �_Y5&��m��?k�$&��2A���[D0F�13f�]��PS;�����wM�:?8�5r��,5���UD���D<�#
b;�B�S��mR���z�Y�Q�r ���-A����Q�H&1(�Bf����9s��A�����r�J"����)��a���J�B��-[�Y���R���OU�����=
d.Z�s5�f���Gu�>%��x�69�Y� ��IbMP�%U^s���q�(t�p��zxPh8�!4 �N�#/�d�0Z4��ds�V�����Zz�������v�������}���������,�����W�<��#�:��� �>�U��� ��$�4c������c$l?:S,%9��B-�j��$ _��)k ���� ��h))�(��M�[^RU ����,r���/�4L�*=l��_FV�Ud�<W
�����z���Z�l�`
S%X�Z
0e%�e`�k(�w1����$���&(��^�������?�/���`���3��(�-T2��t�� �*��8T���2��
�l�<���0��v�dg��J�w��+ ]L��Mn ��)����D��4m3Q����P4I
��S���*��\�x3x��~wn��p�B)�"�@��hEB�
)Z�<
%G}���yI������
P3�3�[���(g�69�(6
.Qd�q!�I��p� ���nGb���%�-f7������ �N� ���;���E'U����w��$�# I&9����R-��{-Z�4�|��
#�'/�:^��)V�z/fk�#jVVLR�����H�"�X���j����3��0w�������-��k�7��t{����3<y���W����W�pqn����p�KY�%�d~��#iS����s��2����u0'��b3�[UY]����r�0���bH���lC�y��,4��tR���_�k�t�\#������y\��g���5��'�3�a]�A[��8��^�pS���K5��m5a�VJ+�0)�T����g5iRf�i]���/�^o��`���wE�����]}��W�vb�3?�m���/p�Pq$��R��98�w���4y�Q4n��8��I����=�f�G����>���E[�m����c�����>z�}~LtyU�I1�c��l�p�O�z8����Lv=X��I� ��y8!&+�+B�Tn�0��R��XA����!�r��q9��/ 4�haA�� ��wX�%r���
��p�@mB��6{&��2F��6n��� ���
��(�9�� �x� M2!'���/��lXI��xgn6@c�\P*rha��X;�C;>,;2^e\���)���w�3������[0?F�Y�C�����?w����O��'�-
endstream
endobj
11 0 obj
2505
endobj
13 0 obj
<</Length 14 0 R/Filter/FlateDecode>>
stream
x��X�n7��W�l@cw� �%������:q�'A|�����=����#�������H�#�y��yw�w��\���_��/��9�������u�u�l�9L�wK���� ��2}������:_�L�E%�������w���������fz�VvUd0���k.��������Cr�����R�P�}�U\JP�������s��rN�?�s��OA��;�o^����,�3���^�UD/�?��Y����q`F;b��`}�|�j����B��U���5�/xxN�[�t����g��A2?N��.�PDH�@�� �J���r��)gOt,�����y�c(����}�QS��b�?�g�g��|������h����hi��Fe^:��Zf���R���G(<L!��-.P�b�$�nq�3��� g�:��F:�Z��O-�N�_W��
>Fd)"]��p���N�ba��������!�0L6~h��R����-��.�����_i8����^*G�L�
~1�B�S��j��bcR.�f�B��YiYo1��09��4
=�����I�v�S/_L�su��
���b�^�n�w����������b������uR��pF��8`L�G�����m���p�3F�$`�E"�Ey�x�+����!W�(��^���J<�#�G&at�0N�Y6���Z5�;�Xo�y?�</w�\���V:����o���&�7�tD-U����K����M*U�Hk�1@c��Oq�� ~
�8�#��NZAQ!�"��ME�:���r�p�KHYTR&���$��$Y|=J���"G�2�!7�a4���$n���MT��Q��*��8.c��������Jm=�-0��BaZ��J����P���� mR��Aw�V�N����)
�P����5����U�m�,������N~
���U��9s���7?tD-������I�����@����i 56�(��6wS��!�un���gcJ�0�[z-��U����u,'�\ � �"(D�������,�"b��r�~d��L<}��
��P�nw+m��M�Q���Eh|��,�����GE{��nkr�����#;�:_�{2��t�29�Q��g�������e��r�
����Y��r����r��D�Y�Sk��^s�4�D��bm�4]g1��W)�K�)=����}>.?X����tF�{.��7��(�\�q����c��@8�
����y� 5-��#�����z�^���G�8N*�k�H�j�QU"-��,+B�'><�t����(Ep�c?\>|���M�F)��PHr�����^IBV��m�.oP����V]zg���v�{�����P%�lp,][�M��/$�O
��p8��#$yg ^�^���)��/����99P���$3�%c�:���e���G���po��A�E�q�j�<�p�m�Y��K����e�)���+G9��$>�J�H�B[�Lv5��0�l����T��x�Ru�2z:���O�G�rZlE��Bk�i�A���>��}��^����?���K�Xwo�_);�X
endstream
endobj
14 0 obj
1546
endobj
16 0 obj
<</Length 17 0 R/Filter/FlateDecode>>
stream
x��ZKo7����s oD�,�c�hni�P��m��%��GjF����I� ���H��O��nC������'n���Zb�������?'����?���������X���g��oG�b�����&����j*5���6i��m~}�������������jz�5o
dxp������l��zz��u�������K�����&�b?���������R��-����t*OwJg�������NBQv��
H���
X�|����n�hGrU3�2��~��D����?�h���W�������%���>n�r �����~��o9���@qg_<�^s��9�`bn�<�)9�1g�t����[6��b��a�f���(9�d<�W�>wk��{�g���@K��5�4Qv�4��J�f��UR�lh�^O>��fO�'9r�3�$�:7�YM�����F�oj��oM������%g)p>�(���RSYJ:r���6�k�t
����aX���Z�Q
���Q 8_�F�.��N����s��K�H���S�oL3��xT1�0�*8�\���AA�^i�w0Z�09�gA�������!~����;S�R]���|�UL�a��t�����,�sT��0�h,��P'f�!�3�%���1aD�����,��C��&q=���DXEi�d�;���$!W��oT/hNS!y�o�G&atC�y����Z5��?��*�aHyY��L�}U:+�T�W�CL��#}�������;K�7s���&0*V{�1@�`���x����
��18��� hyE_2`Q�
_�]�#-��.�k
!QEH.�a#s�$I�
�u�25!�t�������tD� ���ET�RP��+j�8�w� �Y��7*��7���_
Eh-�u���z��ee�I����d�t��EcoP&�4k�%�s�����X��~�z�;yp=����)I�����F-L����w�����'�����@jl��� ���&�xqWe�1��MZ2%a���\��HK��Y��)�Y����yF0���
:?!�"���\E`����~b��B<}���$�P�vw��g�fT�JY��W�dj�������������Ok�p�q}�`�����sM��-�{�;�2l
^j����3H�FqGi����)���������Mw�����sd-kB�� �s&��D��m�4��`���Rj�jSzd��j����h�wdlL${�`Oz�?u�"�������^�Y�Yl�_*��R��� 6@����U�GQ�G�$N*��h�H�l�
"���6�
����*�����G\�������/]��(����?
�o���'<>
r����a����wv
��K�Vtx@K��1'���U_��|���w�@���k8|�=�wz�2��V�����B'������7"3���l�rf�@3�2Jb�Gr��Ho��AeD��
��n�g�p��"����5�FvX�
md��*w��z*��c�\s��1Fn�i�Q?�hG,��U����;�rC�7�E.���*�I��_M�N=4�T9�i�'��u?�7���������T[��>Tm����O��$aY.k@4��y��c@�����a�G�s��M��"q�������c8�^O9���� �����^���WG�J~�Q�J��\���V��Tj�9��!mV
�����)�e�{�������������Y[�L����o��#G��}���}���%9�4�z��xE'@�;��h�����Y�uW��{�9�O���P���M��w �y��K<��|��QN��N�-*�6������<�{��?��E��n��������+��J��b�jv��In)%���8�;�8�LD�D�"c���=
Z�A���L�R��u����]3�]G��e�mD��C_fg����'�5+����:�[��Q<������&qG�4�F�����d��cc��ke�#����
K���*cZn`"�i�zN���������\��h�w'�~�������7��
endstream
endobj
17 0 obj
2002
endobj
19 0 obj
<</Length 20 0 R/Filter/FlateDecode/Length1 28572>>
stream
x���y|T��0~��wf����d2[&��L2 IHB ��1�L�H�d �'aq
UDm���V��a1�-T��[�_�
�X�m*m)�����9�&�}�������cn�9�9�9�����3����02���`KG�����z!�B����O��7}�|!m�����<�kEH���V�o\�6�fA�t� ���C��f\���0��6h������
P�l���0hy�P
����ZB��"4��u�6t��o��N���PGx��o��]�������r�^I��{��?�x���!��6�1��u���N���l���${���r{R��i>9=�����9���<���� B�O!}F�x$���%� � �]�iAO�C�|
F=�����D��>t%����h�������`wb/*D�,=���K��� r`W�st
���
FmF&��f�����/N����8-*C�N��
�[�'A?B�W#��<��7��O�7h"��������}(� �Q/��k�qbu�_��t����z�&>L�`�0�����,���D,q��� ��{�A<�&��������kl�Y�F�h?<C�'�}l��J<�8��(����E�������x%4Pi*��.�S�2:���g�Kc�k���o#;����n����� W�s
�_����@��(��������<��L ]�~� ��$xZQ�}������������jS�'f�H ��~�~�MpRG���;�wdYA�%��~�?��R�S_�:���I�l�S�|)n�W�-�6|7~���d1YK�����'�Lx�Q�Z��������G�o���(N\��<l�������@G�{�G��l�fxd����+��������x/�r�����������t��������>r�c�O�K��epy�d��k��`W[�����>�=�Q>t.�������<�yAsJk�}O@�_=<�;�Q�o�����&>F��CP��*a�!x� �w��=�~��@;�����@�x
������{������*����=������L&3�<x.#a�C����^������K�r��\�����.���}���;�}O���>>��y�l~������Y�y]��V���^���E7E7]7_�@���U�_�����"������������[H �&� � y^�Z�z�Jv��Ux/��l�N#��\t� �_";�2���uxZC&)�i��PT�/�a�y8�/`�
Z#��|�5�A�H9��s����^G�s���}������� ?��kP:w�1���B�H
X���6���� ��q1�'�@�RT��]����A���7�;q+���J���S�(h�M�6W��_%~+I�{�����L�i��:����������Q^�>����%?���S���
4�*t=�IlB5
�/�j���(�?��J��O���*�����>v`W-.���A.�������N� A��K�����.&Ch����� ��_��%Ew'V����h"��-�+a�]�t+��7��@�(
4�#|�����&&���=���8��@�,�B����M�<�����E�*�-�k����w���"tN�gX�B�0*��%��\7��8Z�x,��z��hG����G:
���3f��_P9��|j�����IE���r'�d�2���/-���q���d{��*Y�&�A/
:���F�5��f9h����N�uB��c24������|>f0W}
3�`�0�$W����r�_��Y�����
�\�o�c��g�v� NO�r���Z��f�&V��mkMs5L�������O�G��
������93�8k*v$�`S1���&��W�����Pkl�����������1<���2��3c�<��f�eb�Y1[F����������n����<c��5��!����<X�:�����\&��j�2�7��Z�����u�9�������4ol�9`,��m�ZKo"�-�a5���!�7��2= =�r�����4��c���m��f`�gk-��>��$N O��uq�?=V��oU{w��������������%�B��f�
M���X�:���Q����@��v���3M�Yx*��2�����X+p$g5o�*h;�dI~y��H��O����m��wDA*'c���p,//��KED7x
{����'��"~�$C�C������B z:e�MCA�*��
J]F+SQ�0�1F�i�����%�g`�glx�$y/s~�cB`��"9�j�*b����J�"��e
r��f��u���)�S��T(�4��K!*DR8�B�|�V�1>~�L�[�tH%k�rmLj�P���������):������U��_�v^����r�a��/��U^����� �G���Y1�43~�������fQ�?�I��������C�sb~-��[k�r��������J�,�� /��v�4�
�P��M)��m�@�6\1��O
oV�f�%��V7D�&!
�Cz#����$��p���� W�t�r�r�t��~�U,}���tk�52W�W2w����E2.W��?1� ��K�����vR'��/�.��c��lFNW�`m�#{0�z\&� ~M�i���Ml���3�����m+/d��&�TbM/v��d;IO�<eri ;�O��L������?�_����x����']�Qs�l��x.>2���]���d��A����S��-6���x��V >�6������S{2�K����]*��E-�����~������ d�/�^$/2,�vx{�
�����,w��Y>3j��F�l���V��bm��zz-��&��%�����t��F3��b1i�}�&9�;s ���p���O����5Q�IMg�']@�J ���\Y^hs�c��|�� Os�t�5o�5�N�B�Rn�*��
hk�=�x �=�rk������KvH>H�����=���:���L�s��g�����?H�y���~U�������t^21��c���s�|8^�98��������9�?��']�m�A7����m��v=��A� x�����D4�,{�^P���R��X��9������iJ���2Xh�Je�v�3��A[nE��!t�a�)������{����'dp a����[92F?P��IE%���4�U{�<d�U���k�v�"����_1r�q�q����g���
���:���` ��4�<��q���Po�i��E��v�v���tZU*��FC��T� �-q"�W��l�� �f�����]��'
jdaO
���^�1`�A���z�6EF��[��Y���i�h8����:
�!1s�<
{��0�����&��er7�x���-���-W�R����"T3,������ 2&�"!�"�w�N����bF��������n���214%���^n�O/7
XVn..c����:Q�����&��������2�n�[!�����E�d�4���>o�<���.�/���j���N�O����%F�F4�t@��`K�T$���f�n����%�a��d4�����1�o����fy#o��x��7���V�3,@� &��vh o��b�.M�i�HwPD��� �;9�
c���(���[8���<�����A�|�a�q#�����.YtGu������oy�]���M=nH��V�3���W�v�/�
�(��V^n-/�"9b>rd�F)�D�c�^�� ���S%�9�qh��%���sI�\ [��H�[���'G�}�=���k3�%T���j��8����@7�@�Z�����
��&%>�cs�
P� �n�8������i��k�s �� B������6m��#-Q�t�V+j9Q��z�h+r2�vB��h�b� l"�����+b"��h�z����&�H�-�Gn%4c� �����/�7���;S 4�F������?]i�V�V�� O �j(})��RU��.�zz)=�h�&N#.q3��r��RK,�`iH�P���n*��c�:�
Dg�]��d������kf^���y�tp���+��n��|�g���n.O|��Q�+T����Z�>���|V�d��;����8��W�Y���k�-O�$��$s�)�I2���)�RuVM�2y�IV�a�i�y�=��h��t��*�?3�u=��p�i��fis��Y��vXv$�ee�MM�75-B@� �8+3���������?�@%,���:w��X��p,�51-��i�&�)�Eb M�<�����=��R���? �=�r��i�����!M'!��l����5��7�$����:�df�K�L)�k����d���;�-��gd�?kZ��U]O,��|Z�}Ad�����_^�9hy�����S�{
�_��/��v7~W���������~g(���p��Z#ol2�t��K�������o]��h��p�"��23/h��4 b��8D�{dE�����r�xf�=�����J���FN6���Z�_bn����t+I���[�)��O��oT
�����"��n�������Kj�t�%��8NyBD��s�V�i�
X6�74�
�A���#�0�9�i���GE08�!�5VUR�S��(f��\����L���\+`q�����wX��������_�3�!%����I �*`*�����1O�.������X����2G��U��&�0�g�U��bA���`@��Y�,�a��fA�3 �X��R%��J;��B>�\d���6�p):A+h^��n��E��Qo�s�d������)�3����/�Qzk:b����MXqs�����k&�,p��������'�]���{�mon������hRM���s����9��z����#��������2���G���4� ���i�s��M��<_�_�_��������_�-F
�j�5h�"���Ms��b�--7w������fE�+�]�0z�S�T���w��(����z� ��V���2�z���9�+��cB_�t��;H`���U�7�z�������/�4������� w\���b�����7_�hI���w7O�Z���tM8���8um������*lf:P�C$���N�8^�Mt��D6�1��z6==��{��������_%u���������OHld�������GV����e�k�v i�-��8�����h�Rf(�j0+�YjU��9���AvJ��4���%4<�z�e15�31u��L.���apZ ��:Q�\~�|����<���I��?�[�v/�V �N|�g��!~s���l:��8[{��T�(��F�T��U8&�j�:[����\�\\(5��]��U��u8Z]�q����.�k�/5�saMX�n�;����5��:������"p�t�N�q�I�Sp
mw��K� 6g� ��Pln���L12�RO^��&