New 8.4 hot standby feature
I have merged the last hot standby patch (v9g) to 8.4 devel and I am pleased
with the experience. This is promising stuff. Perhaps it is a bit too soon to
ask questions here but here it is:
1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before I see
the reflected changes in the replica. This is normal since, in addition to
the WAL log shipping, it takes more time to do the recovery itself. Still, is
there any way besides the archive_timeout config option to speed up the
recovery of WAL logs on the hot standby?
2. last modified since timestamp:
Is there a way to get the "last modified since" timestamp on the hot standby
replica? Since the replication is asynchronous, it is necessary to know how
up to date the replication is. In our case, the timestamp is used this way:
select * from resource where not_modified_since >=
to_timestamp('$not_modified_since', 'YYYY/MM/DD HH:MI:SS');
The $not_modified_since is set to now() for the next time this query will be
run on the master database. This way the application keeps a cache and it is
not necessary to fetch everything every time. With an asynchronous replica
however, "now()" cannot be used and so I am looking into other possibilities
to get an accurate "last modified since" on the hot standby itself. I
tried "select pg_last_recovered_xact_timestamp();"
(http://wiki.postgresql.org/wiki/Hot_Standby) but this is not 100% accurate.
It looks like it has more to do with recovery transactions then the original
database itself.
So the question is: is there any clean way to get the last_modified_since
without making modifications to the schema?
On 1/27/09, Gabi Julien <gabi.julien@broadsign.com> wrote:
I have merged the last hot standby patch (v9g) to 8.4 devel and I am pleased
with the experience. This is promising stuff. Perhaps it is a bit too soon to
ask questions here but here it is:1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before I see
the reflected changes in the replica. This is normal since, in addition to
the WAL log shipping, it takes more time to do the recovery itself. Still, is
there any way besides the archive_timeout config option to speed up the
recovery of WAL logs on the hot standby?
you can manually throw pg_switch_xlog(), In practice, this is more of
an issue on development boxes than anything if you server is at all
busy.
see: http://developer.postgresql.org/pgdocs/postgres/functions-admin.html
merlin
On Tue, 2009-01-27 at 12:58 -0500, Merlin Moncure wrote:
you can manually throw pg_switch_xlog(), In practice, this is more of
an issue on development boxes than anything if you server is at all
busy.
That won't speed up recovery, that will just force the WAL segment to be
archived. It's still up to the standby to find the log and replay it.
Regards,
Jeff Davis
On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote:
I have merged the last hot standby patch (v9g) to 8.4 devel and I am pleased
with the experience. This is promising stuff. Perhaps it is a bit too soon to
ask questions here but here it is:1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before I see
the reflected changes in the replica. This is normal since, in addition to
the WAL log shipping, it takes more time to do the recovery itself. Still, is
there any way besides the archive_timeout config option to speed up the
recovery of WAL logs on the hot standby?
Is the recovery itself the bottleneck?
There's a performance improvement submitted here:
http://archives.postgresql.org/message-id/a778a7260810280033n43f70d36x8c437eacf9a5461e@mail.gmail.com
But I haven't been following the development of it closely, so you'll
have to read the thread to see whether it will meet your needs or not.
Regards,
Jeff Davis
On Tuesday 27 January 2009 13:13:32 you wrote:
On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote:
I have merged the last hot standby patch (v9g) to 8.4 devel and I am
pleased with the experience. This is promising stuff. Perhaps it is a bit
too soon to ask questions here but here it is:1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before
I see the reflected changes in the replica. This is normal since, in
addition to the WAL log shipping, it takes more time to do the recovery
itself. Still, is there any way besides the archive_timeout config option
to speed up the recovery of WAL logs on the hot standby?Is the recovery itself the bottleneck?
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.
There's a performance improvement submitted here:
http://archives.postgresql.org/message-id/a778a7260810280033n43f70d36x8c437
eacf9a5461e@mail.gmail.comBut I haven't been following the development of it closely, so you'll
have to read the thread to see whether it will meet your needs or not.
Thanks I will take a look at it. Also, I came across the record log shipping
feature too in my research:
http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD
Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.
Regards,
Jeff Davis
-------------------------------------------------------
Import Notes
Resolved by subject fallback
On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote:
Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.
See if there are times during which the recovery process isn't doing
anything (i.e. just waiting for WAL data). If so, something like this
might help. If it's constantly working as hard as it can, then probably
not.
An important question you should ask yourself is whether it can keep up
in the steady state at all. If the primary is producing segments faster
than the standby is recovering them, I don't think there's any way
around that.
Regards,
Jeff Davis
Hi,
On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote:
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.
Are you disabling full_page_writes? It may slow down recovery several times.
Thanks I will take a look at it. Also, I came across the record log shipping
feature too in my research:http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD
Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.
No. Even if the logs are shipped frequently, they cannot be applied until
the log file fills.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Hi,
On Thu, Jan 29, 2009 at 12:23 AM, Jason Long
<mailing.list@supernovasoftware.com> wrote:
Is pg_clearxlogtail going to be in contrib or integrated in some other way?
I also hope so. The related topic was discussed before.
http://archives.postgresql.org/pgsql-hackers/2009-01/msg00639.php
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Import Notes
Reply to msg id not found: 4980785B.8040202@supernovasoftware.com
Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com> wrote:
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.Are you disabling full_page_writes? It may slow down recovery several times.
Thanks I will take a look at it. Also, I came across the record log shipping
feature too in my research:http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STANDBY-RECORD
Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.No. Even if the logs are shipped frequently, they cannot be applied until
the log file fills.Regards,
Is pg_clearxlogtail <http://www.2ndquadrant.com/code/pg_clearxlogtail.c>
going to be in contrib or integrated in some other way?
On Tuesday 27 January 2009 16:25:44 you wrote:
On Tue, 2009-01-27 at 14:28 -0500, Gabi Julien wrote:
Could this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.See if there are times during which the recovery process isn't doing
anything (i.e. just waiting for WAL data). If so, something like this
might help. If it's constantly working as hard as it can, then probably
not.An important question you should ask yourself is whether it can keep up
in the steady state at all. If the primary is producing segments faster
than the standby is recovering them, I don't think there's any way
around that.
The load on the slave is close to 0 so it does not explain the speed of
recovery. Also the shipping of the 16MB WAL log takes only 1 second on the
LAN. I guess the problem is probably what Fujii Masao explained. The WAL log
shipped are not yet usable or something like that. I won't try to increase
the frequency of log shipping because of that. Also, my setting of 60 seconds
is the lowest frequency suggested by the documentation anyways.
However, I have found the v4 patch about the PITR performance improvement. I
will give it a try and report here.
I might try pg_clearxlogtail too if I have time.
Show quoted text
Regards,
Jeff Davis
On Tuesday 27 January 2009 21:47:36 you wrote:
Hi,
On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com>
wrote:
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.Are you disabling full_page_writes? It may slow down recovery several
times.
It looks like you found my problem. Everything I needed to know is described
here:
http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html
Setting checkpoint_timeout to 55 seconds speeds up the recovery to the level I
want. Ironically, it makes the pg_last_recovered_xact_timestamp() function
more reliable too on how up to date the replica is. I am not sure that I can
take this for granted however.
I will disable full_page_writes to make sure this agressive checkpoint_timeout
setting won't slow down my master database too much. Can I be reassured on
the fact that, if the master database crashes and some data is lost, at least
the replica would keep its integrity (even though it is not in sync)?
My settings:
full_page_writes = off
checkpoint_timeout = 55s
checkpoint_completion_target = 0.7
archive_mode = on
archive_command = './archive_command.sh "%p" "%f"'
archive_timeout = 60
Also, would it be possible to recompile postgresql by using a different size
(smaller) then 16M for WAL logs and would that be a smart thing to try?
Thanks a lot to all of you.
Show quoted text
Thanks I will take a look at it. Also, I came across the record log
shipping feature too in my research:http://www.postgresql.org/docs/current/static/warm-standby.html#WARM-STAN
DBY-RECORDCould this help? If the logs are smaller then I could potentially afford
shipping then at a higher frequency.No. Even if the logs are shipped frequently, they cannot be applied until
the log file fills.Regards,
On Wednesday 28 January 2009 18:35:18 Gabi Julien wrote:
On Tuesday 27 January 2009 21:47:36 you wrote:
Hi,
On Wed, Jan 28, 2009 at 4:28 AM, Gabi Julien <gabi.julien@broadsign.com>
wrote:
Yes, the logs are shipped every minute but the recevory is 3 or 4 times
longer.Are you disabling full_page_writes? It may slow down recovery several
times.It looks like you found my problem. Everything I needed to know is
described here:http://www.postgresql.org/docs/8.3/interactive/wal-configuration.html
Setting checkpoint_timeout to 55 seconds speeds up the recovery to the
level I want. Ironically, it makes the pg_last_recovered_xact_timestamp()
function more reliable too on how up to date the replica is. I am not sure
that I can take this for granted however.
This is a good question actually. If I set the checkpoint_timeout to
something less then the archive_timeout, can I take this for granted the fact
that pg_last_recovered_xact_timestamp() will always accurately tell me how up
to date the replica is?
On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote:
I have merged the last hot standby patch (v9g) to 8.4 devel and I am pleased
with the experience. This is promising stuff.
Thanks,
Perhaps it is a bit too soon to
ask questions here but here it is:
Thanks very much for the bug report.
1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before I see
the reflected changes in the replica. This is normal since, in addition to
the WAL log shipping, it takes more time to do the recovery itself. Still, is
there any way besides the archive_timeout config option to speed up the
recovery of WAL logs on the hot standby?
There was a reported bug whose apparent symptoms were delay of WAL
files. The bug was not in fact anything to do with that at all, it was
just delayed *visibility*. So I doubt very much that you have a
performance problem.
The bug fix patch is attached, verified to solve the problem.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Attachments:
hs.v9h_dif_v9g.patchtext/x-patch; charset=utf-8; name=hs.v9h_dif_v9g.patchDownload+95-81
On Thursday 29 January 2009 02:43:18 you wrote:
On Tue, 2009-01-27 at 12:53 -0500, Gabi Julien wrote:
I have merged the last hot standby patch (v9g) to 8.4 devel and I am
pleased with the experience. This is promising stuff.Thanks,
Perhaps it is a bit too soon to
ask questions here but here it is:Thanks very much for the bug report.
1. Speed of recovery
With a archive_timeout of 60 seconds, it can take about 4 minutes before
I see the reflected changes in the replica. This is normal since, in
addition to the WAL log shipping, it takes more time to do the recovery
itself. Still, is there any way besides the archive_timeout config option
to speed up the recovery of WAL logs on the hot standby?There was a reported bug whose apparent symptoms were delay of WAL
files. The bug was not in fact anything to do with that at all, it was
just delayed *visibility*. So I doubt very much that you have a
performance problem.The bug fix patch is attached, verified to solve the problem.
Thanks. Please discard all my previous comments. This was the true source of
the issue that I was experiencing.
Hi,
There's a performance improvement submitted here:
http://archives.postgresql.org/message-id/a778a7260810280033n43f70d36x8c437eacf9a5461e@mail.gmail.comBut I haven't been following the development of it closely, so you'll
have to read the thread to see whether it will meet your needs or not.
I've posted the lastest set of the patch to speed-up the recovery
without FPW. You'll find this in the thread. I hope anyone of you
try this.
Regards,
Jeff Davis--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
------
Koichi Suzuki