Streaming replication: sequences on slave seemingly ahead of sequences on master
Hi list,
we have two 9.1.2 servers on debian squeeze, and are setting up a simple
streaming replication between the two.
* wal_keep_segments is set high on the master
* the slave's recovery.conf contains just standbay_mode=on and
primary_conninfo=foo
* we use a simple start_backup/rsync/stop_backup to create the base copy
before starting the slave.
It all seems to be working fine, except that when checking the data (selecting
latest primary key and sequence value for all tables) on master and slave,
some sequence ids are higher on the slave than on the master. I could
understand if they were lower, but this is weird.
* The slave's sequences can be anywhere between 1 and 50 ids ahead.
* The actual table data is properly in sync.
* We look at the slave before the master.
* We ignore readings where pg_current_xlog_location() !=
pg_last_xlog_replay_location().
* It only happens on frequently-updated sequences.
* During recovery, we have warnings of the form:
2012-05-04 10:32:08 CEST WARNING: xlog min recovery request 16A/2A03BDD0 is
past current point 16A/1E72A880
2012-05-04 10:32:08 CEST CONTEXT: writing block 0 of relation
base/35355/42224_vm
xlog redo vacuum: rel 1663/1562168/1563037; blk 12122, lastBlockVacuumed
12070
2012-05-04 10:32:12 CEST WARNING: xlog min recovery request 16A/469F2120 is
past current point 16A/1E9B6EB8
2012-05-04 10:32:12 CEST CONTEXT: writing block 0 of relation
base/56308/57181_vm
xlog redo vacuum: rel 1663/1562168/1563037; blk 21875, lastBlockVacuumed
21329
2012-05-04 10:32:17 CEST WARNING: xlog min recovery request 16A/22D497B8 is
past current point 16A/1FF69258
* servers have near-identical hardware and software
* monitoring via munin show at most 1-2 KB of replication lag
* we retried the base backup twice
So...
* any likely mistake on our side ?
* can it be fixed ?
* is this harmless and to be ignored ?
Thank you.
--
Vincent de Phily
On Fri, May 4, 2012 at 8:55 AM, Vincent de Phily
<vincent.dephily@mobile-devices.fr> wrote:
Hi list,
we have two 9.1.2 servers on debian squeeze, and are setting up a simple
streaming replication between the two.* wal_keep_segments is set high on the master
* the slave's recovery.conf contains just standbay_mode=on and
primary_conninfo=foo
* we use a simple start_backup/rsync/stop_backup to create the base copy
before starting the slave.It all seems to be working fine, except that when checking the data (selecting
latest primary key and sequence value for all tables) on master and slave,
some sequence ids are higher on the slave than on the master. I could
understand if they were lower, but this is weird.* The slave's sequences can be anywhere between 1 and 50 ids ahead.
how did you determine that exactly? how do you know the transactions
are committing in sequence order?
merlin
On Friday 04 May 2012 09:47:16 Merlin Moncure wrote:
On Fri, May 4, 2012 at 8:55 AM, Vincent de Phily
<vincent.dephily@mobile-devices.fr> wrote:
Hi list,
we have two 9.1.2 servers on debian squeeze, and are setting up a simple
streaming replication between the two.* wal_keep_segments is set high on the master
* the slave's recovery.conf contains just standbay_mode=on and
primary_conninfo=foo
* we use a simple start_backup/rsync/stop_backup to create the base copy
before starting the slave.It all seems to be working fine, except that when checking the data
(selecting latest primary key and sequence value for all tables) on
master and slave, some sequence ids are higher on the slave than on the
master. I could understand if they were lower, but this is weird.* The slave's sequences can be anywhere between 1 and 50 ids ahead.
how did you determine that exactly?
Quick and dirty :
SQL=$(psql -tA -h $MASTER $DB <<< "select E'select \''||table_name||E'\', '||
column_name||' from '||table_name||' order by '||column_name||' desc limit 1;'
from information_schema.columns where table_schema='public' and
ordinal_position=1 order by table_name;select E'select \''||sequence_name||
E'\', last_value from '||sequence_name||';' from information_schema.sequences
where sequence_schema='public' order by sequence_name;")
psql -tA -h $SLAVE $DB <<< "select pg_last_xlog_replay_location();$SQL" >
$SLAVE.check
psql -tA -h $MASTER $DB <<< "select pg_current_xlog_location();$SQL" >
$MASTER.check
if diff -u $MASTER.check $SLAVE.check; then
cat $MASTER.check
echo -e "\e[32msync ok\e[m"
else
echo -e "\e[31msync bad\e[m"
fi
how do you know the transactions
are committing in sequence order?
I dont, actually. But whichever order the transactions eventually commit in,
I'd expect that order to be the same on the slave and the host ? And I
wouldn't expect anything to finish on the slave before it finishes on the
master ?
--
Vincent de Phily
Mobile Devices
+33 (0) 142 119 325
+353 (0) 85 710 6320
Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.
This is due to how sequences are pre-allocated in blocks to sessions
running on the master.
Since the slave is updated via the WALs, and not via 'nextval' function
calls in queries, the sequences that are actually used will remain in sync
with the master.
--
Mike Nolan
On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
It all seems to be working fine, except that when checking the data (selecting
latest primary key and sequence value for all tables) on master and slave,
some sequence ids are higher on the slave than on the master. I could
understand if they were lower, but this is weird.* The slave's sequences can be anywhere between 1 and 50 ids ahead.
This is normal. The sequences are advanced in chunks of 100, so the
master's value will be the nextval() while the value on standby will
be the start of the next chunk, so as you say, slightly ahead of the
master.
The same thing would also happen in case of a crash.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Sunday 06 May 2012 10:29:17 Simon Riggs wrote:
On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr>
wrote:
It all seems to be working fine, except that when checking the data
(selecting latest primary key and sequence value for all tables) on
master and slave, some sequence ids are higher on the slave than on the
master. I could understand if they were lower, but this is weird.* The slave's sequences can be anywhere between 1 and 50 ids ahead.
This is normal. The sequences are advanced in chunks of 100, so the
master's value will be the nextval() while the value on standby will
be the start of the next chunk, so as you say, slightly ahead of the
master.The same thing would also happen in case of a crash.
Thanks for the explanation (Michael's too).
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.
As a bonus question, I guess it would be the same if using synchroneous
replication ?
--
Vincent de Phily
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.
If you think so, please submit a patch. That's how it works here.
As a bonus question, I guess it would be the same if using synchroneous
replication ?
Yes
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, May 7, 2012 at 10:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.If you think so, please submit a patch. That's how it works here.
FWIW, I think this would be a reasonable thing to document, given that
it violates the principle of least surprise for people who are not
intimately familiar with how replication and/or sequences work wrt wal
logging in postgresql. (but no, I'm not actually volunteering at this
point to write said patch due to my backlog already being too large
:P)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 7 May 2012 09:19, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, May 7, 2012 at 10:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.If you think so, please submit a patch. That's how it works here.
FWIW, I think this would be a reasonable thing to document, given that
it violates the principle of least surprise for people who are not
intimately familiar with how replication and/or sequences work wrt wal
logging in postgresql. (but no, I'm not actually volunteering at this
point to write said patch due to my backlog already being too large
:P)
Mine also. Its important that everybody understands that submitting a
patch is the way to get change, plus its a good test of whether the
change is actually worth the effort to make it happen.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes:
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.
If you think so, please submit a patch. That's how it works here.
Any documentation patch should be written by somebody who's actually
researched the behavior a bit; in particular I believe this can be
adjusted with the sequence CACHE setting.
regards, tom lane
On Mon, May 7, 2012 at 4:01 AM, Vincent de Phily <
vincent.dephily@mobile-devices.fr> wrote:
On Sunday 06 May 2012 10:29:17 Simon Riggs wrote:
On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr
wrote:
Would be nice to see it added to the documentation (unless I just didn't
find
it ?), as it is quite surprising, and might lead to problems if people
expect
to be able to read sequence values from the slave.
What people need to understand is that there is no way to 'read' a sequence
value from a slave. 'SELECT * from sequence_name' will not reliably give
you either the most recently assigned or the next sequence value. This is
currently covered in the documentation for sequences, but could probably be
improved upon and mentioned somewhere in the documentation on setting up
slave servers. (I will look at adding it to the binary replication
tutorial wiki page.)
Since 'nextval' cannot be called on a sequence on a slave (because a slave
can only support read-only transactions), 'currval' will by definition
return an error.
To cross-pollinate with another thread, if temporary tables (and
insert/delete/update transactions to them) are to be supported on a slave,
will the applications using those temporary tables expect to be able to use
'nextval' on inserts to temporary tables as well?
As a bonus question, I guess it would be the same if using synchroneous
replication ?
Yes.
--
Mike Nolan
On Mon, May 7, 2012 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
Would be nice to see it added to the documentation (unless I just didn't find
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.If you think so, please submit a patch. That's how it works here.
Any documentation patch should be written by somebody who's actually
researched the behavior a bit; in particular I believe this can be
adjusted with the sequence CACHE setting.
No. That behavior is caused by the hard-coded value SEQ_LOG_VALS
(= 32 in sequence.c) rather than CACHE setting.
Regards,
--
Fujii Masao
On Mon, May 7, 2012 at 8:52 AM, Michael Nolan <htfoot@gmail.com> wrote:
To cross-pollinate with another thread, if temporary tables (and
insert/delete/update transactions to them) are to be supported on a slave,
will the applications using those temporary tables expect to be able to use
'nextval' on inserts to temporary tables as well?
That's a good question. I re-asked it for you on -hackers on the GTT
thread. Look for follow ups there.
merlin