Slony-I goes BETA

Started by Jan Wieckover 21 years ago15 messages
#1Jan Wieck
JanWieck@Yahoo.com

Yes, Slonik's,

it't true. After nearly a year the Slony-I project is entering the BETA
phase for the 1.0 release. Please visit

http://gborg.postgresql.org/project/slony1/news/newsfull.php?news_id=174

for further details.

Jan Wieck

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#2Karel Zak
zakkr@zf.jcu.cz
In reply to: Jan Wieck (#1)
Re: [HACKERS] Slony-I goes BETA

On Fri, Jun 04, 2004 at 01:01:19AM -0400, Jan Wieck wrote:

Yes, Slonik's,

it't true. After nearly a year the Slony-I project is entering the BETA
phase for the 1.0 release. Please visit

http://gborg.postgresql.org/project/slony1/news/newsfull.php?news_id=174

Jan, the link

http://postgresql.org/~wieck/slony1/Slony-I-concept.pdf

that is used on project pages doesn't work :-(

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

#3Jan Wieck
JanWieck@Yahoo.com
In reply to: Karel Zak (#2)
Re: [HACKERS] Slony-I goes BETA

On 6/4/2004 4:47 AM, Karel Zak wrote:

On Fri, Jun 04, 2004 at 01:01:19AM -0400, Jan Wieck wrote:

Yes, Slonik's,

it't true. After nearly a year the Slony-I project is entering the BETA
phase for the 1.0 release. Please visit

http://gborg.postgresql.org/project/slony1/news/newsfull.php?news_id=174

Jan, the link

http://postgresql.org/~wieck/slony1/Slony-I-concept.pdf

that is used on project pages doesn't work :-(

Karel

Great ... and there is no way to modify anything on gborg ... this is
the first and last project I manage on any site where I don't have shell
access to the content.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#4Dave Page
dpage@vale-housing.co.uk
In reply to: Jan Wieck (#3)
Re: [HACKERS] Slony-I goes BETA

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jan Wieck
Sent: 04 June 2004 12:50
To: Karel Zak
Cc: Slony-I Mailing List; PostgreSQL-development; PostgreSQL
advocacy; PostgreSQL General
Subject: Re: [HACKERS] Slony-I goes BETA

Great ... and there is no way to modify anything on gborg ...
this is the first and last project I manage on any site where
I don't have shell access to the content.

In the project admin area, click 'Add News' under the News Manager
section, and then follow the 'Click here for a list of all news
bulletins available for update' link which will allow you to edit your
news items.

Not overly intuitive I grant you...

Regards, Dave

#5Josh Berkus
josh@agliodbs.com
In reply to: Jan Wieck (#3)
Re: [pgsql-advocacy] Slony-I goes BETA

Jan,

Great ... and there is no way to modify anything on gborg ... this is
the first and last project I manage on any site where I don't have shell
access to the content.

Sorry -- we'd like to migrate you (and lots of other projects) but,
a) I'm still sick, and
b) we're still having performance (load time) issues with pgFoundry.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#6Marc G. Fournier
scrappy@postgresql.org
In reply to: Josh Berkus (#5)
Re: [pgsql-advocacy] Slony-I goes BETA

On Fri, 4 Jun 2004, Josh Berkus wrote:

Jan,

Great ... and there is no way to modify anything on gborg ... this is
the first and last project I manage on any site where I don't have shell
access to the content.

Sorry -- we'd like to migrate you (and lots of other projects) but,
a) I'm still sick, and
b) we're still having performance (load time) issues with pgFoundry.

b) they would be the same on gborg as on pgFoundry, so don't let that stop
things ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#7Rick Gigger
rick@alpinenetworking.com
In reply to: Jan Wieck (#3)
Re: [GENERAL] [HACKERS] Slony-I goes BETA

The link you have down there is not the one on the site. All of the
links to that file work just fine for me on the live site.

Jan Wieck wrote:

Show quoted text

On 6/4/2004 4:47 AM, Karel Zak wrote:

On Fri, Jun 04, 2004 at 01:01:19AM -0400, Jan Wieck wrote:

Yes, Slonik's,

it't true. After nearly a year the Slony-I project is entering the
BETA phase for the 1.0 release. Please visit

http://gborg.postgresql.org/project/slony1/news/newsfull.php?news_id=174

Jan, the link

http://postgresql.org/~wieck/slony1/Slony-I-concept.pdf

that is used on project pages doesn't work :-(

Karel

Great ... and there is no way to modify anything on gborg ... this is
the first and last project I manage on any site where I don't have shell
access to the content.

Jan

#8Jan Wieck
JanWieck@Yahoo.com
In reply to: Rick Gigger (#7)
Re: [GENERAL] [HACKERS] Slony-I goes BETA

On 6/4/2004 3:28 PM, Rick Gigger wrote:

The link you have down there is not the one on the site. All of the
links to that file work just fine for me on the live site.

After Dave told me how to, I fixed the page.

Jan

Jan Wieck wrote:

On 6/4/2004 4:47 AM, Karel Zak wrote:

On Fri, Jun 04, 2004 at 01:01:19AM -0400, Jan Wieck wrote:

Yes, Slonik's,

it't true. After nearly a year the Slony-I project is entering the
BETA phase for the 1.0 release. Please visit

http://gborg.postgresql.org/project/slony1/news/newsfull.php?news_id=174

Jan, the link

http://postgresql.org/~wieck/slony1/Slony-I-concept.pdf

that is used on project pages doesn't work :-(

Karel

Great ... and there is no way to modify anything on gborg ... this is
the first and last project I manage on any site where I don't have shell
access to the content.

Jan

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#9Jeff Davis
jdavis-pgsql@empires.org
In reply to: Jan Wieck (#1)
Re: [HACKERS] Slony-I goes BETA (possible bug)

I have two nodes, node 1 and node 2.

Both are working with node 1 as the master, and data from subscribed
tables is being properly replicated to node 2.

However, it looks like there's a possible bug with sequences. First let
me explain that I don't entirely understand how a replicated sequence is
expected to behave, but as far as this report is concerned, I assume
that if you do a nextval() on node 1, than "SELECT last_value FROM
test_seq" on node 2 will return the updated value.

It looks like the sequence value is not updated on node 2, until some
other event happens, like doing an UPDATE on a replicated table on node
1.

I already have a table "t2" which is properly replicating.

So, here's what I give to slonik to add the sequence to set 1:
slonik <<_EOF_
cluster name = $CLUSTER;

node 1 admin conninfo = 'dbname=$DBNAME1 host=$HOST1 user=$SLONY_USER';
node 2 admin conninfo = 'dbname=$DBNAME2 host=$HOST2 user=$SLONY_USER';

create set (id=34, origin=1, comment='set 34');
set add sequence (set id = 34, origin = 1, id = 35, full qualified
name='public.test_seq', comment = 'sequence test');

subscribe set (id=34,provider=1,receiver=2,forward=no);

merge set (id=1,add id = 34, origin=1);

subscribe set (id=1,provider=1,receiver=2,forward=no);
_EOF_

Note: results of the query are put after the "--" following the query
for easier readability.

node1=> SELECT last_value FROM test_seq; -- 1
node2=> SELECT last_value FROM test_seq; -- 1
node1=> SELECT nextval('test_seq'); -- 1
node1=> SELECT nextval('test_seq'); -- 2
node1=> SELECT nextval('test_seq'); -- 3
node1=> SELECT last_value FROM test_seq; -- 3
node2=> SELECT last_value FROM test_seq; -- 1
node2=> -- wait for a long time, still doesn't update
node2=> SELECT last_value FROM test_seq; -- 1
node1=> INSERT INTO t2(a) VALUES('string');
node2=> SELECT last_value FROM test_seq; -- 3
node2=> -- now it's updated!

So, that looks like a possible bug where a nextval() call doesn't
trigger the replication. But it does appear to replicate after an
unrelated event triggers the replication (in this case an update to t2,
an unrelated table).

If not, what is the expected behavior of replicated sequences anyway? It
seems you couldn't call nextval() from a slave node, and because of that
you also can't make use of currval(). It looks like the slaves can
really only get the "SELECT last_value FROM test_seq". So is there a
particular use case someone had in mind when implementing the "SET ADD
SEQUENCE" for slonik?

Regards,
Jeff Davis

#10Jan Wieck
JanWieck@Yahoo.com
In reply to: Jeff Davis (#9)
Re: [HACKERS] Slony-I goes BETA (possible bug)

On 6/6/2004 5:21 AM, Jeff Davis wrote:

I have two nodes, node 1 and node 2.

Both are working with node 1 as the master, and data from subscribed
tables is being properly replicated to node 2.

However, it looks like there's a possible bug with sequences. First let
me explain that I don't entirely understand how a replicated sequence is
expected to behave, but as far as this report is concerned, I assume
that if you do a nextval() on node 1, than "SELECT last_value FROM
test_seq" on node 2 will return the updated value.

It looks like the sequence value is not updated on node 2, until some
other event happens, like doing an UPDATE on a replicated table on node
1.

You are right. The "local" slon node checks every "-s" milliseconds
(commandline switch) if the sequence sl_action_seq has changed, and if
so generate a SYNC event. Bumping a sequence alone does not cause this,
only operations that invoke the log trigger on replicated tables do.

Speaking of this, this would also mean that there is a gap between the
last sl_action_seq bumping operation and the commit of that transaction.
If the local slon will generate the sync right in that gap, the changes
done in that transaction will not be replicated until the next
transaction triggers another sync.

I am not sure how to effectively avoid this problem without blindly
creating SYNC events in a maybe less frequent interval. Suggestions?

Jan

I already have a table "t2" which is properly replicating.

So, here's what I give to slonik to add the sequence to set 1:
slonik <<_EOF_
cluster name = $CLUSTER;

node 1 admin conninfo = 'dbname=$DBNAME1 host=$HOST1 user=$SLONY_USER';
node 2 admin conninfo = 'dbname=$DBNAME2 host=$HOST2 user=$SLONY_USER';

create set (id=34, origin=1, comment='set 34');
set add sequence (set id = 34, origin = 1, id = 35, full qualified
name='public.test_seq', comment = 'sequence test');

subscribe set (id=34,provider=1,receiver=2,forward=no);

merge set (id=1,add id = 34, origin=1);

subscribe set (id=1,provider=1,receiver=2,forward=no);
_EOF_

Note: results of the query are put after the "--" following the query
for easier readability.

node1=> SELECT last_value FROM test_seq; -- 1
node2=> SELECT last_value FROM test_seq; -- 1
node1=> SELECT nextval('test_seq'); -- 1
node1=> SELECT nextval('test_seq'); -- 2
node1=> SELECT nextval('test_seq'); -- 3
node1=> SELECT last_value FROM test_seq; -- 3
node2=> SELECT last_value FROM test_seq; -- 1
node2=> -- wait for a long time, still doesn't update
node2=> SELECT last_value FROM test_seq; -- 1
node1=> INSERT INTO t2(a) VALUES('string');
node2=> SELECT last_value FROM test_seq; -- 3
node2=> -- now it's updated!

So, that looks like a possible bug where a nextval() call doesn't
trigger the replication. But it does appear to replicate after an
unrelated event triggers the replication (in this case an update to t2,
an unrelated table).

If not, what is the expected behavior of replicated sequences anyway? It
seems you couldn't call nextval() from a slave node, and because of that
you also can't make use of currval(). It looks like the slaves can
really only get the "SELECT last_value FROM test_seq". So is there a
particular use case someone had in mind when implementing the "SET ADD
SEQUENCE" for slonik?

Regards,
Jeff Davis

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#11Jeff Davis
jdavis-pgsql@empires.org
In reply to: Jan Wieck (#10)
Re: [HACKERS] Slony-I goes BETA (possible bug)

On Sun, 2004-06-06 at 10:32, Jan Wieck wrote:

You are right. The "local" slon node checks every "-s" milliseconds
(commandline switch) if the sequence sl_action_seq has changed, and if
so generate a SYNC event. Bumping a sequence alone does not cause this,
only operations that invoke the log trigger on replicated tables do.

Speaking of this, this would also mean that there is a gap between the
last sl_action_seq bumping operation and the commit of that transaction.
If the local slon will generate the sync right in that gap, the changes
done in that transaction will not be replicated until the next
transaction triggers another sync.

I am not sure how to effectively avoid this problem without blindly
creating SYNC events in a maybe less frequent interval. Suggestions?

A couple thoughts occur to me:

Spurious SYNCs might not be the end of the world, because if someone is
using replication, they probably don't mind the unneeded costs of a SYNC
when the database is not being used heavily. If it is being used
heavily, the SYNCs will have to happen anyway.

Also, it might be possibly to make use of NOTIFY somehow, because
notifications only occur after a transaction commits. Perhaps you can
issue a notify for each transaction that modifies a replicated table and
slon could listen for that notification? That way, it wouldn't SYNC
before the transaction commits and miss the uncommitted data.

Regards,
Jeff Davis

#12Jan Wieck
JanWieck@Yahoo.com
In reply to: Jeff Davis (#11)
Re: [HACKERS] Slony-I goes BETA (possible bug)

I tend to agree with you that spurious SYNC's aren't the end of the
world. The idea of using notify to tell the syncThread somthing happened
is probably the right way to do it, but at this time a little invasive.
We need more time to investigate how to avoid notice storms during high
update activity on the master.

Jan

On 6/6/2004 2:33 PM, Jeff Davis wrote:

On Sun, 2004-06-06 at 10:32, Jan Wieck wrote:

You are right. The "local" slon node checks every "-s" milliseconds
(commandline switch) if the sequence sl_action_seq has changed, and if
so generate a SYNC event. Bumping a sequence alone does not cause this,
only operations that invoke the log trigger on replicated tables do.

Speaking of this, this would also mean that there is a gap between the
last sl_action_seq bumping operation and the commit of that transaction.
If the local slon will generate the sync right in that gap, the changes
done in that transaction will not be replicated until the next
transaction triggers another sync.

I am not sure how to effectively avoid this problem without blindly
creating SYNC events in a maybe less frequent interval. Suggestions?

A couple thoughts occur to me:

Spurious SYNCs might not be the end of the world, because if someone is
using replication, they probably don't mind the unneeded costs of a SYNC
when the database is not being used heavily. If it is being used
heavily, the SYNCs will have to happen anyway.

Also, it might be possibly to make use of NOTIFY somehow, because
notifications only occur after a transaction commits. Perhaps you can
issue a notify for each transaction that modifies a replicated table and
slon could listen for that notification? That way, it wouldn't SYNC
before the transaction commits and miss the uncommitted data.

Regards,
Jeff Davis

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#13Jeff Davis
jdavis-pgsql@empires.org
In reply to: Jan Wieck (#12)
Re: [HACKERS] Slony-I goes BETA (possible bug)

On Mon, 2004-06-07 at 06:20, Jan Wieck wrote:

I tend to agree with you that spurious SYNC's aren't the end of the
world. The idea of using notify to tell the syncThread somthing happened
is probably the right way to do it, but at this time a little invasive.
We need more time to investigate how to avoid notice storms during high
update activity on the master.

There was discussion a while back about improving notify, and one
suggestion was to make it use shared memory so no disk writes are
involved (I believe the current implementation uses a table somehow). If
that was implemented, than we would have no problem with a notice storm,
right? It wouldn't use much shared memory since the slon daemon can
retrieve the notices just as fast as the backend can send them, right?

Backtracking a little, I'm still wondering how exactly a replicated
sequence is supposed to behave, do you have some comments about that? I
don't understand exactly why it's useful.

Regards,
Jeff

#14Chris Browne
cbbrowne@acm.org
In reply to: Jan Wieck (#1)
Re: [HACKERS] Slony-I goes BETA (possible bug)

jdavis-pgsql@empires.org (Jeff Davis) writes:

Backtracking a little, I'm still wondering how exactly a replicated
sequence is supposed to behave, do you have some comments about
that? I don't understand exactly why it's useful.

The reason why it is is necessary to have some kind of handling of
sequences is the fact that tables commonly self-populate ID columns
using sequences.

create table important_table (
id serial unique not null,
descr text,
created_on timestamptz default now()
);

That "id" field is populated via a sequence.

If Slony1 is told to shift control of the system to a new master, it
won't be very nice if "test_id_seq" (the sequence that was generated)
is set to 0 on the slaves, when the master has populated hundreds or
thousands of rows and has used thousands of sequence values.

If "test_id_seq" _doesn't_ get set to a nice high value, then if a
slave gets promoted to master, new inserts into important_table will
use low ID values, and, more than likely, conflict sporadically. Bad
Thing.
--
select 'cbbrowne' || '@' || 'acm.org';
http://www3.sympatico.ca/cbbrowne/wp.html
Signs of a Klingon Programmer - 19. "My program has just dumped Stova
Core!"

#15Jan Wieck
JanWieck@Yahoo.com
In reply to: Jeff Davis (#13)
Re: Slony-I goes BETA (possible bug)

On 6/7/2004 2:33 PM, Jeff Davis wrote:

On Mon, 2004-06-07 at 06:20, Jan Wieck wrote:

I tend to agree with you that spurious SYNC's aren't the end of the
world. The idea of using notify to tell the syncThread somthing happened
is probably the right way to do it, but at this time a little invasive.
We need more time to investigate how to avoid notice storms during high
update activity on the master.

There was discussion a while back about improving notify, and one
suggestion was to make it use shared memory so no disk writes are
involved (I believe the current implementation uses a table somehow). If
that was implemented, than we would have no problem with a notice storm,
right? It wouldn't use much shared memory since the slon daemon can
retrieve the notices just as fast as the backend can send them, right?

Keep in mind that for the time being, one of the important features of
Slony-I is the ability to replicate from a 7.3.x to anything >7.3.x. You
sure don't want to cripple that functionality by heavily depending on
features fixed or significantly improved for 7.5.

Backtracking a little, I'm still wondering how exactly a replicated
sequence is supposed to behave, do you have some comments about that? I
don't understand exactly why it's useful.

At the moment the origin of a set discovers that there has been update
activity, it generates the SYNC event and records all sequences
last_values. At that time, a sequence can possibly be incremented again
by a not yet committed transaction, so it might be recorded with a
higher number than a max() query over the tables would show. When a
subscriber applies the SYNC event, it also calls setval() with those
recorded values. So on the replica the sequence number is adjusted up
just before the SYNC event occupying those numbers commits.

This means, that in the case of a failover, the sequences might show a
gap. This is absolutely in accordance to PostgreSQL's sequence handling
which cannot guarantee gap free sequences in the case of server crashes
or other transaction rollback reasons.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #