Using postgres.log file for replication

Started by Ioana Danesover 17 years ago9 messagesgeneral
Jump to latest
#1Ioana Danes
ioanasoftware@yahoo.ca

Hi Everyone,

I've been wondering if anybody tried to use the postgresql csv log file to replicate sql statements.
I've been looking into it in the past days and after a brief testing it doesn't look bad at all...

Thanks,
Ioana

__________________________________________________________________
Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/

#2Csaba Nagy
nagy@ecircle-ag.com
In reply to: Ioana Danes (#1)
Re: Using postgres.log file for replication

On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes wrote:

I've been wondering if anybody tried to use the postgresql csv log file to replicate sql statements.
I've been looking into it in the past days and after a brief testing it doesn't look bad at all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field = now();

The replica is now different than the master :-)

Cheers,
Csaba.

#3Ioana Danes
ioanasoftware@yahoo.ca
In reply to: Csaba Nagy (#2)
Re: Using postgres.log file for replication

I know there are some limitations abut it:
- copy statements cannot be executed,
- the use of now() function
- even database restore scripts I don't want to be replicated
but these are not a problem for me because I don't use them in the application...

--- On Thu, 11/27/08, Csaba Nagy <nagy@ecircle-ag.com> wrote:

From: Csaba Nagy <nagy@ecircle-ag.com>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: ioanasoftware@yahoo.ca
Cc: "PostgreSQL General" <pgsql-general@postgresql.org>
Received: Thursday, November 27, 2008, 12:24 PM
On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes wrote:

I've been wondering if anybody tried to use the

postgresql csv log file to replicate sql statements.

I've been looking into it in the past days and

after a brief testing it doesn't look bad at all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field = now();

The replica is now different than the master :-)

Cheers,
Csaba.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now at
http://ca.toolbar.yahoo.com.

#4Scott Marlowe
scott.marlowe@gmail.com
In reply to: Ioana Danes (#3)
Re: Using postgres.log file for replication

If you want the same thing in real time look into pgpool II

On Thu, Nov 27, 2008 at 10:28 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:

I know there are some limitations abut it:
- copy statements cannot be executed,
- the use of now() function
- even database restore scripts I don't want to be replicated
but these are not a problem for me because I don't use them in the application...

--- On Thu, 11/27/08, Csaba Nagy <nagy@ecircle-ag.com> wrote:

From: Csaba Nagy <nagy@ecircle-ag.com>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: ioanasoftware@yahoo.ca
Cc: "PostgreSQL General" <pgsql-general@postgresql.org>
Received: Thursday, November 27, 2008, 12:24 PM
On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes wrote:

I've been wondering if anybody tried to use the

postgresql csv log file to replicate sql statements.

I've been looking into it in the past days and

after a brief testing it doesn't look bad at all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field = now();

The replica is now different than the master :-)

Cheers,
Csaba.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now at
http://ca.toolbar.yahoo.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
When fascism comes to America, it will be draped in a flag and
carrying a cross - Sinclair Lewis

#5Ioana Danes
ioanasoftware@yahoo.ca
In reply to: Scott Marlowe (#4)
Re: Using postgres.log file for replication

Thanks for the tip Scott but am looking for an asynchronous replication that does not interfere with the performance of the application. Also I don't necessary need the latest changes to be applied right away. Even a day difference is enough...

Thanks a lot,
Ioana
--- On Thu, 11/27/08, Scott Marlowe <scott.marlowe@gmail.com> wrote:

From: Scott Marlowe <scott.marlowe@gmail.com>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: ioanasoftware@yahoo.ca
Cc: "Csaba Nagy" <nagy@ecircle-ag.com>, "PostgreSQL General" <pgsql-general@postgresql.org>
Received: Thursday, November 27, 2008, 12:34 PM
If you want the same thing in real time look into pgpool II

On Thu, Nov 27, 2008 at 10:28 AM, Ioana Danes
<ioanasoftware@yahoo.ca> wrote:

I know there are some limitations abut it:
- copy statements cannot be executed,
- the use of now() function
- even database restore scripts I don't want to be

replicated

but these are not a problem for me because I don't

use them in the application...

--- On Thu, 11/27/08, Csaba Nagy

<nagy@ecircle-ag.com> wrote:

From: Csaba Nagy <nagy@ecircle-ag.com>
Subject: Re: [GENERAL] Using postgres.log file for

replication

To: ioanasoftware@yahoo.ca
Cc: "PostgreSQL General"

<pgsql-general@postgresql.org>

Received: Thursday, November 27, 2008, 12:24 PM
On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes

wrote:

I've been wondering if anybody tried to

use the

postgresql csv log file to replicate sql

statements.

I've been looking into it in the past

days and

after a brief testing it doesn't look bad at

all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field =

now();

The replica is now different than the master :-)

Cheers,
Csaba.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________

Yahoo! Canada Toolbar: Search from anywhere on the

web, and bookmark your favourite sites. Download it now at

http://ca.toolbar.yahoo.com.

--
Sent via pgsql-general mailing list

(pgsql-general@postgresql.org)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
When fascism comes to America, it will be draped in a flag
and
carrying a cross - Sinclair Lewis

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/

#6Scott Marlowe
scott.marlowe@gmail.com
In reply to: Ioana Danes (#5)
Re: Using postgres.log file for replication

Then you might wanna look at slony. We use it and it's quite stable
and reliable, and if the slave can't keep up it's no big deal to the
master.

On Thu, Nov 27, 2008 at 11:33 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:

Thanks for the tip Scott but am looking for an asynchronous replication that does not interfere with the performance of the application. Also I don't necessary need the latest changes to be applied right away. Even a day difference is enough...

Thanks a lot,
Ioana
--- On Thu, 11/27/08, Scott Marlowe <scott.marlowe@gmail.com> wrote:

From: Scott Marlowe <scott.marlowe@gmail.com>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: ioanasoftware@yahoo.ca
Cc: "Csaba Nagy" <nagy@ecircle-ag.com>, "PostgreSQL General" <pgsql-general@postgresql.org>
Received: Thursday, November 27, 2008, 12:34 PM
If you want the same thing in real time look into pgpool II

On Thu, Nov 27, 2008 at 10:28 AM, Ioana Danes
<ioanasoftware@yahoo.ca> wrote:

I know there are some limitations abut it:
- copy statements cannot be executed,
- the use of now() function
- even database restore scripts I don't want to be

replicated

but these are not a problem for me because I don't

use them in the application...

--- On Thu, 11/27/08, Csaba Nagy

<nagy@ecircle-ag.com> wrote:

From: Csaba Nagy <nagy@ecircle-ag.com>
Subject: Re: [GENERAL] Using postgres.log file for

replication

To: ioanasoftware@yahoo.ca
Cc: "PostgreSQL General"

<pgsql-general@postgresql.org>

Received: Thursday, November 27, 2008, 12:24 PM
On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes

wrote:

I've been wondering if anybody tried to

use the

postgresql csv log file to replicate sql

statements.

I've been looking into it in the past

days and

after a brief testing it doesn't look bad at

all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field =

now();

The replica is now different than the master :-)

Cheers,
Csaba.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________

Yahoo! Canada Toolbar: Search from anywhere on the

web, and bookmark your favourite sites. Download it now at

http://ca.toolbar.yahoo.com.

--
Sent via pgsql-general mailing list

(pgsql-general@postgresql.org)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
When fascism comes to America, it will be draped in a flag
and
carrying a cross - Sinclair Lewis

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/

--
When fascism comes to America, it will be draped in a flag and
carrying a cross - Sinclair Lewis

#7Chris Browne
cbbrowne@acm.org
In reply to: Ioana Danes (#1)
Re: Using postgres.log file for replication

ioanasoftware@yahoo.ca (Ioana Danes) writes:

I've been wondering if anybody tried to use the postgresql csv log
file to replicate sql statements. I've been looking into it in the
past days and after a brief testing it doesn't look bad at all...

It's *plausible*, but you have to ensure that you *never* use any
operations that could break the assumption that it is OK to apply the
queries in "logged order."

1. As has been mentioned separately, references to NOW() will break;
analagous problems will occur for any function that doesn't provide
immutable results, including:

- NOW(), obviously, and any time-based function
- currval('any_sequence')
- random() :-)

Thus, any reasonable usage of DEFAULT values on tables will cause
discrepancies.

2. Further, if there can be multiple requests acting on the database
concurrently, this can make it nigh unto impossible to ensure that
they are applied in a compatible order.

It makes my head hurt a bit to think about the kinds of cases where
this breaks down, but I know it's not difficult for concurrency to
make this break badly.

3. If you ever have queries that create data in nondeterministic
ways, that will cause a discrepancy.

For instance:
insert into table_2 (a,b,c)
select a,b,c from table_1 limit 100;

will NOT be able to be replicated consistently unless there were only
100 tuples to be found.

That being said, the following alteration to that query *could* work
out:
insert into table_2 (a,b,c)
select a,b,c from table_1 order by a,b,c limit 100;

(assuming that the result of the select is, itself, a relation, which
implies that there are no repeated values...)

I'm afraid I don't have sufficiently draconian powers over *our*
developers to ensure that they NEVER do anything that would violate
the above set of requirements.

Nor, frankly, would I want to. I strongly *APPROVE* of them using
NOW() and currval(), and we build applications to be able to support
multiple concurrent users.

The only piece where I might *imagine* I'd want to be "Lord of
Draconia" would be on #3, and, if offered "Draconian Powers," I'd
rather apply those powers to more vital matters :-).
--
output = ("cbbrowne" "@" "cbbrowne.com")
http://www3.sympatico.ca/cbbrowne/slony.html
Signs of a Klingon Programmer #6: "Debugging? Klingons do not
debug. Our software does not coddle the weak."

#8Ioana Danes
ioanasoftware@yahoo.ca
In reply to: Scott Marlowe (#6)
Re: Using postgres.log file for replication

Hi Scott,

Thanks for the replay,

It is almost impossible to use any of the replication tools mostly because the 2 dbs are not on the same network (live and testing environments)... I don't intend to use this for high availability. I am using Sequoia and PITR for that...

I only intend to use it to replicate a live database for testing purposes on another network...

Thanks again,
Ioana

--- On Thu, 11/27/08, Scott Marlowe <scott.marlowe@gmail.com> wrote:

From: Scott Marlowe <scott.marlowe@gmail.com>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: ioanasoftware@yahoo.ca
Cc: "Csaba Nagy" <nagy@ecircle-ag.com>, "PostgreSQL General" <pgsql-general@postgresql.org>
Received: Thursday, November 27, 2008, 1:53 PM
Then you might wanna look at slony. We use it and it's
quite stable
and reliable, and if the slave can't keep up it's
no big deal to the
master.

On Thu, Nov 27, 2008 at 11:33 AM, Ioana Danes
<ioanasoftware@yahoo.ca> wrote:

Thanks for the tip Scott but am looking for an

asynchronous replication that does not interfere with the
performance of the application. Also I don't necessary
need the latest changes to be applied right away. Even a day
difference is enough...

Thanks a lot,
Ioana
--- On Thu, 11/27/08, Scott Marlowe

<scott.marlowe@gmail.com> wrote:

From: Scott Marlowe

<scott.marlowe@gmail.com>

Subject: Re: [GENERAL] Using postgres.log file for

replication

To: ioanasoftware@yahoo.ca
Cc: "Csaba Nagy"

<nagy@ecircle-ag.com>, "PostgreSQL General"
<pgsql-general@postgresql.org>

Received: Thursday, November 27, 2008, 12:34 PM
If you want the same thing in real time look into

pgpool II

On Thu, Nov 27, 2008 at 10:28 AM, Ioana Danes
<ioanasoftware@yahoo.ca> wrote:

I know there are some limitations abut it:
- copy statements cannot be executed,
- the use of now() function
- even database restore scripts I don't

want to be

replicated

but these are not a problem for me because I

don't

use them in the application...

--- On Thu, 11/27/08, Csaba Nagy

<nagy@ecircle-ag.com> wrote:

From: Csaba Nagy

<nagy@ecircle-ag.com>

Subject: Re: [GENERAL] Using postgres.log

file for

replication

To: ioanasoftware@yahoo.ca
Cc: "PostgreSQL General"

<pgsql-general@postgresql.org>

Received: Thursday, November 27, 2008,

12:24 PM

On Thu, 2008-11-27 at 09:20 -0800, Ioana

Danes

wrote:

I've been wondering if anybody

tried to

use the

postgresql csv log file to replicate sql

statements.

I've been looking into it in the

past

days and

after a brief testing it doesn't look

bad at

all...

Try to execute something like:

UPDATE some_table SET

some_timestamp_field =

now();

The replica is now different than the

master :-)

Cheers,
Csaba.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________

Yahoo! Canada Toolbar: Search from anywhere

on the

web, and bookmark your favourite sites. Download

it now at

http://ca.toolbar.yahoo.com.

--
Sent via pgsql-general mailing list

(pgsql-general@postgresql.org)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general

--
When fascism comes to America, it will be draped

in a flag

and
carrying a cross - Sinclair Lewis

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________

Instant Messaging, free SMS, sharing photos and

more... Try the new Yahoo! Canada Messenger at
http://ca.beta.messenger.yahoo.com/

--
When fascism comes to America, it will be draped in a flag
and
carrying a cross - Sinclair Lewis

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Connect with friends from any web browser - no download required. Try the new Yahoo! Canada Messenger for the Web BETA at http://ca.messenger.yahoo.com/webmessengerpromo.php

#9Ioana Danes
ioanasoftware@yahoo.ca
In reply to: Chris Browne (#7)
Re: Using postgres.log file for replication

Thanks for the reply Chris,

None of these are a problem for me because we are not using them at all...

Thanks again,
Ioana

--- On Thu, 11/27/08, Chris Browne <cbbrowne@acm.org> wrote:

From: Chris Browne <cbbrowne@acm.org>
Subject: Re: [GENERAL] Using postgres.log file for replication
To: pgsql-general@postgresql.org
Received: Thursday, November 27, 2008, 6:10 PM
ioanasoftware@yahoo.ca (Ioana Danes) writes:

I've been wondering if anybody tried to use the

postgresql csv log

file to replicate sql statements. I've been

looking into it in the

past days and after a brief testing it doesn't

look bad at all...

It's *plausible*, but you have to ensure that you
*never* use any
operations that could break the assumption that it is OK to
apply the
queries in "logged order."

1. As has been mentioned separately, references to NOW()
will break;
analagous problems will occur for any function that
doesn't provide
immutable results, including:

- NOW(), obviously, and any time-based function
- currval('any_sequence')
- random() :-)

Thus, any reasonable usage of DEFAULT values on tables will
cause
discrepancies.

2. Further, if there can be multiple requests acting on
the database
concurrently, this can make it nigh unto impossible to
ensure that
they are applied in a compatible order.

It makes my head hurt a bit to think about the kinds of
cases where
this breaks down, but I know it's not difficult for
concurrency to
make this break badly.

3. If you ever have queries that create data in
nondeterministic
ways, that will cause a discrepancy.

For instance:
insert into table_2 (a,b,c)
select a,b,c from table_1 limit 100;

will NOT be able to be replicated consistently unless there
were only
100 tuples to be found.

That being said, the following alteration to that query
*could* work
out:
insert into table_2 (a,b,c)
select a,b,c from table_1 order by a,b,c limit 100;

(assuming that the result of the select is, itself, a
relation, which
implies that there are no repeated values...)

I'm afraid I don't have sufficiently draconian
powers over *our*
developers to ensure that they NEVER do anything that would
violate
the above set of requirements.

Nor, frankly, would I want to. I strongly *APPROVE* of
them using
NOW() and currval(), and we build applications to be able
to support
multiple concurrent users.

The only piece where I might *imagine* I'd want to be
"Lord of
Draconia" would be on #3, and, if offered
"Draconian Powers," I'd
rather apply those powers to more vital matters :-).
--
output = ("cbbrowne" "@"
"cbbrowne.com")
http://www3.sympatico.ca/cbbrowne/slony.html
Signs of a Klingon Programmer #6: "Debugging?
Klingons do not
debug. Our software does not coddle the weak."

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

__________________________________________________________________
Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/