Reporting the commit LSN at commit time

Started by Craig Ringerover 11 years ago36 messageshackers
Jump to latest
#1Craig Ringer
craig@2ndquadrant.com

Hi all

To support transparent client-side failover in BDR, it's necessary to
know what the LSN of a node was at the time a transaction committed and
keep track of that in the client/proxy.

I'm thinking about adding a new message type in the protocol that gets
sent immediately before CommandComplete, containing the LSN of the
commit. Clients would need to enable the sending of this message with a
GUC that they set when they connect, so it doesn't confuse clients that
aren't expecting it or aware of it.

Is this something you can see being useful for other non-BDR purposes?
Are there any obvious issues with this?

Clients can always follow up with a second query to get the xlog
position, after commit, but that's potentially slow and has a race that
might cause a client to wait longer than it has to after fail-over to a
different node. That's why having the server report it automatically
seems useful.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Ants Aasma
ants.aasma@cybertec.at
In reply to: Craig Ringer (#1)
Re: Reporting the commit LSN at commit time

On Thu, Aug 7, 2014 at 4:15 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

I'm thinking about adding a new message type in the protocol that gets
sent immediately before CommandComplete, containing the LSN of the
commit. Clients would need to enable the sending of this message with a
GUC that they set when they connect, so it doesn't confuse clients that
aren't expecting it or aware of it.

Is this something you can see being useful for other non-BDR purposes?

I have been thinking about something similar.

For regular streaming replication you could keep track of the commit
LSN on a per client basis and automatically redirect read queries to a
standby if standby apply location is larger than the commit LSN of
this particular client. This can be done mostly transparently for the
application while not running into the issue that recent modifications
disappear for a while after commit.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Craig Ringer (#1)
Re: Reporting the commit LSN at commit time

On Wed, Aug 6, 2014 at 9:15 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

To support transparent client-side failover in BDR, it's necessary to
know what the LSN of a node was at the time a transaction committed and
keep track of that in the client/proxy.

I doubt whether it makes sense to do this without a broader
understanding of how the client-side failover mechanism would work.
If we're going to add something like this, it should include libpq
support for actually doing something useful with it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#1)
Re: Reporting the commit LSN at commit time

Craig Ringer <craig@2ndquadrant.com> writes:

Hi all
To support transparent client-side failover in BDR, it's necessary to
know what the LSN of a node was at the time a transaction committed and
keep track of that in the client/proxy.

I'm thinking about adding a new message type in the protocol that gets
sent immediately before CommandComplete, containing the LSN of the
commit. Clients would need to enable the sending of this message with a
GUC that they set when they connect, so it doesn't confuse clients that
aren't expecting it or aware of it.

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

I don't mind if you expose some other way to inquire about LSNs, but
let's *not* embed it into the wire protocol. Not even as an option.

This position also obviates the need to complain about having a GUC
that changes the protocol-level behavior, which is also a seriously
bad idea.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: Reporting the commit LSN at commit time

On 08/08/2014 03:54 AM, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

Hi all
To support transparent client-side failover in BDR, it's necessary to
know what the LSN of a node was at the time a transaction committed and
keep track of that in the client/proxy.

I'm thinking about adding a new message type in the protocol that gets
sent immediately before CommandComplete, containing the LSN of the
commit. Clients would need to enable the sending of this message with a
GUC that they set when they connect, so it doesn't confuse clients that
aren't expecting it or aware of it.

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.

They're _already_ exposed to users. That ship has sailed.

That's not to say I'm stuck to LSNs as the way to do this, just that I
don't think that particular argument is relevant.

I don't mind if you expose some other way to inquire about LSNs, but
let's *not* embed it into the wire protocol. Not even as an option.

Well, the underlying need here is to have the client able to keep track
of what point in the server's time-line it must see on a replica before
it proceeds to use that replica.

So if the client does some work on server A, then for some reason needs
to / must use server B, it can ask server B "are you replayed up to the
last transaction I performed on server A yet?" and wait until it is.

That's useful for streaming replication (to permit consistent queries
against read replicas) but it's much more so for BDR, where it's
necessary to avoid a variety of multi-master replication anomalies and
conflicts.

I considered LSNs to be the logical mechanism for this as they're
already user-visible, exposed in pg_stat_replication, they can already
be used for just this purpose by hand (just with an extra round-trip), etc.

An obvious alternative is to merge the commit timestamp work, then
expose the timestamp of the last commit replayed in pg_stat_replication.
Then all the client needs to keep track of is the server time of the
last commit.

This position also obviates the need to complain about having a GUC
that changes the protocol-level behavior, which is also a seriously
bad idea.

Well, I'd prefer to be able to negotiate with the server and establish
requirements during the protocol handshake.

As far as I know there isn't an explicit protocol negotiation with
capabilities fields (just a plain version field), but we do have the
startup packet's 'options' field. So I was thinking that requesting the
setting of a PGC_BACKEND GUC in the startup packet would be a logical
way for the client to request use of a protocol extension.

Looking at ProcessStartupPacket(...) in postmaster.c I see that there's
room for special-casing options. Do you think it would be more
appropriate to add a new connection option that's sent by a client to
request reporting of commit timestamps / LSNs / whatever by the server
at commit time?

If not, do you have an alternative suggestion? I can't imagine that
extending the CommandComplete message is a desirable option.

It seems like it'd be useful to expose this as a read-only GUC anyway,
so I don't really see why a PGC_BACKEND GUC isn't exactly the right
thing to use for this, but I'm happy to listen to suggestions.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#5)
Re: Reporting the commit LSN at commit time

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

This position also obviates the need to complain about having a GUC
that changes the protocol-level behavior, which is also a seriously
bad idea.

Well, I'd prefer to be able to negotiate with the server and establish
requirements during the protocol handshake.

Sure, but a GUC is not that. The problem with a GUC for changing
wire-level behavior is that it might be set by code far removed from
the wire, possibly breaking lower code levels that expected different
behavior. The multitude of ways that we offer for setting GUCs is
an active evil in this particular context.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#3)
Re: Reporting the commit LSN at commit time

On 08/07/2014 11:42 PM, Robert Haas wrote:

I doubt whether it makes sense to do this without a broader
understanding of how the client-side failover mechanism would work.
If we're going to add something like this, it should include libpq
support for actually doing something useful with it.

I'm currently interested in targeting PgBouncer and PgJDBC, not libpq,
though I can see that exposing helpers for it in libpq could be useful.

The goal here is to permit a client to safely switch from one server to
another - either in a multimaster async replication system like BDR, or
routing read-only queries to hot standbys with streaming replication -
and know for sure that its last commit is visible on the server it is
now connected to.

For hot standby that means it can avoid running queries that won't see
the latest work it did if the standby is lagging - deciding to run them
on the upstream instead, or wait, as appropriate.

For BDR it'll permit the client to safely perform transparent failover
to another node and resume write operations without risking conflicts
with its own prior transactions . (I wrote some explanations about this
on -general in the thread here:
/messages/by-id/84184AEF-887D-49DF-8F47-6377B1D6EE9F@gmail.com
).

Broadly, what I'm thinking of is:

* Whenever a client issues a transaction that gets a txid assigned, and
that tx commits, the server reports the LSN that includes the commit.

* The client keeps track of which server it is connected to using the
server's (sysid, timelineid, databaseoid) or a similar identifier -
probably specific to the replication protocol in use, unless something
generic proves practical.

* When the client has to switch to a new server or chooses to do so, it
checks pg_stat_replication or pg_replication_slots, finds the server it
was previously connected to, and checks to see if the new server has
replayed up to the last write transaction this client performed on the
previous server. If not, it can make a policy-driven decision: wait
until replay catchup, wait for a while then bail out, etc.

This is admittedly all a bit hand-wavey. I'm looking at ways to do it,
not a firm implementation plan.

Notably, the LSN (and timelineID) aren't the only way to keep track of
the replay progress of a server and check it from another server. If the
commit timestamps work is merged and the timestamp of the last replayed
commit record is exposed in pg_replication_slots, the client could use
the server-reported commit timestamp to the same effect.

In the above you'll note that the client has to make some choices. The
client might be picking different servers for failover, read load
spreading, or other things I haven't thought of. It might be retaining
the old connection and making new ones it wants to be consistent up to a
certain point on the old connection (read spreading), or it might be
dropping the old connection and making a new one (failover). If the new
server to connect to isn't caught up yet it might want to wait
indefinitely, wait a short while, or bail out immediately and try a
different server. There's a lot of client/application specific policy
going to be involved here, so I'm not sure it makes sense to try to make
it transparent in libpq. I can see it being useful to expose some tools
in libpq for it, without a doubt, so clients can do these sorts of
things usefully.

(There's also another whole new question: how do you pick which
alternative server to connect to? But that's not really within the scope
of this.)

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: Reporting the commit LSN at commit time

On 08/08/2014 09:02 AM, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

That makes sense.

Well, I'd prefer to be able to negotiate with the server and establish
requirements during the protocol handshake.

Sure, but a GUC is not that. The problem with a GUC for changing
wire-level behavior is that it might be set by code far removed from
the wire, possibly breaking lower code levels that expected different
behavior. The multitude of ways that we offer for setting GUCs is
an active evil in this particular context.

I'd value your input and suggestions then.

I thought this is what PGC_BACKEND GUCs were for - set them in the
startup packet and you can't mess with them afterwards. I can see that
the ability of someone to cause that to be set in (e.g.) PGOPTIONS could
be a problem though.

AFAIK we don't _have_ a fancy negotiation system in the protocol, with
back-and-forth exchanges of capabilities information.

So is the appropriate thing to do here to set a non-GUC 'options' field
in the startup packet?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#8)
Re: Reporting the commit LSN at commit time

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 09:02 AM, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

Well, I'd prefer to be able to negotiate with the server and establish
requirements during the protocol handshake.

Sure, but a GUC is not that. The problem with a GUC for changing
wire-level behavior is that it might be set by code far removed from
the wire, possibly breaking lower code levels that expected different
behavior. The multitude of ways that we offer for setting GUCs is
an active evil in this particular context.

I thought this is what PGC_BACKEND GUCs were for - set them in the
startup packet and you can't mess with them afterwards. I can see that
the ability of someone to cause that to be set in (e.g.) PGOPTIONS could
be a problem though.

Right. AFAICS it's mostly irrelevant whether or not the setting can be
changed intrasession; what *is* important is that it not be turned on
if the lowest level(s) of the client software stack can't handle it.
It's not beyond the realm of credibility that there could be
security-grade bugs arising from client software receiving messages it
didn't understand. And what that means is that anything that is
accessible to higher levels of the stack is suspect. We've provided
enough pass-through behaviors that I'm not sure anything in the current
connection options API would be safe.

AFAIK we don't _have_ a fancy negotiation system in the protocol, with
back-and-forth exchanges of capabilities information.

Maybe it's time to invent that. It would be positively foolish to
create any such behavior without a protocol version bump anyway.

Although as I said, I don't think embedding knowledge of LSNs at the
protocol level is a good thing to begin with. Is it really necessary that
this information be pushed to the client on every commit, as opposed to
the client asking for it occasionally?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Fujii Masao
masao.fujii@gmail.com
In reply to: Craig Ringer (#5)
Re: Reporting the commit LSN at commit time

On Fri, Aug 8, 2014 at 9:50 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 08/08/2014 03:54 AM, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

Hi all
To support transparent client-side failover in BDR, it's necessary to
know what the LSN of a node was at the time a transaction committed and
keep track of that in the client/proxy.

I'm thinking about adding a new message type in the protocol that gets
sent immediately before CommandComplete, containing the LSN of the
commit. Clients would need to enable the sending of this message with a
GUC that they set when they connect, so it doesn't confuse clients that
aren't expecting it or aware of it.

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.

They're _already_ exposed to users. That ship has sailed.

That's not to say I'm stuck to LSNs as the way to do this, just that I
don't think that particular argument is relevant.

I don't mind if you expose some other way to inquire about LSNs, but
let's *not* embed it into the wire protocol. Not even as an option.

Well, the underlying need here is to have the client able to keep track
of what point in the server's time-line it must see on a replica before
it proceeds to use that replica.

So if the client does some work on server A, then for some reason needs
to / must use server B, it can ask server B "are you replayed up to the
last transaction I performed on server A yet?" and wait until it is.

ISTM that the proper solution to that problem is the introduction of
new synchronous replication mode which makes the transaction wait for
its WAL to be replayed by the standby. If this mode is used, a client
doesn't need to track the LSN of each transaction and check whether
the committed transaction has already replayed by the standby or not.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Craig Ringer
craig@2ndquadrant.com
In reply to: Fujii Masao (#10)
Re: Reporting the commit LSN at commit time

On 08/08/2014 10:58 AM, Fujii Masao wrote:

ISTM that the proper solution to that problem is the introduction of
new synchronous replication mode which makes the transaction wait for
its WAL to be replayed by the standby. If this mode is used, a client
doesn't need to track the LSN of each transaction and check whether
the committed transaction has already replayed by the standby or not.

I'm not convinced of that.

That pushes the penalty onto the writer - which now has to wait until
replicas catch up. It has to pay this for every commit, even if actually
failing over to another node is unlikely.

It'd be better to just enable sync rep instead, or it would if we had
all-nodes sync rep.

IMO any waiting needs to be done on the other side, i.e. "Wait until I
am caught up before proceeding" rather than "wait for the other end to
catch up before returning".

Doing it the way you describe would make it nearly useless for enabling
client-side failover in BDR, where half the point is that it can deal
with high latency or intermittently available links to downstream replicas.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#10)
Re: Reporting the commit LSN at commit time

On Fri, Aug 8, 2014 at 11:58 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Aug 8, 2014 at 9:50 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
ISTM that the proper solution to that problem is the introduction of
new synchronous replication mode which makes the transaction wait for
its WAL to be replayed by the standby. If this mode is used, a client
doesn't need to track the LSN of each transaction and check whether
the committed transaction has already replayed by the standby or not.

Don't you need to combine that with the possibility to wait for N
targets instead of 1 in synchronous_standby_names? You may want to be
sure that the commit is done on a series of standbys before
considering any further operations after this transaction commit.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: Reporting the commit LSN at commit time

On 08/08/2014 09:51 AM, Tom Lane wrote:

AFAIK we don't _have_ a fancy negotiation system in the protocol, with
back-and-forth exchanges of capabilities information.

Maybe it's time to invent that. It would be positively foolish to
create any such behavior without a protocol version bump anyway.

I was hoping it'd be easier to sneak a new message type in without a
full protocol bump. As you can imagine that makes it a ... rather larger
job.

Still, if it's unsafe to do it that way...

Although as I said, I don't think embedding knowledge of LSNs at the
protocol level is a good thing to begin with.

As I said upthread, it need not necessarily be an LSN. A commit
timestamp would do the job too, if information about the last-replayed
commit timestamp was accessible on the downstream node.

It needs to be a sequence identifier that can be matched against
pg_replication_slots / pg_stat_replication or passed to a function on
the downstream end to say "wait until we're replayed to this point".

For streaming replication there's only one upstream, so there's no need
to identify it. For BDR you'd also have to identify the upstream node of
interest - probably by slot ID, or by (sysid, tlid, dboid) tuple.

In the end, it can be an opaque magic cookie. It really doesn't matter,
so long as what the client receives is a value it can pass to another Pg
instance and say "wait until you've replayed up to this, please" or
"have you replayed up to this yet?".

Is it really necessary that this information be pushed to the client

on every commit, as opposed to the client asking for it occasionally?

I think so, yes, though I'd be glad to be proved wrong.

For the purpose of transparent failover (BDR) at least, the server
currently being written to can go away at any moment, and you should
know exactly what you're up to in order to make it safe to continue on
another server.

Consider, for a multi-master configuration where two servers replicate
to each other:

On a connection to server1:

INSERT INTO bird(id, parrot)
VALUES (1, 'African Grey');

[client grabs magic cookie for server replay state]

INSERT INTO bird(id, parrot)
VALUES (2, 'Lorikkeet');

[server1 sends changes to server2, which is behind on replay
and still working on catching up]

[server1 dies abruptly]

[client drops connection to server1, connects to server2]

-- Correct spelling
UPDATE bird
SET parrot = 'Lorikeet'
WHERE id = 2;

If the INSERT from server1 hasn't replayed on server2 yet this will
fail. Other anomalies can be worse and cause lost updates, etc.

To protect against this the client needs a way to wait, after connecting
to server2, until it's caught up with the state of server1. That's what
I'm talking about here. In this case, if you used a periodic progress
indicator requested by the client, you'd still get the same error,
because you'd wait until the first INSERT but not the second.

So yes, the client needs this info at every commit.

That means that enabling client-side fail-over won't be free, especially
for lots of small transactions. It'll be cheaper if Pg can push the info
with the commit confirmation instead of the client having to request it
afterwards though.

(Note that the client risks waiting forever if server1 didn't send the
required commits before it died, but that's where application policy
decisions come in).

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: Reporting the commit LSN at commit time

On 2014-08-07 21:02:54 -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

I don't think we really need to embed it at that level. And it doesn't
have to be always on - only clients that ask for it need to get the
answer. Something like COMMIT WITH (report_commit_lsn ON); or similar
might do the trick?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#14)
Re: Reporting the commit LSN at commit time

On 08/10/2014 12:54 AM, Andres Freund wrote:

On 2014-08-07 21:02:54 -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

I don't think we really need to embed it at that level. And it doesn't
have to be always on - only clients that ask for it need to get the
answer. Something like COMMIT WITH (report_commit_lsn ON); or similar
might do the trick?

Wouldn't that force client drivers - libpq, psqlODBC, PgJDBC, etc - to
all watch for explicit "COMMIT"s sent by the application and rewrite them?

Applications could also then request the commit option via a driver that
couldn't cope with it - which I think was one of Tom's concerns re using
a GUC, too.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#15)
Re: Reporting the commit LSN at commit time

On 2014-08-10 08:50:58 +0800, Craig Ringer wrote:

On 08/10/2014 12:54 AM, Andres Freund wrote:

On 2014-08-07 21:02:54 -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

I don't think we really need to embed it at that level. And it doesn't
have to be always on - only clients that ask for it need to get the
answer. Something like COMMIT WITH (report_commit_lsn ON); or similar
might do the trick?

Wouldn't that force client drivers - libpq, psqlODBC, PgJDBC, etc - to
all watch for explicit "COMMIT"s sent by the application and rewrite them?

Any application doing such "transparent" failover would need to have a
driver that's aware of all that anyway. They need to learn about the
transaction boundaries, the commit command and such. I personally think
this should mean that that feature requires an explicit API call for
transaction control.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#14)
Re: Reporting the commit LSN at commit time

On Sat, Aug 9, 2014 at 12:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-08-07 21:02:54 -0400, Tom Lane wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 08/08/2014 03:54 AM, Tom Lane wrote:

FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
at all. What happens five years from now when we switch to some other
implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.
They're _already_ exposed to users. That ship has sailed.

They're exposed to replication tools, yeah, but embedding them in the
wire protocol would be moving the goalposts a long way past that. As an
example of something that doubtless seemed like a good idea at the time,
consider the business about how an INSERT command completion tag includes
the OID of the inserted row. We're stuck with that obsolete idea
*forever* because it's embedded in the protocol for all clients.

I don't think we really need to embed it at that level. And it doesn't
have to be always on - only clients that ask for it need to get the
answer. Something like COMMIT WITH (report_commit_lsn ON); or similar
might do the trick?

And what does that actually do? Send back a result-set, or a new
protocol message?

I don't have a very clear idea whether this is a good idea in any form
because I can't quite imagine how this is going to be used by the
client without adding an unwarranted amount of complexity there.
However, based on my experiences at EnterpriseDB, I would be extremely
wary of extending the wire protocol. As soon as we do that, it
requires updates to a really phenomenal amount of other software.
Software using libpq may be more or less able to ignore the
difference, as long as they have a new-enough version of libpq (which
is a significant proviso). But any driver that has its own
implementation of the wire protocol (and I think there is at least one
and maybe several important ones that do) needs updating, and anything
that acts as middleware (pgpool, pgbouncer) does too. And it's not
just a matter of the maintainers making the appropriate changes
(though that does need to happen); it's also about everyone who is
using the new server version getting new versions of that other
software also.

So, even accepting for the moment the premise that the basic idea here
is good, I think the benefits would have to be monumental to convince
me that a protocol change is a good idea. If we do anything like
that, we'll be hearing about the downstream damage for years.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#17)
Re: Reporting the commit LSN at commit time

On 08/15/2014 12:21 AM, Robert Haas wrote:

On Sat, Aug 9, 2014 at 12:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I don't have a very clear idea whether this is a good idea in any form
because I can't quite imagine how this is going to be used by the
client without adding an unwarranted amount of complexity there.

Full automatic transparent failover _will_ be complex on the client. No
denying that. The hard parts are picking which node to connect to when
one goes away, the decision making around what to do when the new node
fails to catch up to the last committed state on the old node, and
tracking session state.

There are some quite simple uses too though. The main one of interest to
me is an app that routes read-only queries to an async read-replica and
wants to guarantee that some of them see a state consistent with the
last commit on the master.

It's the first thing many people have asked me about BDR, though. "How
does client-side failover work". This is a priority for a lot of people.

As far as I can see, if you have client-side failover with asynchronous
replication of any form, the client _must_ have some way to reliably
connect to a new node and ask it "are you caught up to the state of the
last node I was connected to yet?". Or "Please wait until the last
transaction I committed elsewhere is visible here".

The client must keep track of some kind of information that indicates
the last node it talked to and identifies the last transaction it
committed. ("Client" could mean "proxy" in the case of a failover-aware
pgbouncer.)

So, even accepting for the moment the premise that the basic idea here
is good, I think the benefits would have to be monumental to convince
me that a protocol change is a good idea. If we do anything like
that, we'll be hearing about the downstream damage for years.

Yes, that's a real concern.

PgJDBC and psqlODBC both implement the wire protocol themselves. PgJDBC
does because it's a type 4 JDBC driver (pure Java, no native code, no
recompile required). I don't understand why psqlODBC goes its own way
instead of using libpq, but it does.

There are also numerous language-specific pure-language bindings, though
half of them seem pretty close to unmaintained.

That's why I proposed a new protocol message carrying extra info, that
clients can optionally request only if they understand it. Nobody else
needs to care or notice that anything's new.

The v2 to v3 protocol switch has only now reached the point where it's
realistic to to drop v2 support from clients. I'm hardly keen to do
another protocol rev, especially for something as minor as this.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Josh Berkus
josh@agliodbs.com
In reply to: Craig Ringer (#1)
Re: Reporting the commit LSN at commit time

On 08/14/2014 05:45 PM, Craig Ringer wrote:

Wouldn't that force client drivers - libpq, psqlODBC, PgJDBC, etc - to
all watch for explicit "COMMIT"s sent by the application and rewrite them?

Realistically, users are going to need new drivers to take advantage of
any automated connection failover anyway.

Full automatic transparent failover _will_ be complex on the client. No
denying that. The hard parts are picking which node to connect to when
one goes away, the decision making around what to do when the new node
fails to catch up to the last committed state on the old node, and
tracking session state.

Frankly, I'd love to see just the simplest version of this implemented
in libpq as a start: the ability for client drivers to take a list of
hosts instead of a singe hostaddr (this was discussed at the 2013
clustering meeting).

There are some quite simple uses too though. The main one of interest to
me is an app that routes read-only queries to an async read-replica and
wants to guarantee that some of them see a state consistent with the
last commit on the master.

It's the first thing many people have asked me about BDR, though. "How
does client-side failover work". This is a priority for a lot of people.

As far as I can see, if you have client-side failover with asynchronous
replication of any form, the client _must_ have some way to reliably
connect to a new node and ask it "are you caught up to the state of the
last node I was connected to yet?". Or "Please wait until the last
transaction I committed elsewhere is visible here".

There are quite a few use-cases where this information isn't required;
even for BDR, I'd love to see the ability to disable this check.

There's also cases where it's not adequate; the user may not have
committed anything on the master, but they still don't want to connect
to a replica which is hours behind the last node they queried.

There's also use-cases for which automated connection failover without a
managed proxy is a Seriously Bad Idea. For one thing, you're setting up
a strong risk of split-brain.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#17)
Re: Reporting the commit LSN at commit time

On 2014-08-14 12:21:38 -0400, Robert Haas wrote:

On Sat, Aug 9, 2014 at 12:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I don't think we really need to embed it at that level. And it doesn't
have to be always on - only clients that ask for it need to get the
answer. Something like COMMIT WITH (report_commit_lsn ON); or similar
might do the trick?

And what does that actually do? Send back a result-set, or a new
protocol message?

What I was thinking of was to return "COMMIT X/X" instead of
"COMMIT". Since that's only sent when COMMIT WITH (report_commit_lsn ON)
was set it won't break clients/libraries that don't need it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#21)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#24)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#25)
#27Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
#28Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#27)
#29Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#25)
#30Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#29)
#31Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#29)
#32Fujii Masao
masao.fujii@gmail.com
In reply to: Andres Freund (#31)
#33Andres Freund
andres@anarazel.de
In reply to: Fujii Masao (#32)
#34Fujii Masao
masao.fujii@gmail.com
In reply to: Andres Freund (#33)
#35Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#31)
#36Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#35)