synchronized snapshots
This is a patch to implement synchronized snapshots. It is based on
Alvaro's specifications in:
http://archives.postgresql.org/pgsql-hackers/2011-02/msg02074.php
In short, this is how it works:
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)
(and then in a different session)
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '000003A1-1');
The one thing that it does not implement is leaving the transaction in
an aborted state if the BEGIN TRANSACTION command failed for an
invalid snapshot identifier. I can certainly see that this would be
useful but I am not sure if it justifies introducing this
inconsistency. We would have a BEGIN TRANSACTION command that left the
session in a different state depending on why it failed...
Also I was unsure if we really need to do further checking beyond the
existence of the file, why exactly is this necessary?
The patch is adding an extra "stemplate" parameter to the GetSnapshot
functions, the primary reason for this is to make it work with SSI,
which gets a snapshot and then does stuff with it. The alternative
would have been splitting up the SSI function so that we can smuggle
in our own snapshot but that didn't seem to be less ugly. The way it
works now is that the lowest function checks if a template is being
passed from higher up and if so, it doesn't get a fresh snapshot but
returns just a copy of the template.
I am wondering if pg_export_snapshot() is still the right name, since
the snapshot is no longer exported to the user. It is exported to a
file but that's an implementation detail.
Joachim
Attachments:
syncSnapshots.1.difftext/x-patch; charset=US-ASCII; name=syncSnapshots.1.diffDownload+745-145
On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland <joe@mcknight.de> wrote:
In short, this is how it works:
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)(and then in a different session)
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '000003A1-1');
I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.
If we have pg_export_snapshot() why not pg_import_snapshot() as well?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 15.08.2011 04:31, Joachim Wieland wrote:
The one thing that it does not implement is leaving the transaction in
an aborted state if the BEGIN TRANSACTION command failed for an
invalid snapshot identifier.
So what if the snapshot is invalid, the SNAPSHOT clause silently
ignored? That sounds really bad.
I can certainly see that this would be
useful but I am not sure if it justifies introducing this
inconsistency. We would have a BEGIN TRANSACTION command that left the
session in a different state depending on why it failed...
I don't understand what inconsistency you're talking about. What else
can cause BEGIN TRANSACTION to fail? Is there currently any failure mode
that doesn't leave the transaction in aborted state?
I am wondering if pg_export_snapshot() is still the right name, since
the snapshot is no longer exported to the user. It is exported to a
file but that's an implementation detail.
It's still exporting the snapshot to other sessions, that name still
seems appropriate to me.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 15.08.2011 10:40, Simon Riggs wrote:
On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland<joe@mcknight.de> wrote:
In short, this is how it works:
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)(and then in a different session)
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '000003A1-1');
I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.If we have pg_export_snapshot() why not pg_import_snapshot() as well?
It would be nice a symmetry, but you'd need a limitation that
pg_import_snapshot() must be the first thing you do in the session. And
it might be hard to enforce that, as once you get control into the
function, you've already acquired another snapshot in the transaction to
run the "SELECT pg_import_snapshot()" query with. Specifying the
snapshot in the BEGIN command makes sense.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Monday, August 15, 2011 08:40:34 Simon Riggs wrote:
On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland <joe@mcknight.de> wrote:
In short, this is how it works:
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)(and then in a different session)
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT =
'000003A1-1');I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.
Uhm. There already are several extensions to begin transaction. Like the just
added "DEFERRABLE".
If we have pg_export_snapshot() why not pg_import_snapshot() as well?
Using BEGIN has the advantage of making it explicit that it cannot be used
inside an existing transaction. Which I do find advantageous.
Andres
On Aug 15, 2011, at 9:40 AM, Simon Riggs wrote:
On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland <joe@mcknight.de> wrote:
In short, this is how it works:
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)(and then in a different session)
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '000003A1-1');
I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.If we have pg_export_snapshot() why not pg_import_snapshot() as well?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
i would definitely argue for a syntax like the one proposed by Joachim.. i could stay the same if this is turned into some sort of flashback implementation some day.
regards,
hans
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
* Simon Riggs:
I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.
Some language bindings treat BEGIN specially, so it might be difficult
to use this feature.
--
Florian Weimer <fweimer@bfk.de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99
On Mon, Aug 15, 2011 at 3:47 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
On 15.08.2011 04:31, Joachim Wieland wrote:
The one thing that it does not implement is leaving the transaction in
an aborted state if the BEGIN TRANSACTION command failed for an
invalid snapshot identifier.So what if the snapshot is invalid, the SNAPSHOT clause silently ignored?
That sounds really bad.
No, the command would fail, but since it fails, it doesn't change the
transaction state.
What was proposed originally was to start a transaction but throw an
error that leaves the transaction in the aborted state. But then the
command had some effect because it started a transaction block, even
though it failed.
I can certainly see that this would be
useful but I am not sure if it justifies introducing this
inconsistency. We would have a BEGIN TRANSACTION command that left the
session in a different state depending on why it failed...I don't understand what inconsistency you're talking about. What else can
cause BEGIN TRANSACTION to fail? Is there currently any failure mode that
doesn't leave the transaction in aborted state?
Granted, it might only fail for parse errors so far, but that would
include for example sending BEGIN DEFERRABLE to a pre-9.1 server. It
wouldn't start a transaction and leave it in an aborted state, but it
would just fail.
I am wondering if pg_export_snapshot() is still the right name, since
the snapshot is no longer exported to the user. It is exported to a
file but that's an implementation detail.It's still exporting the snapshot to other sessions, that name still seems
appropriate to me.
ok.
Joachim
On Mon, Aug 15, 2011 at 6:41 AM, Florian Weimer <fweimer@bfk.de> wrote:
* Simon Riggs:
I don't see the need to change the BEGIN command, which is SQL
Standard. We don't normally do that.Some language bindings treat BEGIN specially, so it might be difficult
to use this feature.
It's true, the command might require explicit support from language
bindings. However I used some perl test scripts, where you can also
send a START TRANSACTION command in an $dbh->do(...).
The intended use case of this feature is still pg_dump btw...
Joachim
On Mon, Aug 15, 2011 at 3:51 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
It would be nice a symmetry, but you'd need a limitation that
pg_import_snapshot() must be the first thing you do in the session. And it
might be hard to enforce that, as once you get control into the function,
you've already acquired another snapshot in the transaction to run the
"SELECT pg_import_snapshot()" query with. Specifying the snapshot in the
BEGIN command makes sense.
+1. Also, I am pretty sure that there are drivers out there, and
connection poolers, that keep track of the transaction state by
watching commands go by. Right now you can tell by the first word of
the command whether it's something that might change the transaction
state; I wouldn't like to make that harder.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Simon Riggs <simon@2ndQuadrant.com> wrote:
Joachim Wieland <joe@mcknight.de> wrote:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT =
'000003A1-1');I don't see the need to change the BEGIN command, which is SQL
Standard.
No, it's not standard.
To quote from our docs at:
http://www.postgresql.org/docs/9.0/interactive/sql-begin.html#AEN58214
| BEGIN is a PostgreSQL language extension. It is equivalent to the
| SQL-standard command START TRANSACTION, whose reference page
| contains additional compatibility information.
|
| Incidentally, the BEGIN key word is used for a different purpose
| in embedded SQL. You are advised to be careful about the
| transaction semantics when porting database applications.
In checking the most recent standards draft I have available, it
appears that besides embedded SQL, this keyword is also used in the
standard trigger declaration syntax. Using BEGIN to start a
transaction is a PostgreSQL extension to the standard. That said,
if we support a feature on the nonstandard BEGIN statement, we
typically add it as an extension to the standard START TRANSACTION
and SET TRANSACTION statements. Through 9.0 that consisted of
having a non-standard default for isolation level and the ability to
omit commas required by the standard. In 9.1 we added another
optional transaction property which defaults to standard behavior:
DEFERRABLE.
If we're talking about a property of a transaction, like the
transaction snapshot, it seems to me to be best to support it using
the same statements we use for other transaction properties.
-Kevin
On Aug 15, 2011, at 6:23 AM, Joachim Wieland wrote:
On Mon, Aug 15, 2011 at 3:47 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:On 15.08.2011 04:31, Joachim Wieland wrote:
The one thing that it does not implement is leaving the transaction in
an aborted state if the BEGIN TRANSACTION command failed for an
invalid snapshot identifier.So what if the snapshot is invalid, the SNAPSHOT clause silently ignored?
That sounds really bad.No, the command would fail, but since it fails, it doesn't change the
transaction state.What was proposed originally was to start a transaction but throw an
error that leaves the transaction in the aborted state. But then the
command had some effect because it started a transaction block, even
though it failed.
It certainly seems safer to me to set the transaction to an aborted state; you were expecting a set of commands to run with one snapshot, but if we don't abort the transaction they'll end up running anyway and doing so with the *wrong* snapshot. That could certainly lead to data corruption.
I suspect that all the other cases of BEGIN failing would be syntax errors, so you would immediately know in testing that something was wrong. A missing file is definitely not a syntax error, so we can't really depend on user testing to ensure this is handled correctly. IMO, that makes it critical that that error puts us in an aborted transaction.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby <jim@nasby.net> wrote:
I suspect that all the other cases of BEGIN failing would be syntax errors, so
you would immediately know in testing that something was wrong. A missing file
is definitely not a syntax error, so we can't really depend on user testing to ensure
this is handled correctly. IMO, that makes it critical that that error puts us in an
aborted transaction.
Why can we not just require the user to verify if his BEGIN query
failed or succeeded?
Is that really too much to ask for?
Also see what Robert wrote about proxies in between that keep track of
the transaction
state. Consider they see a BEGIN query that fails. How would they know
if the session
is now in an aborted transaction or not in a transaction at all?
Joachim
On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland <joe@mcknight.de> wrote:
On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby <jim@nasby.net> wrote:
I suspect that all the other cases of BEGIN failing would be syntax errors, so
you would immediately know in testing that something was wrong. A missing file
is definitely not a syntax error, so we can't really depend on user testing to ensure
this is handled correctly. IMO, that makes it critical that that error puts us in an
aborted transaction.Why can we not just require the user to verify if his BEGIN query
failed or succeeded?
Is that really too much to ask for?Also see what Robert wrote about proxies in between that keep track of
the transaction
state. Consider they see a BEGIN query that fails. How would they know
if the session
is now in an aborted transaction or not in a transaction at all?
I think the point here is that we should be consistent. Currently,
you can make BEGIN fail by doing it on the standby, and asking for
READ WRITE mode:
rhaas=# begin transaction read write;
ERROR: cannot set transaction read-write mode during recovery
After doing that, you are NOT in a transaction context:
rhaas=# select 1;
?column?
----------
1
(1 row)
So whatever this does should be consistent with that, at least IMHO.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of mar ago 16 09:59:04 -0400 2011:
On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland <joe@mcknight.de> wrote:
Also see what Robert wrote about proxies in between that keep track
of the transaction state. Consider they see a BEGIN query that
fails. How would they know if the session is now in an aborted
transaction or not in a transaction at all?I think the point here is that we should be consistent. Currently,
you can make BEGIN fail by doing it on the standby, and asking for
READ WRITE mode:rhaas=# begin transaction read write;
ERROR: cannot set transaction read-write mode during recoveryAfter doing that, you are NOT in a transaction context:
rhaas=# select 1;
?column?
----------
1
(1 row)So whatever this does should be consistent with that, at least IMHO.
I think we argued about a very similar problem years ago and the outcome
was that you should be left in an aborted transaction block; otherwise
running a dumb SQL script (which has no way to "abort if it fails")
could wreak serious havoc (?). I think this failure to behave in that
fashion on the standby is something to be fixed, not imitated.
What this says is that a driver or app seeing BEGIN fail should issue
ROLLBACK before going further -- which seems the intuitive way to behave
to me. No?
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Aug 16, 2011 at 10:43 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
Excerpts from Robert Haas's message of mar ago 16 09:59:04 -0400 2011:
On Mon, Aug 15, 2011 at 6:46 PM, Joachim Wieland <joe@mcknight.de> wrote:
Also see what Robert wrote about proxies in between that keep track
of the transaction state. Consider they see a BEGIN query that
fails. How would they know if the session is now in an aborted
transaction or not in a transaction at all?I think the point here is that we should be consistent. Currently,
you can make BEGIN fail by doing it on the standby, and asking for
READ WRITE mode:rhaas=# begin transaction read write;
ERROR: cannot set transaction read-write mode during recoveryAfter doing that, you are NOT in a transaction context:
rhaas=# select 1;
?column?
----------
1
(1 row)So whatever this does should be consistent with that, at least IMHO.
I think we argued about a very similar problem years ago and the outcome
was that you should be left in an aborted transaction block; otherwise
running a dumb SQL script (which has no way to "abort if it fails")
could wreak serious havoc (?). I think this failure to behave in that
fashion on the standby is something to be fixed, not imitated.What this says is that a driver or app seeing BEGIN fail should issue
ROLLBACK before going further -- which seems the intuitive way to behave
to me. No?
Maybe. But if we're going to change the behavior of BEGIN, then (1)
we need to think about backward compatibility and (2) we should change
it across the board. It's not for this patch to go invent something
that's inconsistent with what we're already doing elsewhere.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Aug 15, 2011, at 5:46 PM, Joachim Wieland wrote:
On Mon, Aug 15, 2011 at 6:09 PM, Jim Nasby <jim@nasby.net> wrote:
I suspect that all the other cases of BEGIN failing would be syntax errors, so
you would immediately know in testing that something was wrong. A missing file
is definitely not a syntax error, so we can't really depend on user testing to ensure
this is handled correctly. IMO, that makes it critical that that error puts us in an
aborted transaction.Why can we not just require the user to verify if his BEGIN query
failed or succeeded?
Is that really too much to ask for?
It's something else that you have to remember to get right. psql, for example, will blindly continue on unless you remembered to tell it to exit on an error.
Also, an invalid transaction seems to be the result of least surprise... if you cared enough to begin a transaction, you're going to expect that either everything between that and the COMMIT succeeds or fails, not something in-between.
Also see what Robert wrote about proxies in between that keep track of
the transaction
state. Consider they see a BEGIN query that fails. How would they know
if the session
is now in an aborted transaction or not in a transaction at all?
AFAIK a proxy can tell if a transaction is in progress or not via libpq. Worst-case, it just needs to send an extra ROLLBACK.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
On Tue, 2011-08-16 at 11:01 -0500, Jim Nasby wrote:
Also, an invalid transaction seems to be the result of least
surprise... if you cared enough to begin a transaction, you're going
to expect that either everything between that and the COMMIT succeeds
or fails, not something in-between.
Agreed.
Perhaps we need a new utility command to set the snapshot to make the
error handling a little more obvious?
Regards,
Jeff Davis
On Aug 16, 2011, at 5:40 PM, Jeff Davis wrote:
On Tue, 2011-08-16 at 11:01 -0500, Jim Nasby wrote:
Also, an invalid transaction seems to be the result of least
surprise... if you cared enough to begin a transaction, you're going
to expect that either everything between that and the COMMIT succeeds
or fails, not something in-between.Agreed.
Perhaps we need a new utility command to set the snapshot to make the
error handling a little more obvious?
Well, it appears we have a larger problem, as Robert pointed out that trying to start a writable transaction on a hot standby leaves you not in a transaction (which I feel is a problem).
So IMHO the right thing to do here is make it so that runtime errors in BEGIN leave you in an invalid transaction. Then we can decide on the API for synchronized snapshots that makes sense instead of working around the behavior of BEGIN.
I guess the big question to answer now is: what's the backwards compatibility impact of changing how BEGIN deals with runtime errors?
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
Jim Nasby <jim@nasby.net> writes:
Well, it appears we have a larger problem, as Robert pointed out that trying to start a writable transaction on a hot standby leaves you not in a transaction (which I feel is a problem).
So IMHO the right thing to do here is make it so that runtime errors in BEGIN leave you in an invalid transaction. Then we can decide on the API for synchronized snapshots that makes sense instead of working around the behavior of BEGIN.
I'm not convinced by the above argument, because it requires that
you pretend there's a significant difference between syntax errors and
"run time" errors (whatever those are). Syntax errors in a BEGIN
command are not going to leave you in an aborted transaction, because
the backend is not going to recognize the command as a BEGIN at all.
This means that frontends *must* be capable of dealing with the case
that a failed BEGIN didn't start a transaction. (Either that, or
they just assume their commands are always syntactically perfect,
which seems like pretty fragile programming to me; and the more weird
nonstandard options we load onto BEGIN, the less tenable the position
becomes. For example, if you feed BEGIN option-foo to a server that's
a bit older than you thought it was, you will get a syntax error.)
If we have some failure cases that start a transaction and some that do
not, we just have a mess, IMO.
I think we'd be far better off to maintain the position that a failed
BEGIN does not start a transaction, under any circumstances. To do
that, we cannot have this new option attached to the BEGIN, which is a
good thing anyway IMO from a standards compatibility point of view.
It'd be better to make it a separate utility statement. There is no
logical problem in doing that, and we already have a precedent for
utility statements that do something special before the transaction
snapshot is taken: see LOCK.
In fact, now that I think about it, setting the transaction snapshot
from a utility statement would be functionally useful because then you
could take locks beforehand.
And as a bonus, we don't have a backwards compatibility problem to solve.
regards, tom lane