existence of a savepoint?

Started by Stuart McGrawabout 8 years ago12 messagesgeneral

smcg4191@mtneva.com

about 8 years ago

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

David G. Johnston

david.g.johnston@gmail.com

about 8 years ago

In reply to: Stuart McGraw (#1)

Re: existence of a savepoint?

On Sunday, May 27, 2018, Stuart McGraw <smcg4191@mtneva.com> wrote:

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

A scan of the documentation doesn't show any commands or functions that
would provide this information.

David J.

brian@omniti.com

about 8 years ago

In reply to: Stuart McGraw (#1)

Re: existence of a savepoint?

On Sun, May 27, 2018 at 6:04 PM, Stuart McGraw <smcg4191@mtneva.com> wrote:

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

This is kind of backwards, but you can determine if a savepoint by a
certain name exists by trying to release it. You can do so without
damaging the transaction by creating a savepoint immediately beforehand,
but this will cause you to lose the state of the named savepoint. Example
below.

=# begin;
BEGIN
=# insert into brian.test values (1);
INSERT 0 1
=# savepoint target_point;
SAVEPOINT
=# insert into brian.test values (2);
INSERT 0 1
=# savepoint buffer;
SAVEPOINT
=# release target_point;
RELEASE <------ savepoint existed
=# commit;
COMMIT

=# begin;
BEGIN
=# insert into brian.test values (3);
INSERT 0 1
=# savepoint buffer;
SAVEPOINT
=# release target_point;
ERROR: no such savepoint <----- savepoint did not exist
=# rollback to buffer;
ROLLBACK
=# select * from brian.test;
x
---
3

In theory you could do two savepoints "target_point" and
"target_point_test" and check for the _test version to maintain your
transaction states and be able to test for it once.

BEGIN
=# insert into brian.test values (1);
INSERT 0 1
=# savepoint target_point;
SAVEPOINT
=# savepoint target_point_test;
SAVEPOINT
=# insert into brian.test values (2);
INSERT 0 1
=# savepoint buffer;
SAVEPOINT
=# release target_point_test; <----- check if target_point exists
RELEASE
=# rollback to target_point; <----- rollback to it
ROLLBACK

Cheers,

-Brian Dunavant

alvherre@2ndquadrant.com

about 8 years ago

In reply to: Stuart McGraw (#1)

Re: existence of a savepoint?

On 2018-May-27, Stuart McGraw wrote:

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

What is the use case for this?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

smcg4191@mtneva.com

about 8 years ago

In reply to: David G. Johnston (#2)

Re: existence of a savepoint?

On 05/27/2018 04:39 PM, David G. Johnston wrote:

On Sunday, May 27, 2018, Stuart McGraw <smcg4191@mtneva.com <mailto:smcg4191@mtneva.com>> wrote:

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

A scan of the documentation doesn't show any commands or functions that would provide this information.

I was hoping that I was overlooking something. :-(

smcg4191@mtneva.com

about 8 years ago

In reply to: Alvaro Herrera (#4)

Re: existence of a savepoint?

On 05/29/2018 08:38 AM, Alvaro Herrera wrote:

On 2018-May-27, Stuart McGraw wrote:

Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

What is the use case for this?

I have a process that extracts data from a source and inserts
it into a Postgresql database with one big commit at the end.
The flow is not straight forward: the data is read in blocks
that contain sub-blocks and certain database errors (duplicate
keys, etc) may need to be rolled back to the previous insert,
sub-block or block depending on context. Trying to keep track
of which savepoints are active without "leaking" them (failing
to release when no longer needed) in the presence of many branches
and exception handlers is not impossible but would be much easier
if I could introspect the savepoint state rather than having to
try and track it myself.

Alternatively if there were a setting to tell Postgresql to
follow the SQL standard behavior of overwriting rather stacking
savepoints, that too would also solve my current problem I think.
Perhaps it is just my limited experience but the former behavior
has always seemed more useful in practice than the latter.

smcg4191@mtneva.com

about 8 years ago

In reply to: Brian Dunavant (#3)

Re: existence of a savepoint?

On 05/29/2018 08:26 AM, Brian Dunavant wrote:

On Sun, May 27, 2018 at 6:04 PM, Stuart McGraw <smcg4191@mtneva.com <mailto:smcg4191@mtneva.com>> wrote:
Is there some way to to test if a savepoint of a given name
exists? Or better yet, the number of stacked savepoints of
that name?

This is kind of backwards, but you can determine if a savepoint by a certain name exists by trying to release it. You can do so without damaging the transaction by creating a savepoint immediately beforehand, but this will cause you to lose the state of the named savepoint. Example below.
[...snipped for brevity...]

Thanks. That's interesting, protecting a questionable
release with another savepoint. I wouldn't have thought
of that. But I'm not sure it helps my ultimate goal of
simplifying my application code or that it doesn't shift
the problem of not leaking savepoints to the test
savepoints.

alvherre@2ndquadrant.com

about 8 years ago

In reply to: Stuart McGraw (#6)

Re: existence of a savepoint?

On 2018-May-29, Stuart McGraw wrote:

Alternatively if there were a setting to tell Postgresql to
follow the SQL standard behavior of overwriting rather stacking
savepoints, that too would also solve my current problem I think.
Perhaps it is just my limited experience but the former behavior
has always seemed more useful in practice than the latter.

I think if what we're doing breaks the semantics of the SQL spec, we're
definitely open to changing our behavior. But that wouldn't solve your
problem today. What I think could solve your problem today is a
C-language extension that uses xact.c callbacks in order to expose a
list that you can query from user space.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

David G. Johnston

david.g.johnston@gmail.com

about 8 years ago

In reply to: Alvaro Herrera (#8)

Re: existence of a savepoint?

On Tue, May 29, 2018 at 4:01 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

On 2018-May-29, Stuart McGraw wrote:

Alternatively if there were a setting to tell Postgresql to
follow the SQL standard behavior of overwriting rather stacking
savepoints, that too would also solve my current problem I think.
Perhaps it is just my limited experience but the former behavior
has always seemed more useful in practice than the latter.

I think if what we're doing breaks the semantics of the SQL spec, we're
definitely open to changing our behavior. But that wouldn't solve your
problem today. What I think could solve your problem today is a
C-language extension that uses xact.c callbacks in order to expose a
list that you can query from user space.

Stuart:

That said, have you measured this "leaking" and can show that it is
non-trivial (given the large size of the overall transaction)?

Beyond that bulk ETL leveraging SAVEPOINT is not something I've encountered
or contemplated. Expecting and reacting to errors is expensive and itself
error-prone. I'd much rather try to design something that where failure is
simply bad - usually by bulk loading with fewer constraints and then
ensuring that future queries don't attempt to do something illegal like
insert duplicates.

David J.

smcg4191@mtneva.com

about 8 years ago

In reply to: David G. Johnston (#9)

Re: existence of a savepoint?

On 05/29/2018 05:18 PM, David G. Johnston wrote:

On Tue, May 29, 2018 at 4:01 PM, Alvaro Herrera <alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>>wrote:

On 2018-May-29, Stuart McGraw wrote:

Alternatively if there were a setting to tell Postgresql to
follow the SQL standard behavior of overwriting rather stacking
savepoints, that too would also solve my current problem I think.
Perhaps it is just my limited experience but the former behavior
has always seemed more useful in practice than the latter.

I think if what we're doing breaks the semantics of the SQL spec, we're
definitely open to changing our behavior. But that wouldn't solve your
problem today. What I think could solve your problem today is a
C-language extension that uses xact.c callbacks in order to expose a
list that you can query from user space.

Stuart:

That said, have you measured this "leaking" and can show that it is non-trivial (given the large size of the overall transaction)?

No I haven't and am not sure how I would. Are you saying I shouldn't worry about it and just not bother releasing any of the savepoints? I would feel a little uneasy about that the same way I would feel about a program that never freed allocated memory or closed open files. If I know there are relatively small limits on how much data will be processed or how long the program will run, sure. But in my case I don't control the size of the input data and I don't understand the internals of savepoints so I think caution is prudent.

Also I'm not sure the warnings against premature optimization when talking about code performance tweaks apply to resource leaks. The former attempt to make a program run faster but don't (in theory) affect its correctness. Resource problems often show up unexpectedly and catastrophically. So being more preemptively concerned about the latter I think is justified.

Beyond that bulk ETL leveraging SAVEPOINT is not something I've encountered or contemplated. Expecting and reacting to errors is expensive and itself error-prone. I'd much rather try to design something that where failure is simply bad - usually by bulk loading with fewer constraints and then ensuring that future queries don't attempt to do something illegal like insert duplicates.

Funny you should say that :-) I am looking at rewriting these import programs (there are several) to do just that. But it is not a trivial job and in the meantime I need to keep what already exists, working.

smcg4191@mtneva.com

about 8 years ago

In reply to: Alvaro Herrera (#8)

Re: existence of a savepoint?

On 05/29/2018 05:01 PM, Alvaro Herrera wrote:

On 2018-May-29, Stuart McGraw wrote:

Alternatively if there were a setting to tell Postgresql to
follow the SQL standard behavior of overwriting rather stacking
savepoints, that too would also solve my current problem I think.
Perhaps it is just my limited experience but the former behavior
has always seemed more useful in practice than the latter.

I think if what we're doing breaks the semantics of the SQL spec, we're
definitely open to changing our behavior.

My comment was based on the statement in the Postqresql docs for SAVEPOINT:

"SQL requires a savepoint to be destroyed automatically when another
savepoint with the same name is established. In PostgreSQL, the old
savepoint is kept, though only the more recent one will be used when
rolling back or releasing. (Releasing the newer savepoint with RELEASE
SAVEPOINT will cause the older one to again become accessible to ROLLBACK
TO SAVEPOINT and RELEASE SAVEPOINT.) Otherwise, SAVEPOINT is fully SQL
conforming."

But that wouldn't solve your
problem today. What I think could solve your problem today is a
C-language extension that uses xact.c callbacks in order to expose a
list that you can query from user space.

Even that won't solve it unless I can find someone who has already
written it and is willing to share it and my Googling hasn't found
anything yet. :-(

David G. Johnston

david.g.johnston@gmail.com

about 8 years ago

In reply to: Stuart McGraw (#10)

Re: existence of a savepoint?

On Tuesday, May 29, 2018, Stuart McGraw <smcg4191@mtneva.com> wrote:

But in my case I don't control the size of the input data

Not in production but you have an idea of both size and complexity and
should be able to generate performance test scenarios, and related
monitoring queries (system and service) to obtain some idea. The specifics
are beyond my experience but this is not brand new technology and people
have done similar stuff with it before.

And, as an extension to what you said, given such lack of control you are
going to want to monitor performance in production anyway even with an
assumed bullet-resistant solution.

David J.