database-level lockdown

Started by Filipe Pinaalmost 11 years ago18 messagesgeneral
Jump to latest
#1Filipe Pina
filipe.pina@impactzero.pt

[ original question posted in
http://stackoverflow.com/questions/30789279/django-postgresql-retry-transaction-last-try-must-go-in
for easier read ]

I have a Django+PostgreSQL.
For data integrity pg is setup for serializable transactions, so I'm
retrying the transaction (in a generic class) as:

def _execute_wrapper(pair):
obj, item = pair

from time import sleep
last_error = None

for tt in xrange(obj._max_retry):
try:
obj.execute(item)
return (None, tt)
except (utils.OperationalError) as exc:
last_error = exc
obj.handle_error(exc)
sleep(obj._retry_wait)
except Exception as exc:
obj.handle_error(exc)
raise exc

return (last_error, tt)
Default _max_retry is 5 and _retry_wait is 0.1s.

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error it
received, aborting.

Now my problem is that aborting for the last try (on a restartable
error - OperationalError code 40001) is not an option... It simply
needs to get through, locking whatever other processes and queries it
needs.

So I've thought about use a generic lock (at django level, not PG) that
every transaction would check, but that will be a problem once the
Django application runs in multiple instances (with a load balancer)...

So, I've thought of pg_advisory_lock() to have the lock on the database
level, but then I'd need some way to make sure Django checks that lock
in every database request (a select, an update, function call, etc)..

It would need to get the lock and release it immediately (not the try
version of the lock that releases only at the end of the transaction).
Then, the retry wrapper, on the 5th try will set the lock (not
releasing it immediately)..

I'm open to any option, it's just that a transaction going on its 5th
try of an OperationalError really has to go in, no matter if it has to
fully lockdown the database to any other transaction..

Thanks,

Filipe

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Filipe Pina (#1)
Re: database-level lockdown

Filipe Pina <filipe.pina@impactzero.pt> writes:

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error it
received, aborting.

Now my problem is that aborting for the last try (on a restartable
error - OperationalError code 40001) is not an option... It simply
needs to get through, locking whatever other processes and queries it
needs.

I think you need to reconsider your objectives. What if two or more
transactions are repeatedly failing and retrying, perhaps because they
conflict? They can't all forcibly win.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Filipe Pina
filipe.pina@impactzero.pt
In reply to: Tom Lane (#2)
Re: database-level lockdown

Exactly, that’s why there’s a limit on the retry number. On the last try I wanted something like full lockdown to make sure the transaction will not fail due to serialiazation failure (if no other processes are touching the database, it can’t happen).

So if two transactions were retrying over and over, the first one to reach max_retries would activate that “global lock” making the other one wait and then the second one would also be able to successfully commit...

On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Filipe Pina <filipe.pina@impactzero.pt> writes:

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error it
received, aborting.

Now my problem is that aborting for the last try (on a restartable
error - OperationalError code 40001) is not an option... It simply
needs to get through, locking whatever other processes and queries it
needs.

I think you need to reconsider your objectives. What if two or more
transactions are repeatedly failing and retrying, perhaps because they
conflict? They can't all forcibly win.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Filipe Pina
filipe.pina@impactzero.pt
In reply to: Filipe Pina (#3)
Re: database-level lockdown

So, as database level locks do not exist (as per
https://wiki.postgresql.org/wiki/Lock_database), I've thought of having
a function that would lock *every* table in the database (far from the
same, but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()
RETURNS void AS $$
DECLARE
t information_schema.tables.table_name%TYPE;
BEGIN
FOR t in SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
LOOP
EXECUTE 'LOCK ' || t;
END LOOP;
END
$$
LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction
and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from
Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {
# try 4 times to execute developer function and if all of them fail
# (due to SQLSTATE 40001 serialization failures),
# lock database and execute one last time
for 1 in [1..4] {
try {
call developer_function()
return 'success'
}
except SQLSTATE_40001 {
continue
}
except other_error {
return other_error
}

# only reaches here if all tries failed with SQLSTATE_40001
try {
START TRANSACTION
call lockdown()
call developer_function()
COMMIT TRANSACTION
return 'success'
}
except any_error {
# implicit ROLLBACK
return any_error
}
}

So, my problem here is that "call lockdown()" will place the locks and
remove them upon returning... Is it possible to execute a function
without creating a subtransaction?

I could place the locks from the adapter directly at the outter
transaction level but I have the feeling that performance would be
worse...

Thanks,
Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt>
wrote:

Show quoted text

Exactly, that’s why there’s a limit on the retry number. On the
last try I wanted something like full lockdown to make sure the
transaction will not fail due to serialiazation failure (if no other
processes are touching the database, it can’t happen).

So if two transactions were retrying over and over, the first one to
reach max_retries would activate that “global lock” making the
other one wait and then the second one would also be able to
successfully commit...

On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Filipe Pina <filipe.pina@impactzero.pt> writes:

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error
it
received, aborting.

Now my problem is that aborting for the last try (on a restartable
error - OperationalError code 40001) is not an option... It simply
needs to get through, locking whatever other processes and queries
it
needs.

I think you need to reconsider your objectives. What if two or more
transactions are repeatedly failing and retrying, perhaps because
they
conflict? They can't all forcibly win.

regards, tom lane

#5Melvin Davidson
melvin6925@gmail.com
In reply to: Filipe Pina (#4)
Re: database-level lockdown

Wouldn't it be easier just to prevent connections to the database
while your transaction is executed?
EG:
<Connect to your_database>
UPDATE pg_database
SET datallowconn FALSE
WHERE datname = 'your_database' ;
START TRANSACTION;
<Do your_transaction>
COMMIT;
UPDATE pg_database
SET datallowconn TRUE
WHERE datname = 'your_database' ;

On Fri, Jul 3, 2015 at 1:25 PM, Filipe Pina <filipe.pina@impactzero.pt>
wrote:

So, as database level locks do not exist (as per
https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a
function that would lock *every* table in the database (far from the same,
but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()
RETURNS void AS $$
DECLARE
t information_schema.tables.table_name%TYPE;
BEGIN
FOR t in SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
LOOP
EXECUTE 'LOCK ' || t;
END LOOP;
END
$$
LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction
and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from
Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {
# try 4 times to execute developer function and if all of them fail
# (due to SQLSTATE 40001 serialization failures),
# lock database and execute one last time
for 1 in [1..4] {
try {
call developer_function()
return 'success'
}
except SQLSTATE_40001 {
continue
}
except other_error {
return other_error
}

# only reaches here if all tries failed with SQLSTATE_40001
try {
START TRANSACTION
call lockdown()
call developer_function()
COMMIT TRANSACTION
return 'success'
}
except any_error {
# implicit ROLLBACK
return any_error
}
}

So, my problem here is that "call lockdown()" will place the locks and
remove them upon returning... Is it possible to execute a function without
creating a subtransaction?

I could place the locks from the adapter directly at the outter
transaction level but I have the feeling that performance would be worse...

Thanks,
Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt>
wrote:

Exactly, that’s why there’s a limit on the retry number. On the last try I
wanted something like full lockdown to make sure the transaction will not
fail due to serialiazation failure (if no other processes are touching the
database, it can’t happen). So if two transactions were retrying over and
over, the first one to reach max_retries would activate that “global lock”
making the other one wait and then the second one would also be able to
successfully commit...

On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina <
filipe.pina@impactzero.pt> writes:

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error it
received, aborting.

Now my problem is that aborting for the last try (on a restartable error
- OperationalError code 40001) is not an option... It simply needs to get
through, locking whatever other processes and queries it needs.

I think you need to reconsider your objectives. What if two or more
transactions are repeatedly failing and retrying, perhaps because they
conflict? They can't all forcibly win. regards, tom lane

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#6Filipe Pina
filipe.pina@impactzero.pt
In reply to: Melvin Davidson (#5)
Re: database-level lockdown

Thanks for the suggestion. I read that some people do use that strategy for
maintenance sometimes but it's no feasible in this scenario.

I would have to disallow new connections AND kill all existing connections
(as there would be an existing connection pool), but this won't have the
same impact as using LOCKs..

Terminating all sessions will break every other transaction (except for the
one doing it). Locking database will put all the other on hold.
As we're talking about quick/instant operations on hold will have impact on
performance but won't cause anything to abort..

I really can't find any other solution for what I need (in short: make sure
no transactions are left out due to serialization failures)

On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com> wrote:

Wouldn't it be easier just to prevent connections to the database
while your transaction is executed?
EG:
<Connect to your_database>
UPDATE pg_database
SET datallowconn FALSE
WHERE datname = 'your_database' ;
START TRANSACTION;
<Do your_transaction>
COMMIT;
UPDATE pg_database
SET datallowconn TRUE
WHERE datname = 'your_database' ;

On Fri, Jul 3, 2015 at 1:25 PM, Filipe Pina <filipe.pina@impactzero.pt>
wrote:

So, as database level locks do not exist (as per
https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a
function that would lock *every* table in the database (far from the same,
but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()
RETURNS void AS $$
DECLARE
t information_schema.tables.table_name%TYPE;
BEGIN
FOR t in SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
LOOP
EXECUTE 'LOCK ' || t;
END LOOP;
END
$$
LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction
and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from
Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {
# try 4 times to execute developer function and if all of them fail
# (due to SQLSTATE 40001 serialization failures),
# lock database and execute one last time
for 1 in [1..4] {
try {
call developer_function()
return 'success'
}
except SQLSTATE_40001 {
continue
}
except other_error {
return other_error
}

# only reaches here if all tries failed with SQLSTATE_40001
try {
START TRANSACTION
call lockdown()
call developer_function()
COMMIT TRANSACTION
return 'success'
}
except any_error {
# implicit ROLLBACK
return any_error
}
}

So, my problem here is that "call lockdown()" will place the locks and
remove them upon returning... Is it possible to execute a function without
creating a subtransaction?

I could place the locks from the adapter directly at the outter
transaction level but I have the feeling that performance would be worse...

Thanks,
Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt>
wrote:

Exactly, that’s why there’s a limit on the retry number. On the last try I
wanted something like full lockdown to make sure the transaction will not
fail due to serialiazation failure (if no other processes are touching the
database, it can’t happen). So if two transactions were retrying over and
over, the first one to reach max_retries would activate that “global lock”
making the other one wait and then the second one would also be able to
successfully commit...

On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina <
filipe.pina@impactzero.pt> writes:

It will try 5 times to execute each instruction (in case of
OperationError) and in the last one it will raise the last error it
received, aborting.

Now my problem is that aborting for the last try (on a restartable error
- OperationalError code 40001) is not an option... It simply needs to get
through, locking whatever other processes and queries it needs.

I think you need to reconsider your objectives. What if two or more
transactions are repeatedly failing and retrying, perhaps because they
conflict? They can't all forcibly win. regards, tom lane

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#7John R Pierce
pierce@hogranch.com
In reply to: Filipe Pina (#6)
Re: database-level lockdown

On 7/4/2015 10:49 AM, Filipe Pina wrote:

Terminating all sessions will break every other transaction (except
for the one doing it). Locking database will put all the other on hold.
As we're talking about quick/instant operations on hold will have
impact on performance but won't cause anything to abort..

you can't lock something thats already in use by a transaction as that
transaction already has locks on it.

better would be to design your workflow so it all can be done
transactionally and get away from this batch processing model.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Filipe Pina (#6)
Re: database-level lockdown

On 07/04/2015 10:49 AM, Filipe Pina wrote:

Thanks for the suggestion. I read that some people do use that strategy
for maintenance sometimes but it's no feasible in this scenario.

I would have to disallow new connections AND kill all existing
connections (as there would be an existing connection pool), but this
won't have the same impact as using LOCKs..

Terminating all sessions will break every other transaction (except for
the one doing it). Locking database will put all the other on hold.
As we're talking about quick/instant operations on hold will have impact
on performance but won't cause anything to abort..

I really can't find any other solution for what I need (in short: make
sure no transactions are left out due to serialization failures)

Which would seem to indicate you have painted yourself into a corner.
The idea of locking an entire database to get one transaction to commit
seems a little extreme to me.

What is this transaction trying to do and why is it necessary that it
commit at all costs?

On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com
<mailto:melvin6925@gmail.com>> wrote:

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Greg Sabino Mullane
greg@turnstep.com
In reply to: Filipe Pina (#6)
Re: database-level lockdown

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Filipe Pina said:

I really can't find any other solution for what I need (in short: make sure
no transactions are left out due to serialization failures)

I think you may have been too quick to rule out advisory locks as a solution.
Yes, you will need wrappers around all other calls, but extraordinary problems
call for extraordinary solutions.

I could place the locks from the adapter directly at the outer transaction
level but I have the feeling that performance would be worse...

Well, performance has really got to take a back seat, given your other
requirements. ;) Locking certainly *could* work - and is arguably the best
solution, as that's what locks are for. Just have your Very Important Transaction
retry, and upon reaching that critical number, exclusively lock just the tables
being used, then try again. If you don't know which tables are being used,
I suggest storing that somewhere your class can find it, or moving away
from such a generic class.

There are other solutions (e.g. forcing conflicting processes to quit
and sleep a second), but the locking one seems the easiest.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201507050943
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlWZNeoACgkQvJuQZxSWSshVngCgpzGg7/OXRcyE2JgwDxDTFr9X
o7UAn3ENNgmIVqPpR4j1kyooiu+Ool7A
=6FSv
-----END PGP SIGNATURE-----

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Filipe Pina
filipe.pina@impactzero.pt
In reply to: Adrian Klaver (#8)
Re: database-level lockdown

It's not necessary to commit at all costs, it can fail, just not due to
serialization..

And the transaction can be something as simple as updating a field or
inserting a record (with foreign keys which is one the serialization
checks).

On Sáb, Jul 4, 2015 at 7:23 , Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

Show quoted text

On 07/04/2015 10:49 AM, Filipe Pina wrote:

Thanks for the suggestion. I read that some people do use that
strategy
for maintenance sometimes but it's no feasible in this scenario.

I would have to disallow new connections AND kill all existing
connections (as there would be an existing connection pool), but this
won't have the same impact as using LOCKs..

Terminating all sessions will break every other transaction (except
for
the one doing it). Locking database will put all the other on hold.
As we're talking about quick/instant operations on hold will have
impact
on performance but won't cause anything to abort..

I really can't find any other solution for what I need (in short:
make
sure no transactions are left out due to serialization failures)

Which would seem to indicate you have painted yourself into a corner.
The idea of locking an entire database to get one transaction to
commit seems a little extreme to me.

What is this transaction trying to do and why is it necessary that it
commit at all costs?

On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com
<mailto:melvin6925@gmail.com>> wrote:

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Filipe Pina
filipe.pina@impactzero.pt
In reply to: Greg Sabino Mullane (#9)
Re: database-level lockdown

Yes, I've tried to come up with guideline to enumerate tables used in
each process, but it's not simple because it's python application
calling pgsql functions that use other functions, so it's tricky for a
developer re-using existing functions to enumerate the tables used for
those. Even if everything is well documented and can be re-used seems
like a nasty task...

For now, I'm locking all to be able to close the gap, but I'm also
wondering if I could do it in a pgsql function as I mentioned in the
question:

FUNCTION A
-> FUNCTION B
----> lock TABLE
-> FUNCTION C
----> TABLE is not locked anymore because function B frees it as soon
as it returns

Is there someway to have a function that locks some tables on the
"outter" transaction instead of its own subtransaction?

On Seg, Jul 6, 2015 at 3:08 , Filipe Pina <filipe.pina@impactzero.pt>
wrote:

Show quoted text

On Dom, Jul 5, 2015 at 2:50 , Greg Sabino Mullane <greg@turnstep.com>
wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Filipe Pina said:

I really can't find any other solution for what I need (in short:
make sure
no transactions are left out due to serialization failures)

I think you may have been too quick to rule out advisory locks as a
solution.
Yes, you will need wrappers around all other calls, but
extraordinary problems
call for extraordinary solutions.

I could place the locks from the adapter directly at the outer
transaction
level but I have the feeling that performance would be worse...

Well, performance has really got to take a back seat, given your
other
requirements. ;) Locking certainly *could* work - and is arguably
the best
solution, as that's what locks are for. Just have your Very
Important Transaction
retry, and upon reaching that critical number, exclusively lock just
the tables
being used, then try again. If you don't know which tables are being
used,
I suggest storing that somewhere your class can find it, or moving
away
from such a generic class.

There are other solutions (e.g. forcing conflicting processes to
quit
and sleep a second), but the locking one seems the easiest.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201507050943
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlWZNeoACgkQvJuQZxSWSshVngCgpzGg7/OXRcyE2JgwDxDTFr9X
o7UAn3ENNgmIVqPpR4j1kyooiu+Ool7A
=6FSv
-----END PGP SIGNATURE-----

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Filipe Pina (#10)
Re: database-level lockdown

On 07/06/2015 07:10 AM, Filipe Pina wrote:

It's not necessary to commit at all costs, it can fail, just not due to
serialization..

And the transaction can be something as simple as updating a field or
inserting a record (with foreign keys which is one the serialization
checks).

Not following, why throw serialization at a FK?

On Sáb, Jul 4, 2015 at 7:23 , Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 07/04/2015 10:49 AM, Filipe Pina wrote:

Thanks for the suggestion. I read that some people do use that
strategy for maintenance sometimes but it's no feasible in this
scenario. I would have to disallow new connections AND kill all
existing connections (as there would be an existing connection
pool), but this won't have the same impact as using LOCKs..
Terminating all sessions will break every other transaction
(except for the one doing it). Locking database will put all the
other on hold. As we're talking about quick/instant operations on
hold will have impact on performance but won't cause anything to
abort.. I really can't find any other solution for what I need (in
short: make sure no transactions are left out due to serialization
failures)

Which would seem to indicate you have painted yourself into a corner.
The idea of locking an entire database to get one transaction to
commit seems a little extreme to me. What is this transaction trying
to do and why is it necessary that it commit at all costs?

On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com
<mailto:melvin6925@gmail.com>> wrote:

--
Adrian Klaver adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Filipe Pina (#11)
Re: database-level lockdown

On 07/06/2015 07:15 AM, Filipe Pina wrote:

Yes, I've tried to come up with guideline to enumerate tables used in
each process, but it's not simple because it's python application
calling pgsql functions that use other functions, so it's tricky for a
developer re-using existing functions to enumerate the tables used for
those. Even if everything is well documented and can be re-used seems
like a nasty task...

Still not sure what is you are trying to accomplish.

Is it really necessary that every transaction be serialized?

Or to put it another way, why are you running in serializable by default?

Or yet another way, what is the problem you are trying to solve with
serialized transactions?

For now, I'm locking all to be able to close the gap, but I'm also
wondering if I could do it in a pgsql function as I mentioned in the
question:

FUNCTION A
-> FUNCTION B
----> lock TABLE
-> FUNCTION C
----> TABLE is not locked anymore because function B frees it as soon as
it returns

Is there someway to have a function that locks some tables on the
"outter" transaction instead of its own subtransaction?

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14Filipe Pina
filipe.pina@impactzero.pt
In reply to: Adrian Klaver (#13)
Re: database-level lockdown

Exactly, that's the twist I've decided after some tests yesterday (the
"lock all tables on last try" degraded performance in an obscene way):
giving up on serializable by default.

I wanted to use serializable so developers wouldn't have to worry about
properly using locks, but implementing this "serialization failure"
handler is becoming a major headache...

Thanks guys!

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 07/06/2015 07:15 AM, Filipe Pina wrote:

Yes, I've tried to come up with guideline to enumerate tables used in
each process, but it's not simple because it's python application
calling pgsql functions that use other functions, so it's tricky for
a
developer re-using existing functions to enumerate the tables used
for
those. Even if everything is well documented and can be re-used seems
like a nasty task...

Still not sure what is you are trying to accomplish.

Is it really necessary that every transaction be serialized?

Or to put it another way, why are you running in serializable by
default?

Or yet another way, what is the problem you are trying to solve with
serialized transactions?

For now, I'm locking all to be able to close the gap, but I'm also
wondering if I could do it in a pgsql function as I mentioned in the
question:

FUNCTION A
-> FUNCTION B
----> lock TABLE
-> FUNCTION C
----> TABLE is not locked anymore because function B frees it as
soon as
it returns

Is there someway to have a function that locks some tables on the
"outter" transaction instead of its own subtransaction?

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Alban Hertroys
haramrae@gmail.com
In reply to: Filipe Pina (#14)
Re: database-level lockdown

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote:

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Still not sure what is you are trying to accomplish. Is it really necessary
that every transaction be serialized? Or to put it another way, why are you
running in serializable by default? Or yet another way, what is the problem
you are trying to solve with serialized transactions?

Exactly, that's the twist I've decided after some tests yesterday (the "lock
all tables on last try" degraded performance in an obscene way): giving up
on serializable by default.

I wanted to use serializable so developers wouldn't have to worry about
properly using locks, but implementing this "serialization failure" handler
is becoming a major headache...

What Adrian was trying to get out of you is why you think you need those locks.

You're working with an RDBMS, it does the job of keeping data
integrity in a multi-user environment already. You can trust it do do
that well.
So unless you're doing something really special, you shouldn't need to
lock anything.
Among the special cases where you do need locks are things like
needing a gapless sequence (invoice numbers come to mind); things like
that.

So once again, what do you need those locks for?

P.S. Please don't top-post on this list. I had to edit your message to
fix the flow of the conversation.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16Filipe Pina
filipe.pina@impactzero.pt
In reply to: Alban Hertroys (#15)
Re: database-level lockdown

On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com> wrote:

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt>
wrote:

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver
<adrian.klaver@aklaver.com>
wrote:

Still not sure what is you are trying to accomplish. Is it really
necessary
that every transaction be serialized? Or to put it another way,
why are you
running in serializable by default? Or yet another way, what is
the problem
you are trying to solve with serialized transactions?

Exactly, that's the twist I've decided after some tests yesterday
(the "lock
all tables on last try" degraded performance in an obscene way):
giving up
on serializable by default.

I wanted to use serializable so developers wouldn't have to worry
about
properly using locks, but implementing this "serialization failure"
handler
is becoming a major headache...

What Adrian was trying to get out of you is why you think you need
those locks.

You're working with an RDBMS, it does the job of keeping data
integrity in a multi-user environment already. You can trust it do do
that well.
So unless you're doing something really special, you shouldn't need to
lock anything.
Among the special cases where you do need locks are things like
needing a gapless sequence (invoice numbers come to mind); things like
that.

So once again, what do you need those locks for?

P.S. Please don't top-post on this list. I had to edit your message to
fix the flow of the conversation.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

It probably wasn't clear but the conversation did start with exposing
my issue, I'll try to rephrase.

Serializable keeps data integrity of the database, that's true. But it
does that by "refusing" transactions that would compromise it (with
error 40001).

I need to make sure no transactions fail due to such errors, so I made
an handler to retry transactions if they fail due to 40001.
But I can't leave it retrying forever, so I have this (configurable)
limit of 5 retries.
5 retries might be enough in some case, but it might not be enough in
others.

So how do I make sure the last try would not fail with serialization
error?
I could only come up with the idea of using LOCK database (on that last
try) but as it is not available in PG, I went for "lock all tables".

Does it make sense now?
I was probably having the wrong approach to the problem anyway, I've
went with "read committed" mode now and normal locking (SELECT .. FOR
UPDATE on the required spots) and it works better..
I just come from a GTM background (and all the other developers in the
project) and we're used to the TP RESTART concept where, in the final
retry, GTM locks the database region(s) that were used by that
transaction (not the whole DB, just the regions).

Thanks and sorry for the top-posting, not really used to mailing lists
and proper posting styles..

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Filipe Pina (#16)
Re: database-level lockdown

On 07/07/2015 06:44 AM, Filipe Pina wrote:

On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com> wrote:

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote:

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

Still not sure what is you are trying to accomplish. Is it
really necessary that every transaction be serialized? Or to
put it another way, why are you running in serializable by
default? Or yet another way, what is the problem you are
trying to solve with serialized transactions?

Exactly, that's the twist I've decided after some tests yesterday
(the "lock all tables on last try" degraded performance in an
obscene way): giving up on serializable by default. I wanted to
use serializable so developers wouldn't have to worry about
properly using locks, but implementing this "serialization
failure" handler is becoming a major headache...

What Adrian was trying to get out of you is why you think you need
those locks. You're working with an RDBMS, it does the job of keeping
data integrity in a multi-user environment already. You can trust it
do do that well. So unless you're doing something really special, you
shouldn't need to lock anything. Among the special cases where you do
need locks are things like needing a gapless sequence (invoice numbers
come to mind); things like that. So once again, what do you need those
locks for? P.S. Please don't top-post on this list. I had to edit your
message to fix the flow of the conversation.
--
If you can't see the forest for the trees, Cut the trees and you'll
see there is no forest.

It probably wasn't clear but the conversation did start with exposing my
issue, I'll try to rephrase.

Serializable keeps data integrity of the database, that's true. But it
does that by "refusing" transactions that would compromise it (with
error 40001).

I need to make sure no transactions fail due to such errors, so I made
an handler to retry transactions if they fail due to 40001.
But I can't leave it retrying forever, so I have this (configurable)
limit of 5 retries.
5 retries might be enough in some case, but it might not be enough in
others.

So how do I make sure the last try would not fail with serialization error?
I could only come up with the idea of using LOCK database (on that last
try) but as it is not available in PG, I went for "lock all tables".

Does it make sense now?
I was probably having the wrong approach to the problem anyway, I've
went with "read committed" mode now and normal locking (SELECT .. FOR
UPDATE on the required spots) and it works better..

You also mentioned that this for a Django application, have you looked
at its new(since 1.6) transaction management?:

https://docs.djangoproject.com/en/1.8/topics/db/transactions/

It allows you fairly fine grained control over transactions, including
access to savepoints.

I just come from a GTM background (and all the other developers in the
project) and we're used to the TP RESTART concept where, in the final
retry, GTM locks the database region(s) that were used by that
transaction (not the whole DB, just the regions).

Thanks and sorry for the top-posting, not really used to mailing lists
and proper posting styles..

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18Filipe Pina
filipe.pina@impactzero.pt
In reply to: Adrian Klaver (#17)
Re: database-level lockdown

On Ter, Jul 7, 2015 at 2:51 , Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 07/07/2015 06:44 AM, Filipe Pina wrote:

On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com>
wrote:

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt>
wrote:

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

Still not sure what is you are trying to accomplish. Is it
really necessary that every transaction be serialized? Or to
put it another way, why are you running in serializable by
default? Or yet another way, what is the problem you are
trying to solve with serialized transactions?

Exactly, that's the twist I've decided after some tests
yesterday
(the "lock all tables on last try" degraded performance in an
obscene way): giving up on serializable by default. I wanted to
use serializable so developers wouldn't have to worry about
properly using locks, but implementing this "serialization
failure" handler is becoming a major headache...

What Adrian was trying to get out of you is why you think you need
those locks. You're working with an RDBMS, it does the job of
keeping
data integrity in a multi-user environment already. You can trust it
do do that well. So unless you're doing something really special,
you
shouldn't need to lock anything. Among the special cases where you
do
need locks are things like needing a gapless sequence (invoice
numbers
come to mind); things like that. So once again, what do you need
those
locks for? P.S. Please don't top-post on this list. I had to edit
your
message to fix the flow of the conversation.
--
If you can't see the forest for the trees, Cut the trees and you'll
see there is no forest.

It probably wasn't clear but the conversation did start with
exposing my
issue, I'll try to rephrase.

Serializable keeps data integrity of the database, that's true. But
it
does that by "refusing" transactions that would compromise it (with
error 40001).

I need to make sure no transactions fail due to such errors, so I
made
an handler to retry transactions if they fail due to 40001.
But I can't leave it retrying forever, so I have this (configurable)
limit of 5 retries.
5 retries might be enough in some case, but it might not be enough in
others.

So how do I make sure the last try would not fail with serialization
error?
I could only come up with the idea of using LOCK database (on that
last
try) but as it is not available in PG, I went for "lock all tables".

Does it make sense now?
I was probably having the wrong approach to the problem anyway, I've
went with "read committed" mode now and normal locking (SELECT .. FOR
UPDATE on the required spots) and it works better..

You also mentioned that this for a Django application, have you
looked at its new(since 1.6) transaction management?:

https://docs.djangoproject.com/en/1.8/topics/db/transactions/

It allows you fairly fine grained control over transactions,
including access to savepoints.

I just come from a GTM background (and all the other developers in
the
project) and we're used to the TP RESTART concept where, in the final
retry, GTM locks the database region(s) that were used by that
transaction (not the whole DB, just the regions).

Thanks and sorry for the top-posting, not really used to mailing
lists
and proper posting styles..

--
Adrian Klaver
adrian.klaver@aklaver.com

Yes, I was using transation.atomic in some django-only operations, but
the problem is that some django operations call PG functions that call
other PG functions and the purpose was to have the django code
"agnostic" of the what the functions do..

But we've switched to "read committed" plus proper "FOR UPDATE"
statements in the few functions we have at the moment and the
performance is actually quite better (at the little extra cost of
people being careful with locks).