[PATCH] lock_timeout and common SIGALRM framework
Hi,
attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.
The framework is using a new internal API for timeouts:
bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);
A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:
typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);
typedef struct {
TimeoutName index;
bool resched_next;
timeout_init timeout_init;
timeout_destroy timeout_destroy;
timeout_check timeout_check;
timeout_start timeout_start;
TimestampTz fin_time;
} timeout_params;
This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.
And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.
The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.
Documentation and extensive comments are included.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Attachments:
pg92-common-lock-framework-lock_timeout.patchtext/x-patch; name=pg92-common-lock-framework-lock_timeout.patchDownload+1175-658
Hi,
2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,
attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.The framework is using a new internal API for timeouts:
bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);typedef struct {
TimeoutName index;
bool resched_next;
timeout_init timeout_init;
timeout_destroy timeout_destroy;
timeout_check timeout_check;
timeout_start timeout_start;
TimestampTz fin_time;
} timeout_params;This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.Documentation and extensive comments are included.
Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Attachments:
pg92-common-lock-framework-lock_timeout-v2.patchtext/x-patch; name=pg92-common-lock-framework-lock_timeout-v2.patchDownload+1307-838
2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:
Hi,
2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,
attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.The framework is using a new internal API for timeouts:
bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);typedef struct {
TimeoutName index;
bool resched_next;
timeout_init timeout_init;
timeout_destroy timeout_destroy;
timeout_check timeout_check;
timeout_start timeout_start;
TimestampTz fin_time;
} timeout_params;This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.Documentation and extensive comments are included.
Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.
3rd and (for now) final version. Tidied comments, the time checks in
Check*() functions and function order in timeout.c.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Attachments:
pg92-common-lock-framework-lock_timeout-v3.patchtext/x-patch; name=pg92-common-lock-framework-lock_timeout-v3.patchDownload+1302-837
2012-04-04 16:22 keltezéssel, Boszormenyi Zoltan írta:
2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:
Hi,
2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,
attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.The framework is using a new internal API for timeouts:
bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);typedef struct {
TimeoutName index;
bool resched_next;
timeout_init timeout_init;
timeout_destroy timeout_destroy;
timeout_check timeout_check;
timeout_start timeout_start;
TimestampTz fin_time;
} timeout_params;This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.Documentation and extensive comments are included.
Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.3rd and (for now) final version.
I lied. This is the final one. I fixed a typo in the documentation
and replaced timeout_start_time (previously static to proc.c)
with get_timeout_start(DEADLOCK_TIMEOUT). This one should
have happened in the second version.
Tidied comments, the time checks in
Check*() functions and function order in timeout.c.Best regards,
Zoltán Böszörményi--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web:http://www.postgresql-support.de
http://www.postgresql.at/
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Attachments:
pg92-common-lock-framework-lock_timeout-v4.patchtext/x-patch; name=pg92-common-lock-framework-lock_timeout-v4.patchDownload+1313-843
2012-04-04 17:12 keltezéssel, Boszormenyi Zoltan írta:
2012-04-04 16:22 keltezéssel, Boszormenyi Zoltan írta:
2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:
Hi,
2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,
attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.The framework is using a new internal API for timeouts:
bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);typedef struct {
TimeoutName index;
bool resched_next;
timeout_init timeout_init;
timeout_destroy timeout_destroy;
timeout_check timeout_check;
timeout_start timeout_start;
TimestampTz fin_time;
} timeout_params;This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.Documentation and extensive comments are included.
Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.3rd and (for now) final version.
I lied. This is the final one. I fixed a typo in the documentation
and replaced timeout_start_time (previously static to proc.c)
with get_timeout_start(DEADLOCK_TIMEOUT). This one should
have happened in the second version.Tidied comments, the time checks in
Check*() functions and function order in timeout.c.Best regards,
Zoltán Böszörményi
One comment for testers: all the timeout GUC values are given in
milliseconds, the kernel interface (setitimer) and TimestampTz uses
microseconds.
The transaction that locks is inherently a read/write one and by the time
the code reaches ProcSleep(), at least a few tens of microseconds has
already passed since the start of the statement even on 3GHz+ CPUs.
Not to mention that computers nowadays have high precision timers
and OSs running on them utilitize those. So, the time returned by
GetCurrentStatementStartTimestamp() will certainly be different from
GetCurrentTimestamp(). This means that the timeouts' fin_time will also
be different.
Long story short, using the same value for statement_timeout and
lock_timeout (or deadlock_timeout for that matter) means that
statement_timeout will trigger first. The story is different only on
a combination of a fast CPU and an OS with greater-then-millisecond
timer resolution.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?
Sure. Attached is the split version.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán Böszörményi
Hi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:
While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.
Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;
I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.
Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).
It can also be done without a statement_timeout, and a control-C on the
second lock table.
I didn't touch anything but this. It occurs everytime, when asserts are
activated.
I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?
Cheers
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future.
Thanks.
I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.
Indeed, the unpatched GIT version crashes if you enter
=#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:
Session 1:
$ psql
psql (9.2devel)
Type "help" for help.
zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#
Session 2:
zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR: canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!>
Server log after the second lock table:
TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File: "lock.c", Line: 749)
LOG: server process (PID 12978) was terminated by signal 6: Aborted
Best regards,
Zoltán Böszörményi
I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
2012-04-08 11:24 keltezéssel, Boszormenyi Zoltan írta:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future.Thanks.
I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.Indeed, the unpatched GIT version crashes if you enter
=#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:Session 1:
$ psql
psql (9.2devel)
Type "help" for help.zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#Session 2:
zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR: canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!>Server log after the second lock table:
TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File: "lock.c", Line: 749)
LOG: server process (PID 12978) was terminated by signal 6: AbortedBest regards,
Zoltán Böszörményi
Robert, the Assert triggering with the above procedure
is in your "fast path" locking code with current GIT.
Best regards,
Zoltán Böszörményi
I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.
Now it survives the above scenario.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.Now it survives the above scenario.
Best regards,
Zoltán Böszörményi
New patch attached, rebased to today's GIT.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.Now it survives the above scenario.
Best regards,
Zoltán BöszörményiNew patch attached, rebased to today's GIT.
Best regards,
Zoltán Böszörményi
Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.
Is the patch in context diff format?
Yes
Does it apply cleanly to the current git master?
Yes
Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.
Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes
Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.
Do we already have it?
No.
Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.
Does it include pg_dump support (if applicable)?
Not applicable
Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.
Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.
I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.
Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
Other timeouts such as archive_timeout and checkpoint_timeout still
work.
Does the feature work as advertised?
Yes
Are there corner cases the author has failed to consider?
I didn't find any.
Are there any assertion failures or crashes?
No.
Does the patch slow down simple tests?
No
If it claims to improve performance, does it?
Not applicable
Does it slow down other things?
No
Does it follow the project coding guidelines?
I think so
Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.
Will it work on Windows/BSD etc?
It should. I couldn't test it though.
Are the comments sufficient and accurate?
Yes
Does it do what it says, correctly?
Yes
Does it produce compiler warnings?
No
Can you make it crash?
Not anymore
Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.
Are there interdependencies that can cause problems?
I don't see any.
Regards,
Marc
2012-04-23 15:08 keltezéssel, Marc Cousin írta:
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.Now it survives the above scenario.
Best regards,
Zoltán BöszörményiNew patch attached, rebased to today's GIT.
Best regards,
Zoltán BöszörményiOk, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.
It was added to 2012-Next when I posted it, 2012-01 was already
closed for new additions.
Is the patch in context diff format?
YesDoes it apply cleanly to the current git master?
YesDoes it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
YesDo we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.Do we already have it?
No.Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.Does it include pg_dump support (if applicable)?
Not applicableAre there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
Other timeouts such as archive_timeout and checkpoint_timeout still
work.Does the feature work as advertised?
YesAre there corner cases the author has failed to consider?
I didn't find any.Are there any assertion failures or crashes?
No.Does the patch slow down simple tests?
NoIf it claims to improve performance, does it?
Not applicableDoes it slow down other things?
NoDoes it follow the project coding guidelines?
I think soAre there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.Will it work on Windows/BSD etc?
It should. I couldn't test it though.Are the comments sufficient and accurate?
YesDoes it do what it says, correctly?
YesDoes it produce compiler warnings?
NoCan you make it crash?
Not anymoreIs everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.Are there interdependencies that can cause problems?
I don't see any.Regards,
Marc
Thanks for the review.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Hi,
2012-04-24 09:00 keltezéssel, Boszormenyi Zoltan írta:
2012-04-23 15:08 keltezéssel, Marc Cousin írta:
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that. Would it work to
split the patch in two pieces?Sure. Attached is the split version.
Best regards,
Zoltán BöszörményiHi,
I've started looking at and testing both patches.
Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.Here is the way to reproduce it (I have done it with a pgbench schema):
- Set a small statement_timeout (just to save time during the tests)
Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR: canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).It can also be done without a statement_timeout, and a control-C on the
second lock table.I didn't touch anything but this. It occurs everytime, when asserts are
activated.I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?Cheers
Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.Now it survives the above scenario.
Best regards,
Zoltán BöszörményiNew patch attached, rebased to today's GIT.
Best regards,
Zoltán BöszörményiOk, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.It was added to 2012-Next when I posted it, 2012-01 was already
closed for new additions.Is the patch in context diff format?
YesDoes it apply cleanly to the current git master?
YesDoes it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
YesDo we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.Do we already have it?
No.Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.Does it include pg_dump support (if applicable)?
Not applicableAre there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
Other timeouts such as archive_timeout and checkpoint_timeout still
work.Does the feature work as advertised?
YesAre there corner cases the author has failed to consider?
I didn't find any.Are there any assertion failures or crashes?
No.Does the patch slow down simple tests?
NoIf it claims to improve performance, does it?
Not applicableDoes it slow down other things?
NoDoes it follow the project coding guidelines?
I think soAre there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.Will it work on Windows/BSD etc?
It should. I couldn't test it though.Are the comments sufficient and accurate?
YesDoes it do what it says, correctly?
YesDoes it produce compiler warnings?
NoCan you make it crash?
Not anymoreIs everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.Are there interdependencies that can cause problems?
I don't see any.Regards,
Marc
Thanks for the review.
Best regards,
Zoltán Böszörményi
New patches are attached, rebased to current GIT to avoid
a (trivial) reject. No other changes.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Hi,
another rebasing and applied the GIT changes in
ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
Windows implementation of PGSemaphoreTimedLock.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig& Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
Hi,
another rebasing and applied the GIT changes in
ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
Windows implementation of PGSemaphoreTimedLock.
Hi,
I gave the framework patch a look. One thing I'm not sure about is the
way you've defined the API. It seems a bit strange to have a nice and
clean, generic interface in timeout.h; and then have the internal
implementation of the API cluttered with details such as what to do when
the deadlock timeout expires. Wouldn't it be much better if we could
keep the details of CheckDeadLock itself within proc.c, for example?
Same for the other specific Check*Timeout functions. It seems to me
that the only timeout.c specific requirement is to be able to poke
base_timeouts[].indicator and fin_time. Maybe that can get done in
timeout.c and then have the generic checker call the module-specific
checker. ... I see that you have things to do before and after setting
"indicator". Maybe you could split the module-specific Check functions
in two and have timeout.c call each in turn. Other than that I don't
see that this should pose any difficulty.
Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
more than once per signal if you have multiple timeouts enabled. Maybe
it'd be better to call it in once handle_sig_alarm and then pass the
value down, if necessary (AFAICS if you restructure the code as I
suggest above, you don't need to get the value down the the
module-specific code).
As for the Init*Timeout() and Destroy*Timeout() functions, they seem
a bit pointless -- I mean if they just always call the generic
InitTimeout and DestroyTimeout functions, why not just define the struct
in a way that makes the specific functions unnecessary? Maybe you even
already have all that's necessary ... I think you just change
base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
so on.
Minor nitpick: the "Interface:" comment in timeout.c seems useless.
That kind of thing tends to get overlooked and obsolete over time. We
have examples of such things all over the place. I'd just rip it and
have timeout.h be the interface documentation.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
BTW it doesn't seem that datatype/timestamp.h is really necessary in
timeout.h.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
Hi,
another rebasing and applied the GIT changes in
ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
Windows implementation of PGSemaphoreTimedLock.Hi,
I gave the framework patch a look. One thing I'm not sure about is the
way you've defined the API. It seems a bit strange to have a nice and
clean, generic interface in timeout.h; and then have the internal
implementation of the API cluttered with details such as what to do when
the deadlock timeout expires. Wouldn't it be much better if we could
keep the details of CheckDeadLock itself within proc.c, for example?
Do you mean adding a callback function argument to for enable_timeout()
would be better?
Same for the other specific Check*Timeout functions. It seems to me
that the only timeout.c specific requirement is to be able to poke
base_timeouts[].indicator and fin_time. Maybe that can get done in
timeout.c and then have the generic checker call the module-specific
checker.
Or instead of static functions, Check* functions can be external
to timeout.c. It seemed to be a good idea to move all the timeout
related functions to timeout.c.
... I see that you have things to do before and after setting
"indicator". Maybe you could split the module-specific Check functions
in two and have timeout.c call each in turn. Other than that I don't
see that this should pose any difficulty.Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
more than once per signal if you have multiple timeouts enabled.
Actually, GetCurrentTimestamp() is not called multiple times,
because only the first timeout source in the queue can get triggered.
Maybe
it'd be better to call it in once handle_sig_alarm and then pass the
value down, if necessary (AFAICS if you restructure the code as I
suggest above, you don't need to get the value down the the
module-specific code).
But yes, this way it can be cleaner.
As for the Init*Timeout() and Destroy*Timeout() functions, they seem
a bit pointless -- I mean if they just always call the generic
InitTimeout and DestroyTimeout functions, why not just define the struct
in a way that makes the specific functions unnecessary? Maybe you even
already have all that's necessary ... I think you just change
base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
so on.
OK, I will experiment with it.
Minor nitpick: the "Interface:" comment in timeout.c seems useless.
That kind of thing tends to get overlooked and obsolete over time. We
have examples of such things all over the place. I'd just rip it and
have timeout.h be the interface documentation.
OK.
BTW it doesn't seem that datatype/timestamp.h is really necessary in
timeout.h.
You are wrong. get_timeout_start() returns TimestampTz and it's
defined in datatype/timestamp.h.
Thanks for the review, I will send the new code soon.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Hi,
attached are the new patches.
The base patch is now simplified and is about 10K smaller:
- no more individual Init* and Destroy* functions
- Check* functions don't check for "now", it's done centrally by the signal handler
- CheckDeadLock() is moved back to proc.c and made public for timeout.c
- a new header storage/proctimeout.h is introduced, so timeout.c can know
about CheckDeadLock()
The lock_timeout patch gained a new feature and enum GUC:
SET lock_timeout_option = { 'per_lock' | 'per_statement' } ;
'per_lock' is the default and carries the previous behaviour: the timeout value
applies to all locks individually. The statement may take up to N*timeout.
'per_statement' behaves like statement_timeout. The important difference is
that statement_timeout may trigger during the executor phase when a long
result set is already being returned to the client and the transfer is cut because
of the timeout. On the other hand, lock_timeout may only trigger during locking
the objects and if all locks were granted under the specified time, the result set
is then returned to the client.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/