Unexpected serialization error
Hi guys,
I had no luck over at general slack channel. So I'm beginning to treat this
as a bug. Here's a test case that unexpectedly fails at last insert. I
expect it not to fail because the rows that the two transactions act on are
unrelated to one another.
CREATE TABLE t1 (
class integer NOT NULL
);
CREATE INDEX ON t1 (class);
CREATE TABLE t2 (
class integer NOT NULL
);
CREATE INDEX ON t2 (class);
ALTER SYSTEM SET enable_seqscan TO off;
-- Session 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t2 WHERE class=1;
-- Session 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t1 WHERE class=2;
INSERT INTO t2 VALUES(2);
COMMIT;
-- Session 1
INSERT INTO t1 VALUES(1);
COMMIT;
Postgres version is PostgreSQL 12.5 on x86_64-pc-linux-musl, compiled by
gcc (Alpine 9.3.0) 9.3.0, 64-bit.
-- Lule
=?UTF-8?B?THVrYSDFvWl0bmlr?= <luka.zitnik@gmail.com> writes:
I had no luck over at general slack channel. So I'm beginning to treat this
as a bug. Here's a test case that unexpectedly fails at last insert. I
expect it not to fail because the rows that the two transactions act on are
unrelated to one another.
The bug is that you're assuming exact tracking of the SSI serialization
rules. It's not done that way, because it'd be prohibitively expensive.
(I've not dug into this example in any detail, but I suspect it's
acquiring page-level not tuple-level predicate locks, thus the
transactions conflict because each tries to update a page the other one
already read.)
The short answer is that ANY application that's depending on serializable
mode MUST be prepared to retry serialization failures. You don't get to
skip that just because there theoretically shouldn't be a failure.
regards, tom lane
Got it, thanks. If this particular test case implied worse than expected
performance (e.g. significantly more than the 0.03% of failures,
http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf, RUBiS example),
it probably would have been taken into consideration.
Btw, more out of curiosity, is it possible to nudge the test to pass by
changing some of these settings?
ssi_test=# select name from pg_settings where name ~ 'pred';
name
--------------------------------
max_pred_locks_per_page
max_pred_locks_per_relation
max_pred_locks_per_transaction
(3 rows)
Here's the lock types I see right before the last insert:
ssi_test=# SELECT mode, locktype, relation::regclass, page, tuple
FROM pg_locks WHERE mode = 'SIReadLock';
mode | locktype | relation | page | tuple
------------+----------+--------------+------+-------
SIReadLock | page | t1_class_idx | 1 |
SIReadLock | page | t2_class_idx | 1 |
(2 rows)
On Sat, Feb 13, 2021 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
=?UTF-8?B?THVrYSDFvWl0bmlr?= <luka.zitnik@gmail.com> writes:
I had no luck over at general slack channel. So I'm beginning to treat
this
as a bug. Here's a test case that unexpectedly fails at last insert. I
expect it not to fail because the rows that the two transactions act onare
unrelated to one another.
The bug is that you're assuming exact tracking of the SSI serialization
rules. It's not done that way, because it'd be prohibitively expensive.
(I've not dug into this example in any detail, but I suspect it's
acquiring page-level not tuple-level predicate locks, thus the
transactions conflict because each tries to update a page the other one
already read.)The short answer is that ANY application that's depending on serializable
mode MUST be prepared to retry serialization failures. You don't get to
skip that just because there theoretically shouldn't be a failure.regards, tom lane
Hi,
On 2021-02-12 19:14:17 +0100, Luka Žitnik wrote:
I had no luck over at general slack channel. So I'm beginning to treat this
as a bug. Here's a test case that unexpectedly fails at last insert. I
expect it not to fail because the rows that the two transactions act on are
unrelated to one another.CREATE TABLE t1 (
class integer NOT NULL
);CREATE INDEX ON t1 (class);
CREATE TABLE t2 (
class integer NOT NULL
);CREATE INDEX ON t2 (class);
ALTER SYSTEM SET enable_seqscan TO off;
-- Session 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t2 WHERE class=1;-- Session 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t1 WHERE class=2;
INSERT INTO t2 VALUES(2);
COMMIT;-- Session 1
INSERT INTO t1 VALUES(1);
COMMIT;
I think you're hitting multiple issues here... For one, initially your
table is empty, in which case we mark the entire index to be a conflict
during for the SELECT * FROM t2 WHERE class=1;
Second, even if the indexes weren't empty, the granularity of the index
predicate locking is a page - which means that S1's predicate lock for
SELECT * FROM t1 WHERE class=2; will conflict with S1's INSERT INTO t1
VALUES(1).
If the SELECTs actual hit rows (and the inserts were updates), you'd
presumably have the sequence suceed. Same with an index where the
inserted values weren't on the same page as the looked up values.
Greetings,
Andres Freund
Thank you, Andres. I appreciate your thoughts. So, I do notice the lock
level changes from table to page as soon as there are various rows in the
table. But what's really astonishing to me is to start looking at pages as
predicate granularity and, consequently, the possibility of the same test
passing with certain amount of data in the tables. This is a great find for
me.
On Mon, 22 Feb 2021, 02:28 Andres Freund, <andres@anarazel.de> wrote:
Show quoted text
Hi,
On 2021-02-12 19:14:17 +0100, Luka Žitnik wrote:
I had no luck over at general slack channel. So I'm beginning to treat
this
as a bug. Here's a test case that unexpectedly fails at last insert. I
expect it not to fail because the rows that the two transactions act onare
unrelated to one another.
CREATE TABLE t1 (
class integer NOT NULL
);CREATE INDEX ON t1 (class);
CREATE TABLE t2 (
class integer NOT NULL
);CREATE INDEX ON t2 (class);
ALTER SYSTEM SET enable_seqscan TO off;
-- Session 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t2 WHERE class=1;-- Session 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM t1 WHERE class=2;
INSERT INTO t2 VALUES(2);
COMMIT;-- Session 1
INSERT INTO t1 VALUES(1);
COMMIT;I think you're hitting multiple issues here... For one, initially your
table is empty, in which case we mark the entire index to be a conflict
during for the SELECT * FROM t2 WHERE class=1;Second, even if the indexes weren't empty, the granularity of the index
predicate locking is a page - which means that S1's predicate lock for
SELECT * FROM t1 WHERE class=2; will conflict with S1's INSERT INTO t1
VALUES(1).If the SELECTs actual hit rows (and the inserts were updates), you'd
presumably have the sequence suceed. Same with an index where the
inserted values weren't on the same page as the looked up values.Greetings,
Andres Freund