BUG #17949: Adding an index introduces serialisation anomalies.

Started by PG Bug reporting formalmost 3 years ago27 messagesbugs

noreply@postgresql.org

almost 3 years ago

The following bug has been logged on the website:

Bug reference: 17949
Logged by: Artem Anisimov
Email address: artem.anisimov.255@gmail.com
PostgreSQL version: 15.3
Operating system: fedora 38
Description:

Hello dear pg authors,

I have come across a behaviour in pg that contradicts
https://www.postgresql.org/docs/15/transaction-iso.html#XACT-SERIALIZABLE.

I've minimised my testcase to a scenario that essentially duplicates a
scenario that you describe in the following paragraph:

In particular, it is possible to see unique constraint violations
caused by conflicts with overlapping Serializable transactions
even after explicitly checking that the key isn't present
before attempting to insert it.
This can be avoided by making sure that all Serializable transactions

that

insert potentially conflicting keys explicitly check if they can do so

first.

At the end of the report there is a C reproducer that highlights the
problem. Let me give a high-level overview of the scenario first.

I have N threads each trying to "acquire an exclusive lock" this way:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM locks WHERE path = $1;
quit if there already is a lock
INSERT INTO locks(path, ...) VALUES($1, ...);
COMMIT;

Once all threads have attempted to acquire a lock, I count how many of them
succeeded to INSERT INTO. Normally, there is only one thread that succeeds,
which is what I expect. However, if I add a HASH or a BTREE index on "path",
it becomes possible for multiple threads to do a successful INSERT INTO,
which violates the serialisability.

The problem can be reproduced with the default postgresql.conf, but it takes
some time. If I increase "shared_buffers" to 1024MB, the issue appears
almost immediately (fewer chances to promote predicate locks to locks on the
whole table?).

I've seen this behaviour with pg 13, 15 and 16 (503b055).

Now let us see the reproducer. It has two primary components:
1. test_once() spawns 32 threads that try to acquire a lock, waits for them,
and counts the number of "acquired exclusive locks",
2. try_acquire_lock() executes a transaction described above.

To build the reproducer:
$ gcc -std=c99 -o test test.c -pthread `pkg-config libpq --cflags --libs`

To run the reproducer:
$ ./test
or
$ DB_CONNSTR="dbname=abc host=def user=ghi" ./test

The reproducer:

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

#include <libpq-fe.h>

// Comment this to make serialisation anomalies go away.
#define WITH_INDEX

// I have seen the problem with as few as 3 threads. 32 threads make
// the issue appear much sooner.
#define NR_THREADS (32)
#define NR_RUNS (1024 * 1024)

static PGconn *conns[NR_THREADS];

static void* try_acquire_lock(void *arg)
{
PGconn *c = arg;
PGresult *res;
int ntuples;

res = PQexec(c, "BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);

res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
assert(PQresultStatus(res) == PGRES_TUPLES_OK);
ntuples = PQntuples(res);
PQclear(res);

if (ntuples > 0) {
// someone else already has a lock
res = PQexec(c, "COMMIT");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
return NULL;
}

res = PQexec(c, "INSERT INTO locks(path) VALUES('xyz')");
PQclear(res);

res = PQexec(c, "COMMIT");
PQclear(res);

return NULL;
}

static void test_once(void)
{
PGconn *c = conns[0];
PGresult *res;
int ntuples;

pthread_t thrs[NR_THREADS];
for (int i = 0; i < NR_THREADS; ++i)
pthread_create(&thrs[i], NULL, &try_acquire_lock,
conns[i]);
for (int i = 0; i < NR_THREADS; ++i)
pthread_join(thrs[i], NULL);

res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
assert(PQresultStatus(res) == PGRES_TUPLES_OK);
ntuples = PQntuples(res);
PQclear(res);

if (ntuples != 1)
printf("ntuples = %d\n", ntuples);
assert(ntuples == 1);

res = PQexec(c, "TRUNCATE TABLE locks");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
}

static void prepare_db(void)
{
PGconn *c = conns[0];
PGresult *res;

res = PQexec(c, "DROP TABLE locks");
PQclear(res);

res = PQexec(c, "CREATE TABLE locks (path TEXT NOT NULL)");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);

#ifdef WITH_INDEX
res = PQexec(c, "CREATE INDEX ON locks USING HASH(path)");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
#endif
}

int main(void)
{
const char *connstr = getenv("DB_CONNSTR");
if (connstr == NULL)
connstr = "dbname=postgres";

for (int i = 0; i < NR_THREADS; ++i) {
conns[i] = PQconnectdb(connstr);
assert(PQstatus(conns[i]) == CONNECTION_OK);
}
prepare_db();

for (int i = 0; i < NR_RUNS; ++i)
test_once();

for (int i = 0; i < NR_THREADS; ++i)
PQfinish(conns[i]);

return 0;
}

BUG #17949: Adding an index introduces serialisation anomalies.

Attachments:

Attachments:

Attachments: