BUG #17949: Adding an index introduces serialisation anomalies. - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17949: Adding an index introduces serialisation anomalies. |
Date | |
Msg-id | 17949-a0f17035294a55e2@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17949: Adding an index introduces serialisation anomalies.
(Thomas Munro <thomas.munro@gmail.com>)
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17949 Logged by: Artem Anisimov Email address: artem.anisimov.255@gmail.com PostgreSQL version: 15.3 Operating system: fedora 38 Description: Hello dear pg authors, I have come across a behaviour in pg that contradicts https://www.postgresql.org/docs/15/transaction-iso.html#XACT-SERIALIZABLE. I've minimised my testcase to a scenario that essentially duplicates a scenario that you describe in the following paragraph: > In particular, it is possible to see unique constraint violations > caused by conflicts with overlapping Serializable transactions > even after explicitly checking that the key isn't present > before attempting to insert it. > This can be avoided by making sure that all Serializable transactions that > insert potentially conflicting keys explicitly check if they can do so first. At the end of the report there is a C reproducer that highlights the problem. Let me give a high-level overview of the scenario first. I have N threads each trying to "acquire an exclusive lock" this way: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT * FROM locks WHERE path = $1; quit if there already is a lock INSERT INTO locks(path, ...) VALUES($1, ...); COMMIT; Once all threads have attempted to acquire a lock, I count how many of them succeeded to INSERT INTO. Normally, there is only one thread that succeeds, which is what I expect. However, if I add a HASH or a BTREE index on "path", it becomes possible for multiple threads to do a successful INSERT INTO, which violates the serialisability. The problem can be reproduced with the default postgresql.conf, but it takes some time. If I increase "shared_buffers" to 1024MB, the issue appears almost immediately (fewer chances to promote predicate locks to locks on the whole table?). I've seen this behaviour with pg 13, 15 and 16 (503b055). Now let us see the reproducer. It has two primary components: 1. test_once() spawns 32 threads that try to acquire a lock, waits for them, and counts the number of "acquired exclusive locks", 2. try_acquire_lock() executes a transaction described above. To build the reproducer: $ gcc -std=c99 -o test test.c -pthread `pkg-config libpq --cflags --libs` To run the reproducer: $ ./test or $ DB_CONNSTR="dbname=abc host=def user=ghi" ./test The reproducer: #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <assert.h> #include <libpq-fe.h> // Comment this to make serialisation anomalies go away. #define WITH_INDEX // I have seen the problem with as few as 3 threads. 32 threads make // the issue appear much sooner. #define NR_THREADS (32) #define NR_RUNS (1024 * 1024) static PGconn *conns[NR_THREADS]; static void* try_acquire_lock(void *arg) { PGconn *c = arg; PGresult *res; int ntuples; res = PQexec(c, "BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE"); assert(PQresultStatus(res) == PGRES_COMMAND_OK); PQclear(res); res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'"); assert(PQresultStatus(res) == PGRES_TUPLES_OK); ntuples = PQntuples(res); PQclear(res); if (ntuples > 0) { // someone else already has a lock res = PQexec(c, "COMMIT"); assert(PQresultStatus(res) == PGRES_COMMAND_OK); PQclear(res); return NULL; } res = PQexec(c, "INSERT INTO locks(path) VALUES('xyz')"); PQclear(res); res = PQexec(c, "COMMIT"); PQclear(res); return NULL; } static void test_once(void) { PGconn *c = conns[0]; PGresult *res; int ntuples; pthread_t thrs[NR_THREADS]; for (int i = 0; i < NR_THREADS; ++i) pthread_create(&thrs[i], NULL, &try_acquire_lock, conns[i]); for (int i = 0; i < NR_THREADS; ++i) pthread_join(thrs[i], NULL); res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'"); assert(PQresultStatus(res) == PGRES_TUPLES_OK); ntuples = PQntuples(res); PQclear(res); if (ntuples != 1) printf("ntuples = %d\n", ntuples); assert(ntuples == 1); res = PQexec(c, "TRUNCATE TABLE locks"); assert(PQresultStatus(res) == PGRES_COMMAND_OK); PQclear(res); } static void prepare_db(void) { PGconn *c = conns[0]; PGresult *res; res = PQexec(c, "DROP TABLE locks"); PQclear(res); res = PQexec(c, "CREATE TABLE locks (path TEXT NOT NULL)"); assert(PQresultStatus(res) == PGRES_COMMAND_OK); PQclear(res); #ifdef WITH_INDEX res = PQexec(c, "CREATE INDEX ON locks USING HASH(path)"); assert(PQresultStatus(res) == PGRES_COMMAND_OK); PQclear(res); #endif } int main(void) { const char *connstr = getenv("DB_CONNSTR"); if (connstr == NULL) connstr = "dbname=postgres"; for (int i = 0; i < NR_THREADS; ++i) { conns[i] = PQconnectdb(connstr); assert(PQstatus(conns[i]) == CONNECTION_OK); } prepare_db(); for (int i = 0; i < NR_RUNS; ++i) test_once(); for (int i = 0; i < NR_THREADS; ++i) PQfinish(conns[i]); return 0; }
pgsql-bugs by date: