BUG #18163: Catcache entry invalidation might be missed when toast processed inside CatalogCacheCreateEntry - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18163: Catcache entry invalidation might be missed when toast processed inside CatalogCacheCreateEntry
Date
Msg-id 18163-859bad19a43edcf6@postgresql.org
Whole thread Raw
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18163
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16.0
Operating system:   Ubuntu 22.04
Description:

[ inspired by Tom Lane's suspicions in the discussion of bug #18014 ]

When a catalog entry created inside CatalogCacheCreateEntry, for tuples
with toasted fields this function fetches toast values and thus processes
AcceptInvalidationMessages calls, that might invalidate the entry created,
but CatalogCacheCreateEntry still returns the entry as valid.

I've constructed the following demo script for the issue:
numraces=10
echo "
SELECT format('CREATE ROLE user_%s', i) FROM generate_series(1, 4000) i
\gexec" | psql >/dev/null

for ((c=1;c<=$numraces;c++)); do
createdb db$c
echo "
CREATE DOMAIN dom AS text;

SELECT format('
  GRANT USAGE ON TYPE dom TO user_%s', i) FROM generate_series(1, 4000) i
\gexec

CREATE TABLE tab(time timestamp, id int, content dom);
" | psql db$c >/dev/null
done

for ((n=1;n<=300;n++)); do
echo "ITERATION $n"

for ((c=1;c<=$numraces;c++)); do
echo "
BEGIN;
ALTER DOMAIN dom SET DEFAULT 'x';
SELECT pg_sleep(0.1);
COMMIT;
SELECT pg_sleep(0.05 + random() * 0.10); -- tuning might be needed
INSERT INTO tab SELECT now(), 101; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 102; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 103; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 104; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 105;
" | psql db$c >/dev/null &

echo "
BEGIN;
SELECT pg_sleep(0.05);
SELECT oid FROM pg_type WHERE typname = 'dom' FOR UPDATE;
ROLLBACK;
ALTER DOMAIN dom SET DEFAULT 'xx';
INSERT INTO tab SELECT now(), 201; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 202; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 203; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 204; SELECT pg_sleep(0.1);
INSERT INTO tab SELECT now(), 205;
" | psql db$c >/dev/null &
done
wait

for ((c=1;c<=$numraces;c++)); do
errcnt=$(
echo "
SELECT COUNT(*) FROM tab t, tab t1, tab t2
  WHERE t1.time < t.time AND t1.content <> t.content
    AND t2.time > t.time AND t2.content <> t.content;
" | psql db1 -Aqt)
if [[ $errcnt == 0 ]]; then
  psql db$c -c "TRUNCATE TABLE tab" >/dev/null
else
  echo "Pair $c has got inconsistent domain definitions:"
  psql db$c -c "SELECT time, id, content FROM tab ORDER BY time"
  break
fi
done

[[ $errcnt == 0 ]] || break;
done

It shows how two sessions might see two different live catcache entries
simultaneously:
...
ITERATION 20
Pair 1 has got inconsistent domain definitions:
            time            | id  | content 
----------------------------+-----+---------
 2023-10-20 10:31:30.214179 | 101 | x
 2023-10-20 10:31:30.222808 | 201 | xx
 2023-10-20 10:31:30.324768 | 202 | xx
 2023-10-20 10:31:30.326976 | 102 | x
 2023-10-20 10:31:30.42655  | 203 | xx
 2023-10-20 10:31:30.428698 | 103 | x
 2023-10-20 10:31:30.527381 | 204 | xx
 2023-10-20 10:31:30.530396 | 104 | x
 2023-10-20 10:31:30.629155 | 205 | xx
 2023-10-20 10:31:30.632042 | 105 | x
(10 rows)

I've  patched catcache.c to catch the anomaly easier:
@@ -40,2 +40,3 @@
 #include "utils/syscache.h"
+#include "tcop/tcopprot.h"
 
@@ -1812,2 +1813,3 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple
ntp, Datum *arguments,
         Assert(!negative);
+if (debug_query_string &&  strstr(debug_query_string, "now(), 101") !=
NULL) pg_usleep(1000L);
 
but I believe it could be seen without patching, that would just require
more time.


pgsql-bugs by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: group by true now errors with non-integer constant in GROUP BY
Next
From: Janning Vygen
Date:
Subject: slow pg_dump with bytea