Re: BUG #17821: Assertion failed in heap_update() due to heap pruning - Mailing list pgsql-bugs
From | Alexander Lakhin |
---|---|
Subject | Re: BUG #17821: Assertion failed in heap_update() due to heap pruning |
Date | |
Msg-id | 25c3399b-58a3-d727-7f97-93394fd1afa9@gmail.com Whole thread Raw |
In response to | Re: BUG #17821: Assertion failed in heap_update() due to heap pruning (Andres Freund <andres@anarazel.de>) |
Responses |
Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
|
List | pgsql-bugs |
Hi, 04.03.2023 19:48, Andres Freund wrote: > My first suspicion would be that we aren't holding a lock in the right moment, to enforce cache invalidation processing. > I've encountered another case of that assertion failure (this time with pg_statistic): numclients=10 for ((c=1;c<=numclients;c++)); do createdb db$c done for ((i=1;i<=50;i++)); do echo "iteration $i" for ((c=1;c<=numclients;c++)); do echo " CREATE TABLE t(i int4, t1 text[], t2 text[]); INSERT INTO t SELECT g, ARRAY[repeat('x', g)], ARRAY[repeat('x', g)] FROM generate_series(1, 200) g; ANALYZE t; " | psql -d db$c >/dev/null done for ((c=1;c<=numclients;c++)); do echo " ANALYZE t; " | psql -d db$c >/dev/null & echo " SELECT * FROM t WHERE i = 1; ANALYZE t; " | psql -d db$c >/dev/null & done wait grep 'TRAP:' server.log && { break; } for ((c=1;c<=numclients;c++)); do echo " DROP TABLE t; VACUUM pg_statistic; " | psql -d db$c >/dev/null done done ... iteration 13 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3074, PID: 89697 Reproduced on REL_12_STABLE (starting from 5e0928005) .. master. (Modifying pruneheap.c as in [1] not needed for this case.) Here we have the following sequence of events: Session 1 ANALYZE t; // needed just to produce material for pruning do_analyze_rel() -> update_attstats() -> CatalogTupleUpdateWithInfo() -> heap_update(ctid(18,16)); PrepareToInvalidateCacheTuple(hashvalue: {hash1}) Session 2: SELECT * FROM t WHERE i = 1; SearchCatCacheInternal(hashValue: {hash1}) SearchCatCacheMiss(hashValue: {hash1}) ProcessInvalidationMessages() CatCacheInvalidate(hashValue: {hash1}) CatCacheInvalidate(hashValue: {hash2}) CatCacheInvalidate(hashValue: {hash1}) SearchCatCacheMiss returns tid(18,16) ANALYZE t; do_analyze_rel() -> update_attstats() attno=0 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash2}) -> SearchCatCacheMiss() -> heap_page_prune(block: 18) attno=1 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash1}) -> oldtup = tid(18,16) CatalogTupleUpdateWithInfo() -> heap_update(tid(18, 16)) (A server.log with additional tracing messages for one of the failed runs is attached.) In other words, a backend can get some tuple id from catcache and then prune the underlying page while searching for another (missing) entry in the cache (any other reading of the page that may trigger pruning will do as well). After that, an attempt to update that tuple might trigger the Assert(). [1] https://www.postgresql.org/message-id/17821-dd8c334263399284%40postgresql.org Best regards, Alexander
Attachment
pgsql-bugs by date: