Re: Potential "AIO / io workers" inter-worker locking issue in PG18? - Mailing list pgsql-bugs
From | Marco Boeringa |
---|---|
Subject | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Date | |
Msg-id | 53b44572-0ceb-4149-b361-07da2da91032@boeringa.demon.nl Whole thread Raw |
In response to | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
|
List | pgsql-bugs |
Hi Andres, > Could you show perf report --no-children? That would show us which individual functions, rather than call-stacks, take the longest. See entirely below! > I don't immediately see how this could be related to AIO. Yes, you could be right this is not related to AIO at all, but another issue introduced in PG18. The only reason I initiallythought of AIO was of course that it is one of the main new features of PG18, and I could imagine "workers" gettinginto some sort of inter-worker locking issues, just like threads can. For sure, I did not see this issue in <= PG17, so some change in PG18 is causing it. Additionally, there is a small chanceit might be related to PostGIS, as that was upgraded as well (3.5.2 --> 3.6.0) during the PG upgrade, as PG18 requiresPostGIS 3.6.0 minimum. And the query does use PostGIS functions, but none that AFAIK rely on e.g. a spatial index.Functions like ST_Area just process an individual geometry, not the spatial relationship between multiple geometries. As I wrote before, this is a multi-threaded Python application (actually developed in a GIS), that uses Python's 'concurrent.futures'threading framework to create jobs of records to process for each thread, significantly speeding up theprocessing. The queries are in fact dynamically build by the code, and part of a much larger geoprocessing workflow, soit is hard to run them separately and provide a query plan (although in this case I could by rewriting part of the querybelow). However, I want to stress that any query plan is unlikely to yield anything. In normal circumstances and in PG17 and below,this code runs fine! And it is only 1 in maybe 4 runs in PG18 that goes berserk and makes a processing step that shouldtakes < 10 seconds, all of a sudden take > 2 hours. UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1 SET area_geo = t2.area_geo, perim_geo = t2.perim_geo, compact_geo = CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END, npoints_geo = t2.npoints_geo, comp_npoints_geo = CASE WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END / t2.npoints_geo) ELSE 0 END, convex_ratio_geo = CASE WHEN ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo / ST_Area(ST_ConvexHull(way)::geography,true)) ELSE 1 END FROM (SELECT objectid,ST_Area(way::geography,true) AS area_geo,ST_Perimeter(way::geography,true) AS perim_geo,ST_NPoints(way) AS npoints_geo FROM osm.landcover_scrubs_small_scale_2_ply) AS t2 WHERE (t2.objectid = t1.objectid) AND t1.objectid IN (SELECT t3.objectid FROM mini_test.osm.osm_tmp_28128_ch5 AS t3) > So somehow >60% of the CPU time is spent fetching tuples corresponding to index entries. That seems ... a lot. Is it possible that you have a lot of dead rows in the involved tables? Yes, that is perfectly possible. However, the particular table is only just over 100k records. It is true that my code is designed to process literally *every* record in a table. However, I specifically set adjusted table storage parameters with much more aggressive vacuum settings (essentially forcing always vacuum after something like 10k dead tuples irrespective of the size of the table). This has worked really well, and I have successfully UPDATEd all of Facebook Daylight size > 1B records tables with the same code, without ever running into this particular issue, nor transaction ID wraparound issues. One particular thing to note as well is that, due to careful design of the jobs taking page locality into account, deliberately setting a low table fill factor, and plenty of RAM, quite a few but not all of the steps in the geoprocessing workflow, manage to run almost completely as PostgreSQL 'HOT' updates, so for all records in the table (even for very large ones). *** sudo perf -p <PID of one stuck postgres backend> -g -d 10 *** *** sudo perf report --no-children *** Samples: 40K of event 'task-clock:ppp', Event count (approx.): 10008250000 Overhead Command Shared Object Symbol + 12,81% postgres postgres [.] LWLockAttemptLock + 10,70% postgres postgres [.] heap_hot_search_buffer + 9,27% postgres postgres [.] tts_heap_getsomeattrs.lto_priv.0 + 6,77% postgres postgres [.] LWLockReleaseInternal + 5,89% postgres postgres [.] hash_search_with_hash_value + 5,79% postgres postgres [.] ExecInterpExpr + 4,14% postgres postgres [.] PinBuffer + 3,61% postgres postgres [.] heapam_index_fetch_tuple.lto_priv.0 + 3,00% postgres postgres [.] ReleaseAndReadBuffer + 2,43% postgres postgres [.] LWLockAcquire + 2,38% postgres postgres [.] HeapTupleSatisfiesVisibility + 2,36% postgres postgres [.] index_getnext_tid + 1,96% postgres postgres [.] index_fetch_heap + 1,90% postgres postgres [.] _bt_readpage + 1,85% postgres postgres [.] LWLockRelease + 1,74% postgres postgres [.] heap_page_prune_opt + 1,49% postgres postgres [.] IndexNext + 1,47% postgres postgres [.] _bt_checkkeys + 1,47% postgres postgres [.] ExecScan + 1,47% postgres postgres [.] XidInMVCCSnapshot + 1,19% postgres postgres [.] StartReadBuffer + 1,18% postgres postgres [.] GetPrivateRefCountEntry.lto_priv.0 + 1,12% postgres postgres [.] _bt_next + 1,10% postgres postgres [.] UnpinBufferNoOwner.lto_priv.0 + 1,09% postgres postgres [.] MemoryContextReset + 1,08% postgres postgres [.] btgettuple + 0,91% postgres postgres [.] _bt_check_compare.lto_priv.0 + 0,82% postgres postgres [.] ExecEvalSysVar + 0,82% postgres postgres [.] ExecStoreBufferHeapTuple + 0,71% postgres postgres [.] hash_bytes + 0,65% postgres postgres [.] tts_virtual_clear.lto_priv.0 + 0,63% postgres postgres [.] GlobalVisTestFor + 0,63% postgres postgres [.] ExecNestLoop + 0,55% postgres postgres [.] HeapTupleIsSurelyDead + 0,52% postgres postgres [.] ResourceOwnerForget 0,48% postgres postgres [.] slot_getsomeattrs_int 0,42% postgres postgres [.] PredicateLockTID 0,40% postgres postgres [.] ReadBufferExtended 0,37% postgres postgres [.] _bt_saveitem 0,31% postgres postgres [.] ExecIndexScan 0,30% postgres libc.so.6 [.] __memcmp_sse2 0,28% postgres postgres [.] _bt_setuppostingitems 0,26% postgres postgres [.] ReleaseBuffer 0,23% postgres postgres [.] ResourceOwnerEnlarge 0,23% postgres postgres [.] HeapCheckForSerializableConflictOut 0,22% postgres postgres [.] IncrBufferRefCount 0,18% postgres postgres [.] pgstat_count_io_op
pgsql-bugs by date: