GISTSTATE is too large - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | GISTSTATE is too large |
Date | |
Msg-id | 20210425222053.y6vcdiqohdoshi26@alap3.anarazel.de Whole thread Raw |
Responses |
Re: GISTSTATE is too large
Re: GISTSTATE is too large |
List | pgsql-hackers |
Hi, On twitter it was mentioned [1] that gist index builds spend a lot of time in FunctionCall3Coll. Which could be addressed to a good degree by not using FunctionCall3Coll() which needs to call InitFunctionCallInfoData() every time, but instead doing so once by including the FunctionCallInfo in GISTSTATE. Which made me look at GISTSTATEs layout. And, uh, I was a bit shocked: struct GISTSTATE { MemoryContext scanCxt; /* 0 8 */ MemoryContext tempCxt; /* 8 8 */ TupleDesc leafTupdesc; /* 16 8 */ TupleDesc nonLeafTupdesc; /* 24 8 */ TupleDesc fetchTupdesc; /* 32 8 */ FmgrInfo consistentFn[32]; /* 40 1536 */ /* --- cacheline 24 boundary (1536 bytes) was 40 bytes ago --- */ FmgrInfo unionFn[32]; /* 1576 1536 */ ... /* --- cacheline 216 boundary (13824 bytes) was 40 bytes ago --- */ Oid supportCollation[32]; /* 13864 128 */ /* size: 13992, cachelines: 219, members: 15 */ /* last cacheline: 40 bytes */ }; So the basic GISTSTATE is 14kB large. And all the information needed to call support functions for one attribute is spaced so far appart that it's guaranteed to be on different cachelines and to be very unlikely to be prefetched by the hardware prefetcher. It seems pretty clear that this should be changed to be something more like typedef struct GIST_COL_STATE { FmgrInfo consistentFn; FmgrInfo unionFn; FmgrInfo compressFn; FmgrInfo decompressFn; FmgrInfo penaltyFn; FmgrInfo picksplitFn; FmgrInfo equalFn; FmgrInfo distanceFn; FmgrInfo fetchFn; /* Collations to pass to the support functions */ Oid supportCollation; } GIST_COL_STATE; typedef struct GISTSTATE { MemoryContext scanCxt; /* context for scan-lifespan data */ MemoryContext tempCxt; /* short-term context for calling functions */ TupleDesc leafTupdesc; /* index's tuple descriptor */ TupleDesc nonLeafTupdesc; /* truncated tuple descriptor for non-leaf * pages */ TupleDesc fetchTupdesc; /* tuple descriptor for tuples returned in an * index-only scan */ GIST_COL_STATE column_state[FLEXIBLE_ARRAY_MEMBER]; } with initGISTstate allocating based on IndexRelationGetNumberOfKeyAttributes() instead of using a constant. And then subsequently change GIST_COL_STATE to embed the FunctionCallInfo, rather than initializiing them on the stack for every call. I'm not planning on doing this work, but I thought it's sensible to send to the list anyway. Greetings, Andres Freund [1] https://twitter.com/komzpa/status/1386420422225240065
pgsql-hackers by date: