Thread: GetRelationPath() vs critical sections
Hi, In the AIO patchset there are cases where we have to LOG failures inside a critical section. This is necessary because e.g. a buffer read might complete while we are waiting for a WAL write inside a critical section. We can't just defer the log message, as the IO might end up being waited-on/completed-by another backend than the backend that issued the IO, so we'd defer logging issues until an effectively arbitrary later time. In general emitting a LOG inside a critical section isn't a huge issue - we made sure that elog.c has a reserve of memory to be able to log without crashing. However, the current message for buffer IO issues use relpath*() (ending up in a call to GetRelationPath()). Which in turn uses psprintf() to generate the path. Which in turn violates the no-memory-allocations-in-critical-sections rule, as the containing memory context will typically not have ->allowInCritSection == true. It's not obvious to me what the best way to deal with this is. One idea I had was to add an errrelpath() that switches to edata->assoc_context before calling relpath(), but that would end up leaking memory, as FreeErrorDataContents() wouldn't know about the allocation. Obviously we could add a version of GetRelationPath() that just prints into a caller provided buffer - but that's somewhat awkward API wise. A third approach would be to have a dedicated memory context for this kind of thing that's reset after logging the message - but that comes awkwardly close to duplicating ErrorContext. I wonder if we're lacking a bit of infrastructure here... Greetings, Andres Freund
On Wed, Sep 04, 2024 at 11:58:33AM -0400, Andres Freund wrote: > In general emitting a LOG inside a critical section isn't a huge issue - we > made sure that elog.c has a reserve of memory to be able to log without > crashing. > > However, the current message for buffer IO issues use relpath*() (ending up in > a call to GetRelationPath()). Which in turn uses psprintf() to generate the > path. Which in turn violates the no-memory-allocations-in-critical-sections > rule, as the containing memory context will typically not have > ->allowInCritSection == true. > > It's not obvious to me what the best way to deal with this is. > > One idea I had was to add an errrelpath() that switches to > edata->assoc_context before calling relpath(), but that would end up leaking > memory, as FreeErrorDataContents() wouldn't know about the allocation. > > Obviously we could add a version of GetRelationPath() that just prints into a > caller provided buffer - but that's somewhat awkward API wise. Agreed. > A third approach would be to have a dedicated memory context for this kind of > thing that's reset after logging the message - but that comes awkwardly close > to duplicating ErrorContext. That's how I'd try to do it. Today's ErrorContext is the context for allocations FreeErrorDataContents() knows how to find. The new context would be for calling into arbitrary code unknown to FreeErrorDataContents(). Most of the time, we'd avoid reset work for the new context, since it would have no allocations. Ideally, errstart() would switch to the new context before returning true, and errfinish() would switch back. That way, you could just call relpath*() without an errrelpath() to help. This does need functions called in ereport() argument lists to not use CurrentMemoryContext for allocations that need to survive longer. I'd not be concerned about imposing that in a major release. What obstacles would arise if we did that? > I wonder if we're lacking a bit of infrastructure here... Conceptually, the ereport() argument list should be a closure that runs in a suitable mcxt. I think we're not far from the goal.
On Thu, Sep 5, 2024 at 3:58 AM Andres Freund <andres@anarazel.de> wrote: > Obviously we could add a version of GetRelationPath() that just prints into a > caller provided buffer - but that's somewhat awkward API wise. For the record, that's exactly what I did in the patch I proposed to fix our long standing RelationTruncate() data-eating bug: https://www.postgresql.org/message-id/flat/CA%2BhUKG%2B5nfWcpnZ%3DZ%3DUpGvY1tTF%3D4QU_0U_07EFaKmH7Nr%2BNLQ%40mail.gmail.com#aa061db119ee7a4b5390af56e24f475d
Thomas Munro <thomas.munro@gmail.com> writes: > On Thu, Sep 5, 2024 at 3:58 AM Andres Freund <andres@anarazel.de> wrote: >> Obviously we could add a version of GetRelationPath() that just prints into a >> caller provided buffer - but that's somewhat awkward API wise. > > For the record, that's exactly what I did in the patch I proposed to > fix our long standing RelationTruncate() data-eating bug: > > https://www.postgresql.org/message-id/flat/CA%2BhUKG%2B5nfWcpnZ%3DZ%3DUpGvY1tTF%3D4QU_0U_07EFaKmH7Nr%2BNLQ%40mail.gmail.com#aa061db119ee7a4b5390af56e24f475d I want to have a dicussion on the user provided buffer APIs. I just get the similar feedback on [1] because of this recently.. I agree that "user provided buffer" API is bad for the reasons like: a). inconvenient since user need to provide the buffer. b) unsafe because user may provide a incorrect buffer. But it still have some advantages, like c). allocate the memory in a expected MemoryContext rather than CurrentMemoryContext. d). Allocating the memory at the different time rather than executing the API e). API can write the data to the user descired buffer directly rather than another copy after. My user case at [1] is because of (c) and (e), and the user case here looks because of factor (d). Come to the badness of "user provided buffer" API, I think we can ease them by providing both the non-user-buffer API and user-provided-buffer API. Since the later one is safe and convenient, so people probably user the non-user-buffer API by default and just the user who wants the benefits of "provided-buffer" would use that API. Am I miss some important factors on this topic? [1] https://www.postgresql.org/message-id/1882669.1726701697%40sss.pgh.pa.us (I read the above topic [1] now, I just realize I proposed to [change] the API rather than adding an new variant, that's not my intention and that's my fault). -- Best Regards Andy Fan
On Wed, Feb 19, 2025 at 3:35 PM Andres Freund <andres@anarazel.de> wrote: > After thinking about this for an embarassingly long time, I think there's > actually a considerably better answer for a case like this: A function that > returns a fixed-length string by value: > > - Compilers can fairly easily warn about on-stack values that goes out of > scope > > - Because we don't need to free the memory anymore, some code that that > previously needed to explicitly free the memory doesn't need to anymore > (c.f. AbortBufferIO()). > > - The max lenght isn't that long, so it's actually reasonably efficient, > likely commonly cheaper than psprintf. I like it! > This made me realize that WaitReadBuffers() leaks a small bit of memory in the > in zero_damaged_pages path. Hardly the end of the world, but it does show how > annoying the current interface is. Right. It uses relpath() for an error message, but unlike similar examples this one is only a WARNING so it might run an unbounded number of times before the context cleans the memory up. Not new code, just moved around in v17.
Hi, On 2025-02-20 14:00:10 +1300, Thomas Munro wrote: > On Wed, Feb 19, 2025 at 3:35 PM Andres Freund <andres@anarazel.de> wrote: > > After thinking about this for an embarassingly long time, I think there's > > actually a considerably better answer for a case like this: A function that > > returns a fixed-length string by value: > > > > - Compilers can fairly easily warn about on-stack values that goes out of > > scope > > > > - Because we don't need to free the memory anymore, some code that that > > previously needed to explicitly free the memory doesn't need to anymore > > (c.f. AbortBufferIO()). > > > > - The max lenght isn't that long, so it's actually reasonably efficient, > > likely commonly cheaper than psprintf. > > I like it! Does anybody have opinions about whether we should keep a backward compatible interface in place or not? Via https://codesearch.debian.net/ I tried to look for references. Unfortunately I had to exclude "relpath" as there are just too many independent hits, due to the python function of the same name. For relpathperm(), relpathbackend(), GetRelationPath() there looks to be just fincore. There also are two copies of our code, but those we don't need to care about (libpg-query and ruby-pg-query), since they're just going to copy the new code when updating. I also looked for matches including relpath that had "fork" elsewhere in the file, it's hard to see a potential use of relpath() not using fork in the arguments or such. Which makes me think it's not worth having a backward compatible interface? Greetings, Andres Freund
On Thu, Feb 20, 2025 at 12:40:57PM -0500, Andres Freund wrote: > On 2025-02-20 14:00:10 +1300, Thomas Munro wrote: > > On Wed, Feb 19, 2025 at 3:35 PM Andres Freund <andres@anarazel.de> wrote: > > > After thinking about this for an embarassingly long time, I think there's > > > actually a considerably better answer for a case like this: A function that > > > returns a fixed-length string by value: > > > > > > - Compilers can fairly easily warn about on-stack values that goes out of > > > scope > > > > > > - Because we don't need to free the memory anymore, some code that that > > > previously needed to explicitly free the memory doesn't need to anymore > > > (c.f. AbortBufferIO()). > > > > > > - The max lenght isn't that long, so it's actually reasonably efficient, > > > likely commonly cheaper than psprintf. > > > > I like it! Works for me. > Unfortunately I had to exclude "relpath" as there are just too many > independent hits, due to the python function of the same name. For > relpathperm(), relpathbackend(), GetRelationPath() there looks to be just > fincore. PGXN has few hits, and some of these are false positives or otherwise irrelevant: $ grep -re '[^.]\(relpath[a-z]*\|GetRelationPath\)(' | sed 's/-[^:]*/:/'|sort -u db2_fdw::extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode, jsoncdc:: pub fn GetRelationPath(dbNode: Oid, spcNode: Oid, relNode: Oid, openbarter:: path = relpath(frame.f_code.co_filename, refdir) # relative to refdir pg_bulkload::#define relpath(rnode, forknum) relpath((rnode)) pg_bulkload:: fname = relpath(bknode, MAIN_FORKNUM); pg_bulkload:: fname = relpath(rnode, MAIN_FORKNUM); pg_repack::#define relpath(rnode, forknum) relpath((rnode)) plv8:: def _to_relpath(self, abspath, _): plv8:: def _to_relpath(self, abspath, test_root): plv8:: yield self._to_relpath(abspath, test_root) tblsize_nolock:: relationpath = relpath(*rfn); > Which makes me think it's not worth having a backward compatible interface? Agreed. Even if 100% of those matches had to change, that's below standard level of breakage for a major release.
Andres Freund <andres@anarazel.de> writes: > Does anybody have opinions about whether we should keep a backward compatible > interface in place or not? I vote for "not" --- doesn't seem like there'll be much external code affected, and we make comparably-sized API breaks all the time. As a matter of style, I wonder if it'd be better to have these functions write into a caller-supplied variable. That seems more in keeping with most other places in Postgres, and it would save a copying step in cases where the caller needs the result on the heap. I realize that returning structs has been in C for decades, but that doesn't mean I want some of our APIs doing it one way and some the other. regards, tom lane
Hi, On 2025-02-20 14:11:16 -0500, Tom Lane wrote: > As a matter of style, I wonder if it'd be better to have these > functions write into a caller-supplied variable. I wondered about that too, it does however make some code more awkward, e.g. because there's not a good surrounding block to put the variable in. I mildly prefer the return-by-value approach, but not more. > That seems more in keeping with most other places in Postgres, and it would > save a copying step in cases where the caller needs the result on the heap. I don't think there are many cases where we have to then copy the value to the heap, fwiw, with a more complete patch. It's perhaps worth noting that on most (all?) architectures the *caller* will reserve space for the return value of functions that return structs by value. In general it'll be easier for the compiler to optimize code where it knows the lifetime of the relevant memory, which is harder for a function that gets passed a "target" memory location than when it's returning by value. But of course it's very hard to believe this ever matters for the case at hand. The patch curently uses a hardcoded 6 for the length of MAX_BACKENDS. Does anybody have a good idea for how to either a) derive 6 from MAX_BACKENDS in a way that can be used in a C array size b) Use a static assert to check that it fits? Otherwise I'll just put it in the test that my prior version already added. Not as good as compile-time, but should still make it easy to find the issue, if we ever increase/decrease MAX_BACKENDS. In the attached I 1) changed all the GetRelationPath()/relpath* uses to the new way, the names now are unchanged 2) introduced a PROCNUMBER_CHARS, instead hardcoding 6 in the length 3) renamed RelPathString to RelPathStr and change the path member to str I also added a second patch changing _mdfd_segpath() to do something similar, because it felt somewhat silly to allocate memory after the prior change. Not entirely sure it's worth it. Greetings, Andres Freund
Attachment
On Fri, Feb 21, 2025 at 9:28 AM Andres Freund <andres@anarazel.de> wrote: > The patch curently uses a hardcoded 6 for the length of MAX_BACKENDS. Does > anybody have a good idea for how to either > > a) derive 6 from MAX_BACKENDS in a way that can be used in a C array size Do we even check the binary digits? BTW I see a place in lwlock.c that still talks about 2^23-1, looks like it is out of date. Hmmm, I wonder if it would be better to start by declaring how many bits we want to use, given that is our real constraint. And then: #define PROCNUMBER_BITS 18 #define MAX_BACKENDS ((1 << PROCNUMBER_BITS) - 1) #define PROCNUMBER_CHARS DECIMAL_DIGITS_FOR_BITS(PROCNUMBER_BITS) ... with a little helper ported to preprocessor hell from Hacker's Delight magic[1] for that last bit. See attached. But if that's a bit too nuts... > b) Use a static assert to check that it fits? Right, easy stuff like sizeof(CppString2(MAX_BACKENDS)) - 1 can only work if the token is a decimal number. I suppose you could just use constants: #define MAX_BACKENDS 0x3ffff #define PROCNUMBER_BITS 18 #define PROCNUMBER_CHARS 6 ... and then use the previous magic to statically assert their relationship inside one translation unit, to keep it out of a widely included header. [1] https://lemire.me/blog/2021/06/03/computing-the-number-of-digits-of-an-integer-even-faster/
Attachment
Thomas Munro <thomas.munro@gmail.com> writes: > On Fri, Feb 21, 2025 at 9:28 AM Andres Freund <andres@anarazel.de> wrote: >> The patch curently uses a hardcoded 6 for the length of MAX_BACKENDS. Does >> anybody have a good idea for how to either >> >> a) derive 6 from MAX_BACKENDS in a way that can be used in a C array size This all seems quite over-the-top to me. Just allocate 10 characters, which is certainly enough for the output of a %u format spec, and call it good. (Yeah, I know that's not as much fun, but ...) regards, tom lane
On Fri, Feb 21, 2025 at 6:20 PM Thomas Munro <thomas.munro@gmail.com> wrote: > #define PROCNUMBER_BITS 18 > #define MAX_BACKENDS ((1 << PROCNUMBER_BITS) - 1) > #define PROCNUMBER_CHARS DECIMAL_DIGITS_FOR_BITS(PROCNUMBER_BITS) > > ... with a little helper ported to preprocessor hell from Hacker's > Delight magic[1] for that last bit. See attached. But if that's a > bit too nuts... Erm, wait of course you just need: #define DECIMAL_DIGITS(n) \ ((n) < 10 ? 1 : \ (n) < 100 ? 2 : \ (n) < 1000 ? 3 : \ (n) < 10000 ? 4 : \ (n) < 100000 ? 5 : \ (n) < 1000000 ? 6 : \ (n) < 10000000 ? 7 : \ (n) < 100000000 ? 8 : \ (n) < 1000000000 ? 9 : 10) #define PROCNUMBER_BITS 18 #define MAX_BACKENDS ((1 << PROCNUMBER_BITS) - 1) #define PROCNUMBER_CHARS DECIMAL_DIGITS(MAX_BACKENDS)
Hi, On 2025-02-21 18:20:39 +1300, Thomas Munro wrote: > On Fri, Feb 21, 2025 at 9:28 AM Andres Freund <andres@anarazel.de> wrote: > > The patch curently uses a hardcoded 6 for the length of MAX_BACKENDS. Does > > anybody have a good idea for how to either > > > > a) derive 6 from MAX_BACKENDS in a way that can be used in a C array size > > Do we even check the binary digits? BTW I see a place in lwlock.c > that still talks about 2^23-1, looks like it is out of date. Yea. I actually posted a patch addressing that recently-ish: https://postgr.es/m/7bye3hvkdk7ybt4ofigimr4ahshsj657aqatfc5trd5pbp5qd4%40dir2wsx2lgo6 Patch 0001 in that series would be nice to have here, because it moves MAX_BACKENDS out of postmaster.h. I kind had hoped somebody would comment on whether pg_config_manual.h is a good place for the define. I guess we could also move it to storage/procnumber.h. > Hmmm, I wonder if it would be better to start by declaring how many bits we > want to use, given that is our real constraint. And then: > > #define PROCNUMBER_BITS 18 > #define MAX_BACKENDS ((1 << PROCNUMBER_BITS) - 1) > #define PROCNUMBER_CHARS DECIMAL_DIGITS_FOR_BITS(PROCNUMBER_BITS) > > ... with a little helper ported to preprocessor hell from Hacker's > Delight magic[1] for that last bit. See attached. But if that's a > bit too nuts... This seems a bit too complicated to be worth it. I am inclined to think the approach taken in the patch of just having a regression test verifying that the numbers match is good enough. The one arguments I can see for something like DECIMAL_DIGITS_TABLE is that we could define OIDCHARS the same way. With a quick grep I couldn't immediately find other candidates though. > > b) Use a static assert to check that it fits? > > Right, easy stuff like sizeof(CppString2(MAX_BACKENDS)) - 1 can only > work if the token is a decimal number. I suppose you could just use > constants: > > #define MAX_BACKENDS 0x3ffff > #define PROCNUMBER_BITS 18 > #define PROCNUMBER_CHARS 6 > > ... and then use the previous magic to statically assert their > relationship inside one translation unit, to keep it out of a widely > included header. I think we could also just define MAX_BACKENDS as a decimal number. I don't find 0x3ffff that meaningfully better than 262143 Greetings, Andres Freund
On Sat, Feb 22, 2025 at 5:15 AM Andres Freund <andres@anarazel.de> wrote: > On 2025-02-21 18:20:39 +1300, Thomas Munro wrote: > > On Fri, Feb 21, 2025 at 9:28 AM Andres Freund <andres@anarazel.de> wrote: > > Do we even check the binary digits? BTW I see a place in lwlock.c > > that still talks about 2^23-1, looks like it is out of date. > > Yea. I actually posted a patch addressing that recently-ish: > https://postgr.es/m/7bye3hvkdk7ybt4ofigimr4ahshsj657aqatfc5trd5pbp5qd4%40dir2wsx2lgo6 > > Patch 0001 in that series would be nice to have here, because it moves > MAX_BACKENDS out of postmaster.h. I kind had hoped somebody would comment on > whether pg_config_manual.h is a good place for the define. > > I guess we could also move it to storage/procnumber.h. procnumber.h seems like the right place, at least without a separate discussion of the ramifications of making it configurable, no? (I thought there were ideas about squeezing it down to 16 bits so you could jam two of 'em into an atomic uint32_t for list headers or something like that, off-topic here except to say that it seems to conflict with the idea of making it user-increasable?) > > Hmmm, I wonder if it would be better to start by declaring how many bits we > > want to use, given that is our real constraint. And then: > > > > #define PROCNUMBER_BITS 18 > > #define MAX_BACKENDS ((1 << PROCNUMBER_BITS) - 1) > > #define PROCNUMBER_CHARS DECIMAL_DIGITS_FOR_BITS(PROCNUMBER_BITS) > > > > ... with a little helper ported to preprocessor hell from Hacker's > > Delight magic[1] for that last bit. See attached. But if that's a > > bit too nuts... > > This seems a bit too complicated to be worth it. I am inclined to think the > approach taken in the patch of just having a regression test verifying that > the numbers match is good enough. > > The one arguments I can see for something like DECIMAL_DIGITS_TABLE is that we > could define OIDCHARS the same way. With a quick grep I couldn't immediately > find other candidates though. If anyone ever does consider using something like that, whether to derive values or just to make assertions that they agree inside a translation unit somewhere, FTR here's why I posted the simple version after being temporarily nerdsniped by Hacker's Delight: the bitswizzling algorithm was translated from branch-free code, which is fun but irrelevant here and that property didn't even survive the translation. So you might as well just use the naive definition of DECIMAL_DIGITS(n) I posted immediately after.
On Sat, Feb 22, 2025 at 1:30 PM Thomas Munro <thomas.munro@gmail.com> wrote: > procnumber.h seems like the right place, at least without a separate > discussion of the ramifications of making it configurable, no? (I > thought there were ideas about squeezing it down to 16 bits so you > could jam two of 'em into an atomic uint32_t for list headers or > something like that, off-topic here except to say that it seems to > conflict with the idea of making it user-increasable?) (Please ignore this comment, I'll comment on the other thread instead. Sorry I hadn't seen your patches over there yet: when I started talking about the definition and assertions around MAX_BACKENDS in here, it had jumped out at me independently while trying to answer your question about compile-time log10 stuff, because I also noticed that we sucked even at codifying the log2 constraints.)
Hi, On 2025-02-20 15:28:39 -0500, Andres Freund wrote: > On 2025-02-20 14:11:16 -0500, Tom Lane wrote: > > As a matter of style, I wonder if it'd be better to have these > > functions write into a caller-supplied variable. > > I wondered about that too, it does however make some code more awkward, > e.g. because there's not a good surrounding block to put the variable in. > > I mildly prefer the return-by-value approach, but not more. Since I've not heard anything on that point, I'm currently thinking with going with return-by-value. Attached are two very mildly updated patches. Most of the changes is added commit messages. Greetings, Andres Freund
Attachment
Hi, On 2025-02-24 10:50:21 -0500, Andres Freund wrote: > Since I've not heard anything on that point, I'm currently thinking with going > with return-by-value. Attached are two very mildly updated patches. Most of > the changes is added commit messages. And pushed. Greetings, Andres Freund