Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date
Msg-id CA+hUKGLNOFKfAG5ETuOEEzukgmOFGsiunfL2AfZmKhWx8EJE3g@mail.gmail.com
Whole thread Raw
In response to Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
List pgsql-hackers
On Mon, Jan 2, 2023 at 8:38 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sat, Dec 31, 2022 at 6:36 PM Noah Misch <noah@leadboat.com> wrote:
> > On Sat, Dec 31, 2022 at 10:06:53AM +1300, Thomas Munro wrote:
> > > Idea #8 is a realisation that twisting oneself into a pretzel to avoid
> > > having to change the regexp code or its REG_CANCEL control flow may be
> > > a bit silly.  If the only thing it really needs to do is free some
> > > memory, maybe the regexp module should provide a function that frees
> > > everything that is safe to call from our rcancelrequested callback, so
> > > we can do so before we longjmp back to Kansas.  Then the REG_CANCEL
> > > code paths would be effectively unreachable in PostgreSQL.  I don't
> > > know if it's better or worse than your idea #6, "use palloc instead,
> > > it already has garbage collection, duh", but it's a different take on
> > > the same realisation that this is just about free().
> >
> > PG_TRY() isn't free, so it's nice that (6) doesn't add one.  If (6) fails in
> > some not-yet-revealed way, (8) could get more relevant.
> >
> > > I guess idea #6 must be pretty easy to try: just point that MALLOC()
> > > macro to palloc(), and do a plain old CFI() in rcancelrequested().
>
> It's not quite so easy: in RE_compile_and_cache we construct objects
> with arbitrary cache-managed lifetime, which suggests we need a cache
> memory context, but we could also fail mid construction, which
> suggests we'd need a dedicated per-regex object memory context that is
> made permanent with the MemoryContextSetParent() trick (as we see
> elsewhere for cached things that are constructed by code that might
> throw), ...

Here's an experiment-grade attempt at idea #6 done that way, for
discussion.  You can see how much memory is wasted by each regex_t,
which I guess is probably on the order of a couple of hundred KB if
you use all 32 regex cache slots using ALLOCSET_SMALL_SIZES as I did
here:

postgres=# select 'x' ~ 'hello world .*';
-[ RECORD 1 ]
?column? | f

postgres=# select * from pg_backend_memory_contexts where name =
'RegexpMemoryContext';
-[ RECORD 1 ]-+-------------------------
name          | RegexpMemoryContext
ident         | hello world .*
parent        | RegexpCacheMemoryContext
level         | 2
total_bytes   | 13376
total_nblocks | 5
free_bytes    | 5144
free_chunks   | 8
used_bytes    | 8232

There's some more memory allocated in regc_pg_locale.c with raw
malloc() that could probably benefit from a pallocisation just to be
able to measure it, but I didn't touch that here.

The recovery conflict patch itself is unchanged, except that I removed
the claim in the commit message that this would be back-patched.  It's
pretty clear that this would need to spend a decent amount of time on
master only.

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
Next
From: Andres Freund
Date:
Subject: Re: pgsql: Delay commit status checks until freezing executes.