Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent
Date
Msg-id CA+Tgmob6knHwEbQv11n_Yy0GyFzPgjhC33LroWso0prt7hCksg@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent
List pgsql-hackers
On Mon, Nov 24, 2025 at 4:25 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> On Mon, Nov 24, 2025 at 04:02:18PM -0500, Robert Haas wrote:
> > I haven't done a full review of this and I'm not sure whether you want
> > me to spend more time on it...
>
> I'd appreciate an eyeball check if you have the time.

OK. I am not too familiar with the current code but I will have a look.

> > I'm guessing that the reason why that doesn't easily work is
> > because you're relying on those locks to prevent multiple backends
> > from doing the same initialization?
>
> For GetNamedDSMSegment(), I bet we could avoid needing a PG_CATCH by taking
> the DSMRegistryLock exclusively when accessing the registry.  But for
> GetNamedDSA() and GetNamedDSHash(), we want to keep the half-initialized
> entry so that we don't leak LWLock tranche IDs.  I initially thought that
> might be okay, but if every backend is retrying, you could quickly run out
> of tranche IDs.

I think that leaking tranche IDs is definitely a bad plan, even if it
didn't cause us to run out. Ideally, I think the design goal should be
to set up the structure of things so that we never need to do anything
beyond error cleanup if we fail. If we can't achieve that, well then
fine, but that is what I would view as most desirable. For instance, I
would be inclined to set up GetNamedDSA like this:

1. First or create create the dshash entry. If that fails, we haven't
done anything yet, so no problem.

2. Next, if there dshash entry doesn't have an assigned tranche ID
yet, then try to assign one. This probably won't fail. But if it does,
it will produce a meaningful error, and every future attempt will
likely produce the same, still-meaningful error. I think this is a
significant improvement over having every attempt but the first return
"requested DSA \"%s\" failed initialization," which hides the real
cause of failure.

3. Next, if the segment doesn't yet have a DSA, then try to create
one.This could fail due to lack of memory; if so, future attempts
might succeed, or might continue to fail with out-of-memory errors.
Assuming that creating the DSA is successful, pin the DSA and store
the handle into the dshash entry. On the other hand, if we don't
create a DSA here because one already exists, try to attach to it.
That could also fail, but if it does, we can still retry later.

4. Pin the mapping for the DSA that we created or attached to in the
previous step.

5. dshash_release_lock.

In other words, I'd try to create a system that always attempts to
make forward progress, but when that's not possible, fails without
permanently leaking any resources or spoiling anything for the next
backend that wants to make an attempt. One worry that we've mentioned
previously is that this might lead to every backend retrying. But as
the code is currently written, that basically happens anyway except
all of them but one produce a generic, uninformative error. Now, if we
do what I'm proposing here, we'll repeatedly try to create DSM, DSA,
and DSH objects in a way that actually allocates memory, so the
retries get a little bit more expensive. At the moment, I'm inclined
to believe this doesn't really matter. If it's true that failures are
very rare, and I suspect it is, then it doesn't really matter if the
cost of retries goes up a bit. If hypothetically they are common
enough to matter, then we definitely need a mechanism that can't get
permanently stuck in a half-initialized state. If we were to find that
retrying too many times too quickly creates other problems, my
suggested response would be to add a timestamp to the dshash entry and
limit retries to once per minute or something. However, in the absence
of evidence that we need such a mechanism, I'd be inclined to guess
that we don't.

Thoughts?

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reduce size of logs stored by buildfarm
Next
From: Dmitry Dolgov
Date:
Subject: Re: System views for versions reporting