Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Let's make PostgreSQL multi-threaded
Date
Msg-id CA+TgmoaWMgRf4xv8N7_1XqZyT-KWeD=n0mN2fm4i_Mwmg6oepg@mail.gmail.com
Whole thread Raw
In response to Re: Let's make PostgreSQL multi-threaded  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Let's make PostgreSQL multi-threaded
Re: Let's make PostgreSQL multi-threaded
List pgsql-hackers
On Tue, Jun 6, 2023 at 11:46 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Bruce was worried about the loss of isolation that the separate address
> spaces gives, and Jeremy shared an anecdote on that. That is an
> objection to the idea itself, i.e. even if transition was smooth,
> bug-free and effortless, that point remains. I personally think the
> isolation we get from separate address spaces is overrated. Yes, it
> gives you some protection, but given how much shared memory there is,
> the blast radius is large even with separate backend processes.

An interesting idea might be to look at the places where we ereport or
elog FATAL due to some kind of backend data structure corruption and
ask whether there would be an argument for elevating the level to
PANIC if we changed this. There are definitely some places where we
argue that the only corrupted state is backend-local and thus we don't
need to PANIC if it's corrupted. I wonder to what extent this change
would undermine that argument.

Even if it does, I think it's worth it. Corrupted backend-local data
structures aren't that common, thankfully.

> I don't think this is worth it, unless we plan to eventually remove the
> multi-process mode. We could e.g. make lock table expandable in threaded
> mode, and fixed-size in process mode, but the big gains would come from
> being able to share things between threads and have variable-length
> shared data structures more easily. As long as you need to also support
> processes, you need to code to the lowest common denominator and don't
> really get the benefits.
>
> I don't know how long a transition period we need. Maybe 1 release, maybe 5.

I think 1 release is wildly optimistic. Even if someone wrote a patch
for this and got it committed this release cycle, it's likely that
there would be follow-up commits needed over a period of several years
before it really worked as well as we'd like. Only after that could we
consider deprecating the per-process way. But I don't think that's
necessarily a huge problem. I originally intended DSM as an optional
feature: if you didn't have it, then you couldn't use features that
depended on it, but the rest of the system still worked. Eventually,
other people liked it enough that we decided to introduce hard
dependencies on it. I think that's a good model for a change like
this. When the inventor of a new system thinks that we should have a
hard dependency on it, MEH. When there's a groundswell of other,
unaffiliated hackers making that argument, COOL.

I'm also not quite convinced that there's no long-term use case for
multi-process mode. Maybe you're right and there isn't, but that
amounts to arguing that every extension in the world will be happy to
run in a multi-threaded world rather than not. I don't know if I quite
believe that. It also amounts to arguing that performance is going to
be better for everyone in this new multi-threaded mode, and that it
won't cause unforeseen problems for any significant numbers of users,
and maybe those things are true, but I think we need to get this new
system in place and get some real-world experience before we can judge
these kinds of things. I agree that, in theory, it would be nice to
get to a place where the multi-process mode is a dinosaur and that we
can just rip it out ... but I don't share your confidence that we can
get there in any short time period.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Let's make PostgreSQL multi-threaded
Next
From: Joe Conway
Date:
Subject: Re: Order changes in PG16 since ICU introduction