Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Let's make PostgreSQL multi-threaded
Date
Msg-id CAMT0RQR+NLJw6hmN88D7RPe=vUEr7DB52dKmS9tWPO_eX2Np0g@mail.gmail.com
Whole thread Raw
In response to Let's make PostgreSQL multi-threaded  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Let's make PostgreSQL multi-threaded
Re: Let's make PostgreSQL multi-threaded
List pgsql-hackers
On Mon, Jun 5, 2023 at 4:52 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> If there are no major objections, I'm going to update the developer FAQ,
> removing the excuses there for why we don't use threads [1].

I think it is not wise to start the wholesale removal of the objections there.

But I think it is worthwhile to revisit the section about threads and
maybe split out the historic part which is no more true, and provide
both pros and cons for these.

I started with this short summary from the discussion in this thread,
feel free to expand, argue, fix :)
* is current excuse
-- is counterargument or ack
----------------
As an example, threads are not yet used instead of multiple processes
for backends because:
* Historically, threads were poorly supported and buggy.
-- yes they were, not relevant now when threads are well-supported and non-buggy

* An error in one backend can corrupt other backends if they're
threads within a single process
-- still valid for silent corruption
-- for detected crash - yes, but we are restarting all backends in
case of crash anyway.

* Speed improvements using threads are small compared to the remaining
backend startup time.
-- we now have some measurements that show significant performance
improvements not related to startup time

* The backend code would be more complex.
-- this is still the case
-- even more worrisome is that all extensions also need to be rewritten
-- and many incompatibilities will be silent and take potentially years to find

* Terminating backend processes allows the OS to cleanly and quickly
free all resources, protecting against memory and file descriptor
leaks and making backend shutdown cheaper and faster
-- still true

* Debugging threaded programs is much harder than debugging worker
processes, and core dumps are much less useful
-- this was countered by claiming that
  -- by now we have reasonable debugger support for threads
  -- there is no direct debugger support for debugging the exact
system set up like PostgreSQL processes + shared memory

* Sharing of read-only executable mappings and the use of
shared_buffers means processes, like threads, are very memory
efficient
-- this seems to say that the current process model is as good as threads ?
-- there were a few counterarguments
  -- per-backend virtual memory mapping can add up to significant
amount of extra RAM usage
  -- the discussion did not yet touch various per-backend caches
(pg_catalog cache, statement cache) which are arguably easier to
implement in threaded model
  -- TLB reload at each process switch is expensive and would be
mostly avoided in case of threads

* Regular creation and destruction of processes helps protect against
memory fragmentation, which can be hard to manage in long-running
processes
-- probably still true
-------------------------------------



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: Do we want a hashset type?
Next
From: Andres Freund
Date:
Subject: Re: abi-compliance-checker