Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Let's make PostgreSQL multi-threaded
Date
Msg-id CAFiTN-vJqo4TSBpkQTJqhYz6CL0M=cPhQZUXnop1uDC47s2hBg@mail.gmail.com
Whole thread Raw
In response to Re: Let's make PostgreSQL multi-threaded  (Hannu Krosing <hannuk@google.com>)
Responses Re: Let's make PostgreSQL multi-threaded
List pgsql-hackers
On Sat, Jun 10, 2023 at 11:32 PM Hannu Krosing <hannuk@google.com> wrote:
>
> On Mon, Jun 5, 2023 at 4:52 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> >
> > If there are no major objections, I'm going to update the developer FAQ,
> > removing the excuses there for why we don't use threads [1].
>
> I think it is not wise to start the wholesale removal of the objections there.
>
> But I think it is worthwhile to revisit the section about threads and
> maybe split out the historic part which is no more true, and provide
> both pros and cons for these.
>
> I started with this short summary from the discussion in this thread,
> feel free to expand, argue, fix :)
> * is current excuse
> -- is counterargument or ack
> ----------------
> As an example, threads are not yet used instead of multiple processes
> for backends because:
> * Historically, threads were poorly supported and buggy.
> -- yes they were, not relevant now when threads are well-supported and non-buggy
>
> * An error in one backend can corrupt other backends if they're
> threads within a single process
> -- still valid for silent corruption
> -- for detected crash - yes, but we are restarting all backends in
> case of crash anyway.
>
> * Speed improvements using threads are small compared to the remaining
> backend startup time.
> -- we now have some measurements that show significant performance
> improvements not related to startup time
>
> * The backend code would be more complex.
> -- this is still the case
> -- even more worrisome is that all extensions also need to be rewritten
> -- and many incompatibilities will be silent and take potentially years to find
>
> * Terminating backend processes allows the OS to cleanly and quickly
> free all resources, protecting against memory and file descriptor
> leaks and making backend shutdown cheaper and faster
> -- still true
>
> * Debugging threaded programs is much harder than debugging worker
> processes, and core dumps are much less useful
> -- this was countered by claiming that
>   -- by now we have reasonable debugger support for threads
>   -- there is no direct debugger support for debugging the exact
> system set up like PostgreSQL processes + shared memory
>
> * Sharing of read-only executable mappings and the use of
> shared_buffers means processes, like threads, are very memory
> efficient
> -- this seems to say that the current process model is as good as threads ?
> -- there were a few counterarguments
>   -- per-backend virtual memory mapping can add up to significant
> amount of extra RAM usage
>   -- the discussion did not yet touch various per-backend caches
> (pg_catalog cache, statement cache) which are arguably easier to
> implement in threaded model
>   -- TLB reload at each process switch is expensive and would be
> mostly avoided in case of threads

I think it is worth mentioning that parallel worker infrastructure
will be simplified with threaded models e.g. 'parallel query', and
'parallel vacuum'.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Wrong results from Parallel Hash Full Join
Next
From: David Rowley
Date:
Subject: Re: Remove WindowClause PARTITION BY items belonging to redundant pathkeys