Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Let's make PostgreSQL multi-threaded
Date
Msg-id 3d8ffaa5-b9a1-9538-9ac3-ffa751449f4b@enterprisedb.com
Whole thread Raw
In response to Re: Let's make PostgreSQL multi-threaded  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers

On 6/8/23 01:37, Thomas Munro wrote:
> On Thu, Jun 8, 2023 at 10:37 AM Jeremy Schneider
> <schneider@ardentperf.com> wrote:
>> On 6/7/23 2:39 PM, Thomas Kellerer wrote:
>>> Tomas Vondra schrieb am 07.06.2023 um 21:20:
>>>> Also, which other projects did this transition? Is there something we
>>>> could learn from them? Were they restricted to much smaller list of
>>>> platforms?
>>>
>>> Not open source, but Oracle was historically multi-threaded on Windows
>>> and multi-process on all other platforms.
>>> I _think_ starting with 19c you can optionally run it multi-threaded on
>>> Linux as well.
>> Looks like it actually became publicly available in 12c. AFAICT Oracle
>> supports both modes today, with a config parameter to switch between them.
> 
> It's old, but this describes the 4 main models and which well known
> RDBMSes use them in section 2.3:
> 
> https://dsf.berkeley.edu/papers/fntdb07-architecture.pdf
> 
> TL;DR DB2 is the winner, it can do process-per-connection,
> thread-per-connection, process-pool or thread-pool.
> 

I think the basic architectures are known, especially from the user
perspective. I'm more interested in challenges the projects faced while
moving from one architecture to the other, or how / why they support
more than just one, etc.

In [1] Heikki argued that:

    I don't think this is worth it, unless we plan to eventually remove
    the multi-process mode. ... As long as you need to also support
    processes, you need to code to the lowest common denominator and
    don't really get the benefits.

But these projects clearly support multiple architectures, and have no
intention to ditch some of them. So how did they do that? Surely they
think there are benefits.

One option would be to just have separate code paths for processes and
threads, but the effort required to maintain and improve that would be
deadly. So the only feasible option seems to be they managed to abstract
the subsystems enough for the "regular" code to not care about model.


[1]
https://www.postgresql.org/message-id/6e3082dc-ff29-9cbf-847e-5f570828b46b@iki.fi

> I understand this thread to be about thread-per-connection (= backend,
> session, socket) for now.

Maybe, although people also proposed to switch the parallel query to
threads (so that'd be multiple threads per session). But I don't think
it really matters, the concerns are mostly about moving from one
architecture to another and/or supporting both.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: postgres_fdw: wrong results with self join + enable_nestloop off
Next
From: Pradeep Kumar
Date:
Subject: Seeking Guidance on Using Valgrind in PostgreSQL for Detecting Memory Leaks in Extension Code