Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Let's make PostgreSQL multi-threaded
Date
Msg-id 3185fb3b-bbff-4b3e-78c1-3fb9befe6ef8@iki.fi
Whole thread Raw
In response to Re: Let's make PostgreSQL multi-threaded  ("Tristan Partin" <tristan@neon.tech>)
Responses Re: Let's make PostgreSQL multi-threaded
List pgsql-hackers
On 05/06/2023 11:28, Tristan Partin wrote:
> On Mon Jun 5, 2023 at 9:51 AM CDT, Heikki Linnakangas wrote:
>> # Extensions
>>
>> A lot of extensions also contain global variables or other things that
>> break in a multi-threaded environment. We need a way to label extensions
>> that support multi-threading. And in the future, also extensions that
>> *require* a multi-threaded server.
>>
>> Let's add flags to the control file to mark if the extension is
>> thread-safe and/or process-safe. If you try to load an extension that's
>> not compatible with the server's mode, throw an error.
>>
>> We might need new functions in addition _PG_init, called at connection
>> startup and shutdown. And background worker API probably needs some changes.
> 
> It would be a good idea to start exposing a variable through pkg-config
> to tell whether the backend is multi-threaded or multi-process.

I think we need to support both modes without having to recompile the 
server or the extensions. So it needs to be a runtime check.

>> # Exposed PIDs
>>
>> We expose backend process PIDs to users in a few places.
>> pg_stat_activity.pid and pg_terminate_backend(), for example. They need
>> to be replaced, or we can assign a fake PID to each connection when
>> running in multi-threaded mode.
> 
> Would it be possible to just transparently slot in the thread ID
> instead?

Perhaps. It might break applications that use the PID directly with e.g. 
'kill <PID>', though.

>> The Python interpreter has a Global Interpreter Lock. It's not possible
>> to create two completely independent Python interpreters in the same
>> process, there will be some lock contention on the GIL. Fortunately, the
>> python community just accepted https://peps.python.org/pep-0684/. That's
>> exactly what we need: it makes it possible for separate interpreters to
>> have their own GILs. It's not clear to me if that's in Python 3.12
>> already, or under development for some future version, but by the time
>> we make the switch in Postgres, there probably will be a solution in
>> cpython.
> 
> 3.12 is the currently in-development version of Python. 3.12 is planned
> for release in October of this year.
> 
> A workaround that some projects seem to do is to use multiple Python
> interpreters[0], though it seems uncommon. It might be important to note
> depending on the minimum version of Python Postgres aims to support (not
> sure on this policy).
> 
> The C-API of Python also provides mechanisms for releasing the GIL. I am
> not familiar with how Postgres uses Python, but I have seen huge
> improvements to performance with well-placed GIL releases in
> multi-threaded contexts. Surely this API would just become a no-op after
> the PEP is implemented.
> 
> [0]: https://peps.python.org/pep-0684/#existing-use-of-multiple-interpreters

Oh, cool. I'm inclined to jump straight to PEP-684 and require python 
3.12 in multi-threaded mode, though, or just accept that it's slow. But 
let's see what the state of the world is when we get there.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: PG 16 draft release notes ready
Next
From: Kirk Wolak
Date:
Subject: RFC: Adding \history [options] [filename] to psql (Snippets and Shared Queries)