Thread: libpq and multi-threading

libpq and multi-threading

From
"Michael J. Baars"
Date:
Hi All,

I have a question about libpq and multi-threading.

In the PostgreSQL documentation (https://www.postgresql.org/docs/15/libpq-threading.html) it says that results can be passed around freely between threads. However, when I try to read the result from the parent thread, the program crashes with a segmentation fault.

I have already tried to set the PostgreSQL 'dynamic_shared_memory_type' configuration option to 'mmap', but this does not help.

Am I doing something wrong? How can I make libpq use mmap to allocate memory that can be read from the parent thread?

Best regards,
Mischa Baars.




Re: libpq and multi-threading

From
Laurenz Albe
Date:
On Tue, 2023-05-02 at 11:38 +0200, Michael J. Baars wrote:
> I have a question about libpq and multi-threading.
>
> In the PostgreSQL documentation (https://www.postgresql.org/docs/15/libpq-threading.html)
> it says that results can be passed around freely between threads. However, when I try to read
> the result from the parent thread, the program crashes with a segmentation fault.

That's too little information.

Yours,
Laurenz Albe



Re: libpq and multi-threading

From
"Michael J. Baars"
Date:
Hello Laurenz,

I don't think it is, but let me shed some more light on it.

After playing around a little with threads and memory, I now know that the PGresult is not read-only, it is read-once. The child can only read that portion of parent memory, that was written before the thread started. Read-only is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough either. Shared memory allocated by the child can not be accessed by the parent. I remembered right after pushing the send button. Shared memory needed by the child therefore has to be allocated through the parent.

In conclusion. I have found no way to pass the PGresult around, other than by copying it to shared memory. Rather disappointing. One store too many if you ask me. But passing PGresults around freely between threads, because they are supposingly read-only, is not a finding that I was able to reproduce from here.

On Tue, 2 May 2023, 15:49 Laurenz Albe, <laurenz.albe@cybertec.at> wrote:
On Tue, 2023-05-02 at 11:38 +0200, Michael J. Baars wrote:
> I have a question about libpq and multi-threading.
>
> In the PostgreSQL documentation (https://www.postgresql.org/docs/15/libpq-threading.html)
> it says that results can be passed around freely between threads. However, when I try to read
> the result from the parent thread, the program crashes with a segmentation fault.

That's too little information.

Yours,
Laurenz Albe

Re: libpq and multi-threading

From
"David G. Johnston"
Date:
On Tue, May 2, 2023 at 2:38 AM Michael J. Baars <mjbaars1977.pgsql.hackers@gmail.com> wrote:
I have already tried to set the PostgreSQL 'dynamic_shared_memory_type' configuration option to 'mmap', but this does not help.


Of course it doesn't, that is a server-side configuration.

"Specifies the dynamic shared memory implementation that the server should use." 


David J.

Re: libpq and multi-threading

From
"Michael J. Baars"
Date:
Hi David,

My mistake. Too much fiddling around, but better than no fiddling around. It appears both sides make mistakes, or does your freely passing around work better than mine?

On Tue, 2 May 2023, 17:57 David G. Johnston, <david.g.johnston@gmail.com> wrote:
On Tue, May 2, 2023 at 2:38 AM Michael J. Baars <mjbaars1977.pgsql.hackers@gmail.com> wrote:
I have already tried to set the PostgreSQL 'dynamic_shared_memory_type' configuration option to 'mmap', but this does not help.


Of course it doesn't, that is a server-side configuration.

"Specifies the dynamic shared memory implementation that the server should use." 


David J.

Re: libpq and multi-threading

From
"Peter J. Holzer"
Date:
On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:
> I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.


> After playing around a little with threads and memory, I now know that the
> PGresult is not read-only, it is read-once. The child can only read that
> portion of parent memory, that was written before the thread started. Read-only
> is not strong enough.
>
> Let me correct my first mail. Making libpq use mmap is not good enough either.
> Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: libpq and multi-threading

From
"Michael J. Baars"
Date:
Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM option. Indeed this results in an environment in which both the parent and the child can read / write each other's memory, but dynamic memory being allocated using malloc(3) from two different threads simulaneously will result in internal interference. 

Because libpq makes use of malloc to store results, you will come to find that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:
On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:
> I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.


> After playing around a little with threads and memory, I now know that the
> PGresult is not read-only, it is read-once. The child can only read that
> portion of parent memory, that was written before the thread started. Read-only
> is not strong enough.
>
> Let me correct my first mail. Making libpq use mmap is not good enough either.
> Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Re: libpq and multi-threading

From
Michael Loftis
Date:

That is not a thread. Linux man clone right at the start …

clone, __clone2, clone3 - create a child process”

What you want is pthread_create (or similar)

There’s a bunch of not well documented dragons if you’re trying to treat a child process as a thread. Use POSIX Threads, as pretty much anytime PG or anything else Linux based says thread they’re talking about a POSIX Thread environment.


On Wed, May 3, 2023 at 05:12 Michael J. Baars <mjbaars1977.pgsql.hackers@gmail.com> wrote:
Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM option. Indeed this results in an environment in which both the parent and the child can read / write each other's memory, but dynamic memory being allocated using malloc(3) from two different threads simulaneously will result in internal interference. 

Because libpq makes use of malloc to store results, you will come to find that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:
On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:
> I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.


> After playing around a little with threads and memory, I now know that the
> PGresult is not read-only, it is read-once. The child can only read that
> portion of parent memory, that was written before the thread started. Read-only
> is not strong enough.
>
> Let me correct my first mail. Making libpq use mmap is not good enough either.
> Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: libpq and multi-threading

From
"Michael J. Baars"
Date:
Hi Michael,

Are pthread_* functions really such an improvement over clone? Does it make an 'freely passing around' of PGresult objects possible? Like it matters, process or thread.

We were talking about the documentation and this 'freely passing around' PGresult object. I just don't think it is as simple as the documentation makes you believe.

On Wed, 3 May 2023, 14:35 Michael Loftis, <mloftis@wgops.com> wrote:

That is not a thread. Linux man clone right at the start …

clone, __clone2, clone3 - create a child process”

What you want is pthread_create (or similar)

There’s a bunch of not well documented dragons if you’re trying to treat a child process as a thread. Use POSIX Threads, as pretty much anytime PG or anything else Linux based says thread they’re talking about a POSIX Thread environment.


On Wed, May 3, 2023 at 05:12 Michael J. Baars <mjbaars1977.pgsql.hackers@gmail.com> wrote:
Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM option. Indeed this results in an environment in which both the parent and the child can read / write each other's memory, but dynamic memory being allocated using malloc(3) from two different threads simulaneously will result in internal interference. 

Because libpq makes use of malloc to store results, you will come to find that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:
On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:
> I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.


> After playing around a little with threads and memory, I now know that the
> PGresult is not read-only, it is read-once. The child can only read that
> portion of parent memory, that was written before the thread started. Read-only
> is not strong enough.
>
> Let me correct my first mail. Making libpq use mmap is not good enough either.
> Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: libpq and multi-threading

From
"Peter J. Holzer"
Date:
On 2023-05-03 06:35:26 -0600, Michael Loftis wrote:
> That is not a thread. Linux man clone right at the start …
>
> “clone, __clone2, clone3 - create a child process”
>
> What you want is pthread_create (or similar)

clone is the system call which is used to create both processes and
threads (in the early days of Linux that generalization was thought to
be beneficial, but POSIX has all kinds of special rules for processes
and threads so it may actually have made stuff more complicated.)

I do agree that pthread_create (or the C11 thrd_create) is the way to
go. It will just call clone behind the scenes, but it will do so with
the right flags and possibly set up some other stuff expected by the
rest of the C library, too.

There may be good reasons to use the low level function in some cases.
But I'd say that in that case you should better know what that means
exactly.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: libpq and multi-threading

From
Geoff Winkless
Date:
On Wed, 3 May 2023 at 12:11, Michael J. Baars <mjbaars1977.pgsql.hackers@gmail.com> wrote:
The shared common address space is controlled by the clone(2) CLONE_VM option. Indeed this results in an environment in which both the parent and the child can read / write each other's memory, but dynamic memory being allocated using malloc(3) from two different threads simulaneously will result in internal interference. 

There's an interesting note here


TL;DR: glibc malloc does not cope well with threads created with clone(). Use pthread_create if you wish to use glibc malloc.

Geoff