libpq and multi-threading

Started by Michael J. Baarsalmost 3 years ago11 messagesgeneral
Jump to latest
#1Michael J. Baars
mjbaars1977.pgsql.hackers@gmail.com

Hi All,

I have a question about libpq and multi-threading.

In the PostgreSQL documentation (
https://www.postgresql.org/docs/15/libpq-threading.html) it says that
results can be passed around freely between threads. However, when I try to
read the result from the parent thread, the program crashes with a
segmentation fault.

I have already tried to set the PostgreSQL 'dynamic_shared_memory_type'
configuration option to 'mmap', but this does not help.

Am I doing something wrong? How can I make libpq use mmap to allocate
memory that *can* be read from the parent thread?

Best regards,
Mischa Baars.

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Michael J. Baars (#1)
Re: libpq and multi-threading

On Tue, 2023-05-02 at 11:38 +0200, Michael J. Baars wrote:

I have a question about libpq and multi-threading.

In the PostgreSQL documentation (https://www.postgresql.org/docs/15/libpq-threading.html)
it says that results can be passed around freely between threads. However, when I try to read
the result from the parent thread, the program crashes with a segmentation fault.

That's too little information.

Yours,
Laurenz Albe

#3Michael J. Baars
mjbaars1977.pgsql.hackers@gmail.com
In reply to: Laurenz Albe (#2)
Re: libpq and multi-threading

Hello Laurenz,

I don't think it is, but let me shed some more light on it.

After playing around a little with threads and memory, I now know that the
PGresult is not read-only, it is read-once. The child can only read that
portion of parent memory, that was written before the thread started.
Read-only is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough
either. Shared memory allocated by the child can not be accessed by the
parent. I remembered right after pushing the send button. Shared memory
needed by the child therefore has to be allocated through the parent.

In conclusion. I have found no way to pass the PGresult around, other than
by copying it to shared memory. Rather disappointing. One store too many if
you ask me. But passing PGresults around freely between threads, because
they are supposingly read-only, is not a finding that I was able to
reproduce from here.

On Tue, 2 May 2023, 15:49 Laurenz Albe, <laurenz.albe@cybertec.at> wrote:

Show quoted text

On Tue, 2023-05-02 at 11:38 +0200, Michael J. Baars wrote:

I have a question about libpq and multi-threading.

In the PostgreSQL documentation (

https://www.postgresql.org/docs/15/libpq-threading.html)

it says that results can be passed around freely between threads.

However, when I try to read

the result from the parent thread, the program crashes with a

segmentation fault.

That's too little information.

Yours,
Laurenz Albe

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Michael J. Baars (#1)
Re: libpq and multi-threading

On Tue, May 2, 2023 at 2:38 AM Michael J. Baars <
mjbaars1977.pgsql.hackers@gmail.com> wrote:

I have already tried to set the PostgreSQL 'dynamic_shared_memory_type'
configuration option to 'mmap', but this does not help.

Of course it doesn't, that is a server-side configuration.

"Specifies the dynamic shared memory implementation that the server should
use."

https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY

David J.

#5Michael J. Baars
mjbaars1977.pgsql.hackers@gmail.com
In reply to: David G. Johnston (#4)
Re: libpq and multi-threading

Hi David,

My mistake. Too much fiddling around, but better than no fiddling around.
It appears both sides make mistakes, or does your freely passing around
work better than mine?

On Tue, 2 May 2023, 17:57 David G. Johnston, <david.g.johnston@gmail.com>
wrote:

Show quoted text

On Tue, May 2, 2023 at 2:38 AM Michael J. Baars <
mjbaars1977.pgsql.hackers@gmail.com> wrote:

I have already tried to set the PostgreSQL 'dynamic_shared_memory_type'
configuration option to 'mmap', but this does not help.

Of course it doesn't, that is a server-side configuration.

"Specifies the dynamic shared memory implementation that the server should
use."

https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY

David J.

#6Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Michael J. Baars (#3)
Re: libpq and multi-threading

On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:

I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.

After playing around a little with threads and memory, I now know that the
PGresult is not read-only, it is read-once. The child can only read that
portion of parent memory, that was written before the thread started. Read-only
is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough either.
Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#7Michael J. Baars
mjbaars1977.pgsql.hackers@gmail.com
In reply to: Peter J. Holzer (#6)
Re: libpq and multi-threading

Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM
option. Indeed this results in an environment in which both the parent and
the child can read / write each other's memory, but dynamic memory being
allocated using malloc(3) from two different threads simulaneously will
result in internal interference.

Because libpq makes use of malloc to store results, you will come to find
that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:

Show quoted text

On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:

I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.

After playing around a little with threads and memory, I now know that

the

PGresult is not read-only, it is read-once. The child can only read that
portion of parent memory, that was written before the thread started.

Read-only

is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough

either.

Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#8Michael Loftis
mloftis@wgops.com
In reply to: Michael J. Baars (#7)
Re: libpq and multi-threading

That is not a thread. Linux man clone right at the start …

“clone, __clone2, clone3 - create a child process”

What you want is pthread_create (or similar)

There’s a bunch of not well documented dragons if you’re trying to treat a
child process as a thread. Use POSIX Threads, as pretty much anytime PG or
anything else Linux based says thread they’re talking about a POSIX Thread
environment.

On Wed, May 3, 2023 at 05:12 Michael J. Baars <
mjbaars1977.pgsql.hackers@gmail.com> wrote:

Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM
option. Indeed this results in an environment in which both the parent and
the child can read / write each other's memory, but dynamic memory being
allocated using malloc(3) from two different threads simulaneously will
result in internal interference.

Because libpq makes use of malloc to store results, you will come to find
that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:

On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:

I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.

After playing around a little with threads and memory, I now know that

the

PGresult is not read-only, it is read-once. The child can only read that
portion of parent memory, that was written before the thread started.

Read-only

is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough

either.

Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

#9Michael J. Baars
mjbaars1977.pgsql.hackers@gmail.com
In reply to: Michael Loftis (#8)
Re: libpq and multi-threading

Hi Michael,

Are pthread_* functions really such an improvement over clone? Does it make
an 'freely passing around' of PGresult objects possible? Like it matters,
process or thread.

We were talking about the documentation and this 'freely passing around'
PGresult object. I just don't think it is as simple as the documentation
makes you believe.

On Wed, 3 May 2023, 14:35 Michael Loftis, <mloftis@wgops.com> wrote:

Show quoted text

That is not a thread. Linux man clone right at the start …

“clone, __clone2, clone3 - create a child process”

What you want is pthread_create (or similar)

There’s a bunch of not well documented dragons if you’re trying to treat a
child process as a thread. Use POSIX Threads, as pretty much anytime PG or
anything else Linux based says thread they’re talking about a POSIX Thread
environment.

On Wed, May 3, 2023 at 05:12 Michael J. Baars <
mjbaars1977.pgsql.hackers@gmail.com> wrote:

Hi Peter,

The shared common address space is controlled by the clone(2) CLONE_VM
option. Indeed this results in an environment in which both the parent and
the child can read / write each other's memory, but dynamic memory being
allocated using malloc(3) from two different threads simulaneously will
result in internal interference.

Because libpq makes use of malloc to store results, you will come to find
that the CLONE_VM option was not the option you were looking for.

On Tue, 2 May 2023, 19:58 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:

On 2023-05-02 17:43:06 +0200, Michael J. Baars wrote:

I don't think it is, but let me shed some more light on it.

One possibly quite important information you haven't told us yet is
which OS you use.

Or how you create the threads, how you pass the results around, what
else you are possibly doing between getting the result and trying to use
it ...

A short self-contained test case might shed some light on this.

After playing around a little with threads and memory, I now know that

the

PGresult is not read-only, it is read-once. The child can only read

that

portion of parent memory, that was written before the thread started.

Read-only

is not strong enough.

Let me correct my first mail. Making libpq use mmap is not good enough

either.

Shared memory allocated by the child can not be accessed by the parent.

Are you sure you are talking about threads and not processes? In the OSs
I am familiar with, threads (of the same process) share a common address
space. You don't need explicit shared memory and there is no such thing
as "parent memory" (there is thread-local storage, but that's more a
compiler/library construct).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

#10Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Michael Loftis (#8)
Re: libpq and multi-threading

On 2023-05-03 06:35:26 -0600, Michael Loftis wrote:

That is not a thread. Linux man clone right at the start …

“clone, __clone2, clone3 - create a child process”

What you want is pthread_create (or similar)

clone is the system call which is used to create both processes and
threads (in the early days of Linux that generalization was thought to
be beneficial, but POSIX has all kinds of special rules for processes
and threads so it may actually have made stuff more complicated.)

I do agree that pthread_create (or the C11 thrd_create) is the way to
go. It will just call clone behind the scenes, but it will do so with
the right flags and possibly set up some other stuff expected by the
rest of the C library, too.

There may be good reasons to use the low level function in some cases.
But I'd say that in that case you should better know what that means
exactly.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#11Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Michael J. Baars (#7)
Re: libpq and multi-threading

On Wed, 3 May 2023 at 12:11, Michael J. Baars <
mjbaars1977.pgsql.hackers@gmail.com> wrote:

The shared common address space is controlled by the clone(2) CLONE_VM
option. Indeed this results in an environment in which both the parent and
the child can read / write each other's memory, but dynamic memory being
allocated using malloc(3) from two different threads simulaneously will
result in internal interference.

There's an interesting note here

https://stackoverflow.com/a/45285877

TL;DR: glibc malloc does not cope well with threads created with clone().
Use pthread_create if you wish to use glibc malloc.

Geoff