Assertion failure with barriers in parallel hash join

thomas.munro@gmail.com

over 5 years ago

In reply to: Michael Paquier (#1)

Re: Assertion failure with barriers in parallel hash join

On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote:

#2 0x00000000009027d2 in ExceptionalCondition
(conditionName=conditionName@entry=0xa80846 "!barrier->static_party",

#4 0x0000000000682ebf in ExecParallelHashJoinNewBatch

Thanks. Ohhh. I think I see how that condition was reached and what
to do about it, but I'll need to look more closely. I'm away on
vacation right now, and will update in a couple of days when I'm back
at a real computer.

thomas.munro@gmail.com

over 5 years ago

In reply to: Thomas Munro (#2)

Re: Assertion failure with barriers in parallel hash join

On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote:

#2 0x00000000009027d2 in ExceptionalCondition
(conditionName=conditionName@entry=0xa80846 "!barrier->static_party",

#4 0x0000000000682ebf in ExecParallelHashJoinNewBatch

Thanks. Ohhh. I think I see how that condition was reached and what
to do about it, but I'll need to look more closely. I'm away on
vacation right now, and will update in a couple of days when I'm back
at a real computer.

Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.

melanieplageman@gmail.com

over 5 years ago

In reply to: Thomas Munro (#3)

Re: Assertion failure with barriers in parallel hash join

On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com>
wrote:

On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz>

wrote:

#2 0x00000000009027d2 in ExceptionalCondition
(conditionName=conditionName@entry=0xa80846 "!barrier->static_party",

#4 0x0000000000682ebf in ExecParallelHashJoinNewBatch

Thanks. Ohhh. I think I see how that condition was reached and what
to do about it, but I'll need to look more closely. I'm away on
vacation right now, and will update in a couple of days when I'm back
at a real computer.

Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.

I was just taking a look at the patch and noticed the commit message
says:

With unlucky timing and parallel_leader_participation off...

Is parallel_leader_participation being off required to reproduce the
issue?

thomas.munro@gmail.com

over 5 years ago

In reply to: Melanie Plageman (#4)

Re: Assertion failure with barriers in parallel hash join

On Tue, Oct 13, 2020 at 12:15 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.

I was just taking a look at the patch and noticed the commit message
says:

With unlucky timing and parallel_leader_participation off...

Is parallel_leader_participation being off required to reproduce the
issue?

Yeah, because otherwise the leader detaches last so the problem doesn't arise.

thomas.munro@gmail.com

about 5 years ago

In reply to: Thomas Munro (#5)

Re: Assertion failure with barriers in parallel hash join

On Tue, Oct 13, 2020 at 12:18 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Oct 13, 2020 at 12:15 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.

I was just taking a look at the patch and noticed the commit message
says:

With unlucky timing and parallel_leader_participation off...

Is parallel_leader_participation being off required to reproduce the
issue?

Yeah, because otherwise the leader detaches last so the problem doesn't arise.

While working on Melanie's Parallel Hash Full Join patch I remembered
that this (apparently extremely rare) race still needs fixing. Here
is a slightly tidied version, which I'm adding to the next CF for CI
coverage.

Here also is a picture that comes from an unfinished description of
this algorithm that I've been trying to write, that might help explain
the change. It's a phase diagram, where you can see the phases "run"
(= all processes try to work on batches) and "done" (= one process is
freeing the shmem objects for tracking batches, anyone who attaches to
the barrier in this phase knows that it's not even safe to access
batch bookkeeping memory). Before this patch there is no "run", just
"done" (= process batches and then one process frees, which has a race
if someone else attaches really late, after the freeing has begun).
I'm currently wondering whether this can be further improved using
Melanie's new BarrierArriveAndDetachExceptLast() function.

(In the code the phase names have -ing on the end, I'll probably drop
those, because commit 3048898e73c did that to the corresponding wait
events.)

thomas.munro@gmail.com

about 5 years ago

In reply to: Thomas Munro (#6)

Re: Assertion failure with barriers in parallel hash join

On Sat, Mar 6, 2021 at 9:56 AM Thomas Munro <thomas.munro@gmail.com> wrote:

While working on Melanie's Parallel Hash Full Join patch I remembered
that this (apparently extremely rare) race still needs fixing. Here
is a slightly tidied version, which I'm adding to the next CF for CI
coverage.

Pushed and back-patched.

thomas.munro@gmail.com

about 5 years ago

In reply to: Thomas Munro (#7)

Re: Assertion failure with barriers in parallel hash join

On Wed, Mar 17, 2021 at 6:17 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sat, Mar 6, 2021 at 9:56 AM Thomas Munro <thomas.munro@gmail.com> wrote:

While working on Melanie's Parallel Hash Full Join patch I remembered
that this (apparently extremely rare) race still needs fixing. Here
is a slightly tidied version, which I'm adding to the next CF for CI
coverage.

Pushed and back-patched.

According to BF animal elver there is something wrong with this
commit. Looking into it.

thomas.munro@gmail.com

about 5 years ago

In reply to: Thomas Munro (#8)

Re: Assertion failure with barriers in parallel hash join

On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

According to BF animal elver there is something wrong with this
commit. Looking into it.

Assertion failure reproduced here and understood, but unfortunately
it'll take some more time to fix this. I've reverted the commit for
now to unbreak the ~5 machines that are currently showing red in the
build farm.

#10

melanieplageman@gmail.com

about 5 years ago

In reply to: Thomas Munro (#9)

Re: Assertion failure with barriers in parallel hash join

On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

According to BF animal elver there is something wrong with this
commit. Looking into it.

Assertion failure reproduced here and understood, but unfortunately
it'll take some more time to fix this. I've reverted the commit for
now to unbreak the ~5 machines that are currently showing red in the
build farm.

So, I think the premise of the patch
v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes
sense: freeing the hashtable memory should happen atomically with
updating the flag that indicates to all others that the memory is not to
be accessed any longer.

As was likely reported by the buildfarm leading to you reverting the
patch, when the inner side is empty and we dump out before advancing the
build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts
you've added in ExecHashTableDetach().

Assert(!pstate ||
BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING);

Hmm.

Maybe if the inner side is empty, we can advance the build barrier to
the end?
We help batch 0 along like this in ExecParallelHashJoinSetUpBatches().

But, I'm not sure we can expect the process executing this code to be
attached to the build barrier, can we?

@@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
                                 * outer relation.
                                 */
                                if (hashtable->totalTuples == 0 &&
!HJ_FILL_OUTER(node))
+                               {
+                                       if (parallel)
+                                       {
+                                               Barrier    *build_barrier;
+
+                                               build_barrier =
&parallel_state->build_barrier;
+                                               while
(BarrierPhase(build_barrier) < PHJ_BUILD_DONE)
+
BarrierArriveAndWait(build_barrier, 0);
+                                               BarrierDetach(build_barrier);
+                                       }
+
                                        return NULL;
+                               }

#11

melanieplageman@gmail.com

about 5 years ago

In reply to: Thomas Munro (#9)

Re: Assertion failure with barriers in parallel hash join

On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

According to BF animal elver there is something wrong with this
commit. Looking into it.

Assertion failure reproduced here and understood, but unfortunately
it'll take some more time to fix this. I've reverted the commit for
now to unbreak the ~5 machines that are currently showing red in the
build farm.

Also, silly question: why couldn't we just set the pstate->batches pointer to
InvalidDsaPointer before doing dsa_free() (of course saving the pointer
so that we can actually do the freeing with it)? Is there still a race?

#12

melanieplageman@gmail.com

over 4 years ago

In reply to: Melanie Plageman (#10)

Re: Assertion failure with barriers in parallel hash join

On Wed, Mar 31, 2021 at 6:25 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

According to BF animal elver there is something wrong with this
commit. Looking into it.

Assertion failure reproduced here and understood, but unfortunately
it'll take some more time to fix this. I've reverted the commit for
now to unbreak the ~5 machines that are currently showing red in the
build farm.

So, I think the premise of the patch
v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes
sense: freeing the hashtable memory should happen atomically with
updating the flag that indicates to all others that the memory is not to
be accessed any longer.

As was likely reported by the buildfarm leading to you reverting the
patch, when the inner side is empty and we dump out before advancing the
build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts
you've added in ExecHashTableDetach().

Assert(!pstate ||
BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING);

Hmm.

Maybe if the inner side is empty, we can advance the build barrier to
the end?
We help batch 0 along like this in ExecParallelHashJoinSetUpBatches().

But, I'm not sure we can expect the process executing this code to be
attached to the build barrier, can we?
@@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
* outer relation.
*/
if (hashtable->totalTuples == 0 &&
!HJ_FILL_OUTER(node))
+                               {
+                                       if (parallel)
+                                       {
+                                               Barrier    *build_barrier;
+
+                                               build_barrier =
&parallel_state->build_barrier;
+                                               while
(BarrierPhase(build_barrier) < PHJ_BUILD_DONE)
+
BarrierArriveAndWait(build_barrier, 0);
+                                               BarrierDetach(build_barrier);
+                                       }
+
return NULL;
+                               }

I've attached a new version of the patch which contains my suggested fix
for the problem with the empty inner optimization.

If you apply Thomas' injected sleeps original bug repro patch and use
the following DDL and query, you can reproduce the issue with the empty
inner optimization present in his v2 patch. Then, if you apply my
attached v3 patch, you can see that we no longer trip the assert.

drop table if exists empty_simple;
create table empty_simple(id int, col2 text);
update pg_class set reltuples = 10000 where relname = 'empty_simple';

drop table if exists simple;
create table simple as
select generate_series(1, 20000) AS id, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
analyze simple;

set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set enable_hashjoin = on;
set parallel_leader_participation to off;
set work_mem = '4MB';
set enable_parallel_hash = on;
select count(*) from empty_simple join simple using (id);

- Melanie

#13

geidav.pg@gmail.com

almost 4 years ago

In reply to: Melanie Plageman (#12)

Re: Assertion failure with barriers in parallel hash join

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Hi all,

We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliably trigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; in builds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentation fault still occur.

I have also glanced at the code and the implementation looks fine. However, I'm not an expert for the fairly involved hash join state machine.
There seems to be no need for additional documentation.

For completeness here is the stack trace of the segmentation fault.
Like the stack trace from the assertion failure initially shared by Michael and also encountered by us, the stack trace of the segmentation fault also contains ExecParallelHashJoinNewBatch().

#9 | Source "/opt/src/backend/executor/execMain.c", line 361, in standard_ExecutorRun
| Source "/opt/src/backend/executor/execMain.c", line 1551, in ExecutePlan
Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x657e4d]
#8 | Source "/opt/src/backend/executor/nodeAgg.c", line 2179, in ExecAgg
Source "/opt/src/backend/executor/nodeAgg.c", line 2364, in agg_retrieve_direct [0x66ba60]
#7 | Source "/opt/src/backend/executor/nodeAgg.c", line 581, in fetch_input_tuple
Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x66d585]
#6 | Source "/opt/src/backend/executor/nodeHashjoin.c", line 607, in ExecParallelHashJoin
| Source "/opt/src/backend/executor/nodeHashjoin.c", line 560, in ExecHashJoinImpl
Source "/opt/src/backend/executor/nodeHashjoin.c", line 1132, in ExecParallelHashJoinNewBatch [0x67a89b]
#5 | Source "/opt/src/backend/storage/ipc/barrier.c", line 242, in BarrierAttach
Source "/opt/src/include/storage/s_lock.h", line 228, in tas [0x7c2a1b]
#4 Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f4db364841f, in __funlockfile

--
David Geier
(SericeNow)

The new status of this patch is: Ready for Committer

#14

thomas.munro@gmail.com

almost 4 years ago

In reply to: David Geier (#13)

Re: Assertion failure with barriers in parallel hash join

On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com> wrote:

We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliably trigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; in builds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentation fault still occur.

Thanks for the report, testing, and for creating the CF entry.

I assume you are using parallel_leader_participation=off, and the
reason we haven't heard more about this is because few people do that.
By coincidence I was just about to restart a bunch of hash join
projects and have been paging the topic area back into my brain, so
I'll start with another round of testing/analysis of this bug/patch
next week.

#15

geidav.pg@gmail.com

almost 4 years ago

In reply to: Thomas Munro (#14)

Re: Assertion failure with barriers in parallel hash join

Hi Thomas,

Correct. We're running with disabled parallel leader participation and we
have to do so, because another custom plan node we built relies on that.
That would be great. Anything else I can help with to get this patch in?

Thanks!

--
David
(ServiceNow)

On Fri, 3 Jun 2022 at 00:06, Thomas Munro <thomas.munro@gmail.com> wrote:

Show quoted text

On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com> wrote:

We recently encountered the same bug in the field. Oleksii Kozlov

managed to come up with reproduction steps, which reliably trigger it.
Interestingly, the bug does not only manifest as failing assertion, but
also as segmentation fault; in builds with disabled and with enabled (!)
assertions. So it can crash production environments. We applied the
proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that
with the patch neither the assertion nor the segmentation fault still occur.

Thanks for the report, testing, and for creating the CF entry.

I assume you are using parallel_leader_participation=off, and the
reason we haven't heard more about this is because few people do that.
By coincidence I was just about to restart a bunch of hash join
projects and have been paging the topic area back into my brain, so
I'll start with another round of testing/analysis of this bug/patch
next week.

#16

geidav.pg@gmail.com

over 3 years ago

In reply to: David Geier (#15)

Re: Assertion failure with barriers in parallel hash join

Hi Thomas,

Can we make progress with this patch in the current commit fest, or
discuss what is still missing to bring this in?

Thanks!

--
David Geier
(ServiceNow)

Show quoted text

On 6/6/22 17:01, David Geier wrote:

Hi Thomas,

Correct. We're running with disabled parallel leader participation and
we have to do so, because another custom plan node we built relies on
that.
That would be great. Anything else I can help with to get this patch in?

Thanks!

--
David
(ServiceNow)

On Fri, 3 Jun 2022 at 00:06, Thomas Munro <thomas.munro@gmail.com> wrote:

On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com>
wrote:

We recently encountered the same bug in the field. Oleksii

Kozlov managed to come up with reproduction steps, which reliably
trigger it. Interestingly, the bug does not only manifest as
failing assertion, but also as segmentation fault; in builds with
disabled and with enabled (!) assertions. So it can crash
production environments. We applied the proposed patch v3 from
Melanie to the REL_14_3 branch and can confirm that with the patch
neither the assertion nor the segmentation fault still occur.

Thanks for the report, testing, and for creating the CF entry.

I assume you are using parallel_leader_participation=off, and the
reason we haven't heard more about this is because few people do that.
By coincidence I was just about to restart a bunch of hash join
projects and have been paging the topic area back into my brain, so
I'll start with another round of testing/analysis of this bug/patch
next week.

#17

thomas.munro@gmail.com

over 3 years ago

In reply to: David Geier (#16)

Re: Assertion failure with barriers in parallel hash join

On Thu, Nov 17, 2022 at 8:01 PM David Geier <geidav.pg@gmail.com> wrote:

Can we make progress with this patch in the current commit fest, or discuss what is still missing to bring this in?

Hi David,
Sorry for the delay. I'll aim to get this done in the next few days.

#18

geidav.pg@gmail.com

over 3 years ago

In reply to: Thomas Munro (#17)

Re: Assertion failure with barriers in parallel hash join

Thanks!

Please let me know if I can help out, e.g. with re-testing.

--
David Geier
(ServiceNow)

Show quoted text

On 11/17/22 08:28, Thomas Munro wrote:

On Thu, Nov 17, 2022 at 8:01 PM David Geier <geidav.pg@gmail.com> wrote:

Can we make progress with this patch in the current commit fest, or discuss what is still missing to bring this in?

Hi David,
Sorry for the delay. I'll aim to get this done in the next few days.

#19