Issue with markers in isolation tester? Or not?

Started by Michail Nikolaevabout 1 year ago15 messages

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

Hello, everyone!

While stabilizing tests for [0]/messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz I have noticed unclear (and confusing in my
opinion) behavior of markers in the isolation tester.

I have attached a test with reproducer.

There are two permutations in the test:

permutation
after(before)
before
detach1
wakeup1
detach2
wakeup2

In that case, I see expected results:
step before: <... completed>
step after: <... completed>

But changing the order of steps slightly:

permutation
after(before)
wakeup1 <------- wakeup moved here
before
detach1
detach2
wakeup2

makes "after" to be completed before "before". Does that make sense?

In my cases, there are 2-steps of injection points for each of 3 sessions,
so, it is difficult to make it bullet-proof in that case without a lot of
additional injection points.

Best regards,
Mikhail.

[0]: /messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz
/messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Michail Nikolaev (#1)

Re: Issue with markers in isolation tester? Or not?

Hello!

In case if someone will look for a workaround - it is possible to use
"notices N" mark.\

Instead of

step before {
SELECT injection_points_run('injection-points-wait-2');
}

use something like:

step before {
DO $$
BEGIN
PERFORM injection_points_run('injection-points-wait-2');
RAISE NOTICE 'before is complete';
END $$
}

and then:

permutation
after(before notices 1)
before
detach1
wakeup1
detach2
wakeup2

permutation
after(before notices 1)
wakeup1
before
detach1
detach2
wakeup2

in such a case, both permutations report "before" to be completed before
"after", not after.

Best regards,
Michail.

Show quoted text

noah@leadboat.com

about 1 year ago

In reply to: Michail Nikolaev (#1)

Re: Issue with markers in isolation tester? Or not?

On Mon, Jan 13, 2025 at 02:02:00AM +0100, Michail Nikolaev wrote:

While stabilizing tests for [0] I have noticed unclear (and confusing in my
opinion) behavior of markers in the isolation tester.

I have attached a test with reproducer.

There are two permutations in the test:

permutation
after(before)
before
detach1
wakeup1
detach2
wakeup2

In that case, I see expected results:
step before: <... completed>
step after: <... completed>

But changing the order of steps slightly:

permutation
after(before)
wakeup1 <------- wakeup moved here
before
detach1
detach2
wakeup2

makes "after" to be completed before "before". Does that make sense?

Yes. I don't see a good reason for isolationtester to disregard that
"(before)" marker, so I think you've found a bug in isolationtester. How
might we fix it?

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Noah Misch (#3)

Re: Issue with markers in isolation tester? Or not?

Hello, Noah!

Thanks for your attention!

It looks like the simplest solution is just to count the number of
not-yet-completed steps and check that value.

A patch with such changes (+ the same test + README update + commit
message) is attached.

Best regards,
Mikhail.

noah@leadboat.com

about 1 year ago

In reply to: Michail Nikolaev (#4)

Re: Issue with markers in isolation tester? Or not?

On Tue, Jan 14, 2025 at 12:16:22PM +0100, Michail Nikolaev wrote:

--- a/src/test/isolation/README
+++ b/src/test/isolation/README
@@ -183,9 +183,9 @@ A marker consisting solely of a step name indicates that this step may
not be reported as completing until that other step has completed.
This allows stabilizing cases where two queries might be seen to complete
in either order.  Note that this step can be *launched* before the other
-step has completed.  (If the other step is used more than once in the
-current permutation, this step cannot complete while any of those
-instances is active.)
+step has completed or launched.  (If the other step is used more than once
+in the current permutation, this step cannot complete while any of those
+instances are not yet complete.)

I misunderstood, and I was mistaken to see this as a bug fix. The
isolationtester is acting per its definition, and this would be a definition
change. Do others have opinions on the merits of today's definition vs. the
proposed definition?

I'd need to review the motivating test to form my own opinion on whether the
new definition makes it easier to write tests. I'm optimistic that it does
make things easier, because I think one can get the old behavior by defining
two steps with the same commands. Thanks for the patch. You could bundle
this in the thread that wants this to stabilize that thread's CI results.

As a side note on the broader topic of the difficulty of stabilizing isolation
test race conditions, I've thought it would be helpful to have a "strict
blocking condition mode" that complains at the end of the test run if the spec
had completion order ambiguities. I think these are the rules sufficient to
have a fully-specified completion order:

1. Every step that waits needs a blocking condition.
2. No two steps A & B name the same step C as a blocking condition. Instead,
make B block on A.
3. If a step appears in another step's blocking condition and is not the last
step listed in the permutation, the step listed after the unblocker needs a
blocking condition on a formerly-blocked step.

+session s1
+step after	{ SELECT injection_points_run('injection-points-wait-1'); }

FYI, I prefer the convention of ending each step name with the session number,
i.e. s/after/after1/ here. The first isolation specs used that style. I
mildly prefer it over the newer alternative styles, and I strongly prefer
styles that convey the session identity over styles that don't. No need to
send a new patch version, though.

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Noah Misch (#5)

Re: Issue with markers in isolation tester? Or not?

Hello, Noah!

I misunderstood, and I was mistaken to see this as a bug fix. The
isolationtester is acting per its definition, and this would be a

definition

change.

Yes, you are right, but I think it is better to clarify it somehow
because from my point of view that definition feels like logic in patched
version.
But maybe my non-native English prevents me from understanding it correctly.

In my opinion, it is better to add some clarification like "this marker
affects only other steps which were launched before this step or during its
execution".

I'd need to review the motivating test to form my own opinion on whether

the

new definition makes it easier to write tests.
You could bundle this in the thread that wants this to stabilize that

thread's CI results.

I'll think about it, but currently it is stabilized using "notices"
and, you know, more changes - harder to merge :)

Also, I'll try one more time to stabilize it without "notices" - maybe I'll
get some insight.
Anyway - I'll ping you in that thread [0]/messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz (you're already in recipients) or
bring it here.

Best regards,
Mikhail.

[0]: /messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz
/messages/by-id/ZnoZ6GNwkJmq-gTh@paquier.xyz

Michael Paquier

michael@paquier.xyz

about 1 year ago

In reply to: Noah Misch (#5)

Re: Issue with markers in isolation tester? Or not?

On Tue, Jan 14, 2025 at 11:48:28AM -0800, Noah Misch wrote:

I misunderstood, and I was mistaken to see this as a bug fix. The
isolationtester is acting per its definition, and this would be a definition
change. Do others have opinions on the merits of today's definition vs. the
proposed definition?

I agree that Michail's case with the handling of the markers is kind
of strange in the case he has reported. Anyway, do you think that it
would be a good idea to change that knowing for how long we've relied
on isolationtester to do things the way they are? If we do it only on
HEAD, it would make the back-patching of newly-added tests for bug
fixes potentially more complex to deal with. If applying the new
definition across all the stable branches, this could impact something
outside code :/

I'd need to review the motivating test to form my own opinion on whether the
new definition makes it easier to write tests. I'm optimistic that it does
make things easier, because I think one can get the old behavior by defining
two steps with the same commands. Thanks for the patch. You could bundle
this in the thread that wants this to stabilize that thread's CI results.

If you can get a backward-compatible behavior, well, no objections
from here, I guess.

FYI, I prefer the convention of ending each step name with the session number,
i.e. s/after/after1/ here. The first isolation specs used that style. I
mildly prefer it over the newer alternative styles, and I strongly prefer
styles that convey the session identity over styles that don't. No need to
send a new patch version, though.

+1.  Including the session number in the steps makes debugging easier.
--
Michael

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Michael Paquier (#7)

Re: Issue with markers in isolation tester? Or not?

Michael Paquier <michael@paquier.xyz> writes:

On Tue, Jan 14, 2025 at 11:48:28AM -0800, Noah Misch wrote:

I misunderstood, and I was mistaken to see this as a bug fix. The
isolationtester is acting per its definition, and this would be a definition
change. Do others have opinions on the merits of today's definition vs. the
proposed definition?

I agree that Michail's case with the handling of the markers is kind
of strange in the case he has reported. Anyway, do you think that it
would be a good idea to change that knowing for how long we've relied
on isolationtester to do things the way they are?

We've changed it before, so I'm not totally averse to considering
more change. But I'm not happy about the lack of any analysis in
this thread of the impact on existing test cases. In particular,
it seems to me that this proposal amounts to reducing the number
of states that can be reached during an isolation test. So it
seems to me that means strictly less coverage than we had before.
I'd prefer to start by asking why the proposed new test can't be
made stable without such a restriction.

(I don't mean to minimize the question of whether we need better
documentation about this behavior.)

regards, tom lane

noah@leadboat.com

about 1 year ago

In reply to: Michael Paquier (#7)

Re: Issue with markers in isolation tester? Or not?

On Wed, Jan 15, 2025 at 02:41:31PM +0900, Michael Paquier wrote:

On Tue, Jan 14, 2025 at 11:48:28AM -0800, Noah Misch wrote:

I misunderstood, and I was mistaken to see this as a bug fix. The
isolationtester is acting per its definition, and this would be a definition
change. Do others have opinions on the merits of today's definition vs. the
proposed definition?

I agree that Michail's case with the handling of the markers is kind
of strange in the case he has reported. Anyway, do you think that it
would be a good idea to change that knowing for how long we've relied
on isolationtester to do things the way they are?

I think that would depend on what proposed tests become easier. Since the
original poster used the "notices" marker type to solve the motivating use
case, let's shelve this until there's new demand.

If we ever unshelve this and find much in today's specs that would malfunction
under the new definition, we might offer both definitions. A new marker
syntax would reach the new definition, and old specs would work as before.

If we do it only on
HEAD, it would make the back-patching of newly-added tests for bug
fixes potentially more complex to deal with. If applying the new
definition across all the stable branches, this could impact something
outside code :/

When I'm exposed to non-core tests, those tests usually target many major
versions simultaneously. Having the spec meaning vary across major versions
would hurt them more than it would help. Hence, back-patch if we do adopt the
change. As above, the impact on existing specs could inform whether we add
syntax vs. redefine existing syntax.

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Noah Misch (#9)

Re: Issue with markers in isolation tester? Or not?

Hello, everyone!

I was able to stabilize (I hope so) tests on [0]/messages/by-id/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com without any changes to
isolationtester and without "notices N".

So, I agree, it is better to put it on a shelf now. But a few words in
documentation may be a good idea.

Best regards,
Mikhail.

[0]: /messages/by-id/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com
/messages/by-id/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com

Michael Paquier

michael@paquier.xyz

about 1 year ago

In reply to: Michail Nikolaev (#10)

Re: Issue with markers in isolation tester? Or not?

On Sat, Jan 18, 2025 at 02:03:33PM +0100, Michail Nikolaev wrote:

I was able to stabilize (I hope so) tests on [0] without any changes to
isolationtester and without "notices N".

So, I agree, it is better to put it on a shelf now. But a few words in
documentation may be a good idea.

Could you summarize here what you have done to achieve test
stabilization in your new patch set posted at [1]/messages/by-id/CANtu0ohStYQkJL+_=40Odcqu2A4QPrM_VmofwkkudHGHFgymkA@mail.gmail.com -- Michael without using the
proposal of this thread?

Perhaps that's OK to leave things as they are, but your proposal here
is leading to a more natural definition in terms of how test ordering
can be enforced with injection points, leading to a less confusing
definition. The analysis of existing tests if this proposal moves
forward would be required, of course.

[1]: /messages/by-id/CANtu0ohStYQkJL+_=40Odcqu2A4QPrM_VmofwkkudHGHFgymkA@mail.gmail.com -- Michael
--
Michael

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Michael Paquier (#11)

Re: Issue with markers in isolation tester? Or not?

Hello, Michael!

Could you summarize here what you have done to achieve test
stabilization in your new patch set posted at [1] without using the
proposal of this thread?

Mostly idea is next:

Let's imagine we have two steps - step_before and step_after which may end
in either order.
Then instead of such step/markers structure:

step_before(step_after)
stepN
stepN+1
step_after

use the next:

step_before
stepN
stepN+1
step_after(step_before)

In the first case, there are two possible results:
1) step_before is finished before step_after - reported as step_before,
step_after
2) step_after is launched before step_before ends - reported as step_after,
step_before

But in the case second variant:
1) step_before is finished before step_after - reported as step_before,
step_after
2) step_after is launched before step_before ends - reported as
step_before, step_after

So, the second option provides the same result regardless of order of
finishing of step_before and step_after, which is the thing I want to
achieve here.

Best regards,
Mikhail.

noah@leadboat.com

about 1 year ago

In reply to: Michail Nikolaev (#12)

Re: Issue with markers in isolation tester? Or not?

On Sun, Jan 19, 2025 at 07:26:58PM +0100, Michail Nikolaev wrote:

Mostly idea is next:

Let's imagine we have two steps - step_before and step_after which may end
in either order.
Then instead of such step/markers structure:

step_before(step_after)
stepN
stepN+1
step_after

use the next:

step_before
stepN
stepN+1
step_after(step_before)

In the first case, there are two possible results:
1) step_before is finished before step_after - reported as step_before,
step_after
2) step_after is launched before step_before ends - reported as step_after,
step_before

But in the case second variant:
1) step_before is finished before step_after - reported as step_before,
step_after
2) step_after is launched before step_before ends - reported as
step_before, step_after

So, the second option provides the same result regardless of order of
finishing of step_before and step_after, which is the thing I want to
achieve here.

Stepping back, any variation in the total ordering of step-start and
step-complete events necessarily changes the expected output. Hence, we want
to freeze that ordering. We could do that directly, by having isolationtester
read a format that lists start and complete events. We could also have the
isolationtester write the same format, so you could draft a permutation
without blocking conditions and just save the order that isolationtester
discovers by experiment. I'd find that less mentally taxing than following
rules to reach a frozen ordering via blocking conditions.

As an example of one concrete way to achieve that, have isolationtester write
a results/SPECNAME.wait file that contains syntax like "permutation-event
waiting:S1 S2 complete:S1 S3". You'd copy that file to
expected/SPECNAME.wait, which isolationtester would read at startup. This
wouldn't replace "notices" blocking conditions for certain complex tests, and
tests with multiple expected outputs might not want it. It would cover the
rest. Any unprefixed entry (e.g. the plain "S2") would get the simplification
of skipping the check for whether the step has entered a wait.

How well would that meet your test's needs?

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Noah Misch (#13)

Re: Issue with markers in isolation tester? Or not?

Hello, Noah!

Stepping back, any variation in the total ordering of step-start and
step-complete events necessarily changes the expected output. Hence, we

want

to freeze that ordering. We could do that directly, by having

isolationtester

read a format that lists start and complete events. We could also have

the

isolationtester write the same format, so you could draft a permutation
without blocking conditions and just save the order that isolationtester
discovers by experiment. I'd find that less mentally taxing than

following

rules to reach a frozen ordering via blocking conditions.

How well would that meet your test's needs?

It feels a bit strange to me to force isolationtest to output something in
a specific order, especially since we don't actually care about the order
in the test.

But why do we need that output at all? I think a simpler solution would be
to mark a step in a way that prevents isolationtester from outputting
detailed information about the completion of that step—just an overview of
the steps launched and completed at the end of the permutation.

In my tests, I evaluate how [CREATE|RE] INDEX CONCURRENTLY interferes with
UPSERTs. I’m not testing the exact ordering; I simply set the backends into
a specific state (with different index information) and let them proceed
one by one. I don’t care which one finishes first or last—I’m only
interested in identifying errors in the output.

Best regards,
Mikhail.

noah@leadboat.com

about 1 year ago

In reply to: Michail Nikolaev (#14)

Re: Issue with markers in isolation tester? Or not?

On Sun, Jan 19, 2025 at 09:55:32PM +0100, Michail Nikolaev wrote:

Stepping back, any variation in the total ordering of step-start and
step-complete events necessarily changes the expected output. Hence, we

want

to freeze that ordering. We could do that directly, by having

isolationtester

read a format that lists start and complete events. We could also have

the

isolationtester write the same format, so you could draft a permutation
without blocking conditions and just save the order that isolationtester
discovers by experiment. I'd find that less mentally taxing than

following

rules to reach a frozen ordering via blocking conditions.

How well would that meet your test's needs?

It feels a bit strange to me to force isolationtest to output something in
a specific order, especially since we don't actually care about the order
in the test.

But why do we need that output at all?

It's good to think of ways we can output differently for more stability.
However, ordered output helps the human studying what happened. Today's
output is well-optimized for that. Today's output resembles what a person
would see if running a collection of psql sessions. For example, if I'm
reviewing a patch, reviewing the expected spec output feels natural.

I think a simpler solution would be
to mark a step in a way that prevents isolationtester from outputting
detailed information about the completion of that step—just an overview of
the steps launched and completed at the end of the permutation.

My main concern about that is the difficulty of marking steps:

- Best: no markings
- Medium: easy rules for when you need a marking
- Worst: run on FreeBSD to determine marking requirements by experiment

I've not been able to identify an easy rule that avoids "run on FreeBSD" with
today's framework. (I have a non-easy list of four rules, but I'm not
confident even that list is complete.) I can't think of an easy rule for
"just an overview" markings, either. Even if there is one, not needing
markings is better, all else being equal. What other advantages may
compensate for this concern?

There's another, minor concern. The more "just an overview" markings in a
spec, the more reading its expected output wouldn't feel like reading psql
transcripts. I'd likely end up logging both today's ordered format and the
new format. We'd diff the new format only, and we'd write the old format for
human debugging.

In my tests, I evaluate how [CREATE|RE] INDEX CONCURRENTLY interferes with
UPSERTs. I’m not testing the exact ordering; I simply set the backends into
a specific state (with different index information) and let them proceed
one by one. I don’t care which one finishes first or last—I’m only
interested in identifying errors in the output.

That makes sense.

Thanks,
nm