Testing of MVCC

Started by Matt Millerover 20 years ago31 messageshackersgeneral
Jump to latest
#1Matt Miller
mattm@epx.com
hackersgeneral

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress? In
particular I want confirm the robustness of some PL/pgSQL functions in a
multi-user environment. I could probably just bang away from multiple
interactive psql sessions, but I like to script the whole thing, and I'd
like it all to fit neatly into the current "make check" regression
tests.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#1)
hackersgeneral
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would support
useful concurrent testing. No one's gotten around to it yet, but surely
we need one.

regards, tom lane

#3Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Tom Lane (#2)
hackersgeneral
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would support
useful concurrent testing. No one's gotten around to it yet, but surely
we need one.

There's something *somewhat* related here:

http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/client/testing/concurrency-torture-test.py

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#4Richard D Levine
Richard_D_Levine@raytheon.com
In reply to: Karsten Hilbert (#3)
hackersgeneral
Re: Testing of MVCC
Firebird has MVCC also (they call it multi-generational record architecture
--- MGRA), and may have at least a good test plan, though it may not cover
effects of rules, triggers, functions, and constraints.  Those are the
killer test cases.  I don't have time to look.

http://firebird.sourceforge.net/

Rick

pgsql-general-owner@postgresql.org wrote on 08/09/2005 02:19:56 PM:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows.

Is

there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would

support

useful concurrent testing. No one's gotten around to it yet, but

surely

we need one.

There's something *somewhat* related here:

http://savannah.gnu.org/cgi-

bin/viewcvs/gnumed/gnumed/gnumed/client/testing/concurrency-torture-test.py

Show quoted text

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#5Matt Miller
mattm@epx.com
In reply to: Tom Lane (#2)
hackersgeneral
Re: [GENERAL] Testing of MVCC

On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. ... but surely we need one.

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests. Is there objection to a portion
of src/test/regress depending on contrib/dblink? I'm not sure yet how
that dependency would look, but I'm mainly wondering if there are
objections in principle to depending on contrib/.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#5)
hackersgeneral
Re: [GENERAL] Testing of MVCC

Matt Miller <mattm@epx.com> writes:

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests. Is there objection to a portion
of src/test/regress depending on contrib/dblink?

Yes. Given the difficulties we had in getting the contrib/dblink
regression tests to pass in the buildfarm, and the environmental
sensitivity it has, I don't think making the core tests depend on it
is a hot idea. In any case I doubt it would be very useful, since
a script based on that still doesn't let you issue concurrent queries.

regards, tom lane

#7Matt Miller
mattm@epx.com
In reply to: Tom Lane (#6)
hackersgeneral
Re: [GENERAL] Testing of MVCC

On Wed, 2005-08-10 at 16:41 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests.

I doubt it would be very useful, since
a script based on that still doesn't let you issue concurrent queries.

I think it would be useful to allow a test script to first create a set
of committed and uncommitted transactions, and to then issue some
queries on another connection to confirm that the other connection has a
proper view of the database at that point. This type of test is
serialized, but I think it would be a useful multi-user test. Also, the
output from such a test is probably pretty easy to fit into the
diff-based validation of "make check."

I realize that we also need to have tests that spawn several connections
and run scripts concurrently across those connections. I agree that
this type of test would probably not benefit fundamentally from
contrib/dblink. However, I was grasping a bit to see how the output
from such a concurrent test would be diff'ed with an expected output in
a meaningful way. So, to continue to progress on this problem, I
figured that a contrib/dblink dependency would at least allow me to
start coding something...

Is there objection to a portion
of src/test/regress depending on contrib/dblink?

Yes.

Understood.

#8Matt Miller
mattm@epx.com
In reply to: Matt Miller (#5)
hackersgeneral
Re: Testing of MVCC

On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. ... but surely we need one.

The attached patch allows src/test/regress/pg_regress.sh to recognize
lines that begin with "curr_test:" in the schedule file. Tests named on
such a line are run concurrently across multiple connections. To make
use of this facility each test in the group must begin with the line:

select * from concurrency_test where key = '<test_name>' for update;

where <test_name> is replace by the name of that test. This will enable
pg_regress to start this test at the same time as the other tests in the
group.

Is this a reasonable starting point for a concurrent testing framework?

This does not address the issue of how to interpret the test output.
Maybe the simplest solution is to force test writers to generate output
that does not depend on the relative progress of any concurrent tests.
Or, maybe the "ignore:" directive in the schedule file could be employed
somehow.

Attachments:

curr_test.patchtext/x-patch; charset=UTF-8; name=curr_test.patchDownload+76-65
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#8)
hackersgeneral
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

The attached patch allows src/test/regress/pg_regress.sh to recognize
lines that begin with "curr_test:" in the schedule file. Tests named on
such a line are run concurrently across multiple connections.

This doesn't seem like any advance over the existing parallel-test
facility. Synchronizing the test starts slightly more closely
isn't really going to buy anything: you still can't control or even
predict relative progress.

Maybe the simplest solution is to force test writers to generate output
that does not depend on the relative progress of any concurrent tests.

Well, that's exactly the situation we have now, and it's not really
adequate.

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.
I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

regards, tom lane

#10Matt Miller
mattm@epx.com
In reply to: Tom Lane (#9)
hackersgeneral
Re: Testing of MVCC

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.

It seems to me that this is what contrib/dblink could allow, but when I
presented that idea earlier you replied:

I doubt it would be very useful, since a script based on that
still doesn't let you issue concurrent queries.

So, I guess I'm not clear on what you're thinking.

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

#11Matt Miller
mattm@epx.com
In reply to: Matt Miller (#10)
hackersgeneral
Re: Testing of MVCC

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

I think I found it:

http://expect.nist.gov/

#12Michael Fuhr
mike@fuhr.org
In reply to: Matt Miller (#10)
hackersgeneral
Re: Testing of MVCC

On Mon, Aug 15, 2005 at 10:37:06PM +0000, Matt Miller wrote:

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

http://www.google.com/

:-)

Or here:

http://expect.nist.gov/

--
Michael Fuhr

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
hackersgeneral
Re: Testing of MVCC

Tom Lane wrote:

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.
I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

Or else a harness that operates at the library/connection level rather
than trying to control a tty app.

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing, and I am not sure how easy
or even possible it is to get it to work in a satisfactory way on
Windows. The NIST site says it's in AS Tcl, but in the docs that
accompany my copy of same, it says "Unix only" on the Expect manual page.

Just some words of caution.

One other note: please be very careful in changing pg_regress.sh -
getting it right especially on Windows was very time consuming, and it
is horribly fragile.

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
hackersgeneral
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Or else a harness that operates at the library/connection level rather
than trying to control a tty app.

Right. What is sort of in the back of my mind is a C program that can
open more than one connection, and it reads a script that tells it
"fire this command out on this connection". The question at hand is
whether we can avoid re-inventing the wheel.

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing,

True. I was pointing to it more as an example of the sorts of tools
people have built for this type of problem.

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

regards, tom lane

#15Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#14)
hackersgeneral
Re: Testing of MVCC

Tom Lane wrote:

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing,

True. I was pointing to it more as an example of the sorts of tools
people have built for this type of problem.

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

Yes, because we already have a dependency on perl. But don't be
surprised if we can't find such a beast, especially one that runs under
the weird MSys DTK perl - I won't even begin to tell you the nightmares
that caused with getting buildfarm to work on Windows.

BTW, further reading indicates that AS Expect does exist for Windows,
but it's a commercial offering, not a free one. Others appear to be
somewhat limited in value, but I could be wrong.

cheers

andrew

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#15)
hackersgeneral
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

Yes, because we already have a dependency on perl. But don't be
surprised if we can't find such a beast, especially one that runs under
the weird MSys DTK perl -

[ digs... ] It looks like what I was remembering is
http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
which seems to leave all the interesting problems (like driving more
than one program-under-test) to the user's own devices. Sigh.

regards, tom lane

#17Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#16)
hackersgeneral
Re: Testing of MVCC

Tom Lane <tgl@sss.pgh.pa.us> writes:

[ digs... ] It looks like what I was remembering is
http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
which seems to leave all the interesting problems (like driving more
than one program-under-test) to the user's own devices. Sigh.

The goal here is to find race conditions in the server, right? There's no real
chance of any race condition errors in psql as far as I can see, perhaps in
the \commands but I don't think that's what you're worried about here.

So why bother with driving multiple invocations of psql under Expect. Just use
DBD::Pg to open as many connections as you want and issue whatever queries you
want.

The driver program would be really simple. I'm not sure if you would specify
the series of queries with a perl data structure or define a text file format
that it would parse. Either seems pretty straightforward.

If you're worried about adding a dependency on DBD::Pg which would create a
circular dependency, well, it's just the test harness, it would just mean
someone would have to go build DBD::Pg before running the tests. (Personally
my inclination would be to break the cycle by including DBD::Pg in core but
that seems to be an uphill battle these days.)

--
greg

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#17)
hackersgeneral
Re: Testing of MVCC

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

regards, tom lane

#19Tino Wildenhain
tino@wildenhain.de
In reply to: Tom Lane (#18)
hackersgeneral
Re: Testing of MVCC

Tom Lane schrieb:

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

I wonder if you dont have a wrapper around libpq you can use like that?

#20Andrew Piskorski
atp@piskorski.com
In reply to: Tom Lane (#9)
hackersgeneral
Re: Testing of MVCC

On Mon, Aug 15, 2005 at 06:01:20PM -0400, Tom Lane wrote:

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.

Well, using Tcl with its Tcl Threads Extension should certainly let
you easily control multiple concurrent PostgreSQL connections. (The
Thread Extension's APIs are particularly nice for multi-threaded
programming.) Its docs are here:

http://cvs.sourceforge.net/viewcvs.py/tcl/thread/doc/html/

I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

I don't have any clear idea of what sort of tests you want to run
"according to a script" though, so I'm not sure whether the Tcl
Threads Extension, or Expect, or some other tool would best meet your
needs.

--
Andrew Piskorski <atp@piskorski.com>
http://www.piskorski.com/

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tino Wildenhain (#19)
hackersgeneral
#22Tino Wildenhain
tino@wildenhain.de
In reply to: Tom Lane (#21)
hackersgeneral
#23Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#18)
hackersgeneral
#24Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#21)
hackersgeneral
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#24)
hackersgeneral
#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#25)
hackersgeneral
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
hackersgeneral
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
hackersgeneral
#29Kaare Rasmussen
kar@kakidata.dk
In reply to: Andrew Dunstan (#26)
hackersgeneral
#30Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#18)
hackersgeneral
#31Bruce Momjian
bruce@momjian.us
In reply to: Jim Nasby (#30)
hackersgeneral