Testing of MVCC

Started by Matt Millerover 20 years ago31 messages
#1Matt Miller
mattm@epx.com

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress? In
particular I want confirm the robustness of some PL/pgSQL functions in a
multi-user environment. I could probably just bang away from multiple
interactive psql sessions, but I like to script the whole thing, and I'd
like it all to fit neatly into the current "make check" regression
tests.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#1)
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would support
useful concurrent testing. No one's gotten around to it yet, but surely
we need one.

regards, tom lane

#3Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Tom Lane (#2)
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would support
useful concurrent testing. No one's gotten around to it yet, but surely
we need one.

There's something *somewhat* related here:

http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/client/testing/concurrency-torture-test.py

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#4Noname
Richard_D_Levine@raytheon.com
In reply to: Karsten Hilbert (#3)
Re: Testing of MVCC
Firebird has MVCC also (they call it multi-generational record architecture
--- MGRA), and may have at least a good test plan, though it may not cover
effects of rules, triggers, functions, and constraints.  Those are the
killer test cases.  I don't have time to look.

http://firebird.sourceforge.net/

Rick

pgsql-general-owner@postgresql.org wrote on 08/09/2005 02:19:56 PM:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows.

Is

there something like this already, e.g. in src/test/regress?

No. You should consult the pghackers archives --- there have been
discussions in the past about creating a test harness that would

support

useful concurrent testing. No one's gotten around to it yet, but

surely

we need one.

There's something *somewhat* related here:

http://savannah.gnu.org/cgi-

bin/viewcvs/gnumed/gnumed/gnumed/client/testing/concurrency-torture-test.py

Show quoted text

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#5Matt Miller
mattm@epx.com
In reply to: Tom Lane (#2)
Re: [GENERAL] Testing of MVCC

On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. ... but surely we need one.

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests. Is there objection to a portion
of src/test/regress depending on contrib/dblink? I'm not sure yet how
that dependency would look, but I'm mainly wondering if there are
objections in principle to depending on contrib/.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#5)
Re: [GENERAL] Testing of MVCC

Matt Miller <mattm@epx.com> writes:

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests. Is there objection to a portion
of src/test/regress depending on contrib/dblink?

Yes. Given the difficulties we had in getting the contrib/dblink
regression tests to pass in the buildfarm, and the environmental
sensitivity it has, I don't think making the core tests depend on it
is a hot idea. In any case I doubt it would be very useful, since
a script based on that still doesn't let you issue concurrent queries.

regards, tom lane

#7Matt Miller
mattm@epx.com
In reply to: Tom Lane (#6)
Re: [GENERAL] Testing of MVCC

On Wed, 2005-08-10 at 16:41 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests.

I doubt it would be very useful, since
a script based on that still doesn't let you issue concurrent queries.

I think it would be useful to allow a test script to first create a set
of committed and uncommitted transactions, and to then issue some
queries on another connection to confirm that the other connection has a
proper view of the database at that point. This type of test is
serialized, but I think it would be a useful multi-user test. Also, the
output from such a test is probably pretty easy to fit into the
diff-based validation of "make check."

I realize that we also need to have tests that spawn several connections
and run scripts concurrently across those connections. I agree that
this type of test would probably not benefit fundamentally from
contrib/dblink. However, I was grasping a bit to see how the output
from such a concurrent test would be diff'ed with an expected output in
a meaningful way. So, to continue to progress on this problem, I
figured that a contrib/dblink dependency would at least allow me to
start coding something...

Is there objection to a portion
of src/test/regress depending on contrib/dblink?

Yes.

Understood.

#8Matt Miller
mattm@epx.com
In reply to: Matt Miller (#5)
1 attachment(s)
Re: Testing of MVCC

On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:

Matt Miller <mattm@epx.com> writes:

I want to write some regression tests that confirm the behavior of
multiple connections simultaneously going at the same tables/rows. Is
there something like this already, e.g. in src/test/regress?

No. ... but surely we need one.

The attached patch allows src/test/regress/pg_regress.sh to recognize
lines that begin with "curr_test:" in the schedule file. Tests named on
such a line are run concurrently across multiple connections. To make
use of this facility each test in the group must begin with the line:

select * from concurrency_test where key = '<test_name>' for update;

where <test_name> is replace by the name of that test. This will enable
pg_regress to start this test at the same time as the other tests in the
group.

Is this a reasonable starting point for a concurrent testing framework?

This does not address the issue of how to interpret the test output.
Maybe the simplest solution is to force test writers to generate output
that does not depend on the relative progress of any concurrent tests.
Or, maybe the "ignore:" directive in the schedule file could be employed
somehow.

Attachments:

curr_test.patchtext/x-patch; charset=UTF-8; name=curr_test.patchDownload
Index: pg_regress.sh
===================================================================
RCS file: /var/local/pgcvs/pgsql/src/test/regress/pg_regress.sh,v
retrieving revision 1.59
diff -c -r1.59 pg_regress.sh
*** pg_regress.sh	17 Jul 2005 18:28:45 -0000	1.59
--- pg_regress.sh	15 Aug 2005 21:20:03 -0000
***************
*** 623,628 ****
--- 623,632 ----
  do
      # Count line numbers
      lno=`expr $lno + 1`
+ 
+     # Init concurrency flag
+     concurrent=
+ 
      [ -z "$line" ] && continue
  
      set X $line; shift
***************
*** 631,636 ****
--- 635,647 ----
          shift
          ignore_list="$ignore_list $@"
          continue
+     elif [ x"$1" = x"curr_test:" ]; then
+         # init support for concurrent test group
+         concurrent=1
+         cat /dev/null >"$inputdir/sql/concurrency_test_init.sql"
+         echo "create table concurrency_test (key varchar primary key);" >>"$inputdir/sql/concurrency_test_init.sql"
+         ( $PSQL -d "$dbname" <"$inputdir/sql/concurrency_test_init.sql" >"$outputdir/results/concurrency_test_init.out" 2>&1 )&
+         wait
      elif [ x"$1" != x"test:" ]; then
          echo "$me:$schedule:$lno: syntax error"
          (exit 2); exit
***************
*** 649,671 ****
          ( $PSQL -d "$dbname" <"$inputdir/sql/$1.sql" >"$outputdir/results/$1.out" 2>&1 )&
          wait
      else
!         # Start a parallel group
!         $ECHO_N "parallel group ($# tests): $ECHO_C"
!         if [ $maxconnections -gt 0 ] ; then
!             connnum=0
!             test $# -gt $maxconnections && $ECHO_N "(in groups of $maxconnections) $ECHO_C"
!         fi
!         for name do
!             ( 
!               $PSQL -d "$dbname" <"$inputdir/sql/$name.sql" >"$outputdir/results/$name.out" 2>&1
!               $ECHO_N " $name$ECHO_C"
!             ) &
              if [ $maxconnections -gt 0 ] ; then
!                 connnum=`expr \( $connnum + 1 \) % $maxconnections`
!                 test $connnum -eq 0 && wait
              fi
!         done
!         wait
          echo
      fi
  
--- 660,717 ----
          ( $PSQL -d "$dbname" <"$inputdir/sql/$1.sql" >"$outputdir/results/$1.out" 2>&1 )&
          wait
      else
!         # ----------
!         # If this is a concurrent test group then write the script "concurrent_test.sql"
!         # which will spawn and synchronize each test in the group.
!         #
!         # Concurrent test groups do not respect $maxconnections.
!         #
!         # If this is not a concurrent test group then just run each test directly.
!         # ----------
! 
!         if [ "$concurrent" = "1" ]; then
!             $ECHO_N "concurrent group ($# tests): $ECHO_C"
! 
!             # insert a lock record for each test
!             cat /dev/null >"$inputdir/sql/concurrency_test.sql"
!             echo "BEGIN;" >>"$inputdir/sql/concurrency_test.sql"
!             for name do
!                 echo "insert into concurrency_test values ('$name');" >>"$inputdir/sql/concurrency_test.sql"
!             done
!             echo "COMMIT;" >>"$inputdir/sql/concurrency_test.sql"
! 
!             # for each test, acquire the lock and then spawn the test
!             echo "BEGIN;" >>"$inputdir/sql/concurrency_test.sql"
!             for name do
!                 echo "select * from concurrency_test where key = '$name' for update;" >>"$inputdir/sql/concurrency_test.sql"
!                 echo "\! $PSQL -d \"$dbname\" <\"$inputdir/sql/$name.sql\" >\"$outputdir/results/$name.out\" 2>&1 &" >>"$inputdir/sql/concurrency_test.sql"
!             done
! 
!             # release all locks, concurrently launching all tests
!             echo "ROLLBACK;" >>"$inputdir/sql/concurrency_test.sql"
! 
!             # done writing the script.  fire it.
!             ( $PSQL -d "$dbname" <"$inputdir/sql/concurrency_test.sql" >"$outputdir/results/concurrency_test.out" 2>&1 )&
!             wait
!         else
!             $ECHO_N "parallel group ($# tests): $ECHO_C"
              if [ $maxconnections -gt 0 ] ; then
!                 connnum=0
!                 test $# -gt $maxconnections && $ECHO_N "(in groups of $maxconnections) $ECHO_C"
              fi
! 
!             for name do
!                 (
!                   $PSQL -d "$dbname" <"$inputdir/sql/$name.sql" >"$outputdir/results/$name.out" 2>&1
!                   $ECHO_N " $name$ECHO_C"
!                 ) &
!                 if [ $maxconnections -gt 0 ] ; then
!                     connnum=`expr \( $connnum + 1 \) % $maxconnections`
!                     test $connnum -eq 0 && wait
!                 fi
!             done
!             wait
!         fi
          echo
      fi
  
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Miller (#8)
Re: Testing of MVCC

Matt Miller <mattm@epx.com> writes:

The attached patch allows src/test/regress/pg_regress.sh to recognize
lines that begin with "curr_test:" in the schedule file. Tests named on
such a line are run concurrently across multiple connections.

This doesn't seem like any advance over the existing parallel-test
facility. Synchronizing the test starts slightly more closely
isn't really going to buy anything: you still can't control or even
predict relative progress.

Maybe the simplest solution is to force test writers to generate output
that does not depend on the relative progress of any concurrent tests.

Well, that's exactly the situation we have now, and it's not really
adequate.

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.
I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

regards, tom lane

#10Matt Miller
mattm@epx.com
In reply to: Tom Lane (#9)
Re: Testing of MVCC

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.

It seems to me that this is what contrib/dblink could allow, but when I
presented that idea earlier you replied:

I doubt it would be very useful, since a script based on that
still doesn't let you issue concurrent queries.

So, I guess I'm not clear on what you're thinking.

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

#11Matt Miller
mattm@epx.com
In reply to: Matt Miller (#10)
Re: Testing of MVCC

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

I think I found it:

http://expect.nist.gov/

#12Michael Fuhr
mike@fuhr.org
In reply to: Matt Miller (#10)
Re: Testing of MVCC

On Mon, Aug 15, 2005 at 10:37:06PM +0000, Matt Miller wrote:

Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

http://www.google.com/

:-)

Or here:

http://expect.nist.gov/

--
Michael Fuhr

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: Testing of MVCC

Tom Lane wrote:

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.
I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

Or else a harness that operates at the library/connection level rather
than trying to control a tty app.

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing, and I am not sure how easy
or even possible it is to get it to work in a satisfactory way on
Windows. The NIST site says it's in AS Tcl, but in the docs that
accompany my copy of same, it says "Unix only" on the Expect manual page.

Just some words of caution.

One other note: please be very careful in changing pg_regress.sh -
getting it right especially on Windows was very time consuming, and it
is horribly fragile.

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Or else a harness that operates at the library/connection level rather
than trying to control a tty app.

Right. What is sort of in the back of my mind is a C program that can
open more than one connection, and it reads a script that tells it
"fire this command out on this connection". The question at hand is
whether we can avoid re-inventing the wheel.

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing,

True. I was pointing to it more as an example of the sorts of tools
people have built for this type of problem.

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

regards, tom lane

#15Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#14)
Re: Testing of MVCC

Tom Lane wrote:

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing,

True. I was pointing to it more as an example of the sorts of tools
people have built for this type of problem.

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

Yes, because we already have a dependency on perl. But don't be
surprised if we can't find such a beast, especially one that runs under
the weird MSys DTK perl - I won't even begin to tell you the nightmares
that caused with getting buildfarm to work on Windows.

BTW, further reading indicates that AS Expect does exist for Windows,
but it's a commercial offering, not a free one. Others appear to be
somewhat limited in value, but I could be wrong.

cheers

andrew

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#15)
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

Yes, because we already have a dependency on perl. But don't be
surprised if we can't find such a beast, especially one that runs under
the weird MSys DTK perl -

[ digs... ] It looks like what I was remembering is
http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
which seems to leave all the interesting problems (like driving more
than one program-under-test) to the user's own devices. Sigh.

regards, tom lane

#17Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#16)
Re: Testing of MVCC

Tom Lane <tgl@sss.pgh.pa.us> writes:

[ digs... ] It looks like what I was remembering is
http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
which seems to leave all the interesting problems (like driving more
than one program-under-test) to the user's own devices. Sigh.

The goal here is to find race conditions in the server, right? There's no real
chance of any race condition errors in psql as far as I can see, perhaps in
the \commands but I don't think that's what you're worried about here.

So why bother with driving multiple invocations of psql under Expect. Just use
DBD::Pg to open as many connections as you want and issue whatever queries you
want.

The driver program would be really simple. I'm not sure if you would specify
the series of queries with a perl data structure or define a text file format
that it would parse. Either seems pretty straightforward.

If you're worried about adding a dependency on DBD::Pg which would create a
circular dependency, well, it's just the test harness, it would just mean
someone would have to go build DBD::Pg before running the tests. (Personally
my inclination would be to break the cycle by including DBD::Pg in core but
that seems to be an uphill battle these days.)

--
greg

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#17)
Re: Testing of MVCC

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

regards, tom lane

#19Tino Wildenhain
tino@wildenhain.de
In reply to: Tom Lane (#18)
Re: Testing of MVCC

Tom Lane schrieb:

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

I wonder if you dont have a wrapper around libpq you can use like that?

#20Andrew Piskorski
atp@piskorski.com
In reply to: Tom Lane (#9)
Re: Testing of MVCC

On Mon, Aug 15, 2005 at 06:01:20PM -0400, Tom Lane wrote:

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.

Well, using Tcl with its Tcl Threads Extension should certainly let
you easily control multiple concurrent PostgreSQL connections. (The
Thread Extension's APIs are particularly nice for multi-threaded
programming.) Its docs are here:

http://cvs.sourceforge.net/viewcvs.py/tcl/thread/doc/html/

I am unsure that the existing pg_regress infrastructure is the right
place to start from. Perhaps we should look at Expect or something
similar.

I don't have any clear idea of what sort of tests you want to run
"according to a script" though, so I'm not sure whether the Tcl
Threads Extension, or Expect, or some other tool would best meet your
needs.

--
Andrew Piskorski <atp@piskorski.com>
http://www.piskorski.com/

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tino Wildenhain (#19)
Re: Testing of MVCC

Tino Wildenhain <tino@wildenhain.de> writes:

Tom Lane schrieb:

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". ...
I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

I wonder if you dont have a wrapper around libpq you can use like that?

Sure, it wouldn't take much to create a minimal C+libpq program that
would do the basics. But the history of testing tools teaches that
you soon find yourself wanting a whole lot more functionality, like
conditional tests, looping, etc, in the test-driver mechanism.
That's the wheel that I don't want to re-invent. And it's a big part
of the reason why stuff like Expect and the Perl Test modules have
become so popular: you have a full scripting language right there at
your command.

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

regards, tom lane

#22Tino Wildenhain
tino@wildenhain.de
In reply to: Tom Lane (#21)
Re: Testing of MVCC

Tom Lane schrieb:

Tino Wildenhain <tino@wildenhain.de> writes:

Tom Lane schrieb:

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". ...
I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

I wonder if you dont have a wrapper around libpq you can use like that?

Sure, it wouldn't take much to create a minimal C+libpq program that
would do the basics. But the history of testing tools teaches that

Well no no. I was just thinking perl might have something similar to
pythons pyPgSQL module which both hase dbapi2 interface as well
as low level access to libpq - all that nicely accessible from the
scripting language. I'm using it for NOTIFY/LISTEN for example.

you soon find yourself wanting a whole lot more functionality, like
conditional tests, looping, etc, in the test-driver mechanism.
That's the wheel that I don't want to re-invent. And it's a big part
of the reason why stuff like Expect and the Perl Test modules have
become so popular: you have a full scripting language right there at
your command.

Sure, see above :)

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

Nothing on cpan or how thats called?

#23Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#18)
Re: Testing of MVCC

Tom Lane <tgl@sss.pgh.pa.us> writes:

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

Hm.

The API is designed like that. You issue a query in one call and retrieve the
results in a separate call (or series of calls). I don't know the DBD::Pg
implementation enough to be sure it's a 1:1 mapping to Postgres wire protocol
messages but I would expect you could get a pretty control at that level.

I doubt it's using asynchronous I/O though. Which would mean, for example,
that you can't arrange to send a message while another connection is in the
middle of receiving a large message.

I think part of this boils down to a deficiency in the Postgres wire protocol
though. It doesn't allow for interleaving calls in the middle of downloading a
large results block. That means DBD::Pg would be in bad shape if it returned
control to the user while in the process of downloading query results. If the
user issued any calls to the driver in that state it would have to return some
sort of error.

By comparison DBD::Oracle can stream results to the user while still
continuing to download more results. It tries to adjust the number of records
read whenever the buffer empties to keep the network pipeline full. This
allows the user to process records while the database is still working on
executing the query and the network is still working on shipping the results.
(Obviously this works better with some plans than others.) And the driver can
cancel or issue other queries between any of these block reads.

--
greg

#24Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#21)
Re: Testing of MVCC

Tom Lane wrote:

Sure, it wouldn't take much to create a minimal C+libpq program that
would do the basics. But the history of testing tools teaches that
you soon find yourself wanting a whole lot more functionality, like
conditional tests, looping, etc, in the test-driver mechanism.
That's the wheel that I don't want to re-invent. And it's a big part
of the reason why stuff like Expect and the Perl Test modules have
become so popular: you have a full scripting language right there at
your command.

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

How will we make sure it's consistent? People have widely varying
versions of DBD::Pg and DBI installed, not to mention the bewildering
array of Test::Foo modules out there (just try installing Template
Toolkit on a less than very modern perl and see yourself get into module
hell). The only way I can see of working on this path would be to keep
and make our own copies of the needed modules, and point PERL5LIB at
that collection. But that would constitute a large extra buildtime burden.

A better solution might be to hack something out of the pure perl DBD
driver and use that. It's known to have some problems, but maybe this
would be a good impetus to iron those out, and this would reduce us to
carrying a single non-compiled perl module (plus whatever test framework
we need).

cheers

andrew

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#24)
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

How will we make sure it's consistent? People have widely varying
versions of DBD::Pg and DBI installed, not to mention the bewildering
array of Test::Foo modules out there

Yeah, that would be an issue. But can't a Perl script require
"version >= m.n" for each module it uses?

I had actually been thinking to myself that Pg.pm might be a better base
because it's more self-contained.

Another line of thought is to write a fresh implementation of the wire
protocol all in Perl, so as not to depend on DBI or much of anything
except Perl's TCP support (which I hope is reasonably well standardized
;-)). If you wanted to do any testing at the protocol level ---
handling of bad messages, say --- you'd pretty much need this anyway
because no driver is going to let you get at things at such a low level.
But it'd raise the cost of getting started quite a bit.

regards, tom lane

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#25)
Re: Testing of MVCC

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

How will we make sure it's consistent? People have widely varying
versions of DBD::Pg and DBI installed, not to mention the bewildering
array of Test::Foo modules out there

Yeah, that would be an issue. But can't a Perl script require
"version >= m.n" for each module it uses?

Yes it can, but are you going to restrict building or running
regressions to only thos platforms that have our required modules
installed? That might be thought a tad unfriendly.

I had actually been thinking to myself that Pg.pm might be a better base
because it's more self-contained.

Another line of thought is to write a fresh implementation of the wire
protocol all in Perl, so as not to depend on DBI or much of anything
except Perl's TCP support (which I hope is reasonably well standardized
;-)). If you wanted to do any testing at the protocol level ---
handling of bad messages, say --- you'd pretty much need this anyway
because no driver is going to let you get at things at such a low level.
But it'd raise the cost of getting started quite a bit.

I think we're mostly on the same page.

Maybe pulling some code from this would give us a leg up rather than
having to start from scratch: http://search.cpan.org/~arc/DBD-PgPP-0.05/

cheers

andrew

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Yeah, that would be an issue. But can't a Perl script require
"version >= m.n" for each module it uses?

Yes it can, but are you going to restrict building or running
regressions to only thos platforms that have our required modules
installed? That might be thought a tad unfriendly.

Well, how we package all this is still TBD. We might want to set it up
as a test suite completely separate from the existing regression tests.
Then, if you don't want to install the needed Perl stuff, you just don't
run it.

regards, tom lane

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
Re: Testing of MVCC

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Another line of thought is to write a fresh implementation of the wire
protocol all in Perl, so as not to depend on DBI or much of anything
except Perl's TCP support (which I hope is reasonably well standardized
;-)). If you wanted to do any testing at the protocol level ---
handling of bad messages, say --- you'd pretty much need this anyway
because no driver is going to let you get at things at such a low level.
But it'd raise the cost of getting started quite a bit.

Maybe pulling some code from this would give us a leg up rather than
having to start from scratch: http://search.cpan.org/~arc/DBD-PgPP-0.05/

Another thought on this point: I think most of the pain is associated
with handling the startup authentication negotiation (at least if you
want to support all the auth method options). Once you've got a live
connection the protocol's not all that complicated. So one way to
leverage some code would be to use DBD or another existing driver to
handle the startup phase, and then just hack it to turn over the bare
socket to the test code.

regards, tom lane

#29Kaare Rasmussen
kar@kakidata.dk
In reply to: Andrew Dunstan (#26)
Re: Testing of MVCC

Yes it can, but are you going to restrict building or running
regressions to only thos platforms that have our required modules
installed? That might be thought a tad unfriendly.

You could include DBD::Pg with the distribution and run it locally. Perhaps
even DBI, leaving Perl the only unknown.

#30Jim C. Nasby
jnasby@pervasive.com
In reply to: Tom Lane (#18)
Re: Testing of MVCC

On Tue, Aug 16, 2005 at 12:24:34AM -0400, Tom Lane wrote:

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

Even if it can't be done, would it be reasonable to spawn multiple perl
processes, each of which handles one database connection? I suspect it
wouldn't be too hard to write a daemon in perl that would sit between
the test code and a pile of DBI connections.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com 512-569-9461

#31Greg Stark
gsstark@mit.edu
In reply to: Jim C. Nasby (#30)
Re: Testing of MVCC

"Jim C. Nasby" <jnasby@pervasive.com> writes:

On Tue, Aug 16, 2005 at 12:24:34AM -0400, Tom Lane wrote:

Greg Stark <gsstark@mit.edu> writes:

So why bother with driving multiple invocations of psql under
Expect. Just use DBD::Pg to open as many connections as you want and
issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet". Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away. Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be. There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

Even if it can't be done, would it be reasonable to spawn multiple perl
processes, each of which handles one database connection? I suspect it
wouldn't be too hard to write a daemon in perl that would sit between
the test code and a pile of DBI connections.

Well then you're back in the same boat as using psql or any other subprocess.

However I'm unconvinced that it cannot be done. Certainly the API separates
pretty much each phase of the query process. You can issue queries and not
look at the result set of the query immediately. What's unclear is what the
driver is doing beneath the scenes. It may be reading the results sooner than
the API makes it appear.

There are also more places to potentially try to interleave actions than the
API actually exposes. I'm not sure how important those additional points would
be though since they are necessarily pretty small windows or else the API
would expose them.

So for example if you issue a query you can regain control before reading the
actual results of the query. However you cannot regain control before the
reading the message that indicates your query was at least received properly.
In practice I'm not sure that matters since that one simple response would
undoubtedly fit within the server's network buffers anyways so whether the
client waits for it or not seems unlikely to have any effect on the server.

There is also the problem of trying to have two processes reading results at
the same time. Say you want to test that two concurrent sequential scans
behave properly. libpq and therefore DBI since it's based on libpq can only
read the entire result set. You can do things before you read the results, but
once you ask for the results you don't get control back until the entire
result set is ready.

--
greg