Using LibPq in TAP tests via FFI

Started by Andrew Dunstanover 1 year ago27 messages
Jump to latest
#1Andrew Dunstan
andrew@dunslane.net

Over at [1]  /messages/by-id/20240612152812.ixz3eiz2p475gose@awork3.anarazel.de Andres expressed enthusiasm for enabling TAP tests to call
LibPQ directly via FFI, and there was some support from others as well.
Attached is a very rough POC for just that.There are two perl modules,
one which wraps libpq (or almost all of it) in perl, and another which
uses that module to create a session object that can be used to run SQL.
Also in the patch is a modification of one TAP test (arbitrarily chosen
as src/bin/pg_amcheck/t/004_verify_heapam.p) to use the new interface,
so it doesn't use psql at all.

There's a bunch of work to do here, but for a morning's work it's not
too bad :-) Luckily I had most of the first file already to hand.

Next I plan to look at some of the recovery tests and other uses of
background_psql, which might be more challenging,a dn require extension
of the session API. Also there's a lot of error checking and
documentation that need to be added.

cheers

andrew

[1]:   /messages/by-id/20240612152812.ixz3eiz2p475gose@awork3.anarazel.de

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#1)
Re: Using LibPq in TAP tests via FFI

On 2024-06-14 Fr 11:09, Andrew Dunstan wrote:

Over at [1] Andres expressed enthusiasm for enabling TAP tests to call
LibPQ directly via FFI, and there was some support from others as
well. Attached is a very rough POC for just that.There are two perl
modules, one which wraps libpq (or almost all of it) in perl, and
another which uses that module to create a session object that can be
used to run SQL. Also in the patch is a modification of one TAP test
(arbitrarily chosen as src/bin/pg_amcheck/t/004_verify_heapam.p) to
use the new interface, so it doesn't use psql at all.

There's a bunch of work to do here, but for a morning's work it's not
too bad :-) Luckily I had most of the first file already to hand.

Next I plan to look at some of the recovery tests and other uses of
background_psql, which might be more challenging,a dn require
extension of the session API. Also there's a lot of error checking and
documentation that need to be added.

cheers

andrew

[1]
/messages/by-id/20240612152812.ixz3eiz2p475gose@awork3.anarazel.de

And here's the patch

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pqffi-poc.patchtext/x-patch; charset=UTF-8; name=pqffi-poc.patchDownload+761-38
#3Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#2)
Re: Using LibPq in TAP tests via FFI

On Fri, Jun 14, 2024 at 11:11 AM Andrew Dunstan <andrew@dunslane.net> wrote:

And here's the patch

I haven't reviewed the patch, but a big +1 for the idea. Not only this
might cut down on the resource costs of running the tests in CI, as
Andres has pointed out a few times, but it also could lead to much
nicer user interfaces. For instance, right now, we have a number of
TAP tests that are parsing psql output to recover the values returned
by queries. Perhaps eventually - or maybe already, again I haven't
looked at the code - you'll be able to do something like
$resultset->[0][0] to pull the first column out of the first row. That
kind of thing could substantially improve the readability and
maintainability of some of our tests.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#1)
Re: Using LibPq in TAP tests via FFI

Hi,

On June 14, 2024 8:09:43 AM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:

Over at [1] Andres expressed enthusiasm for enabling TAP tests to call LibPQ directly via FFI, and there was some support from others as well. Attached is a very rough POC for just that.There are two perl modules, one which wraps libpq (or almost all of it) in perl, and another which uses that module to create a session object that can be used to run SQL. Also in the patch is a modification of one TAP test (arbitrarily chosen as src/bin/pg_amcheck/t/004_verify_heapam.p) to use the new interface, so it doesn't use psql at all.

There's a bunch of work to do here, but for a morning's work it's not too bad :-) Luckily I had most of the first file already to hand.

Yay!

Next I plan to look at some of the recovery tests and other uses of background_psql, which might be more challenging,a dn require extension of the session API. Also there's a lot of error checking and documentation that need to be added.

I'd suggest trying to convert the various looping constructs first, they're responsible for a large number of spawned shells. And I vaguely recall that there were none/very few that depend on actually being run via psql.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#5Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#2)
Re: Using LibPq in TAP tests via FFI

Hi,

On 2024-06-14 11:11:38 -0400, Andrew Dunstan wrote:

On 2024-06-14 Fr 11:09, Andrew Dunstan wrote:

Over at [1] Andres expressed enthusiasm for enabling TAP tests to call
LibPQ directly via FFI, and there was some support from others as well.
Attached is a very rough POC for just that.There are two perl modules,
one which wraps libpq (or almost all of it) in perl, and another which
uses that module to create a session object that can be used to run SQL.

What are your current thoughts about a fallback for this? It seems possible
to implement the session module ontop of BackgroundPsql.pm, if necessary. But
I suspect we'll eventually get to a point where that gets less and less
convenient.

How much of a dependency is FFI::Platypus, compared to requiring perl headers
to be installed? In case FFI::Platypus is a complicted dependency, a small XS
wrapper could be an alternative.

Greetings,

Andres Freund

#6Thomas Munro
thomas.munro@gmail.com
In reply to: Robert Haas (#3)
Re: Using LibPq in TAP tests via FFI

On Sat, Jun 15, 2024 at 3:33 AM Robert Haas <robertmhaas@gmail.com> wrote:

I haven't reviewed the patch, but a big +1 for the idea. Not only this
might cut down on the resource costs of running the tests in CI, as

It would be good to keep some context between the threads here. For
the archives' sake, here is where the potential savings were reported,
and this and other ideas were discussed:

/messages/by-id/CA+hUKGJoEO33K=ZynsH=xkiEyfBMZjOoqBK+gouBdTGW2-woig@mail.gmail.com

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#4)
Re: Using LibPq in TAP tests via FFI

On Fri, Jun 14, 2024 at 11:40 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On June 14, 2024 8:09:43 AM PDT, Andrew Dunstan <andrew@dunslane.net>
wrote:

Over at [1] Andres expressed enthusiasm for enabling TAP tests to call

LibPQ directly via FFI, and there was some support from others as well.
Attached is a very rough POC for just that.There are two perl modules, one
which wraps libpq (or almost all of it) in perl, and another which uses
that module to create a session object that can be used to run SQL. Also in
the patch is a modification of one TAP test (arbitrarily chosen as
src/bin/pg_amcheck/t/004_verify_heapam.p) to use the new interface, so it
doesn't use psql at all.

There's a bunch of work to do here, but for a morning's work it's not too

bad :-) Luckily I had most of the first file already to hand.

Yay!

Next I plan to look at some of the recovery tests and other uses of

background_psql, which might be more challenging,a dn require extension of
the session API. Also there's a lot of error checking and documentation
that need to be added.

I'd suggest trying to convert the various looping constructs first,
they're responsible for a large number of spawned shells. And I vaguely
recall that there were none/very few that depend on actually being run via
psql.

Yeah, here's a new version with a few more scripts modified, and also
poll_query_until() adjusted. That seems to be the biggest looping construct.

The biggest remaining unadjusted script users of psql are all in the
subscription and recovery tests.

cheers

andrew

Attachments:

pqffi-poc-2.patchtext/x-patch; charset=US-ASCII; name=pqffi-poc-2.patchDownload+843-89
#8Andrew Dunstan
andrew@dunslane.net
In reply to: Thomas Munro (#6)
Re: Using LibPq in TAP tests via FFI

On Fri, Jun 14, 2024 at 7:42 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sat, Jun 15, 2024 at 3:33 AM Robert Haas <robertmhaas@gmail.com> wrote:

I haven't reviewed the patch, but a big +1 for the idea. Not only this
might cut down on the resource costs of running the tests in CI, as

It would be good to keep some context between the threads here. For
the archives' sake, here is where the potential savings were reported,
and this and other ideas were discussed:

/messages/by-id/CA+hUKGJoEO33K=ZynsH=xkiEyfBMZjOoqBK+gouBdTGW2-woig@mail.gmail.com

Yeah thanks for adding that context.

cheers

andrew

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#5)
Re: Using LibPq in TAP tests via FFI

On Fri, Jun 14, 2024 at 12:25 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2024-06-14 11:11:38 -0400, Andrew Dunstan wrote:

On 2024-06-14 Fr 11:09, Andrew Dunstan wrote:

Over at [1] Andres expressed enthusiasm for enabling TAP tests to call
LibPQ directly via FFI, and there was some support from others as well.
Attached is a very rough POC for just that.There are two perl modules,
one which wraps libpq (or almost all of it) in perl, and another which
uses that module to create a session object that can be used to run

SQL.

What are your current thoughts about a fallback for this? It seems
possible
to implement the session module ontop of BackgroundPsql.pm, if necessary.
But
I suspect we'll eventually get to a point where that gets less and less
convenient.

I guess it's a question of how widely available FFI::Platypus is. I know
it's available pretty much out of the box on Strawberry Perl and Msys2'
ucrt perl. It works fine on my Ubuntu ARM64 instance. On my Mac I had to
install it via cpan, but that worked fine. For the moment CYgwin has me
beat, but I believe it's possible to make it work - at least the docs
suggest it is. Not sure about other platforms.

I agree with you that falling back on BackgroundPsql is not a terribly
satisfactory solution.

How much of a dependency is FFI::Platypus, compared to requiring perl
headers
to be installed? In case FFI::Platypus is a complicted dependency, a
small XS
wrapper could be an alternative.

Sure we could look at it. I might need to enlist some assistance there :-).
Using FFI is really nice because it does so much of the work for you.

cheers

andrew

#10Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#9)
Re: Using LibPq in TAP tests via FFI

Hi,

On 2024-06-16 17:43:05 -0400, Andrew Dunstan wrote:

On Fri, Jun 14, 2024 at 12:25 PM Andres Freund <andres@anarazel.de> wrote:
I guess it's a question of how widely available FFI::Platypus is. I know
it's available pretty much out of the box on Strawberry Perl and Msys2'
ucrt perl.

FWIW I hacked a bit on CI, trying to make it work. Took a bit, partially
because CI uses an older strawberry perl without FFI::Platypus. And
FFI::Platypus didn't build with that.

Updating that to 5.38 causes some complaints about LANG that I haven't hunted
down, just avoided by unsetting LANG.

As-is your patch didn't work, because it has "systempath => []", which caused
libpq to not load, because it depended on things in the system path...

What's the reason for that?

After commenting that out, all but one tests passed:

[20:21:31.137] ------------------------------------- 8< -------------------------------------
[20:21:31.137] stderr:
[20:21:31.137] # Failed test 'psql connect success'
[20:21:31.137] # at C:/cirrus/src/test/recovery/t/041_checkpoint_at_promote.pl line 161.
[20:21:31.137] # got: '2'
[20:21:31.137] # expected: '0'
[20:21:31.137] # Failed test 'psql select 1'
[20:21:31.137] # at C:/cirrus/src/test/recovery/t/041_checkpoint_at_promote.pl line 162.
[20:21:31.137] # got: ''
[20:21:31.137] # expected: '1'
[20:21:31.137] # Looks like you failed 2 tests of 6.
[20:21:31.137]
[20:21:31.137] (test program exited with status code 2)
[20:21:31.137] ------------------------------------------------------------------------------
[20:21:31.137]

Due to concurrency and run-to-run variance I wouldn't bet too much on this,
but the modified tests do have improved test times:

before:

[19:40:47.468] 135/296 postgresql:pg_amcheck / pg_amcheck/004_verify_heapam OK 7.70s 32 subtests passed
[19:43:40.853] 232/296 postgresql:amcheck / amcheck/001_verify_heapam OK 36.50s 272 subtests passed

after:
[20:22:55.495] 133/296 postgresql:pg_amcheck / pg_amcheck/004_verify_heapam OK 4.60s 32 subtests passed
[20:25:13.641] 212/296 postgresql:amcheck / amcheck/001_verify_heapam OK 4.87s 272 subtests passed

I looked at a few past runs and there never were instances of
amcheck/001_verify_heapam that were even close to as fast as this.

The overall tests time did improve some, but that is hard to weigh due to the
test failure.

I agree with you that falling back on BackgroundPsql is not a terribly
satisfactory solution.

I'm somewhat doubtful we'll just agree on making FFI::Platypus a hard
dependency, but if we agree to do so...

Greetings,

Andres Freund

#11Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#10)
Re: Using LibPq in TAP tests via FFI

On Mon, Jun 17, 2024 at 10:38 AM Andres Freund <andres@anarazel.de> wrote:

before:

[19:40:47.468] 135/296 postgresql:pg_amcheck / pg_amcheck/004_verify_heapam OK 7.70s 32 subtests passed
[19:43:40.853] 232/296 postgresql:amcheck / amcheck/001_verify_heapam OK 36.50s 272 subtests passed

after:
[20:22:55.495] 133/296 postgresql:pg_amcheck / pg_amcheck/004_verify_heapam OK 4.60s 32 subtests passed
[20:25:13.641] 212/296 postgresql:amcheck / amcheck/001_verify_heapam OK 4.87s 272 subtests passed

Nice!

I agree with you that falling back on BackgroundPsql is not a terribly
satisfactory solution.

I'm somewhat doubtful we'll just agree on making FFI::Platypus a hard
dependency, but if we agree to do so...

Why can't we just do that? I mean, do we have any concrete reason to
think that it'll block a supported platform?

I'm personally willing to test/validate on the full set of non-Linux
Unixen and write up the install instructions to help eg build farm
animal owners adjust. Really this is mostly about libffi, which is
super widely ported, and it is required by Python which we already
soft-depend on, and will hard-depend on if we drop autoconf. The rest
is presumably just Perl xs glue to drive it, which, if it doesn't work
on some niche platform, you'd think should be easy enough to fix...

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#11)
Re: Using LibPq in TAP tests via FFI

Thomas Munro <thomas.munro@gmail.com> writes:

On Mon, Jun 17, 2024 at 10:38 AM Andres Freund <andres@anarazel.de> wrote:

I'm somewhat doubtful we'll just agree on making FFI::Platypus a hard
dependency, but if we agree to do so...

Why can't we just do that? I mean, do we have any concrete reason to
think that it'll block a supported platform?

IIUC, this would only be a hard dependency if you want to run certain
TAP tests (maybe eventually all of them). Seems like not that much of
a roadblock for somebody that's just trying to build PG for
themselves. I agree we'd want it on most buildfarm animals
eventually, but they pretty much all have python installed ...

regards, tom lane

#13Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#12)
Re: Using LibPq in TAP tests via FFI

Hi,

On 2024-06-16 19:07:49 -0400, Tom Lane wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Mon, Jun 17, 2024 at 10:38 AM Andres Freund <andres@anarazel.de> wrote:

I'm somewhat doubtful we'll just agree on making FFI::Platypus a hard
dependency, but if we agree to do so...

Why can't we just do that? I mean, do we have any concrete reason to
think that it'll block a supported platform?

IIUC, this would only be a hard dependency if you want to run certain
TAP tests (maybe eventually all of them).

I think it'd be all of them within a very short timeframe. IMO we'd want to
convert a bunch of the code in Cluster.pm to use psql-less connections to
maximize the benefit across all tests, without needing to modify all of them.

Greetings,

Andres Freund

#14Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#11)
Re: Using LibPq in TAP tests via FFI

On Mon, Jun 17, 2024 at 10:57 AM Thomas Munro <thomas.munro@gmail.com> wrote:

I'm personally willing to test/validate on the full set of non-Linux
Unixen and write up the install instructions to help eg build farm
animal owners adjust.

I created a page where we can log "works/doesn't work" and "installed
how" information:

https://wiki.postgresql.org/wiki/Platypus

I'll go and test the BSDs and hopefully illumos. And then maybe Macs
if Tom doesn't beat me to it.

#15Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#14)
Re: Using LibPq in TAP tests via FFI

Hi,

On 2024-06-17 12:03:28 +1200, Thomas Munro wrote:

And then maybe Macs if Tom doesn't beat me to it.

macports even has a platypus package, so that should be easy.

For CI it should suffice to add p5.34-ffi-platypus to the list of packages in
macos' setup_additional_packages_script, they then should get automatically
cached.

Greetings,

Andres Freund

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#15)
Re: Using LibPq in TAP tests via FFI

Andres Freund <andres@anarazel.de> writes:

On 2024-06-17 12:03:28 +1200, Thomas Munro wrote:

And then maybe Macs if Tom doesn't beat me to it.

macports even has a platypus package, so that should be easy.

Less easy if you don't want to depend on macports or homebrew.
However, I see something a bit promising-looking in the base system:

$ ls -l /usr/lib/*ffi*
-rwxr-xr-x 1 root wheel 100720 May 7 03:01 /usr/lib/libffi-trampolines.dylib

regards, tom lane

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#11)
Re: Using LibPq in TAP tests via FFI

Thomas Munro <thomas.munro@gmail.com> writes:

Really this is mostly about libffi, which is
super widely ported, and it is required by Python

BTW, what form does that "requirement" take exactly? I see no
evidence that the core python3 executable is linked to libffi
on any of the machines I checked.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#16)
Re: Using LibPq in TAP tests via FFI

I wrote:

Andres Freund <andres@anarazel.de> writes:

macports even has a platypus package, so that should be easy.

Less easy if you don't want to depend on macports or homebrew.

I tried "sudo cpan install FFI::Platypus" against macOS Sonoma's
base-system perl. It seemed to compile all right, but a nontrivial
fraction of its self-tests fail:

Files=72, Tests=296, 7 wallclock secs ( 0.10 usr 0.07 sys + 4.96 cusr 1.34 csys = 6.47 CPU)
Result: FAIL
Failed 33/72 test programs. 87/296 subtests failed.
make: *** [test_dynamic] Error 3
PLICEASE/FFI-Platypus-2.08.tar.gz
/usr/bin/make test -- NOT OK

No energy for digging deeper tonight.

regards, tom lane

#19Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#17)
Re: Using LibPq in TAP tests via FFI

On Mon, Jun 17, 2024 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Really this is mostly about libffi, which is
super widely ported, and it is required by Python

BTW, what form does that "requirement" take exactly? I see no
evidence that the core python3 executable is linked to libffi
on any of the machines I checked.

There is another library in between:

$ ldd /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311.so
/usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311.so:
libffi.so.8 => /usr/local/lib/libffi.so.8 (0x214865b76000)
libdl.so.1 => /usr/lib/libdl.so.1 (0x214864bcc000)
libthr.so.3 => /lib/libthr.so.3 (0x214866862000)
libc.so.7 => /lib/libc.so.7 (0x214863e03000)

Perhaps it's technically possible to build Python without the ctypes
module, but I'm not sure and I don't see anywhere that describes it
explicitly as optional.

#20Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#19)
Re: Using LibPq in TAP tests via FFI

On Mon, Jun 17, 2024 at 1:36 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Perhaps it's technically possible to build Python without the ctypes
module, but I'm not sure and I don't see anywhere that describes it
explicitly as optional.

One clue is that they used to bundle their own copy of libffi before
Python 3.7. You had a choice of that or --with-system-ffi, but I
don't see an option for none. I might be missing something about
their build system, though.

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Dunstan (#7)
#22Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Dunstan (#22)
#24Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#10)
#25Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#24)
#26Thomas Munro
thomas.munro@gmail.com
In reply to: Andrew Dunstan (#25)
#27Andrew Dunstan
andrew@dunslane.net
In reply to: Thomas Munro (#26)