pgsql: Further cleanup of autoconf output files for GSSAPI changes.

Started by Tom Laneover 2 years ago31 messages
#1Tom Lane
tgl@sss.pgh.pa.us

Further cleanup of autoconf output files for GSSAPI changes.

Running autoheader was missed in f7431bca8. This is cosmetic since
we aren't using these HAVE_ symbols, but let's get everything in
sync while we're looking at this.

Discussion: /messages/by-id/2422362.1681741814@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/d48ac0070cff197125088959e5644ed051322f5e

Modified Files
--------------
src/include/pg_config.h.in | 6 ++++++
1 file changed, 6 insertions(+)

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#1)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

On 2023-04-17 Mo 11:22, Tom Lane wrote:

Further cleanup of autoconf output files for GSSAPI changes.

Running autoheader was missed in f7431bca8. This is cosmetic since
we aren't using these HAVE_ symbols, but let's get everything in
sync while we're looking at this.

Discussion:/messages/by-id/2422362.1681741814@sss.pgh.pa.us

I think this also needs a fix in src/tools/msvc/Solution.pm, see
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&amp;dt=2023-04-17%2016%3A30%3A03&gt;

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

Andrew Dunstan <andrew@dunslane.net> writes:

I think this also needs a fix in src/tools/msvc/Solution.pm, see
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&amp;dt=2023-04-17%2016%3A30%3A03&gt;

Argh, forgot about that. Will fix.

(This three-build-system business can't go away soon enough.)

regards, tom lane

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

On 2023-04-17 Mo 15:53, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

I think this also needs a fix in src/tools/msvc/Solution.pm, see
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&amp;dt=2023-04-17%2016%3A30%3A03&gt;

Argh, forgot about that. Will fix.

(This three-build-system business can't go away soon enough.)

From my POV we can remove it any time - I am still having Windows
issues with meson, but only with MSYS2. The MSVC meson build on drongo
is perfectly well behaved.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-17 Mo 15:53, Tom Lane wrote:

(This three-build-system business can't go away soon enough.)

From my POV we can remove it any time - I am still having Windows
issues with meson, but only with MSYS2. The MSVC meson build on drongo
is perfectly well behaved.

I think the outcome of the discussion a week or so ago was that
we want the MSVC scripts in v16, but we can nuke them from HEAD
as soon as the branch is made.

autoconf unfortunately will have to live a good bit longer ...
I don't think we're anywhere near the point where the meson
system is mature enough to drop that.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#4)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

On 2023-04-17 16:22:30 -0400, Andrew Dunstan wrote:

I am still having Windows issues with meson, but only with MSYS2.

Any more details on that? I might be able to help out / improve things.

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#6)
Re: pgsql: Further cleanup of autoconf output files for GSSAPI changes.

On 2023-04-20 Th 11:06, Andres Freund wrote:

On 2023-04-17 16:22:30 -0400, Andrew Dunstan wrote:

I am still having Windows issues with meson, but only with MSYS2.

Any more details on that? I might be able to help out / improve things.

For some reason which makes no sense to me the buildfarm animal fails at
the first Stop-Db step. The DB is actually stopped, but pg_ctl returns a
non-zero status. The thing that's really odd is that meson isn't at all
involved in this step. But it's happened enough that I've had to back
off using meson builds on fairywren - I'm going to do more testing on a
new Windows instance.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#7)
issue with meson builds on msys2

[redirecting to -hackers]

On 2023-04-20 Th 15:37, Andrew Dunstan wrote:

On 2023-04-20 Th 11:06, Andres Freund wrote:

On 2023-04-17 16:22:30 -0400, Andrew Dunstan wrote:

I am still having Windows issues with meson, but only with MSYS2.

Any more details on that? I might be able to help out / improve things.

For some reason which makes no sense to me the buildfarm animal fails
at the first Stop-Db step. The DB is actually stopped, but pg_ctl
returns a non-zero status. The thing that's really odd is that meson
isn't at all involved in this step. But it's happened enough that I've
had to back off using meson builds on fairywren - I'm going to do more
testing on a new Windows instance.

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test
with a non-meson build there is no problem:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ export PGCTLTIMEOUT=300

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile start") ; print "fail\n" if $?; '
waiting for server to start.... done
server started

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop") ; print "fail\n" if $?; '
waiting for server to shut down....fail

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ tail root/HEAD/instkeep.2023-04-25_11-09-41/logfile
2023-04-26 12:44:50.188 UTC [5132:2] LOG:  listening on Unix socket
"C:/tools/msys64/tmp/buildfarm-jaWBkm/.s.PGSQL.5678"
2023-04-26 12:44:50.249 UTC [5388:1] LOG:  database system was shut down
at 2023-04-26 12:43:02 UTC
2023-04-26 12:44:50.260 UTC [5132:3] LOG:  database system is ready to
accept connections
2023-04-26 12:45:01.542 UTC [5132:4] LOG:  received fast shutdown request
2023-04-26 12:45:01.542 UTC [5132:5] LOG:  aborting any active transactions
2023-04-26 12:45:01.547 UTC [5132:6] LOG:  background worker "logical
replication launcher" (PID 3876) exited with exit code 1
2023-04-26 12:45:01.550 UTC [6032:1] LOG:  shutting down
2023-04-26 12:45:01.551 UTC [6032:2] LOG:  checkpoint starting: shutdown
immediate
2023-04-26 12:45:01.557 UTC [6032:3] LOG:  checkpoint complete: wrote 2
buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003
s, sync=0.001 s, total=0.007 s; sync files=0, longest=0.000 s,
average=0.000 s; distance=0 kB, estimate=0 kB; lsn=0/1034E7F8, redo
lsn=0/1034E7F8
2023-04-26 12:45:01.568 UTC [5132:7] LOG:  database system is shut down

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#8)
Re: issue with meson builds on msys2

Andrew Dunstan <andrew@dunslane.net> writes:

For some reason which makes no sense to me the buildfarm animal fails
at the first Stop-Db step. The DB is actually stopped, but pg_ctl
returns a non-zero status. The thing that's really odd is that meson
isn't at all involved in this step. But it's happened enough that I've
had to back off using meson builds on fairywren - I'm going to do more
testing on a new Windows instance.

Here's a simple illustration of the problem. If I do the identical test
with a non-meson build there is no problem:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop") ; print "fail\n" if $?; '
waiting for server to shut down....fail

Looking at the pg_ctl source code, the only way I can explain that
printout is that do_stop called wait_for_postmaster_stop which,
after one or more loops, exited via one of its exit() calls.
The lack of any message can be explained if we imagine that
write_stderr() output is going to the bit bucket. I'd start by
changing those write_stderr's to print_msg(), which visibly
does work; that should confirm the existence of the stderr
issue and show you how wait_for_postmaster_stop is failing.

regards, tom lane

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#9)
Re: issue with meson builds on msys2

I wrote:

Looking at the pg_ctl source code, the only way I can explain that
printout is that do_stop called wait_for_postmaster_stop which,
after one or more loops, exited via one of its exit() calls.

Ah, a little too hasty there: it's get_pgpid() that has to be
reaching an exit().

regards, tom lane

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#10)
Re: issue with meson builds on msys2

On 2023-04-26 We 10:58, Tom Lane wrote:

I wrote:

Looking at the pg_ctl source code, the only way I can explain that
printout is that do_stop called wait_for_postmaster_stop which,
after one or more loops, exited via one of its exit() calls.

Ah, a little too hasty there: it's get_pgpid() that has to be
reaching an exit().

If I redirect the output to a file (which is what the buildfarm client
actually does), it seems like it completes successfully, but I still get
a non-zero exit:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop > stoplog 2>&1") ; print
"BANG\n" if $?; '
BANG

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ cat root/HEAD/instkeep.2023-04-25_11-09-41/stoplog
waiting for server to shut down.... done
server stopped

It seems more than odd that we get to where the "server stopped" massage
is printed but we get a failure.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#11)
Re: issue with meson builds on msys2

Andrew Dunstan <andrew@dunslane.net> writes:

If I redirect the output to a file (which is what the buildfarm client
actually does), it seems like it completes successfully, but I still get
a non-zero exit:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop > stoplog 2>&1") ; print
"BANG\n" if $?; '
BANG

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ cat root/HEAD/instkeep.2023-04-25_11-09-41/stoplog
waiting for server to shut down.... done
server stopped

Thats ... just wacko. do_stop() emits "waiting for server to shut
down...", "done", and "server stopped" in the same way (via print_msg).
How is it that all three messages show up in one context but not the
other? Could wait_for_postmaster_stop or get_pgpid be bollixing the
stdout channel somehow? Try redirecting stdout and stderr separately
to see if that proves anything.

It seems more than odd that we get to where the "server stopped" massage
is printed but we get a failure.

Indeed, that's even weirder. do_stop() returns directly to the
exit(0) in main().

regards, tom lane

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#12)
Re: issue with meson builds on msys2

On 2023-04-26 We 11:30, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

If I redirect the output to a file (which is what the buildfarm client
actually does), it seems like it completes successfully, but I still get
a non-zero exit:
pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop > stoplog 2>&1") ; print
"BANG\n" if $?; '
BANG
pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ cat root/HEAD/instkeep.2023-04-25_11-09-41/stoplog
waiting for server to shut down.... done
server stopped

Thats ... just wacko. do_stop() emits "waiting for server to shut
down...", "done", and "server stopped" in the same way (via print_msg).
How is it that all three messages show up in one context but not the
other? Could wait_for_postmaster_stop or get_pgpid be bollixing the
stdout channel somehow? Try redirecting stdout and stderr separately
to see if that proves anything.

Doesn't seem to prove much:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop > stoplog.out 2>
stoplog.err") ; print "BANG\n" if $?; '
BANG

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ cat root/HEAD/instkeep.2023-04-25_11-09-41/stoplog.out
waiting for server to shut down.... done
server stopped

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ cat root/HEAD/instkeep.2023-04-25_11-09-41/stoplog.err

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf

It seems more than odd that we get to where the "server stopped" massage
is printed but we get a failure.

Indeed, that's even weirder. do_stop() returns directly to the
exit(0) in main().

And if I call it via IPC::Run it works:

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41"; use
IPC::Run; my ($out, $err) = ("",""); IPC::Run::run ["bin/pg_ctl", "-D",
"data-C", "stop"], \"",\$out,\$err ; print "BANG\n" if $?; print "out:
$out\n" if $out; print "err: $err\n" if $err;'
out: waiting for server to shut down.... done
server stopped

It seems there is something odd in how msys perl (not ucrt perl)
implements system() that is tickled by this, but why that should only
occur when it's built using meson is completely beyond me. It should be
just another executable. And pg_ctl is behaving properly as far as we
can see. I'm not quite sure where to go from here. I guess I can try to
see if we have IPC::Run and if so use it. That would probably get me
over the hurdle for fairywren. This has already consumed far too much of
my time.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#14Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#8)
Re: issue with meson builds on msys2

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ export PGCTLTIMEOUT=300

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile start") ; print "fail\n" if $?; '
waiting for server to start.... done
server started

Does it happen as well if you use ucrt perl? Not because I think we should
require it, just to narrow the space.

Any chance that doing export MSYS=winjitdebug changes something? There's quite
a bit of similarity with the python issue you've also encountered - python
would just exit with the a failure indicating exit code.

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
$ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
system("bin/pg_ctl -D data-C -l logfile stop") ; print "fail\n" if $?; '
waiting for server to shut down....fail

Hm. I don't remember the details, but in the python case I was able to get
some additional error code somehow, which then indicated that the
child-process failed with the NT status code indicating the equivalent of a
segfault.

I guess system() in msys perl will invoke bash as a shell to execute the
problem. Perhaps the failing program isn't actually pg_ctl, but the shell? If
it is indeed bash, what does the shell report as the exit code of pg_ctl?
E.g. doing something like
system('bin/pg_ctl -D data-C -l logfile stop; echo $?');

Could you do ldd (with mingw's ldd, which understands PE binaries) of meson
and autoconf built pg_ctl on your machine? I wonder if we end up with a
different windows runtime or such. In the python case I had some
circumstantial evidence that the problem was dependent on the windows runtime
version.

Downthread you mention that the issue doesn't happen with IPC::Run - the
biggest difference I can see is that IPC::Run would IIRC not use a shell? Does
the problem "re-appear" if you make IPC::Run use a shell?

Greetings,

Andres Freund

#15Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#14)
Re: issue with meson builds on msys2

On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this
is reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when
I have time, but for now the IPC::Run workaround is sufficient.

The main thing that's now an issue on Windows is support for various
options like libxml2. I installed the libxml2 distro from the package
manager scoop, generated .lib files for the libxml2 and libxslt DLLs,
and was able to build with autoconf on msys2, and with our MSVC support,
but not with meson in either case. It looks like we need to expand the
logic in meson.build for a number of these, just as we have done for
perl, python, openssl, ldap etc.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#16Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#15)
Re: issue with meson builds on msys2

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

Oh. Seems like something we need to dig into independent of meson then :(

The main thing that's now an issue on Windows is support for various options
like libxml2. I installed the libxml2 distro from the package manager scoop,
generated .lib files for the libxml2 and libxslt DLLs, and was able to build
with autoconf on msys2, and with our MSVC support, but not with meson in
either case. It looks like we need to expand the logic in meson.build for a
number of these, just as we have done for perl, python, openssl, ldap etc.

I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense). And it should be something that understands needing to build debug
and non-debug libraries.

Greetings,

Andres Freund

#17Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#15)
Re: issue with meson builds on msys2

On 2023-05-03 We 09:20, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version)
this is reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue
when I have time, but for now the IPC::Run workaround is sufficient.

The main thing that's now an issue on Windows is support for various
options like libxml2. I installed the libxml2 distro from the package
manager scoop, generated .lib files for the libxml2 and libxslt DLLs,
and was able to build with autoconf on msys2, and with our MSVC
support, but not with meson in either case. It looks like we need to
expand the logic in meson.build for a number of these, just as we have
done for perl, python, openssl, ldap etc.

I've actually made some progress on this front. I grabbed and built
https://github.com/pkgconf/pkgconf.git (with meson :-) )

After that I set PKG_CONFIG_PATH to point to where the libxml .pc files
are installed, and lo and behold the meson/msvc build worked with libxml
/ libxslt. I did have to move libxml's openssl.pc file aside, as the
distro's version of openssl is extremely old, and we don't want to use
it (I'm using 3.1.0).

Of course, this imposes an extra build dependency for Windows, but it's
not too onerous.

It also means that if anyone wants to use some dependency without a .pc
file they would need to create one. I'll keep trying to expand the list
of things I configure with.

Next targets will include ldap, lz4 and zstd.

I also need to test this with msys2, so fat I have only tested with MSVC.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#18Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#16)
Re: issue with meson builds on msys2

On 2023-05-03 We 14:26, Andres Freund wrote:

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

Oh. Seems like something we need to dig into independent of meson then :(

The main thing that's now an issue on Windows is support for various options
like libxml2. I installed the libxml2 distro from the package manager scoop,
generated .lib files for the libxml2 and libxslt DLLs, and was able to build
with autoconf on msys2, and with our MSVC support, but not with meson in
either case. It looks like we need to expand the logic in meson.build for a
number of these, just as we have done for perl, python, openssl, ldap etc.

I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense). And it should be something that understands needing to build debug
and non-debug libraries.

I'm not familiar with conan. I have struggled considerably with vcpkg in
the past.

I don't think there is any one perfect answer.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#19Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#15)
Re: issue with meson builds on msys2

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I
have time, but for now the IPC::Run workaround is sufficient.

Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?

I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...

Greetings,

Andres Freund

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#19)
Re: issue with meson builds on msys2

On 2023-05-04 Th 19:54, Andres Freund wrote:

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I
have time, but for now the IPC::Run workaround is sufficient.

Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?

Its a Windows Server 2019 (v 1809) instance running on AWS.

Here's an extract from systeminfo:

OS Name:                   Microsoft Windows Server 2019 Datacenter
OS Version:                10.0.17763 N/A Build 17763
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Standalone Server
OS Build Type:             Multiprocessor Free
Registered Owner:          EC2
Registered Organization:   Amazon.com
Product ID:                00430-00000-00000-AA796
Original Install Date:     4/24/2023, 10:28:31 AM
System Boot Time:          4/24/2023, 1:49:59 PM
System Manufacturer:       Amazon EC2
System Model:              t3.large
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 85 Stepping 7
GenuineIntel ~2500 Mhz
BIOS Version:              Amazon EC2 1.0, 10/16/2017
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume1
System Locale:             en-us;English (United States)
Input Locale:              en-us;English (United States)
Time Zone:                 (UTC) Coordinated Universal Time
Total Physical Memory:     8,090 MB
Available Physical Memory: 4,843 MB
Virtual Memory: Max Size:  10,010 MB
Virtual Memory: Available: 7,405 MB
Virtual Memory: In Use:    2,605 MB

I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...

Just to reassure myself I have not been hallucinating, I repeated the test.

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start

startlog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'

OK

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop

stoplog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'

BANG: 33280

If you want to play I can arrange access.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#21Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#20)
Re: issue with meson builds on msys2

Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:

On 2023-05-04 Th 19:54, Andres Freund wrote:

Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?

Its a Windows Server 2019 (v 1809) instance running on AWS.

Hm. When I hit the python issue I also couldn't repro it on windows 10. Cirrus
was also using Windows Server 2019...

I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...

Just to reassure myself I have not been hallucinating, I repeated the test.

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start >
startlog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
OK

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop >
stoplog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280

Oh, so it only happens when stopping, never when starting? That's
interesting...

If you want to play I can arrange access.

That'd be very helpful.

Greetings,

Andres Freund

#22Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#20)
Re: issue with meson builds on msys2

Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:

If you want to play I can arrange access.

Andrew did - thanks!

A first observeration is that making the shell command slightly more
complicated, by echoing $? after pg_ctl, prevents the error:

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
0
OK

So does manually or or via a subshell adding another layer of shell.

As Andrew observed earlier, the issue does not occur when not performing
redirection of the output. One interesting bit there is that the perl docs for
system include:
https://perldoc.perl.org/functions/system

If there are no shell metacharacters in the argument, it is split into words
and passed directly to execvp, which is more efficient. On Windows, only the
system PROGRAM LIST syntax will reliably avoid using the shell; system LIST,
even with more than one element, will fall back to the shell if the first
spawn fails.

My guesss is that the issue somehow is triggered around the shell handling.

One relevant bit: If I use strace (from msys) within system, the subprograms
(shell and pg_ctl) actually exit with 0, from what I can tell - but 33280
still is returned. Unfortunately, if I use strace for all of perl, the error
vanishes.

Perhaps are some odd interactions with the stuff that InheritstdHandles()
does?

Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
from the actual buildfarm animal the system you gave me access to is...

Greetings,

Andres Freund

#23Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#22)
Re: issue with meson builds on msys2

On 2023-05-15 Mo 15:38, Andres Freund wrote:

Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:

If you want to play I can arrange access.

Andrew did - thanks!

A first observeration is that making the shell command slightly more
complicated, by echoing $? after pg_ctl, prevents the error:

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
0
OK

You're now testing something else, namely the return of the echo rather
than the call to pg_ctl, so I don't think this is any kind of answer. It
would just be ignoring the result of pg_ctl.

So does manually or or via a subshell adding another layer of shell.

As Andrew observed earlier, the issue does not occur when not performing
redirection of the output. One interesting bit there is that the perl docs for
system include:
https://perldoc.perl.org/functions/system

If there are no shell metacharacters in the argument, it is split into words
and passed directly to execvp, which is more efficient. On Windows, only the
system PROGRAM LIST syntax will reliably avoid using the shell; system LIST,
even with more than one element, will fall back to the shell if the first
spawn fails.

My guesss is that the issue somehow is triggered around the shell handling.

One relevant bit: If I use strace (from msys) within system, the subprograms
(shell and pg_ctl) actually exit with 0, from what I can tell - but 33280
still is returned. Unfortunately, if I use strace for all of perl, the error
vanishes.

Perhaps are some odd interactions with the stuff that InheritstdHandles()
does?

I observed the same thing with strace. Kind of a Heisenbug.

Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
from the actual buildfarm animal the system you gave me access to is...

Feel free to do anything you want. This is a completely separate
instance from the buildfarm animals. When we're done with this issue the
EC2 instance will go away.

If you use the script just run in test mode or from-source mode, so it
doesn't try to report results (that would fail anyway, as it doesn't
have a registered secret). You might have to force have_ipc_run to 0. Or
you can just build / install manually.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#24Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#23)
Re: issue with meson builds on msys2

Hi,

On 2023-05-15 16:01:39 -0400, Andrew Dunstan wrote:

On 2023-05-15 Mo 15:38, Andres Freund wrote:

Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:

If you want to play I can arrange access.

Andrew did - thanks!

A first observeration is that making the shell command slightly more
complicated, by echoing $? after pg_ctl, prevents the error:

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
0
OK

You're now testing something else, namely the return of the echo rather than
the call to pg_ctl, so I don't think this is any kind of answer. It would
just be ignoring the result of pg_ctl.

It wouldn't really - the echo $? inside the system() would report the
error. Which it doesn't - note the "0" in the second output.

Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
from the actual buildfarm animal the system you gave me access to is...

Feel free to do anything you want. This is a completely separate instance
from the buildfarm animals. When we're done with this issue the EC2 instance
will go away.

Thanks!

Greetings,

Andres Freund

#25Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#24)
Re: issue with meson builds on msys2

Hi,

On 2023-05-15 13:13:26 -0700, Andres Freund wrote:

It wouldn't really - the echo $? inside the system() would report the
error. Which it doesn't - note the "0" in the second output.

Ah. Interesting. Part of the issue is perl (or msys?) swalling some error
details.

I could see more details in strace once I added another layer of shell
evaluation inside the system() call.

  190  478261 [main] bash 44432 frok::parent: CreateProcessW (C:\tools\nmsys64\usr\bin\bash.exe, C:\tools\nmsys64\usr\bin\bash.exe, 0, 0, 1, 0x420, 0, 0, 0x7FFFFBE10, 0x7FFFF
BDB0)
--- Process 7152 created
[...]
 1556  196093 [main] bash 44433 child_info_spawn::worker: pid 44433, prog_arg ./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl, cmd line C:\tools\nmsys6
4\home\pgrunner\bf\root\HEAD\pgsql.build\tmp_install\tools\nmsys64\home\pgrunner\bf\root\HEAD\inst\bin\pg_ctl.exe -D t -w -l logfile stop)
  128  196221 [main] bash 44433! child_info_spawn::worker: new process name \\?\C:\tools\nmsys64\home\pgrunner\bf\root\HEAD\pgsql.build\tmp_install\tools\nmsys64\home\pgrunne
r\bf\root\HEAD\inst\bin\pg_ctl.exe
[...]
--- Process 6136 (pid: 44433) exited with status 0x0
[...]
--- Process 7152 exited with status 0xc000013a
5292450 5816310 [waitproc] bash 44432 pinfo::maybe_set_exit_code_from_windows: pid 44433, exit value - old 0x0, windows 0xC000013A, MSYS 0x8000002

So indeed, pg_ctl exits with 0, but bash ends up with a different exit code.

What's very interesting here is that the error is 0xC000013A, which is quite
different from the 33280 that perl then reports. From what I can see bash
actually returns 0xC000013A - I don't know how perl ends up with 33280 /
0x8200 from that.

Either way, 0xC000013A is interesting - that's 0xC000013A,
STATUS_CONTROL_C_EXIT.

Very interestingly the problem vanishes as soon as I add a redirection for
standard input into the mix. Notably it suffices to redirect stdin in the
pg_ctl *start*, even if not done for pg_ctl stop. There also is no issue if
perl's stdin is redirected from /dev/null.

My guess is that msys has an issue with refcounting consoles across multiple
processes.

After that I was able to reproduce the issue without really involving perl:

bash -c './tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile start > startlog 2>&1; ./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile stop > stoplog 2>&1; echo inner: $?'; echo outer: $?

+ bash -c './tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile start > startlog 2>&1; ./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile stop > stoplog 2>&1; echo inner: $?'
inner: 130
+ echo outer: 0
outer: 0

If you add -e, the inner: is obviously "transferred" to the outer: output.

As soon as either the pg_ctl for the start, or the whole bash invocation, has
stdin redirected, the problem vanishes.

Greetings,

Andres Freund

#26Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#25)
Re: issue with meson builds on msys2

Hi,

On 2023-05-15 15:30:28 -0700, Andres Freund wrote:

As soon as either the pg_ctl for the start, or the whole bash invocation, has
stdin redirected, the problem vanishes.

For a moment I thought this could be related to InheritStdHandles() - but no,
it doesn't make a difference.

There's loads of handles referencing cygwin alive in pg_ctl.

Based on difference in strace output for bash -c "pg_ctl stop" for the case
where start redirected stdin (#1) and where not (#2), it looks like some part
of msys / cygwin sees that stdin is alive when preparing to execute "pg_ctl
stop", and then runs into trouble.

The way we start the child process on windows makes the use of cmd.exe for
redirection pretty odd.

I couldn't trivially reproduce this with a much simpler case (just nohup
sleep). Perhaps it's dependent on a wrapper cmd or such.

Greetings,

Andres Freund

#27Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#26)
Re: issue with meson builds on msys2

On 2023-05-15 Mo 19:43, Andres Freund wrote:

Hi,

On 2023-05-15 15:30:28 -0700, Andres Freund wrote:

As soon as either the pg_ctl for the start, or the whole bash invocation, has
stdin redirected, the problem vanishes.

For a moment I thought this could be related to InheritStdHandles() - but no,
it doesn't make a difference.

There's loads of handles referencing cygwin alive in pg_ctl.

Based on difference in strace output for bash -c "pg_ctl stop" for the case
where start redirected stdin (#1) and where not (#2), it looks like some part
of msys / cygwin sees that stdin is alive when preparing to execute "pg_ctl
stop", and then runs into trouble.

The way we start the child process on windows makes the use of cmd.exe for
redirection pretty odd.

I couldn't trivially reproduce this with a much simpler case (just nohup
sleep). Perhaps it's dependent on a wrapper cmd or such.

I don't know where this all leaves us. It's still more than odd that the
start works fine and the stop doesn't.

This piece of code has worked happily for years. It's only a recent
installation or update of msys2 that's made the problem appear.

I have implemented a workaround where IPC::Run is available - that means
a little extra one-off work for people using msys2, but it's not a huge
burden. Beyond that I don't really want to spend a lot more energy on it.

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#28Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#27)
Re: issue with meson builds on msys2

Hi,

On 2023-05-16 08:55:20 -0400, Andrew Dunstan wrote:

I don't know where this all leaves us. It's still more than odd that the
start works fine and the stop doesn't.

From what I understand it's just a question of starting another shell, with
some redirection, after having previously started a shell, which left a
program running (thus still referencing the same console device).

This piece of code has worked happily for years. It's only a recent
installation or update of msys2 that's made the problem appear.

Yea, it does look like a bug somewhere. I just don't know how to make it a
small enough reproducer right now.

I have implemented a workaround where IPC::Run is available - that means a
little extra one-off work for people using msys2, but it's not a huge
burden. Beyond that I don't really want to spend a lot more energy on it.

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?

The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Greetings,

Andres Freund

#29Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#28)
Re: issue with meson builds on msys2

On 2023-05-16 Tu 17:52, Andres Freund wrote:

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?

The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Should only be needed for the start command, right? I can probably just
add "< $devnull" to the command. I'll test it out.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#30Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#29)
Re: issue with meson builds on msys2

Hi,

On May 17, 2023 2:51:41 PM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-05-16 Tu 17:52, Andres Freund wrote:

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?

The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Should only be needed for the start command, right?

I think so.

I can probably just add "< $devnull" to the command. I'll test it out.

Cool.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#31Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#30)
Re: issue with meson builds on msys2

On 2023-05-17 We 17:55, Andres Freund wrote:

Hi,

On May 17, 2023 2:51:41 PM PDT, Andrew Dunstan<andrew@dunslane.net> wrote:

On 2023-05-16 Tu 17:52, Andres Freund wrote:

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?

The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Should only be needed for the start command, right?

I think so.

I can probably just add "< $devnull" to the command. I'll test it out.

Cool.

OK, that seems to work. *whew*. Thanks for your help.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com