Regression test fails when 1) old PG is installed and 2) meson/ninja build is used

Started by Hayato Kuroda (Fujitsu)9 months ago5 messages
#1Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com

Dear hackers,

While creating patches for older branches I found the $SUBJECT. I do not have much knowledge
for meson thus I'm not sure it is intentional.

Reproducer
=======
I could reproduce the failure with steps:

1. install old PG, e.g., PG16. To your system. .so file must be put on your $$LD_LIBRARY_PATH.
2. build newer PG, e.g., master, with meson build system [1]``` $ meson setup -Dinjection_points=true -Dcassert=true --optimization=0 --debug ../postgres/ The Meson build system Version: 0.63.3 ... Project name: postgresql Project version: 18devel ... $ ninja ... ```.
3. run regression test and ERROR would be reported [2]``` $ meson test --suite setup --suite regress ninja: Entering directory `/home/hayato/builddir' ninja: no work to do. 1/4 postgresql:setup / tmp_install OK 0.82s 2/4 postgresql:setup / install_test_files OK 0.05s 3/4 postgresql:setup / initdb_cache OK 1.88s 4/4 postgresql:regress / regress/regress ERROR 3.66s exit status 2 ... Ok: 3 Expected Fail: 0 Fail: 1 Unexpected Pass: 0 Skipped: 0 Timeout: 0 ```.

This issue does not happen when I used autoconf/make build system.

Analysis
=====

According to the log, the instance could be started but psql could not work correctly:

```
----------------------------------- stdout -----------------------------------
# executing test in /home/hayato/builddir/testrun/regress/regress group regress test regress
# initializing database system by copying initdb template
# using temp instance on port 40047 with PID 949892
Bail out!# test failed
----------------------------------- stderr -----------------------------------
psql: symbol lookup error: psql: undefined symbol: PQservice
# command failed: "psql" -X -q -c "CREATE DATABASE \"regression\" ...

(test program exited with status code 2)
==============================================================================
```

It looks like that psql required the function `PQservice` in the library but it
could not find in the used libpq.so. Since the function has been introduced since
PG18, I suspect psql tried to link with .so file for old PG (installed one).
IIUC each commands should refer libraries exist in tmp_install, not the system's one.

Is this an issue to be solved on PG community, or specification of meson/ninja?
Or... could it happen only on my environment?

Note
====
I'm using AlmaLinux 9.5. I can give more detail if needed.

[1]: ``` $ meson setup -Dinjection_points=true -Dcassert=true --optimization=0 --debug ../postgres/ The Meson build system Version: 0.63.3 ... Project name: postgresql Project version: 18devel ... $ ninja ... ```
```
$ meson setup -Dinjection_points=true -Dcassert=true --optimization=0 --debug ../postgres/
The Meson build system
Version: 0.63.3
...
Project name: postgresql
Project version: 18devel
...
$ ninja
...
```

[2]: ``` $ meson test --suite setup --suite regress ninja: Entering directory `/home/hayato/builddir' ninja: no work to do. 1/4 postgresql:setup / tmp_install OK 0.82s 2/4 postgresql:setup / install_test_files OK 0.05s 3/4 postgresql:setup / initdb_cache OK 1.88s 4/4 postgresql:regress / regress/regress ERROR 3.66s exit status 2 ... Ok: 3 Expected Fail: 0 Fail: 1 Unexpected Pass: 0 Skipped: 0 Timeout: 0 ```
```
$ meson test --suite setup --suite regress
ninja: Entering directory `/home/hayato/builddir'
ninja: no work to do.
1/4 postgresql:setup / tmp_install OK 0.82s
2/4 postgresql:setup / install_test_files OK 0.05s
3/4 postgresql:setup / initdb_cache OK 1.88s
4/4 postgresql:regress / regress/regress ERROR 3.66s exit status 2
...
Ok: 3
Expected Fail: 0
Fail: 1
Unexpected Pass: 0
Skipped: 0
Timeout: 0
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#2Andres Freund
andres@anarazel.de
In reply to: Hayato Kuroda (Fujitsu) (#1)
Re: Regression test fails when 1) old PG is installed and 2) meson/ninja build is used

Hi,

On 2025-04-11 07:53:07 +0000, Hayato Kuroda (Fujitsu) wrote:

Dear hackers,

While creating patches for older branches I found the $SUBJECT. I do not have much knowledge
for meson thus I'm not sure it is intentional.

Reproducer
=======
I could reproduce the failure with steps:

1. install old PG, e.g., PG16. To your system. .so file must be put on your $$LD_LIBRARY_PATH.
2. build newer PG, e.g., master, with meson build system [1].
3. run regression test and ERROR would be reported [2].

This issue does not happen when I used autoconf/make build system.

Analysis
=====

According to the log, the instance could be started but psql could not work correctly:

```
----------------------------------- stdout -----------------------------------
# executing test in /home/hayato/builddir/testrun/regress/regress group regress test regress
# initializing database system by copying initdb template
# using temp instance on port 40047 with PID 949892
Bail out!# test failed
----------------------------------- stderr -----------------------------------
psql: symbol lookup error: psql: undefined symbol: PQservice
# command failed: "psql" -X -q -c "CREATE DATABASE \"regression\" ...

(test program exited with status code 2)
==============================================================================

I can't reproduce this. For me the psql started by pg_regress is the one in
tmp_install and so is the libpq it links to.

$ killall -STOP psql
$ ps aux|grep psql
andres 3375208 0.0 0.0 28696 9972 pts/5 T Apr10 0:00 psql tpch_10
andres 3597915 1.0 0.0 36036 10120 ? T 09:42 0:00 psql -X -a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597916 1.0 0.0 36036 10120 ? T 09:42 0:00 psql -X -a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597918 0.6 0.0 36036 10144 ? T 09:42 0:00 psql -X -a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597920 0.3 0.0 36036 10104 ? T 09:42 0:00 psql -X -a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597922 0.6 0.0 36036 10120 ? T 09:42 0:00 psql -X -a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597955 0.0 0.0 6608 2180 pts/0 S+ 09:42 0:00 grep psql
$ ls -l /proc/3597918/exe
lrwxrwxrwx 1 andres andres 0 Apr 11 09:42 /proc/3597918/exe -> /srv/dev/build/postgres/m-dev-assert/tmp_install/srv/dev/install/postgres/m-dev-assert/bin/psql

$ less /proc/3597918/maps
...
000 103:06 4831894711 /srv/dev/build/postgres/m-dev-assert/tmp_install/srv/dev/install/postgres/m-dev-assert/lib/x86_64-linux-gnu/libpq.so.5.18

And meson-logs/testlog.txt shows that the command is executed with
PATH=/srv/dev/build/postgres/m-dev-assert/tmp_install//srv/dev/install/postgres/m-dev-assert/bin:<other things>
LD_LIBRARY_PATH=/srv/dev/build/postgres/m-dev-assert/tmp_install//srv/dev/install/postgres/m-dev-assert/lib/x86_64-linux-gnu

Can you check whether your meson-logs/testlog.txt shows the appropriate
PATH/LD_LIBRARY_PATH and whether libpq is in the right place?

Greetings,

Andres Freund

#3Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Andres Freund (#2)
2 attachment(s)
RE: Regression test fails when 1) old PG is installed and 2) meson/ninja build is used

Dear Andres,

Sorry for late response. I have not been reproduced till now and now I do.
When happened, the problem still happens even when another build directory
is introduced.
There might be hidden conditions I have not found yet.

I can't reproduce this. For me the psql started by pg_regress is the one in
tmp_install and so is the libpq it links to.

$ killall -STOP psql
$ ps aux|grep psql
andres 3375208 0.0 0.0 28696 9972 pts/5 T Apr10 0:00 psql
tpch_10
andres 3597915 1.0 0.0 36036 10120 ? T 09:42 0:00 psql -X
-a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597916 1.0 0.0 36036 10120 ? T 09:42 0:00 psql -X
-a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597918 0.6 0.0 36036 10144 ? T 09:42 0:00 psql -X
-a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597920 0.3 0.0 36036 10104 ? T 09:42 0:00 psql -X
-a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597922 0.6 0.0 36036 10120 ? T 09:42 0:00 psql -X
-a -q -d regression -v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on
andres 3597955 0.0 0.0 6608 2180 pts/0 S+ 09:42 0:00 grep
psql
$ ls -l /proc/3597918/exe
lrwxrwxrwx 1 andres andres 0 Apr 11 09:42 /proc/3597918/exe ->
/srv/dev/build/postgres/m-dev-assert/tmp_install/srv/dev/install/postgres/m-
dev-assert/bin/psql

$ less /proc/3597918/maps
...
000 103:06 4831894711
/srv/dev/build/postgres/m-dev-assert/tmp_install/srv/dev/install/postgres/m-
dev-assert/lib/x86_64-linux-gnu/libpq.so.5.18

Hmm. I could not do the check because the psql command could not be start or it
exit immediately with the symbol lookup error.

Not sure it is meaningful, but I attached the execution log of ldd command, with
the same PATH/LD_LIBRARY_PATH while doing a meson test.

And meson-logs/testlog.txt shows that the command is executed with
PATH=/srv/dev/build/postgres/m-dev-assert/tmp_install//srv/dev/install/post
gres/m-dev-assert/bin:<other things>
LD_LIBRARY_PATH=/srv/dev/build/postgres/m-dev-assert/tmp_install//srv/d
ev/install/postgres/m-dev-assert/lib/x86_64-linux-gnu
Can you check whether your meson-logs/testlog.txt shows the appropriate
PATH/LD_LIBRARY_PATH and whether libpq is in the right place?

I also checked PATH/LD_LIBRARY_PATH and they looked correct.

PATH=/home/hayato/builddir/tmp_install//usr/local/pgsql/bin:/home/hayato/builddir/src/test/regress:/usr/local/pgsql/bin/:<others>...
LD_LIBRARY_PATH=/home/hayato/builddir/tmp_install//usr/local/pgsql/lib64:/usr/local/pgsql/lib:/usr/local/lib:/usr/lib64/...

Attached is a file which extract some lines from testlog.txt.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

mod_testlog.txttext/plain; name=mod_testlog.txtDownload
ldd_log.txttext/plain; name=ldd_log.txtDownload
#4Andrew Dunstan
andrew@dunslane.net
In reply to: Hayato Kuroda (Fujitsu) (#3)
Re: Regression test fails when 1) old PG is installed and 2) meson/ninja build is used

On 2025-04-21 Mo 7:42 AM, Hayato Kuroda (Fujitsu) wrote:

I also checked PATH/LD_LIBRARY_PATH and they looked correct.

PATH=/home/hayato/builddir/tmp_install//usr/local/pgsql/bin:/home/hayato/builddir/src/test/regress:/usr/local/pgsql/bin/:<others>...
LD_LIBRARY_PATH=/home/hayato/builddir/tmp_install//usr/local/pgsql/lib64:/usr/local/pgsql/lib:/usr/local/lib:/usr/lib64/...

What is that extra stuff doing on the end of your LD_LIBRARY_PATH? That
looks wrong. Do you have LD_LIBRARY_PATH set in your calling environment?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#5Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Andrew Dunstan (#4)
RE: Regression test fails when 1) old PG is installed and 2) meson/ninja build is used

Dear Andrew,

What is that extra stuff doing on the end of your LD_LIBRARY_PATH?
That looks wrong. Do you have LD_LIBRARY_PATH set in your calling environment?

To confirm, did you ask the LD_LIBRARY_PATH on my bash? Here it is:

```bash
$ echo $LD_LIBRARY_PATH
/usr/local/pgsql/lib:/usr/local/lib:/usr/lib64/
```

I modified the first one to refer lib64 dir, but the result is not changed.

Best regards,
Hayato Kuroda
FUJITSU LIMITED