'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

Started by Nazir Bilal Yavuz13 days ago6 messageshackers
Jump to latest
#1Nazir Bilal Yavuz
byavuz81@gmail.com

Hi,

psql/010_tab_completion, psql/030_pager and
authentication/001_password tests started to fail on MacOS CI tasks
[1]: https://cirrus-ci.com/task/6188397108133888
tests):

```
# Checking port 27038
# Found port 27038
Name: main
Version: 19devel
Data directory:
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/pgdata
Backup directory:
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/backup
Archive directory:
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/archives
Connection string: port=27038
host=/var/folders/hm/d7rr9ds96qx995ns72ry9g4m0000gn/T/rkP99U6zgp
Log file: /Users/admin/pgsql/build/testrun/psql/030_pager/log/030_pager_main.log
[11:23:15.358](0.026s) # initializing database system by copying initdb template
# Running: cp -RPp
/Users/admin/pgsql/build/tmp_install/initdb-template
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/pgdata
# Running: /Users/admin/pgsql/build/src/test/regress/pg_regress
--config-auth /Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/pgdata
### Starting node "main"
# Running: pg_ctl --wait --pgdata
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/pgdata
--log /Users/admin/pgsql/build/testrun/psql/030_pager/log/030_pager_main.log
--options --cluster-name=main start
waiting for server to start.... done
server started
# Postmaster PID for node "main" is 8554
Bad file descriptor: dup2( 1, 2 ) at
/Users/admin/pgsql/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
line 114.
at /Users/admin/pgsql/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm line 114.
# Postmaster PID for node "main" is 8554
### Stopping node "main" using mode immediate
# Running: pg_ctl --pgdata
/Users/admin/pgsql/build/testrun/psql/030_pager/data/t_030_pager_main_data/pgdata
--mode immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "main"
```

My current hypothesis is that all of these three tests use 'IO::Pty'.
On MacOS CI, we install this via MacPorts using the 'p5.34-io-tty'
package [2]https://ports.macports.org/port/p5.34-io-tty/, which was updated about 2.5 days ago. I am not sure if is
relevant but it has a known issue on Darwin [3]https://metacpan.org/pod/IO::Tty#VERIFIED-SYSTEMS,-KNOWN-ISSUES:

```
Darwin 7.9.0
HPUX 10.20 & 11.00
EOF on the slave tty is not reported back to the master.
```

I attempted to verify this by downgrading 'p5.34-io-tty' to the
previous version (1.20) and confirm that CI passes but I couldn't
confirm it as I don't have a MacOS machine and for some reason Cirrus
Terminal doesn't show up when I try to run CI with terminal access.

Please note that this problem doesn't happen on CFBot or Postgres CI
yet. I think there are two possible reasons:

1- We install packages by using MacPorts and then we save them as
cache so we don't need to install them for each CI Run. Problems will
start when this cache is invalidated or expired.
2- CFBot and Postgres CI use persistent workers. Cirrus CI might have
updated the Sequoia macOS VM images but persistent workers aren't
updated yet. Problems will start when they are updated.

[1]: https://cirrus-ci.com/task/6188397108133888
[2]: https://ports.macports.org/port/p5.34-io-tty/
[3]: https://metacpan.org/pod/IO::Tty#VERIFIED-SYSTEMS,-KNOWN-ISSUES

--
Regards,
Nazir Bilal Yavuz
Microsoft

#2Andres Freund
andres@anarazel.de
In reply to: Nazir Bilal Yavuz (#1)
Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

Hi,

On 2026-04-01 16:28:21 +0300, Nazir Bilal Yavuz wrote:

psql/010_tab_completion, psql/030_pager and
authentication/001_password tests started to fail on MacOS CI tasks
[1]. Example failure log (error message is same on all of the failing
tests):
...
# Postmaster PID for node "main" is 8554
Bad file descriptor: dup2( 1, 2 ) at
/Users/admin/pgsql/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
line 114.
at /Users/admin/pgsql/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm line 114.

Ugh.

My current hypothesis is that all of these three tests use 'IO::Pty'.
On MacOS CI, we install this via MacPorts using the 'p5.34-io-tty'
package [2], which was updated about 2.5 days ago. I am not sure if is
relevant but it has a known issue on Darwin [3]:

```
Darwin 7.9.0
HPUX 10.20 & 11.00
EOF on the slave tty is not reported back to the master.
```

I attempted to verify this by downgrading 'p5.34-io-tty' to the
previous version (1.20) and confirm that CI passes but I couldn't
confirm it as I don't have a MacOS machine and for some reason Cirrus
Terminal doesn't show up when I try to run CI with terminal access.

Please note that this problem doesn't happen on CFBot or Postgres CI
yet. I think there are two possible reasons:

1- We install packages by using MacPorts and then we save them as
cache so we don't need to install them for each CI Run. Problems will
start when this cache is invalidated or expired.

It's that. I cleared the cache for macos on my repo, and see the same issue
after that.

https://cirrus-ci.com/task/5023293209575424

This presumably means that every macports user (e.g. Tom), will see this as
well after installing the latest updates.

I'm afraid the guy maintaining both IPC::Run [1]/messages/by-id/CAN55FZ06xanSbJdHe-CurjX_qNuBWZDEvS1kAk36L38YCtZXnw@mail.gmail.com and IO::Tty has gone all in on AI
authored code. Both IPC::Run and IO::Tty have seen more merges in the last
week than in the 5 years before. Stuff getting merged left and right, with
failing tests to boot.

If I wanted to do a supply chain attack on postgres, this would be the
way. Hijack IPC::Run, edit the commits locally on a committers machine before
push, to add a backdoor, celebrate.

Greetings,

Andres Freund

[1]: /messages/by-id/CAN55FZ06xanSbJdHe-CurjX_qNuBWZDEvS1kAk36L38YCtZXnw@mail.gmail.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

Andres Freund <andres@anarazel.de> writes:

On 2026-04-01 16:28:21 +0300, Nazir Bilal Yavuz wrote:

My current hypothesis is that all of these three tests use 'IO::Pty'.
On MacOS CI, we install this via MacPorts using the 'p5.34-io-tty'
package [2], which was updated about 2.5 days ago. I am not sure if is
relevant but it has a known issue on Darwin [3]:

This presumably means that every macports user (e.g. Tom), will see this as
well after installing the latest updates.

There must be some other factor involved. indri has been using
p5.34-io-tty since I last did "port update" there, which looks
to have been 22 March. And it's not failing. My laptop is
also okay with these tests, and it likewise has

$ port installed | grep io-.ty
p5.34-io-tty @1.200.0_0 (active)

indri and laptop are both on latest macOS Tahoe, maybe the
problem only manifests on older releases?

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

Hi,

On 2026-04-01 10:26:54 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2026-04-01 16:28:21 +0300, Nazir Bilal Yavuz wrote:

My current hypothesis is that all of these three tests use 'IO::Pty'.
On MacOS CI, we install this via MacPorts using the 'p5.34-io-tty'
package [2], which was updated about 2.5 days ago. I am not sure if is
relevant but it has a known issue on Darwin [3]:

This presumably means that every macports user (e.g. Tom), will see this as
well after installing the latest updates.

There must be some other factor involved. indri has been using
p5.34-io-tty since I last did "port update" there, which looks
to have been 22 March. And it's not failing. My laptop is
also okay with these tests, and it likewise has

$ port installed | grep io-.ty
p5.34-io-tty @1.200.0_0 (active)

I think that means you have the old version. The new, broken, version appears
to be p5.34-io-tty @ 1.240.0

The problematic version was just released two days ago, so if you last did a
port update on the 22nd, it'd make sense that you don't have the problematic
version yet.

The macports change was:
https://github.com/macports/macports-ports/commit/7aafb014a1204e7bef28d5ec5b907bd2b0dee018

Greetings,

Andres Freund

#5Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#2)
Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

On Wed, Apr 1, 2026 at 6:58 AM Andres Freund <andres@anarazel.de> wrote:

I'm afraid the guy maintaining both IPC::Run [1] and IO::Tty has gone all in on AI
authored code. Both IPC::Run and IO::Tty have seen more merges in the last
week than in the 5 years before. Stuff getting merged left and right, with
failing tests to boot.

If I wanted to do a supply chain attack on postgres, this would be the
way. Hijack IPC::Run, edit the commits locally on a committers machine before
push, to add a backdoor, celebrate.

I did consider locking the exact version of IPC::Run during the NetBSD
flake debacle [1]https://github.com/anarazel/pg-vm-images/pull/125, but abandoned it after the cross-platform pain... I
believed signature verification was "good enough" at the time. Should
we reconsider?

--Jacob

[1]: https://github.com/anarazel/pg-vm-images/pull/125

#6Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#5)
Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks

On 1 Apr 2026, at 17:36, Jacob Champion <jacob.champion@enterprisedb.com> wrote:

On Wed, Apr 1, 2026 at 6:58 AM Andres Freund <andres@anarazel.de> wrote:

I'm afraid the guy maintaining both IPC::Run [1] and IO::Tty has gone all in on AI
authored code. Both IPC::Run and IO::Tty have seen more merges in the last
week than in the 5 years before. Stuff getting merged left and right, with
failing tests to boot.

If I wanted to do a supply chain attack on postgres, this would be the
way. Hijack IPC::Run, edit the commits locally on a committers machine before
push, to add a backdoor, celebrate.

I did consider locking the exact version of IPC::Run during the NetBSD
flake debacle [1], but abandoned it after the cross-platform pain... I
believed signature verification was "good enough" at the time. Should
we reconsider?

I think it makes sense to pin the versions of these modules, I personally have
them like that on my system to avoid surprises from package upgrades.

--
Daniel Gustafsson