fix: propagate M4 env variable to flex subprocess

Started by J. Javier Maestro8 months ago13 messages
#1J. Javier Maestro
jjmaestro@ieee.org
1 attachment(s)

Hi all,

I was trying to build Postgres with Meson specifying the binaries of the
tools such as bison, flex and m4. I found that the m4 binary wasn't being
used by flex, and digging into this led me to the pgflex wrapper.

The pgflex wrapper runs flex with an explicit environment, so it doesn't
inherit environment variables from the parent process. However, flex can
use the M4 env variable and/or the PATH (via execvp) to find the m4 macro
processor.

Thus, since flex honors the M4 env variable, it should be propagated to the
subprocess environment if it's set in the parent environment. This patch
fixes it.

Let me know what you think!

Regards,

Javier

Attachments:

fix-propagate-M4-env-variable-to-flex-subprocess.patchapplication/octet-stream; name=fix-propagate-M4-env-variable-to-flex-subprocess.patchDownload
From 1af963cc47b3507e2f954642867ee34a81729013 Mon Sep 17 00:00:00 2001
From: Javier Maestro <jjmaestro@ieee.org>
Date: Mon, 30 Sep 2024 14:47:19 +0100
Subject: [PATCH] fix: propagate M4 env variable to flex subprocess

The pgflex wrapper runs flex with an explicit environment, so it doesn't
inherit environment variables from the parent process.

However, flex can use the M4 env variable and/or the PATH (via execvp)
to find the m4 macro processor.
---
 src/tools/pgflex | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/src/tools/pgflex b/src/tools/pgflex
index baabe2d..15f766c 100755
--- a/src/tools/pgflex
+++ b/src/tools/pgflex
@@ -53,6 +53,11 @@ os.chdir(args.privatedir)
 # don't even need to make this conditional.
 env = {'FLEX_TMP_DIR': args.privatedir}
 
+# flex honors the M4 env variable, so if it's set in the environment
+# it should be propagated to the subprocess environment
+if "M4" in os.environ:
+    env['M4'] = os.environ["M4"]
+
 # build flex invocation
 command = [args.flex, '-o', args.output_file]
 if args.no_backup:
-- 
2.39.5 (Apple Git-154)

#2Andres Freund
andres@anarazel.de
In reply to: J. Javier Maestro (#1)
Re: fix: propagate M4 env variable to flex subprocess

Hi,

On 2025-05-12 23:14:59 -0400, J. Javier Maestro wrote:

The pgflex wrapper runs flex with an explicit environment, so it doesn't
inherit environment variables from the parent process. However, flex can
use the M4 env variable and/or the PATH (via execvp) to find the m4 macro
processor.

Thus, since flex honors the M4 env variable, it should be propagated to the
subprocess environment if it's set in the parent environment. This patch
fixes it.

I think it probably was not intentional to fully specify the environment,
rather than just *adding* FLEX_TMP_DIR to the caller's environment. Bilal, I
think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

Greetings,

Andres Freund

#3J. Javier Maestro
jjmaestro@ieee.org
In reply to: Andres Freund (#2)
Re: fix: propagate M4 env variable to flex subprocess

On Tue, May 13, 2025 at 11:54 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2025-05-12 23:14:59 -0400, J. Javier Maestro wrote:

The pgflex wrapper runs flex with an explicit environment, so it doesn't
inherit environment variables from the parent process. However, flex can
use the M4 env variable and/or the PATH (via execvp) to find the m4 macro
processor.

Thus, since flex honors the M4 env variable, it should be propagated to

the

subprocess environment if it's set in the parent environment. This patch
fixes it.

I think it probably was not intentional to fully specify the environment,
rather than just *adding* FLEX_TMP_DIR to the caller's environment.

I think so, it definitely looks like the only intent was to
specify FLEX_TMP_DIR.

But even if the goal wasn’t to fully specify the environment, this fix is
only passing an env variable that's supposed to be there because Flex will
honor it if set.

Bilal, I think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

IIRC the rest of the tools in the environment have ways to be specified via
Meson options (BISON, FLEX, PERL) so the only issue I see is Flex not being
able to find the specific m4 binary. What other issue(s) are you
considering?

Thanks!

Javier

#4Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Andres Freund (#2)
Re: fix: propagate M4 env variable to flex subprocess

Hi,

On Tue, 13 May 2025 at 18:54, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2025-05-12 23:14:59 -0400, J. Javier Maestro wrote:

The pgflex wrapper runs flex with an explicit environment, so it doesn't
inherit environment variables from the parent process. However, flex can
use the M4 env variable and/or the PATH (via execvp) to find the m4 macro
processor.

Thus, since flex honors the M4 env variable, it should be propagated to the
subprocess environment if it's set in the parent environment. This patch
fixes it.

I think it probably was not intentional to fully specify the environment,
rather than just *adding* FLEX_TMP_DIR to the caller's environment. Bilal, I
think you wrote this originally, do you recall?

I found the original pull request [1]https://github.com/anarazel/postgres/pull/51 but it does not include the
'FLEX_TMP_DIR' part. I tried to search where the 'FLEX_TMP_DIR' part
is added [2]https://github.com/anarazel/postgres/commit/cd817afd4ab006b90307a940d96b5116d649165c, it seems that this part is added while rebasing so there
is no specific commit.

[1]: https://github.com/anarazel/postgres/pull/51
[2]: https://github.com/anarazel/postgres/commit/cd817afd4ab006b90307a940d96b5116d649165c

--
Regards,
Nazir Bilal Yavuz
Microsoft

#5J. Javier Maestro
jjmaestro@ieee.org
In reply to: Nazir Bilal Yavuz (#4)
Re: fix: propagate M4 env variable to flex subprocess

On Tue, May 20, 2025 at 8:53 AM Nazir Bilal Yavuz <byavuz81@gmail.com>
wrote:

Hi,

On Tue, 13 May 2025 at 18:54, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2025-05-12 23:14:59 -0400, J. Javier Maestro wrote:

The pgflex wrapper runs flex with an explicit environment, so it

doesn't

inherit environment variables from the parent process. However, flex

can

use the M4 env variable and/or the PATH (via execvp) to find the m4

macro

processor.

Thus, since flex honors the M4 env variable, it should be propagated

to the

subprocess environment if it's set in the parent environment. This

patch

fixes it.

I think it probably was not intentional to fully specify the environment,
rather than just *adding* FLEX_TMP_DIR to the caller's environment.

Bilal, I

think you wrote this originally, do you recall?

I found the original pull request [1] but it does not include the
'FLEX_TMP_DIR' part. I tried to search where the 'FLEX_TMP_DIR' part
is added [2], it seems that this part is added while rebasing so there
is no specific commit.

[1] https://github.com/anarazel/postgres/pull/51
[2]
https://github.com/anarazel/postgres/commit/cd817afd4ab006b90307a940d96b5116d649165c

Thanks for the context.

Apart from these pointers to the original PRs, could you (and Andres) give
me feedback on the patch? Do you think it's OK to merge? Should I add it to
a commitfest?

On this note, on top of that patch, I also have other patches that I think
would be relevant. Given that you and Andres seem to use Github, here are
the patches that I would like to discuss:

https://github.com/monogres/postgres/compare/REL_16_0...monogres/patches/16.0

The last one is more of a "hack" but still, it shows the issues with the
current approach to "embedding metadata" in header files that end up in the
binaries.

What would be the most effective way to discuss these patches? Would it be
better to go one-by-one or to create a new thread focused on discussing
reproducibility, to discuss all of them?

Thanks a lot beforehand for your help,

Regards,

Javier

#6Andres Freund
andres@anarazel.de
In reply to: J. Javier Maestro (#3)
Re: fix: propagate M4 env variable to flex subprocess

Hi,

On 2025-05-17 23:32:24 -0400, J. Javier Maestro wrote:

On Tue, May 13, 2025 at 11:54 AM Andres Freund <andres@anarazel.de> wrote:

Bilal, I think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

IIRC the rest of the tools in the environment have ways to be specified via
Meson options (BISON, FLEX, PERL) so the only issue I see is Flex not being
able to find the specific m4 binary. What other issue(s) are you
considering?

PATH, LD_LIBRARY_PATH, ...

I think this really should just add to the environment, rather than supplant
it. Do you want to write a patch like that? Otherwise I can.

Greetings,

Andres Freund

#7J. Javier Maestro
jjmaestro@ieee.org
In reply to: Andres Freund (#6)
1 attachment(s)
Re: fix: propagate M4 env variable to flex subprocess

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2025-05-17 23:32:24 -0400, J. Javier Maestro wrote:

On Tue, May 13, 2025 at 11:54 AM Andres Freund <andres@anarazel.de>

wrote:

Bilal, I think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

IIRC the rest of the tools in the environment have ways to be specified

via

Meson options (BISON, FLEX, PERL) so the only issue I see is Flex not

being

able to find the specific m4 binary. What other issue(s) are you
considering?

PATH, LD_LIBRARY_PATH, ...

I think this really should just add to the environment, rather than
supplant
it.

Ah, understood. That definitely looks like a better option.

Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if it's
OK, what are the next steps to get the patch merged in main!

Thanks,

Javier

Attachments:

0001-fix-pgflex-propagate-environment-to-flex-subprocess.patchapplication/octet-stream; name=0001-fix-pgflex-propagate-environment-to-flex-subprocess.patchDownload
From d3c74f4f5ded7ef8df964d995b26a4032ec75bf2 Mon Sep 17 00:00:00 2001
From: Javier Maestro <jjmaestro@ieee.org>
Date: Wed, 28 May 2025 18:51:00 +0100
Subject: [PATCH] fix: pgflex - propagate environment to flex subprocess
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Python's subprocess.run docs say that if the env argument is not None,
it will be used "instead of the default behavior of inheriting the
current process’ environment".

However, the environment should be preserved, only adding FLEX_TMP_DIR
to it. For more info, see the following thread:

https://www.postgresql.org/message-id/ot4w2y6es446gjgvhbpp45qt3wx65n5epjxgsu3ghgr63yuizn%40qeurkcofswmv
---
 src/tools/pgflex | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/tools/pgflex b/src/tools/pgflex
index 3986b06874..b8d9aa0086 100755
--- a/src/tools/pgflex
+++ b/src/tools/pgflex
@@ -48,7 +48,7 @@ os.chdir(args.privatedir)
 # contents. Set FLEX_TMP_DIR to the target private directory to avoid
 # that. That environment variable isn't consulted on other platforms, so we
 # don't even need to make this conditional.
-env = {'FLEX_TMP_DIR': args.privatedir}
+os.environ['FLEX_TMP_DIR'] = args.privatedir
 
 # build flex invocation
 command = [args.flex, '-o', args.output_file]
@@ -58,7 +58,7 @@ command += args.flex_flags
 command += [args.input_file]
 
 # create .c file from .l file
-sp = subprocess.run(command, env=env)
+sp = subprocess.run(command)
 if sp.returncode != 0:
     sys.exit(sp.returncode)
 
-- 
2.39.5 (Apple Git-154)

#8Peter Eisentraut
peter@eisentraut.org
In reply to: J. Javier Maestro (#7)
Re: fix: propagate M4 env variable to flex subprocess

On 28.05.25 20:42, J. Javier Maestro wrote:

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>> wrote:

Hi,

On 2025-05-17 23:32:24 -0400, J. Javier Maestro wrote:

On Tue, May 13, 2025 at 11:54 AM Andres Freund

<andres@anarazel.de <mailto:andres@anarazel.de>> wrote:

Bilal, I think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

IIRC the rest of the tools in the environment have ways to be

specified via

Meson options (BISON, FLEX, PERL) so the only issue I see is Flex

not being

able to find the specific m4 binary. What other issue(s) are you
considering?

PATH, LD_LIBRARY_PATH, ...

I think this really should just add to the environment, rather than
supplant
it.

Ah, understood. That definitely looks like a better option.

Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if
it's OK, what are the next steps to get the patch merged in main!

This patch looks right to me.

I would wait for the PG19 branching at this point, unless there is a
concrete need for backpatching.

#9Nikolay Samokhvalov
nik@postgres.ai
In reply to: J. Javier Maestro (#7)
Re: fix: propagate M4 env variable to flex subprocess

On Wed, May 28, 2025 at 11:43 AM J. Javier Maestro <jjmaestro@ieee.org>
wrote:

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de> wrote:

...

Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if it's
OK, what are the next steps to get the patch merged in main!

I checked 0001 version, it builds and works as expected.

Not to lose it, created commitfest entry:
https://commitfest.postgresql.org/patch/5835/

(and marked as ready, considering others' words in this thread)

#10J. Javier Maestro
jjmaestro@ieee.org
In reply to: Peter Eisentraut (#8)
Re: fix: propagate M4 env variable to flex subprocess

On Tue, Jun 17, 2025 at 6:15 AM Peter Eisentraut <peter@eisentraut.org>
wrote:

This patch looks right to me.

Great, thanks!

I would wait for the PG19 branching at this point, unless there is a

concrete need for backpatching.

This patch fixes just one of the issues I found when trying to have a
reproducible PG build... and, given the fact that there are still other
issues that make a PG build non-reproducible (see the other patches in
https://github.com/monogres/postgres/compare/REL_16_0...monogres/patches/16.0)
I'd say it's OK if it's not backported.

Thank you!

Javier

#11J. Javier Maestro
jjmaestro@ieee.org
In reply to: Nikolay Samokhvalov (#9)
Re: fix: propagate M4 env variable to flex subprocess

On Sun, Jun 22, 2025 at 10:23 PM Nikolay Samokhvalov <nik@postgres.ai>
wrote:

On Wed, May 28, 2025 at 11:43 AM J. Javier Maestro <jjmaestro@ieee.org>
wrote:

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de> wrote:

...

Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if
it's OK, what are the next steps to get the patch merged in main!

I checked 0001 version, it builds and works as expected.

Not to lose it, created commitfest entry:
https://commitfest.postgresql.org/patch/5835/

(and marked as ready, considering others' words in this thread)

Thank you!

Cheers,

Javier

#12Peter Eisentraut
peter@eisentraut.org
In reply to: Peter Eisentraut (#8)
Re: fix: propagate M4 env variable to flex subprocess

On 17.06.25 07:15, Peter Eisentraut wrote:

On 28.05.25 20:42, J. Javier Maestro wrote:

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>> wrote:

    Hi,

    On 2025-05-17 23:32:24 -0400, J. Javier Maestro wrote:
     > On Tue, May 13, 2025 at 11:54 AM Andres Freund
    <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
     > > Bilal, I think you wrote this originally, do you recall?
     > >
     > > It seems like an issue beyond just M4...
     > >
     >
     > IIRC the rest of the tools in the environment have ways to be
    specified via
     > Meson options (BISON, FLEX, PERL) so the only issue I see is Flex
    not being
     > able to find the specific m4 binary. What other issue(s) are you
     > considering?

    PATH, LD_LIBRARY_PATH, ...

    I think this really should just add to the environment, rather than
    supplant
    it.

Ah, understood. That definitely looks like a better option.

    Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if
it's OK, what are the next steps to get the patch merged in main!

This patch looks right to me.

I would wait for the PG19 branching at this point, unless there is a
concrete need for backpatching.

committed

#13J. Javier Maestro
jjmaestro@ieee.org
In reply to: Peter Eisentraut (#12)
Re: fix: propagate M4 env variable to flex subprocess

On Mon, Jun 30, 2025 at 11:25 AM Peter Eisentraut <peter@eisentraut.org>
wrote:

On 17.06.25 07:15, Peter Eisentraut wrote:

On 28.05.25 20:42, J. Javier Maestro wrote:

On Wed, May 28, 2025 at 6:08 PM Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>> wrote:

Hi,

On 2025-05-17 23:32:24 -0400, J. Javier Maestro wrote:

On Tue, May 13, 2025 at 11:54 AM Andres Freund

<andres@anarazel.de <mailto:andres@anarazel.de>> wrote:

Bilal, I think you wrote this originally, do you recall?

It seems like an issue beyond just M4...

IIRC the rest of the tools in the environment have ways to be

specified via

Meson options (BISON, FLEX, PERL) so the only issue I see is Flex

not being

able to find the specific m4 binary. What other issue(s) are you
considering?

PATH, LD_LIBRARY_PATH, ...

I think this really should just add to the environment, rather than
supplant
it.

Ah, understood. That definitely looks like a better option.

Do you want to write a patch like that? Otherwise I can.

Sure, I've attached the new patch. Let me know what you think, and if
it's OK, what are the next steps to get the patch merged in main!

This patch looks right to me.

I would wait for the PG19 branching at this point, unless there is a
concrete need for backpatching.

committed

Thanks!

Javier