make dist using git archive
One of the goals is to make the creation of the distribution tarball
more directly traceable to the git repository. That is why we removed
the "make distprep" step.
Here I want to take another step in that direction, by changing "make
dist" to directly use "git archive", rather than the custom shell script
it currently runs.
The simple summary is that it would run
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel.tar.gz
(with appropriate version numbers, of course), and that's the tarball we
would ship.
There are analogous commands for other compression formats.
The actual command gets subtly more complicated if you need to run this
in a separate build directory. In my attached patch, the make version
doesn't support vpath at the moment, just so that it's easier to
understand for now. The meson version is a bit hairy.
I have studied and tested this quite a bit, and I have found that the
archives produced this way are deterministic and reproducible, meaning
for a given commit the result file should always be bit-for-bit identical.
The exception is that if you use a git version older than 2.38.0, gzip
records the platform in the archive, so you'd get a different output on
Windows vs. macOS vs. "UNIX" (everything else). In git 2.38.0, this was
changed so that everything is recorded as "UNIX" now. This is just
something to keep in mind. This issue is specific to the gzip format,
it does not affect other compression formats.
Meson has its own distribution building command (meson dist), but opted
against using that. The main problem is that the way they have
implemented it, it is not deterministic in the above sense. (Another
point is of course that we probably want a "make" version for the time
being.)
But the target name "dist" in meson is reserved for that reason, so I
needed to call the custom target "pgdist".
I did take one idea from meson: It runs a check before git archive that
the checkout is clean. That way, you avoid mistakes because of
uncommitted changes. This works well in my "make" implementation. In
the meson implementation, I had to find a workaround, because a
custom_target cannot have a dependency on a run_target. As also
mentioned above, the whole meson implementation is a bit ugly.
Anyway, with the attached patch you can do
make dist
or
meson compile -C build pgdist
and it produces the same set of tarballs as before, except it's done
differently.
The actual build scripts need some fine-tuning, but the general idea is
correct, I think.
Attachments:
v0-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v0-0001-make-dist-uses-git-archive.patchDownload
From 4b128faca90238d0a0bb6949a8050c2501d1bd67 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sat, 20 Jan 2024 21:54:36 +0100
Subject: [PATCH v0] make dist uses git archive
---
GNUmakefile.in | 34 ++++++++++++----------------------
meson.build | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..3e04785ada2 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+ $(GIT) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+ $(GIT) archive --format tar.gz --prefix $(distdir)/ HEAD -o $@
+
+$(distdir).tar.bz2: check-dirty-index
+ $(GIT) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index c317144b6bc..f0d870c5192 100644
--- a/meson.build
+++ b/meson.build
@@ -3347,6 +3347,44 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+git = find_program('git', required: false, native: true, disabler: true)
+bzip2 = find_program('bzip2', required: false, native: true, disabler: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+ command: [git, 'diff-index', '--quiet', 'HEAD'])
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', '@BUILD_ROOT@/@OUTPUT@',
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + bzip2.path() + ' -c', 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', '@BUILD_ROOT@/@OUTPUT@',
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+)
+
+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
--
2.43.0
Hi,
On Mon, Jan 22, 2024 at 3:32 PM Peter Eisentraut <peter@eisentraut.org> wrote:
One of the goals is to make the creation of the distribution tarball
more directly traceable to the git repository. That is why we removed
the "make distprep" step.Here I want to take another step in that direction, by changing "make
dist" to directly use "git archive", rather than the custom shell script
it currently runs.The simple summary is that it would run
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel.tar.gz(with appropriate version numbers, of course), and that's the tarball we
would ship.There are analogous commands for other compression formats.
The actual command gets subtly more complicated if you need to run this
in a separate build directory. In my attached patch, the make version
doesn't support vpath at the moment, just so that it's easier to
understand for now. The meson version is a bit hairy.I have studied and tested this quite a bit, and I have found that the
archives produced this way are deterministic and reproducible, meaning
for a given commit the result file should always be bit-for-bit identical.The exception is that if you use a git version older than 2.38.0, gzip
records the platform in the archive, so you'd get a different output on
Windows vs. macOS vs. "UNIX" (everything else). In git 2.38.0, this was
changed so that everything is recorded as "UNIX" now. This is just
something to keep in mind. This issue is specific to the gzip format,
it does not affect other compression formats.Meson has its own distribution building command (meson dist), but opted
against using that. The main problem is that the way they have
implemented it, it is not deterministic in the above sense. (Another
point is of course that we probably want a "make" version for the time
being.)But the target name "dist" in meson is reserved for that reason, so I
needed to call the custom target "pgdist".I did take one idea from meson: It runs a check before git archive that
the checkout is clean. That way, you avoid mistakes because of
uncommitted changes. This works well in my "make" implementation. In
the meson implementation, I had to find a workaround, because a
custom_target cannot have a dependency on a run_target. As also
mentioned above, the whole meson implementation is a bit ugly.Anyway, with the attached patch you can do
make dist
or
meson compile -C build pgdist
I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.
Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gz
and it produces the same set of tarballs as before, except it's done
differently.The actual build scripts need some fine-tuning, but the general idea is
correct, I think.
I think this is a good idea, thanks for working on this.
--
Regards
Junwang Zhao
On 22.01.24 13:10, Junwang Zhao wrote:
I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.
Yes, it's not good, but I couldn't find a way to make it work.
This is part of the complications with meson I referred to. The
@BUILD_ROOT@ placeholder in custom_target() is apparently always a
relative path, but it doesn't know that git -C changes the current
directory.
Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gz
I'm not sure why we would need it built-in. It can be done by hand, of
course.
On Mon Jan 22, 2024 at 1:31 AM CST, Peter Eisentraut wrote:
From 4b128faca90238d0a0bb6949a8050c2501d1bd67 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sat, 20 Jan 2024 21:54:36 +0100
Subject: [PATCH v0] make dist uses git archive---
GNUmakefile.in | 34 ++++++++++++----------------------
meson.build | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+), 22 deletions(-)diff --git a/GNUmakefile.in b/GNUmakefile.in index eba569e930e..3e04785ada2 100644 --- a/GNUmakefile.in +++ b/GNUmakefile.in @@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport distdir = postgresql-$(VERSION) dummy = =install=+GIT = git + dist: $(distdir).tar.gz $(distdir).tar.bz2 - rm -rf $(distdir) - -$(distdir).tar: distdir - $(TAR) chf $@ $(distdir) - -.INTERMEDIATE: $(distdir).tar - -distdir-location: - @echo $(distdir) - -distdir: - rm -rf $(distdir)* $(dummy) - for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \ - file=`expr X$$x : 'X\./\(.*\)'`; \ - if test -d "$(top_srcdir)/$$file" ; then \ - mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \ - else \ - ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \ - || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \ - fi || exit; \ - done - $(MAKE) -C $(distdir) distclean + +.PHONY: check-dirty-index +check-dirty-index: + $(GIT) diff-index --quiet HEAD + +$(distdir).tar.gz: check-dirty-index + $(GIT) archive --format tar.gz --prefix $(distdir)/ HEAD -o $@ + +$(distdir).tar.bz2: check-dirty-index + $(GIT) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $@distcheck: dist rm -rf $(dummy) diff --git a/meson.build b/meson.build index c317144b6bc..f0d870c5192 100644 --- a/meson.build +++ b/meson.build @@ -3347,6 +3347,44 @@ run_target('help',+############################################################### +# Distribution archive +############################################################### + +git = find_program('git', required: false, native: true, disabler: true) +bzip2 = find_program('bzip2', required: false, native: true, disabler: true)
This doesn't need to be a disabler. git is fine as-is. See later
comment. Disablers only work like you are expecting when they are used
like how git is used. Once you call a method like .path(), all bets are
off.
+distdir = meson.project_name() + '-' + meson.project_version() + +check_dirty_index = run_target('check-dirty-index', + command: [git, 'diff-index', '--quiet', 'HEAD'])
Seems like you might want to add -C here too?
+ +tar_gz = custom_target('tar.gz', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', 'archive', + '--format', 'tar.gz', + '--prefix', distdir + '/', + '-o', '@BUILD_ROOT@/@OUTPUT@', + 'HEAD', '.'], + install: false, + output: distdir + '.tar.gz', +) + +tar_bz2 = custom_target('tar.bz2', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + bzip2.path() + ' -c', 'archive', + '--format', 'tar.bz2', + '--prefix', distdir + '/',
- '-o', '@BUILD_ROOT@/@OUTPUT@',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
This will generate the tarballs in the build directory. Do the same for
the previous target. Tested locally.
+ 'HEAD', '.'], + install: false, + output: distdir + '.tar.bz2', +)
The bz2 target should be wrapped in an `if bzip2.found()`. It is
possible for git to be found, but not bzip2. I might also define the bz2
command out of line. Also, you may want to add
these programs to meson_options.txt for overriding, even though the
"meson-ic" way is to use a machine file.
+ +alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
Are you intending for the check_dirty_index target to prohibit the other
two targets from running? Currently that is not the case. If it is what
you intend, use a stamp file or something to indicate a relationship.
Alternatively, inline the git diff-index into the other commands. These
might also do better as external scripts. It would reduce duplication
between the autotools and Meson builds.
+ + + ############################################################### # The End, The End, My Friend ###############################################################
I am not really following why we can't use the builtin Meson dist
command. The only difference from my testing is it doesn't use
a --prefix argument.
--
Tristan Partin
Neon (https://neon.tech)
On Tue, Jan 23, 2024 at 2:36 AM Peter Eisentraut <peter@eisentraut.org> wrote:
On 22.01.24 13:10, Junwang Zhao wrote:
I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.Yes, it's not good, but I couldn't find a way to make it work.
This is part of the complications with meson I referred to. The
@BUILD_ROOT@ placeholder in custom_target() is apparently always a
relative path, but it doesn't know that git -C changes the current
directory.Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gzI'm not sure why we would need it built-in. It can be done by hand, of
course.
If this is only used by the release phase, one can do this by hand.
*commit id/tag* in package name can be used to identify the git source,
which might be useful for cooperation between QA and dev team,
but surely there are better ways for this, so I do not have a strong
opinion here.
--
Regards
Junwang Zhao
On 22.01.24 21:04, Tristan Partin wrote:
I am not really following why we can't use the builtin Meson dist
command. The only difference from my testing is it doesn't use a
--prefix argument.
Here are some problems I have identified:
1. meson dist internally runs gzip without the -n option. That makes
the tar.gz archive include a timestamp, which in turn makes it not
reproducible.
2. Because gzip includes a platform indicator in the archive, the
produced tar.gz archive is not reproducible across platforms. (I don't
know if gzip has an option to avoid that. git archive uses an internal
gzip implementation that handles this.)
3. Meson does not support tar.bz2 archives.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.
5. I have found that the tar archives created by meson and git archive
include the files in different orders. I suspect that the Python
tarfile module introduces some either randomness or platform dependency.
6. meson dist is also slower because of the additional work.
7. meson dist produces .sha256sum files but we have called them .sha256.
(This is obviously trivial, but it is something that would need to be
dealt with somehow nonetheless.)
Most or all of these issues are fixable, either upstream in Meson or by
adjusting our own requirements. But for now this route would have some
significant disadvantages.
On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:
On 22.01.24 21:04, Tristan Partin wrote:
I am not really following why we can't use the builtin Meson dist
command. The only difference from my testing is it doesn't use a
--prefix argument.Here are some problems I have identified:
1. meson dist internally runs gzip without the -n option. That makes
the tar.gz archive include a timestamp, which in turn makes it not
reproducible.2. Because gzip includes a platform indicator in the archive, the
produced tar.gz archive is not reproducible across platforms. (I don't
know if gzip has an option to avoid that. git archive uses an internal
gzip implementation that handles this.)3. Meson does not support tar.bz2 archives.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.5. I have found that the tar archives created by meson and git archive
include the files in different orders. I suspect that the Python
tarfile module introduces some either randomness or platform dependency.6. meson dist is also slower because of the additional work.
7. meson dist produces .sha256sum files but we have called them .sha256.
(This is obviously trivial, but it is something that would need to be
dealt with somehow nonetheless.)Most or all of these issues are fixable, either upstream in Meson or by
adjusting our own requirements. But for now this route would have some
significant disadvantages.
Thanks Peter. I will bring these up with upstream!
--
Tristan Partin
Neon (https://neon.tech)
On Wed Jan 24, 2024 at 10:18 AM CST, Tristan Partin wrote:
On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:
On 22.01.24 21:04, Tristan Partin wrote:
I am not really following why we can't use the builtin Meson dist
command. The only difference from my testing is it doesn't use a
--prefix argument.Here are some problems I have identified:
1. meson dist internally runs gzip without the -n option. That makes
the tar.gz archive include a timestamp, which in turn makes it not
reproducible.
It doesn't look like Python provides the facilities to affect this.
2. Because gzip includes a platform indicator in the archive, the
produced tar.gz archive is not reproducible across platforms. (I don't
know if gzip has an option to avoid that. git archive uses an internal
gzip implementation that handles this.)
Same reason as above.
3. Meson does not support tar.bz2 archives.
Submitted https://github.com/mesonbuild/meson/pull/12770.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.
Because Meson allows projects to distribute arbitrary files via
meson.add_dist_script(), and can include subprojects via `meson dist
--include-subprojects`, this doesn't seem like an easily solvable
problem.
5. I have found that the tar archives created by meson and git archive
include the files in different orders. I suspect that the Python
tarfile module introduces some either randomness or platform dependency.
Seems likely.
6. meson dist is also slower because of the additional work.
Not easily solvable due to 4.
7. meson dist produces .sha256sum files but we have called them .sha256.
(This is obviously trivial, but it is something that would need to be
dealt with somehow nonetheless.)Most or all of these issues are fixable, either upstream in Meson or by
adjusting our own requirements. But for now this route would have some
significant disadvantages.Thanks Peter. I will bring these up with upstream!
I think the solution to point 4 is to not unpack/repack if there are no
dist scripts and/or subprojects to distribute. I can take a look at
this later. I think this would also solve points 1, 2, 5, and 6 because
at that point meson is just calling git-archive.
--
Tristan Partin
Neon (https://neon.tech)
On 24.01.24 18:57, Tristan Partin wrote:
4. Meson uses git archive internally, but then unpacks and repacks
the > archive, which loses the ability to use git get-tar-commit-id.
Because Meson allows projects to distribute arbitrary files via
meson.add_dist_script(), and can include subprojects via `meson dist
--include-subprojects`, this doesn't seem like an easily solvable problem.
git archive has the --add-file option, which can probably do the same
thing. Subprojects are another thing, but obviously are more rarely used.
I think the solution to point 4 is to not unpack/repack if there are no
dist scripts and/or subprojects to distribute. I can take a look at this
later. I think this would also solve points 1, 2, 5, and 6 because at
that point meson is just calling git-archive.
I think that would be a useful direction.
On 22.01.24 21:04, Tristan Partin wrote:
+git = find_program('git', required: false, native: true, disabler: true) +bzip2 = find_program('bzip2', required: false, native: true, disabler: true)This doesn't need to be a disabler. git is fine as-is. See later
comment. Disablers only work like you are expecting when they are used
like how git is used. Once you call a method like .path(), all bets are
off.
ok, fixed
+distdir = meson.project_name() + '-' + meson.project_version() + +check_dirty_index = run_target('check-dirty-index', + command: [git, 'diff-index', '--quiet', 'HEAD'])Seems like you might want to add -C here too?
done
+tar_bz2 = custom_target('tar.bz2', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + bzip2.path() + ' -c', 'archive', + '--format', 'tar.bz2', + '--prefix', distdir + '/',- '-o', '@BUILD_ROOT@/@OUTPUT@', + '-o', join_paths(meson.build_root(), '@OUTPUT@'),This will generate the tarballs in the build directory. Do the same for
the previous target. Tested locally.
Fixed, thanks. I had struggled with this one.
+ 'HEAD', '.'], + install: false, + output: distdir + '.tar.bz2', +)The bz2 target should be wrapped in an `if bzip2.found()`.
Well, I think we want the dist step to fail if bzip2 is not there. At
least that is the current expectation.
+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
Are you intending for the check_dirty_index target to prohibit the other
two targets from running? Currently that is not the case.
Yes, that was the hope, and that's how the make dist variant works. But
I couldn't figure this out with meson. Also, the above actually also
doesn't work with older meson versions, so I had to comment this out to
get CI to work.
If it is what
you intend, use a stamp file or something to indicate a relationship.
Alternatively, inline the git diff-index into the other commands. These
might also do better as external scripts. It would reduce duplication
between the autotools and Meson builds.
Yeah, maybe that's a direction.
The updated patch also supports vpath builds with make now.
I have also added a CI patch, for amusement. Maybe we'll want to keep
it, though.
Attachments:
v1-0002-WIP-Add-dist-building-to-CI.patchtext/plain; charset=UTF-8; name=v1-0002-WIP-Add-dist-building-to-CI.patchDownload
From 3def9a78f25643304aa9cc847fc93376546d35bf Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Jan 2024 15:35:48 +0100
Subject: [PATCH v1 2/2] WIP: Add dist building to CI
Note: freebsd repartition script didn't copy .git directory
---
.cirrus.tasks.yml | 28 +++++++++++++++++++++++++
src/tools/ci/gcp_freebsd_repartition.sh | 2 +-
2 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e4e1bcfeb99..6eba3073a35 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -202,6 +202,11 @@ task:
build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop
EOF
+ dist_script: |
+ su postgres -c 'meson compile -C build -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
# if the server continues running, it often causes cirrus-ci to fail
# during upload, as it doesn't expect artifacts to change size
@@ -345,6 +350,10 @@ task:
make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}
EOF
+ dist_script: su postgres -c "make dist -j${BUILD_JOBS}"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_ac
@@ -403,6 +412,10 @@ task:
PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}
EOF
+ dist_script: su postgres -c 'meson compile -C build-32 -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
@@ -496,6 +509,10 @@ task:
ulimit -n 1024 # default is 256, pretty low
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
+ dist_script: meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
@@ -567,6 +584,12 @@ task:
vcvarsall x64
meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
+ dist_script: |
+ vcvarsall x64
+ meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
@@ -627,6 +650,11 @@ task:
test_world_script: |
%BASH% -c "meson test %MTEST_ARGS% --num-processes %TEST_JOBS%"
+ dist_script: |
+ %BASH% -c "meson compile -C build -v pgdist"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
diff --git a/src/tools/ci/gcp_freebsd_repartition.sh b/src/tools/ci/gcp_freebsd_repartition.sh
index 2d5e1738998..edf38e17575 100755
--- a/src/tools/ci/gcp_freebsd_repartition.sh
+++ b/src/tools/ci/gcp_freebsd_repartition.sh
@@ -25,4 +25,4 @@ du -hs $CIRRUS_WORKING_DIR
mv $CIRRUS_WORKING_DIR $CIRRUS_WORKING_DIR.orig
mkdir $CIRRUS_WORKING_DIR
mount -o noatime /dev/da0p3 $CIRRUS_WORKING_DIR
-cp -r $CIRRUS_WORKING_DIR.orig/* $CIRRUS_WORKING_DIR/
+cp -a $CIRRUS_WORKING_DIR.orig/ $CIRRUS_WORKING_DIR/
--
2.43.0
v1-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v1-0001-make-dist-uses-git-archive.patchDownload
From 13612f9a1c486e8acfe4156a7bc069a66fd30e77 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Jan 2024 12:28:28 +0100
Subject: [PATCH v1 1/2] make dist uses git archive
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
GNUmakefile.in | 34 ++++++++++++----------------------
meson.build | 41 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 53 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..680c765dd73 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+ $(GIT) -C $(srcdir) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+ $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2: check-dirty-index
+ $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index 55184db2488..43884e7cfdd 100644
--- a/meson.build
+++ b/meson.build
@@ -3348,6 +3348,47 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+ command: [git, '-C', '@SOURCE_ROOT@',
+ 'diff-index', '--quiet', 'HEAD'])
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c', 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+)
+
+# FIXME: would like to add check_dirty_index here, broken(?) before
+# meson 0.60.0
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
base-commit: 46d8587b504170ca14f064890bc7ccbbd7523f81
--
2.43.0
On Thu Jan 25, 2024 at 10:04 AM CST, Peter Eisentraut wrote:
On 22.01.24 21:04, Tristan Partin wrote:
+ 'HEAD', '.'], + install: false, + output: distdir + '.tar.bz2', +)The bz2 target should be wrapped in an `if bzip2.found()`.
The way that this currently works is that you will fail at configure
time if bz2 doesn't exist on the system. Meson will try to resolve
a .path() method on a NotFoundProgram. You might want to define the bz2
target to just call `exit 1` in this case.
if bzip2.found()
# do your current target
else
bz2 = run_target('tar.bz2', command: ['exit', 1])
endif
This should cause pgdist to appropriately fail at run time when
generating the bz2 tarball.
Well, I think we want the dist step to fail if bzip2 is not there. At
least that is the current expectation.+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
Are you intending for the check_dirty_index target to prohibit the other
two targets from running? Currently that is not the case.Yes, that was the hope, and that's how the make dist variant works. But
I couldn't figure this out with meson. Also, the above actually also
doesn't work with older meson versions, so I had to comment this out to
get CI to work.If it is what
you intend, use a stamp file or something to indicate a relationship.
Alternatively, inline the git diff-index into the other commands. These
might also do better as external scripts. It would reduce duplication
between the autotools and Meson builds.Yeah, maybe that's a direction.
For what it's worth, I run Meson 1.3, and the behavior of generating the
tarballs even though it is a dirty tree still occurred. In the new patch
you seem to say it was fixed in 0.60.
--
Tristan Partin
Neon (https://neon.tech)
On 25.01.24 17:25, Tristan Partin wrote:
The way that this currently works is that you will fail at configure
time if bz2 doesn't exist on the system. Meson will try to resolve a
.path() method on a NotFoundProgram. You might want to define the bz2
target to just call `exit 1` in this case.if bzip2.found()
# do your current target
else
bz2 = run_target('tar.bz2', command: ['exit', 1])
endifThis should cause pgdist to appropriately fail at run time when
generating the bz2 tarball.
Ok, done that way.
For what it's worth, I run Meson 1.3, and the behavior of generating the
tarballs even though it is a dirty tree still occurred. In the new patch
you seem to say it was fixed in 0.60.
The problem I'm referring to is that before 0.60, alias_target cannot
depend on run_target (only "build target"). This is AFAICT not
documented and might not have been an intentional change, but you can
trace it in the meson source code, and it shows in the PostgreSQL CI.
That's also why for the above bzip2 issue I have to use custom_target in
place of your run_target.
Attachments:
v2-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v2-0001-make-dist-uses-git-archive.patchDownload
From 48ddd079c1530baaabf92a5650ff9a7bfa7ac3d6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Jan 2024 12:28:28 +0100
Subject: [PATCH v2 1/2] make dist uses git archive
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
GNUmakefile.in | 34 ++++++++++++----------------------
meson.build | 47 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..680c765dd73 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+ $(GIT) -C $(srcdir) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+ $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2: check-dirty-index
+ $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index 55184db2488..c245a1818e0 100644
--- a/meson.build
+++ b/meson.build
@@ -3348,6 +3348,53 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+ command: [git, '-C', '@SOURCE_ROOT@', 'diff-index', '--quiet', 'HEAD'])
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+if bzip2.found()
+ tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c', 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+ )
+else
+ tar_bz2 = custom_target('tar.bz2',
+ command: ['sh', '-c', 'exit 1'],
+ output: distdir + '.tar.bz2',
+ )
+endif
+
+# FIXME: would like to add check_dirty_index here, broken(?) before
+# meson 0.60.0
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
base-commit: 46d8587b504170ca14f064890bc7ccbbd7523f81
--
2.43.0
v2-0002-WIP-Add-dist-building-to-CI.patchtext/plain; charset=UTF-8; name=v2-0002-WIP-Add-dist-building-to-CI.patchDownload
From 65055901d7476ecef67ff96f8218ca8fa344838f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Jan 2024 15:35:48 +0100
Subject: [PATCH v2 2/2] WIP: Add dist building to CI
Note: freebsd repartition script didn't copy .git directory
---
.cirrus.tasks.yml | 28 +++++++++++++++++++++++++
src/tools/ci/gcp_freebsd_repartition.sh | 2 +-
2 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e4e1bcfeb99..6eba3073a35 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -202,6 +202,11 @@ task:
build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop
EOF
+ dist_script: |
+ su postgres -c 'meson compile -C build -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
# if the server continues running, it often causes cirrus-ci to fail
# during upload, as it doesn't expect artifacts to change size
@@ -345,6 +350,10 @@ task:
make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}
EOF
+ dist_script: su postgres -c "make dist -j${BUILD_JOBS}"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_ac
@@ -403,6 +412,10 @@ task:
PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}
EOF
+ dist_script: su postgres -c 'meson compile -C build-32 -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
@@ -496,6 +509,10 @@ task:
ulimit -n 1024 # default is 256, pretty low
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
+ dist_script: meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
@@ -567,6 +584,12 @@ task:
vcvarsall x64
meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
+ dist_script: |
+ vcvarsall x64
+ meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
@@ -627,6 +650,11 @@ task:
test_world_script: |
%BASH% -c "meson test %MTEST_ARGS% --num-processes %TEST_JOBS%"
+ dist_script: |
+ %BASH% -c "meson compile -C build -v pgdist"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
diff --git a/src/tools/ci/gcp_freebsd_repartition.sh b/src/tools/ci/gcp_freebsd_repartition.sh
index 2d5e1738998..edf38e17575 100755
--- a/src/tools/ci/gcp_freebsd_repartition.sh
+++ b/src/tools/ci/gcp_freebsd_repartition.sh
@@ -25,4 +25,4 @@ du -hs $CIRRUS_WORKING_DIR
mv $CIRRUS_WORKING_DIR $CIRRUS_WORKING_DIR.orig
mkdir $CIRRUS_WORKING_DIR
mount -o noatime /dev/da0p3 $CIRRUS_WORKING_DIR
-cp -r $CIRRUS_WORKING_DIR.orig/* $CIRRUS_WORKING_DIR/
+cp -a $CIRRUS_WORKING_DIR.orig/ $CIRRUS_WORKING_DIR/
--
2.43.0
On Fri Jan 26, 2024 at 12:28 AM CST, Peter Eisentraut wrote:
On 25.01.24 17:25, Tristan Partin wrote:
For what it's worth, I run Meson 1.3, and the behavior of generating the
tarballs even though it is a dirty tree still occurred. In the new patch
you seem to say it was fixed in 0.60.The problem I'm referring to is that before 0.60, alias_target cannot
depend on run_target (only "build target"). This is AFAICT not
documented and might not have been an intentional change, but you can
trace it in the meson source code, and it shows in the PostgreSQL CI.
That's also why for the above bzip2 issue I have to use custom_target in
place of your run_target.
https://github.com/mesonbuild/meson/pull/12783
Thanks for finding these issues.
--
Tristan Partin
Neon (https://neon.tech)
Hello, meson developer here.
On 1/23/24 4:30 AM, Peter Eisentraut wrote:
On 22.01.24 21:04, Tristan Partin wrote:
I am not really following why we can't use the builtin Meson dist
command. The only difference from my testing is it doesn't use a
--prefix argument.Here are some problems I have identified:
1. meson dist internally runs gzip without the -n option. That makes
the tar.gz archive include a timestamp, which in turn makes it not
reproducible.
Well, it uses python tarfile which uses python gzip support under the
hood, but yes, that is true, python tarfile doesn't expose this tunable.
2. Because gzip includes a platform indicator in the archive, the
produced tar.gz archive is not reproducible across platforms. (I don't
know if gzip has an option to avoid that. git archive uses an internal
gzip implementation that handles this.)
This appears to be https://github.com/python/cpython/issues/112346
3. Meson does not support tar.bz2 archives.
Simple enough to add, but I'm a bit surprised as usually people seem to
want either gzip for portability or xz for efficient compression.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.
What do you use this for? IMO a more robust way to track the commit used
is to use gitattributes export-subst to write a `.git_archival.txt` file
containing the commit sha1 and other info -- this can be read even after
the file is extracted, which means it can also be used to bake the ID
into the built binaries e.g. as part of --version output.
5. I have found that the tar archives created by meson and git archive
include the files in different orders. I suspect that the Python
tarfile module introduces some either randomness or platform dependency.
Different orders is meaningless, the question is whether the order is
internally consistent. Python uses sorted() to guarantee a stable order,
which may be a different algorithm than the one git-archive uses to
guarantee a stable order. But the order should be stable and that is
what matters.
6. meson dist is also slower because of the additional work.
I'm amenable to skipping the extraction/recombination of subprojects and
running of dist scripts in the event that neither exist, as Tristan
offered to do, but...
7. meson dist produces .sha256sum files but we have called them .sha256.
(This is obviously trivial, but it is something that would need to be
dealt with somehow nonetheless.)Most or all of these issues are fixable, either upstream in Meson or by
adjusting our own requirements. But for now this route would have some
significant disadvantages.
Overall I feel like much of this is about requiring dist tarballs to be
byte-identical to other dist tarballs, although reproducible builds is
mainly about artifacts, not sources, and for sources it doesn't
generally matter unless the sources are ephemeral and generated
on-demand (in which case it is indeed very important to produce the same
tarball each time). A tarball is usually generated once, signed, and
uploaded to release hosting. Meson already guarantees the contents are
strictly based on the built tag.
--
Eli Schwartz
Attachments:
On 26.01.24 22:18, Eli Schwartz wrote:
Hello, meson developer here.
Hello, and thanks for popping in!
3. Meson does not support tar.bz2 archives.
Simple enough to add, but I'm a bit surprised as usually people seem to
want either gzip for portability or xz for efficient compression.
We may very well end up updating our requirements here before too long,
so I wouldn't bother with this on the meson side. Last time we
discussed this, there were still platforms under support that didn't
have xz easily available.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.What do you use this for? IMO a more robust way to track the commit used
is to use gitattributes export-subst to write a `.git_archival.txt` file
containing the commit sha1 and other info -- this can be read even after
the file is extracted, which means it can also be used to bake the ID
into the built binaries e.g. as part of --version output.
It's a marginal use case, for sure. But it is something that git
provides tooling for that is universally available. Any alternative
would be an ad-hoc solution that is specific to our project and would be
different for the next project.
5. I have found that the tar archives created by meson and git archive
include the files in different orders. I suspect that the Python
tarfile module introduces some either randomness or platform dependency.Different orders is meaningless, the question is whether the order is
internally consistent. Python uses sorted() to guarantee a stable order,
which may be a different algorithm than the one git-archive uses to
guarantee a stable order. But the order should be stable and that is
what matters.
(FWIW, I couldn't reproduce this anymore, so maybe it's not actually an
issue.)
Overall I feel like much of this is about requiring dist tarballs to be
byte-identical to other dist tarballs, although reproducible builds is
mainly about artifacts, not sources, and for sources it doesn't
generally matter unless the sources are ephemeral and generated
on-demand (in which case it is indeed very important to produce the same
tarball each time).
The source tarball is, in a way, also an artifact.
I think it's useful that others can easily independently verify that the
produced tarball matches what they have locally. It's not an absolute
requirement, but given that it is possible, it seems useful to take
advantage of it.
In a way, this also avoids the need for signing the tarball, which we
don't do. So maybe that contributes to a different perspective.
On 1/31/24 3:03 AM, Peter Eisentraut wrote:
What do you use this for? IMO a more robust way to track the commit used
is to use gitattributes export-subst to write a `.git_archival.txt` file
containing the commit sha1 and other info -- this can be read even after
the file is extracted, which means it can also be used to bake the ID
into the built binaries e.g. as part of --version output.It's a marginal use case, for sure. But it is something that git
provides tooling for that is universally available. Any alternative
would be an ad-hoc solution that is specific to our project and would be
different for the next project.
mercurial has the "archivemeta" config setting that exports similar
information, but forces the filename ".hg_archival.txt".
The setuptools-scm project follows this pattern by requiring the git
file to be called ".git_archival.txt" with a set pattern mimicking the
hg one:
https://setuptools-scm.readthedocs.io/en/latest/usage/#git-archives
So, I guess you could use this and then it would not be specific to your
project. :)
Overall I feel like much of this is about requiring dist tarballs to be
byte-identical to other dist tarballs, although reproducible builds is
mainly about artifacts, not sources, and for sources it doesn't
generally matter unless the sources are ephemeral and generated
on-demand (in which case it is indeed very important to produce the same
tarball each time).The source tarball is, in a way, also an artifact.
I think it's useful that others can easily independently verify that the
produced tarball matches what they have locally. It's not an absolute
requirement, but given that it is possible, it seems useful to take
advantage of it.In a way, this also avoids the need for signing the tarball, which we
don't do. So maybe that contributes to a different perspective.
Since you mention signing and not as a simple "aside"...
That's a fascinating perspective. I wonder how people independently
verify that what they have locally (I assume from git clones) matches
what the postgres committers have authorized.
I'm a bit skeptical that you can avoid the need to perform code-signing
at some stage, somewhere, somehow, by suggesting that people can simply
git clone, run some commands and compare the tarball. The point of
signing is to verify that no one has acquired an untraceable API token
they should not have and gotten write access to the authoritative server
then uploaded malicious code under various forged identities, possibly
overwriting previous versions, either in git or out of git.
Ideally git commits should be signed, but that requires large numbers of
people to have security-minded git commit habits. From a quick check of
the postgres commit logs, only one person seems to be regularly signing
commits, which does provide a certain measure of protection -- an
attacker cannot attack via `git push --force` across that boundary, and
those commits serve as verifiable states that multiple people have seen.
The tags aren't signed either, which is a big issue for verifiably
identifying the release artifacts published by the release manager. Even
if not every commit is signed, having signed tags provides a known
coordination point of code that has been broadly tested and code-signed
for mass use.
...
In summary, my opinion is that using git-get-tar-commit-id provides zero
security guarantees, and if that's not something you are worried about
then that's one thing, but if you were expecting it to *replace* signing
the tarball, then that's.... very much another thing entirely, and not
one I can agree at all with.
--
Eli Schwartz
Attachments:
On Wed, Jan 31, 2024 at 10:50 AM Eli Schwartz <eschwartz93@gmail.com> wrote:
Ideally git commits should be signed, but that requires large numbers of
people to have security-minded git commit habits. From a quick check of
the postgres commit logs, only one person seems to be regularly signing
commits, which does provide a certain measure of protection -- an
attacker cannot attack via `git push --force` across that boundary, and
those commits serve as verifiable states that multiple people have seen.The tags aren't signed either, which is a big issue for verifiably
identifying the release artifacts published by the release manager. Even
if not every commit is signed, having signed tags provides a known
coordination point of code that has been broadly tested and code-signed
for mass use.In summary, my opinion is that using git-get-tar-commit-id provides zero
security guarantees, and if that's not something you are worried about
then that's one thing, but if you were expecting it to *replace* signing
the tarball, then that's.... very much another thing entirely, and not
one I can agree at all with.
I read this part with interest. I think there's definitely something
to be said for strengthening some of our practices in this area. At
the same time, I think it's reasonable for Peter to want to pursue the
limited goal he stated in the original post, namely reproducible
tarball generation, without getting tangled up in possible policy
changes that might be controversial and might require a bunch of
planning and coordination. "GPG signatures are good" can be true
without "reproducible tarball generation is good" being false; and if
"git archive" allows for that and "meson dist" doesn't, then we're
unlikely to adopt "meson dist".
--
Robert Haas
EDB: http://www.enterprisedb.com
Small update: I noticed that on Windows (at least the one that is
running the CI job), I need to use git -c core.autocrlf=false, otherwise
git archive does line-ending conversion for the files it puts into the
archive. With this fix, all the archives produced by all the CI jobs
across the different platforms match, except the .tar.gz archive from
the Linux job, which I suspect suffers from an old git version. We
should get the Linux images updated to a newer Debian version soon
anyway, so I think that issue will go away.
Attachments:
v3-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v3-0001-make-dist-uses-git-archive.patchDownload
From e6447b8a9ffd354549827e90e8bec007e703374d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sun, 11 Feb 2024 23:58:42 +0100
Subject: [PATCH v3 1/2] make dist uses git archive
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
GNUmakefile.in | 34 ++++++++++++---------------------
meson.build | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..73560ea7d0a 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+ $(GIT) -C $(srcdir) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+ $(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2: check-dirty-index
+ $(GIT) -C $(srcdir) -c core.autocrlf=false -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index 8ed51b6aae8..5d68c66ac0a 100644
--- a/meson.build
+++ b/meson.build
@@ -3352,6 +3352,58 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+ command: [git, '-C', '@SOURCE_ROOT@', 'diff-index', '--quiet', 'HEAD'])
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+if bzip2.found()
+ tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c',
+ 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+ )
+else
+ tar_bz2 = custom_target('tar.bz2',
+ command: ['sh', '-c', 'exit 1'],
+ output: distdir + '.tar.bz2',
+ )
+endif
+
+# FIXME: would like to add check_dirty_index here, broken(?) before
+# meson 0.60.0
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
base-commit: ce571434ae7027462565706236a0c6fbdf603734
--
2.43.1
v3-0002-WIP-Add-dist-building-to-CI.patchtext/plain; charset=UTF-8; name=v3-0002-WIP-Add-dist-building-to-CI.patchDownload
From 0227fa494dab9302290c63afc04d0efc9c7d175d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sun, 11 Feb 2024 23:58:42 +0100
Subject: [PATCH v3 2/2] WIP: Add dist building to CI
Note: freebsd repartition script didn't copy .git directory
---
.cirrus.tasks.yml | 28 +++++++++++++++++++++++++
src/tools/ci/gcp_freebsd_repartition.sh | 2 +-
2 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e4e1bcfeb99..6eba3073a35 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -202,6 +202,11 @@ task:
build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop
EOF
+ dist_script: |
+ su postgres -c 'meson compile -C build -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
# if the server continues running, it often causes cirrus-ci to fail
# during upload, as it doesn't expect artifacts to change size
@@ -345,6 +350,10 @@ task:
make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}
EOF
+ dist_script: su postgres -c "make dist -j${BUILD_JOBS}"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_ac
@@ -403,6 +412,10 @@ task:
PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}
EOF
+ dist_script: su postgres -c 'meson compile -C build-32 -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
@@ -496,6 +509,10 @@ task:
ulimit -n 1024 # default is 256, pretty low
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
+ dist_script: meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
@@ -567,6 +584,12 @@ task:
vcvarsall x64
meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
+ dist_script: |
+ vcvarsall x64
+ meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
@@ -627,6 +650,11 @@ task:
test_world_script: |
%BASH% -c "meson test %MTEST_ARGS% --num-processes %TEST_JOBS%"
+ dist_script: |
+ %BASH% -c "meson compile -C build -v pgdist"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
diff --git a/src/tools/ci/gcp_freebsd_repartition.sh b/src/tools/ci/gcp_freebsd_repartition.sh
index 2d5e1738998..edf38e17575 100755
--- a/src/tools/ci/gcp_freebsd_repartition.sh
+++ b/src/tools/ci/gcp_freebsd_repartition.sh
@@ -25,4 +25,4 @@ du -hs $CIRRUS_WORKING_DIR
mv $CIRRUS_WORKING_DIR $CIRRUS_WORKING_DIR.orig
mkdir $CIRRUS_WORKING_DIR
mount -o noatime /dev/da0p3 $CIRRUS_WORKING_DIR
-cp -r $CIRRUS_WORKING_DIR.orig/* $CIRRUS_WORKING_DIR/
+cp -a $CIRRUS_WORKING_DIR.orig/ $CIRRUS_WORKING_DIR/
--
2.43.1
On Sun Feb 11, 2024 at 5:09 PM CST, Peter Eisentraut wrote:
Small update: I noticed that on Windows (at least the one that is
running the CI job), I need to use git -c core.autocrlf=false, otherwise
git archive does line-ending conversion for the files it puts into the
archive. With this fix, all the archives produced by all the CI jobs
across the different platforms match, except the .tar.gz archive from
the Linux job, which I suspect suffers from an old git version. We
should get the Linux images updated to a newer Debian version soon
anyway, so I think that issue will go away.
I think with this change, it is unlikely I will be able to upstream
anything to Meson that would benefit Postgres here since setting this
option seems project dependent.
--
Tristan Partin
Neon (https://neon.tech)
On 12.02.24 18:26, Tristan Partin wrote:
On Sun Feb 11, 2024 at 5:09 PM CST, Peter Eisentraut wrote:
Small update: I noticed that on Windows (at least the one that is
running the CI job), I need to use git -c core.autocrlf=false,
otherwise git archive does line-ending conversion for the files it
puts into the archive. With this fix, all the archives produced by
all the CI jobs across the different platforms match, except the
.tar.gz archive from the Linux job, which I suspect suffers from an
old git version. We should get the Linux images updated to a newer
Debian version soon anyway, so I think that issue will go away.I think with this change, it is unlikely I will be able to upstream
anything to Meson that would benefit Postgres here since setting this
option seems project dependent.
Meson is vulnerable to the same problem: If the person who makes the
release had some crlf-related git setting activated in their
environment, then that would affect the tarball. And such a tarball
would be genuinely broken for non-Windows users, because at least some
parts of Unix systems can't process such CRLF files correctly.
(This is easy to test: Run meson dist with core.autocrlf=true on the
postgresql tree on a non-Windows system. It will fail during dist check.)
Here is an updated version of this patch set.
I have removed the "dirty check" stuff. It didn't really work well/was
buggy under meson, and it failed mysteriously on the Linux CI tasks. So
let's just drop that functionality for now.
I have also added a more complete commit message and some more code
comments.
I have extracted the freebsd CI script fix into a separate patch (0002).
I think this is useful even if we don't take the full CI patch (0003).
About the 0003 patch: It seems useful in principle to test these things
continuously. The dist script runs about 10 seconds in each task, and
takes a bit of disk space for the artifacts. I'm not sure to what
degree this might bother someone.
Attachments:
v4-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v4-0001-make-dist-uses-git-archive.patchDownload
From d7c11e243d46e7e4ca0c4d0397680c3cce003687 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sun, 11 Feb 2024 23:58:42 +0100
Subject: [PATCH v4 1/3] make dist uses git archive
This changes "make dist" to directly use "git archive", rather than
the custom shell script it currently runs.
This is to make the creation of the distribution tarball more directly
traceable to the git repository. That is why we removed the "make
distprep" step.
"make dist" continues to produce a .gz and a .bz2 tarball as before.
The archives produced this way are deterministic and reproducible,
meaning for a given commit the result file should always be
bit-for-bit identical. The exception is that if you use a git version
older than 2.38.0, gzip records the platform in the archive, so you'd
get a different output on Windows vs. macOS vs. "UNIX" (everything
else). In git 2.38.0, this was changed so that everything is recorded
as "UNIX" now. This is just something to keep in mind. This issue is
specific to the gzip format, it does not affect other compression
formats.
Meson has its own distribution building command (meson dist), but we
are not using that at this point. The main problem is that the way
they have implemented it, it is not deterministic in the above sense.
Also, we want a "make" version for the time being. But the target
name "dist" in meson is reserved for that reason, so we call the
custom target "pgdist" (so call something like "meson compile -C build
pgdist").
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
GNUmakefile.in | 35 +++++++++++++--------------------
meson.build | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index 4d8fc794bbb..4dc13a3e2ea 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,20 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+# Note: core.autocrlf=false is needed to avoid line-ending conversion
+# in case the environment has a different setting. Without this, a
+# tarball created on Windows might be different than on, and unusable
+# on, Unix machines.
+
+$(distdir).tar.gz:
+ $(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2:
+ $(GIT) -C $(srcdir) -c core.autocrlf=false -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..7e233317075 100644
--- a/meson.build
+++ b/meson.build
@@ -3359,6 +3359,58 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+# Note: core.autocrlf=false is needed to avoid line-ending conversion
+# in case the environment has a different setting. Without this, a
+# tarball created on Windows might be different than on, and unusable
+# on, Unix machines.
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+if bzip2.found()
+ tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c',
+ 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+ )
+else
+ tar_bz2 = custom_target('tar.bz2',
+ command: ['sh', '-c', 'exit 1'],
+ output: distdir + '.tar.bz2',
+ )
+endif
+
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
base-commit: a145f424d5248a09d766e8cb503b999290cb3b31
--
2.44.0
v4-0002-ci-freebsd-repartition-script-didn-t-copy-.git-di.patchtext/plain; charset=UTF-8; name=v4-0002-ci-freebsd-repartition-script-didn-t-copy-.git-di.patchDownload
From 2ba3315c75ff5f2a0123e6e0bc258e8bb4ae2f3a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 21 Mar 2024 09:19:50 +0100
Subject: [PATCH v4 2/3] ci: freebsd repartition script didn't copy .git
directory
We need a slightly different "cp" incantation to make sure top-level
"dot" files, such as ".git", are also copied.
This is relevant for example if a script wants to execute a git
command. This currently does not happen, but it has come up while
testing other patches.
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
src/tools/ci/gcp_freebsd_repartition.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/tools/ci/gcp_freebsd_repartition.sh b/src/tools/ci/gcp_freebsd_repartition.sh
index cc7f6bc35ef..3adb8fb88ec 100755
--- a/src/tools/ci/gcp_freebsd_repartition.sh
+++ b/src/tools/ci/gcp_freebsd_repartition.sh
@@ -23,4 +23,4 @@ du -hs $CIRRUS_WORKING_DIR
mv $CIRRUS_WORKING_DIR $CIRRUS_WORKING_DIR.orig
mkdir $CIRRUS_WORKING_DIR
mount -o noatime /dev/md1 $CIRRUS_WORKING_DIR
-cp -r $CIRRUS_WORKING_DIR.orig/* $CIRRUS_WORKING_DIR/
+cp -a $CIRRUS_WORKING_DIR.orig/ $CIRRUS_WORKING_DIR/
--
2.44.0
v4-0003-ci-Add-dist-building.patchtext/plain; charset=UTF-8; name=v4-0003-ci-Add-dist-building.patchDownload
From 1e4aec24aceec0f1ef874c2a968ec9e9908e6405 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 21 Mar 2024 09:23:13 +0100
Subject: [PATCH v4 3/3] ci: Add dist building
XXX only for testing
---
.cirrus.tasks.yml | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 1adfdfdd456..38c5cc54311 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -202,6 +202,11 @@ task:
build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop
EOF
+ dist_script: |
+ su postgres -c 'meson compile -C build -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
# if the server continues running, it often causes cirrus-ci to fail
# during upload, as it doesn't expect artifacts to change size
@@ -345,6 +350,10 @@ task:
make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}
EOF
+ dist_script: su postgres -c "make dist -j${BUILD_JOBS}"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_ac
@@ -403,6 +412,10 @@ task:
PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}
EOF
+ dist_script: su postgres -c 'meson compile -C build-32 -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
@@ -498,6 +511,10 @@ task:
ulimit -n 1024 # default is 256, pretty low
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
+ dist_script: meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
@@ -569,6 +586,12 @@ task:
vcvarsall x64
meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
+ dist_script: |
+ vcvarsall x64
+ meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
@@ -629,6 +652,11 @@ task:
test_world_script: |
%BASH% -c "meson test %MTEST_ARGS% --num-processes %TEST_JOBS%"
+ dist_script: |
+ %BASH% -c "meson compile -C build -v pgdist"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
--
2.44.0
On Thu Mar 21, 2024 at 3:44 AM CDT, Peter Eisentraut wrote:
Here is an updated version of this patch set.
You should add 'disabler: true' to the git find_program in Meson. If Git
doesn't exist on the system with the way your patch is currently
written, the targets would be defined, even though they would never
succeed.
You may also want to make sure that we are actually in a Git repository.
I don't think git-archive works outside one.
Re the autoclrf, is this something we could throw in a .gitattributes
files?
I have removed the "dirty check" stuff. It didn't really work well/was
buggy under meson, and it failed mysteriously on the Linux CI tasks. So
let's just drop that functionality for now.I have also added a more complete commit message and some more code
comments.
Meson has its own distribution building command (meson dist), but we
are not using that at this point. The main problem is that the way
they have implemented it, it is not deterministic in the above sense.
Also, we want a "make" version for the time being. But the target
name "dist" in meson is reserved for that reason, so we call the
custom target "pgdist" (so call something like "meson compile -C build
pgdist").
I would suggest poisoning `meson dist` in the following way:
if not meson.is_subproject()
# Maybe edit the comment...Maybe tell perl to print this message
# instead and then exit non-zero?
#
# Meson has its own distribution building command (meson dist), but we
# are not using that at this point. The main problem is that the way
# they have implemented it, it is not deterministic in the above sense.
# Also, we want a "make" version for the time being. But the target
# name "dist" in meson is reserved for that reason, so we call the
# custom target "pgdist" (so call something like "meson compile -C build
# pgdist").
#
# We don't poison the dist if we are a subproject because it is
# possible that the parent project may want to create a dist using
# the builtin Meson method.
meson.add_dist_script(perl, '-e', 'exit 1')
endif
I have extracted the freebsd CI script fix into a separate patch (0002).
I think this is useful even if we don't take the full CI patch (0003).
0002 looks pretty reasonable to me.
About the 0003 patch: It seems useful in principle to test these things
continuously. The dist script runs about 10 seconds in each task, and
takes a bit of disk space for the artifacts. I'm not sure to what
degree this might bother someone.
0003 works for me :).
--
Tristan Partin
Neon (https://neon.tech)
On 22.03.24 18:29, Tristan Partin wrote:
On Thu Mar 21, 2024 at 3:44 AM CDT, Peter Eisentraut wrote:
Here is an updated version of this patch set.
You should add 'disabler: true' to the git find_program in Meson. If Git
doesn't exist on the system with the way your patch is currently
written, the targets would be defined, even though they would never
succeed.
Ok, added. (I had it in there in an earlier version, but I think I
misread one of your earlier messages and removed it.)
You may also want to make sure that we are actually in a Git repository.
I don't think git-archive works outside one.
Then git archive will print an error. That seems ok.
Re the autoclrf, is this something we could throw in a .gitattributes
files?
We don't want to apply it to all git commands, just this one in this
context.
I would suggest poisoning `meson dist` in the following way:
if not meson.is_subproject()
[...]
meson.add_dist_script(perl, '-e', 'exit 1')
endif
Good idea, added that.
I have extracted the freebsd CI script fix into a separate patch
(0002). I think this is useful even if we don't take the full CI
patch (0003).0002 looks pretty reasonable to me.
Committed that one in the meantime.
Attachments:
v5-0001-make-dist-uses-git-archive.patchtext/plain; charset=UTF-8; name=v5-0001-make-dist-uses-git-archive.patchDownload
From 680967a621083756e1e182df7394a739008a21d6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sun, 11 Feb 2024 23:58:42 +0100
Subject: [PATCH v5 1/2] make dist uses git archive
This changes "make dist" to directly use "git archive", rather than
the custom shell script it currently runs.
This is to make the creation of the distribution tarball more directly
traceable to the git repository. That is why we removed the "make
distprep" step.
"make dist" continues to produce a .gz and a .bz2 tarball as before.
The archives produced this way are deterministic and reproducible,
meaning for a given commit the result file should always be
bit-for-bit identical. The exception is that if you use a git version
older than 2.38.0, gzip records the platform in the archive, so you'd
get a different output on Windows vs. macOS vs. "UNIX" (everything
else). In git 2.38.0, this was changed so that everything is recorded
as "UNIX" now. This is just something to keep in mind. This issue is
specific to the gzip format, it does not affect other compression
formats.
Meson has its own distribution building command (meson dist), but we
are not using that at this point. The main problem is that the way
they have implemented it, it is not deterministic in the above sense.
Also, we want a "make" version for the time being. But the target
name "dist" in meson is reserved for that reason, so we call the
custom target "pgdist" (so call something like "meson compile -C build
pgdist").
Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
GNUmakefile.in | 35 ++++++++++----------------
meson.build | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+), 22 deletions(-)
diff --git a/GNUmakefile.in b/GNUmakefile.in
index 4d8fc794bbb..4dc13a3e2ea 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,20 @@ update-unicode: | submake-generated-headers submake-libpgport
distdir = postgresql-$(VERSION)
dummy = =install=
+GIT = git
+
dist: $(distdir).tar.gz $(distdir).tar.bz2
- rm -rf $(distdir)
-
-$(distdir).tar: distdir
- $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
- @echo $(distdir)
-
-distdir:
- rm -rf $(distdir)* $(dummy)
- for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
- mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
- ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
- done
- $(MAKE) -C $(distdir) distclean
+
+# Note: core.autocrlf=false is needed to avoid line-ending conversion
+# in case the environment has a different setting. Without this, a
+# tarball created on Windows might be different than on, and unusable
+# on, Unix machines.
+
+$(distdir).tar.gz:
+ $(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2:
+ $(GIT) -C $(srcdir) -c core.autocrlf=false -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..4981447414f 100644
--- a/meson.build
+++ b/meson.build
@@ -3359,6 +3359,72 @@ run_target('help',
+###############################################################
+# Distribution archive
+###############################################################
+
+# Meson has its own distribution building command (meson dist), but we
+# are not using that at this point. The main problem is that the way
+# they have implemented it, it is not deterministic. Also, we want it
+# to be equivalent to the "make" version for the time being. But the
+# target name "dist" in meson is reserved for that reason, so we call
+# the custom target "pgdist".
+
+git = find_program('git', required: false, native: true, disabler: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+# Note: core.autocrlf=false is needed to avoid line-ending conversion
+# in case the environment has a different setting. Without this, a
+# tarball created on Windows might be different than on, and unusable
+# on, Unix machines.
+
+tar_gz = custom_target('tar.gz',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ 'archive',
+ '--format', 'tar.gz',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.gz',
+)
+
+if bzip2.found()
+ tar_bz2 = custom_target('tar.bz2',
+ build_always_stale: true,
+ command: [git, '-C', '@SOURCE_ROOT@',
+ '-c', 'core.autocrlf=false',
+ '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c',
+ 'archive',
+ '--format', 'tar.bz2',
+ '--prefix', distdir + '/',
+ '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+ 'HEAD', '.'],
+ install: false,
+ output: distdir + '.tar.bz2',
+ )
+else
+ tar_bz2 = custom_target('tar.bz2',
+ command: [perl, '-e', 'exit 1'],
+ output: distdir + '.tar.bz2',
+ )
+endif
+
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+# Make the standard "dist" command fail, to prevent accidental use.
+# But not if we are in a subproject, in case the parent project wants to
+# create a dist using the standard Meson command.
+if not meson.is_subproject()
+ meson.add_dist_script(perl, '-e', 'exit 1')
+endif
+
+
+
###############################################################
# The End, The End, My Friend
###############################################################
base-commit: fc2d260c7e6236fe2447dad0f8415e72f4be66a2
--
2.44.0
v5-0002-ci-Add-dist-building.patchtext/plain; charset=UTF-8; name=v5-0002-ci-Add-dist-building.patchDownload
From 32cf0bbe02a5378075ec4a368edc2a2d274fe602 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 21 Mar 2024 09:23:13 +0100
Subject: [PATCH v5 2/2] ci: Add dist building
XXX only for testing
---
.cirrus.tasks.yml | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 1adfdfdd456..38c5cc54311 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -202,6 +202,11 @@ task:
build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop
EOF
+ dist_script: |
+ su postgres -c 'meson compile -C build -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
# if the server continues running, it often causes cirrus-ci to fail
# during upload, as it doesn't expect artifacts to change size
@@ -345,6 +350,10 @@ task:
make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}
EOF
+ dist_script: su postgres -c "make dist -j${BUILD_JOBS}"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_ac
@@ -403,6 +412,10 @@ task:
PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}
EOF
+ dist_script: su postgres -c 'meson compile -C build-32 -v pgdist'
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
@@ -498,6 +511,10 @@ task:
ulimit -n 1024 # default is 256, pretty low
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
+ dist_script: meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
@@ -569,6 +586,12 @@ task:
vcvarsall x64
meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
+ dist_script: |
+ vcvarsall x64
+ meson compile -C build -v pgdist
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
@@ -629,6 +652,11 @@ task:
test_world_script: |
%BASH% -c "meson test %MTEST_ARGS% --num-processes %TEST_JOBS%"
+ dist_script: |
+ %BASH% -c "meson compile -C build -v pgdist"
+ tar_artifacts:
+ path: "build*/*.tar.*"
+
on_failure:
<<: *on_failure_meson
crashlog_artifacts:
--
2.44.0
3 comments left that are inconsequential. Feel free to ignore.
+# Meson has its own distribution building command (meson dist), but we +# are not using that at this point. The main problem is that the way +# they have implemented it, it is not deterministic. Also, we want it +# to be equivalent to the "make" version for the time being. But the +# target name "dist" in meson is reserved for that reason, so we call +# the custom target "pgdist".
The second sentence is a run-on.
+if bzip2.found() + tar_bz2 = custom_target('tar.bz2', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', + '-c', 'core.autocrlf=false', + '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c', + 'archive', + '--format', 'tar.bz2', + '--prefix', distdir + '/', + '-o', join_paths(meson.build_root(), '@OUTPUT@'), + 'HEAD', '.'], + install: false, + output: distdir + '.tar.bz2', + )
You might find Meson's string formatting syntax creates a more readable
command string:
'tar.tar.bz2.command=@0@ -c'.format(bzip2.path())
And then 'install: false' is the default if you feel like leaving it
out.
Otherwise, let's get this in!
--
Tristan Partin
Neon (https://neon.tech)
On 24.03.24 16:42, Tristan Partin wrote:
You might find Meson's string formatting syntax creates a more readable
command string:'tar.tar.bz2.command=@0@ -c'.format(bzip2.path())
And then 'install: false' is the default if you feel like leaving it out.
Otherwise, let's get this in!
Done and committed.
Hi,
On 2024-03-25 06:44:33 +0100, Peter Eisentraut wrote:
Done and committed.
This triggered a new warning for me:
../../../../../home/andres/src/postgresql/meson.build:3422: WARNING: Project targets '>=0.54' but uses feature introduced in '0.55.0': Passing executable/found program object to script parameter of add_dist_script.
Greetings,
Andres
On 26.03.24 01:23, Andres Freund wrote:
On 2024-03-25 06:44:33 +0100, Peter Eisentraut wrote:
Done and committed.
This triggered a new warning for me:
../../../../../home/andres/src/postgresql/meson.build:3422: WARNING: Project targets '>=0.54' but uses feature introduced in '0.55.0': Passing executable/found program object to script parameter of add_dist_script.
Hmm, I don't see that. Is there another version dependency that
controls when you see version dependency warnings? ;-)
We could trivially remove this particular line, or perhaps put a
if meson.version().version_compare('>=0.55')
around it. (But would that still warn?)
Hi,
On 2024-03-26 08:36:58 +0100, Peter Eisentraut wrote:
On 26.03.24 01:23, Andres Freund wrote:
On 2024-03-25 06:44:33 +0100, Peter Eisentraut wrote:
Done and committed.
This triggered a new warning for me:
../../../../../home/andres/src/postgresql/meson.build:3422: WARNING: Project targets '>=0.54' but uses feature introduced in '0.55.0': Passing executable/found program object to script parameter of add_dist_script.
Hmm, I don't see that. Is there another version dependency that controls
when you see version dependency warnings? ;-)
Sometimes an incompatibility is later noticed and a warning is introduced at
that point.
We could trivially remove this particular line, or perhaps put a
if meson.version().version_compare('>=0.55')
around it. (But would that still warn?)
It shouldn't, no. As long as the code is actually executed within the check,
it avoids the warning. However if you just set a variable inside the version
gated block and then later use the variable outside that, it will
warn. Probably hard to avoid...
Greetings,
Andres Freund
On Tue Mar 26, 2024 at 2:56 AM CDT, Andres Freund wrote:
Hi,
On 2024-03-26 08:36:58 +0100, Peter Eisentraut wrote:
On 26.03.24 01:23, Andres Freund wrote:
On 2024-03-25 06:44:33 +0100, Peter Eisentraut wrote:
Done and committed.
This triggered a new warning for me:
../../../../../home/andres/src/postgresql/meson.build:3422: WARNING: Project targets '>=0.54' but uses feature introduced in '0.55.0': Passing executable/found program object to script parameter of add_dist_script.
Hmm, I don't see that. Is there another version dependency that controls
when you see version dependency warnings? ;-)Sometimes an incompatibility is later noticed and a warning is introduced at
that point.We could trivially remove this particular line, or perhaps put a
if meson.version().version_compare('>=0.55')
around it. (But would that still warn?)
It shouldn't, no. As long as the code is actually executed within the check,
it avoids the warning. However if you just set a variable inside the version
gated block and then later use the variable outside that, it will
warn. Probably hard to avoid...
The following change also makes the warning go away, but the version
comparison seems better to me due to how we choose not to use machine
files for overriding programs[0]If someone wants to make a plea here: https://github.com/mesonbuild/meson/pull/12623. :(
- meson.add_dist_script(perl, ...)
+ meson.add_dist_script('perl', ...)
Aside, but I think since we dropped AIX, we can bump the required Meson
version. My last analysis of the situation told me that the AIX
buildfarm animals were the only machines which didn't have a Python
version capable of running a newer version. I would need to look at the
situation again though.
[0]: If someone wants to make a plea here: https://github.com/mesonbuild/meson/pull/12623
https://github.com/mesonbuild/meson/pull/12623
--
Tristan Partin
Neon (https://neon.tech)
On Wed Jan 24, 2024 at 11:57 AM CST, Tristan Partin wrote:
On Wed Jan 24, 2024 at 10:18 AM CST, Tristan Partin wrote:
On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:
On 22.01.24 21:04, Tristan Partin wrote:
3. Meson does not support tar.bz2 archives.
This has now been merged. It will be in 1.5, so we will probably see it
in RHEL in a decade :P.
4. Meson uses git archive internally, but then unpacks and repacks the
archive, which loses the ability to use git get-tar-commit-id.Because Meson allows projects to distribute arbitrary files via
meson.add_dist_script(), and can include subprojects via `meson dist
--include-subprojects`, this doesn't seem like an easily solvable
problem.Thanks Peter. I will bring these up with upstream!
I think the solution to point 4 is to not unpack/repack if there are no
dist scripts and/or subprojects to distribute. I can take a look at
this later. I think this would also solve points 1, 2, 5, and 6 because
at that point meson is just calling git-archive.
I think implementing a solution to point 4 is a little bit more pressing
given that reproducible tarballs are more important after the xz
debaucle. I will try to give it some effort soon.
--
Tristan Partin
Neon (https://neon.tech)