preserve timestamps when installing headers
Hi hackers,
I noticed that `make install` updates modification time for all
installed headers. This leads to recompilation of all dependent
objects, which is inconvenient for example when working on a
third-party extension. A way to solve this would be to pass
`INSTALL="install -p"` to `configure`, to make `install` preserve the
timestamp. After this, a new problem arises -- the
`src/include/Makefile` doesn't use `install` for all headers, but
instead uses `cp`. This patch adds `-p` switch to `cp` invocation in
these files, to make it preserve timestamps. Combined with the
aforementioned install flag, it allows a developer to hack on both
postgres and a third-party extension at the same time, without the
unneeded recompilation.
--
Alexander Kuzmenkov
Timescale
Attachments:
preserve-header-timestamps.patchtext/x-patch; charset=US-ASCII; name=preserve-header-timestamps.patchDownload
diff --git a/src/include/Makefile b/src/include/Makefile
index 5f257a958c..27242ff910 100644
--- a/src/include/Makefile
+++ b/src/include/Makefile
@@ -48,14 +48,16 @@ install: all installdirs
$(INSTALL_DATA) utils/fmgroids.h '$(DESTDIR)$(includedir_server)/utils'
$(INSTALL_DATA) utils/fmgrprotos.h '$(DESTDIR)$(includedir_server)/utils'
# We don't use INSTALL_DATA for performance reasons --- there are a lot of files
-# (in fact, we have to take some pains to avoid overlength shell commands here)
- cp $(srcdir)/*.h '$(DESTDIR)$(includedir_server)'/
+# (in fact, we have to take some pains to avoid overlength shell commands here).
+# We preserve timestamps while copying to avoid recompiling dependent objects
+# such as third-party extensions.
+ cp -p $(srcdir)/*.h '$(DESTDIR)$(includedir_server)'/
for dir in $(SUBDIRS); do \
- cp $(srcdir)/$$dir/*.h '$(DESTDIR)$(includedir_server)'/$$dir/ || exit; \
+ cp -p $(srcdir)/$$dir/*.h '$(DESTDIR)$(includedir_server)'/$$dir/ || exit; \
done
ifeq ($(vpath_build),yes)
for file in catalog/schemapg.h catalog/system_fk_info.h catalog/pg_*_d.h parser/gram.h storage/lwlocknames.h utils/probes.h; do \
- cp $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \
+ cp -p $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \
done
endif
cd '$(DESTDIR)$(includedir_server)' && chmod $(INSTALL_DATA_MODE) *.h
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed
Personally, I don't often encounter the problem that Alexander is describing, but I agree that there are cases when the simplest way to debug a tricky bug is to make a modification to the core. In fact, I used this technique to diagnose [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=98ec35b0.
Unless anyone can think of the scenario when the proposed change will break something, I would suggest merging it.
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=98ec35b0
The new status of this patch is: Ready for Committer
On Tue, Oct 12, 2021 at 01:22:50PM +0300, Alexander Kuzmenkov wrote:
I noticed that `make install` updates modification time for all
installed headers. This leads to recompilation of all dependent
objects, which is inconvenient for example when working on a
third-party extension. A way to solve this would be to pass
`INSTALL="install -p"` to `configure`, to make `install` preserve the
timestamp. After this, a new problem arises -- the
`src/include/Makefile` doesn't use `install` for all headers, but
instead uses `cp`. This patch adds `-p` switch to `cp` invocation in
these files, to make it preserve timestamps. Combined with the
aforementioned install flag, it allows a developer to hack on both
postgres and a third-party extension at the same time, without the
unneeded recompilation.
The use of cp instead of $(INSTALL_DATA) for the installation of the
headers comes from a703269, back from 2005. How do numbers compare
today, 16 years later?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Tue, Oct 12, 2021 at 01:22:50PM +0300, Alexander Kuzmenkov wrote:
This patch adds `-p` switch to `cp` invocation in
these files, to make it preserve timestamps.
The use of cp instead of $(INSTALL_DATA) for the installation of the
headers comes from a703269, back from 2005. How do numbers compare
today, 16 years later?
According to a nearby copy of POSIX, "cp -p" does a lot more than
preserve timestamps. It also specifies preserving file ownership,
which seems absolutely catastrophic for the standard use-case of
"build as some ordinary user, then install as root".
TBH, I am not convinced that the complained-of case is enough of a
problem to justify any change in our build rules, even if there
weren't any semantic issues. If you are worried about build times,
you should be using ccache, and IME builds using ccache are not
terribly impacted by file timestamp changes.
regards, tom lane
On Mon, Dec 06, 2021 at 01:51:39AM -0500, Tom Lane wrote:
TBH, I am not convinced that the complained-of case is enough of a
problem to justify any change in our build rules, even if there
weren't any semantic issues. If you are worried about build times,
you should be using ccache, and IME builds using ccache are not
terribly impacted by file timestamp changes.
FWIW, I am not on board with changing build semantics or any
assumptions the header installation relies on either, but I could see
a point in switching back to INSTALL_DATA instead of cp to be
consistent with the rest of the build, iff the argument made back in
2005 about the performance of this code path does not hold anymore.
If we do that, it would then be possible to feed a custom INSTALL
command to ./configure.
--
Michael
On 06.12.21 07:51, Tom Lane wrote:
TBH, I am not convinced that the complained-of case is enough of a
problem to justify any change in our build rules, even if there
weren't any semantic issues. If you are worried about build times,
you should be using ccache, and IME builds using ccache are not
terribly impacted by file timestamp changes.
I have never heard of a dependency-based build system taking into
account the timestamps of files outside of the source (or build) tree.
It does make sense to some degree, but it seems very unusual, and
basically nothing works like that. I'm also not sure how packaging
systems preserve file timestamps. Maybe it's a thing now, but I'd like
to see a more comprehensive analysis before we commit to this.
On 06.12.21 12:15, Michael Paquier wrote:
FWIW, I am not on board with changing build semantics or any
assumptions the header installation relies on either, but I could see
a point in switching back to INSTALL_DATA instead of cp to be
consistent with the rest of the build, iff the argument made back in
2005 about the performance of this code path does not hold anymore.
I think you will find that it is still very slow.
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
On 06.12.21 12:15, Michael Paquier wrote:
FWIW, I am not on board with changing build semantics or any
assumptions the header installation relies on either, but I could see
a point in switching back to INSTALL_DATA instead of cp to be
consistent with the rest of the build, iff the argument made back in
2005 about the performance of this code path does not hold anymore.
I think you will find that it is still very slow.
That would likely depend on whether configure had found a suitable
"install" program or decided to fall back on config/install-sh.
The latter will definitely be horribly slow, but C-coded install
utilities are probably no slower than "cp".
However, there's another problem with using INSTALL_DATA as a solution
to this issue: why would you expect that to preserve timestamps?
install-sh won't. I see that /usr/bin/install (which configure picks
on my RHEL box) won't preserve them by default, but it has a -p
option to do so. I would not bet on that being portable to all of
the myriad of foo-install programs that configure will accept, though.
On the whole, I think we should just reject this proposal and move on.
The portability hazards seem significant, and it's really unclear
to me what the advantages are (per Peter's earlier comment).
regards, tom lane
On 04.01.22 22:21, Tom Lane wrote:
However, there's another problem with using INSTALL_DATA as a solution
to this issue: why would you expect that to preserve timestamps?
install-sh won't. I see that /usr/bin/install (which configure picks
on my RHEL box) won't preserve them by default, but it has a -p
option to do so. I would not bet on that being portable to all of
the myriad of foo-install programs that configure will accept, though.
I don't think preserving timestamps should be the default behavior, but
I would support organizing things so that additional options can be
passed to "install" to make it do whatever the user prefers. But that
won't work if some installations don't go through install.
We could have some mode where "install" is used instead of "cp", if
someone wants to figure out exactly how to make that determination.
Btw., a quick test of make -C src/include/ install:
cp (current code): 0.5 s
GNU install: 0.6 s
install-sh: 12.5 s
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
I don't think preserving timestamps should be the default behavior, but
I would support organizing things so that additional options can be
passed to "install" to make it do whatever the user prefers. But that
won't work if some installations don't go through install.
Check, but ...
Btw., a quick test of make -C src/include/ install:
cp (current code): 0.5 s
GNU install: 0.6 s
install-sh: 12.5 s
So this says that there's only a performance issue with install-sh;
but that's used by just a tiny minority of systems anymore. Scraping
the buildfarm's configure results, I find this many animals reporting
each of these choices:
4 /bin/install -c
8 config/install-sh -c
2 /opt/packages/coreutils-8.6/inst/bin/install -c
1 /usr/bin/ginstall -c
100 /usr/bin/install -c
1 /usr/gnu/bin/install -c
The 8 holdouts are
gaur
haddock
hake
hornet
hoverfly
mandrill
sungazer
tern
ie, ancient HPUX, OpenIndiana (Solaris), and AIX, none of which
are likely development platforms anymore --- and if somebody
did care about this, there's nothing stopping them from
installing GNU install on their machine.
So I fear we're optimizing for a case that stopped being mainstream
a decade or more back. I could get behind switching the code back
to using $(INSTALL) for this, and then offering some way to inject
user-selected switches into the $(INSTALL) invocations. That
wouldn't need much more than another gmake macro. (Does there
need to be a way to inject such switches only into header
installations, or is it OK to do it across the board?)
[ wanders away wondering how this'd affect the meson conversion
project ]
regards, tom lane
On 11/01/2022 00:03, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
I don't think preserving timestamps should be the default behavior, but
I would support organizing things so that additional options can be
passed to "install" to make it do whatever the user prefers. But that
won't work if some installations don't go through install.
+1. We just bumped into this with Neon, where we have a build script
that generates Rust bindings from the PostgreSQL header files. The build
script runs "make install", and because that changes the mtime even if
there were no changes to the headers, the bindings are also regenerated
every time.
So I fear we're optimizing for a case that stopped being mainstream
a decade or more back. I could get behind switching the code back
to using $(INSTALL) for this, and then offering some way to inject
user-selected switches into the $(INSTALL) invocations. That
wouldn't need much more than another gmake macro. (Does there
need to be a way to inject such switches only into header
installations, or is it OK to do it across the board?)
Here's a patch to switch back to $(INSTALL). With that, you can do:
./configure INSTALL="/usr/bin/install -C"
[ wanders away wondering how this'd affect the meson conversion
project ]
If anything, I guess this will help, by making the Makefile a bit less
special.
- Heikki
Attachments:
0001-Use-normal-install-program-to-install-server-headers.patchtext/x-patch; charset=UTF-8; name=0001-Use-normal-install-program-to-install-server-headers.patchDownload
From c0305f950e45ed905cc8145879d0b48390c387fa Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 9 Sep 2022 21:25:26 +0300
Subject: [PATCH 1/1] Use normal install program to install server headers.
Commit a7032690f9 replaced $(INSTALL) with plain "cp" for installing the
server header files. It sped up "make install" significantly, because
the old logic called $(INSTALL) separately for every header file,
whereas plain "cp" could copy all the files in one command. However, we
have long since made it a requirement that $(INSTALL) can also install
multiple files in one command, see commit f1c5247563. Switch back to
$(INSTALL).
Discussion: https://www.postgresql.org/message-id/200503252305.j2PN52m23610%40candle.pha.pa.us
Discussion: https://www.postgresql.org/message-id/2415283.1641852217%40sss.pgh.pa.us
---
src/include/Makefile | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/src/include/Makefile b/src/include/Makefile
index 0b4cab9bb1..1e50400617 100644
--- a/src/include/Makefile
+++ b/src/include/Makefile
@@ -48,22 +48,15 @@ install: all installdirs
$(INSTALL_DATA) utils/errcodes.h '$(DESTDIR)$(includedir_server)/utils'
$(INSTALL_DATA) utils/fmgroids.h '$(DESTDIR)$(includedir_server)/utils'
$(INSTALL_DATA) utils/fmgrprotos.h '$(DESTDIR)$(includedir_server)/utils'
-# We don't use INSTALL_DATA for performance reasons --- there are a lot of files
-# (in fact, we have to take some pains to avoid overlength shell commands here)
- cp $(srcdir)/*.h '$(DESTDIR)$(includedir_server)'/
+ $(INSTALL_DATA) $(srcdir)/*.h '$(DESTDIR)$(includedir_server)'
for dir in $(SUBDIRS); do \
- cp $(srcdir)/$$dir/*.h '$(DESTDIR)$(includedir_server)'/$$dir/ || exit; \
+ $(INSTALL_DATA) $(srcdir)/$$dir/*.h '$(DESTDIR)$(includedir_server)'/$$dir || exit; \
done
ifeq ($(vpath_build),yes)
for file in catalog/schemapg.h catalog/system_fk_info.h catalog/pg_*_d.h parser/gram.h storage/lwlocknames.h utils/probes.h; do \
- cp $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \
+ $(INSTALL_DATA) $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \
done
endif
- cd '$(DESTDIR)$(includedir_server)' && chmod $(INSTALL_DATA_MODE) *.h
- for dir in $(SUBDIRS); do \
- cd '$(DESTDIR)$(includedir_server)'/$$dir || exit; \
- chmod $(INSTALL_DATA_MODE) *.h || exit; \
- done
installdirs:
$(MKDIR_P) '$(DESTDIR)$(includedir)/libpq' '$(DESTDIR)$(includedir_internal)/libpq'
--
2.30.2
On Fri, Sep 09, 2022 at 10:23:57PM +0300, Heikki Linnakangas wrote:
On 11/01/2022 00:03, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
I don't think preserving timestamps should be the default behavior, but
I would support organizing things so that additional options can be
passed to "install" to make it do whatever the user prefers. But that
won't work if some installations don't go through install.
Here's a patch to switch back to $(INSTALL). With that, you can do:
./configure INSTALL="/usr/bin/install -C"
+1, I recently looked for a way to do that while trying to accelerate
Cygwin/Mingw.
--
Justin
On 09.09.22 21:23, Heikki Linnakangas wrote:
So I fear we're optimizing for a case that stopped being mainstream
a decade or more back. I could get behind switching the code back
to using $(INSTALL) for this, and then offering some way to inject
user-selected switches into the $(INSTALL) invocations. That
wouldn't need much more than another gmake macro. (Does there
need to be a way to inject such switches only into header
installations, or is it OK to do it across the board?)Here's a patch to switch back to $(INSTALL).
I'm content to go ahead with this.
On 12/09/2022 18:49, Peter Eisentraut wrote:
On 09.09.22 21:23, Heikki Linnakangas wrote:
So I fear we're optimizing for a case that stopped being mainstream
a decade or more back. I could get behind switching the code back
to using $(INSTALL) for this, and then offering some way to inject
user-selected switches into the $(INSTALL) invocations. That
wouldn't need much more than another gmake macro. (Does there
need to be a way to inject such switches only into header
installations, or is it OK to do it across the board?)Here's a patch to switch back to $(INSTALL).
I'm content to go ahead with this.
Committed.
- Heikki