Multi-branch committing in git, revisited

Started by Tom Laneover 15 years ago28 messageshackers

tgl@sss.pgh.pa.us

over 15 years ago

Okay, so now that I've actually done a couple of multi-branch commits...

I'm using the multiple-work-directory arrangement suggested on our wiki
page. The work flow seems to boil down to:

* Prepare patch in master
* Stage patch with git add
* git diff --staged >/tmp/patch-head
* cd into REL9_0_STABLE workdir
* patch -p0 </tmp/patch-head
* Adjust patch if needed
* Stage patch with git add
* git diff --staged >/tmp/patch-90
* cd into REL8_4_STABLE workdir
* patch -p0 </tmp/patch-90
* ... lather, rinse, repeat ...
* cd back to master
* git commit -F /tmp/commitmsg
* cd into REL9_0_STABLE workdir
* git commit -F /tmp/commitmsg
* cd into REL8_4_STABLE workdir
* git commit -F /tmp/commitmsg
* ... lather, rinse, repeat ...
* git push

While this isn't much worse than what I was used to with CVS, it's
definitely not better. I think that I could simplify transferring the
patch back to older branches if I could use git cherry-pick. However,
that only works on already-committed patches. If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

One idea that comes to me is to do each patch in a temporary side branch
off that maintenance branch, and then only the final commits merging
that work back to the public branches need be closely spaced. But being
still a git novice, it's not exactly clear to me how to do that, and
anyway it seems like there's still possibility for trouble if there's
unexpected merging failures on some of the branches. (That would only
be an issue if the back-patching took long enough for some of the
branches to change underneath me, which isn't too likely, but if it
did happen how do I recover and still keep the commits close together?)

So it seems like maybe we still need some more thought about how to
recognize related commits in different branches. Or at the very least,
we need a best-practice document explaining how to manage this --- we
shouldn't expect every committer to reinvent this wheel for himself.

Comments?

regards, tom lane

Andrew Dunstan

andrew@dunslane.net

over 15 years ago

In reply to: Tom Lane (#1)

Re: Multi-branch committing in git, revisited

On 09/21/2010 09:20 PM, Tom Lane wrote:

Okay, so now that I've actually done a couple of multi-branch commits...

I'm using the multiple-work-directory arrangement suggested on our wiki
page. The work flow seems to boil down to:

* Prepare patch in master
* Stage patch with git add
* git diff --staged>/tmp/patch-head
* cd into REL9_0_STABLE workdir
* patch -p0</tmp/patch-head
* Adjust patch if needed
* Stage patch with git add
* git diff --staged>/tmp/patch-90
* cd into REL8_4_STABLE workdir
* patch -p0</tmp/patch-90
* ... lather, rinse, repeat ...
* cd back to master
* git commit -F /tmp/commitmsg
* cd into REL9_0_STABLE workdir
* git commit -F /tmp/commitmsg
* cd into REL8_4_STABLE workdir
* git commit -F /tmp/commitmsg
* ... lather, rinse, repeat ...
* git push

While this isn't much worse than what I was used to with CVS, it's
definitely not better. I think that I could simplify transferring the
patch back to older branches if I could use git cherry-pick. However,
that only works on already-committed patches. If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

One idea that comes to me is to do each patch in a temporary side branch
off that maintenance branch, and then only the final commits merging
that work back to the public branches need be closely spaced. But being
still a git novice, it's not exactly clear to me how to do that, and
anyway it seems like there's still possibility for trouble if there's
unexpected merging failures on some of the branches. (That would only
be an issue if the back-patching took long enough for some of the
branches to change underneath me, which isn't too likely, but if it
did happen how do I recover and still keep the commits close together?)

I suspect you should indeed be working on topic branches for everything
non-trivial. You can safely commit on them, cherry-pick from them,
squash from them onto the main branch, and then discard them. You can
resolve problems before committing on the main branch by doing:

git checkout mainbranch
git merge --squash --no-commit topicbranch

Unless I'm missing something there's no reason you should ever have to
have a significant delay between branches in commits/pushes.

I know git has something of a learning curve, but I'm betting that in a
month or so you'll be better at using it than most of us ;-)

cheers

andrew

David E. Wheeler

david@kineticode.com

over 15 years ago

In reply to: Tom Lane (#1)

Re: Multi-branch committing in git, revisited

On Sep 21, 2010, at 6:20 PM, Tom Lane wrote:

While this isn't much worse than what I was used to with CVS, it's
definitely not better. I think that I could simplify transferring the
patch back to older branches if I could use git cherry-pick. However,
that only works on already-committed patches. If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

You could commit in each one as you go, cherry-pick, and then in each one

git reset --soft HEAD^

Then they'd all be patched and staged.

Best,

David

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: David E. Wheeler (#3)

Re: Multi-branch committing in git, revisited

David E. Wheeler wrote:

On Sep 21, 2010, at 6:20 PM, Tom Lane wrote:

While this isn't much worse than what I was used to with CVS, it's
definitely not better. I think that I could simplify transferring the
patch back to older branches if I could use git cherry-pick. However,
that only works on already-committed patches. If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

You could commit in each one as you go, cherry-pick, and then in each one

git reset --soft HEAD^

Then they'd all be patched and staged.

If I understand correctly, that 'git reset' will mark all branch changes
as staged but not committed, and then you can commit all branches at
once and push it. Is that right?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

David E. Wheeler

david@kineticode.com

over 15 years ago

In reply to: Bruce Momjian (#4)

Re: Multi-branch committing in git, revisited

On Sep 21, 2010, at 8:01 PM, Bruce Momjian wrote:

Then they'd all be patched and staged.

If I understand correctly, that 'git reset' will mark all branch changes
as staged but not committed, and then you can commit all branches at
once and push it. Is that right?

Right.

David

David Christensen

david@endpoint.com

over 15 years ago

In reply to: Tom Lane (#1)

Re: Multi-branch committing in git, revisited

If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

Is the issue here the clock time spent between the commits, i.e., the possibility that someone is going to push to the specific branches in between or the date/time that the commit itself displays? Because if it's specifically commit time that's at issue, I believe `git cherry-pick` preserves the original commit time from the original commit, which should make that a non-issue. Even if you need to fix up a commit to get the cherry-pick to apply, you can always `git commit -C <ref-of-cherry-pick>` to preserve the authorship/commit time for the commit in question.

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: David Christensen (#6)

Re: Multi-branch committing in git, revisited

David Christensen wrote:

If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related. So we're back up
against the problem of git not really understanding the relationships of
patches in different branches.

Is the issue here the clock time spent between the commits, i.e., the
possibility that someone is going to push to the specific branches in
between or the date/time that the commit itself displays? Because if
it's specifically commit time that's at issue, I believe `git
cherry-pick` preserves the original commit time from the original
commit, which should make that a non-issue. Even if you need to fix
up a commit to get the cherry-pick to apply, you can always `git commit
-C <ref-of-cherry-pick>` to preserve the authorship/commit time for
the commit in question.

The problem is that the cherrypicks will often have to modified, and it
can take +20 minutes to resolve some of them.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: David E. Wheeler (#5)

Re: Multi-branch committing in git, revisited

David E. Wheeler wrote:

On Sep 21, 2010, at 8:01 PM, Bruce Momjian wrote:

Then they'd all be patched and staged.

If I understand correctly, that 'git reset' will mark all branch changes
as staged but not committed, and then you can commit all branches at
once and push it. Is that right?

Right.

OK, I am scared I actually understood that. ;-)

Is there a command to commit all stated changes in all branches or do we
have to go around to each branch to do the commit on each one? I
realize the push works to push all branch commits (or it only do that
from the master branch?).

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: David E. Wheeler (#5)

Re: Multi-branch committing in git, revisited

"David E. Wheeler" <david@kineticode.com> writes:

On Sep 21, 2010, at 8:01 PM, Bruce Momjian wrote:

Then they'd all be patched and staged.

If I understand correctly, that 'git reset' will mark all branch changes
as staged but not committed, and then you can commit all branches at
once and push it. Is that right?

Right.

You sure about the "staged" part? If I'm reading the git-reset man
page correctly, this command will revert your commit position and index,
leaving only the modified work files behind. So it looks to me like
you need another round of git add, or at least git commit -a.

Offhand I think I like Andrew's recommendation of a shortlived branch
better. In essence your idea is using the tip of "master" itself as a
shortlived branch, which is maybe a bit too cute. If you get distracted
and need to do something else for awhile, the tip of "master" is not
where you want your not-yet-pushable work to be.

(For those following along at home, there are some mighty instructive
examples in the git-reset man page.)

regards, tom lane

#10

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Tom Lane (#9)

Re: Multi-branch committing in git, revisited

Tom Lane wrote:

"David E. Wheeler" <david@kineticode.com> writes:

On Sep 21, 2010, at 8:01 PM, Bruce Momjian wrote:

Then they'd all be patched and staged.

If I understand correctly, that 'git reset' will mark all branch changes
as staged but not committed, and then you can commit all branches at
once and push it. Is that right?

Right.

You sure about the "staged" part? If I'm reading the git-reset man
page correctly, this command will revert your commit position and index,
leaving only the modified work files behind. So it looks to me like
you need another round of git add, or at least git commit -a.

The command was:

git reset --soft HEAD^

The --soft says:

--soft
Does not touch the index file nor the working tree
at all, but requires them to be in a good order.
This leaves all your changed files "Changes to be
committed", as git status would put it.

and the HEAD^ is the same as HEAD^1, which is on commit backward from
HEAD. I assume ""Changes to be committed" means "staged".

Offhand I think I like Andrew's recommendation of a shortlived branch
better. In essence your idea is using the tip of "master" itself as a
shortlived branch, which is maybe a bit too cute. If you get distracted

True.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#11

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Bruce Momjian (#10)

Re: Multi-branch committing in git, revisited

Bruce Momjian wrote:

Offhand I think I like Andrew's recommendation of a shortlived branch
better. In essence your idea is using the tip of "master" itself as a
shortlived branch, which is maybe a bit too cute. If you get distracted

True.

However, keep in mind that creating a branch in every existing backpatch
branch is going to create even more backpatching monkey-work.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#12

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: David Christensen (#6)

Re: Multi-branch committing in git, revisited

David Christensen <david@endpoint.com> writes:

Is the issue here the clock time spent between the commits, i.e., the possibility that someone is going to push to the specific branches in between or the date/time that the commit itself displays? Because if it's specifically commit time that's at issue, I believe `git cherry-pick` preserves the original commit time from the original commit, which should make that a non-issue. Even if you need to fix up a commit to get the cherry-pick to apply, you can always `git commit -C <ref-of-cherry-pick>` to preserve the authorship/commit time for the commit in question.

Oh, yeah, that's interesting. So you could force all the commits to
match the timestamp of the first one. That's sort of the wrong end
of the process though --- I'd rather have a timestamp closer to when
I'm done than when I start.

The other thing that comes to mind on further reflection is that by
the time you get done with the back-patching, the commit log message
might be different from what you thought it would be when you started.
I had an example just today:
http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=829f5b3571241cae2cc1a02923439cd0725d683c
Fixing "make distdir" wasn't part of what I was doing when I started.
I suppose I could have done that in a separate series of commits, but
if the idea is to make things more efficient not less so, that's not
the direction I want to go.

So right now I'm thinking that the best approach is to do the work
in temporary topic branches, then make up a commit message and use
it while doing squash merges onto the public branches. I hadn't thought
when I went into this that a two-line patch would justify a temporary
branch, but if you need to back-patch it, maybe it does.

In principle I guess that we could decide to use -c or -C for the squash
merges and thus make their commit timestamps exactly the same not only
approximately the same. This seems a bit overly anal retentive to me
at the moment, but maybe sometime in the future it'll be an idea worth
adopting.

regards, tom lane

#13

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Bruce Momjian (#7)

Re: Multi-branch committing in git, revisited

Bruce Momjian <bruce@momjian.us> writes:

The problem is that the cherrypicks will often have to modified, and it
can take +20 minutes to resolve some of them.

To say nothing of actually *testing* the patch in each branch, which is
Strongly Recommended if it didn't apply cleanly. I've not infrequently
spent many hours on a difficult back-patch sequence.

regards, tom lane

#14

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Bruce Momjian (#11)

Re: Multi-branch committing in git, revisited

Bruce Momjian <bruce@momjian.us> writes:

However, keep in mind that creating a branch in every existing backpatch
branch is going to create even more backpatching monkey-work.

Monkey-work is scriptable though. It'll all be worth it if git
cherry-pick is even marginally smarter about back-merging the actual
patches. In principle it could be less easily confused than plain
old patch, but I was a bit discouraged by the upthread comment that
it's just a shorthand for "git diff | patch" :-(

regards, tom lane

#15

A.M.

agentm@themactionfaction.com

over 15 years ago

In reply to: Tom Lane (#9)

Re: Multi-branch committing in git, revisited

On Sep 21, 2010, at 11:19 PM, Tom Lane wrote:

Offhand I think I like Andrew's recommendation of a shortlived branch
better. In essence your idea is using the tip of "master" itself as a
shortlived branch, which is maybe a bit too cute. If you get distracted
and need to do something else for awhile, the tip of "master" is not
where you want your not-yet-pushable work to be.

For uncommitted work, see also "git stash".
http://www.kernel.org/pub/software/scm/git/docs/git-stash.html

Cheers,
M

#16

David E. Wheeler

david@kineticode.com

over 15 years ago

In reply to: Tom Lane (#9)

Re: Multi-branch committing in git, revisited

On Sep 21, 2010, at 8:19 PM, Tom Lane wrote:

You sure about the "staged" part?

Yes, I do it all the time (I make a lot of mistakes).

Offhand I think I like Andrew's recommendation of a shortlived branch
better. In essence your idea is using the tip of "master" itself as a
shortlived branch, which is maybe a bit too cute. If you get distracted
and need to do something else for awhile, the tip of "master" is not
where you want your not-yet-pushable work to be.

Yes, I think using branches for everything is generally the way to go. But if you wanted to just use your existing approach, then reset --soft HEAD^ would work, too.

Best,

David

#17

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 15 years ago

In reply to: Tom Lane (#1)

Re: Multi-branch committing in git, revisited

On 22/09/10 04:20, Tom Lane wrote:

Okay, so now that I've actually done a couple of multi-branch commits...

I'm using the multiple-work-directory arrangement suggested on our wiki
page. The work flow seems to boil down to:

* Prepare patch in master
* Stage patch with git add
* git diff --staged>/tmp/patch-head
* cd into REL9_0_STABLE workdir
* patch -p0</tmp/patch-head
* Adjust patch if needed
* Stage patch with git add
* git diff --staged>/tmp/patch-90
* cd into REL8_4_STABLE workdir
* patch -p0</tmp/patch-90
* ... lather, rinse, repeat ...
* cd back to master
* git commit -F /tmp/commitmsg
* cd into REL9_0_STABLE workdir
* git commit -F /tmp/commitmsg
* cd into REL8_4_STABLE workdir
* git commit -F /tmp/commitmsg
* ... lather, rinse, repeat ...
* git push

While this isn't much worse than what I was used to with CVS, it's
definitely not better. I think that I could simplify transferring the
patch back to older branches if I could use git cherry-pick. However,
that only works on already-committed patches. If I commit in master
before I start working on 9.0, and so on back, then the commits will be
separated in time by a significant amount, thus destroying any chance of
having git_topo_order recognize them as related.

In git, each commit has two timestamps. Author timestamp and committer
timestamp. They are usually the same, but when you cherry-pick, the
cherry-picked commit retains the original author timestamp, while commit
timestamp changes. "git log" shows only the author timestamp, and if I'm
reading git_topo_order correctly, that's what it cares about too. "git
log --format=fuller" can be used to show both.

So AFAICS, if you use cherry-pick, you're fine. Even if you don't for
some reason, you can override the author timestamp with "git commit
--date=<date>".

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#18

Elvis Pranskevichus

el@prans.net

over 15 years ago

In reply to: Tom Lane (#14)

Re: Multi-branch committing in git, revisited

On September 21, 2010 11:47:30 pm Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

However, keep in mind that creating a branch in every existing backpatch
branch is going to create even more backpatching monkey-work.

Monkey-work is scriptable though. It'll all be worth it if git
cherry-pick is even marginally smarter about back-merging the actual
patches. In principle it could be less easily confused than plain
old patch, but I was a bit discouraged by the upthread comment that
it's just a shorthand for "git diff | patch" :-(

regards, tom lane

cherry-pick is NOT just a shorthand for git diff | patch. It is a single-
commit merge tool. man page does not indicate that, but you can supply the
merge strategy parameter, just like you would do with git merge, but AFAIR
it's not necessary and cherry-pick will fallback to the default recursive
merge when needed.

Elvis

#19

Abhijit Menon-Sen

ams@2ndQuadrant.com

over 15 years ago

In reply to: Tom Lane (#1)

Re: Multi-branch committing in git, revisited

At 2010-09-21 21:20:06 -0400, tgl@sss.pgh.pa.us wrote:

So it seems like maybe we still need some more thought about how to
recognize related commits in different branches.

I'd suggest using "git cherry-pick -x" (or something similar) to mark
backported patches:

-x When recording the commit, append to the original commit message
a note that indicates which commit this change was cherry-picked
from. Append the note only for cherry picks without conflicts.
Do not use this option if you are cherry-picking from your
private branch because the information is useless to the
recipient. If on the other hand you are cherry-picking between
two publicly visible branches (e.g. backporting a fix to a
maintenance branch for an older release from a development
branch), adding this information can be useful.

I don't think it makes any sense to contort your workflow to commit to
different branches at the same instant just to be able to group commits
by timestamp. Using the trail left by cherry-pick -x is much better. You
can just commit your changes to master and cherry-pick them wherever you
want to. This is independent of doing the work in a topic branch.

(Of course, with git it's less troublesome to merge forward rather than
pick backwards, but that's a workflow change that's a lot harder to
adjust to.)

-- ams

#20

Magnus Hagander

magnus@hagander.net

over 15 years ago

In reply to: Tom Lane (#14)

Re: Multi-branch committing in git, revisited

On Wed, Sep 22, 2010 at 05:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Bruce Momjian <bruce@momjian.us> writes:

However, keep in mind that creating a branch in every existing backpatch
branch is going to create even more backpatching monkey-work.

Monkey-work is scriptable though. It'll all be worth it if git
cherry-pick is even marginally smarter about back-merging the actual
patches. In principle it could be less easily confused than plain
old patch, but I was a bit discouraged by the upthread comment that
it's just a shorthand for "git diff | patch" :-(

FWIW, here's the workflow I just tried for the gitignore patch (blame
me and not the workflow if I broke the patch, btw :P)

* Have a master "committers" repo, with all active branches checked out
(and a simple script that updates and can reset them all automatically)
* Have a working repo, where I do my changes. Each branch gets a checkout
when necessary here, and this is where I apply it. I've just used
inline checkouts,
but I don't see why it shouldn't work with workdirs etc.
* In the working repo, apply patch to master branch.
* Then use git cherry-pick to get it into the back branches (still in
the working repo)
At this point, also do the testing of the backpatch.

At this point we have commits with potentially lots of time in between them.
So now we squash these onto the committers repository, using a small script that
does this:

#!/bin/sh

set -e

CMSG=/tmp/commitmsg.$$

editor $CMSG

if [ ! -f $CMSG ]; then
echo No commit message, aborting.
exit 0
fi

export BRANCHES="master REL9_0_STABLE REL8_4_STABLE REL8_3_STABLE
REL8_2_STABLE REL8_1_STABLE REL8_0_STABLE REL7_4_STABLE"

echo Fetching local changes so they are available to merge
git fetch local

for B in ${BRANCHES} ; do
echo Switching and merging $B...
git checkout $B
git merge --squash local/$B --no-commit
git commit -F $CMSG
done

rm -f $CMSG

BTW, before pushing, I like to do something like this:

git push --dry-run 2>&1 |egrep -v "^To" | awk '{print $1}'|xargs git
log --format=fuller

just to get a list of exactly what I'm about to push :-) That doesn't
mean there won't
be mistake, but maybe fewer of them...