Managing multiple branches in git
[ it's way past time for a new subject thread ]
Marko Kreen <markokr@gmail.com> writes:
They cannot be same commits in GIT as the resulting tree is different.
This brings up something that I've been wondering about: my limited
exposure to git hasn't shown me any sane way to work with multiple
release branches.
The way that I have things set up for CVS is that I have a checkout
of HEAD, and also "sticky" checkouts of the back branches:
pgsql/ ...
REL8_3/pgsql/ ... (made with -r REL8_3_STABLE)
REL8_2/pgsql/ ...
etc
Each of these is configured (using --prefix) to install into a separate
installation tree. So I can switch my attention to one branch or
another by cd'ing to the right place and adjusting a few environment
variables such as PATH and PGDATA.
The way I prepare a patch that has to be back-patched is first to make
and test the fix in HEAD. Then apply it (using diff/patch and perhaps
manual adjustments) to the first back branch, and test that. Repeat for
each back branch as far as I want to go. Almost always, there is a
certain amount of manual adjustment involved due to renamings,
historical changes of pgindent rules, etc. Once I have all the versions
tested, I prepare a commit message and commit all the branches. This
results in one commit message per branch in the pgsql-committers
archives, and just one commit in the cvs2cl representation of the
history --- which is what I want.
I don't see any even-approximately-sane way to handle similar cases
in git. From what I've learned so far, you can have one checkout
at a time in a git working tree, which would mean N copies of the
entire repository if I want N working trees. Not to mention the
impossibility of getting it to regard parallel commits as related
in any way whatsoever.
So how is this normally done with git?
regards, tom lane
On Jun 2, 2009, at 8:43 AM, Tom Lane wrote:
Each of these is configured (using --prefix) to install into a
separate
installation tree. So I can switch my attention to one branch or
another by cd'ing to the right place and adjusting a few environment
variables such as PATH and PGDATA.
Yeah, with git, rather than cd'ing to another directory, you'd just do
`git checkout rel8_3` and work from the same directory.
So how is this normally done with git?
For better or for worse, because git is project-oriented rather than
filesystem-oriented, you can't commit to all the branches at once. You
have to commit to each one independently. You can push them all back
to the canonical repository at once, and the canonical repository's
commit hooks can trigger for all of the commits at once (or so I
gather from getting emails from GitHub with a bunch of commits listed
in a single message), but each commit is still independent.
It has to do with the fundamentally different way in which Git works:
snapshots of your code rather than different directories.
Best,
David
"David E. Wheeler" <david@kineticode.com> writes:
Yeah, with git, rather than cd'ing to another directory, you'd just do
`git checkout rel8_3` and work from the same directory.
That's what I'd gathered, and frankly it is not an acceptable answer.
Sure, the "checkout" operation is remarkably fast, but it does nothing
for derived files. What would really be involved here (if I wanted to
be sure of having a non-broken build) is
make maintainer-clean
git checkout rel8_3
configure
make
which takes long enough that I'll have plenty of time to consider
how much I hate git. If there isn't a better way proposed, I'm
going to flip back to voting against this conversion. I need tools
that work for me not against me.
regards, tom lane
On Jun 2, 2009, at 9:03 AM, Tom Lane wrote:
"David E. Wheeler" <david@kineticode.com> writes:
Yeah, with git, rather than cd'ing to another directory, you'd just
do
`git checkout rel8_3` and work from the same directory.That's what I'd gathered, and frankly it is not an acceptable answer.
Sure, the "checkout" operation is remarkably fast, but it does nothing
for derived files. What would really be involved here (if I wanted to
be sure of having a non-broken build) is
make maintainer-clean
git checkout rel8_3
configure
make
which takes long enough that I'll have plenty of time to consider
how much I hate git. If there isn't a better way proposed, I'm
going to flip back to voting against this conversion. I need tools
that work for me not against me.
Well, you can have as many clones of a repository as you like. You can
keep one with master checked out, another with rel8_3, another with
rel8_2, etc. You'd just have to write a script to keep them in sync
(shouldn't be too difficult, each just as all the others as an origin
-- or maybe you have one that's canonical on your system).
Best,
David
David E. Wheeler wrote:
Well, you can have as many clones of a repository as you like. You can
keep one with master checked out, another with rel8_3, another with
rel8_2, etc. You'd just have to write a script to keep them in sync
(shouldn't be too difficult, each just as all the others as an origin --
or maybe you have one that's canonical on your system).
Hmm, but is there a way to create those clones from a single local
"database"?
(I like the monotone model much better. This mixing of working copies
and databases as if they were a single thing is silly and uncomfortable
to use.)
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
* David E. Wheeler <david@kineticode.com> [090602 11:56]:
On Jun 2, 2009, at 8:43 AM, Tom Lane wrote:
Each of these is configured (using --prefix) to install into a
separate
installation tree. So I can switch my attention to one branch or
another by cd'ing to the right place and adjusting a few environment
variables such as PATH and PGDATA.Yeah, with git, rather than cd'ing to another directory, you'd just do
`git checkout rel8_3` and work from the same directory.
But that looses his "configured" and "compiled" state...
But git isn't forcing him to change his workflow at all...
He *can* keep completely separate "git repositories" for each release
and work just as before. This will carry with it a full "separate"
history in each repository, and I think that extra couple hundred MB is
what he's hoping to avoid.
But git has concepts of "object alternates" and "reference
repositories". To mimic your workflow, I would probably do something
like:
## Make my reference repository, cloned from "offical" where everyone pushes
mountie@pumpkin:~/projects/postgresql$ git clone --bare --mirror git://repo.or.cz/PostgreSQL.git PostgreSQL.git
## Make my local master development repository
mountie@pumpkin:~/projects/postgresql$ git clone --reference PostgreSQL.git git://repo.or.cz/PostgreSQL.git master
Initialized empty Git repository in /home/mountie/projects/postgresql/master/.git/
## Make my local REL8_3_STABLE development repository
mountie@pumpkin:~/projects/postgresql$ git clone --reference PostgreSQL.git git://repo.or.cz/PostgreSQL.git REL8_3_STABLE
Initialized empty Git repository in /home/mountie/projects/postgresql/REL8_3_STABLE/.git/
mountie@pumpkin:~/projects/postgresql$ cd REL8_3_STABLE/
mountie@pumpkin:~/projects/postgresql/REL8_3_STABLE$ git checkout --track -b REL8_3_STABLE origin/REL8_3_STABLE
Branch REL8_3_STABLE set up to track remote branch refs/remotes/origin/REL8_3_STABLE.
Switched to a new branch 'REL8_3_STABLE'
Now, the master/REL8_3_STABLE directories are both complete git
repositories, independant of eachother, except that they both reference
the "objects" in the PostgreSQL.git repository. They don't contain the
historical objects in their own object store. And I would couple that
with a cronjob:
*/15 * * * git --git-dir=$HOME/projects/postgresql/PostgreSQL.git fetch --quiet
which will keep my "reference" project up2date (a la rsync-the-CVSROOT,
or cvsup-a-mirror anybody currently has when working with CVS)...
Then Tom can keep working pretty much as he currently does.
a.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
On Tue, Jun 2, 2009 at 5:16 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
David E. Wheeler wrote:
Well, you can have as many clones of a repository as you like. You can
keep one with master checked out, another with rel8_3, another with
rel8_2, etc. You'd just have to write a script to keep them in sync
(shouldn't be too difficult, each just as all the others as an origin --
or maybe you have one that's canonical on your system).Hmm, but is there a way to create those clones from a single local
"database"?
Just barely paying attention here, but isn't 'git clone --local' what you need?
--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
Yeah I was annoyed by the issue with having to reconfigure as well.
There are various tricks you can do though with separate repositories.
You could have the older branch repositories be clones of HEAD branch
repository so when you push from them the changes just go to that
repository then you can push all three branches together (not sure if
you can do it all in one command though)
You can also have the different repositories share data files which I
think will mean you don't have to pull other people's commits
repeatedly. (the default is to have local clones use hard links so
they don't take a lot of space and they're quick to sync anyways.)
There's also an option to make a clone without the full history but
for local clones they're fast enough to create anyways that there's
probably no point.
Incidentally I use git-clean -x -d -f instead of make maintainer-clean.
--
Greg
On 2 Jun 2009, at 17:07, "David E. Wheeler" <david@kineticode.com>
wrote:
Show quoted text
On Jun 2, 2009, at 9:03 AM, Tom Lane wrote:
"David E. Wheeler" <david@kineticode.com> writes:
Yeah, with git, rather than cd'ing to another directory, you'd
just do
`git checkout rel8_3` and work from the same directory.That's what I'd gathered, and frankly it is not an acceptable answer.
Sure, the "checkout" operation is remarkably fast, but it does
nothing
for derived files. What would really be involved here (if I wanted
to
be sure of having a non-broken build) is
make maintainer-clean
git checkout rel8_3
configure
make
which takes long enough that I'll have plenty of time to consider
how much I hate git. If there isn't a better way proposed, I'm
going to flip back to voting against this conversion. I need tools
that work for me not against me.Well, you can have as many clones of a repository as you like. You
can keep one with master checked out, another with rel8_3, another
with rel8_2, etc. You'd just have to write a script to keep them in
sync (shouldn't be too difficult, each just as all the others as an
origin -- or maybe you have one that's canonical on your system).Best,
David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Jun 2, 2009, at 9:16 AM, Alvaro Herrera wrote:
Well, you can have as many clones of a repository as you like. You
can
keep one with master checked out, another with rel8_3, another with
rel8_2, etc. You'd just have to write a script to keep them in sync
(shouldn't be too difficult, each just as all the others as an
origin --
or maybe you have one that's canonical on your system).Hmm, but is there a way to create those clones from a single local
"database"?
Yeah, that's what I meant by a "canonical copy on your system."
(I like the monotone model much better. This mixing of working copies
and databases as if they were a single thing is silly and
uncomfortable
to use.)
Monotone?
Best,
David
On 6/2/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ it's way past time for a new subject thread ]
Marko Kreen <markokr@gmail.com> writes:
They cannot be same commits in GIT as the resulting tree is different.
This brings up something that I've been wondering about: my limited
exposure to git hasn't shown me any sane way to work with multiple
release branches.The way that I have things set up for CVS is that I have a checkout
of HEAD, and also "sticky" checkouts of the back branches:
pgsql/ ...
REL8_3/pgsql/ ... (made with -r REL8_3_STABLE)
REL8_2/pgsql/ ...
etcEach of these is configured (using --prefix) to install into a separate
installation tree. So I can switch my attention to one branch or
another by cd'ing to the right place and adjusting a few environment
variables such as PATH and PGDATA.The way I prepare a patch that has to be back-patched is first to make
and test the fix in HEAD. Then apply it (using diff/patch and perhaps
manual adjustments) to the first back branch, and test that. Repeat for
each back branch as far as I want to go. Almost always, there is a
certain amount of manual adjustment involved due to renamings,
historical changes of pgindent rules, etc. Once I have all the versions
tested, I prepare a commit message and commit all the branches. This
results in one commit message per branch in the pgsql-committers
archives, and just one commit in the cvs2cl representation of the
history --- which is what I want.I don't see any even-approximately-sane way to handle similar cases
in git. From what I've learned so far, you can have one checkout
at a time in a git working tree, which would mean N copies of the
entire repository if I want N working trees. Not to mention the
impossibility of getting it to regard parallel commits as related
in any way whatsoever.
Whether you use several branches in one tree or several checked out
trees should be a personal preference, both ways are possible with GIT.
So how is this normally done with git?
If you are talking about backbranch fixes, then the "most-version
controlled-way" to do would be to use lowest branch as base, commit
fix there and then merge it upwards.
Now whether it succeeds depends on merge points between branches,
as VCS system takes nearest merge point as base to launch merge logic on.
I think that is also the actual thing that Markus is concerned about.
But instead of having random merge points between branches that depend
on when some new file was added, we could simply import all branches
with linear history and later simply say to git that:
* 7.4 is merged into 8.0
..
* 8.2 is merged into 8.3
* 8.3 is merged into HEAD
without any file changes. Logically this would mean that "any changes in
branch N-1 are already in N".
So afterwards when working with fully with GIT any upwards merges
work without any fuss as it does not need to consider old history
imported from CVS at all.
--
marko
On 06/02/2009 05:43 PM, Tom Lane wrote:
Marko Kreen<markokr@gmail.com> writes:
They cannot be same commits in GIT as the resulting tree is different.
I don't see any even-approximately-sane way to handle similar cases
in git. From what I've learned so far, you can have one checkout
at a time in a git working tree, which would mean N copies of the
entire repository if I want N working trees. Not to mention the
impossibility of getting it to regard parallel commits as related
in any way whatsoever.
You can use the "--reference" option to git clone to refer to objects in
another clone. That way most of the commits will only be stored in there
- only the local commits will be in the local checkout.
Andres
Alvaro Herrera <alvherre@commandprompt.com> writes:
Hmm, but is there a way to create those clones from a single local
"database"?
(I like the monotone model much better. This mixing of working copies
and databases as if they were a single thing is silly and uncomfortable
to use.)
I agree, .git as a subdirectory of the working directory doesn't make
much sense to me.
I wondered for a second about symlinking .git from several checkout
directories to a common master, but AFAICT .git stores both the
"repository" and status information about the current checkout, so
that's not gonna work.
In the one large project that I have a git tree for, .git seems to
eat only about as much disk space as the checkout (so apparently the
compression is pretty effective). So it wouldn't be totally impractical
to have a separate repository for each branch, but it sure seems like
an ugly and klugy way to do it. And we'd still end up with the "same"
commit on different branches appearing entirely unrelated.
At the same time, I don't really buy the theory that relating commits on
different branches via merges will work. In my experience it is very
seldom the case that a patch applies to each back branch with no manual
effort whatever, which is what I gather the merge functionality could
help with. So maybe there's not much help to be had on this ...
regards, tom lane
On 06/02/2009 06:33 PM, Tom Lane wrote:
At the same time, I don't really buy the theory that relating commits on
different branches via merges will work. In my experience it is very
seldom the case that a patch applies to each back branch with no manual
effort whatever, which is what I gather the merge functionality could
help with. So maybe there's not much help to be had on this ...
You can do a merge and change the commit during that - this way you get
the merge tracking information correct although you did a merge so that
further merge operations can consider the specific change to be applied
on both/some/all branches.
This will happen by default if there is a merge conflict or can be
forced by using the --no-commit option to merge.
Andres
Tom Lane wrote:
I agree, .git as a subdirectory of the working directory doesn't make
much sense to me.I wondered for a second about symlinking .git from several checkout
directories to a common master, but AFAICT .git stores both the
"repository" and status information about the current checkout, so
that's not gonna work.In the one large project that I have a git tree for, .git seems to
eat only about as much disk space as the checkout (so apparently the
compression is pretty effective). So it wouldn't be totally impractical
to have a separate repository for each branch, but it sure seems like
an ugly and klugy way to do it. And we'd still end up with the "same"
commit on different branches appearing entirely unrelated
I am curious about why an end user would really care? CVS and SVN both
kept local workspace directories containing metadata. If anything, I
find GIT the least intrusive of these three, as the .git is only in the
top-level directory, whereas CVS and SVN like to pollute every directory.
Assuming you don't keep binaries under source control, the .git
containing all history is very often smaller than the "pristine copy"
kept by CVS or SVN in their metadata directories, so space isn't really
the issue.
Maybe think of it more like a feature. GIT keeps a local cache of the
entire repo, whereas SVN and CVS only keeps a local cache of the commit
you are based on. It's a feature that you can review history without
network connectivity.
Cheers,
mark
--
Mark Mielke <mark@mielke.cc>
On 6/2/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Hmm, but is there a way to create those clones from a single local
"database"?(I like the monotone model much better. This mixing of working copies
and databases as if they were a single thing is silly and uncomfortable
to use.)I agree, .git as a subdirectory of the working directory doesn't make
much sense to me.I wondered for a second about symlinking .git from several checkout
directories to a common master, but AFAICT .git stores both the
"repository" and status information about the current checkout, so
that's not gonna work.
You cannot share .git, but you can share object directory (.git/objects).
Which contains the bulk data. There are various ways to do it, symlink
should be one of them.
In the one large project that I have a git tree for, .git seems to
eat only about as much disk space as the checkout (so apparently the
compression is pretty effective). So it wouldn't be totally impractical
to have a separate repository for each branch, but it sure seems like
an ugly and klugy way to do it. And we'd still end up with the "same"
commit on different branches appearing entirely unrelated.At the same time, I don't really buy the theory that relating commits on
different branches via merges will work. In my experience it is very
seldom the case that a patch applies to each back branch with no manual
effort whatever, which is what I gather the merge functionality could
help with. So maybe there's not much help to be had on this ...
Sure, if branches are different enough, the merge commit would
contain lot of code changes. But still - you would get single "main"
commit with log message, plus bunch of merge commits, which may be
nicer than several duplicate commits.
--
marko
On Jun 2, 2009, at 9:23 AM, Aidan Van Dyk wrote:
Yeah, with git, rather than cd'ing to another directory, you'd just
do
`git checkout rel8_3` and work from the same directory.But that looses his "configured" and "compiled" state...
But git isn't forcing him to change his workflow at all...
I defer to your clearly superior knowledge. Git is simple, but there
is *so* much to learn!
David
Tom Lane wrote:
Marko Kreen <markokr@gmail.com> writes:
They cannot be same commits in GIT as the resulting tree is different.
The way I prepare a patch that has to be back-patched is first to make
and test the fix in HEAD. Then apply it (using diff/patch and perhaps
manual adjustments) to the first back branch, and test that. Repeat for
each back branch as far as I want to go. Almost always, there is a
certain amount of manual adjustment involved due to renamings,
historical changes of pgindent rules, etc. Once I have all the versions
tested, I prepare a commit message and commit all the branches. This
results in one commit message per branch in the pgsql-committers
archives, and just one commit in the cvs2cl representation of the
history --- which is what I want.
I think the closest equivalent to what you're doing here is:
"git cherry-pick -n -x <the commit you want to pull>"
The "git cherry-pick" command does similar to the diff/patch work.
The "-n" prevents an automatic checking to allow for manual adjustments.
The "-x" flag adds a note to the commit comment describing the relationship
between the commits.
It seems to me we could make a cvs2cl like script that's aware
of the comments "git-cherry-pick -x" inserts and rolls them up
in a similar way that cvs2cl does.
The way that I have things set up for CVS is that I have a checkout
of HEAD, and also "sticky" checkouts of the back branches...
Each of these is configured (using --prefix) to install into a separate
installation tree. ...
I think the most similar thing here would be for you to have one
normal clone of the "official" repository, and then use
git-clone --local
when you set up the back branch directories. The --local flag will
use hard-links to avoid wasting space & time of maintaining multiple
copies of histories.
I don't see any even-approximately-sane way to handle similar cases
in git. From what I've learned so far, you can have one checkout
at a time in a git working tree, which would mean N copies of the
entire repository if I want N working trees....
git-clone --local avoids that.
... Not to mention the
impossibility of getting it to regard parallel commits as related
in any way whatsoever.
Well - "related in any way whatsoever" seems possible either
through the comments added in the "-x" flag in git-cherry-pick, or
with the other workflows people described where you fix the bug in
a new branch off some ancestor of all the releases (ideally near
where the bug occurred) and merge them into the branches.
Show quoted text
So how is this normally done with git?
* Tom Lane <tgl@sss.pgh.pa.us> [090602 12:35]:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Hmm, but is there a way to create those clones from a single local
"database"?(I like the monotone model much better. This mixing of working copies
and databases as if they were a single thing is silly and uncomfortable
to use.)I agree, .git as a subdirectory of the working directory doesn't make
much sense to me.
The main reason why git uses this is that the "index" (git equivilant of
the CVS/*) resides in 1 place instead of in each directory. So, if you
have multiple working directories sharing a single .git, you get them
tromping on each others "index".
That said, you can symlink almost everything *inside* .git to other
repositories.
For instance, if you had the "Reference" repository I shows last time,
instead of doing the "git clone", you could do:
#Make a new REL8_2_STABLE working area
mountie@pumpkin:~/pg-work$ REF=$(pwd)/PostgreSQL.git
mountie@pumpkin:~/pg-work$ mkdir REL8_2_STABLE
mountie@pumpkin:~/pg-work$ cd REL8_2_STABLE/
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ git init
# And now make everything point back
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ mkdir .git/refs/remotes && ln -s $REF/refs/heads .git/refs/remotes/origin
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ rm -Rf .git/objects && ln -s $REF/objects .git/objects
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ rmdir .git/refs/tags && ln -s $REF/refs/tags .git/refs/tags
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ rm -Rf .git/info && ln -s $REF/info .git/info
mountie@pumpkin:~/pg-work/REL8_2_STABLE$ rm -Rf .git/hooks && ln -s $REF/hooks
This will leave you with an independent config, independent index,
independent heads, and independent reflogs, with a shared "remote"
tracking branches, shared "object" store, shared "tags", and shared
hooks.
And make sure you don't purge any unused objects out of any of these
subdirs, because they don't know that the object might be in use in
another subdir... This warning is the one reason why it's usually
recommended to just use a reference repository, and not have to worry..
a.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
Mark Mielke wrote:
I am curious about why an end user would really care? CVS and SVN both
kept local workspace directories containing metadata. If anything, I
find GIT the least intrusive of these three, as the .git is only in the
top-level directory, whereas CVS and SVN like to pollute every directory.
That's not the problem. The problem is that it is kept in the same
directory as the checked out copy. It would be a lot more usable if it
was possible to store it elsewhere.
Yes, the .svn directories are a PITA.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 6/2/09, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Mark Mielke wrote:
I am curious about why an end user would really care? CVS and SVN both
kept local workspace directories containing metadata. If anything, I
find GIT the least intrusive of these three, as the .git is only in the
top-level directory, whereas CVS and SVN like to pollute every directory.That's not the problem. The problem is that it is kept in the same
directory as the checked out copy. It would be a lot more usable if it
was possible to store it elsewhere.
export GIT_DIR=...
--
marko