* Looking for a way to set up Git correctly
@ 2010-11-11 3:25 Dennis
2010-11-11 9:38 ` Alex Riesen
2010-11-11 13:25 ` Enrico Weigelt
0 siblings, 2 replies; 5+ messages in thread
From: Dennis @ 2010-11-11 3:25 UTC (permalink / raw
To: git
I have a situation.
I have started a web project (call it branch1), and have maintained it
without a version control system for quite some time.
Then, I copied it to another folder (branch2) and while the project remained
essentially the same, I have changed a few of internal paths and some
variable names inside the files.
Then, a few months later on, I copied branch2 to a folder called branch3 and
also modified some of the variable names and some of the internal structure
of the files.
Thus I ended up with 3 folders on my local HDD with pretty much the same
file names and folder structure and everything, and most of the file
content, except those small deltas that made those files different for each
branch.
I guess it's never too late, and now I want to put these 3 projects into a
version control system, and I chose git.
Now, this can be either really simple or really complicated. My first
question is: how do I set the repository up in the proper way where I could
work on all 3 projects separately, with additional possibility of working on
branch1 only and later committing my changes to branch2 and branch3. (Since
projects are virtually identical, a fix in one branch usually needs to be
propagated to other branches)
First, I assume I will use a single repository for this. Then, do I simply
set up 3 branches and start using them, or is there a way to set git up to
capitalize on the projects being nearly identical?
My second question is that each branch has a huge folder with image data.
By huge I mean 1 to 4Gb, depending on the branch. Since images are not
directly relevant to the development work, is there a way to not include
those folders in git? To be honest though, I probably should include them,
but I wanted to ask about this separately as git repository may be get
large, since all 3 branches may grow to 9Gb or so.
Thus I am looking for a git way to handle my situation. Is this simple or
is is hard?
Are there any recommendations before I jump in?
Dennis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly
2010-11-11 3:25 Looking for a way to set up Git correctly Dennis
@ 2010-11-11 9:38 ` Alex Riesen
2010-11-11 13:25 ` Enrico Weigelt
1 sibling, 0 replies; 5+ messages in thread
From: Alex Riesen @ 2010-11-11 9:38 UTC (permalink / raw
To: Dennis; +Cc: git
On Thu, Nov 11, 2010 at 04:25, Dennis <denny@dennymagicsite.com> wrote:
> I have started a web project (call it branch1), and have maintained it
> without a version control system for quite some time.
> Then, I copied it to another folder (branch2) and while the project remained
> essentially the same, I have changed a few of internal paths and some
> variable names inside the files.
> Then, a few months later on, I copied branch2 to a folder called branch3 and
> also modified some of the variable names and some of the internal structure
> of the files.
>
> Thus I ended up with 3 folders on my local HDD with pretty much the same
> file names and folder structure and everything, and most of the file
> content, except those small deltas that made those files different for each
> branch.
>
> I guess it's never too late, and now I want to put these 3 projects into a
> version control system, and I chose git.
>
> Now, this can be either really simple or really complicated. My first
> question is: how do I set the repository up in the proper way where I could
> work on all 3 projects separately, with additional possibility of working on
> branch1 only and later committing my changes to branch2 and branch3. (Since
> projects are virtually identical, a fix in one branch usually needs to be
> propagated to other branches)
> First, I assume I will use a single repository for this. Then, do I simply
> set up 3 branches and start using them, or is there a way to set git up to
> capitalize on the projects being nearly identical?
Assuming I've got the relationships of your "branches" right:
$ cp -a branch1 branch && cd branch
$ git init
$ echo /huge-images/ >.gitignore
$ git add .gitignore; git add .; git commit; git branch branch1
$ git checkout -b branch2
$ cp -a ../branch2 .
$ git add .; git commit
$ git checkout -b branch3
$ cp -a ../branch3 .
$ git add .; git commit
> My second question is that each branch has a huge folder with image data. By
> huge I mean 1 to 4Gb, depending on the branch. Since images are not
> directly relevant to the development work, is there a way to not include
> those folders in git? To be honest though, I probably should include them,
> but I wanted to ask about this separately as git repository may be get
> large, since all 3 branches may grow to 9Gb or so.
>
> Thus I am looking for a git way to handle my situation. Is this simple or
> is is hard?
If you add the images you will eventually run into problems (heavy
swapping, for one).
Git is not really setup to work with big binary files (a file must fit into
memory completely).
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly
2010-11-11 3:25 Looking for a way to set up Git correctly Dennis
2010-11-11 9:38 ` Alex Riesen
@ 2010-11-11 13:25 ` Enrico Weigelt
2010-11-11 16:46 ` Jonathan Nieder
1 sibling, 1 reply; 5+ messages in thread
From: Enrico Weigelt @ 2010-11-11 13:25 UTC (permalink / raw
To: git
* Dennis <denny@dennymagicsite.com> wrote:
Hi,
> Now, this can be either really simple or really complicated. My first
> question is: how do I set the repository up in the proper way where I
> could work on all 3 projects separately, with additional possibility of
> working on branch1 only and later committing my changes to branch2 and
> branch3.
As first step you could create 3 separate git repos in each directory
and add everything to it (git init, git add -A, git commit). Then
rename the branches properly (so instead of "master", they'll be called
"branch1", "branch2", "branch2" or something like that). Create another
(maybe bare) repo elsewhere, add it as remote to the three other ones
and push their branches upwards. Now you have 4 repos, 3 for working
on the individual branches and another for collecting them all (hub model).
You could also choose to throw the first three away and only work in
the last one.
> (Since projects are virtually identical, a fix in one branch
> usually needs to be propagated to other branches)
In your case, cherry-pick might be the right for you.
You could also do a little bit refactoring, making a 4th branch which
the other 3 are then rebased onto. Then you could do your fixes in that
branch and merged into or rebase the other 3 onto that one.
> My second question is that each branch has a huge folder with image data.
> By huge I mean 1 to 4Gb, depending on the branch. Since images are not
> directly relevant to the development work, is there a way to not include
> those folders in git?
see .gitignore file.
nevertheless it might be useful to also have all the images in the
repo for backup reasons.
BTW: if you're concerned about disk space, you could add the object dir
of the 4th (hub) repository to the 3 working repos (run git-gc in the
hub repo before that!). Next gc runs will remove the objects that are
already present in the hub. But beware! If you remove something in the
hub repo and run git-gc there, you could loose objects in the other repos!
(maybe it would be wise to add the 3 working repos as remotes in the
hub and always run an git remote update before git-gc in the hub).
cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/
phone: +49 36207 519931 email: weigelt@metux.de
mobile: +49 151 27565287 icq: 210169427 skype: nekrad666
----------------------------------------------------------------------
Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly
2010-11-11 13:25 ` Enrico Weigelt
@ 2010-11-11 16:46 ` Jonathan Nieder
[not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com>
0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Nieder @ 2010-11-11 16:46 UTC (permalink / raw
To: Dennis; +Cc: git, Alex Riesen, Enrico Weigelt
(+cc: Dennis again, Alex)
Hi,
Enrico Weigelt wrote:
> * Dennis <denny@dennymagicsite.com> wrote:
>> Now, this can be either really simple or really complicated. My first
>> question is: how do I set the repository up in the proper way where I
>> could work on all 3 projects separately, with additional possibility of
>> working on branch1 only and later committing my changes to branch2 and
>> branch3.
>
> As first step you could create 3 separate git repos in each directory
> and add everything to it (git init, git add -A, git commit). Then
> rename the branches properly (so instead of "master", they'll be called
> "branch1", "branch2", "branch2" or something like that). Create another
> (maybe bare) repo elsewhere, add it as remote to the three other ones
> and push their branches upwards.
So this looks like so:
for i in project1 project2 project3
do
(
cd "$i"
git init
git add .
git commit
)
done
git init main
cd main
for i in project1 project2 project3
do
git fetch ../$i master:$i
done
mv project1 project2 project3 away/
If you would like multiple worktrees (one for each branch, maybe) for
the main repo, you might want to look into the new-workdir script in
contrib/workdir (but do consider the caveats[1]).
>> (Since projects are virtually identical, a fix in one branch
>> usually needs to be propagated to other branches)
>
> In your case, cherry-pick might be the right for you.
e.g., when project3 gets a new fix:
git checkout project1
git cherry-pick project3
> You could also do a little bit refactoring, making a 4th branch which
> the other 3 are then rebased onto.
Right, what is the actual relationship between these projects? Do
they actually represent branches in the history of a single project?
Suppose project1 is historically an ancestor to project2, project3,
and project4, which are independent. (Maybe project1 is the initial
version and projects 2,3,4 are ports to other platforms.) You could
take this into account when initially setting up the branches, like
this:
git init main
cd main
GIT_DIR=$(pwd)/.git; export GIT_DIR
GIT_WORK_TREE=../project1 git add .
GIT_WORK_TREE=../project1 git commit
git branch -m project1
for i in project2 project3 project4
do
git checkout -b $i project1
GIT_WORK_TREE=../$i git add -A
GIT_WORK_TREE=../$i git commit
done
(and use gitk --all when done to make sure everything looks right)
Alternatively, you can rearrange the history afterwards:
$ git cat-file commit project2 | tee project2
tree 76db51024713f6ef191928a8445d48d39ab55434
author Junio C Hamano <gitster@pobox.com> 1289324716 -0800
committer Junio C Hamano <gitster@pobox.com> 1289324716 -0800
project2: an excellent project
$ git rev-parse project1
$ vi project2
... add a "parent <object id>" line
after the tree line,
where <object id> is the full object name rev-parse printed ...
$ git hash-object -t commit -w project2
$ git branch -f branch2 <the object name hash-object prints>
... repeat for project3 and project4 ...
$ gitk --all; # to make sure everything looks right
This is less convenient than it ought to be. It would be nice to add
a "git graft" command to automate this procedure, which
- interacts well with "git replace"
- doesn't interact poorly with "git fetch" like .git/info/grafts does
- could be more convenient to use than .git/info/grafts.
As the gitworkflows man page mentions, if you make your fixes on the
oldest branch they apply to (project1) and then merge to all later
branches, then the fixes will propagate forward correctly. See the
"Graduation" and "Merging upwards" sections of gitworkflows for details.
>> My second question is that each branch has a huge folder with image data.
>> By huge I mean 1 to 4Gb, depending on the branch. Since images are not
>> directly relevant to the development work, is there a way to not include
>> those folders in git?
I would suggest tracking a symlink to another repository (or to a
directory tracked through other means, like unison).
Hope that helps,
Jonathan
[1] If you have two worktrees for the same project with the
same branch checked out at a given moment, the results can be
confusing (changes made in one worktree will look like they have
been commited and undone in the other).
The "detached HEAD" feature (which git-checkout.1 explains) and
multiple worktrees do not interact so well: the need to preserve
commits while no branch was checked out in one worktree will not be
taken into account when "git gc" runs (explicitly or implicitly!) on
the other. This can be very disconcerting.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Looking for a way to set up Git correctly
[not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com>
@ 2010-11-11 19:38 ` Jonathan Nieder
0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Nieder @ 2010-11-11 19:38 UTC (permalink / raw
To: denny; +Cc: git, Alex Riesen, Enrico Weigelt
denny@dennymagicsite.com wrote:
> I am still looking through your replies and getting familiar with
> git commands.
By the way, please ignore that GIT_WORK_TREE stuff I did. It
probably works, but it's ugly. :) That example could have been
written better as
git init everything
GIT_DIR=$(pwd)/everything/.git; export GIT_DIR
(
cd common-ancestor
git add -A
git commit
git branch -m ancestor
)
(
cd project1
git checkout -b project1 ancestor
git add -A
git commit
)
... etc ..
unset GIT_DIR
cd everything
git checkout project1
[...]
> From a developer's point of view, working on projectX means making
> some changes and committing them to the repo for that project. The
> developer may not be aware of other pojects existing.
For concreteness, I am imagining these directories represent various
versions of the Almquist shell. The common ancestor is the BSD4.3/Net-2
version and various projects may have built from there in different
directions: NetBSD sh, FreeBSD sh, dash. (Yes, I am oversimplifying. :))
Now suppose they have diverged so wildly that it is never possible to
synchronize code with each other. Instead, they can copy fixes, and
this is especially convenient when the fixes are phrased as diffs to
the common ancestor.
To facilitate this, Alice revives the BSD4.3/Net-2 sh project with a
"fixes only" policy. Her daily work might look like this:
$ git fetch netbsd
$ git log netbsd/for-alice@{1}..netbsd/for-alice; # any good patches today?
$ git cherry-pick -s 67fd89980; # a good patch.
... quick test ...
$ git cherry-pick -s 897ac8; # another good patch.
... quick test ...
...
$ git fetch freebsd
... and similarly for the rest of the patch submitters ...
$ git am emailed-patch
Then to more thoroughly test the result:
$ git checkout -b throwaway; # new throw-away branch.[1]
$ git merge netbsd/master; # will the changes work for netbsd?
... thorough test ...
$ git reset --keep master
$ git merge freebsd/master; # how about freebsd?
... etc ...
And finally she pushes the changes out.
> Without knowing anything about git for a moment, one ideal workflow
> is where a developer makes changes to projectX that touch the base
> and projectX specific features. Then the developer commits them and
> pushes them to the main repo. The main repo contains all projects.
> During the commit, chages to the base automagically get pushed to
> all projects that share that base
If it is a matter of what files are touched, then maybe the base is
actually something like a library, which should be managed as a
separate project. See the "git submodule" manual if you would like to
try something like this but still keep the projects coupled.
On the other hand, remaining in the situation from before:
Suppose Sam is the NetBSD sh maintainer. The first step in working on
a new release might be
$ git fetch ancestor
$ git log -p HEAD..FETCH_HEAD; # fixes look okay?
$ git pull ancestor
since Alice tends to include only safe, well tested fixes.
Many changes Sam makes are specific to his project, but today he comes
up with a fix that might be useful for other ash descendants.
So instead of commiting directly, he can try:
$ git checkout for-alice; # carry the fix to the for-alice branch
... test ...
$ git commit -a; # commit it.
If it is not an urgent fix, at this point he might do
$ git checkout master; # back to the main NetBSD branch, without the fix
and give the other projects some time to work on the patch and come up
with a better fix. Or he might cherry-pick the commit from for-alice,
and even publish it and encourage others to cherry-pick directly from
him to get the fix out ASAP.
Notice that not all changes to the base files are necessarily useful
for other descendants of the ancestral program. So in this example,
propagation of changes between projects is fairly explicit.
[1] "git checkout HEAD^0" would be more convenient.
See DETACHED HEAD in the git checkout manual if interested.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-11 19:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-11 3:25 Looking for a way to set up Git correctly Dennis
2010-11-11 9:38 ` Alex Riesen
2010-11-11 13:25 ` Enrico Weigelt
2010-11-11 16:46 ` Jonathan Nieder
[not found] ` <20101111190724.00vcimqm8w0cw8s0@dennymagicsite.com>
2010-11-11 19:38 ` Jonathan Nieder
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).