git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Best practices for updating old repos
@ 2017-06-16  0:52 Michael Eager
  2017-06-16  4:22 ` Stefan Beller
       [not found] ` <CAGUfqurYdFJDT4+XzyPvo3sxeT=zjXqZGCPpDbUFwjqG1B3pBw@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Eager @ 2017-06-16  0:52 UTC (permalink / raw)
  To: git

Hi All --

I'm working with code that is based on a five year old repository.
There are 130 local commits since the repo was forked.  Naturally,
the upstream project has moved on significantly.

I'm wondering about best approaches to updating the repo to the
current upstream version.  Here are the approaches I've considered:

- Rebase from upstream.  Likely almost every patch will fail with
   multiple merge conflicts.

- Merge local branch into upstream.  Likely many merge failures, but
   fewer than with rebase.

- Apply individual patches from the old repo to the upstream repo.
   Fix merge conflicts, rebuild, fix build failures.  There may be
   some duplication and additional merge problems created, where a
   later patch from the old repo fixes the same conflict or build
   failure.

I've tried each of these approaches on various projects.  Each has
problems. After resolving merge issues there are build failures which
need to be resolved and additional patches created.  The result is
that the patch history is a bit chaotic, where there are later patches
which fix problems with early patches.  I've tried to sort the fix
patches to follow the patch they correct, so that the fixes were
together and I could merge them, but that can be difficult.

I've used Stacked Git a little, but don't know if it will make
any of this easier.

On some projects, I've reimplemented changes in the upstream repo,
abandoning the patch history from the old repo:

- Create diff of old repo and upstream.  Apply only the changes
   to add new functionality, which are in the patches to the
   old repo.   Fix problems caused by API changes, renamed files, etc.

- Re-implement the changes on the upstream repo.  Some of the old
   code would be re-used, but modified to fit in the current upstream.
   Some new code would be written.

One other variant of the rebase approach I've thought of is to do
this incrementally, rebasing the old repo against an upstream commit
a short time after the old repo was forked, fixing any conflicts,
rebuilding and fixing build failures.  Then repeat, with a bit
newer commit.  Then repeat, until I get to the top.  This sounds
tedious, but some of it can be automated.  It also might result in
my making the changes compatible with upstream code which was later
abandoned or significantly changed.

Anyone have a different approach that I should consider?  Or maybe
offer advice on how to make one of these approaches work better?
What is best practice to update an old repo?

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Best practices for updating old repos
  2017-06-16  0:52 Best practices for updating old repos Michael Eager
@ 2017-06-16  4:22 ` Stefan Beller
  2017-06-16  6:32   ` Michael Eager
       [not found] ` <CAGUfqurYdFJDT4+XzyPvo3sxeT=zjXqZGCPpDbUFwjqG1B3pBw@mail.gmail.com>
  1 sibling, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2017-06-16  4:22 UTC (permalink / raw)
  To: Michael Eager; +Cc: git@vger.kernel.org

On Thu, Jun 15, 2017 at 5:52 PM, Michael Eager <eager@eagerm.com> wrote:

> One other variant of the rebase approach I've thought of is to do
> this incrementally, rebasing the old repo against an upstream commit
> a short time after the old repo was forked, fixing any conflicts,
> rebuilding and fixing build failures.  Then repeat, with a bit
> newer commit.  Then repeat, until I get to the top.  This sounds
> tedious, but some of it can be automated.  It also might result in
> my making the changes compatible with upstream code which was later
> abandoned or significantly changed.

This sounds like

https://github.com/mhagger/git-imerge
https://www.youtube.com/watch?v=FMZ2_-Ny_zc

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Best practices for updating old repos
       [not found] ` <CAGUfqurYdFJDT4+XzyPvo3sxeT=zjXqZGCPpDbUFwjqG1B3pBw@mail.gmail.com>
@ 2017-06-16  6:24   ` Michael Eager
  2017-06-16 10:16   ` Fwd: " Michael O'Cleirigh
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Eager @ 2017-06-16  6:24 UTC (permalink / raw)
  To: Michael O'Cleirigh; +Cc: git

Thanks for your comments.

On 06/15/2017 07:57 PM, Michael O'Cleirigh wrote:
> Hi Michael,
>
> In git if you don't merge often then you get these merge conflict hell situations.
>
> In my experience the main conflicts come not from the unified diff of those 130 commits but from
> differences in the surrounding code.
>
> Merging/rebase/cherrypicking directly to the latest upstream sounds impossible to me.
>
> These conflicts come from the distance between the local fork branch and the upstream branch.
>
> You need to merge through closer commits first to have a hope of getting something automatic to work.
>
> Something like getting the list  of releases made in the upstream in the last 5 years and merging
> them in order into the fork branch.
>
> i.e. merge v1, merge v2, ... merge v300
>
> I went through something similiar with a subversion repo we converted to git.
>
> In subversion they were cherry picking done work into a release branch.
>
> In git a feature branch mode was being used.
>
> It turned out some commits were never cherry picked and bringing them to the latest release was hard.
>
> We tried many of the approaches you outlined, took what git would give us automatically and in the
> most hairy cases recreated the changes on the latest upstream by reading the diff of the original
> commit and rewriting it on the latest code.
>
> In terms of how the history looks after the merge conflicts are resolved you could internalize the
> fixups into a single commit applied onto the original fork branch.  So that history would show the
> 130 commit branch directly merged into the upstream.
>
> You would use the git-commit-tree command to reuse the merged tree id and then use it as a merge
> commit between the 130th commit id and the upstream commit id.
>
> Regards,
>
> Michael
>
> On Thu, Jun 15, 2017 at 8:52 PM, Michael Eager <eager@eagerm.com <mailto:eager@eagerm.com>> wrote:
>
>     Hi All --
>
>     I'm working with code that is based on a five year old repository.
>     There are 130 local commits since the repo was forked.  Naturally,
>     the upstream project has moved on significantly.
>
>     I'm wondering about best approaches to updating the repo to the
>     current upstream version.  Here are the approaches I've considered:
>
>     - Rebase from upstream.  Likely almost every patch will fail with
>        multiple merge conflicts.
>
>     - Merge local branch into upstream.  Likely many merge failures, but
>        fewer than with rebase.
>
>     - Apply individual patches from the old repo to the upstream repo.
>        Fix merge conflicts, rebuild, fix build failures.  There may be
>        some duplication and additional merge problems created, where a
>        later patch from the old repo fixes the same conflict or build
>        failure.
>
>     I've tried each of these approaches on various projects.  Each has
>     problems. After resolving merge issues there are build failures which
>     need to be resolved and additional patches created.  The result is
>     that the patch history is a bit chaotic, where there are later patches
>     which fix problems with early patches.  I've tried to sort the fix
>     patches to follow the patch they correct, so that the fixes were
>     together and I could merge them, but that can be difficult.
>
>     I've used Stacked Git a little, but don't know if it will make
>     any of this easier.
>
>     On some projects, I've reimplemented changes in the upstream repo,
>     abandoning the patch history from the old repo:
>
>     - Create diff of old repo and upstream.  Apply only the changes
>        to add new functionality, which are in the patches to the
>        old repo.   Fix problems caused by API changes, renamed files, etc.
>
>     - Re-implement the changes on the upstream repo.  Some of the old
>        code would be re-used, but modified to fit in the current upstream.
>        Some new code would be written.
>
>     One other variant of the rebase approach I've thought of is to do
>     this incrementally, rebasing the old repo against an upstream commit
>     a short time after the old repo was forked, fixing any conflicts,
>     rebuilding and fixing build failures.  Then repeat, with a bit
>     newer commit.  Then repeat, until I get to the top.  This sounds
>     tedious, but some of it can be automated.  It also might result in
>     my making the changes compatible with upstream code which was later
>     abandoned or significantly changed.
>
>     Anyone have a different approach that I should consider?  Or maybe
>     offer advice on how to make one of these approaches work better?
>     What is best practice to update an old repo?
>
>     --
>     Michael Eager eager@eagercon.com <mailto:eager@eagercon.com>
>     1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 <tel:650-325-8077>
>
>


-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Best practices for updating old repos
  2017-06-16  4:22 ` Stefan Beller
@ 2017-06-16  6:32   ` Michael Eager
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Eager @ 2017-06-16  6:32 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On 06/15/2017 09:22 PM, Stefan Beller wrote:
> On Thu, Jun 15, 2017 at 5:52 PM, Michael Eager <eager@eagerm.com> wrote:
>
>> One other variant of the rebase approach I've thought of is to do
>> this incrementally, rebasing the old repo against an upstream commit
>> a short time after the old repo was forked, fixing any conflicts,
>> rebuilding and fixing build failures.  Then repeat, with a bit
>> newer commit.  Then repeat, until I get to the top.  This sounds
>> tedious, but some of it can be automated.  It also might result in
>> my making the changes compatible with upstream code which was later
>> abandoned or significantly changed.
>
> This sounds like
>
> https://github.com/mhagger/git-imerge
> https://www.youtube.com/watch?v=FMZ2_-Ny_zc

Thanks, Stefan.   I'll look into this; it may be similar to what
I was thinking of doing.  I'll have to watch the video.

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Fwd: Best practices for updating old repos
       [not found] ` <CAGUfqurYdFJDT4+XzyPvo3sxeT=zjXqZGCPpDbUFwjqG1B3pBw@mail.gmail.com>
  2017-06-16  6:24   ` Michael Eager
@ 2017-06-16 10:16   ` Michael O'Cleirigh
  1 sibling, 0 replies; 5+ messages in thread
From: Michael O'Cleirigh @ 2017-06-16 10:16 UTC (permalink / raw)
  To: git

(Sorry I sent this originally last night in gmail but not in plain
text mode and it bounced)


Hi Michael,

In git if you don't merge often then you get these merge conflict hell
situations.

In my experience the main conflicts come not from the unified diff of
those 130 commits but from differences in the surrounding code.

Merging/rebase/cherrypicking directly to the latest upstream sounds
impossible to me.

These conflicts come from the distance between the local fork branch
and the upstream branch.

You need to merge through closer commits first to have a hope of
getting something automatic to work.

Something like getting the list  of releases made in the upstream in
the last 5 years and merging them in order into the fork branch.

i.e. merge v1, merge v2, ... merge v300

I went through something similiar with a subversion repo we converted to git.

In subversion they were cherry picking done work into a release branch.

In git a feature branch mode was being used.

It turned out some commits were never cherry picked and bringing them
to the latest release was hard.

We tried many of the approaches you outlined, took what git would give
us automatically and in the most hairy cases recreated the changes on
the latest upstream by reading the diff of the original commit and
rewriting it on the latest code.

In terms of how the history looks after the merge conflicts are
resolved you could internalize the fixups into a single commit applied
onto the original fork branch.  So that history would show the 130
commit branch directly merged into the upstream.

You would use the git-commit-tree command to reuse the merged tree id
and then use it as a merge commit between the 130th commit id and the
upstream commit id.

Regards,

Michael

On Thu, Jun 15, 2017 at 8:52 PM, Michael Eager <eager@eagerm.com> wrote:
>
> Hi All --
>
> I'm working with code that is based on a five year old repository.
> There are 130 local commits since the repo was forked.  Naturally,
> the upstream project has moved on significantly.
>
> I'm wondering about best approaches to updating the repo to the
> current upstream version.  Here are the approaches I've considered:
>
> - Rebase from upstream.  Likely almost every patch will fail with
>   multiple merge conflicts.
>
> - Merge local branch into upstream.  Likely many merge failures, but
>   fewer than with rebase.
>
> - Apply individual patches from the old repo to the upstream repo.
>   Fix merge conflicts, rebuild, fix build failures.  There may be
>   some duplication and additional merge problems created, where a
>   later patch from the old repo fixes the same conflict or build
>   failure.
>
> I've tried each of these approaches on various projects.  Each has
> problems. After resolving merge issues there are build failures which
> need to be resolved and additional patches created.  The result is
> that the patch history is a bit chaotic, where there are later patches
> which fix problems with early patches.  I've tried to sort the fix
> patches to follow the patch they correct, so that the fixes were
> together and I could merge them, but that can be difficult.
>
> I've used Stacked Git a little, but don't know if it will make
> any of this easier.
>
> On some projects, I've reimplemented changes in the upstream repo,
> abandoning the patch history from the old repo:
>
> - Create diff of old repo and upstream.  Apply only the changes
>   to add new functionality, which are in the patches to the
>   old repo.   Fix problems caused by API changes, renamed files, etc.
>
> - Re-implement the changes on the upstream repo.  Some of the old
>   code would be re-used, but modified to fit in the current upstream.
>   Some new code would be written.
>
> One other variant of the rebase approach I've thought of is to do
> this incrementally, rebasing the old repo against an upstream commit
> a short time after the old repo was forked, fixing any conflicts,
> rebuilding and fixing build failures.  Then repeat, with a bit
> newer commit.  Then repeat, until I get to the top.  This sounds
> tedious, but some of it can be automated.  It also might result in
> my making the changes compatible with upstream code which was later
> abandoned or significantly changed.
>
> Anyone have a different approach that I should consider?  Or maybe
> offer advice on how to make one of these approaches work better?
> What is best practice to update an old repo?
>
> --
> Michael Eager    eager@eagercon.com
> 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-06-16 10:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-16  0:52 Best practices for updating old repos Michael Eager
2017-06-16  4:22 ` Stefan Beller
2017-06-16  6:32   ` Michael Eager
     [not found] ` <CAGUfqurYdFJDT4+XzyPvo3sxeT=zjXqZGCPpDbUFwjqG1B3pBw@mail.gmail.com>
2017-06-16  6:24   ` Michael Eager
2017-06-16 10:16   ` Fwd: " Michael O'Cleirigh

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).