git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git Merge 2020 slides and reproducibility
@ 2020-03-06 15:00 Elijah Newren
  2020-03-06 16:40 ` Derrick Stolee
  2020-03-07 13:38 ` Konstantin Tokarev
  0 siblings, 2 replies; 8+ messages in thread
From: Elijah Newren @ 2020-03-06 15:00 UTC (permalink / raw)
  To: Git Mailing List

Hi,

Had a few different folks ask me at Git Merge about slides for my
talk.  I said I'd post them on github somewhere, but in case you were
one of the folks and have a hard time finding it...they are up at
https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
and steps to reproduce the speedups I got can be found at
https://github.com/newren/git/blob/git-merge-2020-demo/README.md
(though be forewarned that the code is has lots of fixmes & ifdefs &
other problems, has awful commit messages, etc.; I will be cleaning it
up soon).

I know the "suggested" way to make this stuff available was on
Twitter, but I don't really have any much of any social media presence
(I can't even access the blog I once had) and don't want to make a
twitter account just for this.  (If someone else wants to repost my
slides, feel free.)

Elijah

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-06 15:00 Git Merge 2020 slides and reproducibility Elijah Newren
@ 2020-03-06 16:40 ` Derrick Stolee
  2020-03-06 18:25   ` Matheus Tavares Bernardino
  2020-03-07 13:38 ` Konstantin Tokarev
  1 sibling, 1 reply; 8+ messages in thread
From: Derrick Stolee @ 2020-03-06 16:40 UTC (permalink / raw)
  To: Elijah Newren, Git Mailing List

On 3/6/2020 10:00 AM, Elijah Newren wrote:
> Had a few different folks ask me at Git Merge about slides for my
> talk.  I said I'd post them on github somewhere, but in case you were
> one of the folks and have a hard time finding it...they are up at
> https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf

Thanks! I guess I can post mine, too:

https://stolee.dev/docs/git-merge-2020.pdf

> and steps to reproduce the speedups I got can be found at
> https://github.com/newren/git/blob/git-merge-2020-demo/README.md
> (though be forewarned that the code is has lots of fixmes & ifdefs &
> other problems, has awful commit messages, etc.; I will be cleaning it
> up soon).
> 
> I know the "suggested" way to make this stuff available was on
> Twitter, but I don't really have any much of any social media presence
> (I can't even access the blog I once had) and don't want to make a
> twitter account just for this.  (If someone else wants to repost my
> slides, feel free.)

Done: https://twitter.com/stolee/status/1235968445637771265?s=20

Thanks!
-Stolee

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-06 16:40 ` Derrick Stolee
@ 2020-03-06 18:25   ` Matheus Tavares Bernardino
  0 siblings, 0 replies; 8+ messages in thread
From: Matheus Tavares Bernardino @ 2020-03-06 18:25 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren, Git Mailing List

On Fri, Mar 6, 2020 at 8:40 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 3/6/2020 10:00 AM, Elijah Newren wrote:
> > Had a few different folks ask me at Git Merge about slides for my
> > talk.  I said I'd post them on github somewhere, but in case you were
> > one of the folks and have a hard time finding it...they are up at
> > https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
>
> Thanks! I guess I can post mine, too:
>
> https://stolee.dev/docs/git-merge-2020.pdf

Thank you both for making your slides available. And for the great
presentations, as well!

---
Matheus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-06 15:00 Git Merge 2020 slides and reproducibility Elijah Newren
  2020-03-06 16:40 ` Derrick Stolee
@ 2020-03-07 13:38 ` Konstantin Tokarev
  2020-03-07 16:03   ` Elijah Newren
  1 sibling, 1 reply; 8+ messages in thread
From: Konstantin Tokarev @ 2020-03-07 13:38 UTC (permalink / raw)
  To: Elijah Newren, Git Mailing List



06.03.2020, 18:00, "Elijah Newren" <newren@gmail.com>:
> Hi,
>
> Had a few different folks ask me at Git Merge about slides for my
> talk. I said I'd post them on github somewhere, but in case you were
> one of the folks and have a hard time finding it...they are up at
> https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
> and steps to reproduce the speedups I got can be found at
> https://github.com/newren/git/blob/git-merge-2020-demo/README.md
> (though be forewarned that the code is has lots of fixmes & ifdefs &
> other problems, has awful commit messages, etc.; I will be cleaning it
> up soon).

Hello, I've just tried your branch on my repository and it seems like it can
be a salvation from all rename-related pain that I'm regularly facing when
doing merges and cherry-picks! Thank you very much, I hope it will be
integrated into mainline soon.

However, when testing my previous merges which had to be done with helper 
script, I've encountered case of

CONFLICT (directory rename split)

Is there any way to prevent conflict in this case if files are the same, and
merge their contents if there are differences? I think it would be reasonable
to assume that move done in newest commit should win, and allow user
to change strategy via command line option, provide explicit hint where files
should be moved, or maybe even decide it interactively.

-- 
Regards,
Konstantin


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-07 13:38 ` Konstantin Tokarev
@ 2020-03-07 16:03   ` Elijah Newren
  2020-03-07 19:38     ` Konstantin Tokarev
  0 siblings, 1 reply; 8+ messages in thread
From: Elijah Newren @ 2020-03-07 16:03 UTC (permalink / raw)
  To: Konstantin Tokarev; +Cc: Git Mailing List

On Sat, Mar 7, 2020 at 5:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
>
> 06.03.2020, 18:00, "Elijah Newren" <newren@gmail.com>:
> > Hi,
> >
> > Had a few different folks ask me at Git Merge about slides for my
> > talk. I said I'd post them on github somewhere, but in case you were
> > one of the folks and have a hard time finding it...they are up at
> > https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
> > and steps to reproduce the speedups I got can be found at
> > https://github.com/newren/git/blob/git-merge-2020-demo/README.md
> > (though be forewarned that the code is has lots of fixmes & ifdefs &
> > other problems, has awful commit messages, etc.; I will be cleaning it
> > up soon).
>
> Hello, I've just tried your branch on my repository and it seems like it can
> be a salvation from all rename-related pain that I'm regularly facing when
> doing merges and cherry-picks! Thank you very much, I hope it will be
> integrated into mainline soon.

Wow, thanks for trying it out.  Please note that while it _might_ be
okay to use for real work, I am not that confident that it is.  There
are a number of factors making the 'demo' label I gave it a rather
fitting one:

  * I only started using it personally on a real world repository (or
two) about a week and a half ago. (Before then, I knew merge-ort
didn't work.)
  * The second real world repo I used it on uncovered a bug in my code
that the testsuite didn't catch[1]
  * Although I've tested with two real world repos now, that testing
was very minimal; I was focused on getting the demo ready and
implementing as many optimizations as I could.
  * While the outer merge, rebase, and cherry-pick commands will
accept a bunch of merge-machinery options and pass them along,
merge-ort flat ignores them all.
  * merge-ort is hardcoded for merge.directoryRenames=true, when the
default should be merge.directoryRenames=conflict
  * it has a bunch of FIXMEs, some of which are code cleanliness
issues but some of which represent minor bugs

[1] https://lore.kernel.org/git/911de63afa274b0791e4d4252934a5e9b0031f10.1582762465.git.gitgitgadget@gmail.com/

Also...

> However, when testing my previous merges which had to be done with helper
> script, I've encountered case of
>
> CONFLICT (directory rename split)
>
> Is there any way to prevent conflict in this case if files are the same, and
> merge their contents if there are differences? I think it would be reasonable
> to assume that move done in newest commit should win, and allow user
> to change strategy via command line option, provide explicit hint where files
> should be moved, or maybe even decide it interactively.

This conflict message is known to trigger in some cases where it
shouldn't; it may be that you're just experiencing annoyance from
that.  Let me fix that issue before worrying about workarounds.


Also, if you try out the 'fast-rebase' builtin from that branch (which
is a demo only and not meant to become a real command), note that its
usage message is really helpful:
$ git fast-rebase -h
fatal: usage: read the code, figure out how to use it, then do so

It's the kind of thing you put in code when you're trying to get it
working the night before you'll include its results in your talk (and
finish getting it to work the morning of)...



Anyway, thank you very much for giving it a whirl and reporting, just
please be cautious about depending on it since it's still work in
progress.

Elijah

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-07 16:03   ` Elijah Newren
@ 2020-03-07 19:38     ` Konstantin Tokarev
  2020-03-09 15:50       ` Elijah Newren
  0 siblings, 1 reply; 8+ messages in thread
From: Konstantin Tokarev @ 2020-03-07 19:38 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List



07.03.2020, 19:03, "Elijah Newren" <newren@gmail.com>:
> On Sat, Mar 7, 2020 at 5:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
>>  06.03.2020, 18:00, "Elijah Newren" <newren@gmail.com>:
>>  > Hi,
>>  >
>>  > Had a few different folks ask me at Git Merge about slides for my
>>  > talk. I said I'd post them on github somewhere, but in case you were
>>  > one of the folks and have a hard time finding it...they are up at
>>  > https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
>>  > and steps to reproduce the speedups I got can be found at
>>  > https://github.com/newren/git/blob/git-merge-2020-demo/README.md
>>  > (though be forewarned that the code is has lots of fixmes & ifdefs &
>>  > other problems, has awful commit messages, etc.; I will be cleaning it
>>  > up soon).
>>
>>  Hello, I've just tried your branch on my repository and it seems like it can
>>  be a salvation from all rename-related pain that I'm regularly facing when
>>  doing merges and cherry-picks! Thank you very much, I hope it will be
>>  integrated into mainline soon.
>
> Wow, thanks for trying it out. Please note that while it _might_ be
> okay to use for real work, I am not that confident that it is.

Do not worry, I've made full copy of repo before trying anything.

> There
> are a number of factors making the 'demo' label I gave it a rather
> fitting one:
>
>   * I only started using it personally on a real world repository (or
> two) about a week and a half ago. (Before then, I knew merge-ort
> didn't work.)
>   * The second real world repo I used it on uncovered a bug in my code
> that the testsuite didn't catch[1]
>   * Although I've tested with two real world repos now, that testing
> was very minimal; I was focused on getting the demo ready and
> implementing as many optimizations as I could.
>   * While the outer merge, rebase, and cherry-pick commands will
> accept a bunch of merge-machinery options and pass them along,
> merge-ort flat ignores them all.
>   * merge-ort is hardcoded for merge.directoryRenames=true, when the
> default should be merge.directoryRenames=conflict

directoryRenames=true is actually one of features which I was badly
missing and somehow overlooked.

>   * it has a bunch of FIXMEs, some of which are code cleanliness
> issues but some of which represent minor bugs
>
> [1] https://lore.kernel.org/git/911de63afa274b0791e4d4252934a5e9b0031f10.1582762465.git.gitgitgadget@gmail.com/
>
> Also...
>
>>  However, when testing my previous merges which had to be done with helper
>>  script, I've encountered case of
>>
>>  CONFLICT (directory rename split)
>>
>>  Is there any way to prevent conflict in this case if files are the same, and
>>  merge their contents if there are differences? I think it would be reasonable
>>  to assume that move done in newest commit should win, and allow user
>>  to change strategy via command line option, provide explicit hint where files
>>  should be moved, or maybe even decide it interactively.
>
> This conflict message is known to trigger in some cases where it
> shouldn't; it may be that you're just experiencing annoyance from
> that. Let me fix that issue before worrying about workarounds.

Well, in my case a directory of files was moved path A in one of merged heads
and to path B in another, so I guess it was legitimate.

Are you going to continue development in the same branch?
When do you expect it to be ready for review?
-- 
Regards,
Konstantin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-07 19:38     ` Konstantin Tokarev
@ 2020-03-09 15:50       ` Elijah Newren
  2020-03-10 14:36         ` Konstantin Tokarev
  0 siblings, 1 reply; 8+ messages in thread
From: Elijah Newren @ 2020-03-09 15:50 UTC (permalink / raw)
  To: Konstantin Tokarev; +Cc: Git Mailing List

On Sat, Mar 7, 2020 at 11:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
>
> 07.03.2020, 19:03, "Elijah Newren" <newren@gmail.com>:
> > On Sat, Mar 7, 2020 at 5:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
...
> >>  However, when testing my previous merges which had to be done with helper
> >>  script, I've encountered case of
> >>
> >>  CONFLICT (directory rename split)
> >>
> >>  Is there any way to prevent conflict in this case if files are the same, and
> >>  merge their contents if there are differences? I think it would be reasonable
> >>  to assume that move done in newest commit should win, and allow user
> >>  to change strategy via command line option, provide explicit hint where files
> >>  should be moved, or maybe even decide it interactively.
> >
> > This conflict message is known to trigger in some cases where it
> > shouldn't; it may be that you're just experiencing annoyance from
> > that. Let me fix that issue before worrying about workarounds.
>
> Well, in my case a directory of files was moved path A in one of merged heads
> and to path B in another, so I guess it was legitimate.

The point of directory rename detection is to allow new paths on the
unrenamed side of history to follow the directory rename.  So, while
there may have been an ambiguous directory rename, if there were no
new paths to be moved by it, then that directory rename is irrelevant
and shouldn't be reported as a problem.  (If you did have new paths on
the unrenamed side in that directory, then yes, it's legitimate.)

> Are you going to continue development in the same branch?

Nope, the branch exists for reproducibility of the demo.  Right now,
my plan is to work on the 'ort' branch (which the git-merge-2020-demo
branch was a snapshot of), but I reserve the right at any time to push
up code to that branch that doesn't even compile or is known to be
horribly broken.

> When do you expect it to be ready for review?

Good question.  There's other work I've been pushing off with the
excuse of preparing for the Git Merge 2020 conference, and working on
those other things may limit my time on this and make it harder to
give good guestimates.

I'm hoping that _parts_ of it will be ready to review a week or two
after 2.26 is released.  That will not mean I'm done with development
at that time, just that I'm trying to get feedback in parallel with
doing further development.  Besides competing priorities, there's
another reason to be somewhat cautious about the timeline: I don't
want us to replace one area of the code that only one person is
willing to touch with a different scary beast that no one wants to
touch.  So, I need to put some work into high level algorithm and data
structure documentation, splitting up patches nicely, etc.  And the
purpose of writing those documents isn't to put the design in stone,
but rather to make review easier -- at which point I expect at least
one big change or two (and dozens of small changes) to be requested
for maintenance/performance/API-design reasons.  I'll be disappointed
if I don't get that kind of feedback, as I'll be worried we're just
putting a new black box into place.

I happen to think that the basics of the new module are nicer than the
old merge-recursive module I'm replacing, but the performance work
complicated things a fair amount and I want to make it more
approachable.  So, we'll see.

I know this is horribly vague.  Sorry.

Elijah

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Git Merge 2020 slides and reproducibility
  2020-03-09 15:50       ` Elijah Newren
@ 2020-03-10 14:36         ` Konstantin Tokarev
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Tokarev @ 2020-03-10 14:36 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List



09.03.2020, 18:50, "Elijah Newren" <newren@gmail.com>:
> On Sat, Mar 7, 2020 at 11:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
>>  07.03.2020, 19:03, "Elijah Newren" <newren@gmail.com>:
>>  > On Sat, Mar 7, 2020 at 5:38 AM Konstantin Tokarev <annulen@yandex.ru> wrote:
>
> ...
>>  >> However, when testing my previous merges which had to be done with helper
>>  >> script, I've encountered case of
>>  >>
>>  >> CONFLICT (directory rename split)
>>  >>
>>  >> Is there any way to prevent conflict in this case if files are the same, and
>>  >> merge their contents if there are differences? I think it would be reasonable
>>  >> to assume that move done in newest commit should win, and allow user
>>  >> to change strategy via command line option, provide explicit hint where files
>>  >> should be moved, or maybe even decide it interactively.
>>  >
>>  > This conflict message is known to trigger in some cases where it
>>  > shouldn't; it may be that you're just experiencing annoyance from
>>  > that. Let me fix that issue before worrying about workarounds.
>>
>>  Well, in my case a directory of files was moved path A in one of merged heads
>>  and to path B in another, so I guess it was legitimate.
>
> The point of directory rename detection is to allow new paths on the
> unrenamed side of history to follow the directory rename. So, while
> there may have been an ambiguous directory rename, if there were no
> new paths to be moved by it, then that directory rename is irrelevant
> and shouldn't be reported as a problem. (If you did have new paths on
> the unrenamed side in that directory, then yes, it's legitimate.)

In my case, both sides have different renames, but files in subject directory are
mostly unchanged. It would even work for me if merge placed it to wrong
directory in the end, just to have it merge files contents automatically.

>
>>  Are you going to continue development in the same branch?
>
> Nope, the branch exists for reproducibility of the demo. Right now,
> my plan is to work on the 'ort' branch (which the git-merge-2020-demo
> branch was a snapshot of), but I reserve the right at any time to push
> up code to that branch that doesn't even compile or is known to be
> horribly broken.
>
>>  When do you expect it to be ready for review?
>
> Good question. There's other work I've been pushing off with the
> excuse of preparing for the Git Merge 2020 conference, and working on
> those other things may limit my time on this and make it harder to
> give good guestimates.
>
> I'm hoping that _parts_ of it will be ready to review a week or two
> after 2.26 is released. That will not mean I'm done with development
> at that time, just that I'm trying to get feedback in parallel with
> doing further development. Besides competing priorities, there's
> another reason to be somewhat cautious about the timeline: I don't
> want us to replace one area of the code that only one person is
> willing to touch with a different scary beast that no one wants to
> touch. So, I need to put some work into high level algorithm and data
> structure documentation, splitting up patches nicely, etc. And the
> purpose of writing those documents isn't to put the design in stone,
> but rather to make review easier -- at which point I expect at least
> one big change or two (and dozens of small changes) to be requested
> for maintenance/performance/API-design reasons. I'll be disappointed
> if I don't get that kind of feedback, as I'll be worried we're just
> putting a new black box into place.
>
> I happen to think that the basics of the new module are nicer than the
> old merge-recursive module I'm replacing, but the performance work
> complicated things a fair amount and I want to make it more
> approachable. So, we'll see.

/me personally would at any time prefer correct renames detection over speed,
even if things become _slower_, just to resolve less conflicts manually.
However, I guess planning all optimizations up front may be necessary to choose
optimal data structures.

>
> I know this is horribly vague. Sorry.

No problem, thanks a lot for your work and this information!

>
> Elijah

-- 
Regards,
Konstantin


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-03-10 14:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-06 15:00 Git Merge 2020 slides and reproducibility Elijah Newren
2020-03-06 16:40 ` Derrick Stolee
2020-03-06 18:25   ` Matheus Tavares Bernardino
2020-03-07 13:38 ` Konstantin Tokarev
2020-03-07 16:03   ` Elijah Newren
2020-03-07 19:38     ` Konstantin Tokarev
2020-03-09 15:50       ` Elijah Newren
2020-03-10 14:36         ` Konstantin Tokarev

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).