Git: new feature suggestion

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Git: new feature suggestion
@ 2017-01-18 10:40 Joao Pinto
  2017-01-18 18:50 ` Stefan Beller
  2017-01-19  6:33 ` Konstantin Khomoutov
  0 siblings, 2 replies; 15+ messages in thread
From: Joao Pinto @ 2017-01-18 10:40 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds, CARLOS.PALMINHA@synopsys.com

Hello,

My name is Joao Pinto, I work at Synopsys and I am a frequent Linux Kernel
contributor.

Let me start by congratulate you for the fantastic work you have been doing with
Git which is an excellent tool.

The Linux Kernel as all systems needs to be improved and re-organized to be
better prepared for future development and sometimes we need to change
folder/files names or even move things around.
I have seen a lot of Linux developers avoid this re-organization operations
because they would lose the renamed file history, because a new log is created
for the new file, even if it is a renamed version of itself.
I am sending you this e-mail to suggest the creation of a new feature in Git:
when renamed, a file or folder should inherit his parent’s log and a “rename: …”
would be automatically created or have some kind of pointer to its “old” form to
make history analysis easier.

I volunteer to help in the new feature if you find it useful. I think it would
improve log history analysis and would enable developers to better organize old
code.

Thank you for your attention.

Best Regards,
Joao Pinto

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-18 10:40 Git: new feature suggestion Joao Pinto
@ 2017-01-18 18:50 ` Stefan Beller
  2017-01-18 19:04   ` Joao Pinto
  2017-01-19  6:33 ` Konstantin Khomoutov
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2017-01-18 18:50 UTC (permalink / raw)
  To: Joao Pinto
  Cc: git@vger.kernel.org, Linus Torvalds, CARLOS.PALMINHA@synopsys.com

On Wed, Jan 18, 2017 at 2:40 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
> Hello,
>
> My name is Joao Pinto, I work at Synopsys and I am a frequent Linux Kernel
> contributor.
>
> Let me start by congratulate you for the fantastic work you have been doing with
> Git which is an excellent tool.
>
> The Linux Kernel as all systems needs to be improved and re-organized to be
> better prepared for future development and sometimes we need to change
> folder/files names or even move things around.
> I have seen a lot of Linux developers avoid this re-organization operations
> because they would lose the renamed file history, because a new log is created
> for the new file, even if it is a renamed version of itself.

Well there are a couple of things to help with digging in the logs.

git log:
       --follow
           Continue listing the history of a file beyond renames (works only
           for a single file).

        -M[<n>], --find-renames[=<n>]
           If generating diffs, detect and report renames for each commit. For
           following files across renames while traversing history, see
           --follow. If n is specified, it is a threshold on the similarity
           index (i.e. amount of addition/deletions compared to the file’s
           size). For example, -M90% means Git should consider a delete/add
           pair to be a rename if more than 90% of the file hasn’t changed.
           Without a % sign, the number is to be read as a fraction, with a
           decimal point before it. I.e., -M5 becomes 0.5, and is thus the
           same as -M50%. Similarly, -M05 is the same as -M5%. To limit
           detection to exact renames, use -M100%. The default similarity
           index is 50%.

       -C[<n>], --find-copies[=<n>]
           Detect copies as well as renames. See also --find-copies-harder. If
           n is specified, it has the same meaning as for -M<n>.



> I am sending you this e-mail to suggest the creation of a new feature in Git:
> when renamed, a file or folder should inherit his parent’s log and a “rename: …”
> would be automatically created or have some kind of pointer to its “old” form to
> make history analysis easier.

How do you currently analyse history, which detailed feature is missing?

Mind that in the Git data model we deliberately do not record the rename
at commit time, but rather want to identify the renames at log time.
This is because
in the meantime between commit and log viewing someone could have written
a better rename detection, whereas at commit time we'd be stuck with ancient
cruft forever. ;)

>
> I volunteer to help in the new feature if you find it useful. I think it would
> improve log history analysis and would enable developers to better organize old
> code.

IMHO complete renames (i.e. git mv path/a/file.c path/b/thing.c) are already
covered quite well. Partial rename (e.g. moving code from one file into two
separate files or vice versa) is still a bit hard.

I started such a new feature, see
https://public-inbox.org/git/20160903033120.20511-1-sbeller@google.com/
latest code is at https://github.com/stefanbeller/git/commits/colored_diff12,
but the latest two commits are bogus and need rewriting.

I think this feature is not 100% what you are aiming at, but is very close.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-18 18:50 ` Stefan Beller
@ 2017-01-18 19:04   ` Joao Pinto
  0 siblings, 0 replies; 15+ messages in thread
From: Joao Pinto @ 2017-01-18 19:04 UTC (permalink / raw)
  To: Stefan Beller, Joao Pinto
  Cc: git@vger.kernel.org, Linus Torvalds, CARLOS.PALMINHA@synopsys.com


Hi Stefan,

Às 6:50 PM de 1/18/2017, Stefan Beller escreveu:
> On Wed, Jan 18, 2017 at 2:40 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>> Hello,
>>
>> My name is Joao Pinto, I work at Synopsys and I am a frequent Linux Kernel
>> contributor.
>>
>> Let me start by congratulate you for the fantastic work you have been doing with
>> Git which is an excellent tool.
>>
>> The Linux Kernel as all systems needs to be improved and re-organized to be
>> better prepared for future development and sometimes we need to change
>> folder/files names or even move things around.
>> I have seen a lot of Linux developers avoid this re-organization operations
>> because they would lose the renamed file history, because a new log is created
>> for the new file, even if it is a renamed version of itself.
> 
> Well there are a couple of things to help with digging in the logs.
> 
> git log:
>        --follow
>            Continue listing the history of a file beyond renames (works only
>            for a single file).
> 
>         -M[<n>], --find-renames[=<n>]
>            If generating diffs, detect and report renames for each commit. For
>            following files across renames while traversing history, see
>            --follow. If n is specified, it is a threshold on the similarity
>            index (i.e. amount of addition/deletions compared to the file’s
>            size). For example, -M90% means Git should consider a delete/add
>            pair to be a rename if more than 90% of the file hasn’t changed.
>            Without a % sign, the number is to be read as a fraction, with a
>            decimal point before it. I.e., -M5 becomes 0.5, and is thus the
>            same as -M50%. Similarly, -M05 is the same as -M5%. To limit
>            detection to exact renames, use -M100%. The default similarity
>            index is 50%.
> 
>        -C[<n>], --find-copies[=<n>]
>            Detect copies as well as renames. See also --find-copies-harder. If
>            n is specified, it has the same meaning as for -M<n>.
> 
> 
> 
>> I am sending you this e-mail to suggest the creation of a new feature in Git:
>> when renamed, a file or folder should inherit his parent’s log and a “rename: …”
>> would be automatically created or have some kind of pointer to its “old” form to
>> make history analysis easier.
> 
> How do you currently analyse history, which detailed feature is missing?
> 
> Mind that in the Git data model we deliberately do not record the rename
> at commit time, but rather want to identify the renames at log time.
> This is because
> in the meantime between commit and log viewing someone could have written
> a better rename detection, whereas at commit time we'd be stuck with ancient
> cruft forever. ;)
> 
>>
>> I volunteer to help in the new feature if you find it useful. I think it would
>> improve log history analysis and would enable developers to better organize old
>> code.
> 
> IMHO complete renames (i.e. git mv path/a/file.c path/b/thing.c) are already
> covered quite well. Partial rename (e.g. moving code from one file into two
> separate files or vice versa) is still a bit hard.
> 
> I started such a new feature, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__public-2Dinbox.org_git_20160903033120.20511-2D1-2Dsbeller-40google.com_&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=BseICq5hy9UHxmX2XP8oPYLbn-HoEUlEuVUzqPHkX58&s=PybtKK0ELH3Nld_CQSYZnLqCQOWvnU4Fjj5iV_7EKqE&e= 
> latest code is at https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_stefanbeller_git_commits_colored-5Fdiff12&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=BseICq5hy9UHxmX2XP8oPYLbn-HoEUlEuVUzqPHkX58&s=pkTehcEmeHVLHdcNbUiU03meyH10cgUbGqLgOqXcL6w&e= ,
> but the latest two commits are bogus and need rewriting.
> 
> I think this feature is not 100% what you are aiming at, but is very close.
> 
> Thanks,
> Stefan
> 

Great info, helps a lot! I am going to analyse and get back to you ASAP.

Thanks


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-18 10:40 Git: new feature suggestion Joao Pinto
  2017-01-18 18:50 ` Stefan Beller
@ 2017-01-19  6:33 ` Konstantin Khomoutov
  2017-01-19 17:55   ` Joao Pinto
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Konstantin Khomoutov @ 2017-01-19  6:33 UTC (permalink / raw)
  To: Joao Pinto; +Cc: git, Linus Torvalds, CARLOS.PALMINHA@synopsys.com

On Wed, 18 Jan 2017 10:40:52 +0000
Joao Pinto <Joao.Pinto@synopsys.com> wrote:

[...]
> I have seen a lot of Linux developers avoid this re-organization
> operations because they would lose the renamed file history, because
> a new log is created for the new file, even if it is a renamed
> version of itself. I am sending you this e-mail to suggest the
> creation of a new feature in Git: when renamed, a file or folder
> should inherit his parent’s log and a “rename: …” would be
> automatically created or have some kind of pointer to its “old” form
> to make history analysis easier.

Git does not record renames because of its stance that what matters is
code _of the whole project_ as opposed to its location in a particular
file.

Hence with regard to renames Git "works backwards" by detecting them
dynamically while traversing the history (such as with `git log`
etc).  This detection uses certain heuristics which can be controlled
with knobs pointed to by Stefan Beller.

Still, I welcome you to read the sort-of "reference" post by Linus
Torvalds [1] in which he explains the reasoning behind this approach
implemented in Git.  IMO, understanding the reasoning behind the idea
is much better than just mechanically learning how to use it.

The whole thread (esp. Torvalds' replies) is worth reading, but that
particular mail summarizes the whole thing very well.

(The reference link to it used to be [2], but Gmane is not fully
recovered to be able to display it.)

1. http://public-inbox.org/git/Pine.LNX.4.58.0504150753440.7211@ppc970.osdl.org/
2. http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19  6:33 ` Konstantin Khomoutov
@ 2017-01-19 17:55   ` Joao Pinto
  2017-01-19 18:17   ` Junio C Hamano
  2017-01-19 18:39   ` Linus Torvalds
  2 siblings, 0 replies; 15+ messages in thread
From: Joao Pinto @ 2017-01-19 17:55 UTC (permalink / raw)
  To: Konstantin Khomoutov, Joao Pinto
  Cc: git, Linus Torvalds, CARLOS.PALMINHA@synopsys.com


Hi,

Às 6:33 AM de 1/19/2017, Konstantin Khomoutov escreveu:
> On Wed, 18 Jan 2017 10:40:52 +0000
> Joao Pinto <Joao.Pinto@synopsys.com> wrote:
> 
> [...]
>> I have seen a lot of Linux developers avoid this re-organization
>> operations because they would lose the renamed file history, because
>> a new log is created for the new file, even if it is a renamed
>> version of itself. I am sending you this e-mail to suggest the
>> creation of a new feature in Git: when renamed, a file or folder
>> should inherit his parent’s log and a “rename: …” would be
>> automatically created or have some kind of pointer to its “old” form
>> to make history analysis easier.
> 
> Git does not record renames because of its stance that what matters is
> code _of the whole project_ as opposed to its location in a particular
> file.
> 
> Hence with regard to renames Git "works backwards" by detecting them
> dynamically while traversing the history (such as with `git log`
> etc).  This detection uses certain heuristics which can be controlled
> with knobs pointed to by Stefan Beller.
> 
> Still, I welcome you to read the sort-of "reference" post by Linus
> Torvalds [1] in which he explains the reasoning behind this approach
> implemented in Git.  IMO, understanding the reasoning behind the idea
> is much better than just mechanically learning how to use it.
> 
> The whole thread (esp. Torvalds' replies) is worth reading, but that
> particular mail summarizes the whole thing very well.
> 
> (The reference link to it used to be [2], but Gmane is not fully
> recovered to be able to display it.)
> 
> 1. https://urldefense.proofpoint.com/v2/url?u=http-3A__public-2Dinbox.org_git_Pine.LNX.4.58.0504150753440.7211-40ppc970.osdl.org_&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=X0bQCOGTuZF-uq6smPwJDw4Q47qHgjWaewgTHCbhMnM&s=97U97toe9A6XOAJxbhxvWeYpzl-wPw9QvlhQfAEUTdI&e= 
> 2. https://urldefense.proofpoint.com/v2/url?u=http-3A__thread.gmane.org_gmane.comp.version-2Dcontrol.git_27_focus-3D217&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=X0bQCOGTuZF-uq6smPwJDw4Q47qHgjWaewgTHCbhMnM&s=agYFOBCbLeaKAB6frWWzcwHkZyrMZLW4ExgDxzQyVlI&e= 
> 

Thank you very much for the info!

Joao

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19  6:33 ` Konstantin Khomoutov
  2017-01-19 17:55   ` Joao Pinto
@ 2017-01-19 18:17   ` Junio C Hamano
  2017-01-19 18:39   ` Linus Torvalds
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-01-19 18:17 UTC (permalink / raw)
  To: Konstantin Khomoutov
  Cc: Joao Pinto, git, Linus Torvalds, CARLOS.PALMINHA@synopsys.com

Konstantin Khomoutov <kostix+git@007spb.ru> writes:

> Still, I welcome you to read the sort-of "reference" post by Linus
> Torvalds [1] in which he explains the reasoning behind this approach
> implemented in Git.  IMO, understanding the reasoning behind the idea
> is much better than just mechanically learning how to use it.
>
> The whole thread (esp. Torvalds' replies) is worth reading, but that
> particular mail summarizes the whole thing very well.
>
> (The reference link to it used to be [2], but Gmane is not fully
> recovered to be able to display it.)
>
> 1. http://public-inbox.org/git/Pine.LNX.4.58.0504150753440.7211@ppc970.osdl.org/
> 2. http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217

Indeed.  Thanks for providing a link to it here ;-)

The message is the most important one in the early history of Git,
and it still is one of the most important messages in the Git
mailing-list archive.  "git log -S<block>" was designed to take a
block of text (even though people misuse it and feed a single line
to it) exactly because it wanted to serve the "tracking when that
file+line changed" part in that vision.  The rename detection in
"diff" was meant to be used on the commit "git log -S<block>" finds
to see if the found change came from another file so that the user
can decide that "digging further" part needs to be done for another
file.  "git blame" with -M and -C options were done to mostly
automate the "drilling down" process that finds the last commit that
touched each line in the above process, and when used with tools
like "tig", you can even peel one commit back and "zoom down" if the
found commit is an uninteresting one (e.g. a change with only code
formatting).

One thing that is still missing in the current version of Git,
compared to the "ideal SCM" the message envisioned, is the part that
notices: "oops, that line didn't even exist in the previous version,
BUT I FOUND FIVE PLACES that matched almost perfectly in the same
diff, and here they are".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19  6:33 ` Konstantin Khomoutov
  2017-01-19 17:55   ` Joao Pinto
  2017-01-19 18:17   ` Junio C Hamano
@ 2017-01-19 18:39   ` Linus Torvalds
  2017-01-19 18:54     ` Joao Pinto
  2017-01-19 21:48     ` Jakub Narębski
  2 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2017-01-19 18:39 UTC (permalink / raw)
  To: Konstantin Khomoutov
  Cc: Joao Pinto, Git Mailing List, CARLOS.PALMINHA@synopsys.com

On Wed, Jan 18, 2017 at 10:33 PM, Konstantin Khomoutov
<kostix+git@007spb.ru> wrote:
>
> Still, I welcome you to read the sort-of "reference" post by Linus
> Torvalds [1] in which he explains the reasoning behind this approach
> implemented in Git.

It's worth noting that that discussion was from some _very_ early days
in git (one week into the whole thing), when none of those
visualization tools were actually implemented.

Even now, ten years after the fact, plain git doesn't actually do what
I outlined. Yes, "git blame -Cw" works fairly well, and is in general
better than the traditional per-file "annotate". And yes, "git log
--follow" does another (small) part of the outlined thing, but is
really not very powerful.

Some tools on top of git do more, but I think in general this is an
area that could easily be improved upon. For example, the whole
iterative and interactive drilling down in history of a particular
file is very inconvenient to do with "git blame" (you find a commit
that change the area in some way that you don't find interesting, so
then you have to restart git blame with the parent of that
unintersting commit).

You can do it in tig, but I suspect a more graphical tool might be better.

.. and we still end up having a lot of things where we simply just
work with pathnames. For example, when doing merges, it' absolutely
_wonderful_ doing

   gitk --merge <filename>

to see what happened to that filename that has a conflict during the
merge. But it's all based on the whole-file changes, and sometimes
you'd like to see just the commits that generate one particular
conflict (in the kernel, things like the MAINTAINERS file can have
quite a lot of changes, but they are all pretty idnependent, and what
you want to see is just "changes to this area").

We do have the "-L" flag to git log, but it doesn't actually work for
this particular case because of limitations.

So what I'm trying to say is that the argument from 10+ years ago that
"you can do better with intelligent tools after-the-fact" is very much
true, but it's also true that we don't actually have all those
intelligent tools, and this is an area that could still be improved
upon. Some of them are actually available as add-ons in various
graphical IDE's that use git.

                 Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 18:39   ` Linus Torvalds
@ 2017-01-19 18:54     ` Joao Pinto
  2017-01-19 19:16       ` Linus Torvalds
  2017-01-19 21:48     ` Jakub Narębski
  1 sibling, 1 reply; 15+ messages in thread
From: Joao Pinto @ 2017-01-19 18:54 UTC (permalink / raw)
  To: Linus Torvalds, Konstantin Khomoutov
  Cc: Joao Pinto, Git Mailing List, CARLOS.PALMINHA@synopsys.com


Hi Linus,

Às 6:39 PM de 1/19/2017, Linus Torvalds escreveu:
> On Wed, Jan 18, 2017 at 10:33 PM, Konstantin Khomoutov
> <kostix+git@007spb.ru> wrote:
>>
>> Still, I welcome you to read the sort-of "reference" post by Linus
>> Torvalds [1] in which he explains the reasoning behind this approach
>> implemented in Git.
> 
> It's worth noting that that discussion was from some _very_ early days
> in git (one week into the whole thing), when none of those
> visualization tools were actually implemented.
> 
> Even now, ten years after the fact, plain git doesn't actually do what
> I outlined. Yes, "git blame -Cw" works fairly well, and is in general
> better than the traditional per-file "annotate". And yes, "git log
> --follow" does another (small) part of the outlined thing, but is
> really not very powerful.
> 
> Some tools on top of git do more, but I think in general this is an
> area that could easily be improved upon. For example, the whole
> iterative and interactive drilling down in history of a particular
> file is very inconvenient to do with "git blame" (you find a commit
> that change the area in some way that you don't find interesting, so
> then you have to restart git blame with the parent of that
> unintersting commit).
> 
> You can do it in tig, but I suspect a more graphical tool might be better.
> 
> .. and we still end up having a lot of things where we simply just
> work with pathnames. For example, when doing merges, it' absolutely
> _wonderful_ doing
> 
>    gitk --merge <filename>
> 
> to see what happened to that filename that has a conflict during the
> merge. But it's all based on the whole-file changes, and sometimes
> you'd like to see just the commits that generate one particular
> conflict (in the kernel, things like the MAINTAINERS file can have
> quite a lot of changes, but they are all pretty idnependent, and what
> you want to see is just "changes to this area").
> 
> We do have the "-L" flag to git log, but it doesn't actually work for
> this particular case because of limitations.
> 
> So what I'm trying to say is that the argument from 10+ years ago that
> "you can do better with intelligent tools after-the-fact" is very much
> true, but it's also true that we don't actually have all those
> intelligent tools, and this is an area that could still be improved
> upon. Some of them are actually available as add-ons in various
> graphical IDE's that use git.
> 
>                  Linus
> 

I am currently facing some challenges in one of Linux subsystems where a rename
of a set of folders and files would be the perfect scenario for future
development, but the suggestion is not accepted, not because it's not correct,
but because it makes the maintainer life harder in backporting bug fixes and new
features to older kernel versions and because it is not easy to follow the
renamed file/folder history from the kernel.org history logs.

Like nature shows us, the ability to adapt is the key for survival, so Linux
would gain a lot with some new features in git that can make maintainers life
easier. Assisted-backporting would be an excellent feature for them.

Did you ever thought about optimization backport operations through git or by an
add-on to it?

I am available to help if this feature makes sense for git users.

Thanks,
Joao

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 18:54     ` Joao Pinto
@ 2017-01-19 19:16       ` Linus Torvalds
  2017-01-19 21:51         ` Joao Pinto
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2017-01-19 19:16 UTC (permalink / raw)
  To: Joao Pinto
  Cc: Konstantin Khomoutov, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com

On Thu, Jan 19, 2017 at 10:54 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>
> I am currently facing some challenges in one of Linux subsystems where a rename
> of a set of folders and files would be the perfect scenario for future
> development, but the suggestion is not accepted, not because it's not correct,
> but because it makes the maintainer life harder in backporting bug fixes and new
> features to older kernel versions and because it is not easy to follow the
> renamed file/folder history from the kernel.org history logs.

Honestly, that's less of a git issue, and more of a "patch will not
apply across versions" issue.

No amount of rename detection will ever fix that, simply because the
rename hadn't even _happened_ in the old versions that things get
backported to.

("git cherry-pick" can do a merge resolution and thus do "backwards"
renaming too, so tooling can definitely help, but it still ends up
meaning that even trivial patches are no longer the _same_ trivial
patch across versions).

So renaming things increases maintainer workloads in those situations
regardless of any tooling issues.

(You may also be referring to the mellanox mess, where this issue is
very much exacerbated by having different groups working on the same
thing, and maintainers having very much a "I will not take _anything_
from any of the groups that makes my life more complicated" model,
because those groups fucked up so much in the past).

In other words, quite often issues are about workflows rather than
tools. The networking layer probably has more of this, because David
actually does the backports himself, so he _really_ doesn't want to
complicate things.

               Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 19:16       ` Linus Torvalds
@ 2017-01-19 21:51         ` Joao Pinto
  2017-01-19 22:03           ` Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Joao Pinto @ 2017-01-19 21:51 UTC (permalink / raw)
  To: Linus Torvalds, Joao Pinto
  Cc: Konstantin Khomoutov, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com

Às 7:16 PM de 1/19/2017, Linus Torvalds escreveu:
> On Thu, Jan 19, 2017 at 10:54 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>>
>> I am currently facing some challenges in one of Linux subsystems where a rename
>> of a set of folders and files would be the perfect scenario for future
>> development, but the suggestion is not accepted, not because it's not correct,
>> but because it makes the maintainer life harder in backporting bug fixes and new
>> features to older kernel versions and because it is not easy to follow the
>> renamed file/folder history from the kernel.org history logs.
> 
> Honestly, that's less of a git issue, and more of a "patch will not
> apply across versions" issue.
> 
> No amount of rename detection will ever fix that, simply because the
> rename hadn't even _happened_ in the old versions that things get
> backported to.
> 
> ("git cherry-pick" can do a merge resolution and thus do "backwards"
> renaming too, so tooling can definitely help, but it still ends up
> meaning that even trivial patches are no longer the _same_ trivial
> patch across versions).
> 
> So renaming things increases maintainer workloads in those situations
> regardless of any tooling issues.
> 
> (You may also be referring to the mellanox mess, where this issue is
> very much exacerbated by having different groups working on the same
> thing, and maintainers having very much a "I will not take _anything_
> from any of the groups that makes my life more complicated" model,
> because those groups fucked up so much in the past).
> 
> In other words, quite often issues are about workflows rather than
> tools. The networking layer probably has more of this, because David
> actually does the backports himself, so he _really_ doesn't want to
> complicate things.

I totally understand David' side! Synopsys is a well-known IP Vendor, and for a
long time its focus was the IP only. Knowadays the strategy has changed and
Synopsys is very keen to help in Open Source, namelly Linux, developing the
drivers for new IP Cores and participating in the improvement of existing ones.
I am part of the team that has that job.

In USB and PCI subystems developers created common Synopsys drivers (focused on
the HW IP) and so today they are massively used by all the SoC that use Synopsys
IP.

In the network subsystem, there are some drivers that target the same IP but
were made by different companies. stmmac is an excelent driver for Synopsys MAC
10/100/1000/QOS IPs, but there was another driver made by AXIS driver that also
targeted the QOS IP. We detected that issue and merged the AXIS specific driver
ops to stmmac, and nowadays, AXIS uses stmmac. So less drivers to maintain!

The idea that was rejected consisted of renaming stmicro/stmmac to dwc/stmmac
and to have dwc (designware controllers) as the official driver spot for
Synopsys Ethernet IPs.
There is another example of duplication, which is AMD' and Samsung' XGMAC
driver, targeting the same Synopsys XGMAC IP.

I am giving this examples because although the refactor adds work for
backporting, it reduces the maintenance since we would have less duplicated
drivers as we have today.

Thanks,
Joao

>                Linus
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 21:51         ` Joao Pinto
@ 2017-01-19 22:03           ` Stefan Beller
  2017-01-20 10:44             ` Joao Pinto
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2017-01-19 22:03 UTC (permalink / raw)
  To: Joao Pinto
  Cc: Linus Torvalds, Konstantin Khomoutov, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com

On Thu, Jan 19, 2017 at 1:51 PM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
> Às 7:16 PM de 1/19/2017, Linus Torvalds escreveu:
>> On Thu, Jan 19, 2017 at 10:54 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>>>
>>> I am currently facing some challenges in one of Linux subsystems where a rename
>>> of a set of folders and files would be the perfect scenario for future
>>> development, but the suggestion is not accepted, not because it's not correct,
>>> but because it makes the maintainer life harder in backporting bug fixes and new
>>> features to older kernel versions and because it is not easy to follow the
>>> renamed file/folder history from the kernel.org history logs.
>>
>> Honestly, that's less of a git issue, and more of a "patch will not
>> apply across versions" issue.
>>
>> No amount of rename detection will ever fix that, simply because the
>> rename hadn't even _happened_ in the old versions that things get
>> backported to.
>>
>> ("git cherry-pick" can do a merge resolution and thus do "backwards"
>> renaming too, so tooling can definitely help, but it still ends up
>> meaning that even trivial patches are no longer the _same_ trivial
>> patch across versions).
>>
>> So renaming things increases maintainer workloads in those situations
>> regardless of any tooling issues.
>>
>> (You may also be referring to the mellanox mess, where this issue is
>> very much exacerbated by having different groups working on the same
>> thing, and maintainers having very much a "I will not take _anything_
>> from any of the groups that makes my life more complicated" model,
>> because those groups fucked up so much in the past).
>>
>> In other words, quite often issues are about workflows rather than
>> tools. The networking layer probably has more of this, because David
>> actually does the backports himself, so he _really_ doesn't want to
>> complicate things.
>
> I totally understand David' side! Synopsys is a well-known IP Vendor, and for a
> long time its focus was the IP only. Knowadays the strategy has changed and
> Synopsys is very keen to help in Open Source, namelly Linux, developing the
> drivers for new IP Cores and participating in the improvement of existing ones.
> I am part of the team that has that job.
>
> In USB and PCI subystems developers created common Synopsys drivers (focused on
> the HW IP) and so today they are massively used by all the SoC that use Synopsys
> IP.
>
> In the network subsystem, there are some drivers that target the same IP but
> were made by different companies. stmmac is an excelent driver for Synopsys MAC
> 10/100/1000/QOS IPs, but there was another driver made by AXIS driver that also
> targeted the QOS IP. We detected that issue and merged the AXIS specific driver
> ops to stmmac, and nowadays, AXIS uses stmmac. So less drivers to maintain!
>
> The idea that was rejected consisted of renaming stmicro/stmmac to dwc/stmmac
> and to have dwc (designware controllers) as the official driver spot for
> Synopsys Ethernet IPs.
> There is another example of duplication, which is AMD' and Samsung' XGMAC
> driver, targeting the same Synopsys XGMAC IP.
>
> I am giving this examples because although the refactor adds work for
> backporting, it reduces the maintenance since we would have less duplicated
> drivers as we have today.

This sounds as if the code in question would only receive backports
for a specific
time (determined by HW lifecycle, maintenance life cycle and such).

So I wonder if this could be solved by not just renaming but
additionally adding a
symbolic link, such that the files in question seem to appear twice on
the file system.
Then backports ought to be applicable (hoping git-am doesn't choke on symlinks),
and after a while once the there no backports any more (due to life
cycle reasons),
remove the link?

This also sounds like a kind of problem, that others have run into before,
how did they solve it?

Thanks,
Stefan

>
> Thanks,
> Joao
>
>
>>                Linus
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 22:03           ` Stefan Beller
@ 2017-01-20 10:44             ` Joao Pinto
  0 siblings, 0 replies; 15+ messages in thread
From: Joao Pinto @ 2017-01-20 10:44 UTC (permalink / raw)
  To: Stefan Beller, Joao Pinto
  Cc: Linus Torvalds, Konstantin Khomoutov, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com


Hi Stefan,

Às 10:03 PM de 1/19/2017, Stefan Beller escreveu:
> On Thu, Jan 19, 2017 at 1:51 PM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>> Às 7:16 PM de 1/19/2017, Linus Torvalds escreveu:
>>> On Thu, Jan 19, 2017 at 10:54 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>>>>
>>>> I am currently facing some challenges in one of Linux subsystems where a rename
>>>> of a set of folders and files would be the perfect scenario for future
>>>> development, but the suggestion is not accepted, not because it's not correct,
>>>> but because it makes the maintainer life harder in backporting bug fixes and new
>>>> features to older kernel versions and because it is not easy to follow the
>>>> renamed file/folder history from the kernel.org history logs.
>>>
>>> Honestly, that's less of a git issue, and more of a "patch will not
>>> apply across versions" issue.
>>>
>>> No amount of rename detection will ever fix that, simply because the
>>> rename hadn't even _happened_ in the old versions that things get
>>> backported to.
>>>
>>> ("git cherry-pick" can do a merge resolution and thus do "backwards"
>>> renaming too, so tooling can definitely help, but it still ends up
>>> meaning that even trivial patches are no longer the _same_ trivial
>>> patch across versions).
>>>
>>> So renaming things increases maintainer workloads in those situations
>>> regardless of any tooling issues.
>>>
>>> (You may also be referring to the mellanox mess, where this issue is
>>> very much exacerbated by having different groups working on the same
>>> thing, and maintainers having very much a "I will not take _anything_
>>> from any of the groups that makes my life more complicated" model,
>>> because those groups fucked up so much in the past).
>>>
>>> In other words, quite often issues are about workflows rather than
>>> tools. The networking layer probably has more of this, because David
>>> actually does the backports himself, so he _really_ doesn't want to
>>> complicate things.
>>
>> I totally understand David' side! Synopsys is a well-known IP Vendor, and for a
>> long time its focus was the IP only. Knowadays the strategy has changed and
>> Synopsys is very keen to help in Open Source, namelly Linux, developing the
>> drivers for new IP Cores and participating in the improvement of existing ones.
>> I am part of the team that has that job.
>>
>> In USB and PCI subystems developers created common Synopsys drivers (focused on
>> the HW IP) and so today they are massively used by all the SoC that use Synopsys
>> IP.
>>
>> In the network subsystem, there are some drivers that target the same IP but
>> were made by different companies. stmmac is an excelent driver for Synopsys MAC
>> 10/100/1000/QOS IPs, but there was another driver made by AXIS driver that also
>> targeted the QOS IP. We detected that issue and merged the AXIS specific driver
>> ops to stmmac, and nowadays, AXIS uses stmmac. So less drivers to maintain!
>>
>> The idea that was rejected consisted of renaming stmicro/stmmac to dwc/stmmac
>> and to have dwc (designware controllers) as the official driver spot for
>> Synopsys Ethernet IPs.
>> There is another example of duplication, which is AMD' and Samsung' XGMAC
>> driver, targeting the same Synopsys XGMAC IP.
>>
>> I am giving this examples because although the refactor adds work for
>> backporting, it reduces the maintenance since we would have less duplicated
>> drivers as we have today.
> 
> This sounds as if the code in question would only receive backports
> for a specific
> time (determined by HW lifecycle, maintenance life cycle and such).
> 
> So I wonder if this could be solved by not just renaming but
> additionally adding a
> symbolic link, such that the files in question seem to appear twice on
> the file system.
> Then backports ought to be applicable (hoping git-am doesn't choke on symlinks),
> and after a while once the there no backports any more (due to life
> cycle reasons),
> remove the link?
> 
> This also sounds like a kind of problem, that others have run into before,
> how did they solve it?

I am currently involved in the PCI host/ refactor process and that will cause a
total reorganization of the folder, because a new PCIe Endpoint is comming up
from Texas Instruments. Bjorn (PCI Maintainer) is ok with it.
The network subsystem, is a very busy one, with lots of activity, and so I
understand David Miller' point, because he already has work overload, but we
have to find a way to improve it and prepare for the future.

I think this a hot topic, that should be discussed, since it might hold back the
evolution of some subystems.

Thanks,
Joao

> 
> Thanks,
> Stefan
> 
>>
>> Thanks,
>> Joao
>>
>>
>>>                Linus
>>>
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 18:39   ` Linus Torvalds
  2017-01-19 18:54     ` Joao Pinto
@ 2017-01-19 21:48     ` Jakub Narębski
  2017-01-20  0:26       ` Linus Torvalds
  1 sibling, 1 reply; 15+ messages in thread
From: Jakub Narębski @ 2017-01-19 21:48 UTC (permalink / raw)
  To: Linus Torvalds, Konstantin Khomoutov
  Cc: Joao Pinto, Git Mailing List, CARLOS.PALMINHA@synopsys.com

W dniu 19.01.2017 o 19:39, Linus Torvalds pisze:
> On Wed, Jan 18, 2017 at 10:33 PM, Konstantin Khomoutov
> <kostix+git@007spb.ru> wrote:
>>
>> Still, I welcome you to read the sort-of "reference" post by Linus
>> Torvalds [1] in which he explains the reasoning behind this approach
>> implemented in Git.
> 
> It's worth noting that that discussion was from some _very_ early days
> in git (one week into the whole thing), when none of those
> visualization tools were actually implemented.
> 
> Even now, ten years after the fact, plain git doesn't actually do what
> I outlined. Yes, "git blame -Cw" works fairly well, and is in general
> better than the traditional per-file "annotate". And yes, "git log
> --follow" does another (small) part of the outlined thing, but is
> really not very powerful.

It is really a pity that "git log --follow" is so limited; it's
development stopped at early 'good enough' implementation.

For example "git log --follow gitweb/gitweb.perl" would not show
the whole history of a file (which was once independent project),
and "git log --follow" doesn't work for directories or multiple
files.

> 
> Some tools on top of git do more, but I think in general this is an
> area that could easily be improved upon. For example, the whole
> iterative and interactive drilling down in history of a particular
> file is very inconvenient to do with "git blame" (you find a commit
> that change the area in some way that you don't find interesting, so
> then you have to restart git blame with the parent of that
> unintersting commit).
> 
> You can do it in tig, but I suspect a more graphical tool might be better.

Well, we do have "git gui blame".

[...]
-- 
Jakub Narębski

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-19 21:48     ` Jakub Narębski
@ 2017-01-20  0:26       ` Linus Torvalds
  2017-01-20 11:18         ` Jakub Narębski
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2017-01-20  0:26 UTC (permalink / raw)
  To: Jakub Narębski
  Cc: Konstantin Khomoutov, Joao Pinto, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com

On Thu, Jan 19, 2017 at 1:48 PM, Jakub Narębski <jnareb@gmail.com> wrote:
> W dniu 19.01.2017 o 19:39, Linus Torvalds pisze:
>>
>> You can do it in tig, but I suspect a more graphical tool might be better.
>
> Well, we do have "git gui blame".

Does that actually work for people? Because it really doesn't for me.

And I'm not just talking about the aesthetics of the thing, but the
whole experience, and the whole "dig into parent" which just gives me
an error message.

            Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Git: new feature suggestion
  2017-01-20  0:26       ` Linus Torvalds
@ 2017-01-20 11:18         ` Jakub Narębski
  0 siblings, 0 replies; 15+ messages in thread
From: Jakub Narębski @ 2017-01-20 11:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Konstantin Khomoutov, Joao Pinto, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com

W dniu 20.01.2017 o 01:26, Linus Torvalds pisze:
> On Thu, Jan 19, 2017 at 1:48 PM, Jakub Narębski <jnareb@gmail.com> wrote:
>> W dniu 19.01.2017 o 19:39, Linus Torvalds pisze:
>>>
>>> You can do it in tig, but I suspect a more graphical tool might be better.
>>
>> Well, we do have "git gui blame".
> 
> Does that actually work for people? Because it really doesn't for me.
> 
> And I'm not just talking about the aesthetics of the thing, but the
> whole experience, and the whole "dig into parent" which just gives me
> an error message.

Strange. I had been using "git gui blame" _because_ of its "dig to parent"
functionality, and it worked for me just fine.

The other thing that I like about "git gui blame" is that it shows both
the commit that moved the fragment of code (via "git blame"), and the
commit that created the fragment of code (via "git blame -C -C -w", I think).

Anyway, all of this (sub)discussion is about archeology, but what might
be more important is automatic rename handling when integrating changes,
be it git-am, git-merge, or something else...

-- 
Jakub Narębski

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-01-20 11:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-18 10:40 Git: new feature suggestion Joao Pinto
2017-01-18 18:50 ` Stefan Beller
2017-01-18 19:04   ` Joao Pinto
2017-01-19  6:33 ` Konstantin Khomoutov
2017-01-19 17:55   ` Joao Pinto
2017-01-19 18:17   ` Junio C Hamano
2017-01-19 18:39   ` Linus Torvalds
2017-01-19 18:54     ` Joao Pinto
2017-01-19 19:16       ` Linus Torvalds
2017-01-19 21:51         ` Joao Pinto
2017-01-19 22:03           ` Stefan Beller
2017-01-20 10:44             ` Joao Pinto
2017-01-19 21:48     ` Jakub Narębski
2017-01-20  0:26       ` Linus Torvalds
2017-01-20 11:18         ` Jakub Narębski

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).