git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* rationale behind git not tracking history of branches
@ 2020-05-26 21:01 Kevin Buchs
  2020-05-27  2:50 ` Jonathan Nieder
  2020-05-27 16:10 ` Randall S. Becker
  0 siblings, 2 replies; 5+ messages in thread
From: Kevin Buchs @ 2020-05-26 21:01 UTC (permalink / raw)
  To: git

For many years of using Git, I always struggled to make sense of
commit history graphs (git log --graph; gitk). Just recently I
discovered that git does not track the history of branches to which
commits belonged and the lightbulb turned on. This is proving to be
painful in a project I inherited with permanent multiple branches.
Now, I am a bit curious as to the rationale behind this intentional
decision not to track branch history. Is it entirely a matter of
keeping branches lightweight?

I am assuming one can backfill for the missing capability by using a
commit hook to manually track when a branch head is changed. Perhaps
by storing the branch in the commit notes.

Kevin Buchs, Senior Engineer, New Context Services
kevin.buchs@newcontext.com  507-251-7463  St. Cloud, MN

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rationale behind git not tracking history of branches
  2020-05-26 21:01 rationale behind git not tracking history of branches Kevin Buchs
@ 2020-05-27  2:50 ` Jonathan Nieder
  2020-05-27 16:24   ` Kevin Buchs
  2020-05-27 16:10 ` Randall S. Becker
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Nieder @ 2020-05-27  2:50 UTC (permalink / raw)
  To: Kevin Buchs; +Cc: git

Hi,

Kevin Buchs wrote:

> For many years of using Git, I always struggled to make sense of
> commit history graphs (git log --graph; gitk). Just recently I
> discovered that git does not track the history of branches to which
> commits belonged and the lightbulb turned on. This is proving to be
> painful in a project I inherited with permanent multiple branches.
> Now, I am a bit curious as to the rationale behind this intentional
> decision not to track branch history. Is it entirely a matter of
> keeping branches lightweight?
>
> I am assuming one can backfill for the missing capability by using a
> commit hook to manually track when a branch head is changed. Perhaps
> by storing the branch in the commit notes.

I think this comes down to a question of mental model: one thing I
value when using Git is that each commit does *not* belong to a
specific branch --- branches describe the shape of history, and
commits are points in that history.

This becomes particularly relevant when working with multiple
colleagues, sharing history between different servers: I may have a
branch I call "linus" that points to the same history that a colleague
called "master".

That said, I can understand how that may be difficult to get used to
coming from other version control systems (such as Subversion) in
which a revision does belong to a branch.

Can you say a little more about what aim you're trying to achieve when
you want to make this lookup?  For example:

- are you looking to figure out what the commit author was working
  on when they made the commit?  For that, the commit message is meant
  to provide context, and a commit hook like you describe can be a
  good way to enforce that (for example if you want every commit
  message to contain a bug number for context).

- are you looking to find out *when* a commit became part of a
  particular published branch?  It's true that Git doesn't provide a
  good way to do that today.  I have some hope that some best
  practices like discussed in [1][2] will coalesce for attesting to
  the history of a branch's state.

  If you always perform merges with --no-ff, then you can find some
  things out using the --first-parent history.  It is possible to
  enforce such practices using hooks, but this may be a lot of fuss
  for little gain, depending on the underlying need.

  I find "git log --ancestry-path" to be very useful for finding out
  *how* a commit became part of a particular published branch.

- or are you looking for this information for some other purpose?

Returning to your question: one reason that I find Git not recording
the current branch name to be liberating is that I am not great at
naming things.  I can use a placeholder name, knowing that I am its
only audience, without fear of embarrassment.

Thanks and hope that helps,
Jonathan

[1] https://git.eclipse.org/r/c/51128, describing refs/meta/push-certs
[2] https://lore.kernel.org/git/22945.15202.337224.529980@chiark.greenend.org.uk/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: rationale behind git not tracking history of branches
  2020-05-26 21:01 rationale behind git not tracking history of branches Kevin Buchs
  2020-05-27  2:50 ` Jonathan Nieder
@ 2020-05-27 16:10 ` Randall S. Becker
  1 sibling, 0 replies; 5+ messages in thread
From: Randall S. Becker @ 2020-05-27 16:10 UTC (permalink / raw)
  To: 'Kevin Buchs', git

On May 26, 2020 5:01 PM, Kevin Buchs wrote:
> For many years of using Git, I always struggled to make sense of commit
> history graphs (git log --graph; gitk). Just recently I discovered that git does
> not track the history of branches to which commits belonged and the
> lightbulb turned on. This is proving to be painful in a project I inherited with
> permanent multiple branches.
> Now, I am a bit curious as to the rationale behind this intentional decision
> not to track branch history. Is it entirely a matter of keeping branches
> lightweight?
> 
> I am assuming one can backfill for the missing capability by using a commit
> hook to manually track when a branch head is changed. Perhaps by storing
> the branch in the commit notes.

Based on the distributed nature of git, the interpretation of the history of a branch can be different based on local clones. The history only comes together as commits are merged together on an upstream repository, so even in the upstream, the interpretation is potentially different from what the branch's interpretation is in someone's clone. The parent-child commit structure in the underlying Merkel tree is the only definitive interpretation and is a post-merge perspective independent of any interpretation of the branch itself.

The point-in-time interpretation of a branch is simply the HEAD where the branch pointer is located, but the point-in-time interpretation is ambiguous across multiple clones and even the upstream. In order to disambiguate the interpretation, the branch contents have to be moved to a common repository which only understands its branch HEAD post push, which may be disallowed without a merge depending on the current tree structure.

To track a branch's history across all domains, and make it meaningful, you could write a commit, push, etc. hook, that understands what the current state of the branch is and where, and interpret the history based on your own project needs. It would be something like a relation set something like the tuple: {some-unique-repo-clone-identifier; branch; commit; date-and-time}, as the branch state (HEAD) is time-varying and can repeat its presence on a commit multiple times (git branch -f branch-name commit-ish) . The "some-unique-repo-clone-identifier" is problematic, because there isn't one currently AFAIK.

That's my interpretation anyway based on how I see things in today's git world - I might very well be wrong.

Regards,
Randall

-- Brief whoami:
 NonStop developer since approximately 211288444200000000
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rationale behind git not tracking history of branches
  2020-05-27  2:50 ` Jonathan Nieder
@ 2020-05-27 16:24   ` Kevin Buchs
  2020-05-27 19:31     ` Michal Suchánek
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Buchs @ 2020-05-27 16:24 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

Hi Jonathan,

Thanks for the reply. I will give you my current situation. I am just
taking over a project that many cooks were involved in previously. We
have three persistent branches: dev, staging and master which
correspond directly to three CD environments: dev, staging and prod.
The nominal commit history ought to be that all changes happen in the
dev branch, and that the latest dev is merged into staging and then to
master at appropriate milestones of testing. However, the history of
commit chains clearly show that is not the case. Here is what gitk
shows me: https://1drv.ms/u/s!AgKA2HL-SveIha4Y_5lihkQO7ulfKQ?e=oA9PEi
.
Now, you can see that the nominal practice was not followed. Sure,
there are many commit messages to indicate merges and I could assume
those are correct and possibly reconstruct which branch each commit
might have belonged to. However, you can see there were a series of
changes to multiple commit chains, when there should have just been a
single chain - corresponding to the dev branch. How do I know there
were not changes that should be included in dev that were stranded?
Based on the overall "smell" of this project, I really don't believe I
should trust those commit messages. So, what would make my task a
whole lot easier would be if there were a display of columns
corresponding to the branches and any commits that were in multiple
branches, due to merge fast forwarding, would explicitly show up in
the corresponding columns.

Going forward, I believe I can make my own work much clearer for
others who might pick it up by having some way to sort out branch
history. I will probably use --no-ff to make history clearer. And
maybe notes to actually record to which branch a commit belongs.

I had been studying this Q&A:
https://stackoverflow.com/questions/2706797/finding-what-branch-a-git-commit-came-from
. I have come to realize the only item that is really useful there is
the first comment on the question. It says that the lack of recording
of branch history in git is "by design". I understand the point of
making branches lightweight. though I don't know that recording a
branch would make them much heavier. I wondered if there was another
motivation in the design of git. Knowing that might help me to use the
tool more effectively or adapt it to where I would like it to be.

Kevin Buchs, Senior Engineer, New Context Services
kevin.buchs@newcontext.com  507-251-7463  St. Cloud, MN

Kevin Buchs, Senior Engineer, New Context Services
kevin.buchs@newcontext.com  507-251-7463  St. Cloud, MN


On Tue, May 26, 2020 at 9:50 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Hi,
>
> Kevin Buchs wrote:
>
> > For many years of using Git, I always struggled to make sense of
> > commit history graphs (git log --graph; gitk). Just recently I
> > discovered that git does not track the history of branches to which
> > commits belonged and the lightbulb turned on. This is proving to be
> > painful in a project I inherited with permanent multiple branches.
> > Now, I am a bit curious as to the rationale behind this intentional
> > decision not to track branch history. Is it entirely a matter of
> > keeping branches lightweight?
> >
> > I am assuming one can backfill for the missing capability by using a
> > commit hook to manually track when a branch head is changed. Perhaps
> > by storing the branch in the commit notes.
>
> I think this comes down to a question of mental model: one thing I
> value when using Git is that each commit does *not* belong to a
> specific branch --- branches describe the shape of history, and
> commits are points in that history.
>
> This becomes particularly relevant when working with multiple
> colleagues, sharing history between different servers: I may have a
> branch I call "linus" that points to the same history that a colleague
> called "master".
>
> That said, I can understand how that may be difficult to get used to
> coming from other version control systems (such as Subversion) in
> which a revision does belong to a branch.
>
> Can you say a little more about what aim you're trying to achieve when
> you want to make this lookup?  For example:
>
> - are you looking to figure out what the commit author was working
>   on when they made the commit?  For that, the commit message is meant
>   to provide context, and a commit hook like you describe can be a
>   good way to enforce that (for example if you want every commit
>   message to contain a bug number for context).
>
> - are you looking to find out *when* a commit became part of a
>   particular published branch?  It's true that Git doesn't provide a
>   good way to do that today.  I have some hope that some best
>   practices like discussed in [1][2] will coalesce for attesting to
>   the history of a branch's state.
>
>   If you always perform merges with --no-ff, then you can find some
>   things out using the --first-parent history.  It is possible to
>   enforce such practices using hooks, but this may be a lot of fuss
>   for little gain, depending on the underlying need.
>
>   I find "git log --ancestry-path" to be very useful for finding out
>   *how* a commit became part of a particular published branch.
>
> - or are you looking for this information for some other purpose?
>
> Returning to your question: one reason that I find Git not recording
> the current branch name to be liberating is that I am not great at
> naming things.  I can use a placeholder name, knowing that I am its
> only audience, without fear of embarrassment.
>
> Thanks and hope that helps,
> Jonathan
>
> [1] https://git.eclipse.org/r/c/51128, describing refs/meta/push-certs
> [2] https://lore.kernel.org/git/22945.15202.337224.529980@chiark.greenend.org.uk/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rationale behind git not tracking history of branches
  2020-05-27 16:24   ` Kevin Buchs
@ 2020-05-27 19:31     ` Michal Suchánek
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Suchánek @ 2020-05-27 19:31 UTC (permalink / raw)
  To: Kevin Buchs; +Cc: git

On Wed, May 27, 2020 at 11:24:59AM -0500, Kevin Buchs wrote:
> Hi Jonathan,
> 
> Thanks for the reply. I will give you my current situation. I am just
> taking over a project that many cooks were involved in previously. We
> have three persistent branches: dev, staging and master which
> correspond directly to three CD environments: dev, staging and prod.
> The nominal commit history ought to be that all changes happen in the
> dev branch, and that the latest dev is merged into staging and then to
> master at appropriate milestones of testing. However, the history of
In this setup you should have merges on dev only. staging should be
behind dev and master behid staging but any merge between these branches
should be fast-forward. Creating the merge commits would only add noise.

> commit chains clearly show that is not the case. Here is what gitk
> shows me: https://1drv.ms/u/s!AgKA2HL-SveIha4Y_5lihkQO7ulfKQ?e=oA9PEi
> .
> Now, you can see that the nominal practice was not followed. Sure,
> there are many commit messages to indicate merges and I could assume
> those are correct and possibly reconstruct which branch each commit
> might have belonged to. However, you can see there were a series of
> changes to multiple commit chains, when there should have just been a
> single chain - corresponding to the dev branch. How do I know there
> were not changes that should be included in dev that were stranded?
Find commits that are not ancestors of dev branch tip.

But if you are taking over the project maybe just auditing the actual
difference between the branches might be easier. Then you can use git
blame to see how the pieces of code that differ entered the branch in
question.

Going from the diverging history to one where the branches are as
described you can merge master to dev once to make all commits formally
ancestors of dev, preferably after examining and reconciling the
differences.

HTH

Michal

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-27 19:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-26 21:01 rationale behind git not tracking history of branches Kevin Buchs
2020-05-27  2:50 ` Jonathan Nieder
2020-05-27 16:24   ` Kevin Buchs
2020-05-27 19:31     ` Michal Suchánek
2020-05-27 16:10 ` Randall S. Becker

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).