Feature request: add a metadata in the commit: the "commited in branch" information

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Feature request: add a metadata in the commit: the "commited in branch" information
@ 2019-12-23 12:56 Arnaud Bertrand
  2019-12-29 23:17 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Arnaud Bertrand @ 2019-12-23 12:56 UTC (permalink / raw)
  To: git

Hello,

Git is a nice tool but one of the most important missing information
is the branch in which a commit was done.
I understood that in git philosophy, once it is merged, a branch can
disappear. But for a lot of companies, a SCM is also a guardian of the
history.
With this point of view, keeping track of the branch name when the
commit was done should be a very very big improvement (and a Major
argument to switch to git)
I speak just about a meta-data, exactly as the committer username,
email or date... no more.
If the branch is removed in the future or is renamed... so what, we
have at least its name at the time of the commit (better than
nothing).

Today, all my git repositories are using hooks to add the name of the
branch as header of the comment. But it would be so better to have it
officially and automatically and accessible as a git log meta-data.
It does not imply any constrains, simply a few characters more in the commit.
We can also imagine a core.branchInCommit parameter (true by default
;-) ) that could be set to false for those that don't one it.
The only commands affected should be git commit, git merge --no-ff and
git log that should be able to show this metadata.

Best regards,

Arnaud Bertrand

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
  2019-12-23 12:56 Feature request: add a metadata in the commit: the "commited in branch" information Arnaud Bertrand
@ 2019-12-29 23:17 ` Junio C Hamano
  2019-12-29 23:53   ` Arnaud Bertrand
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2019-12-29 23:17 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: git

Arnaud Bertrand <xda@abalgo.com> writes:

> I understood that in git philosophy, once it is merged, a branch can
> disappear. But for a lot of companies, a SCM is also a guardian of the
> history.

A lot more important point than "once it is merged" is that the
branch identity is strictly local to your repository.  Contaminating
the object header, which is cast in stone and cannot be modified
after the fact, with such a piece of information will not mix well
with the rest of Git, so ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
  2019-12-29 23:17 ` Junio C Hamano
@ 2019-12-29 23:53   ` Arnaud Bertrand
  2019-12-30  4:15     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Arnaud Bertrand @ 2019-12-29 23:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Arnaud Bertrand, git

Hi Junio,

It really depends how git is used. With big collaborative project
(like git or linux kernel), you are totally right.
for development limited to a company that has developments with team
of 10-20 developers and that uses
a correct SCM plan, the name of the branch is regulated and is
meaningful, mostly  linked to a bug tracking system
system. For audits and  traceability, the branch name is really
important... certainly more than the email of the developer ;-)
So the "contamination" is negligible compare to the bentefit in this context.
It will also helps the graphical tools to have a comprehensive
representation which can do git even better.

If you think it is a bad idea to have it by default, what about an
option to activate this functionality ? Today with the patch I've
done, it is not a problem if there is no branchname in the commit. The
only thing is the "%Xb" placeholder.

I would like to have your advice about the name because I have added
the "branch" metadata but, even it is really the name of the branch, I
think it too "hard". I preferred "BranchOfCommit" or something similar
that is more explicit... I think this one is too heavy. Do you have
other suggestions ?

Thanks for your feedback
.

Le lun. 30 déc. 2019 à 00:20, Junio C Hamano <gitster@pobox.com> a écrit :
>
> Arnaud Bertrand <xda@abalgo.com> writes:
>
> > I understood that in git philosophy, once it is merged, a branch can
> > disappear. But for a lot of companies, a SCM is also a guardian of the
> > history.
>
> A lot more important point than "once it is merged" is that the
> branch identity is strictly local to your repository.  Contaminating
> the object header, which is cast in stone and cannot be modified
> after the fact, with such a piece of information will not mix well
> with the rest of Git, so ...
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
  2019-12-29 23:53   ` Arnaud Bertrand
@ 2019-12-30  4:15     ` Theodore Y. Ts'o
  2019-12-30 11:59       ` Arnaud Bertrand
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-30  4:15 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: Junio C Hamano, git

On Mon, Dec 30, 2019 at 12:53:56AM +0100, Arnaud Bertrand wrote:
> Hi Junio,
> 
> It really depends how git is used. With big collaborative project
> (like git or linux kernel), you are totally right.
> for development limited to a company that has developments with team
> of 10-20 developers and that uses
> a correct SCM plan, the name of the branch is regulated and is
> meaningful, mostly  linked to a bug tracking system
> system. For audits and  traceability, the branch name is really
> important... certainly more than the email of the developer ;-)
> So the "contamination" is negligible compare to the bentefit in this context.
> It will also helps the graphical tools to have a comprehensive
> represeintation which can do git even better.

Why does it need to be the branch name?  You can add your own extra
metadata to the git description.  So for example, I might have a git
commit that looks like this:

    ext4: avoid declaring fs inconsistent due to invalid file handles

    If we receive a file handle, either from NFS or open_by_handle_at(2),
    and it points at an inode which has not been initialized, and the file
    system has metadata checksums enabled, we shouldn't try to get the
    inode, discover the checksum is invalid, and then declare the file
    system as being inconsistent.

    ... <details of repro omitted to keep this email short>

    Google-Bug-Id: 120690101
    Upstream-5.0-SHA1: 8a363970d1dc38c4ec4ad575c862f776f468d057
    Tested: used the repro to verify that open_by_handle_at(2)
       will not declare the fs inconsistent
    Effort: storage/ext4       
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Change-Id: Iafb6da7c360a4c34b882f7fd6a91e3bb

The tie-in to the bug tracking system is done via "Google-Bug-Id:".
The Effort tag is used to identify which subteam should be responsble
for rebasing the commit to a newer upstream kernel.  (E.g., how to
account for all of the patches made on top of 4.14.x when you are
rebasing to the newer 4.19 long-term-stable kernel, to make sure all
not-yet-usptreamed commits are carried over during the rebase
process.)

The Upstream-X.Y-SHA1: tag indicates that this is an upstream commit
that was backported to the internal kernel.  If the commit isn't an
upstream backport, we have a policy (which is enforced via an
automated bot when the commit goes through Gerritt for review) that
there must be an "Upstream-Plan: " tag indicating how the committer
plans to get the change upstream.

The automated review bot also enforces that a Tested: tag exists,
describing how the developer tested the change, and Change-Id: is used
to link the commit to Gerrit, which is how we enforce that all commits
have to be reviewed by a second engineer before it is allowed into the
production kernel sources.  We also maintain all of the Gerrit
comments as history and so we can have accountability as to who
reviewed a commit before it was submitted into the release repository.

We also have automated bots which will run checkpatch and note the
warnings from checkpatch as Gerrit commits; and if the kernel doesn't
build on a variety of architetures and configurations (e.g., debug,
installer, etc.) a bot can also report this and add -1 Gerrit review.

See?  You can do an awful lot without regulating and recording the
branch name used by the engineer.  We have full audit and traceability
through the Gerrit reviews, and we can use the Google-Bug-Id to track
which release versions of which kernels have which bugs fixed.

The bottom line is each company is going to have a different workflow
for doing reviews, linkage to bug tracking systems, auditability, etc.
If everybody were to demand their unique scheme was to be supported
directly in Git, it would be a mess.  The scheme that I've described
above needs no special git features.  It just uses some git hooks as a
convenience to to developers to help them fill in these required
fields, using Gerrit for commit review, and some bots which submit
reviews to Gerrit.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
  2019-12-30  4:15     ` Theodore Y. Ts'o
@ 2019-12-30 11:59       ` Arnaud Bertrand
  2019-12-30 15:15         ` Paul Smith
  0 siblings, 1 reply; 6+ messages in thread
From: Arnaud Bertrand @ 2019-12-30 11:59 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Arnaud Bertrand, Junio C Hamano, git

Le lun. 30 déc. 2019 à 05:15, Theodore Y. Ts'o <tytso@mit.edu> a écrit :
>
> On Mon, Dec 30, 2019 at 12:53:56AM +0100, Arnaud Bertrand wrote:
> > Hi Junio,
> >
> > It really depends how git is used. With big collaborative project
> > (like git or linux kernel), you are totally right.
> > for development limited to a company that has developments with team
> > of 10-20 developers and that uses
> > a correct SCM plan, the name of the branch is regulated and is
> > meaningful, mostly  linked to a bug tracking system
> > system. For audits and  traceability, the branch name is really
> > important... certainly more than the email of the developer ;-)
> > So the "contamination" is negligible compare to the bentefit in this context.
> > It will also helps the graphical tools to have a comprehensive
> > represeintation which can do git even better.
>
> Why does it need to be the branch name?  You can add your own extra
> metadata to the git description.

That's typically my problem.  It is not possible "by default", I mean
- It is only possible if the developer configure something
- or if there is an upper layer that guarantee this
By default, there is no hook embedded with the clone. So, as far as I
know (and I hope I'm wrong!), you have to use upper layer tools or to
change environment variables to activate this feature. Furthermore, it
will never be used by third party tool to beautify the branch
representation.

I think that the major problem is that git calls "branches" what it is
not. Git branches are, in reality, "dynamic tags". In other words,
when you are on this tag and you perform a commit, this dynamic tag
moves with your commit. So it is not really a branch as clearcase,
mercurial, svn or cvs defined it.


> So for example, I might have a git
> commit that looks like this:
>
>     ext4: avoid declaring fs inconsistent due to invalid file handles
>
>     If we receive a file handle, either from NFS or open_by_handle_at(2),
>     and it points at an inode which has not been initialized, and the file
>     system has metadata checksums enabled, we shouldn't try to get the
>     inode, discover the checksum is invalid, and then declare the file
>     system as being inconsistent.
>
>     ... <details of repro omitted to keep this email short>
>
>     Google-Bug-Id: 120690101
>     Upstream-5.0-SHA1: 8a363970d1dc38c4ec4ad575c862f776f468d057
>     Tested: used the repro to verify that open_by_handle_at(2)
>        will not declare the fs inconsistent
>     Effort: storage/ext4
>     Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>     Change-Id: Iafb6da7c360a4c34b882f7fd6a91e3bb
>
> The tie-in to the bug tracking system is done via "Google-Bug-Id:".
> The Effort tag is used to identify which subteam should be responsble
> for rebasing the commit to a newer upstream kernel.  (E.g., how to
> account for all of the patches made on top of 4.14.x when you are
> rebasing to the newer 4.19 long-term-stable kernel, to make sure all
> not-yet-usptreamed commits are carried over during the rebase
> process.)
>
> The Upstream-X.Y-SHA1: tag indicates that this is an upstream commit
> that was backported to the internal kernel.  If the commit isn't an
> upstream backport, we have a policy (which is enforced via an
> automated bot when the commit goes through Gerritt for review) that
> there must be an "Upstream-Plan: " tag indicating how the committer
> plans to get the change upstream.
>
> The automated review bot also enforces that a Tested: tag exists,
> describing how the developer tested the change, and Change-Id: is used
> to link the commit to Gerrit, which is how we enforce that all commits
> have to be reviewed by a second engineer before it is allowed into the
> production kernel sources.  We also maintain all of the Gerrit
> comments as history and so we can have accountability as to who
> reviewed a commit before it was submitted into the release repository.
>
> We also have automated bots which will run checkpatch and note the
> warnings from checkpatch as Gerrit commits; and if the kernel doesn't
> build on a variety of architetures and configurations (e.g., debug,
> installer, etc.) a bot can also report this and add -1 Gerrit review.
>
> See?  You can do an awful lot without regulating and recording the
> branch name used by the engineer.  We have full audit and traceability
> through the Gerrit reviews, and we can use the Google-Bug-Id to track
> which release versions of which kernels have which bugs fixed.
>

You have convinced me that Gerrit is a very nice tool that complete git  ;-)
However, one of the main point of git is that it is easy to setup
(once the tool is installed, it is one second to setup a new
repository and track files)
So, I certainly don't want to reduce this strong point of git!

> The bottom line is each company is going to have a different workflow
> for doing reviews, linkage to bug tracking systems, auditability, etc.
> If everybody were to demand their unique scheme was to be supported
> directly in Git, it would be a mess.

Include the name of the branch is not harmless or fanciful, it is
something important in most of the workflows.
For example:
If you check this article:
https://nvie.com/posts/a-successful-git-branching-model/
The branchname is fundamental and the pictures he shows in its article
will never be achieved without keeping track of the branchname.
Git lightened the notion of branch and therefore its use, which is a
good thing but on the other hand, why forget the branch history? And
certainly if it is only a light metadata in the commit ?
Don't you agree that the branchname where modifications were done
could give really precious information ?
Don't you agree that a representation like it is in the article above
is more clear than the standard git representation ?

I think that add the branchname as "weak" metadata, invisible except
with a dedicated request like specific placeholder in git log could be
a real big added value compared to the inconvenient of its absence.

Cheers,

Arnaud

> The scheme that I've described
> above needs no special git features.  It just uses some git hooks as a
> convenience to to developers to help them fill in these required
> fields, using Gerrit for commit review, and some bots which submit
> reviews to Gerrit.
>
> Cheers,
>
>                                                 - Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
  2019-12-30 11:59       ` Arnaud Bertrand
@ 2019-12-30 15:15         ` Paul Smith
  0 siblings, 0 replies; 6+ messages in thread
From: Paul Smith @ 2019-12-30 15:15 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: git

On Mon, 2019-12-30 at 12:59 +0100, Arnaud Bertrand wrote:
> > Why does it need to be the branch name?  You can add your own extra
> > metadata to the git description.
> 
> That's typically my problem.  It is not possible "by default", I mean
> - It is only possible if the developer configure something
> - or if there is an upper layer that guarantee this
> By default, there is no hook embedded with the clone. So, as far as I
> know (and I hope I'm wrong!), you have to use upper layer tools or to
> change environment variables to activate this feature.

In general I have found that trying to mandate what users do in their own
repositories on their own systems is a losing proposition.

Instead, we put requirements on what content is pushed to the central
repository.  Because the central repository is managed by the SCM admin
team we always know only properly-constructed commits can appear there,
without having to assume that every individual developer's local
environment has been set up in a specific way.

This can be done with hooks in the central repository: there are Git hooks
that are run before any push is accepted, which can cause the push to be
rejected, and hooks that are run after a push is accepted, which can be
used for triggering other operations.

So if you have a requirement about contents of Git commit message format,
for example, you can enforce that via these hooks.  If someone attempts to
push commits to the central repository and the commit message has an
incorrect format then the push is rejected and they'll have to fix it
before they can proceed to push.

In the environments I've been associated with we don't care about branch
names; instead everything is based on bug tracker identifiers.  Every
commit needs to be associated with a valid bug ID (added to the commit
message) and the pre-push hook verifies this and rejects the commit if not.
Then after the push is accepted, post-push hooks will update the bug
tracker with information about the push (SHA, software version, etc.)  This
ensures that development and management can use the bug tracker as their
primary planning tool to know what has been accomplished and what is left
to accomplish.  Since the commit message is persisted through cherry-picks, 
etc. it allows us to know which bugs were fixed in which different patch
release branches as well.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-30 15:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-23 12:56 Feature request: add a metadata in the commit: the "commited in branch" information Arnaud Bertrand
2019-12-29 23:17 ` Junio C Hamano
2019-12-29 23:53   ` Arnaud Bertrand
2019-12-30  4:15     ` Theodore Y. Ts'o
2019-12-30 11:59       ` Arnaud Bertrand
2019-12-30 15:15         ` Paul Smith

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).