git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Carl Worth <cworth@cworth.org>
Cc: git@vger.kernel.org
Subject: Re: [RFC] Two conceptually distinct commit commands
Date: Tue, 05 Dec 2006 17:13:20 -0800	[thread overview]
Message-ID: <7vejrdbzdb.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <87d56z32e1.wl%cworth@cworth.org> (Carl Worth's message of "Mon, 04 Dec 2006 11:08:22 -0800")

Carl Worth <cworth@cworth.org> writes:

>     ... The case I would use it for is fairly common,
>     (and something that I think will speak to Junio who often brings
>     up a similar scenario).
>
>     Here's where I would like this functionality:
>
> 	I receive a patch while I'm in the middle of doing other work,
> 	(but with a clean index compared to HEAD, which is what I've
> 	usually). The patch looks good, so I want to commit it right
> 	away, but I do want to separate it into two or more pieces,
> 	(commonly this is because I want to separate the "add a test
> 	case demonstrating a bug" part from the "fix the bug"
> 	part).

(This is offtopic)

I often faced situations like that during git.git history.  One
patch to expose the bug in the existing code, and another to fix
it.  And there are three ways to make that commit.

 (1) one commit exposes, then another fixes.
 (2) one commit fixes, then another verifies the bug is no more.
 (3) one commit to include both.

In my experience, (1) is only useful during the time I am coming
up with the fix (if I am fixing it myself) or during the time I
am reviewing and committing the fix (if I am applying somebody
else's patch).  Committing in that order lets me validate the
brokenness after making the first commit, and then lets me feel
good by not seeing that problem after the second commit.  But
this means I deliberately record a state that is known not to
pass the test, which means it is a problem for somebody else in
the future when the history needs to be bisected to hunt for an
unrelated bug.  If the "test" is just an optional test in the
test suite, then it is easy to work around (the person who is
bisecting can ignore that bug by not running that particular
test), but if it is an assert somewhere deep inside the code,
ignoring it is not very easy, especially if the person who is
bisecting is not familiar with that part of the code.

What I recommend people to do these days is either (2) or (3),
but do so _after_ verifying the fix in the reverse order.  The
criteria to choose between (2) or (3) is fairly simple: if the
"test" is easily separable (e.g. changes to a test script file
that does not overlap with the "fix" patch), roll both in one
commit.  Then it would not later cause problems for bisection.

Enough of offtopic.

The sequence to split a patch in place would be (I'll speak in
the present tense and pretend Nico's "git add" does not exist
yet):

	git apply
        git update-index <files for the first batch>
        git commit
        git commit -a ;# the remainder

so you do not necessarily need a new "concept".  It is
inconvenient that you need to deal with new files, but that is a
minor detail compared to a bigger problem I'll mention in the
next paragraph.

I think the problem with your thinking is that you still are
talking from "file boundary matters" point of view.  The above
sequence is only useful if the patch to be split is separable
cleanly at the file boundary, in which case, I would (and I've
done so often) split the patch file in my editor and run two
independent "git apply + git commit" sequences.  That way, I
could test each in isolation (perhaps with some dirty working
tree state if I do not stash them away and do reset --hard).
Anything more realistic and practically useful would require
splitting of the patch in semantic ways regardless of file
boundaries.

As I have already said (and you seemed to share the same
discipline), I do not like people committing anything
non-trivial that is not tested.  The patch you received might
not have been tested by the submitter, but there is a chance
that it might have been ;-).  But with the way you said you want
to make the commits in the message I am responding to, the first
commit would never have been tested by anybody in isolation, not
by the original submitter even if he tested the patch before
giving it to you, nor you -- your working tree had either none
of his patch or all of it, and never was in the state with only
the first batch.

So while at the theoretical level I understand what you would
want to achieve with the "single patch that should have been
sent as two patch series" example, from the practical point of
view I do not see much value in it (because "file boundary
matters" is a minority case that is not very interesting), and
from the discipline point of view I would rather not want to
have such a too-convenient way to commit things that were
different in nontrivial ways from what you had in your working
tree (if we use something like Darcs record to update
hunk-by-hunk).

>     The old behavior is still available with the --include option, but
>     nobody has ever come out in favor of that being a useful command,

That is a slight overstatement.  When committing with paths
argument to conclude a merge, --include semantics is the only
variant that makes sense (--only semantics is so wrong that
there is a safety valve that catches it).  Most of the time,
however, if I need to resolve and record a complicated merge, I
would either do "update-index" to clear the deck of the paths
that I already dealt with, and by the time I would type "git
commit", I have an index that has exactly what I want in the
merge commit.  That makes --include a less often used form.  If
a merge is small and easy to resolve at only a few paths, it
still is handy to say "git commit -i resolved-path.c".  It does
not add anything to the semantics -- it is only a typesaver.


  parent reply	other threads:[~2006-12-06  1:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-04 19:08 [RFC] Two conceptually distinct commit commands Carl Worth
2006-12-04 20:10 ` Carl Worth
2006-12-04 21:19   ` Jakub Narebski
2006-12-05  2:36     ` Carl Worth
2006-12-05  0:52 ` Horst H. von Brand
2006-12-05  1:18   ` Carl Worth
2006-12-05  2:14     ` Horst H. von Brand
2006-12-05  2:32       ` Carl Worth
2006-12-05  1:19   ` Jakub Narebski
2006-12-05  3:51 ` Theodore Tso
2006-12-05  6:33   ` Junio C Hamano
2006-12-05  6:38   ` Carl Worth
2006-12-06  1:13 ` Junio C Hamano [this message]
2006-12-06  4:53   ` Carl Worth
2006-12-06  9:54     ` Commit order in git.git, was " Johannes Schindelin
2006-12-06 16:14     ` Carl Worth
2006-12-06 18:31     ` Junio C Hamano
2006-12-06 23:29       ` Carl Worth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vejrdbzdb.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=cworth@cworth.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).