git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Don Zickus <dzickus@redhat.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] git-apply doesn't handle same name patches well [V3]
Date: Thu, 19 Jun 2008 15:15:16 -0700	[thread overview]
Message-ID: <7vk5glrs0b.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20080619213341.GP16941@redhat.com> (Don Zickus's message of "Thu, 19 Jun 2008 17:33:41 -0400")

Don Zickus <dzickus@redhat.com> writes:

> On Tue, Jun 17, 2008 at 05:42:54PM -0700, Junio C Hamano wrote:
>> Don Zickus <dzickus@redhat.com> writes:
> ...
> I was going to try to figure out a way to grab it from fn_cache but I
> wasn't sure how much of the 'lstat' info is needed later.

The usual case of one-diff-one-path patch application wants to make sure
that there is no discrepancy between the index and the work tree for the
path when working in --index mode.  When working in work-tree-only mode,
lstat just makes sure that the path to be patched actually exists (or
doesn't, if it is a creation patch).

When fn_cache is used, you pretend as if the resulting path exists and up
to date when found in fn_cache and the previous round succeeded, so you
can substitute the lstat (you cannot just lose it, but need to make sure
the path exists after applying the earlier fn_cache contents when handling
a later patch that wants to touch an existing file).  As you pretend that
the previous round succeeded, you do not have to check the up-to-dateness
between the index and work tree when dealing with a path that has previous
result in fn_cache, even when operating in --index mode.

>> or do you mean that the first patch rename-edits A to B, but the second
>> one still wants to edit A in place and you would want to pretend as if the
>> later one is for a patch to B?  I would think that is doable but asking
>> for too much magic, and a tool with too much magic is scary.
>
> Personally I think this case should be a failure.  I even attached a
> testcase in my patch to make sure this failed.  I wasn't comfortable doing
> this magic either.

Good.

>> There is a case where a normal git patch contains two separate patches to
>> the same file.  A typechange patch is always expressed as a deletion of
>> the old path immediately followed by a creation of the same path.  I have
>> to wonder why that codepath for handing that particular special case is
>> not changed in this patch.  Surely the mechanism you are adding is a
>> generalization that can cover such a case as well, isn't it?
>
> Heh.  Maybe, but I didn't know the code well enough to do that.  Pointers?

See the way "prev_patch" is used in check_patch.

>> > @@ -2176,6 +2184,38 @@ static int read_file_or_gitlink(struct cache_entry *ce, struct strbuf *buf)
>> >  	return 0;
>> >  }
>> >  
>> > +struct patch *in_fn_cache(char *name)
>> > +{
>> > +	struct path_list_item *item;
>> > +
>> > +	item = path_list_lookup(name, &fn_cache);
>> > +	if (item != NULL)
>> > +		return (struct patch *)item->util;
>> > +
>> > +	return NULL;
>> > +}
>> > +
>> > +void add_to_fn_cache(char *name, struct patch *patch)
>> > +{
>> > +	struct path_list_item *item;
>> > +
>> > +	/* Always add new_name unless patch is a deletion */
>> > +	if (name != NULL) {
>> > +		item = path_list_insert(name, &fn_cache);
>> > +		item->util = patch;
>> > +	}
>> > +
>> > +	/* skip normal diffs, creations and copies */
>> 
>> This comment is a "Huh?".
> I was just making a note about which cases I wanted to skip and which ones
> I wanted to process.  I can expand on it.  For example, patches that
> contain normal diffs, file creations and git copies or ignored as don't
> cares.  Only file deletions and git renames were interesting to me in the
> code below.

The function's purpose is to record what the expected state of the path
after the current patch has applied successfully, and

 * If it is not a deletion, you will record the postimage, so that a later
   patch can work from there, not from what is in the initial state;

 * If it is a deletion (or rename-away), you record that the path after
   this patch no longer exists, so that you can catch a later broken patch
   that tries to touch the path.

You have already handled "normal diff, creation and copy" in the first
part "if (name != NULL)", but you talk about it again here, which was the
"Huh?" inducing part.  It gave an impression that the part that follows
does something other than the above two.

>> > +	/*
>> > +	 * store a failure on rename/deletion cases because
>> > +	 * later chunks shouldn't patch old names
>> > +	 */
>> > +	if ((name == NULL) || (patch->is_rename)) {
>> > +		item = path_list_insert(patch->old_name, &fn_cache);
>> > +		item->util = (struct patch *) -1;

If you have a patch that does A->B (rename), C->A (rename), A->A (mod),
would your code handle that?

>> If you look at the patch->old_name _anyway_, why do you give a separate
>> name parameter to this function?  The function would be much easier to
>> read if you pass only patch, and use patch->new_name instead of the
>> separate name parameter.  Otherwise the reader needs to scroll down and
>> figure out that name is a new name by looking at the call site to
>> understand what is going on.
>
> Yeah, leftover code that was added when I ran into rename and copy
> problems.
>
>> 
>> > +	}
>> > +}
>> > +
>> >  static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *ce)
>> >  {
>> >  	struct strbuf buf;
>> > @@ -2188,7 +2228,16 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
>> >  		if (read_file_or_gitlink(ce, &buf))
>> >  			return error("read of %s failed", patch->old_name);
>> >  	} else if (patch->old_name) {
>> > -		if (S_ISGITLINK(patch->old_mode)) {
>> > +		struct patch *tpatch = in_fn_cache(patch->old_name);
>> > +
>> > +		if (tpatch != NULL) {
>> > +			if (tpatch == (struct patch *) -1) {
>> > +				return error("patch %s has been renamed/deleted",
>> > +					patch->old_name);
>> > +			}
>> > +			/* We have a patched copy in memory use that */
>> > +			strbuf_add(&buf, tpatch->result, tpatch->resultsize);
>> > +		} else if (S_ISGITLINK(patch->old_mode)) {
>> 
>> Isn't this wrong?  Why can't this new enhancement be used while operating
>> only on the index?
>
> I don't know, can it?  You tell me.  I wasn't sure on the whole index
> thing worked.

Perhaps a "git-apply" primer might help.  The program can work in three
modes of operation.

 * normal mode: look at and operate only on work tree files.

 * --index mode: work on both the index and the work tree.  IOW, patch the
   files and immediately do "git add -u" on that path, so that even
   addition and deletion are recorded in the index.  In this case, the
   paths involved must be up-to-date between the work tree and the index
   when you start "git apply"; otherwise you will lose your local
   changes.

 * --cached mode: work only on the index and never look at nor touch the
   work tree.

Now, if you have a patch that has A->A (mod), followed by another A->A(mod),
is there a good reason why you allow it in the first two modes and not the
last one?  I do not think so.

      reply	other threads:[~2008-06-19 22:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-16 20:04 [PATCH] git-apply doesn't handle same name patches well [V3] Don Zickus
2008-06-16 20:27 ` Jakub Narebski
2008-06-17  9:40 ` Johannes Schindelin
2008-06-18  0:42 ` Junio C Hamano
2008-06-19 21:33   ` Don Zickus
2008-06-19 22:15     ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vk5glrs0b.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=dzickus@redhat.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).