From: Junio C Hamano <gitster@pobox.com>
To: Don Zickus <dzickus@redhat.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] git-apply doesn't handle same name patches well [V3]
Date: Thu, 19 Jun 2008 15:15:16 -0700 [thread overview]
Message-ID: <7vk5glrs0b.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20080619213341.GP16941@redhat.com> (Don Zickus's message of "Thu, 19 Jun 2008 17:33:41 -0400")
Don Zickus <dzickus@redhat.com> writes:
> On Tue, Jun 17, 2008 at 05:42:54PM -0700, Junio C Hamano wrote:
>> Don Zickus <dzickus@redhat.com> writes:
> ...
> I was going to try to figure out a way to grab it from fn_cache but I
> wasn't sure how much of the 'lstat' info is needed later.
The usual case of one-diff-one-path patch application wants to make sure
that there is no discrepancy between the index and the work tree for the
path when working in --index mode. When working in work-tree-only mode,
lstat just makes sure that the path to be patched actually exists (or
doesn't, if it is a creation patch).
When fn_cache is used, you pretend as if the resulting path exists and up
to date when found in fn_cache and the previous round succeeded, so you
can substitute the lstat (you cannot just lose it, but need to make sure
the path exists after applying the earlier fn_cache contents when handling
a later patch that wants to touch an existing file). As you pretend that
the previous round succeeded, you do not have to check the up-to-dateness
between the index and work tree when dealing with a path that has previous
result in fn_cache, even when operating in --index mode.
>> or do you mean that the first patch rename-edits A to B, but the second
>> one still wants to edit A in place and you would want to pretend as if the
>> later one is for a patch to B? I would think that is doable but asking
>> for too much magic, and a tool with too much magic is scary.
>
> Personally I think this case should be a failure. I even attached a
> testcase in my patch to make sure this failed. I wasn't comfortable doing
> this magic either.
Good.
>> There is a case where a normal git patch contains two separate patches to
>> the same file. A typechange patch is always expressed as a deletion of
>> the old path immediately followed by a creation of the same path. I have
>> to wonder why that codepath for handing that particular special case is
>> not changed in this patch. Surely the mechanism you are adding is a
>> generalization that can cover such a case as well, isn't it?
>
> Heh. Maybe, but I didn't know the code well enough to do that. Pointers?
See the way "prev_patch" is used in check_patch.
>> > @@ -2176,6 +2184,38 @@ static int read_file_or_gitlink(struct cache_entry *ce, struct strbuf *buf)
>> > return 0;
>> > }
>> >
>> > +struct patch *in_fn_cache(char *name)
>> > +{
>> > + struct path_list_item *item;
>> > +
>> > + item = path_list_lookup(name, &fn_cache);
>> > + if (item != NULL)
>> > + return (struct patch *)item->util;
>> > +
>> > + return NULL;
>> > +}
>> > +
>> > +void add_to_fn_cache(char *name, struct patch *patch)
>> > +{
>> > + struct path_list_item *item;
>> > +
>> > + /* Always add new_name unless patch is a deletion */
>> > + if (name != NULL) {
>> > + item = path_list_insert(name, &fn_cache);
>> > + item->util = patch;
>> > + }
>> > +
>> > + /* skip normal diffs, creations and copies */
>>
>> This comment is a "Huh?".
> I was just making a note about which cases I wanted to skip and which ones
> I wanted to process. I can expand on it. For example, patches that
> contain normal diffs, file creations and git copies or ignored as don't
> cares. Only file deletions and git renames were interesting to me in the
> code below.
The function's purpose is to record what the expected state of the path
after the current patch has applied successfully, and
* If it is not a deletion, you will record the postimage, so that a later
patch can work from there, not from what is in the initial state;
* If it is a deletion (or rename-away), you record that the path after
this patch no longer exists, so that you can catch a later broken patch
that tries to touch the path.
You have already handled "normal diff, creation and copy" in the first
part "if (name != NULL)", but you talk about it again here, which was the
"Huh?" inducing part. It gave an impression that the part that follows
does something other than the above two.
>> > + /*
>> > + * store a failure on rename/deletion cases because
>> > + * later chunks shouldn't patch old names
>> > + */
>> > + if ((name == NULL) || (patch->is_rename)) {
>> > + item = path_list_insert(patch->old_name, &fn_cache);
>> > + item->util = (struct patch *) -1;
If you have a patch that does A->B (rename), C->A (rename), A->A (mod),
would your code handle that?
>> If you look at the patch->old_name _anyway_, why do you give a separate
>> name parameter to this function? The function would be much easier to
>> read if you pass only patch, and use patch->new_name instead of the
>> separate name parameter. Otherwise the reader needs to scroll down and
>> figure out that name is a new name by looking at the call site to
>> understand what is going on.
>
> Yeah, leftover code that was added when I ran into rename and copy
> problems.
>
>>
>> > + }
>> > +}
>> > +
>> > static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *ce)
>> > {
>> > struct strbuf buf;
>> > @@ -2188,7 +2228,16 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
>> > if (read_file_or_gitlink(ce, &buf))
>> > return error("read of %s failed", patch->old_name);
>> > } else if (patch->old_name) {
>> > - if (S_ISGITLINK(patch->old_mode)) {
>> > + struct patch *tpatch = in_fn_cache(patch->old_name);
>> > +
>> > + if (tpatch != NULL) {
>> > + if (tpatch == (struct patch *) -1) {
>> > + return error("patch %s has been renamed/deleted",
>> > + patch->old_name);
>> > + }
>> > + /* We have a patched copy in memory use that */
>> > + strbuf_add(&buf, tpatch->result, tpatch->resultsize);
>> > + } else if (S_ISGITLINK(patch->old_mode)) {
>>
>> Isn't this wrong? Why can't this new enhancement be used while operating
>> only on the index?
>
> I don't know, can it? You tell me. I wasn't sure on the whole index
> thing worked.
Perhaps a "git-apply" primer might help. The program can work in three
modes of operation.
* normal mode: look at and operate only on work tree files.
* --index mode: work on both the index and the work tree. IOW, patch the
files and immediately do "git add -u" on that path, so that even
addition and deletion are recorded in the index. In this case, the
paths involved must be up-to-date between the work tree and the index
when you start "git apply"; otherwise you will lose your local
changes.
* --cached mode: work only on the index and never look at nor touch the
work tree.
Now, if you have a patch that has A->A (mod), followed by another A->A(mod),
is there a good reason why you allow it in the first two modes and not the
last one? I do not think so.
prev parent reply other threads:[~2008-06-19 22:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-16 20:04 [PATCH] git-apply doesn't handle same name patches well [V3] Don Zickus
2008-06-16 20:27 ` Jakub Narebski
2008-06-17 9:40 ` Johannes Schindelin
2008-06-18 0:42 ` Junio C Hamano
2008-06-19 21:33 ` Don Zickus
2008-06-19 22:15 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vk5glrs0b.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=dzickus@redhat.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).