git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Kastrup <dak@gnu.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/5] Add 'df_name_compare()' helper function
Date: Thu, 6 Mar 2008 07:58:48 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.1.00.0803060743180.12253@woody.linux-foundation.org> (raw)
In-Reply-To: <8563w03sxv.fsf@lola.goethe.zz>



On Thu, 6 Mar 2008, David Kastrup wrote:
> 
> Uh, that screams just bloody murder at me with regard to working on
> sorted material.  You'll never even get into the situation where the
> conflicting "equal" entries will be compared if you presorted them with
> something in between.

And you really can't.

> Consider the case of a merge of
> 
> A:
> blubb
> blubb.hi
> 
> B:
> blubb.hi
> blubb/
> 
> Any traversal that is done reasonably efficiently will never compare
> blubb to blubb/ and I don't see how to get around this.

Correct. There _is_ no sort order that will find the conflict in a single 
pass, yet get the sorting right for this case, because there is no sort 
that can satisfy both the "get DF conflicts together" _and_ the "keep 
things in the right order" requirements.

Btw, the old code was no better (and was probably worse) - it used 
"strcmp()" to order the things, so it effectively did the same thing, it 
just wasn't as obvious about it (and got the case of "foo/" vs "foo." 
entirely wrong). The new code makes this hack very explicit.

> Basically, I think you need a special traversal routine.

Yes, we need to handle it in two passes. Which is actually hopefully not 
all that hard, but it is totally impossible (at least for me) with the old 
code that was so hard to follow.

So what I did was to rewrite the code so that it can be followed and 
maintained, and now my plan is to simply put the result in a separate 
index. 

We've always really wanted to do that *anyway* (right now we have this 
"discard and re-read index" cruft in the callers and in the error cases to 
get back the index we overwrote when merging!), and I bet we could have 
done that with the old code too, but _I_ couldn't have done it.

The problem with having a single index as both source and destination is 
that you cannot do a two-phase thing, because you are basically destroying 
one of your sources in your first phase (or put another way: in the second 
phase, you don't really know whether the index entry you now find was 
there to begin with, of whether you just generated it yourself earlier).

So what that series of five patches do is to not fix a single bug, but to 
make the thing a bit easier for me to look at (and I think it's good for 
other reasons too - one less arbitrary tree traversal function!)

I'm hoping that people can look at the patches and test them, and say 
"yeah, that looks like it does the same thing we used to", and then when I 
get the energy (hopefully later today) I'll just make it take a separate 
source and destination index.

		Linus

  reply	other threads:[~2008-03-06 16:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06  4:28 [PATCH 0/5] Split-up "unpack_trees()" cleanup series Linus Torvalds
2008-03-06  2:25 ` [PATCH 1/5] Add 'df_name_compare()' helper function Linus Torvalds
2008-03-06 13:03   ` David Kastrup
2008-03-06 15:58     ` Linus Torvalds [this message]
2008-03-06 21:50       ` David Kastrup
2008-03-06  2:59 ` [PATCH 2/5] Make 'traverse_tree()' use linked structure rather than 'const char *base' Linus Torvalds
2008-03-06  3:44 ` [PATCH 3/5] Add return value to 'traverse_tree()' callback Linus Torvalds
2008-03-06  4:06 ` [PATCH 4/5] Make 'traverse_trees()' traverse conflicting DF entries in parallel Linus Torvalds
2008-03-06  4:15 ` [PATCH 5/5] Move 'unpack_trees()' over to 'traverse_trees()' interface Linus Torvalds
2008-03-06  4:51 ` [PATCH 0/5] Split-up "unpack_trees()" cleanup series Linus Torvalds
2008-03-07  0:13 ` Daniel Barkalow
2008-03-07  2:04   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.1.00.0803060743180.12253@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=dak@gnu.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).