git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: Merge with git-pasky II.
@ 2005-04-26 18:55 Bram Cohen
  2005-04-26 19:58 ` Linus Torvalds
  0 siblings, 1 reply; 81+ messages in thread
From: Bram Cohen @ 2005-04-26 18:55 UTC (permalink / raw)
  To: git

(my apologies for responding to old messages, I only just subscribed to
this list)

Linus Torvalds wrote:
> On Thu, 14 Apr 2005, Junio C Hamano wrote:
> >
> > You say "merge these two trees" above (I take it that you mean
> > "merge these two trees, taking account of this tree as their
> > common ancestor", so actually you are dealing with three trees),
>
> Yes. We're definitely talking three trees.

The LCA for different files might be at different points in the history.
Forcing them to all come from the same point produces very bad merges.

> The fact is, we know how to make tree merges unambiguous, by just
> totally ignoring the history between them.  Ie we know how to merge
> data. I am pretty damn sure that _nobody_ knows how to merge "data over
> time".

You're incorrect. Codeville does exactly that (history-aware merges which
do the right thing even in cases where 3-way merge can't)

> > This however opens up another set of can of worms---it would
> > involve not just three trees but all the trees in the commit
> > chain in between.
>
> Exactly.  I seriously believe that the model is _broken_, simply because
> it gets too complicated. At some point it boils down to "keep it simple,
> stupid".

The Codeville merge algorithm is also quite simple, and is already
implemented and mature.

> I've not even been convinved that renames are worth it. Nobody has
> really given a good reason why.

If one person renames a file and another person modifies it then the
changes should be applied to the moved file.

Also, there's the directory rename case where one person moves a directory
and another person adds a file to it, in which case the file should be
moved to the new directory location on merge. I gather than BK doesn't
support this functionality, but Codeville and Monotone both do.

>    I think you might as well interpret the whole object thing. Git
> _does_ tell you how the objects changed, and I actually believe that a
> diff that works in between objects (ie can show "these lines moved from
> this file X to tjhat file Y") is a _hell_ of a lot more powerful than
> "rename"  is.
>
>    So I'd seriously suggest that instead of worryign about renames,
> people think about global diffs that aren't per-file. Git is good at
> limiting the changes to a set of objects, and it should be entirely
> possible to think of diffs as ways of moving lines _between_ objects and
> not just within objects. It's quite common to move a function from one
> file to another - certainly more so than renaming the whole file.
>
> In other words, I really believe renames are just a meaningless special
> case of a much more interesting problem. Which is just one reason why
> I'm not at all interested in bothering with them other than as a "data
> moved" thing, which git already handles very well indeed.

Nothing, not eveny our beloved BitKeeper, has 'move lines between files'
functionality, and for good reason.

To begin with, it's behaviorally extremely dubious. It would be not
uncommon for the system to erroneously think that some files deleted from
one file were added to another, and then further changes down the line
would cause random unrelated files to get modified in unpredictable ways
when merges happened.

Also, it presents a completely unsolved UI problem. If one person moves
lines 5-15 of file A to file B, and another person concurrently rewrites
lines 10-20 of file A, how on earth is that supposed to be presented to
the user? Codeville can support line moves *within* files just fine, but
doesn't do it because of the UI problem of presenting all the corner
cases. Maybe someday somebody will do a PhD thesis on that topic and we'll
add it, but until then we're sticking with the basic functionality.

Honestly, that you would think of doing whole-tree three-way merges and
even consider moving lines between files shows that you haven't explored
the merge problem very deeply. This is a much harder problem than you
think it is, and one which has already been solved by other systems.

-Bram


^ permalink raw reply	[flat|nested] 81+ messages in thread
* Re: Re: Merge with git-pasky II.
@ 2005-04-17 17:34 Linus Torvalds
  2005-04-17 22:12 ` Herbert Xu
  2005-04-18  4:16 ` Sanjoy Mahajan
  0 siblings, 2 replies; 81+ messages in thread
From: Linus Torvalds @ 2005-04-17 17:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Petr Baudis, Simon Fowler, David Lang, git



On Sun, 17 Apr 2005, Ingo Molnar wrote:
> 
> in fact, this attack cannot even be proven to be malicious, purely via 
> the email from Malice: it could be incredible bad luck that caused that 
> good-looking patch to be mistakenly matching a dangerous object.

I really hate theoretical discussions. 

The fact is, a lot of _crap_ engineering gets done because of the question
"what if?". It results in over-engineering, often to the point where the 
end result is quite a lot measurably worse than the sane results.

You are _literally_ arguing for the equivalent of "what if a meteorite hit
my plane while it was in flight - maybe I should add three inches of
high-tension armored steel around the plane, so that my passengers would
be protected".

That's not engineering. That's five-year-olds discussing building their
imaginary forts ("I want gun-turrets and a mechanical horse one mile high,
and my command center is 5 miles under-ground and totally encased in 5
meters of lead").

I absolutely _hate_ doing engineering on the principle of "this might be
possible in theory", and I'm violently opposed to it. So far, I have not
heard a single argument that I consider even _remotely_ likely.

The thing is, even if you can force a hash collission by sending somebody 
a patch, it's really pretty much almost guaranteed that the patch is not 
just "a few strange characters", unless sha1 is really broken to the point 
where it's not cryptographically secure _at_all_.

In other words, unless somebody finds a way to make sha1 appear as nothing
more than a complicated set of parity bits, all brute-force "get the same
sha1" is likely to be about generating a really strange blob based on the
thing you want to replace - and by "really strange" I mean total binary
crap. And likely _much_ bigger too. And by "much bigger" I mean "possibly
gigabytes of data".

And the thing is, _if_ somebody finds a way to make sha1 act as just a
complex parity bit, and comes up with generating a clashing object that
actually makes sense, then going to sha256 is likely pointless too - I
think the algorithm is basically the same, just with more bits. If you've
broken sha1 to the point where it's _that_ breakable, then you've likely
broken sha256 too. Nobody has ever proven that you couldn't break sha256 
with some really clever algorithm...

So if you start playing "what if?" games, dammit, I can play mine.

If we want to have any kind of confidence that the hash is reall
yunbreakable, we should make it not just longer than 160 bits, we should
make sure that it's two or more hashes, and that they are based on totally
different principles.

And we should all digitally sign every single object too, and we should
use 4096-bit PGP keys and unguessable passphrases that are at least 20
words in length. And we should then build a bunker 5 miles underground,
encased in lead, so that somebody cannot flip a few bits with a ray-gun, 
and make us believe that the sha1's match when they don't. Oh, and we need 
to all wear aluminum propeller beanies to make sure that they don't use 
that ray-gun to make us do the modification _outselves_.

And the thing is, that's just crazy talk. The difference between a crazy
person and an intelligent one is that the crazy one doesn't realize what
makes sense in the world. The goal of good engineering is not to ask "what
if?", but to ask "how do I make this work as well as possible".

So please stop with the theoretical sha1 attacks. It is simply NOT TRUE
that you can generate an object that looks halfway sane and still gets you
the sha1 you want. Even the "breakage" doesn't actually do that.  And if
it ever _does_ become true, it will quite possibly be thanks to some
technology that breaks other hashes too.

So until proven otherwise, I worry about accidental hashes, and in 160
bits of good hashing, that just isn't an issue either. Anybody who
compares a 128-bit md5-sum to a 160-bit sha1 doesn't understand the math.  
It didn't get "slightly less likely" to happen. It got so _unbelievably_
less likely to happen that it's not even funny.

				Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread
[parent not found: <000d01c541ed$32241fd0$6400a8c0@gandalf>]
* Merge with git-pasky II.
@ 2005-04-14  0:29 Petr Baudis
  2005-04-13 21:25 ` Christopher Li
  2005-04-14  0:30 ` Petr Baudis
  0 siblings, 2 replies; 81+ messages in thread
From: Petr Baudis @ 2005-04-14  0:29 UTC (permalink / raw)
  To: torvalds; +Cc: git

  Hello Linus,

  I think my tree should be ready for merging with you. It is the final
tree and I've already switched my main branch for it, so it's what
people doing git pull are getting for some time already.

  Its main contents are all of my shell scripts. Apart of that, some
tiny fixes scattered all around can be found there, as well as some
patches which went through the mailing list. My last merge with you
concerned your commit 39021759c903a943a33a28cfbd5070d36d851581.

  It's again

	rsync://pasky.or.cz/git/

this time my HEAD is fba83970090ef54c6eb86dcc2c2d5087af5ac637.

  Note that my rsync tree still contains even my old branch; I thought
I'd leave it around in the public objects database for some time, shall
anyone want to have a look at the history of some of the scripts. But if
you want it gone, tell me and I will prune it (and perhaps offer it in
/git-old/ or whatever). I'm using the following:

	fsck-cache --unreachable $(commit-id) | grep unreachable \
		| cut -d ' ' -f 2 | sed 's/^\(..\)/.git\/objects\/\1\//' \
		| xargs rm

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2005-04-28  0:37 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-26 18:55 Merge with git-pasky II Bram Cohen
2005-04-26 19:58 ` Linus Torvalds
2005-04-26 20:30   ` Tom Lord
2005-04-26 20:31   ` Bram Cohen
2005-04-26 20:39     ` Tom Lord
2005-04-26 20:58     ` Linus Torvalds
2005-04-26 21:25       ` Linus Torvalds
2005-04-26 21:28       ` Bram Cohen
2005-04-26 21:36         ` Fabian Franz
2005-04-26 22:30           ` Linus Torvalds
2005-04-26 22:25         ` Linus Torvalds
2005-04-28  0:42           ` Petr Baudis
2005-04-26 21:26     ` Diego Calleja
2005-04-26 20:31   ` Daniel Barkalow
2005-04-26 20:44     ` Tom Lord
  -- strict thread matches above, loose matches on Subject: below --
2005-04-17 17:34 Linus Torvalds
2005-04-17 22:12 ` Herbert Xu
2005-04-17 22:35   ` Linus Torvalds
2005-04-17 23:29     ` Herbert Xu
2005-04-17 23:34       ` Petr Baudis
2005-04-17 23:53         ` Kenneth Johansson
2005-04-18  0:49         ` Herbert Xu
2005-04-18  0:55           ` Petr Baudis
2005-04-17 23:50       ` Linus Torvalds
2005-04-18  4:16 ` Sanjoy Mahajan
     [not found] <000d01c541ed$32241fd0$6400a8c0@gandalf>
2005-04-15 20:31 ` Linus Torvalds
2005-04-15 23:00   ` Barry Silverman
2005-04-16  0:32     ` Linus Torvalds
2005-04-14  0:29 Petr Baudis
2005-04-13 21:25 ` Christopher Li
2005-04-14  0:45   ` Petr Baudis
2005-04-14  3:51     ` Linus Torvalds
2005-04-14  1:23       ` Christopher Li
2005-04-14  5:03         ` Paul Jackson
2005-04-14  2:16           ` Christopher Li
2005-04-14  6:16             ` Paul Jackson
2005-04-14  7:05       ` Junio C Hamano
2005-04-14  8:06         ` Linus Torvalds
2005-04-14  8:39           ` Junio C Hamano
2005-04-14  9:10             ` Linus Torvalds
2005-04-14 11:14               ` Junio C Hamano
2005-04-14 12:16                 ` Petr Baudis
2005-04-14 18:12                   ` Junio C Hamano
2005-04-14 18:36                     ` Linus Torvalds
2005-04-14 19:59                       ` Junio C Hamano
2005-04-15  0:42                         ` Linus Torvalds
2005-04-15  2:33                           ` Barry Silverman
2005-04-15 10:02                           ` David Woodhouse
2005-04-15 15:32                             ` Linus Torvalds
2005-04-15 16:01                               ` David Woodhouse
2005-04-15 16:31                                 ` C. Scott Ananian
2005-04-15 17:11                                   ` Linus Torvalds
2005-04-16 15:33                                 ` Johannes Schindelin
2005-04-17 13:14                                   ` David Woodhouse
2005-04-15 19:20                               ` Paul Jackson
2005-04-16  1:44                               ` Simon Fowler
2005-04-16 12:19                                 ` David Lang
2005-04-16 15:55                                   ` Simon Fowler
2005-04-16 20:29                               ` Sanjoy Mahajan
2005-04-16 20:41                                 ` Linus Torvalds
2005-04-15  9:14                       ` David Woodhouse
2005-04-15  9:36                         ` Ingo Molnar
2005-04-15 10:05                           ` David Woodhouse
2005-04-15 14:53                             ` Ingo Molnar
2005-04-15 15:09                               ` David Woodhouse
2005-04-15 12:03                         ` Johannes Schindelin
2005-04-15 10:22                           ` Theodore Ts'o
2005-04-15 14:53                         ` Linus Torvalds
2005-04-15 15:29                           ` David Woodhouse
2005-04-15 15:51                             ` Linus Torvalds
2005-04-15 15:54                           ` Paul Jackson
2005-04-15 16:30                             ` C. Scott Ananian
2005-04-15 18:29                               ` Paul Jackson
2005-04-14 18:51                     ` Christopher Li
2005-04-14 19:35                     ` Petr Baudis
2005-04-14 23:12                       ` Junio C Hamano
2005-04-14 20:24                         ` Christopher Li
2005-04-14 23:31                         ` Petr Baudis
2005-04-15  0:58                           ` Junio C Hamano
2005-04-14 22:30                             ` Christopher Li
2005-04-15  7:43                               ` Junio C Hamano
2005-04-15  6:28                                 ` Christopher Li
2005-04-15 11:11                                   ` Junio C Hamano
2005-04-15 10:22                           ` Junio C Hamano
2005-04-15 20:40                             ` Petr Baudis
2005-04-15 22:41                               ` Junio C Hamano
2005-04-15 19:57           ` Junio C Hamano
2005-04-15 20:45             ` Linus Torvalds
2005-04-14  0:30 ` Petr Baudis

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).