git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Jeff King <peff@peff.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Roman Shaposhnik <rvs@sun.com>,
	git@vger.kernel.org
Subject: Re: Achieving efficient storage of weirdly structured repos
Date: Sun, 06 Apr 2008 20:36:19 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.1.00.0804062028260.2947@xanadu.home> (raw)
In-Reply-To: <20080407001833.GA16558@sigill.intra.peff.net>

On Sun, 6 Apr 2008, Jeff King wrote:

> On Sun, Apr 06, 2008 at 08:13:10PM -0400, Nicolas Pitre wrote:
> 
> > Well, in your example, the large image part should already be common to 
> > many objects due to deltas if they're really the same: different objects 
> > will only have different EXIF data plus a delta reference to the same 
> > base image object. So in a way the split is already there.  Needs only 
> > that some applications exploit this information at runtime.
> 
> Yes, the resulting packfiles find the deltas and are pretty efficient
> (although it is quite slow to pack).  However, the delta information is
> not used at all for inexact rename detection. Are you proposing to make
> that information available to the rename detector?

In practice I don't know how well that would work since the 
current heuristic groups deltas and their 
base according to the name under which those objects are known.  So it 
is possible that some inexact renames end up creating objects that 
currently never delta against each other even if that would be the right 
thing to do.

But in some cases, that might be beneficial to look at the delta object 
themselves when diffing files as the delta might already contain the 
information telling the upper layer that file A and B are in fact 90% 
the same and that they differ from offset X to Y only.


Nicolas

      reply	other threads:[~2008-04-07  0:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-03 19:42 Achieving efficient storage of weirdly structured repos Roman Shaposhnik
2008-04-03 21:11 ` Linus Torvalds
2008-04-04  6:21   ` Jakub Narebski
2008-04-04 13:11     ` Nicolas Pitre
2008-04-04 14:16       ` Pieter de Bie
2008-04-05  3:24       ` Shawn O. Pearce
2008-04-04 23:30   ` Roman Shaposhnik
2008-04-04 23:57     ` Linus Torvalds
2008-04-06  0:13       ` Roman Shaposhnik
2008-04-06  0:48         ` Linus Torvalds
2008-04-06 16:10           ` Jeff King
2008-04-07  0:13             ` Nicolas Pitre
2008-04-07  0:18               ` Jeff King
2008-04-07  0:36                 ` Nicolas Pitre [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.1.00.0804062028260.2947@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=rvs@sun.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).