From: "C. Scott Ananian" <cscott@cscott.net>
To: Martin Uecker <muecker@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: space compression (again)
Date: Sat, 16 Apr 2005 11:11:00 -0400 (EDT) [thread overview]
Message-ID: <Pine.LNX.4.61.0504161101470.29343@cag.csail.mit.edu> (raw)
In-Reply-To: <20050416143905.GA10370@macavity>
On Sat, 16 Apr 2005, Martin Uecker wrote:
> The right thing (TM) is to switch from SHA1 of compressed
> content for the complete monolithic file to a merkle hash tree
> of the uncompressed content. This would make the hash
> independent of the actual storage method (chunked or not).
It would certainly be nice to change to a hash of the uncompressed
content, rather than a hash of the compressed content, but it's not
strictly necessary, since files are fetched all at once: there's not 'read
subrange' operation on blobs.
I assume 'merkle hash tree' is talking about:
http://www.open-content.net/specs/draft-jchapweske-thex-02.html
..which is very interesting, but not quite what I was thinking.
The merkle hash approach seems to require fixed chunk boundaries.
The rsync approach does not use fixed chunk boundaries; this is necessary
to ensure good storage reuse for the expected case (ie; inserting a single
line at the start or in the middle of the file, which changes all the
chunk boundaries).
Further, in the absence of subrange reads on blobs, it's not entirely
clear what using a merkle hash would buy you.
--scott
WASHTUB supercomputer security Mk 48 justice ODUNIT radar COBRA JANE
SSBN 731 BATF KUJUMP SECANT operation class struggle SYNCARP KGB ODACID
( http://cscott.net/ )
next prev parent reply other threads:[~2005-04-16 15:07 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-15 17:19 space compression (again) C. Scott Ananian
2005-04-15 18:34 ` Linus Torvalds
2005-04-15 18:45 ` C. Scott Ananian
2005-04-15 19:00 ` Derek Fawcus
2005-04-15 19:11 ` Linus Torvalds
2005-04-16 14:39 ` Martin Uecker
2005-04-16 15:11 ` C. Scott Ananian [this message]
2005-04-16 17:37 ` Martin Uecker
2005-04-19 12:39 ` Martin Uecker
2005-04-15 18:50 ` Derek Fawcus
-- strict thread matches above, loose matches on Subject: below --
2005-04-15 19:33 Ray Heasman
2005-04-16 12:29 ` David Lang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.61.0504161101470.29343@cag.csail.mit.edu \
--to=cscott@cscott.net \
--cc=git@vger.kernel.org \
--cc=muecker@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).