git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Daniel Hulme <st@istic.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Calculating tree nodes
Date: Tue, 4 Sep 2007 17:20:19 +0100	[thread overview]
Message-ID: <20070904162019.GA10441@istic.org> (raw)
In-Reply-To: <9e4733910709032026s7f94eed9h25d5165840cc38d2@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

On Mon, Sep 03, 2007 at 11:26:30PM -0400, Jon Smirl wrote:
> This is something that has always bugged me about file systems. File
> systems force hierarchical naming due to their directory structure.
> There is no reason they have to work that way. Google is an example of
> a giant file system that works just fine without hierarchical
> directories. The full path should be just another attribute on the
> file. If you want a hierarchical index into the file system you can
> generate it by walking the files or using triggers. But you could also
> delete the hierarchical directory and replace it with something else
> like a full text index. Directories would become a computationally
> generated cache, not a critical part of the file system. But this is a
> git list so I shouldn't go too far off into file system design.

Am I the only one who thinks that this idea of moving filenames from
tree objects into blobs does the *opposite* of what you're trying to
achieve?

It seems, though I could be completely misinterpreting what you're
saying, that you want to be able to get rid of directories and replace
them with some other index into your files: maybe a full-text index,
maybe a spatial index for geographic data, maybe something else
entirely. As things stand, you could do that by editing the core to
introduce a new object type 'fulltext' whose contents maybe look like

aardvark <sha1 of a blob> <sha1 of another blob>
abacus <sha1>
...
zebra <yet another sha1> <maybe the same sha1 I mentioned before>

or even something hierarchical, with each index mapping from the first
letter of the index term to the sha1 of another index, which in term
maps second letters, and so on. Whatever. The point is, it works
parallel to tree. You could have the blobs referenced by your fulltext
object also be referenced by a tree object. If you really don't like
directory trees, you can dispense with tree objects in your repo
entirely. Either way you have a mapping from keys to blobs.

Then you could have your commits and tags include sha1's of fulltext
objects rather than (or as well as) tree objects, and you get your wish.

OTOH, imagine if you move filenames into the blobs. Now, no matter what
other index types you introduce, they'll always be secondary to the
traditional, path-and-filename method of finding files. Crucially, you
can't introduce new blobs into the repo without giving them filenames.

As you said in your other thread,

> Integrating indexing into the data is not normally done in a database.

But isn't this exactly what integrating filenames into blobs would do?

-- 
There is no such thing as a small specification change.
http://surreal.istic.org/            Forcing the lines through the snow.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

      parent reply	other threads:[~2007-09-04 16:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-04  2:13 Calculating tree nodes Jon Smirl
2007-09-04  2:51 ` Shawn O. Pearce
2007-09-04  3:26   ` Jon Smirl
2007-09-04  3:40     ` Johannes Schindelin
2007-09-04  3:54       ` Jon Smirl
2007-09-04  4:21         ` Martin Langhoff
2007-09-04  5:37           ` Jon Smirl
2007-09-04  5:51             ` Andreas Ericsson
2007-09-04 10:33             ` Johannes Schindelin
2007-09-04 14:31               ` Jon Smirl
2007-09-04 15:05                 ` Johannes Schindelin
2007-09-04 15:14                 ` Andreas Ericsson
2007-09-04 21:02                   ` Martin Langhoff
2007-09-04  4:28         ` Junio C Hamano
2007-09-04  5:50           ` Jon Smirl
2007-09-04  4:19     ` David Tweed
2007-09-04  5:52       ` Jon Smirl
2007-09-04  5:55         ` Andreas Ericsson
2007-09-04  6:16           ` Shawn O. Pearce
2007-09-04 14:19             ` Jon Smirl
2007-09-04 14:41               ` Andreas Ericsson
2007-09-04  6:16         ` David Tweed
2007-09-04  6:26     ` Shawn O. Pearce
2007-09-04 17:39       ` Junio C Hamano
2007-09-06  3:20         ` Shawn O. Pearce
2007-09-06  5:21           ` Junio C Hamano
2007-09-04 16:20     ` Daniel Hulme [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070904162019.GA10441@istic.org \
    --to=st@istic.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).