From: Jonathan Nieder <jrnieder@gmail.com>
To: Thorsten Glaser <tg@mirbsd.de>
Cc: Michael J Gruber <git@drmicha.warpmail.net>,
Richard Hartmann <richih.mailinglist@gmail.com>,
Git List <git@vger.kernel.org>
Subject: Re: Tracking file metadata in git -- fix metastore or enhance git?
Date: Fri, 8 Apr 2011 14:45:48 -0500 [thread overview]
Message-ID: <20110408194548.GA26094@elie> (raw)
In-Reply-To: <Pine.BSM.4.64L.1104081903550.22999@herc.mirbsd.org>
Thorsten Glaser wrote:
> Jonathan Nieder dixit:
>> I think the most native-looking way to store metadata associated to
>> paths is .gitattributes. It also has the nice feature of allowing a
>> single attribute to apply to multiple files.
>
> Eh, no. Think of extended attributes like, say, NTFS Resource Forks.
> They’re just different “lines” into the “plane” a file can be, if
> you excuse the metapher. (All parallel, of course.)
Do you mean no, it doesn't have that feature? ;-)
Each git commit (try it with "git cat-file commit HEAD) looks like so:
tree <tree name>
parent <commit name for first parent>
parent <commit name for second parent>
...
author <author identity and time of authorship>
committer <committer identity and time committed>
encoding <encoding of log message (optional)>
<free-form change description>
Where could one sneak in some per-path metadata?
- as new header fields after "encoder" (teaching git fsck, git commit
--amend, and so on about it)? That can work but it would slow down
operations not interested in this metadata. It is best not to have
O(number of paths) header fields.
- in the change description? Yes, that can work, too, and it doesn't
even require changing the commit format.
- a new header field pointing to another object? That is possible as
a last resort.
Anyway, filenames and associated content are not what commits are
about; commits are just nodes in a revision graph, with trees representing
the tracked trees.
Okay, so what about the trees?
<mode> SP <filename> NUL <object name>
...
Where can we sneak something in?
- use a currently invalid <mode>? No, tracking metadata is probably
not worth breaking old git clients.
- use an invalid object name? No (for the same reason).
- use a special filename? Then old git clients will treat the file
as a regular file, so they still get access to the data.
So you see, using ordinary files (whether called .gitattributes or
foo.c.ntfs-resource-fork) to track this extra data makes a lot of
sense.
Now Michael mentioned an alternative, which is to store this
information in separate objects. That way, you could push your
history without the extra metadata, you could edit the metadata
without changing the commit names of the history, separately
garbage-collect metadata you're not interested in, etc. If that is
your goal, then "git notes" is exactly the right solution.
> They are just
> another facet of each file.
Sure, like the atime, the inode number, the uid of the user who wrote
them, and the model number of the disk used to store it.
Oh, you mean they're _relevant_ facets? Yes, that's believable,
though I suspect that's only going to sometimes be the case. So the
operator should say "yes, I'm interested in tracking this extra
information". To summarize the above, some ways this could work
behind the scenes:
* dotfiles with metadata;
* a Makefile to install files with metadata (i.e., the "source"
consists of plain files, while the "build product" has the
specified metadata);
* something else. Hopefully the above explains the relevant
constraints so you can surprise us.
Hope that helps.
Jonathan
next prev parent reply other threads:[~2011-04-08 19:46 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 19:16 Tracking file metadata in git -- fix metastore or enhance git? Richard Hartmann
2011-04-07 19:27 ` Thorsten Glaser
2011-04-08 0:29 ` Richard Hartmann
2011-04-08 10:01 ` Michael J Gruber
2011-04-08 18:59 ` Jonathan Nieder
2011-04-08 19:05 ` Thorsten Glaser
2011-04-08 19:45 ` Jonathan Nieder [this message]
2011-04-08 19:58 ` Thorsten Glaser
2011-04-08 21:23 ` Richard Hartmann
2011-04-09 8:11 ` Chris Webb
2011-04-09 9:09 ` Richard Hartmann
2011-04-10 0:15 ` Jonathan Nieder
2011-04-10 1:03 ` Junio C Hamano
2011-04-10 1:31 ` Richard Hartmann
2011-04-11 0:12 ` Richard Hartmann
2011-04-18 0:21 ` Richard Hartmann
2011-04-18 0:45 ` Jonathan Nieder
2011-12-14 4:54 ` johnnyutahh
2011-12-20 0:55 ` Richard Hartmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110408194548.GA26094@elie \
--to=jrnieder@gmail.com \
--cc=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=richih.mailinglist@gmail.com \
--cc=tg@mirbsd.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).