git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Randall S. Becker" <rsbecker@nexbridge.com>
To: "'Philip Oakley'" <philipoakley@iee.org>,
	"'Mike Hommey'" <mh@glandium.org>, <git@vger.kernel.org>
Subject: RE: Allowing weak references to blobs and strong references to commits
Date: Tue, 31 Mar 2015 17:08:47 -0400	[thread overview]
Message-ID: <006a01d06bf6$e2512540$a6f36fc0$@nexbridge.com> (raw)
In-Reply-To: <1E05987AFD4A4ABCB5515905B517C021@PhilipOakley>

On March 31, 2015 3:55 PM Philip Oakley wrote:
> From: "Mike Hommey" <mh@glandium.org>
> [...]
> > So I thought, since commits are already allowed in tree objects, for
> > submodules, why not add a bit to the mode that would tell git that
> > those commit object references are meant to always be there aka strong
> > reference, as opposed to the current weak references for submodules.
> > I was thinking something like 0200000, which is above S_IFMT, but I
> > haven't checked if mode is expected to be a short anywhere, maybe one
> > of the file permission flags could be abused instead (sticky bit?).
> >
> > I could see this used in the future to e.g. implement a fetchable
> > reflog (which could be a ref to a tree with strong references to
> > commits).
> >
> > Then that got me thinking that the opposite would be useful to me as
> > well: I'm currently storing mercurial manifests as git trees with
> > (weak) commit references using the mercurial sha1s for files.
> > Unfortunately, that doesn't allow to store the corresponding file
> > permissions, so I'm going through hoops to get that. It would be
> > simpler for me if I could just declare files or symlinks with the
> > right permissions and say 'the corresponding blob doesn't need to
> > exist'.
> > I'm sure other tools using git as storage would have a use for such
> > weak references.
> >
> The "weak references" idea is something that's on my back list of
Toh-Doh's for
> the purpose of having a Narrow clone.
> 
> However it's not that easy as you need to consider three areas - what's on
disk
> (worktree/file system), what's in the index, and what's in the object
store and
> how a coherent view is kept of all three without breakage.
> 
> The 'Sparse Checkout' / 'Skip Worktree' (see `git help read-tree`) covers
the
> first two but not the third (which submodules does) [that's your 'the
> corresponding blob doesn't need to exist' aspect from my perspective]
> 
> 
> > What do you think about this? Does that seem reasonable to have in git
> > core, and if yes, how would you go about implementing it (same bit
> > with different meaning for blobs and commits (or would you rather that
> > were only done for commits and not for blobs)? what should I be
> > careful about, besides making sure gc and fsck don't mess up?)

I don't know whether this is relevant or not - forgiveness requested in
advance. It may be useful to store primarily the SHA1 for a weak object. In
a product called RMS, this was called an "External Reference". The file
itself was not stored, but its signature was. It was possible to tell that
the commit was validly and completely on disk, only if the signature matched
(so git status would know). If the file was missing, or had an invalid
signature, the working area was considered dirty (so git status would
presumably report "modified"). All signatures were stored for these types of
files, but the contents were not - hence "external". Otherwise, we stored
all other repository attributes - except the contents, with the obvious
risks. This was typically used to track versions of the compilers and
headers being used for builds, which we did not want to store in the
repository, managed by a separate systems operations group, but wanted to
know the signatures in case we had to go back in time. From my point of
view, I would like to be able to have /usr/include (example only) as a
working area where I can be 100% certain it contains what I expect it to
contain, but I don't really want to store the objects in a repository - and
may not have root anyway.

Cheers,
Randall

-- Brief whoami: NonStop&UNIX developer since approximately
UNIX(421664400)/NonStop(211288444200000000)
-- In my real life, I talk too much.

  reply	other threads:[~2015-03-31 21:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-31 10:07 Allowing weak references to blobs and strong references to commits Mike Hommey
2015-03-31 19:55 ` Philip Oakley
2015-03-31 21:08   ` Randall S. Becker [this message]
2015-03-31 20:23 ` Junio C Hamano
2015-03-31 22:39   ` Mike Hommey
2015-03-31 23:00     ` Junio C Hamano
2015-03-31 23:14     ` Jonathan Nieder
2015-03-31 23:18       ` Jonathan Nieder
2015-03-31 23:25       ` Junio C Hamano
2015-03-31 23:35       ` Mike Hommey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='006a01d06bf6$e2512540$a6f36fc0$@nexbridge.com' \
    --to=rsbecker@nexbridge.com \
    --cc=git@vger.kernel.org \
    --cc=mh@glandium.org \
    --cc=philipoakley@iee.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).