git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: "Duy Nguyen" <pclouds@gmail.com>, "Johannes Sixt" <j6t@kdbg.org>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: [PATCH v4 00/32] Lockfile correctness and refactoring
Date: Fri, 12 Sep 2014 13:28:11 +0200	[thread overview]
Message-ID: <5412D8CB.4000809@alum.mit.edu> (raw)
In-Reply-To: <20140910191159.GA23655@peff.net>

On 09/10/2014 09:11 PM, Jeff King wrote:
> On Wed, Sep 10, 2014 at 09:51:03AM -0700, Junio C Hamano wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>> Yes, we don't let normal fetchers see these repos. They're only for
>>> holding shared objects and the ref tips to keep them reachable.
>>
>> Are these individual refs have relations to the real world after
>> they are created?  To ask it another way, let's say that a branch in
>> a repository, which is using this as a shared object store, caused
>> one of these refs to be created; now the origin repository rewinds
>> or deletes that branch---do you do anything to the ref in the shared
>> object store at that point?
> 
> Yes, we fetch from them before doing any maintenance in the shared
> repository (like running repack). That's how objects migrate into the
> shared repository, as well.
> 
>> I am wondering if it makes sense to maintain a single ref that
>> reaches all the commits in this shared object store repository,
>> instead of keeping these millions of refs.  When you need to make
>> more objects kept and reachable, create an octopus with the current
>> tip and tips of all these refs that causes you to wish making these
>> "more objects kept and reachable".  Obviously that won't work well
>> if the reason why your current scheme uses refs is because you
>> adjust individual refs to prune some objects---hence the first
>> question in this message.
> 
> Exactly. You could do this if you threw away and re-made the octopus
> after each fetch (and then threw away the individual branches that went
> into it). For that matter, if all you really want are the tips for
> reachability, you can basically run "for-each-ref | sort -u"; most of
> these refs are tags that are duplicated between each fork.
> 
> However, having the individual tips does make some things easier. If I
> only keep unique tips and I drop a tip from fork A, I would then need to
> check every other fork to see if any other fork has the same tip. OTOH,
> that means visiting N packed-refs files, each with (let's say) 3000
> refs. As opposed to dealing with a packed-refs file with N*3000 refs. So
> it's really not that different.

I think it would make more sense to have one octopus per fork, rather
than one octopus for all of the forks. The octopus for a fork could be
discarded and re-created every time that fork changed, without having to
worry about which of the old tips is still reachable from another fork.
In fact, if you happen to know that a particular update to a fork is
pure fast-forward, you could update the fork octopus by merging the
changed tips into the old fork octopus without having to re-knit all of
the unchanged tips together again.

The shared repository would need one reference per fork--still much
better than a reference for every reference in every fork. If even that
is too many references, you could knit the fork octopuses into an
overall octopus for all forks. But then updating that octopus for a
change to a fork would become more difficult, because you would have to
read the octopuses for the N-1 unchanged forks from the old octopus and
knit those together with the new octopus for the modified fork.

But all this imagineering doesn't mitigate the other reasons you list
below for not wanting to guarantee reachability using this trick.

> We also use the individual ref tips for packing. They factor into the
> bitmap selection, and we have some patches (which I've been meaning to
> upstream for a while now) to make delta selections in the shared-object
> repository that will have a high chance of reuse in clones of individual
> forks. And it's useful to query them for various reasons (e.g., "who is
> referencing this object?").
> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

  reply	other threads:[~2014-09-12 11:35 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-06  7:50 [PATCH v4 00/32] Lockfile correctness and refactoring Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 01/32] unable_to_lock_die(): rename function from unable_to_lock_index_die() Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 02/32] api-lockfile: expand the documentation Michael Haggerty
2014-09-09 22:40   ` Junio C Hamano
2014-09-06  7:50 ` [PATCH v4 03/32] rollback_lock_file(): do not clear filename redundantly Michael Haggerty
2014-09-11 19:13   ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 04/32] rollback_lock_file(): exit early if lock is not active Michael Haggerty
2014-09-11 19:13   ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 05/32] rollback_lock_file(): set fd to -1 Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 06/32] lockfile: unlock file if lockfile permissions cannot be adjusted Michael Haggerty
2014-09-09 22:39   ` Junio C Hamano
2014-09-12 11:03     ` Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 07/32] hold_lock_file_for_append(): release lock on errors Michael Haggerty
2014-09-09 22:41   ` Junio C Hamano
2014-09-12 11:04     ` Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 08/32] lock_file(): always add lock_file object to lock_file_list Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 09/32] lockfile.c: document the various states of lock_file objects Michael Haggerty
2014-09-11 19:57   ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 10/32] cache.h: define constants LOCK_SUFFIX and LOCK_SUFFIX_LEN Michael Haggerty
2014-09-11 22:15   ` Ronnie Sahlberg
2014-09-12 16:44     ` Michael Haggerty
2014-09-11 22:42   ` Ronnie Sahlberg
2014-09-12 17:13     ` Michael Haggerty
2014-09-12 17:32       ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 11/32] delete_ref_loose(): don't muck around in the lock_file's filename Michael Haggerty
2014-09-13  7:41   ` Johannes Sixt
2014-09-14  6:27     ` Michael Haggerty
2014-09-14  6:38       ` Michael Haggerty
2014-09-14 14:49         ` Johannes Sixt
2014-09-06  7:50 ` [PATCH v4 12/32] prepare_index(): declare return value to be (const char *) Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 13/32] write_packed_entry_fn(): convert cb_data into a (const int *) Michael Haggerty
2014-09-11 19:55   ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 14/32] lock_file(): exit early if lockfile cannot be opened Michael Haggerty
2014-09-11 22:49   ` Ronnie Sahlberg
2014-09-06  7:50 ` [PATCH v4 15/32] remove_lock_file(): call rollback_lock_file() Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 16/32] commit_lock_file(): inline temporary variable Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 17/32] commit_lock_file(): die() if called for unlocked lockfile object Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 18/32] commit_lock_file(): if close fails, roll back Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 19/32] commit_lock_file(): rollback lock file on failure to rename Michael Haggerty
2014-09-10  7:55   ` Jeff King
2014-09-10 12:55     ` Duy Nguyen
2014-09-06  7:50 ` [PATCH v4 20/32] api-lockfile: document edge cases Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 21/32] dump_marks(): remove a redundant call to rollback_lock_file() Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 22/32] git_config_set_multivar_in_file(): avoid " Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 23/32] lockfile: avoid transitory invalid states Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 24/32] struct lock_file: declare some fields volatile Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 25/32] try_merge_strategy(): remove redundant lock_file allocation Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 26/32] try_merge_strategy(): use a statically-allocated lock_file object Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 27/32] commit_lock_file(): use a strbuf to manage temporary space Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 28/32] Change lock_file::filename into a strbuf Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 29/32] resolve_symlink(): use a strbuf for internal scratch space Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 30/32] resolve_symlink(): take a strbuf parameter Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 31/32] trim_last_path_elm(): replace last_path_elm() Michael Haggerty
2014-09-06  7:50 ` [PATCH v4 32/32] Extract a function commit_lock_file_to() Michael Haggerty
2014-09-07 14:21 ` [PATCH v4 00/32] Lockfile correctness and refactoring Torsten Bögershausen
2014-09-12 12:50   ` Michael Haggerty
2014-09-08 22:35 ` Junio C Hamano
2014-09-10  8:13 ` Jeff King
2014-09-10 10:25   ` Duy Nguyen
2014-09-10 10:30     ` Jeff King
2014-09-10 16:51       ` Junio C Hamano
2014-09-10 19:11         ` Jeff King
2014-09-12 11:28           ` Michael Haggerty [this message]
2014-09-12 11:13         ` Michael Haggerty
2014-09-12 14:21   ` Michael Haggerty
2014-09-13 18:51     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5412D8CB.4000809@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).