git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 2/2] fetch-pack: avoid repeatedly re-scanning pack directory
Date: Sun, 27 Jan 2013 02:27:53 -0800	[thread overview]
Message-ID: <20130127102753.GB4228@elie.Belkin> (raw)
In-Reply-To: <20130126224043.GB20849@sigill.intra.peff.net>

Hi,

Jeff King wrote:

> When we look up a sha1 object for reading, we first check
> packfiles, and then loose objects. If we still haven't found
> it, we re-scan the list of packfiles in `objects/pack`. This
> final step ensures that we can co-exist with a simultaneous
> repack process which creates a new pack and then prunes the
> old object.

I like the context above and what follows it, but I think you forgot
to mention what the patch actually does. :)

I guess it is:

	However, in the first scan over refs in fetch-pack.c::everything_local,
	this double-check of packfiles is not necessary since we are only
	trying to get a rough estimate of the last time we fetched from this
	remote repository in order to find good candidate common commits ---
	a missed object would only result in a slightly slower fetch.

	Avoid that slow second scan in the common case by guarding the object
	lookup with has_sha1_file().

Sounds like it would not affect most fetches except by making them
a lot faster in the many-refs case, so for what it's worth I like it.

I had not read this codepath before.  I'm left with a few questions:

 * Why is 49bb805e ("Do not ask for objects known to be complete",
   2005-10-19) trying to do?  Are we hurting that in any way?

   For the sake of an example, suppose in my stalled project I
   maintain 20 topic branches against an unmoving mainline I do not
   advertise and you regularly fetch from me.  The cutoff is the
   *newest* commit date of any of my topic branches you already have.
   By declaring you have that topic branch you avoid a complicated
   negotiation to discover that we both have the mainline.  Is that
   the goal?

 * Is has_sha1_file() generally succeptible to the race against repack
   you mentioned?  How is that normally dealt with?

 * Can a slow operation get confused if an object is incorporated into
   a pack and then expelled again by two repacks in sequence?

Thanks,
Jonathan

  reply	other threads:[~2013-01-27 10:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-26 22:40 [PATCH 0/2] optimizing pack access on "read only" fetch repos Jeff King
2013-01-26 22:40 ` [PATCH 1/2] fetch: run gc --auto after fetching Jeff King
2013-01-27  1:51   ` Jonathan Nieder
     [not found]   ` <87bmopzbqx.fsf@gmail.com>
2017-07-12 20:00     ` git gc --auto aquires *.lock files that make a subsequent git-fetch error out Jeff King
2017-07-12 20:30       ` Ævar Arnfjörð Bjarmason
2017-07-12 20:43         ` Jeff King
2013-01-26 22:40 ` [PATCH 2/2] fetch-pack: avoid repeatedly re-scanning pack directory Jeff King
2013-01-27 10:27   ` Jonathan Nieder [this message]
2013-01-27 20:09     ` Junio C Hamano
2013-01-27 23:20       ` Jonathan Nieder
2013-01-27  6:32 ` [PATCH 0/2] optimizing pack access on "read only" fetch repos Junio C Hamano
2013-01-29  8:06   ` Shawn Pearce
2013-01-29  8:29   ` Jeff King
2013-01-29 15:25     ` Martin Fick
2013-01-29 15:58     ` Junio C Hamano
2013-01-29 21:19       ` Jeff King
2013-01-29 22:26         ` Junio C Hamano
2013-01-31 16:47         ` Shawn Pearce
2013-02-01  9:14           ` Jeff King
2013-02-02 10:07             ` Shawn Pearce
2013-01-29 11:01   ` Duy Nguyen
  -- strict thread matches above, loose matches on Subject: below --
2012-12-07 13:53 [PATCH 0/2] make repeated fetch faster on "read only" repos Jeff King
2012-12-07 14:04 ` [PATCH 2/2] fetch-pack: avoid repeatedly re-scanning pack directory Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130127102753.GB4228@elie.Belkin \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).