git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH v3 4/4] clone: open a shortcut for connectivity check
Date: Fri, 03 May 2013 09:15:15 -0700	[thread overview]
Message-ID: <7vwqrgxcoc.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1367584514-19806-5-git-send-email-pclouds@gmail.com> ("Nguyễn	Thái Ngọc Duy"'s message of "Fri, 3 May 2013 19:35:14 +0700")

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> In order to make sure the cloned repository is good, we run "rev-list
> --objects --not --all $new_refs" on the repository. This is expensive
> on large repositories. This patch attempts to mitigate the impact in
> this special case.
>
> In the "good" clone case, we only have one pack.

If "On large repositories" is the focus, we need to take into
account the fact that pack.packSizeLimit can split and store the
incoming packstream to multiple packs, so "only have one pack" is
misleading.

I think you can still do the same trick even when we split the pack
as index-pack will keep track of the objects it saw in the same
incoming pack stream (but I am writing this from memory without
looking at the original code you are touching, so please double
check).

> If all of the
> following are met, we can be sure that all objects reachable from the
> new refs exist, which is the intention of running "rev-list ...":
>
>  - all refs point to an object in the pack
>  - there are no dangling pointers in any object in the pack
>  - no objects in the pack point to objects outside the pack
>
> The second and third checks can be done with the help of index-pack as
> a slight variation of --strict check (which introduces a new condition
> for the shortcut: pack transfer must be used and the number of objects
> large enough to call index-pack). The first is checked in
> check_everything_connected after we get an "ok" from index-pack.
>
> "index-pack + new checks" is still faster than the current "index-pack
> + rev-list", which is the whole point of this patch. If any of the

Does the same check apply if we end up on the unpack-objects
codepath?

> This shortcut is not applied to shallow clones, partly because shallow
> clones should have no more objects than a usual fetch and the cost of
> rev-list is acceptable, partly to avoid dealing with corner cases when
> grafting is involved.

  parent reply	other threads:[~2013-05-03 16:15 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-31 11:09 [PATCH 0/4] check_everything_connected replacement Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 1/4] fetch-pack: save shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-04-01 14:53   ` Junio C Hamano
2013-04-05  2:11     ` Duy Nguyen
2013-03-31 11:09 ` [PATCH 2/4] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 3/4] index-pack, unpack-objects: add --not-so-strict for connectivity check Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 4/4] Use --not-so-strict on all pack transfer " Nguyễn Thái Ngọc Duy
2013-04-01 14:48 ` [PATCH 0/4] check_everything_connected replacement Junio C Hamano
2013-05-01 10:59 ` [PATCH v2 0/5] " Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` [PATCH v2 1/5] clone: let the user know when check_everything_connected is run Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` [PATCH v2 2/5] fetch-pack: prepare updated shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-05-01 20:27     ` Junio C Hamano
2013-05-02 10:04       ` Duy Nguyen
2013-05-01 10:59   ` [PATCH v2 3/5] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` [PATCH v2 4/5] index-pack, unpack-objects: add --not-so-strict for connectivity check Nguyễn Thái Ngọc Duy
2013-05-01 23:35     ` Junio C Hamano
2013-05-02  9:53       ` Duy Nguyen
2013-05-02 16:27         ` Junio C Hamano
2013-05-03  2:29           ` Duy Nguyen
2013-05-03  6:33             ` Junio C Hamano
2013-05-03  6:55               ` Junio C Hamano
2013-05-03  7:09                 ` Duy Nguyen
2013-05-03  8:16                   ` Eric Sunshine
2013-05-01 10:59   ` [PATCH v2 5/5] Use --not-so-strict on all pack transfer " Nguyễn Thái Ngọc Duy
2013-05-03 12:35   ` [PATCH v3 0/4] check_everything_connected replacement Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 1/4] clone: let the user know when check_everything_connected is run Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 2/4] fetch-pack: prepare updated shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-05-03 12:37       ` Eric Sunshine
2013-05-07 15:59       ` Junio C Hamano
2013-05-26  1:01         ` Duy Nguyen
2013-05-03 12:35     ` [PATCH v3 3/4] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 4/4] clone: open a shortcut for connectivity check Nguyễn Thái Ngọc Duy
2013-05-03 12:41       ` Eric Sunshine
2013-05-03 16:15       ` Junio C Hamano [this message]
2013-05-04  1:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vwqrgxcoc.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).