git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH v2 4/5] index-pack, unpack-objects: add --not-so-strict for connectivity check
Date: Wed,  1 May 2013 17:59:33 +0700	[thread overview]
Message-ID: <1367405974-22190-5-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <1367405974-22190-1-git-send-email-pclouds@gmail.com>

--not-so-strict only checks if all links from objects in the pack
point to real objects (either in current repo, or from the pack
itself). It's like check_everything_connected() except that:

 - it does not follow DAG in order
 - it can detect incomplete object islands
 - it seems to be faster than "rev-list --objects --all"

On my box, "rev-list --objects --all" takes 34 seconds. index-pack takes

215.25user 8.42system 1:32.31elapsed 242%CPU (0avgtext+0avgdata 1357328maxresident)k
0inputs+1421016outputs (0major+1222987minor)pagefaults 0swaps

And index-pack --not-so-strict takes

pack    96a4e3befa40bf38eddc2d7c99246a59af4ad55d
229.75user 11.31system 1:42.50elapsed 235%CPU (0avgtext+0avgdata 1876816maxresident)k
0inputs+1421016outputs (0major+1307989minor)pagefaults 0swaps

The overhead is about 10 seconds, just 1/3 of rev-list, which makes it
in a better position to replace check_everything_connected(). If this
holds true for general case, it could reduce fetch time by a little bit.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-index-pack.txt     | 3 +++
 Documentation/git-unpack-objects.txt | 4 ++++
 builtin/index-pack.c                 | 7 ++++++-
 builtin/unpack-objects.c             | 9 +++++++--
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt
index bde8eec..51af7be 100644
--- a/Documentation/git-index-pack.txt
+++ b/Documentation/git-index-pack.txt
@@ -74,6 +74,9 @@ OPTIONS
 --strict::
 	Die, if the pack contains broken objects or links.
 
+--not-so-strict::
+	Die if the pack contains broken links. For internal use only.
+
 --threads=<n>::
 	Specifies the number of threads to spawn when resolving
 	deltas. This requires that index-pack be compiled with
diff --git a/Documentation/git-unpack-objects.txt b/Documentation/git-unpack-objects.txt
index ff23494..14b6018 100644
--- a/Documentation/git-unpack-objects.txt
+++ b/Documentation/git-unpack-objects.txt
@@ -44,6 +44,10 @@ OPTIONS
 --strict::
 	Don't write objects with broken content or links.
 
+--not-so-strict::
+	Don't write objects with broken links. For internal
+	use only.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 1fd56d9..5d28a1b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -77,6 +77,7 @@ static int nr_threads;
 
 static int from_stdin;
 static int strict;
+static int do_fsck_object;
 static int verbose;
 static int show_stat;
 
@@ -756,7 +757,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			obj = parse_object_buffer(sha1, type, size, buf, &eaten);
 			if (!obj)
 				die(_("invalid %s"), typename(type));
-			if (fsck_object(obj, 1, fsck_error_function))
+			if (do_fsck_object &&
+			    fsck_object(obj, 1, fsck_error_function))
 				die(_("Error in object"));
 			if (fsck_walk(obj, mark_link, NULL))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
@@ -1511,6 +1513,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 				fix_thin_pack = 1;
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
+				do_fsck_object = 1;
+			} else if (!strcmp(arg, "--not-so-strict")) {
+				strict = 1;
 			} else if (!strcmp(arg, "--verify")) {
 				verify = 1;
 			} else if (!strcmp(arg, "--verify-stat")) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 2217d7b..dd0518b 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -12,7 +12,7 @@
 #include "decorate.h"
 #include "fsck.h"
 
-static int dry_run, quiet, recover, has_errors, strict;
+static int dry_run, quiet, recover, has_errors, strict, do_fsck_object;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict] < pack-file";
 
 /* We always read in 4kB chunks. */
@@ -198,7 +198,7 @@ static int check_object(struct object *obj, int type, void *data)
 		return 0;
 	}
 
-	if (fsck_object(obj, 1, fsck_error_function))
+	if (do_fsck_object && fsck_object(obj, 1, fsck_error_function))
 		die("Error in object");
 	if (fsck_walk(obj, check_object, NULL))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
@@ -520,6 +520,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "--strict")) {
+				do_fsck_object = 1;
+				strict = 1;
+				continue;
+			}
+			if (!strcmp(arg, "--not-so-strict")) {
 				strict = 1;
 				continue;
 			}
-- 
1.8.2.83.gc99314b

  parent reply	other threads:[~2013-05-01 10:59 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-31 11:09 [PATCH 0/4] check_everything_connected replacement Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 1/4] fetch-pack: save shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-04-01 14:53   ` Junio C Hamano
2013-04-05  2:11     ` Duy Nguyen
2013-03-31 11:09 ` [PATCH 2/4] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 3/4] index-pack, unpack-objects: add --not-so-strict for connectivity check Nguyễn Thái Ngọc Duy
2013-03-31 11:09 ` [PATCH 4/4] Use --not-so-strict on all pack transfer " Nguyễn Thái Ngọc Duy
2013-04-01 14:48 ` [PATCH 0/4] check_everything_connected replacement Junio C Hamano
2013-05-01 10:59 ` [PATCH v2 0/5] " Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` [PATCH v2 1/5] clone: let the user know when check_everything_connected is run Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` [PATCH v2 2/5] fetch-pack: prepare updated shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-05-01 20:27     ` Junio C Hamano
2013-05-02 10:04       ` Duy Nguyen
2013-05-01 10:59   ` [PATCH v2 3/5] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-05-01 10:59   ` Nguyễn Thái Ngọc Duy [this message]
2013-05-01 23:35     ` [PATCH v2 4/5] index-pack, unpack-objects: add --not-so-strict for connectivity check Junio C Hamano
2013-05-02  9:53       ` Duy Nguyen
2013-05-02 16:27         ` Junio C Hamano
2013-05-03  2:29           ` Duy Nguyen
2013-05-03  6:33             ` Junio C Hamano
2013-05-03  6:55               ` Junio C Hamano
2013-05-03  7:09                 ` Duy Nguyen
2013-05-03  8:16                   ` Eric Sunshine
2013-05-01 10:59   ` [PATCH v2 5/5] Use --not-so-strict on all pack transfer " Nguyễn Thái Ngọc Duy
2013-05-03 12:35   ` [PATCH v3 0/4] check_everything_connected replacement Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 1/4] clone: let the user know when check_everything_connected is run Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 2/4] fetch-pack: prepare updated shallow file before fetching the pack Nguyễn Thái Ngọc Duy
2013-05-03 12:37       ` Eric Sunshine
2013-05-07 15:59       ` Junio C Hamano
2013-05-26  1:01         ` Duy Nguyen
2013-05-03 12:35     ` [PATCH v3 3/4] index-pack: remove dead code (it should never happen) Nguyễn Thái Ngọc Duy
2013-05-03 12:35     ` [PATCH v3 4/4] clone: open a shortcut for connectivity check Nguyễn Thái Ngọc Duy
2013-05-03 12:41       ` Eric Sunshine
2013-05-03 16:15       ` Junio C Hamano
2013-05-04  1:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1367405974-22190-5-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).