[PATCH 0/5] avoiding pointless pack-directory re-scans

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/5] avoiding pointless pack-directory re-scans
@ 2017-11-20 20:26 Jeff King
  2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:26 UTC (permalink / raw)
  To: git

I recently dug into a performance problem running "git fetch" in a
repository with 5000 packs. Now obviously that's a silly number of packs
to have, but I did find some pretty low-hanging fruit.  Most of the time
was spent in pointlessly re-scanning the objects/pack directory.

This series has two fixes, along with a perf test that covers this case.
I think the perf test is especially important here because we actually
fixed one of these cases (patch 4) already, but later regressed it
without noticing. The perf suite could have caught that (and would after
this series).

There are numbers in the individual commits, but here's the before/after
for the whole series:

Test            origin            HEAD
--------------------------------------------------------
5551.4: fetch   5.48(4.99+0.50)   0.14(0.09+0.05) -97.4%

That's on a somewhat-contrived setup meant to maximize us noticing a
regression. But I do think these re-scans are probably kicking in for
normal situations, but just aren't quite expensive enough for anybody to
have noticed and dug into it.

Patches:

  [1/5]: p5550: factor our nonsense-pack creation
  [2/5]: t/perf/lib-pack: use fast-import checkpoint to create packs
  [3/5]: p5551: add a script to test fetch pack-dir rescans
  [4/5]: everything_local: use "quick" object existence check
  [5/5]: sha1_file: don't re-scan pack directory for null sha1

 fetch-pack.c                 |  3 ++-
 sha1_file.c                  |  3 +++
 t/perf/lib-pack.sh           | 25 ++++++++++++++++++++
 t/perf/p5550-fetch-tags.sh   | 25 ++------------------
 t/perf/p5551-fetch-rescan.sh | 55 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 87 insertions(+), 24 deletions(-)
 create mode 100644 t/perf/lib-pack.sh
 create mode 100755 t/perf/p5551-fetch-rescan.sh

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
@ 2017-11-20 20:26 ` Jeff King
  2017-11-20 23:55   ` Eric Sunshine
  2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:26 UTC (permalink / raw)
  To: git

We have a function to create a bunch of irrelevant packs to
measure the expense of reprepare_packed_git(). Let's make
that available to other perf scripts.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/lib-pack.sh         | 29 +++++++++++++++++++++++++++++
 t/perf/p5550-fetch-tags.sh | 25 ++-----------------------
 2 files changed, 31 insertions(+), 23 deletions(-)
 create mode 100644 t/perf/lib-pack.sh

diff --git a/t/perf/lib-pack.sh b/t/perf/lib-pack.sh
new file mode 100644
index 0000000000..501bb7b272
--- /dev/null
+++ b/t/perf/lib-pack.sh
@@ -0,0 +1,29 @@
+# Helpers for dealing with large numbers of packs.
+
+# create $1 nonsense packs, each with a single blob
+create_packs () {
+	perl -le '
+		my ($n) = @ARGV;
+		for (1..$n) {
+			print "blob";
+			print "data <<EOF";
+			print "$_";
+			print "EOF";
+		}
+	' "$@" |
+	git fast-import &&
+
+	git cat-file --batch-all-objects --batch-check='%(objectname)' |
+	while read sha1
+	do
+		echo $sha1 | git pack-objects .git/objects/pack/pack
+	done
+}
+
+# create a large number of packs, disabling any gc which might
+# cause us to repack them
+setup_many_packs () {
+	git config gc.auto 0 &&
+	git config gc.autopacklimit 0 &&
+	create_packs 500
+}
diff --git a/t/perf/p5550-fetch-tags.sh b/t/perf/p5550-fetch-tags.sh
index a5dc39f86a..d0e0e019ea 100755
--- a/t/perf/p5550-fetch-tags.sh
+++ b/t/perf/p5550-fetch-tags.sh
@@ -20,6 +20,7 @@ start to show a noticeable performance problem on my machine, but without
 taking too long to set up and run the tests.
 '
 . ./perf-lib.sh
+. "$TEST_DIRECTORY/perf/lib-pack.sh"
 
 # make a long nonsense history on branch $1, consisting of $2 commits, each
 # with a unique file pointing to the blob at $2.
@@ -44,26 +45,6 @@ create_tags () {
 	git update-ref --stdin
 }
 
-# create $1 nonsense packs, each with a single blob
-create_packs () {
-	perl -le '
-		my ($n) = @ARGV;
-		for (1..$n) {
-			print "blob";
-			print "data <<EOF";
-			print "$_";
-			print "EOF";
-		}
-	' "$@" |
-	git fast-import &&
-
-	git cat-file --batch-all-objects --batch-check='%(objectname)' |
-	while read sha1
-	do
-		echo $sha1 | git pack-objects .git/objects/pack/pack
-	done
-}
-
 test_expect_success 'create parent and child' '
 	git init parent &&
 	git -C parent commit --allow-empty -m base &&
@@ -84,9 +65,7 @@ test_expect_success 'populate parent tags' '
 test_expect_success 'create child packs' '
 	(
 		cd child &&
-		git config gc.auto 0 &&
-		git config gc.autopacklimit 0 &&
-		create_packs 500
+		setup_many_packs
 	)
 '
 
-- 
2.15.0.494.g79a8547723


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs
  2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
  2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
@ 2017-11-20 20:27 ` Jeff King
  2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:27 UTC (permalink / raw)
  To: git

We currently use fast-import only to create a large number
of objects, and then run O(n) invocations of pack-objects to
turn them into packs.

We can do this faster by just asking fast-import to
checkpoint and create a pack for each (after telling it
not to turn loose tiny packs).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/lib-pack.sh | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/t/perf/lib-pack.sh b/t/perf/lib-pack.sh
index 501bb7b272..d3865db286 100644
--- a/t/perf/lib-pack.sh
+++ b/t/perf/lib-pack.sh
@@ -9,15 +9,10 @@ create_packs () {
 			print "data <<EOF";
 			print "$_";
 			print "EOF";
+			print "checkpoint"
 		}
 	' "$@" |
-	git fast-import &&
-
-	git cat-file --batch-all-objects --batch-check='%(objectname)' |
-	while read sha1
-	do
-		echo $sha1 | git pack-objects .git/objects/pack/pack
-	done
+	git fast-import
 }
 
 # create a large number of packs, disabling any gc which might
@@ -25,5 +20,6 @@ create_packs () {
 setup_many_packs () {
 	git config gc.auto 0 &&
 	git config gc.autopacklimit 0 &&
+	git config fastimport.unpacklimit 0 &&
 	create_packs 500
 }
-- 
2.15.0.494.g79a8547723


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans
  2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
  2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
  2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
@ 2017-11-20 20:28 ` Jeff King
  2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
  2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
  4 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:28 UTC (permalink / raw)
  To: git

Since fetch often deals with object-ids we don't have (yet),
it's an easy mistake for it to use a function like
parse_object() that gives the correct result (e.g., NULL)
but does so very slowly (because after failing to find the
object, we re-scan the pack directory looking for new
packs).

The regular test suite won't catch this because the end
result is correct, but we would want to know about
performance regressions, too. Let's add a test to the
regression suite.

Note that this uses a synthetic repository that has a large
number of packs. That's not ideal, as it means we're not
testing what "normal" users see (in fact, some of these
problems have existed for ages without anybody noticing
simply because a rescan on a normal repository just isn't
that expensive).

So what we're really looking for here is the spike you'd
notice in a pathological case (a lot of unknown objects
coming into a repo with a lot of packs). If that's fast,
then the normal cases should be, too.

Note that the test also makes liberal use of $MODERN_GIT for
setup; some of these regressions go back a ways, and we
should be able to use it to find the problems there.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/p5551-fetch-rescan.sh | 55 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)
 create mode 100755 t/perf/p5551-fetch-rescan.sh

diff --git a/t/perf/p5551-fetch-rescan.sh b/t/perf/p5551-fetch-rescan.sh
new file mode 100755
index 0000000000..b99dc23e32
--- /dev/null
+++ b/t/perf/p5551-fetch-rescan.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='fetch performance with many packs
+
+It is common for fetch to consider objects that we might not have, and it is an
+easy mistake for the code to use a function like `parse_object` that might
+give the correct _answer_ on such an object, but do so slowly (due to
+re-scanning the pack directory for lookup failures).
+
+The resulting performance drop can be hard to notice in a real repository, but
+becomes quite large in a repository with a large number of packs. So this
+test creates a more pathological case, since any mistakes would produce a more
+noticeable slowdown.
+'
+. ./perf-lib.sh
+. "$TEST_DIRECTORY"/perf/lib-pack.sh
+
+test_expect_success 'create parent and child' '
+	git init parent &&
+	git clone parent child
+'
+
+
+test_expect_success 'create refs in the parent' '
+	(
+		cd parent &&
+		git commit --allow-empty -m foo &&
+		head=$(git rev-parse HEAD) &&
+		test_seq 1000 |
+		sed "s,.*,update refs/heads/& $head," |
+		$MODERN_GIT update-ref --stdin
+	)
+'
+
+test_expect_success 'create many packs in the child' '
+	(
+		cd child &&
+		setup_many_packs
+	)
+'
+
+test_perf 'fetch' '
+	# start at the same state for each iteration
+	obj=$($MODERN_GIT -C parent rev-parse HEAD) &&
+	(
+		cd child &&
+		$MODERN_GIT for-each-ref --format="delete %(refname)" refs/remotes |
+		$MODERN_GIT update-ref --stdin &&
+		rm -vf .git/objects/$(echo $obj | sed "s|^..|&/|") &&
+
+		git fetch
+	)
+'
+
+test_done
-- 
2.15.0.494.g79a8547723

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] everything_local: use "quick" object existence check
  2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
                   ` (2 preceding siblings ...)
  2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
@ 2017-11-20 20:29 ` Jeff King
  2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
  4 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:29 UTC (permalink / raw)
  To: git

In b495697b82 (fetch-pack: avoid repeatedly re-scanning pack
directory, 2013-01-26), we noticed that everything_local()
could waste time trying to find and parse objects which we
_expect_ to be missing. The solution was to put
has_sha1_file() in front of parse_object() to skip the
more-expensive parse attempt.

That optimization was negated later when has_sha1_file()
learned to do the same re-scan in 45e8a74873 (has_sha1_file:
re-check pack directory before giving up, 2013-08-30).

We can restore it by using the "quick" flag to tell
has_sha1_file (actually has_object_file these days) that we
prefer speed to thoroughness for this call.  See also the
fixes in 5827a0354 and 0eeb077be7 for prior art and
discussion on using the "quick" flag for these cases.

The recently-added performance regression test in p5551
demonstrates the problem. You can see the original fix:

  Test            b495697b82^       b495697b82
  --------------------------------------------------------
  5551.4: fetch   1.68(1.33+0.35)   0.87(0.69+0.18) -48.2%

and then the regression:

  Test            45e8a74873^       45e8a74873
  ---------------------------------------------------------
  5551.4: fetch   0.96(0.77+0.19)   2.55(2.04+0.50) +165.6%

and now our fix:

  Test            HEAD^             HEAD
  --------------------------------------------------------
  5551.4: fetch   7.21(6.58+0.63)   5.47(5.04+0.43) -24.1%

You can also see that other things have gotten a lot slower
since 2013. We'll deal with those in separate patches.

Signed-off-by: Jeff King <peff@peff.net>
---
 fetch-pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 008b25d3db..9f6b07ad91 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -716,7 +716,8 @@ static int everything_local(struct fetch_pack_args *args,
 	for (ref = *refs; ref; ref = ref->next) {
 		struct object *o;

-		if (!has_object_file(&ref->old_oid))
+		if (!has_object_file_with_flags(&ref->old_oid,
+						OBJECT_INFO_QUICK))
 			continue;

 		o = parse_object(&ref->old_oid);
-- 
2.15.0.494.g79a8547723

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
                   ` (3 preceding siblings ...)
  2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
@ 2017-11-20 20:35 ` Jeff King
  2017-11-20 20:47   ` Stefan Beller
                     ` (2 more replies)
  4 siblings, 3 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:35 UTC (permalink / raw)
  To: git

In theory nobody should ever ask the low-level object code
for a null sha1. It's used as a sentinel for "no such
object" in lots of places, so leaking through to this level
is a sign that the higher-level code is not being careful
about its error-checking.  In practice, though, quite a few
code paths seem to rely on the null sha1 lookup failing as a
way to quietly propagate non-existence (e.g., by feeding it
to lookup_commit_reference_gently(), which then returns
NULL).

When this happens, we do two inefficient things:

  1. We actually search for the null sha1 in packs and in
     the loose object directory.

  2. When we fail to find it, we re-scan the pack directory
     in case a simultaneous repack happened to move it from
     loose to packed.

It's debatable whether (1) is a good idea or not. The
performance drop is at least linear in the number of
lookups, so it's not too bad. And if by some 2^-160th chance
somebody does have such an object, we probably ought to
access it. On the other hand, enough code paths treat the
null sha1 specially that the object probably isn't usable,
anyway.

Problem (2), on the other hand, is a pretty severe
performance issue. If you have a large number of packs,
rescanning the pack directory can be very expensive. And it
only helps in the case that you have an object with the null
sha1 _and_ somebody was simultaneously repacking it. The
tradeoff is certainly a bad one there.

In an ideal world, we'd simply fix all of the callers to
notice the null sha1 and avoid passing it to us. But a
simple experiment to catch this with a BUG() shows that
there are a large number of code paths.

In the meantime, let's address problem (2). That's the
minimal fix we can do to make the performance problems go
away. p5551 shows this off (when a fetched ref is new, the
"old" sha1 is 0{40}, which ends up being passed for
fast-forward checks, the status table abbreviations, etc):

  Test            HEAD^             HEAD
  --------------------------------------------------------
  5551.4: fetch   5.51(5.03+0.48)   0.17(0.10+0.06) -96.9%

We could address (1), too, but there's not much performance
improvement left to make.

Signed-off-by: Jeff King <peff@peff.net>
---
This is the minimal fix that addresses the performance issues.
I'd actually have no problem at all declaring that looking up a null
sha1 is insane, and having the object-lookup routines simply return "no
such object" without even doing the loose/pack lookup first.

I'm also fine with going down the BUG() route and fixing the
higher-level callers, but it's a big enough task (with little enough
real-world impact) that I think it would be worth applying this in the
meantime.

 sha1_file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index 8a7c6b7eba..dde0ad101d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1180,6 +1180,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 		if (!sha1_loose_object_info(real, oi, flags))
 			return 0;

+		if (is_null_sha1(sha1))
+			return -1;
+
 		/* Not a loose object; someone else may have just packed it. */
 		if (flags & OBJECT_INFO_QUICK) {
 			return -1;
-- 
2.15.0.494.g79a8547723

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
@ 2017-11-20 20:47   ` Stefan Beller
  2017-11-20 20:58     ` Jeff King
  2017-11-21  2:37   ` Junio C Hamano
  2017-11-21  5:20   ` Junio C Hamano
  2 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2017-11-20 20:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, Nov 20, 2017 at 12:35 PM, Jeff King <peff@peff.net> wrote:
> In theory nobody should ever ask the low-level object code
> for a null sha1. It's used as a sentinel for "no such
> object" in lots of places, so leaking through to this level
> is a sign that the higher-level code is not being careful
> about its error-checking.  In practice, though, quite a few
> code paths seem to rely on the null sha1 lookup failing as a
> way to quietly propagate non-existence (e.g., by feeding it
> to lookup_commit_reference_gently(), which then returns
> NULL).
>
> When this happens, we do two inefficient things:
>
>   1. We actually search for the null sha1 in packs and in
>      the loose object directory.
>
>   2. When we fail to find it, we re-scan the pack directory
>      in case a simultaneous repack happened to move it from
>      loose to packed.
>
> It's debatable whether (1) is a good idea or not. The
> performance drop is at least linear in the number of
> lookups, so it's not too bad. And if by some 2^-160th chance
> somebody does have such an object, we probably ought to
> access it. On the other hand, enough code paths treat the
> null sha1 specially that the object probably isn't usable,
> anyway.
>
> Problem (2), on the other hand, is a pretty severe
> performance issue. If you have a large number of packs,
> rescanning the pack directory can be very expensive. And it
> only helps in the case that you have an object with the null
> sha1 _and_ somebody was simultaneously repacking it. The
> tradeoff is certainly a bad one there.
>
> In an ideal world, we'd simply fix all of the callers to
> notice the null sha1 and avoid passing it to us. But a
> simple experiment to catch this with a BUG() shows that
> there are a large number of code paths.
>
> In the meantime, let's address problem (2). That's the
> minimal fix we can do to make the performance problems go
> away. p5551 shows this off (when a fetched ref is new, the
> "old" sha1 is 0{40}, which ends up being passed for
> fast-forward checks, the status table abbreviations, etc):
>
>   Test            HEAD^             HEAD
>   --------------------------------------------------------
>   5551.4: fetch   5.51(5.03+0.48)   0.17(0.10+0.06) -96.9%
>
> We could address (1), too, but there's not much performance
> improvement left to make.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This is the minimal fix that addresses the performance issues.
> I'd actually have no problem at all declaring that looking up a null
> sha1 is insane, and having the object-lookup routines simply return "no
> such object" without even doing the loose/pack lookup first.
>
> I'm also fine with going down the BUG() route and fixing the
> higher-level callers, but it's a big enough task (with little enough
> real-world impact) that I think it would be worth applying this in the
> meantime.

It would have a lot of impact in the future, when new developers
are hindered mis-using the API. The (unwritten) series with introducing
BUG() would help a lot in 'holding it right' and I would expect fewer
performance
regressions over time.

The patch is impressively small for such a performance gain.
Personally I think (1) (which essentially means "making null sha1
work like a regular sha1") is quite an endeavor at this point in time
for this code base.

As a tangent, a similar but solved problem in the diff code is how
NUL in user data is treated in xdiff for example, as there we kept
being careful since the beginning (though I think we don't have tests
for it, so it might be broken)

Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-20 20:47   ` Stefan Beller
@ 2017-11-20 20:58     ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-20 20:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 20, 2017 at 12:47:53PM -0800, Stefan Beller wrote:

> > This is the minimal fix that addresses the performance issues.
> > I'd actually have no problem at all declaring that looking up a null
> > sha1 is insane, and having the object-lookup routines simply return "no
> > such object" without even doing the loose/pack lookup first.
> >
> > I'm also fine with going down the BUG() route and fixing the
> > higher-level callers, but it's a big enough task (with little enough
> > real-world impact) that I think it would be worth applying this in the
> > meantime.
> 
> It would have a lot of impact in the future, when new developers
> are hindered mis-using the API. The (unwritten) series with introducing
> BUG() would help a lot in 'holding it right' and I would expect fewer
> performance
> regressions over time.

Yeah, I agree the BUG() route is the nicest. I started trying to hunt
down every code path causing a test failure, though, and soon got
overwhelmed.

I actually think it's not the worst thing in the world to say "it's OK
to look up the null sha1; the result is well-defined, and it does not
exist". It's perhaps philosophically a little ugly, but I think it's
quite practical.

> The patch is impressively small for such a performance gain.
> Personally I think (1) (which essentially means "making null sha1
> work like a regular sha1") is quite an endeavor at this point in time
> for this code base.

Yeah, it would be a crazy amount of effort to try to support that, for
very little gain. If we did want to just declare that the null sha1 does
not exist (even if you happen to have objects/00/00... in your
repository), the patch is similarly easy:

diff --git a/sha1_file.c b/sha1_file.c
index 8a7c6b7eba..754fe7f03e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1152,6 +1152,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 				    lookup_replace_object(sha1) :
 				    sha1;

+	if (is_null_sha1(sha1))
+		return -1;
+
 	if (!oi)
 		oi = &blank_oi;

> As a tangent, a similar but solved problem in the diff code is how
> NUL in user data is treated in xdiff for example, as there we kept
> being careful since the beginning (though I think we don't have tests
> for it, so it might be broken)

There I think it's nice if we can be tolerant of NULs, because they're a
thing that might actually happen in user input. A real null sha1 has a
2^-160 chance of happening, and the design of sha1 is such that it's
hard for somebody to do it intentionally.

You might be able to get up to some mischief with replace objects,
though.

-Peff

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
@ 2017-11-20 23:55   ` Eric Sunshine
  2017-11-21 15:58     ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Sunshine @ 2017-11-20 23:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Nov 20, 2017 at 3:26 PM, Jeff King <peff@peff.net> wrote:
> p5550: factor our nonsense-pack creation

s/our/out/, I guess.

> We have a function to create a bunch of irrelevant packs to
> measure the expense of reprepare_packed_git(). Let's make
> that available to other perf scripts.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
  2017-11-20 20:47   ` Stefan Beller
@ 2017-11-21  2:37   ` Junio C Hamano
  2017-11-21 22:57     ` Jeff King
  2017-11-21  5:20   ` Junio C Hamano
  2 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-21  2:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> In an ideal world, we'd simply fix all of the callers to
> notice the null sha1 and avoid passing it to us. But a
> simple experiment to catch this with a BUG() shows that
> there are a large number of code paths.

Well, we can view this (or the alternative you sent later that does
the same a bit earlier in the function) as "fixing the caller" but
has already refactord the common logic to a helper function that all
of these callers call into ;-).


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
  2017-11-20 20:47   ` Stefan Beller
  2017-11-21  2:37   ` Junio C Hamano
@ 2017-11-21  5:20   ` Junio C Hamano
  2017-11-21 23:17     ` Jeff King
  2 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-21  5:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> This is the minimal fix that addresses the performance issues.
> I'd actually have no problem at all declaring that looking up a null
> sha1 is insane, and having the object-lookup routines simply return "no
> such object" without even doing the loose/pack lookup first.
>
> diff --git a/sha1_file.c b/sha1_file.c
> index 8a7c6b7eba..dde0ad101d 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1180,6 +1180,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
>  		if (!sha1_loose_object_info(real, oi, flags))
>  			return 0;
>  
> +		if (is_null_sha1(sha1))
> +			return -1;
> +
>  		/* Not a loose object; someone else may have just packed it. */
>  		if (flags & OBJECT_INFO_QUICK) {
>  			return -1;

After queuing this series to an earlier part of 'pu' and resolving a
conflict with jh/fsck-promisors topic, I ended up with a code that
rejects 0{40} a lot earlier, before we try to see if a pack entry
for 0{40} exists, even though the patch that is queued on this topic
is more conservative (i.e. the above one).

Perhaps we would want to use the alternate version that declares the
0{40} is a sentinel that signals that there is no such object in
this topic---that would give us a consistent semantics without
having to adjust jh/fsck-promisors when it becomes ready to be
merged.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-20 23:55   ` Eric Sunshine
@ 2017-11-21 15:58     ` Jeff King
  2017-11-22  0:32       ` Stefan Beller
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-21 15:58 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Nov 20, 2017 at 06:55:51PM -0500, Eric Sunshine wrote:

> On Mon, Nov 20, 2017 at 3:26 PM, Jeff King <peff@peff.net> wrote:
> > p5550: factor our nonsense-pack creation
> 
> s/our/out/, I guess.

Heh, yes. I even fixed it once, but I have the funny habit of noticing
such typos while reading the "todo" list of "rebase -i" and fixing them
there. Which of course has no impact whatsoever on the commit. :-/

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-21  2:37   ` Junio C Hamano
@ 2017-11-21 22:57     ` Jeff King
  2017-11-22  1:42       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-21 22:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Nov 21, 2017 at 11:37:28AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In an ideal world, we'd simply fix all of the callers to
> > notice the null sha1 and avoid passing it to us. But a
> > simple experiment to catch this with a BUG() shows that
> > there are a large number of code paths.
> 
> Well, we can view this (or the alternative you sent later that does
> the same a bit earlier in the function) as "fixing the caller" but
> has already refactord the common logic to a helper function that all
> of these callers call into ;-).

Yes, I'm definitely tempted by that view. :)

What worries me, though, is that callers who lazily propagate the null
sha1 to the lookup functions cannot reasonably tell the difference
between "this object was corrupt or missing" and "we passed the null
sha1, and a missing object is expected".

For example, look at how fetch.c:update_local_ref() looks up objects.
It feeds the old and new sha1 to lookup_commit_reference_gently(), and
if either is NULL, it skips the fast-forward check. That makes sense if
we expect the null sha1 to get translated into a NULL commit. But it
also triggers for a broken ref, and we'd overwrite it (when the right
thing is probably refusing to update).

Here's a runnable example:

-- >8 --
git init parent
git -C parent commit --allow-empty -m base

git clone parent child
git -C parent commit --allow-empty -m more

cd child
rm -f .git/objects/??/*
git fetch
-- 8< --

That final fetch spews a bunch of errors about broken refs, and then
overwrites the value of origin/master, even though it's broken (in this
case it actually is a fast-forward, but the child repo doesn't even know
that).

I'm not sure what the right behavior is, but I'm pretty sure that's not
it. Probably one of:

  - skip updating the ref when we see the breakage

  - ditto, but terminate the whole operation, since we might be deleting
    other refs and in a broken repo we're probably best to make as few
    changes as possible

  - behave as if it was a non-ff, which would allow "--force" to
    overwrite the broken ref. Maybe convenient for fixing things, but
    possibly surprising (and it's not that hard to just delete the
    broken refs manually before proceeding).

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-21  5:20   ` Junio C Hamano
@ 2017-11-21 23:17     ` Jeff King
  2017-11-22  1:49       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-21 23:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Nov 21, 2017 at 02:20:19PM +0900, Junio C Hamano wrote:

> After queuing this series to an earlier part of 'pu' and resolving a
> conflict with jh/fsck-promisors topic, I ended up with a code that
> rejects 0{40} a lot earlier, before we try to see if a pack entry
> for 0{40} exists, even though the patch that is queued on this topic
> is more conservative (i.e. the above one).
> 
> Perhaps we would want to use the alternate version that declares the
> 0{40} is a sentinel that signals that there is no such object in
> this topic---that would give us a consistent semantics without
> having to adjust jh/fsck-promisors when it becomes ready to be
> merged.

I think we could adjust the patches without too much effort. That said,
I am very on-the-fence about whether to just declare the null sha1 as
"not a real object" and automatically return from the lookup without
doing any work. But if you agree that it's not likely to hurt anything,
it's IMHO a lot easier to reason about.

Here's a re-roll of patch 5 that behaves this way (the patch should be
unsurprising, but I've updated the commit message to match).

I did notice one other thing: the function looks up replace objects, so
we have both the original and the replaced sha1 available. My earlier
patch used the original sha1, but this one uses the replaced value.
I'm not sure if that's sane or not. It lets the fast-path kick in if you
point a replace ref at 0{40}. And it lets you point refs/replace/0{40}
to something else. I doubt that's useful, but it could perhaps be for
debugging, etc.

In most cases, of course, it wouldn't matter (since pointing to or from
the null sha1 is vaguely crazy in the first place).

-- >8 --
Subject: [PATCH v2 5/5] sha1_file: fast-path null sha1 as a missing object

In theory nobody should ever ask the low-level object code
for a null sha1. It's used as a sentinel for "no such
object" in lots of places, so leaking through to this level
is a sign that the higher-level code is not being careful
about its error-checking.  In practice, though, quite a few
code paths seem to rely on the null sha1 lookup failing as a
way to quietly propagate non-existence (e.g., by feeding it
to lookup_commit_reference_gently(), which then returns
NULL).

When this happens, we do two inefficient things:

  1. We actually search for the null sha1 in packs and in
     the loose object directory.

  2. When we fail to find it, we re-scan the pack directory
     in case a simultaneous repack happened to move it from
     loose to packed. This can be very expensive if you have
     a large number of packs.

Only the second one actually causes noticeable performance
problems, so we could treat them independently. But for the
sake of simplicity (both of code and of reasoning about it),
it makes sense to just declare that the null sha1 cannot be
a real on-disk object, and looking it up will always return
"no such object".

There's no real loss of functionality to do so Its use as a
sentinel value means that anybody who is unlucky enough to
hit the 2^-160th chance of generating an object with that
sha1 is already going to find the object largely unusable.

In an ideal world, we'd simply fix all of the callers to
notice the null sha1 and avoid passing it to us. But a
simple experiment to catch this with a BUG() shows that
there are a large number of code paths that do so.

So in the meantime, let's fix the performance problem by
taking a fast exit from the object lookup when we see a null
sha1. p5551 shows off the improvement (when a fetched ref is
new, the "old" sha1 is 0{40}, which ends up being passed for
fast-forward checks, the status table abbreviations, etc):

  Test            HEAD^             HEAD
  --------------------------------------------------------
  5551.4: fetch   5.51(5.03+0.48)   0.17(0.10+0.06) -96.9%

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index 8a7c6b7eba..ae4b71f445 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1152,6 +1152,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 				    lookup_replace_object(sha1) :
 				    sha1;

+	if (is_null_sha1(real))
+		return -1;
+
 	if (!oi)
 		oi = &blank_oi;

-- 
2.15.0.578.g35b419775f

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-21 15:58     ` Jeff King
@ 2017-11-22  0:32       ` Stefan Beller
  2017-11-22 22:38         ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2017-11-22  0:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Sunshine, Git List

On Tue, Nov 21, 2017 at 7:58 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Nov 20, 2017 at 06:55:51PM -0500, Eric Sunshine wrote:
>
>> On Mon, Nov 20, 2017 at 3:26 PM, Jeff King <peff@peff.net> wrote:
>> > p5550: factor our nonsense-pack creation
>>
>> s/our/out/, I guess.
>
> Heh, yes. I even fixed it once, but I have the funny habit of noticing
> such typos while reading the "todo" list of "rebase -i" and fixing them
> there. Which of course has no impact whatsoever on the commit. :-/
>

That happened to me a couple of times as well before.
This sounds like a UX bug on first thought.

On second thought the text after the abbreviated hash can be
user dependent IIRC, by setting some format option how to populate the
rebase instruction sheet using log, so it would not be easy to take
fixes from that line into the commit for a fixup.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-21 22:57     ` Jeff King
@ 2017-11-22  1:42       ` Junio C Hamano
  2017-11-22 22:36         ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-22  1:42 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> I'm not sure what the right behavior is, but I'm pretty sure that's not
> it. Probably one of:
>
>   - skip updating the ref when we see the breakage
>
>   - ditto, but terminate the whole operation, since we might be deleting
>     other refs and in a broken repo we're probably best to make as few
>     changes as possible
>
>   - behave as if it was a non-ff, which would allow "--force" to
>     overwrite the broken ref. Maybe convenient for fixing things, but
>     possibly surprising (and it's not that hard to just delete the
>     broken refs manually before proceeding).

Perhaps the last one would be the ideal endgame, but the second one
may be a good stopping point in the shorter term.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-21 23:17     ` Jeff King
@ 2017-11-22  1:49       ` Junio C Hamano
  2017-11-22  3:17         ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-22  1:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> Here's a re-roll of patch 5 that behaves this way (the patch should be
> unsurprising, but I've updated the commit message to match).
>
> I did notice one other thing: the function looks up replace objects, so
> we have both the original and the replaced sha1 available. My earlier
> patch used the original sha1, but this one uses the replaced value.
> I'm not sure if that's sane or not. It lets the fast-path kick in if you
> point a replace ref at 0{40}. And it lets you point refs/replace/0{40}
> to something else. I doubt that's useful, but it could perhaps be for
> debugging, etc.
>
> In most cases, of course, it wouldn't matter (since pointing to or from
> the null sha1 is vaguely crazy in the first place).

I tend to agree that those who go crazy/fancy with replace mechanism
can keep both halves when it breaks.

WRT existing codepaths that pass 0{40} and refuses to notice a
potential repository corruption (from getting a NULL for a non null
object name), I think we would need a sweep of the codebase and fix
them in the longer term.  As long as that will happen someday, either
approach between "we know 'no loose object? let's redo the packs' is
the part that matters performance-wise, so let's do a short-cut only
for that" and "we know that callers that comes with 0{40} want to get
NULL back" should be OK, I would think.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-22  1:49       ` Junio C Hamano
@ 2017-11-22  3:17         ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-22  3:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Nov 22, 2017 at 10:49:25AM +0900, Junio C Hamano wrote:

> WRT existing codepaths that pass 0{40} and refuses to notice a
> potential repository corruption (from getting a NULL for a non null
> object name), I think we would need a sweep of the codebase and fix
> them in the longer term.  As long as that will happen someday, either
> approach between "we know 'no loose object? let's redo the packs' is
> the part that matters performance-wise, so let's do a short-cut only
> for that" and "we know that callers that comes with 0{40} want to get
> NULL back" should be OK, I would think.

I agree. Let's go with the "v2 5/5" I posted then.

I'll try to work up a patch for the fetch.c case I found tomorrow, but I
suspect there are many more. But that's largely orthogonal to the
series.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-22  1:42       ` Junio C Hamano
@ 2017-11-22 22:36         ` Jeff King
  2017-11-23  2:35           ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-22 22:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Nov 22, 2017 at 10:42:30AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'm not sure what the right behavior is, but I'm pretty sure that's not
> > it. Probably one of:
> >
> >   - skip updating the ref when we see the breakage
> >
> >   - ditto, but terminate the whole operation, since we might be deleting
> >     other refs and in a broken repo we're probably best to make as few
> >     changes as possible
> >
> >   - behave as if it was a non-ff, which would allow "--force" to
> >     overwrite the broken ref. Maybe convenient for fixing things, but
> >     possibly surprising (and it's not that hard to just delete the
> >     broken refs manually before proceeding).
> 
> Perhaps the last one would be the ideal endgame, but the second one
> may be a good stopping point in the shorter term.

This turns out to be a lot trickier than I expected. The crux of the
matter is that the case we care about is hidden inside
lookup_commit_reference_gently(), which doesn't distinguish between
corruption and "not a commit".

So there are four cases we care about for this call in fetch:

  1. We fed a real sha1 and got a commit (or peeled to one).

  2. We fed a real sha1 which resolved to a non-commit, and we got NULL.

  3. We fed a real sha1 and the object was missing or corrupted, and we
     got NULL.

  4. We fed a null sha1 and got NULL.

Right now we lump cases 2-4 together as "do not do a fast-forward
check". That's fine for 2 and 4, but probably not for 3. We can easily
catch case 4 ourselves (if we care to), but distinguishing case 3 from
the others is hard. How should lookup_commit_reference_gently() signal
it to us?

Or should lookup_commit_reference_gently() die on corruption? That's not
very "gentle", but I think the "gently" here is really about "it might
not be a commit", not "the repo might be corrupted". But I think even
that may be the tip of the iceberg. The next thing we do is feed the
commits to in_merge_bases(), which will happily return "nope" if the old
commit cannot be parsed (because it has only a boolean return value).

So I dunno. Maybe it is a losing battle to try to pass this kind of
corruption information up the stack.  I'm tempted to say that there
should just be a "paranoid" flag to globally die() whenever we see a
corruption (and you could run with it normally, but relax it whenever
you're investigating a broken repo). But I doubt even that works. Not
having the "old_oid" object at all would be a repo corruption here, but
how are the low-level routines supposed to know when a missing object is
a corruption and when it is not?

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-22  0:32       ` Stefan Beller
@ 2017-11-22 22:38         ` Jeff King
  2017-11-23  2:41           ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-22 22:38 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Eric Sunshine, Git List

On Tue, Nov 21, 2017 at 04:32:42PM -0800, Stefan Beller wrote:

> > Heh, yes. I even fixed it once, but I have the funny habit of noticing
> > such typos while reading the "todo" list of "rebase -i" and fixing them
> > there. Which of course has no impact whatsoever on the commit. :-/
> 
> That happened to me a couple of times as well before.
> This sounds like a UX bug on first thought.
> 
> On second thought the text after the abbreviated hash can be
> user dependent IIRC, by setting some format option how to populate the
> rebase instruction sheet using log, so it would not be easy to take
> fixes from that line into the commit for a fixup.

Right, I came to the same conclusion (we may even have discussed this on
the list, I don't remember). The current "todo" format says that only
the command and sha1 matter, and we'd be changing that. Maybe that's not
so bad if the user has to enable the feature themselves (and clearly it
would be incompatible with a custom format option). But I think in the
end it probably just makes sense to retrain my expectation, and remember
to "reword" instead of "pick".

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-22 22:36         ` Jeff King
@ 2017-11-23  2:35           ` Junio C Hamano
  2017-11-24 17:32             ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-23  2:35 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> So there are four cases we care about for this call in fetch:
>
>   1. We fed a real sha1 and got a commit (or peeled to one).
>
>   2. We fed a real sha1 which resolved to a non-commit, and we got NULL.
>
>   3. We fed a real sha1 and the object was missing or corrupted, and we
>      got NULL.
>
>   4. We fed a null sha1 and got NULL.
>
> Right now we lump cases 2-4 together as "do not do a fast-forward
> check". That's fine for 2 and 4, but probably not for 3. We can easily
> catch case 4 ourselves (if we care to), but distinguishing case 3 from
> the others is hard. How should lookup_commit_reference_gently() signal
> it to us?

Not limiting us to the caller in the "fetch" codepath, I think the
expectation by callers of lookup_commit_reference_gently() in the
ideal world would be:

 - It has an object name, and wants to use it as point in the commit
   DAG to define the traversal over the DAG, if it refers to a
   commit known to us.

 - It does not know if these object names represent a tag object, a
   commit object, or some other object.  It does not know if the
   local repository actually has them (e.g. we received a "have"
   from the other side---missing is expected).

 - Hence, it would happily accept a NULL as "we do not have it" and
   "we do have it, but it is not a commit-ish".

And from that point of view, 2, 3a (missing), and 4 (0{40}) to yield
NULL is perfectly fine.  3b (exists but broken) may be a noteworthy
event, but for the purpose of the caller, it may want to proceed as
if the object is missing from our end, so it might deserve warning()
but not die(), at least as the default behaviour.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-22 22:38         ` Jeff King
@ 2017-11-23  2:41           ` Junio C Hamano
  2017-11-23  5:02             ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-11-23  2:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Eric Sunshine, Git List

Jeff King <peff@peff.net> writes:

> Right, I came to the same conclusion (we may even have discussed this on
> the list, I don't remember). The current "todo" format says that only
> the command and sha1 matter, and we'd be changing that. Maybe that's not
> so bad if the user has to enable the feature themselves (and clearly it
> would be incompatible with a custom format option).

If you are in the habit of always writing 4 or more lines, the
chance that you would make a typo on the line you can correct in the
"todo" list with such a feature is at most 25% ;-).

I think one downside is that a mere presence of such an option hints
that we somehow encourage people to commit with a title-only message.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] p5550: factor our nonsense-pack creation
  2017-11-23  2:41           ` Junio C Hamano
@ 2017-11-23  5:02             ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2017-11-23  5:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, Eric Sunshine, Git List

On Thu, Nov 23, 2017 at 11:41:25AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Right, I came to the same conclusion (we may even have discussed this on
> > the list, I don't remember). The current "todo" format says that only
> > the command and sha1 matter, and we'd be changing that. Maybe that's not
> > so bad if the user has to enable the feature themselves (and clearly it
> > would be incompatible with a custom format option).
> 
> If you are in the habit of always writing 4 or more lines, the
> chance that you would make a typo on the line you can correct in the
> "todo" list with such a feature is at most 25% ;-).

You'd think, but the distribution of typos doesn't seem to be uniform.
:)

I think what actually happens is that I quite often do my final
proofread in my MUA, where the subject is displayed apart from the rest
of the message. So I tend to gloss over it, and any errors there are
more likely to make it to the list.

> I think one downside is that a mere presence of such an option hints
> that we somehow encourage people to commit with a title-only message.

Yeah, though it might be handy for me I think it just opens up too many
funny questions like this.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-23  2:35           ` Junio C Hamano
@ 2017-11-24 17:32             ` Jeff King
  2017-11-25  3:20               ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2017-11-24 17:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Nov 23, 2017 at 11:35:21AM +0900, Junio C Hamano wrote:

> Not limiting us to the caller in the "fetch" codepath, I think the
> expectation by callers of lookup_commit_reference_gently() in the
> ideal world would be:
> 
>  - It has an object name, and wants to use it as point in the commit
>    DAG to define the traversal over the DAG, if it refers to a
>    commit known to us.
> 
>  - It does not know if these object names represent a tag object, a
>    commit object, or some other object.  It does not know if the
>    local repository actually has them (e.g. we received a "have"
>    from the other side---missing is expected).
> 
>  - Hence, it would happily accept a NULL as "we do not have it" and
>    "we do have it, but it is not a commit-ish".
> 
> And from that point of view, 2, 3a (missing), and 4 (0{40}) to yield
> NULL is perfectly fine.  3b (exists but broken) may be a noteworthy
> event, but for the purpose of the caller, it may want to proceed as
> if the object is missing from our end, so it might deserve warning()
> but not die(), at least as the default behaviour.

Hmm, yeah, I did not differentiate 3a and 3b in my analysis. How we'd
want to handle "missing" would vary from caller to caller, I think.
Certainly for this case in "fetch" where it's the "old" value for a ref,
it would be a corruption not to have it. Just grepping around a few of
the other callers, I see:

  - archive.c:parse_treeish_arg: fine not to have it (we'd complain soon
    after that it doesn't point to a tree either). But also fine to
    complain hard.

  - blame.c:dwim_reverse_initial, and checkout_paths and switch_branches
    in checkout.c: missing object would mean a broken HEAD

  - show-branch.c:append_ref: missing would mean a broken ref

  - clear_commit_marks_for_object_array: conceptually OK to have a
    missing object, though I suspect practically impossible since we
    examined and marked the objects earlier

  - ref-filter's --merged, --contains, etc: the low-level code quietly
    ignores a missing object or non-commit, but the command-line parser
    enforces that we find a commit. So probably impossible to trigger,
    but arguably the second call should be a BUG().

So I dunno. That is far from exhaustive, but it does seem like most
cases should assume the presence of the file.

As for your proposed behavior:

> And from that point of view, 2, 3a (missing), and 4 (0{40}) to yield
> NULL is perfectly fine.  3b (exists but broken) may be a noteworthy
> event, but for the purpose of the caller, it may want to proceed as
> if the object is missing from our end, so it might deserve warning()
> but not die(), at least as the default behaviour.

That's actually not far from what happens now, with the only difference
being that we _do_ actually die() on a corruption (really any error
except ENOENT). I forgot that when I wrote my earlier message. You can
see it by updating the "fetch" reproduction I sent earlier to corrupt:

-- >8 --
git init parent
git -C parent commit --allow-empty -m base

git clone parent child
git -C parent commit --allow-empty -m more

cd child
for i in .git/objects/??/*
do
	chmod +w $i
	echo corrupt >$i
done
git fetch
-- 8< --

which gives something like:

error: inflate: data stream error (incorrect header check)
error: unable to unpack 55c66a401fd4190382f9cd8b70c11f9f5adb044e header
fatal: loose object 55c66a401fd4190382f9cd8b70c11f9f5adb044e (stored in .git/objects/55/c66a401fd4190382f9cd8b70c11f9f5adb044e) is corrupt

So that's good. I do still think that many of the callers of
lookup_commit_reference_gently() probably ought to die() in the
"missing" case rather than continue (and produce subtly wrong answers).
But it may not be that big a deal. For the most part, all bets are off
in a corrupt repo. My main concern is just that we do not want the
corruption to spread or to make it harder for us to recover from (and
that includes allowing us to delete or overwrite other data that would
otherwise be forbidden, which is what's happening in the fetch case).
Most of the callers probably don't fall into that situation, but it
might be nice to err on the side of caution.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
  2017-11-24 17:32             ` Jeff King
@ 2017-11-25  3:20               ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2017-11-25  3:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> So I dunno. That is far from exhaustive, but it does seem like most
> cases should assume the presence of the file.

You are right.  I should have considered that "we got a random
object-name looking thing and we do not care if it does not exist"
is a very narrow minority case.  Most of the object names we deal
with come from local repository traversal and unless things like the
"fsck-promisor" comes into the picture, we should always have them
available locally.

> But it may not be that big a deal. For the most part, all bets are off
> in a corrupt repo. My main concern is just that we do not want the
> corruption to spread or to make it harder for us to recover from (and
> that includes allowing us to delete or overwrite other data that would
> otherwise be forbidden, which is what's happening in the fetch case).

Absolutely.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-11-25  3:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
2017-11-20 23:55   ` Eric Sunshine
2017-11-21 15:58     ` Jeff King
2017-11-22  0:32       ` Stefan Beller
2017-11-22 22:38         ` Jeff King
2017-11-23  2:41           ` Junio C Hamano
2017-11-23  5:02             ` Jeff King
2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
2017-11-20 20:47   ` Stefan Beller
2017-11-20 20:58     ` Jeff King
2017-11-21  2:37   ` Junio C Hamano
2017-11-21 22:57     ` Jeff King
2017-11-22  1:42       ` Junio C Hamano
2017-11-22 22:36         ` Jeff King
2017-11-23  2:35           ` Junio C Hamano
2017-11-24 17:32             ` Jeff King
2017-11-25  3:20               ` Junio C Hamano
2017-11-21  5:20   ` Junio C Hamano
2017-11-21 23:17     ` Jeff King
2017-11-22  1:49       ` Junio C Hamano
2017-11-22  3:17         ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).