* [PATCH 0/2] Make partial clone/fetch work when transfer.fsckobjects=1 @ 2018-03-14 18:42 Jonathan Tan 2018-03-14 18:42 ` [PATCH 1/2] index-pack: support checking objects but not links Jonathan Tan 2018-03-14 18:42 ` [PATCH 2/2] fetch-pack: do not check links for partial fetch Jonathan Tan 0 siblings, 2 replies; 6+ messages in thread From: Jonathan Tan @ 2018-03-14 18:42 UTC (permalink / raw To: git; +Cc: git, Jonathan Tan One of my colleagues noticed that we obtain a "fatal: did not receive expected object" error when partial-cloning (that is, with --filter set) if transfer.fsckobjects is true. Here's a fix for that. Jonathan Tan (2): index-pack: support checking objects but not links fetch-pack: do not check links for partial fetch Documentation/git-index-pack.txt | 3 +++ builtin/index-pack.c | 6 ++++-- fetch-pack.c | 13 +++++++++++-- t/t5302-pack-index.sh | 5 +++++ t/t5616-partial-clone.sh | 11 +++++++++++ 5 files changed, 34 insertions(+), 4 deletions(-) -- 2.16.2.520.gd0db9edba.dirty ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] index-pack: support checking objects but not links 2018-03-14 18:42 [PATCH 0/2] Make partial clone/fetch work when transfer.fsckobjects=1 Jonathan Tan @ 2018-03-14 18:42 ` Jonathan Tan 2018-03-14 18:42 ` [PATCH 2/2] fetch-pack: do not check links for partial fetch Jonathan Tan 1 sibling, 0 replies; 6+ messages in thread From: Jonathan Tan @ 2018-03-14 18:42 UTC (permalink / raw To: git; +Cc: git, Jonathan Tan The index-pack command currently supports the --check-self-contained-and-connected argument, for internal use only, that instructs it to only check for broken links and not broken objects. For partial clones, we need the inverse, so add a --fsck-objects argument that checks for broken objects and not broken links, also for internal use only. This will be used by fetch-pack in a subsequent patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Documentation/git-index-pack.txt | 3 +++ builtin/index-pack.c | 6 ++++-- t/t5302-pack-index.sh | 5 +++++ 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index 1b4b65d66..138edb47b 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -77,6 +77,9 @@ OPTIONS --check-self-contained-and-connected:: Die if the pack contains broken links. For internal use only. +--fsck-objects:: + Die if the pack contains broken objects. For internal use only. + --threads=<n>:: Specifies the number of threads to spawn when resolving deltas. This requires that index-pack be compiled with diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 59878e70b..f46cb5967 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -827,7 +827,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, free(has_data); } - if (strict) { + if (strict || do_fsck_object) { read_lock(); if (type == OBJ_BLOB) { struct blob *blob = lookup_blob(oid); @@ -853,7 +853,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, if (do_fsck_object && fsck_object(obj, buf, size, &fsck_options)) die(_("Error in object")); - if (fsck_walk(obj, NULL, &fsck_options)) + if (strict && fsck_walk(obj, NULL, &fsck_options)) die(_("Not all child objects of %s are reachable"), oid_to_hex(&obj->oid)); if (obj->type == OBJ_TREE) { @@ -1688,6 +1688,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) } else if (!strcmp(arg, "--check-self-contained-and-connected")) { strict = 1; check_self_contained_and_connected = 1; + } else if (!strcmp(arg, "--fsck-objects")) { + do_fsck_object = 1; } else if (!strcmp(arg, "--verify")) { verify = 1; } else if (!strcmp(arg, "--verify-stat")) { diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh index c2fc584da..d695a6082 100755 --- a/t/t5302-pack-index.sh +++ b/t/t5302-pack-index.sh @@ -262,4 +262,9 @@ EOF grep "^warning:.* expected .tagger. line" err ' +test_expect_success 'index-pack --fsck-objects also warns upon missing tagger in tag' ' + git index-pack --fsck-objects tag-test-${pack1}.pack 2>err && + grep "^warning:.* expected .tagger. line" err +' + test_done -- 2.16.2.520.gd0db9edba.dirty ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] fetch-pack: do not check links for partial fetch 2018-03-14 18:42 [PATCH 0/2] Make partial clone/fetch work when transfer.fsckobjects=1 Jonathan Tan 2018-03-14 18:42 ` [PATCH 1/2] index-pack: support checking objects but not links Jonathan Tan @ 2018-03-14 18:42 ` Jonathan Tan 2018-03-14 21:55 ` Junio C Hamano 1 sibling, 1 reply; 6+ messages in thread From: Jonathan Tan @ 2018-03-14 18:42 UTC (permalink / raw To: git; +Cc: git, Jonathan Tan When doing a partial clone or fetch with transfer.fsckobjects=1, use the --fsck-objects instead of the --strict flag when invoking index-pack so that links are not checked, only objects. This is because incomplete links are expected when doing a partial clone or fetch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- fetch-pack.c | 13 +++++++++++-- t/t5616-partial-clone.sh | 11 +++++++++++ 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index d97461296..1d6117565 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -886,8 +886,17 @@ static int get_pack(struct fetch_pack_args *args, ? fetch_fsck_objects : transfer_fsck_objects >= 0 ? transfer_fsck_objects - : 0) - argv_array_push(&cmd.args, "--strict"); + : 0) { + if (args->from_promisor) + /* + * We cannot use --strict in index-pack because it + * checks both broken objects and links, but we only + * want to check for broken objects. + */ + argv_array_push(&cmd.args, "--fsck-objects"); + else + argv_array_push(&cmd.args, "--strict"); + } cmd.in = demux.out; cmd.git_cmd = 1; diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 29d863118..cee556536 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -143,4 +143,15 @@ test_expect_success 'manual prefetch of missing objects' ' test_line_count = 0 observed.oids ' +test_expect_success 'partial clone with transfer.fsckobjects=1 uses index-pack --fsck-objects' ' + git init src && + test_commit -C src x && + test_config -C src uploadpack.allowfilter 1 && + test_config -C src uploadpack.allowanysha1inwant 1 && + + GIT_TRACE="$(pwd)/trace" git -c transfer.fsckobjects=1 \ + clone --filter="blob:none" "file://$(pwd)/src" dst && + grep "git index-pack.*--fsck-objects" trace +' + test_done -- 2.16.2.520.gd0db9edba.dirty ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] fetch-pack: do not check links for partial fetch 2018-03-14 18:42 ` [PATCH 2/2] fetch-pack: do not check links for partial fetch Jonathan Tan @ 2018-03-14 21:55 ` Junio C Hamano 2018-03-14 22:24 ` Jonathan Tan 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2018-03-14 21:55 UTC (permalink / raw To: Jonathan Tan; +Cc: git, git Jonathan Tan <jonathantanmy@google.com> writes: > When doing a partial clone or fetch with transfer.fsckobjects=1, use the > --fsck-objects instead of the --strict flag when invoking index-pack so > that links are not checked, only objects. This is because incomplete > links are expected when doing a partial clone or fetch. It is expected that _some_ links are missing, but this makes me wonder if we can do better than disabling the connectivity check altogether. Does "git fetch" lack sufficient information to attempt the connectivity check, and when (and only when) it hits a broken link, see if that is because the connectivity check traversal is crossing a "partial" fetch boundary, or something along that line? > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > fetch-pack.c | 13 +++++++++++-- > t/t5616-partial-clone.sh | 11 +++++++++++ > 2 files changed, 22 insertions(+), 2 deletions(-) > > diff --git a/fetch-pack.c b/fetch-pack.c > index d97461296..1d6117565 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -886,8 +886,17 @@ static int get_pack(struct fetch_pack_args *args, > ? fetch_fsck_objects > : transfer_fsck_objects >= 0 > ? transfer_fsck_objects > - : 0) > - argv_array_push(&cmd.args, "--strict"); > + : 0) { > + if (args->from_promisor) > + /* > + * We cannot use --strict in index-pack because it > + * checks both broken objects and links, but we only > + * want to check for broken objects. > + */ > + argv_array_push(&cmd.args, "--fsck-objects"); > + else > + argv_array_push(&cmd.args, "--strict"); > + } > > cmd.in = demux.out; > cmd.git_cmd = 1; > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > index 29d863118..cee556536 100755 > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -143,4 +143,15 @@ test_expect_success 'manual prefetch of missing objects' ' > test_line_count = 0 observed.oids > ' > > +test_expect_success 'partial clone with transfer.fsckobjects=1 uses index-pack --fsck-objects' ' > + git init src && > + test_commit -C src x && > + test_config -C src uploadpack.allowfilter 1 && > + test_config -C src uploadpack.allowanysha1inwant 1 && > + > + GIT_TRACE="$(pwd)/trace" git -c transfer.fsckobjects=1 \ > + clone --filter="blob:none" "file://$(pwd)/src" dst && > + grep "git index-pack.*--fsck-objects" trace > +' > + > test_done ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] fetch-pack: do not check links for partial fetch 2018-03-14 21:55 ` Junio C Hamano @ 2018-03-14 22:24 ` Jonathan Tan 2018-03-15 17:15 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Tan @ 2018-03-14 22:24 UTC (permalink / raw To: Junio C Hamano; +Cc: git, git On Wed, 14 Mar 2018 14:55:31 -0700 Junio C Hamano <gitster@pobox.com> wrote: > Jonathan Tan <jonathantanmy@google.com> writes: > > > When doing a partial clone or fetch with transfer.fsckobjects=1, use the > > --fsck-objects instead of the --strict flag when invoking index-pack so > > that links are not checked, only objects. This is because incomplete > > links are expected when doing a partial clone or fetch. > > It is expected that _some_ links are missing, but this makes me > wonder if we can do better than disabling the connectivity check > altogether. Does "git fetch" lack sufficient information to attempt > the connectivity check, and when (and only when) it hits a broken > link, see if that is because the connectivity check traversal is > crossing a "partial" fetch boundary, or something along that line? Our only definition (currently) for the "partial" fetch boundary is whether an object in a promisor packfile (a packfile obtained from the promisor remote) references it, so I think that checking for crossing a "partial" fetch boundary does not add anything. This is because by that definition, any missing links observed from objects newly fetched from the promisor remote cross a "partial" fetch boundary (since all objects fetched in this way "promise" all objects that they refer to). But it is true that we might be able to do better in checking, for example, that a packfile fetched using a blob size limit contains all referenced trees (that is, only blobs are allowed to be missing). ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] fetch-pack: do not check links for partial fetch 2018-03-14 22:24 ` Jonathan Tan @ 2018-03-15 17:15 ` Junio C Hamano 0 siblings, 0 replies; 6+ messages in thread From: Junio C Hamano @ 2018-03-15 17:15 UTC (permalink / raw To: Jonathan Tan; +Cc: git, git Jonathan Tan <jonathantanmy@google.com> writes: > Our only definition (currently) for the "partial" fetch boundary is > whether an object in a promisor packfile (a packfile obtained from the > promisor remote) references it, so I think that checking for crossing a > "partial" fetch boundary does not add anything. Ah, that's a perfect answer. When we find a link that leads to a missing object in a pack after receiving it from a remote repository that gives us promises, the missing object must be available from the remote repository by definition. Makes perfect sense ;-) Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-03-15 17:15 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-03-14 18:42 [PATCH 0/2] Make partial clone/fetch work when transfer.fsckobjects=1 Jonathan Tan 2018-03-14 18:42 ` [PATCH 1/2] index-pack: support checking objects but not links Jonathan Tan 2018-03-14 18:42 ` [PATCH 2/2] fetch-pack: do not check links for partial fetch Jonathan Tan 2018-03-14 21:55 ` Junio C Hamano 2018-03-14 22:24 ` Jonathan Tan 2018-03-15 17:15 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).