From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,T_DKIMWL_WL_MED,USER_IN_DEF_DKIM_WL shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 73D3D1F404 for ; Wed, 15 Aug 2018 23:23:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727340AbeHPCRn (ORCPT ); Wed, 15 Aug 2018 22:17:43 -0400 Received: from mail-qt0-f202.google.com ([209.85.216.202]:47036 "EHLO mail-qt0-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726185AbeHPCRn (ORCPT ); Wed, 15 Aug 2018 22:17:43 -0400 Received: by mail-qt0-f202.google.com with SMTP id o18-v6so2254699qtp.13 for ; Wed, 15 Aug 2018 16:23:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=nP8VYUmsR8Imqs/S+/MpHN1IBb/kPeDtP9PLDNieSPw=; b=tRKooDvnq/cNB/EpVwU3QfkrSYXeef51XCjvWwikJR91zPURvAMbxwE1m/XB7RnKxM jJi9BvhsLXiNsL4D7uOvUOIZfViWcRi211w+Rmzrf5ETO8ubzYMKYpoUFOXvMugsdO0u H1dZc1wBW5o1ySsA311zpqV6rs/XwYRThVk50xVP5ZAyAEedZn3f7N1ITPqeb6jjSNtq G28tNY6ZqWDp4MJVZsR2vm9koElGo3d7yEQKPaBiHe8z68fnqMr6IYbz04ePHukaG22i D7X0sZS53UmPgbdVhYMoH6ED0fhdlrkQfrTostImO6TdS5olTpOrBw0K1BiBUUBBYsTU 2AgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=nP8VYUmsR8Imqs/S+/MpHN1IBb/kPeDtP9PLDNieSPw=; b=QtSrZUvfi6CrBfOFIC77xjUsg0VCxp/FUtgWf2L8FvqfOI05jinFUFslsXduBgeUer x6zPvsF8kZeH9Cx4YRnOB/4qqA25AuJuz4V/tTKZ54JgJw7Bfp+Sc2v/WEm4lhlmsY6p j93tH2dtecXqwzfckpK4HFHqesypOMr5C5+jPxlT4dpLT/eGjPJ3ZmTgG6AVLYcTYaoQ IbFbQ+xGUMeKoKqkEDjHiuJRWiRSofHjlIeG98Ju3zjLsiRggx5dHAXfM23cyXHOYkUV HFAAYg27mA1vGRs1JPueeDe5w5zEGJRVdFFLYSKMsaD1346ArJhMDT5P7Irlu7J4RAuU WweA== X-Gm-Message-State: AOUpUlEBYFkYGMVbGyKPYfY7xA5amsKEzSzw9vuJ38gI3PIJoSvNHO6m fUXjylRkWWP0kFtVEFx7RhWdT11w7Zkhw8PttajY5grJArgNtA5L8KNpVflODo6iz2QjCxWT08T wjnEL8Ynb4MCdEa0OBUMPZ+Iggd9x6+t3QIMROAsZodu2JKuAtfhnC1R0+i0= X-Google-Smtp-Source: AA+uWPyQY8zWkyS+SK4q4dl7YYugZBNuzy8I6AEdGaPDWHItzqclb/RQRoIfjXnHRUmTz1H0+39RRq3V3PpX X-Received: by 2002:a0c:f648:: with SMTP id s8-v6mr14122871qvm.40.1534375401579; Wed, 15 Aug 2018 16:23:21 -0700 (PDT) Date: Wed, 15 Aug 2018 16:19:11 -0700 In-Reply-To: Message-Id: <5d3b4e4acb73009e4cefecd0965fe5dd371efea1.1534374650.git.matvore@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.18.0.865.gffc8e1a3cd6-goog Subject: [PATCH v6 6/6] list-objects-filter: implement filter tree:0 From: Matthew DeVore To: git@vger.kernel.org Cc: Matthew DeVore , git@jeffhostetler.com, jeffhost@microsoft.com, peff@peff.net, stefanbeller@gmail.com, jonathantanmy@google.com, gitster@pobox.com Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach list-objects the "tree:0" filter which allows for filtering out all tree and blob objects (unless other objects are explicitly specified by the user). The purpose of this patch is to allow smaller partial clones. The name of this filter - tree:0 - does not explicitly specify that it also filters out all blobs, but this should not cause much confusion because blobs are not at all useful without the trees that refer to them. I also consider only:commits as a name, but this is inaccurate because it suggests that annotated tags are omitted, but actually they are included. The name "tree:0" allows later filtering based on depth, i.e. "tree:1" would filter out all but the root tree and blobs. In order to avoid confusion between 0 and capital O, the documentation was worded in a somewhat round-about way that also hints at this future improvement to the feature. Signed-off-by: Matthew DeVore --- Documentation/rev-list-options.txt | 5 +++ list-objects-filter-options.c | 4 +++ list-objects-filter-options.h | 1 + list-objects-filter.c | 50 ++++++++++++++++++++++++++ t/t5317-pack-objects-filter-objects.sh | 28 +++++++++++++++ t/t5616-partial-clone.sh | 38 ++++++++++++++++++++ t/t6112-rev-list-filters-objects.sh | 12 +++++++ 7 files changed, 138 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 7b273635d..5f1672913 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -731,6 +731,11 @@ the requested refs. + The form '--filter=sparse:path=' similarly uses a sparse-checkout specification contained in . ++ +The form '--filter=tree:' omits all blobs and trees whose depth +from the root tree is >= (minimum depth if an object is located +at multiple depths in the commits traversed). Currently, only =0 +is supported, which omits all blobs and trees. --no-filter:: Turn off any previous `--filter=` argument. diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index c0e2bd6a0..a28382940 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -50,6 +50,10 @@ static int gently_parse_list_objects_filter( return 0; } + } else if (!strcmp(arg, "tree:0")) { + filter_options->choice = LOFC_TREE_NONE; + return 0; + } else if (skip_prefix(arg, "sparse:oid=", &v0)) { struct object_context oc; struct object_id sparse_oid; diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index 0000a61f8..af64e5c66 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -10,6 +10,7 @@ enum list_objects_filter_choice { LOFC_DISABLED = 0, LOFC_BLOB_NONE, LOFC_BLOB_LIMIT, + LOFC_TREE_NONE, LOFC_SPARSE_OID, LOFC_SPARSE_PATH, LOFC__COUNT /* must be last */ diff --git a/list-objects-filter.c b/list-objects-filter.c index a0ba78b20..8e3caf5bf 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -80,6 +80,55 @@ static void *filter_blobs_none__init( return d; } +/* + * A filter for list-objects to omit ALL trees and blobs from the traversal. + * Can OPTIONALLY collect a list of the omitted OIDs. + */ +struct filter_trees_none_data { + struct oidset *omits; +}; + +static enum list_objects_filter_result filter_trees_none( + enum list_objects_filter_situation filter_situation, + struct object *obj, + const char *pathname, + const char *filename, + void *filter_data_) +{ + struct filter_trees_none_data *filter_data = filter_data_; + + switch (filter_situation) { + default: + die("unknown filter_situation"); + return LOFR_ZERO; + + case LOFS_BEGIN_TREE: + case LOFS_BLOB: + if (filter_data->omits) + oidset_insert(filter_data->omits, &obj->oid); + return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ + + case LOFS_END_TREE: + assert(obj->type == OBJ_TREE); + return LOFR_ZERO; + + } +} + +static void* filter_trees_none__init( + struct oidset *omitted, + struct list_objects_filter_options *filter_options, + filter_object_fn *filter_fn, + filter_free_fn *filter_free_fn) +{ + struct filter_trees_none_data *d = xcalloc(1, sizeof(*d)); + d->omits = omitted; + + *filter_fn = filter_trees_none; + *filter_free_fn = free; + return d; +} + /* * A filter for list-objects to omit large blobs. * And to OPTIONALLY collect a list of the omitted OIDs. @@ -374,6 +423,7 @@ static filter_init_fn s_filters[] = { NULL, filter_blobs_none__init, filter_blobs_limit__init, + filter_trees_none__init, filter_sparse_oid__init, filter_sparse_path__init, }; diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 5e35f33bf..7a4d49ea1 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -72,6 +72,34 @@ test_expect_success 'get an error for missing tree object' ' grep -q "bad tree object" bad_tree ' +test_expect_success 'setup for tests of tree:0' ' + mkdir r1/subtree && + echo "This is a file in a subtree" >r1/subtree/file && + git -C r1 add subtree/file && + git -C r1 commit -m subtree +' + +test_expect_success 'verify tree:0 packfile has no blobs or trees' ' + git -C r1 pack-objects --rev --stdout --filter=tree:0 >commitsonly.pack <<-EOF && + HEAD + EOF + git -C r1 index-pack ../commitsonly.pack && + git -C r1 verify-pack -v ../commitsonly.pack >objs && + ! grep -E "tree|blob" objs +' + +test_expect_success 'grab tree directly when using tree:0' ' + # We should get the tree specified directly but not its blobs or subtrees. + git -C r1 pack-objects --rev --stdout --filter=tree:0 >commitsonly.pack <<-EOF && + HEAD: + EOF + git -C r1 index-pack ../commitsonly.pack && + git -C r1 verify-pack -v ../commitsonly.pack >objs && + awk "/tree|blob/{print \$1}" objs >trees_and_blobs && + git -C r1 rev-parse HEAD: >expected && + test_cmp trees_and_blobs expected +' + # Test blob:limit=[kmg] filter. # We boundary test around the size parameter. The filter is strictly less than # the value, so size 500 and 1000 should have the same results, but 1001 should diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index bbbe7537d..8eeb85fbc 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -154,6 +154,44 @@ test_expect_success 'partial clone with transfer.fsckobjects=1 uses index-pack - grep "git index-pack.*--fsck-objects" trace ' +test_expect_success 'use fsck before and after manually fetching a missing subtree' ' + # push new commit so server has a subtree + mkdir src/dir && + echo "in dir" >src/dir/file.txt && + git -C src add dir/file.txt && + git -C src commit -m "file in dir" && + git -C src push -u srv master && + SUBTREE=$(git -C src rev-parse HEAD:dir) && + + rm -rf dst && + git clone --no-checkout --filter=tree:0 "file://$(pwd)/srv.bare" dst && + git -C dst fsck && + + # Make sure we only have commits, and all trees and blobs are missing. + git -C dst rev-list master --missing=allow-any --objects >fetched_objects && + awk -f print_1.awk fetched_objects \ + | xargs -n1 git -C dst cat-file -t >fetched_types && + sort fetched_types -u >unique_types.observed && + echo commit >unique_types.expected && + test_cmp unique_types.observed unique_types.expected && + + # Auto-fetch a tree with cat-file. + git -C dst cat-file -p $SUBTREE >tree_contents && + grep file.txt tree_contents && + + # fsck still works after an auto-fetch of a tree. + git -C dst fsck && + + # Auto-fetch all remaining trees and blobs with --missing=error + git -C dst rev-list master --missing=error --objects >fetched_objects && + test_line_count = 70 fetched_objects && + awk -f print_1.awk fetched_objects \ + | xargs -n1 git -C dst cat-file -t >fetched_types && + sort fetched_types -u >unique_types.observed && + printf "blob\ncommit\ntree\n" >unique_types.expected && + test_cmp unique_types.observed unique_types.expected +' + test_expect_success 'partial clone fetches blobs pointed to by refs even if normally filtered out' ' rm -rf src dst && git init src && diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index fc0f92a16..30bf1c73e 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -213,6 +213,18 @@ test_expect_success 'rev-list W/ --missing=print and --missing=allow-any for tre test_line_count = 0 rev_list_err ' +# Test tree:0 filter. + +test_expect_success 'verify tree:0 includes trees in "filtered" output' ' + git -C r3 rev-list HEAD --quiet --objects --filter-print-omitted --filter=tree:0 \ + | awk -f print_1.awk \ + | sed s/~// \ + | xargs -n1 git -C r3 cat-file -t \ + | sort -u >filtered_types && + printf "blob\ntree\n" > expected && + test_cmp filtered_types expected +' + # Delete some loose objects and use rev-list, but WITHOUT any filtering. # This models previously omitted objects that we did not receive. -- 2.18.0.865.gffc8e1a3cd6-goog