From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.3 required=3.0 tests=AWL,BAYES_00, DATE_IN_PAST_12_24,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 809F11F859 for ; Sat, 6 Aug 2016 20:47:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752441AbcHFUq7 (ORCPT ); Sat, 6 Aug 2016 16:46:59 -0400 Received: from mail-io0-f171.google.com ([209.85.223.171]:36435 "EHLO mail-io0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752497AbcHFUqv (ORCPT ); Sat, 6 Aug 2016 16:46:51 -0400 Received: by mail-io0-f171.google.com with SMTP id b62so328125315iod.3 for ; Sat, 06 Aug 2016 13:46:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=eno9D1ew6B7o1gl5FllFl+quNOvUCYGBlyoOLN7Pkc4=; b=f2UoS0nJqwXeIHP0Pe6bYUKzidscopObNMF+V3MRVZnCaURaFJbvh0d3suS+60Qp5N o4X165+U/JDYa1yFfPyzBfYtsRYUSQk6VARr5+BmfRPNUKXi+GR73v+WYVBxT5qPn/iI oot6NpdpTQbHyPW778mSafLvbdI26HUatPde3ypQTEuI2dRD6IcB2HdBmweMs65YVxVf uW0joBBe4wqgJ5Wfh/ROPwZA+kHKZ0WAfrjLH3xJhqp3sA8WLGTjwcRXrvvy8JqpO+t8 e5aMChacMEXZE390naLcRWZ58Z6ZlV2duORjz8sEwT6Ca2UcFC2m6ZptKUNBlOH9sS77 j5xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=eno9D1ew6B7o1gl5FllFl+quNOvUCYGBlyoOLN7Pkc4=; b=dejelrCF81nBvWANIU/628nnq6q5zmH4FL7XT601BJpLXt0NXV9ur3z9+fhBWPdBvw StyFB6F6Y2BmTWSS1l0mHy/tX/2hOMETtO2DifT91Kqwx6JeX2vk71dwvdlxEQ4rVGIf /+Dmic8SSRAxvvo13ekHDp+tDWVZ/wbTR2NpiQDbGi6ZMf6ozkPi+oOorkUYaiijEC7G Wjs5BbL5VGZUSgEomhk7DYy6SrIl9z2y76nKrskqvZKotuz6t07NrN7Rdo6BBcrpY5ik Qdx2G2gkwTufHK9VDHFm/l7qnc/NslF1QvJHGZQOKpBkkn+nny7VUsDmdS4UZFFs+GfG NpHg== X-Gm-Message-State: AEkoouvTeM8ADWOoGKO2RwohtUsFgcmi4crrB4o+l4BR5fcZWELYAEG6F/Nx7UysGdvjhQaTuMKrR7umSKsSPrGO X-Received: by 10.107.144.10 with SMTP id s10mr79403189iod.165.1470437113527; Fri, 05 Aug 2016 15:45:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.128.66 with HTTP; Fri, 5 Aug 2016 15:45:13 -0700 (PDT) In-Reply-To: References: <20160804195159.7788-1-sbeller@google.com> <20160804195159.7788-3-sbeller@google.com> From: Stefan Beller Date: Fri, 5 Aug 2016 15:45:13 -0700 Message-ID: Subject: Re: [PATCH 2/6] t7408: merge short tests, factor out testing method To: Junio C Hamano Cc: "git@vger.kernel.org" , mst@redhat.com, Jens Lehmann Content-Type: text/plain; charset=UTF-8 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Fri, Aug 5, 2016 at 1:45 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> Tests consisting of one line each can be consolidated to have fewer tests >> to run as well as fewer lines of code. >> >> When having just a few git commands, do not create a new shell but >> use the -C flag in Git to execute in the correct directory. > > Good motivations. > >> Signed-off-by: Stefan Beller >> --- >> t/t7408-submodule-reference.sh | 50 +++++++++++++++--------------------------- >> 1 file changed, 18 insertions(+), 32 deletions(-) >> >> diff --git a/t/t7408-submodule-reference.sh b/t/t7408-submodule-reference.sh >> index afcc629..1416cbd 100755 >> --- a/t/t7408-submodule-reference.sh >> +++ b/t/t7408-submodule-reference.sh >> @@ -10,6 +10,16 @@ base_dir=$(pwd) >> >> U=$base_dir/UPLOAD_LOG > > Is this line needed anywhere? > > We (perhaps unfortunately) still need $base_dir because we want to > give an absolute file:/// URL to "submodule add", but I do not think > we use $U, so let's get rid of it. > >> +test_alternate_usage() >> +{ > > According to Documentation/CodingGuidelines, this should be: > > test_alternate_usage () { > > Somehow the helper name sounds as if it is testing if an alternate > is used correctly (i.e. the machinery may attempt to use alternate > but not in a correct way), not testing if an alternate is correctly > used (i.e. the machinery incorrectly forgets to use an alternate at > all), but maybe it is just me. > >> + alternates_file=$1 >> + working_dir=$2 > > These are good (they can be on a single line), but you would > want &&-chain just as other lines. > >> + test_line_count = 1 $alternates_file && > > This needs to quote "$alternates_file" especially in a helper > function you have no control over future callers of. > > I wonder if we want to check the actual contents of the alternate; > it may force us to worry about the infamous "should we expect > $(pwd)/something or $PWD/something" if we did so, so it is not a > strong suggestion. > >> + echo "0 objects, 0 kilobytes" >expect && >> + git -C $working_dir count-objects >current && >> + diff expect current > > It is more customary to name two "expect" vs "actual", and compare > them using "test_cmp" not "diff". > >> +} >> + >> test_expect_success 'preparing first repository' ' >> test_create_repo A && >> ( >> @@ -42,44 +52,20 @@ test_expect_success 'preparing superproject' ' >> ) >> ' >> >> -test_expect_success 'submodule add --reference' ' >> +test_expect_success 'submodule add --reference uses alternates' ' >> ( >> cd super && >> git submodule add --reference ../B "file://$base_dir/A" sub && >> git commit -m B-super-added >> - ) >> -' >> - >> -test_expect_success 'after add: existence of info/alternates' ' >> - test_line_count = 1 super/.git/modules/sub/objects/info/alternates >> -' >> - >> -test_expect_success 'that reference gets used with add' ' >> - ( >> - cd super/sub && >> - echo "0 objects, 0 kilobytes" > expected && >> - git count-objects > current && >> - diff expected current This is where the "diff" and "current" above came from. >> - ) >> -' > > Completely unrelated tangent, but after seeing the "how would we > make a more intelligent choice of the diff boundary" topic, I > wondered if we can notice that at this point there is a logical > boundary and do something intelligent about it. All the removed > lines above have become "test_alternate" we see below, while all the > removed lines below upto "test_alternate" correspond to the updated > test at the end. I guess that would require even more knowledge of the underlying content that we track in Git. Originally I started the diff boundary topic, as I assumed that the new line detection is not doing harm. What Michael came up with is impressive though I fear there might be a selection bias in the corpus, i.e. we are missing some projects that get worse by it and these projects would have had a great influence on the selection of the tuning parameters. I guess we'll just wait until someone speaks up and points at worse examples there. What you're asking here is a complete new ballpark IMHO as it takes diff to a whole new (syntactical) level. As of now the world agrees that '\n' seems to be a good character to put in text documents, so we use it as a splitting token in our underlying diff driver, i.e. the diffs are primarily by line, no matter how many characters change in that line. Sometimes there is a "secondary" aspect such as coloring and pointing out the characters that changed in a line[1]. [1] random example: https://gerrit-review.googlesource.com/#/c/79354/3/project.py Visually we focus on the changed characters, but the underlying diff is still "by line". We could write another "diff driver" that would not segment the text by '\n' as a primary method and then showing the changed hunks with +/-, but it would try to find rearrangements in the underlying text: As an example, this patch with such a diff driver could read partially as: ----8<---- ***diff-driver: moved-text line 42 -> 14, length:7, post-operations: 1 test_expect_success 'that reference gets used with add' ' ( cd super/sub && * echo "0 objects, 0 kilobytes" > expected && * git count-objects > current && * diff expected current ) ***diff-driver:post-operation 1: *** modify moved text with: +test_alternate_usage() { + alternates_file="$1" && + working_dir="$2" && * echo "0 objects, 0 kilobytes" > expected && * git count-objects > current && * diff expected current + } ***diff-driver: deletion 40,50 - ) -' - -test_expect_success 'after add: existence of info/alternates' ' - test_line_count = 1 super/.git/modules/sub/objects/info/alternates -' - ----8<---- Thinking about this further, this would not require knowledge of the underlying content, it is actually "just" an intra-file-rename detection. We have a rename detection from one blob to another, so we could also have a similar thing to deduplicate within one blob, which would track moving code around. > >> -test_expect_success 'cloning superproject' ' >> - git clone super super-clone >> -' >> - >> -test_expect_success 'update with reference' ' >> - cd super-clone && git submodule update --init --reference ../B >> -' >> - >> -test_expect_success 'after update: existence of info/alternates' ' >> - test_line_count = 1 super-clone/.git/modules/sub/objects/info/alternates >> + ) && >> + test_alternate_usage super/.git/modules/sub/objects/info/alternates super/sub >> ' >> >> -test_expect_success 'that reference gets used with update' ' >> - cd super-clone/sub && >> - echo "0 objects, 0 kilobytes" > expected && >> - git count-objects > current && >> - diff expected current >> +test_expect_success 'updating superproject keeps alternates' ' >> + test_when_finished "rm -rf super-clone" && > > This one is new; we do not remove A, B or super. Does it matter if > we leave super-clone behind? Is super-clone so special? "We need it in a later test." It comes down to philosophy of how to write tests. I spent some time in 740* and this is a surprising short test. Compare to 7404 and 7400 for example. These are very long, so when you want to add another test (no matter if testing for a fixed regression or a new feature), you have lots of repos like super, super2, super3, sub, sub1, sumodule, anothersub, and none of the names make sense. (Can I reuse them? do they have some weird corner case configuration that I need to undo? etc) To stop that from happening again I want to have a clean slate, i.e. all clones are deleted shortly after using, so it is obvious what to use for testing. > >> + git clone super super-clone && >> + git -C super-clone submodule update --init --reference ../B && >> + test_alternate_usage super-clone/.git/modules/sub/objects/info/alternates super-clone/sub >> ' >> >> test_done