From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 296ED1FA17 for ; Fri, 30 Apr 2021 21:41:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232611AbhD3Vlt (ORCPT ); Fri, 30 Apr 2021 17:41:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231905AbhD3Vlm (ORCPT ); Fri, 30 Apr 2021 17:41:42 -0400 Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3B9FC06138D for ; Fri, 30 Apr 2021 14:40:53 -0700 (PDT) Received: by mail-qv1-xf2e.google.com with SMTP id t14so5118624qvl.10 for ; Fri, 30 Apr 2021 14:40:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t23D3WB4vmdvDXKy81WrtRCExr2GHpVXwNDoKFSvJ7E=; b=pZbGia0V/g6ma6aMvWf35AKO0YHBHmTxGg2xa3cV1Tj9TabRwVEAu5JXLA9CtvUEHZ 65B3haAQvCwjAoXLjWqMnyjmxyWnP5kHX6Wfqt3no2Im5QzwjxHD9IP0L0zf82iHeE6j t+KSft153Z1eaFl8e+cVx8Z3Thkseo2DI10LbXfxtkG+YLyUr/tgsEYbbXnaf4g2oC+N O3x8nizjexEKU8Q+bBOM7dQ0/6Gj5zsTBqwvRLGknFZK2qjgUwZY5Cg+X1xAD5IO6AGG X7v5lQ4lWYy2QDu1mHUc5QNS2/NIJnU+DZ/Xr5XThlE7A2TMelqReukTJj6+s95TPzdZ njzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t23D3WB4vmdvDXKy81WrtRCExr2GHpVXwNDoKFSvJ7E=; b=Z2WvXM1Jc6YAZpzRbjL/0ePSCFkhcYA2W4udXBG41cbCgiShR9CQJFbPtWLNc9p+yU Z5On8sww7/5iZk9l69tuuKm04qfqSLCIyeYGheDwP4//usyDtvQf7aKjljW66cJwxgut cDAOlRsfiavx7/NuYfZW4LzJHBGBG8Tirb9T4QdzUhaZWXbU31z36t0xQZlTg9b6RcdT 4RUSRDBcd8+rutfeLS1126qbFWn7GS6HZI81nGybgjzVonThBKQ7CMQfb4XqEPJANEvt 0e47qyXdLMtE12ZuCELQNlLxvL41CKaZDFstszSw4Iy0wVljxRY88aBBC+6ssNsHmmgV GUdQ== X-Gm-Message-State: AOAM533+ipAyaz1B/RL6XLeUOuXFjnDgkP4kafYEPBnhjvo/D/tU1nOh eBlAbOshTgfj1M9IhPwbhOxX3Uswsf8TuA== X-Google-Smtp-Source: ABdhPJwar9i9IpnokQuJMOUW9wr+WSoT5I7ksZOTPD6JHVn+0zikKTlgs/dh+ByPdM3DBhiLgNepRw== X-Received: by 2002:ad4:504f:: with SMTP id m15mr8022337qvq.56.1619818852558; Fri, 30 Apr 2021 14:40:52 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id j13sm3123718qth.57.2021.04.30.14.40.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Apr 2021 14:40:52 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, git@jeffhostetler.com, stolee@gmail.com Subject: [PATCH v2 5/8] parallel-checkout: add tests related to path collisions Date: Fri, 30 Apr 2021 18:40:32 -0300 Message-Id: X-Mailer: git-send-email 2.30.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that path collisions are properly detected by checkout workers, both to avoid race conditions and to report colliding entries on clone. Co-authored-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- parallel-checkout.c | 4 + t/lib-parallel-checkout.sh | 4 +- t/t2081-parallel-checkout-collisions.sh | 162 ++++++++++++++++++++++++ 3 files changed, 168 insertions(+), 2 deletions(-) create mode 100755 t/t2081-parallel-checkout-collisions.sh diff --git a/parallel-checkout.c b/parallel-checkout.c index 09e8b10a35..6fb3f1e6c9 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -8,6 +8,7 @@ #include "sigchain.h" #include "streaming.h" #include "thread-utils.h" +#include "trace2.h" struct pc_worker { struct child_process cp; @@ -326,6 +327,7 @@ void write_pc_item(struct parallel_checkout_item *pc_item, if (dir_sep && !has_dirs_only_path(path.buf, dir_sep - path.buf, state->base_dir_len)) { pc_item->status = PC_ITEM_COLLIDED; + trace2_data_string("pcheckout", NULL, "collision/dirname", path.buf); goto out; } @@ -341,6 +343,8 @@ void write_pc_item(struct parallel_checkout_item *pc_item, * call should have already caught these cases. */ pc_item->status = PC_ITEM_COLLIDED; + trace2_data_string("pcheckout", NULL, + "collision/basename", path.buf); } else { error_errno("failed to open file '%s'", path.buf); pc_item->status = PC_ITEM_FAILED; diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh index f60b22ef34..d6740425b1 100644 --- a/t/lib-parallel-checkout.sh +++ b/t/lib-parallel-checkout.sh @@ -22,12 +22,12 @@ test_checkout_workers () { local trace_file=trace-test-checkout-workers && rm -f "$trace_file" && - GIT_TRACE2="$(pwd)/$trace_file" "$@" && + GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 && local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) && test $workers -eq $expected_workers && rm "$trace_file" -} +} 8>&2 2>&4 # Verify that both the working tree and the index were created correctly verify_checkout () { diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh new file mode 100755 index 0000000000..f6fcfc0c1e --- /dev/null +++ b/t/t2081-parallel-checkout-collisions.sh @@ -0,0 +1,162 @@ +#!/bin/sh + +test_description="path collisions during parallel checkout + +Parallel checkout must detect path collisions to: + +1) Avoid racily writing to different paths that represent the same file on disk. +2) Report the colliding entries on clone. + +The tests in this file exercise parallel checkout's collision detection code in +both these mechanics. +" + +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + +TEST_ROOT="$PWD" + +test_expect_success CASE_INSENSITIVE_FS 'setup' ' + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 100644 $empty_oid FILE_X + 100644 $empty_oid FILE_x + 100644 $empty_oid file_X + 100644 $empty_oid file_x + EOF + git update-index --index-info >filter.log + EOF +' + +test_workers_in_event_trace () +{ + test $1 -eq $(grep ".event.:.child_start..*checkout--worker" $2 | wc -l) +} + +test_expect_success CASE_INSENSITIVE_FS 'worker detects basename collision' ' + GIT_TRACE2_EVENT="$(pwd)/trace" git \ + -c checkout.workers=2 -c checkout.thresholdForParallelism=0 \ + checkout . && + + test_workers_in_event_trace 2 trace && + collisions=$(grep -i "category.:.pcheckout.,.key.:.collision/basename.,.value.:.file_x.}" trace | wc -l) && + test $collisions -eq 3 +' + +test_expect_success CASE_INSENSITIVE_FS 'worker detects dirname collision' ' + test_config filter.logger.smudge "\"$TEST_ROOT/logger_script\" %f" && + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 100644 $empty_oid A/B + 100644 $empty_oid A/C + 100644 $empty_oid a + 100644 $attr_oid .gitattributes + EOF + git rm -rf . && + git update-index --index-info expected.log && + test_cmp filter.log expected.log && + + # Check that it used the right number of workers and detected the collisions + test_workers_in_event_trace 2 trace && + grep "category.:.pcheckout.,.key.:.collision/dirname.,.value.:.A/B.}" trace && + grep "category.:.pcheckout.,.key.:.collision/dirname.,.value.:.A/C.}" trace +' + +test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'do not follow symlinks colliding with leading dir' ' + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 120000 $symlink_oid D + 100644 $empty_oid d/x + 100644 $empty_oid e/y + EOF + git rm -rf . && + git update-index --index-info stderr && + + grep FILE_X stderr && + grep FILE_x stderr && + grep file_X stderr && + grep file_x stderr && + grep "the following paths have collided" stderr +' + +# This test ensures that the collision report code is correctly looking for +# colliding peers in the second half of the cache_entry array. This is done by +# defining a smudge command for the *last* array entry, which makes it +# non-eligible for parallel-checkout. Thus, it is checked out *first*, before +# spawning the workers. +# +# Note: this test doesn't work on Windows because, on this system, the +# collision report code uses strcmp() to find the colliding pairs when +# core.ignoreCase is false. And we need this setting for this test so that only +# 'file_x' matches the pattern of the filter attribute. But the test works on +# OSX, where the colliding pairs are found using inode. +# +test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN \ + 'collision report on clone (w/ colliding peer after the detected entry)' ' + + test_config_global filter.logger.smudge "\"$TEST_ROOT/logger_script\" %f" && + git reset --hard basename_collision && + echo "file_x filter=logger" >.gitattributes && + git add .gitattributes && + git commit -m "filter for file_x" && + + rm -rf clone-repo && + set_checkout_config 2 0 && + test_checkout_workers 2 \ + git -c core.ignoreCase=false clone . clone-repo 2>stderr && + + grep FILE_X stderr && + grep FILE_x stderr && + grep file_X stderr && + grep file_x stderr && + grep "the following paths have collided" stderr && + + # Check that only "file_x" was filtered + echo file_x >expected.log && + test_cmp clone-repo/filter.log expected.log +' + +test_done -- 2.30.1