git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: gitster@pobox.com, git@jeffhostetler.com,
	chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com,
	jrnieder@gmail.com, martin.agren@gmail.com
Subject: [PATCH v3 00/19] Parallel Checkout (part I)
Date: Wed, 28 Oct 2020 23:14:37 -0300	[thread overview]
Message-ID: <cover.1603937110.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1600814153.git.matheus.bernardino@usp.br>

There was some semantic conflicts between this series and
jk/checkout-index-errors, so I rebased my series on top of that.

Also, I'd please ask reviewers to confirm that my descriptor
redirection in git_pc() (patch 17) is correct, as I'm not very 
familiar with the test suite descriptors.

Main changes since v2:

Patch 10:
  - Squashed Peff's patch removing an useless function parameter.

Patch 11:
  - Valgrind used to complain about send_one_item() passing
    uninitialized bytes to a syscall (write(2)). The referred bytes come
    from the unused positions on oid->hash[], when the hash is SHA-1.
    Since the workers won't use these bytes, there is no real harm. But
    the warning could cause confusion and even get in the way of
    detecting real errors, so I replaced the oidcpy() call with
    hashcpy().

Patch 16:
  - Replaced use of the non-portable '\+' in grep with '..*' (in
    t/lib-parallel-checkout.sh).

  - Properly quoted function parameters in t/lib-parallel-checkout.sh,
    as Jonathan pointed out.

  - In t2080, dropped tests that used git.git as test data, and added
    two more tests to check clone with parallel-checkout using the
    artificial repo already created for other tests.

  - No longer skip clone tests when GIT_TEST_DEFAULT_HASH is sha256. A
    bug in clone used to make the tests fail with this configuration set
    to this value, but the bug was fixed in 47ac970309 ("builtin/clone:
    avoid failure with GIT_DEFAULT_HASH", 2020-09-20).

Patch 17:
  - The test t2081-parallel-checkout-collisions.sh had a bug in which
    the filter options were being wrongly passed to git. These options
    were conditionally defined through a shell variable, for which the
    quoting was wrong. This should have made the test fail but, in fact,
    another bug (using the arithmetic operator `-eq` for strings), was
    preventing the problematic section from ever running. These bugs are
    now fixed, and the test script was also simplified, by making use of
    the lib-parallel-checkout.sh and eliminating the helper function.

  - Use "$TEST_ROOT/logger_script" instead of "../logger_script", to be
    on the safe side.


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (15):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: add tests for basic operations
  parallel-checkout: add tests related to clone collisions
  parallel-checkout: add tests related to .gitattributes
  ci: run test round with parallel-checkout enabled

 .gitignore                              |   1 +
 Documentation/config/checkout.txt       |  21 +
 Makefile                                |   2 +
 apply.c                                 |   1 +
 builtin.h                               |   1 +
 builtin/checkout--helper.c              | 142 ++++++
 builtin/checkout-index.c                |  22 +-
 builtin/checkout.c                      |  21 +-
 builtin/difftool.c                      |   3 +-
 cache.h                                 |  34 +-
 ci/run-build-and-tests.sh               |   1 +
 convert.c                               | 121 +++--
 convert.h                               |  68 +++
 entry.c                                 | 102 ++--
 entry.h                                 |  54 ++
 git.c                                   |   2 +
 parallel-checkout.c                     | 638 ++++++++++++++++++++++++
 parallel-checkout.h                     | 103 ++++
 read-cache.c                            |  12 +-
 t/README                                |   4 +
 t/lib-encoding.sh                       |  25 +
 t/lib-parallel-checkout.sh              |  46 ++
 t/t0028-working-tree-encoding.sh        |  25 +-
 t/t2080-parallel-checkout-basics.sh     | 170 +++++++
 t/t2081-parallel-checkout-collisions.sh |  98 ++++
 t/t2082-parallel-checkout-attributes.sh | 174 +++++++
 unpack-trees.c                          |  22 +-
 27 files changed, 1758 insertions(+), 155 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h
 create mode 100644 t/lib-encoding.sh
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh
 create mode 100755 t/t2081-parallel-checkout-collisions.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

Range-diff against v2:
 1:  b9d2a329d3 =  1:  dfc3e0fd62 convert: make convert_attrs() and convert structs public
 2:  313c3bcbeb =  2:  c5fbd1e16d convert: add [async_]convert_to_working_tree_ca() variants
 3:  29bbdb78e9 =  3:  c77b16f694 convert: add get_stream_filter_ca() variant
 4:  a1cf5df961 =  4:  18c3f4247e convert: add conv_attrs classification
 5:  25b311745a =  5:  2caa2c4345 entry: extract a header file for entry.c functions
 6:  dbee09e936 =  6:  bfa52df9e2 entry: make fstat_output() and read_blob_entry() public
 7:  b61b5c44f0 =  7:  91ef17f533 entry: extract cache_entry update from write_entry()
 8:  667ad0dea7 =  8:  81e03baab1 entry: move conv_attrs lookup up to checkout_entry()
 9:  4ddb34209e =  9:  e1b886f823 entry: add checkout_entry_ca() which takes preloaded conv_attrs
10:  af0d790973 ! 10:  2bdc13664e unpack-trees: add basic support for parallel checkout
    @@ parallel-checkout.c (new)
     +}
     +
     +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
    -+			       const char *path, struct checkout *state)
    ++			       const char *path)
     +{
     +	int ret;
     +	struct stream_filter *filter;
    @@ parallel-checkout.c (new)
     +		goto out;
     +	}
     +
    -+	if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) {
    ++	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
     +		/* Error was already reported. */
     +		pc_item->status = PC_ITEM_FAILED;
     +		goto out;
11:  991169488b ! 11:  096e543fd2 parallel-checkout: make it truly parallel
    @@ Documentation/config/checkout.txt: will checkout the '<something>' branch on ano
     +	The number of parallel workers to use when updating the working tree.
     +	The default is one, i.e. sequential execution. If set to a value less
     +	than one, Git will use as many workers as the number of logical cores
    -+	available. This setting and checkout.thresholdForParallelism affect all
    -+	commands that perform checkout. E.g. checkout, switch, clone, reset,
    -+	sparse-checkout, read-tree, etc.
    ++	available. This setting and `checkout.thresholdForParallelism` affect
    ++	all commands that perform checkout. E.g. checkout, clone, reset,
    ++	sparse-checkout, etc.
     ++
     +Note: parallel checkout usually delivers better performance for repositories
     +located on SSDs or over NFS. For repositories on spinning disks and/or machines
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +
     +	fixed_portion = (struct pc_item_fixed_portion *)data;
     +	fixed_portion->id = pc_item->id;
    -+	oidcpy(&fixed_portion->oid, &pc_item->ce->oid);
     +	fixed_portion->ce_mode = pc_item->ce->ce_mode;
     +	fixed_portion->crlf_action = pc_item->ca.crlf_action;
     +	fixed_portion->ident = pc_item->ca.ident;
     +	fixed_portion->name_len = name_len;
     +	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
    ++	/*
    ++	 * We use hashcpy() instead of oidcpy() because the hash[] positions
    ++	 * after `the_hash_algo->rawsz` might not be initialized. And Valgrind
    ++	 * would complain about passing uninitialized bytes to a syscall
    ++	 * (write(2)). There is no real harm in this case, but the warning could
    ++	 * hinder the detection of actual errors.
    ++	 */
    ++	hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash);
     +
     +	variant = data + sizeof(*fixed_portion);
     +	if (working_tree_encoding_len) {
12:  7ceadf2427 = 12:  9cfeb4821c parallel-checkout: support progress displaying
13:  f13b4c17f4 = 13:  da99b671e6 make_transient_cache_entry(): optionally alloc from mem_pool
14:  d7885a1130 = 14:  d3d561754a builtin/checkout.c: complete parallel checkout support
15:  1cf9b807f7 ! 15:  ee34c6e149 checkout-index: add parallel checkout support
    @@ builtin/checkout-index.c
      #define CHECKOUT_ALL 4
      static int nul_term_line;
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    - 	int prefix_length;
      	int force = 0, quiet = 0, not_new = 0;
      	int index_opt = 0;
    + 	int err = 0;
     +	int pc_workers, pc_threshold;
      	struct option builtin_checkout_index_options[] = {
      		OPT_BOOL('a', "all", &all,
    @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, co
      	for (i = 0; i < argc; i++) {
      		const char *arg = argv[i];
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    + 		strbuf_release(&buf);
    + 	}
    + 
    +-	if (err)
    +-		return 1;
    +-
      	if (all)
      		checkout_all(prefix, prefix_length);
      
     +	if (pc_workers > 1) {
    -+		/* Errors were already reported */
    -+		run_parallel_checkout(&state, pc_workers, pc_threshold,
    -+				      NULL, NULL);
    ++		err |= run_parallel_checkout(&state, pc_workers, pc_threshold,
    ++					     NULL, NULL);
     +	}
    ++
    ++	if (err)
    ++		return 1;
     +
      	if (is_lock_file_locked(&lock_file) &&
      	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
16:  64b41d537e ! 16:  05299a3cc0 parallel-checkout: add tests for basic operations
    @@ Commit message
         for symlinks in the leading directories and the abidance to --force.
     
         Note: some helper functions are added to a common lib file which is only
    -    included by t2080 for now. But it will also be used by another
    -    parallel-checkout test in a following patch.
    +    included by t2080 for now. But it will also be used by other
    +    parallel-checkout tests in the following patches.
     
         Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
    @@ t/lib-parallel-checkout.sh (new)
     +
     +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
     +# and checks that the number of workers spawned is equal to $3.
    ++#
     +git_pc()
     +{
     +	if test $# -lt 4
     +	then
     +		BUG "too few arguments to git_pc()"
    -+	fi
    ++	fi &&
     +
     +	workers=$1 threshold=$2 expected_workers=$3 &&
    -+	shift && shift && shift &&
    ++	shift 3 &&
     +
     +	rm -f trace &&
     +	GIT_TRACE2="$(pwd)/trace" git \
     +		-c checkout.workers=$workers \
     +		-c checkout.thresholdForParallelism=$threshold \
     +		-c advice.detachedHead=0 \
    -+		$@ &&
    ++		"$@" &&
     +
     +	# Check that the expected number of workers has been used. Note that it
    -+	# can be different than the requested number in two cases: when the
    -+	# quantity of entries to be checked out is less than the number of
    -+	# workers; and when the threshold has not been reached.
    ++	# can be different from the requested number in two cases: when the
    ++	# threshold is not reached; and when there are not enough
    ++	# parallel-eligible entries for all workers.
     +	#
    -+	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
    ++	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
     +	test $workers_in_trace -eq $expected_workers &&
     +	rm -f trace
     +}
    @@ t/lib-parallel-checkout.sh (new)
     +# Verify that both the working tree and the index were created correctly
     +verify_checkout()
     +{
    -+	git -C $1 diff-index --quiet HEAD -- &&
    -+	git -C $1 diff-index --quiet --cached HEAD -- &&
    -+	git -C $1 status --porcelain >$1.status &&
    -+	test_must_be_empty $1.status
    ++	git -C "$1" diff-index --quiet HEAD -- &&
    ++	git -C "$1" diff-index --quiet --cached HEAD -- &&
    ++	git -C "$1" status --porcelain >"$1".status &&
    ++	test_must_be_empty "$1".status
     +}
     
      ## t/t2080-parallel-checkout-basics.sh (new) ##
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +. ./test-lib.sh
     +. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
    -+# currently produces a wrong result (See
    -+# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
    -+# So we skip the "parallel-checkout during clone" tests when this test flag is
    -+# set to "sha256". Remove this when the bug is fixed.
    -+#
    -+if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
    -+then
    -+	skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
    -+	test_done
    -+fi
    -+
    -+R_BASE=$GIT_BUILD_DIR
    -+
    -+test_expect_success 'sequential clone' '
    -+	git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&
    -+	verify_checkout r_sequential
    -+'
    -+
    -+test_expect_success 'parallel clone' '
    -+	git_pc 2 0 2 clone --quiet -- $R_BASE r_parallel &&
    -+	verify_checkout r_parallel
    -+'
    -+
    -+test_expect_success 'fallback to sequential clone (threshold)' '
    -+	git -C $R_BASE ls-files >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+
    -+	git_pc 2 $threshold 0 clone --quiet -- $R_BASE r_sequential_fallback &&
    -+	verify_checkout r_sequential_fallback
    -+'
    -+
    -+# Just to be paranoid, actually compare the contents of the worktrees directly.
    -+test_expect_success 'compare working trees from clones' '
    -+	rm -rf r_sequential/.git &&
    -+	rm -rf r_parallel/.git &&
    -+	rm -rf r_sequential_fallback/.git &&
    -+	diff -qr r_sequential r_parallel &&
    -+	diff -qr r_sequential r_sequential_fallback
    -+'
    -+
     +# Test parallel-checkout with different operations (creation, deletion,
     +# modification) and entry types. A branch switch from B1 to B2 will contain:
     +#
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +	verify_checkout various_sequential_fallback
     +'
     +
    -+test_expect_success SYMLINKS 'compare working trees from checkouts' '
    -+	rm -rf various_sequential/.git &&
    -+	rm -rf various_parallel/.git &&
    -+	rm -rf various_sequential_fallback/.git &&
    -+	diff -qr various_sequential various_parallel &&
    -+	diff -qr various_sequential various_sequential_fallback
    ++test_expect_success SYMLINKS 'parallel checkout on clone' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 0 2 clone --recurse-submodules various various_parallel_clone  &&
    ++	verify_checkout various_parallel_clone
    ++'
    ++
    ++test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 100 0 clone --recurse-submodules various various_sequential_fallback_clone &&
    ++	verify_checkout various_sequential_fallback_clone
    ++'
    ++
    ++# Just to be paranoid, actually compare the working trees' contents directly.
    ++test_expect_success SYMLINKS 'compare the working trees' '
    ++	rm -rf various_*/.git &&
    ++	rm -rf various_*/d/.git &&
    ++
    ++	diff -r various_sequential various_parallel &&
    ++	diff -r various_sequential various_sequential_fallback &&
    ++	diff -r various_sequential various_parallel_clone &&
    ++	diff -r various_sequential various_sequential_fallback_clone
     +'
     +
     +test_cmp_str()
17:  70708d3e31 ! 17:  3d140dcacb parallel-checkout: add tests related to clone collisions
    @@ Commit message
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    + ## t/lib-parallel-checkout.sh ##
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 		-c checkout.workers=$workers \
    + 		-c checkout.thresholdForParallelism=$threshold \
    + 		-c advice.detachedHead=0 \
    +-		"$@" &&
    ++		"$@" 2>&8 &&
    + 
    + 	# Check that the expected number of workers has been used. Note that it
    + 	# can be different from the requested number in two cases: when the
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
    + 	test $workers_in_trace -eq $expected_workers &&
    + 	rm -f trace
    +-}
    ++} 8>&2 2>&4
    + 
    + # Verify that both the working tree and the index were created correctly
    + verify_checkout()
    +
      ## t/t2081-parallel-checkout-collisions.sh (new) ##
     @@
     +#!/bin/sh
     +
    -+test_description='parallel-checkout collisions'
    ++test_description='parallel-checkout collisions
    ++
    ++When there are path collisions during a clone, Git should report a warning
    ++listing all of the colliding entries. The sequential code detects a collision
    ++by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    ++colliding pair of an item k, it searches cache_entry[0, k-1].
    ++
    ++This is not sufficient in parallel checkout since:
    ++
    ++- A colliding file may be created between the lstat() and open() calls;
    ++- A colliding entry might appear in the second half of the cache_entry array.
    ++
    ++The tests in this file make sure that the collision detection code is extended
    ++for parallel checkout.
    ++'
     +
     +. ./test-lib.sh
    ++. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# When there are pathname collisions during a clone, Git should report a warning
    -+# listing all of the colliding entries. The sequential code detects a collision
    -+# by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    -+# colliding pair of an item k, it searches cache_entry[0, k-1].
    -+#
    -+# This is not sufficient in parallel-checkout mode since colliding files may be
    -+# created in a racy order. The tests in this file make sure the collision
    -+# detection code is extended for parallel-checkout. This is done in two parts:
    -+#
    -+# - First, two parallel workers create four colliding files racily.
    -+# - Then this exercise is repeated but forcing the colliding pair to appear in
    -+#   the second half of the cache_entry's array.
    -+#
    -+# The second item uses the fact that files with clean/smudge filters are not
    -+# parallel-eligible; and that they are processed sequentially *before* any
    -+# worker is spawned. We set a filter attribute to the last entry in the
    -+# cache_entry[] array, making it non-eligible, so that it is populated first.
    -+# This way, we can test if the collision detection code is correctly looking
    -+# for collision pairs in the second half of the array.
    ++TEST_ROOT="$PWD"
     +
     +test_expect_success CASE_INSENSITIVE_FS 'setup' '
    -+	file_hex=$(git hash-object -w --stdin </dev/null) &&
    -+	file_oct=$(echo $file_hex | hex2oct) &&
    ++	file_x_hex=$(git hash-object -w --stdin </dev/null) &&
    ++	file_x_oct=$(echo $file_x_hex | hex2oct) &&
     +
     +	attr_hex=$(echo "file_x filter=logger" | git hash-object -w --stdin) &&
     +	attr_oct=$(echo $attr_hex | hex2oct) &&
     +
    -+	printf "100644 FILE_X\0${file_oct}" >tree &&
    -+	printf "100644 FILE_x\0${file_oct}" >>tree &&
    -+	printf "100644 file_X\0${file_oct}" >>tree &&
    -+	printf "100644 file_x\0${file_oct}" >>tree &&
    ++	printf "100644 FILE_X\0${file_x_oct}" >tree &&
    ++	printf "100644 FILE_x\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_X\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_x\0${file_x_oct}" >>tree &&
     +	printf "100644 .gitattributes\0${attr_oct}" >>tree &&
     +
     +	tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
     +	commit_hex=$(git commit-tree -m collisions $tree_hex) &&
     +	git update-ref refs/heads/collisions $commit_hex &&
     +
    -+	write_script logger_script <<-\EOF
    ++	write_script "$TEST_ROOT"/logger_script <<-\EOF
     +	echo "$@" >>filter.log
     +	EOF
     +'
     +
    -+clone_and_check_collision()
    -+{
    -+	id=$1 workers=$2 threshold=$3 expected_workers=$4 filter=$5 &&
    -+
    -+	filter_opts=
    -+	if test "$filter" -eq "use_filter"
    -+	then
    -+		# We use `core.ignoreCase=0` so that only `file_x`
    -+		# matches the pattern in .gitattributes.
    -+		#
    -+		filter_opts='-c filter.logger.smudge="../logger_script %f" -c core.ignoreCase=0'
    -+	fi &&
    -+
    -+	test_path_is_missing $id.trace &&
    -+	GIT_TRACE2="$(pwd)/$id.trace" git \
    -+		-c checkout.workers=$workers \
    -+		-c checkout.thresholdForParallelism=$threshold \
    -+		$filter_opts clone --branch=collisions -- . r_$id 2>$id.warning &&
    -+
    -+	# Check that checkout spawned the right number of workers
    -+	workers_in_trace=$(grep "child_start\[.\] git checkout--helper" $id.trace | wc -l) &&
    -+	test $workers_in_trace -eq $expected_workers &&
    -+
    -+	if test $filter -eq "use_filter"
    -+	then
    -+		#  Make sure only 'file_x' was filtered
    -+		test_path_is_file r_$id/filter.log &&
    ++for mode in parallel sequential-fallback
    ++do
    ++
    ++	case $mode in
    ++	parallel)		workers=2 threshold=0 expected_workers=2 ;;
    ++	sequential-fallback)	workers=2 threshold=100 expected_workers=0 ;;
    ++	esac
    ++
    ++	test_expect_success CASE_INSENSITIVE_FS "collision detection on $mode clone" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			clone --branch=collisions . $mode 2>$mode.stderr &&
    ++
    ++		grep FILE_X $mode.stderr &&
    ++		grep FILE_x $mode.stderr &&
    ++		grep file_X $mode.stderr &&
    ++		grep file_x $mode.stderr &&
    ++		test_i18ngrep "the following paths have collided" $mode.stderr
    ++	'
    ++
    ++	# The following test ensures that the collision detection code is
    ++	# correctly looking for colliding peers in the second half of the
    ++	# cache_entry array. This is done by defining a smudge command for the
    ++	# *last* array entry, which makes it non-eligible for parallel-checkout.
    ++	# The last entry is then checked out *before* any worker is spawned,
    ++	# making it succeed and the workers' entries collide.
    ++	#
    ++	# Note: this test don't work on Windows because, on this system,
    ++	# collision detection uses strcmp() when core.ignoreCase=false. And we
    ++	# have to set core.ignoreCase=false so that only 'file_x' matches the
    ++	# pattern of the filter attribute. But it works on OSX, where collision
    ++	# detection uses inode.
    ++	#
    ++	test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN "collision detection on $mode clone w/ filter" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			-c core.ignoreCase=false \
    ++			-c filter.logger.smudge="\"$TEST_ROOT/logger_script\" %f" \
    ++			clone --branch=collisions . ${mode}_with_filter \
    ++			2>${mode}_with_filter.stderr &&
    ++
    ++		grep FILE_X ${mode}_with_filter.stderr &&
    ++		grep FILE_x ${mode}_with_filter.stderr &&
    ++		grep file_X ${mode}_with_filter.stderr &&
    ++		grep file_x ${mode}_with_filter.stderr &&
    ++		test_i18ngrep "the following paths have collided" ${mode}_with_filter.stderr &&
    ++
    ++		# Make sure only "file_x" was filtered
    ++		test_path_is_file ${mode}_with_filter/filter.log &&
     +		echo file_x >expected.filter.log &&
    -+		test_cmp r_$id/filter.log expected.filter.log
    -+	else
    -+		test_path_is_missing r_$id/filter.log
    -+	fi &&
    -+
    -+	grep FILE_X $id.warning &&
    -+	grep FILE_x $id.warning &&
    -+	grep file_X $id.warning &&
    -+	grep file_x $id.warning &&
    -+	test_i18ngrep "the following paths have collided" $id.warning
    -+}
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on parallel clone' '
    -+	clone_and_check_collision parallel 2 0 2
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on fallback to sequential clone' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential 2 $threshold 0
    -+'
    -+
    -+# The next two tests don't work on Windows because, on this system, collision
    -+# detection uses strcmp() (when core.ignoreCase=0) to find the colliding pair.
    -+# But they work on OSX, where collision detection uses inode.
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on parallel clone w/ filter' '
    -+	clone_and_check_collision parallel-with-filter 2 0 2 use_filter
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on fallback to sequential clone w/ filter' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential-with-filter 2 $threshold 0 use_filter
    -+'
    ++		test_cmp ${mode}_with_filter/filter.log expected.filter.log
    ++	'
    ++done
     +
     +test_done
18:  ece38f0483 = 18:  b26f676cae parallel-checkout: add tests related to .gitattributes
19:  b4cb5905d2 ! 19:  641c61f9b6 ci: run test round with parallel-checkout enabled
    @@ t/lib-parallel-checkout.sh
     +
      # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
      # and checks that the number of workers spawned is equal to $3.
    - git_pc()
    -
    - ## t/t2081-parallel-checkout-collisions.sh ##
    -@@
    - test_description='parallel-checkout collisions'
    - 
    - . ./test-lib.sh
    -+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
    - 
    - # When there are pathname collisions during a clone, Git should report a warning
    - # listing all of the colliding entries. The sequential code detects a collision
    + #
-- 
2.28.0


  parent reply	other threads:[~2020-10-29  2:17 UTC|newest]

Thread overview: 154+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 02/21] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 03/21] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 04/21] convert: add conv_attrs classification Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 05/21] entry: extract a header file for entry.c functions Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 06/21] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 07/21] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 08/21] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 09/21] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 10/21] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 11/21] parallel-checkout: make it truly parallel Matheus Tavares
2020-08-19 21:34   ` Jeff Hostetler
2020-08-20  1:33     ` Matheus Tavares Bernardino
2020-08-20 14:39       ` Jeff Hostetler
2020-08-10 21:33 ` [RFC PATCH 12/21] parallel-checkout: add configuration options Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 13/21] parallel-checkout: support progress displaying Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 14/21] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 15/21] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 16/21] checkout-index: add " Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 17/21] parallel-checkout: avoid stat() calls in workers Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 18/21] entry: use is_dir_sep() when checking leading dirs Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 19/21] symlinks: make has_dirs_only_path() track FL_NOENT Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 20/21] parallel-checkout: create leading dirs in workers Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 21/21] parallel-checkout: skip checking the working tree on clone Matheus Tavares
2020-08-12 16:57 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 04/19] convert: add conv_attrs classification Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-10-01 15:53     ` Jeff Hostetler
2020-10-01 15:59       ` Jeff Hostetler
2020-09-22 22:49   ` [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-10-05  6:17     ` [PATCH] parallel-checkout: drop unused checkout state parameter Jeff King
2020-10-05 13:13       ` Matheus Tavares Bernardino
2020-10-05 13:45         ` Jeff King
2020-09-22 22:49   ` [PATCH v2 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-09-29 19:52     ` Martin Ågren
2020-09-30 14:02       ` Matheus Tavares Bernardino
2020-09-22 22:49   ` [PATCH v2 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 15/19] checkout-index: add " Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-10-20  1:35     ` Jonathan Nieder
2020-10-20  2:55       ` Taylor Blau
2020-10-20 13:18         ` Matheus Tavares Bernardino
2020-10-20 19:09           ` Junio C Hamano
2020-10-20  3:18       ` Matheus Tavares Bernardino
2020-10-20  4:16         ` Jonathan Nieder
2020-10-20 19:14         ` Junio C Hamano
2020-09-22 22:49   ` [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-10-29  2:14   ` Matheus Tavares [this message]
2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-10-29 23:40       ` Junio C Hamano
2020-10-30 17:01         ` Matheus Tavares Bernardino
2020-10-30 17:38           ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-10-29 23:48       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-10-29 23:51       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 04/19] convert: add conv_attrs classification Matheus Tavares
2020-10-29 23:53       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-10-30 21:36       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-10-30 21:58       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-10-30 22:02       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-11-02 19:35       ` Junio C Hamano
2020-11-03  3:48         ` Matheus Tavares Bernardino
2020-10-29  2:14     ` [PATCH v3 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 15/19] checkout-index: add " Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-10-29 19:48     ` [PATCH v3 00/19] Parallel Checkout (part I) Junio C Hamano
2020-10-30 15:58     ` Jeff Hostetler
2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-12-05 10:40         ` Christian Couder
2020-12-05 21:53           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-12-05 11:10         ` Christian Couder
2020-12-05 22:20           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-12-05 11:45         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 04/19] convert: add conv_attrs classification Matheus Tavares
2020-12-05 12:07         ` Christian Couder
2020-12-05 22:08           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-12-06  8:31         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-12-06  8:53         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-12-06  9:35         ` Christian Couder
2020-12-07 13:52           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-12-06 10:02         ` Christian Couder
2020-12-07 16:47           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-12-06 11:36         ` Christian Couder
2020-12-07 19:06           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-12-16 22:31         ` Emily Shaffer
2020-12-17 15:00           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 15/19] checkout-index: add " Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-12-16 14:50       ` [PATCH v5 0/9] Parallel Checkout (part I) Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 1/9] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 2/9] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 3/9] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 4/9] convert: add classification for conv_attrs struct Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 5/9] entry: extract a header file for entry.c functions Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 6/9] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 7/9] entry: extract update_ce_after_write() from write_entry() Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 8/9] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 9/9] entry: add checkout_entry_ca() taking preloaded conv_attrs Matheus Tavares
2020-12-16 15:27         ` [PATCH v5 0/9] Parallel Checkout (part I) Christian Couder
2020-12-17  1:11         ` Junio C Hamano
2021-03-23 14:19         ` [PATCH v6 0/9] Parallel Checkout (part 1) Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 1/9] convert: make convert_attrs() and convert structs public Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 2/9] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 3/9] convert: add get_stream_filter_ca() variant Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 4/9] convert: add classification for conv_attrs struct Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 5/9] entry: extract a header file for entry.c functions Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 6/9] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 7/9] entry: extract update_ce_after_write() from write_entry() Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 8/9] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 9/9] entry: add checkout_entry_ca() taking preloaded conv_attrs Matheus Tavares
2021-03-23 17:34           ` [PATCH v6 0/9] Parallel Checkout (part 1) Junio C Hamano
2020-10-01 16:42 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1603937110.git.matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=chriscool@tuxfamily.org \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=martin.agren@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).