git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 00/21] [RFC] Parallel checkout
@ 2020-08-10 21:33 Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public Matheus Tavares
                   ` (23 more replies)
  0 siblings, 24 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git; +Cc: stolee, jeffhost

This series adds parallel workers to the checkout machinery. The cache
entries are distributed among helper processes which are responsible for
reading, filtering and writing the blobs to the working tree. This
should benefit all commands that call unpack_trees() or check_updates(),
such as: checkout, clone, sparse-checkout, checkout-index, etc.

This proposal is based on two previous ones, by Duy [1] and Jeff [2]. It
uses some of the patches from these two series, with additional changes.
The final parallel version was benchmarked during three operations with
cold cache in the linux repo: cloning v5.8, checking out v5.8 from
v2.6.15 and checking out v5.8 from v5.7. The three tables below show the
mean run times and standard deviations for 5 runs in: a local file
system, a Linux NFS server and Amazon EFS. The number of workers was
chosen based on what produces the best result for each case.

Local:

            Clone                  Checkout I             Checkout II
Sequential  8.180 s ± 0.021 s      6.936 s ± 0.030 s      2.585 s ± 0.005 s
10 workers  3.406 s ± 0.187 s      2.164 s ± 0.033 s      1.050 s ± 0.021 s
Speedup     2.40 ± 0.13            3.21 ± 0.05            2.46 ± 0.05

Linux NFS server (v4.1, on EBS, single availability zone):

            Clone                  Checkout I             Checkout II
Sequential  208.069 s ± 2.522 s    198.610 s ± 1.979 s    54.376 s ± 1.333 s
32 workers  58.170 s ± 0.648 s     56.471 s ± 0.093 s     22.311 s ± 0.220 s
Speedup     3.58 ± 0.06            3.52 ± 0.04            2.44 ± 0.06

EFS (v4.1, replicated over multiple availability zones):

            Clone                  Checkout I             Checkout II
Sequential  1143.655 s ± 11.819 s  1277.891 s ± 10.481 s  396.891 s ± 7.505 s
64 workers  94.778 s ± 4.984 s     201.674 s ± 2.286 s    149.951 s ± 12.895 s
Speedup     12.07 ± 0.65           6.34 ± 0.09            2.65 ± 0.23


I also repeated the local benchmark tests including pc-p4-core [2], to
make sure the new proposal doesn't have performance regressions:

            Clone                  Checkout I             Checkout II
pc-p4-core  3.746 s ± 0.044 s      3.158 s ± 0.041 s      1.597 s ± 0.019 s
10 workers  3.595 s ± 0.111 s      2.263 s ± 0.027 s      1.098 s ± 0.023 s
Speedup     1.04 ± 0.03            1.40 ± 0.02            1.45 ± 0.04


The series is divided in three blocks:

- The first 9 patches are preparatory steps in convert.c and entry.c.
- The middle 7 actually implement parallel checkout.
- The last 5 are ideas for further optimization of the parallel version.
  They don't bring a huge difference in local file systems (e.g. linux
  clone is only 1.04x faster than the previous parallel code), but in
  distributed file systems, there is a significant difference: 1.15x
  faster in NFS and 1.83x faster in Amazon EFS. (For comparison, the
  timings before these additional patches can be seen in the commit
  message of patch 11.)

The first 4 patches come from [2]. I couldn't get in touch with Jeff yet
and ask for his approval on then, so I didn't include his Signed-off-by,
for the time being.

Note: we probably want to add some extra validation and perf tests. But,
for now, parallel checkout is enabled by default in this series (with no
threshold on the minimum number of entries), so the test base is already
exercising the parallel code. (see [3])

There are some additional optimization possibilities I want to
experiment with later, such as:
- Work stealing, to better re-distribute tasks in case of non-uniform
  work loads. Duy already proposed a way to implement this in his
  original series.
- Add a --stat option to checkout--helper, to avoid calling stat() when
  state.refresh_cache is false.
- Try to detect when a repository is in NFS/EFS to automatically use a
  higher number of workers, as this showed out to be very effective in
  distributed file systems.

[1]: https://gitlab.com/pclouds/git/-/commits/parallel-checkout
[2]: https://github.com/jeffhostetler/git/commits/pc-p4-core
[3]: https://github.com/matheustavares/git/actions/runs/203036951 

----
Notes on the benchmarks:

Local tests were executed in an i7-7700HQ (4 cores with hyper-threading)
running Manjaro Linux, with SSD. NFS and EFS tests were executed in an
Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux NFS server was
running on a m6g.large instance with 1 TB, EBS GP2 volume. For
pc-p4-core tests, I used the set of parameters that resulted in the
fasted mean execution (of 5 runs) on my machine, which was:
- For clone: async mode, 22 helpers, 2 writers, 10 preloading slots
- For checkout I: async mode, 20 helpers, 2 writers, 20 preloading slots
- For checkout II: sync mode, 4 helpers, 2 writers, 30 preloading slots


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (17):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: add configuration options
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: avoid stat() calls in workers
  entry: use is_dir_sep() when checking leading dirs
  symlinks: make has_dirs_only_path() track FL_NOENT
  parallel-checkout: create leading dirs in workers
  parallel-checkout: skip checking the working tree on clone

 .gitignore                        |   1 +
 Documentation/config/checkout.txt |  16 +
 Makefile                          |   2 +
 apply.c                           |   1 +
 builtin.h                         |   1 +
 builtin/checkout--helper.c        | 135 +++++++
 builtin/checkout-index.c          |  17 +
 builtin/checkout.c                |  21 +-
 builtin/difftool.c                |   3 +-
 cache.h                           |  35 +-
 convert.c                         | 121 +++---
 convert.h                         |  68 ++++
 entry.c                           | 180 +++++++--
 entry.h                           |  54 +++
 git.c                             |   2 +
 parallel-checkout.c               | 611 ++++++++++++++++++++++++++++++
 parallel-checkout.h               | 103 +++++
 read-cache.c                      |  12 +-
 symlinks.c                        |  42 +-
 unpack-trees.c                    |  24 +-
 20 files changed, 1292 insertions(+), 157 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 02/21] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Lars Schneider, Torsten Bögershausen,
	Junio C Hamano, brian m. carlson

From: Jeff Hostetler <jeffhost@microsoft.com>

Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation.

[matheus.bernardino: squash and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 23 ++---------------------
 convert.h | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 572449825c..9710d770dc 100644
--- a/convert.c
+++ b/convert.c
@@ -24,17 +24,6 @@
 #define CONVERT_STAT_BITS_TXT_CRLF  0x2
 #define CONVERT_STAT_BITS_BIN       0x4
 
-enum crlf_action {
-	CRLF_UNDEFINED,
-	CRLF_BINARY,
-	CRLF_TEXT,
-	CRLF_TEXT_INPUT,
-	CRLF_TEXT_CRLF,
-	CRLF_AUTO,
-	CRLF_AUTO_INPUT,
-	CRLF_AUTO_CRLF
-};
-
 struct text_stat {
 	/* NUL, CR, LF and CRLF counts */
 	unsigned nul, lonecr, lonelf, crlf;
@@ -1300,18 +1289,10 @@ static int git_path_check_ident(struct attr_check_item *check)
 	return !!ATTR_TRUE(value);
 }
 
-struct conv_attrs {
-	struct convert_driver *drv;
-	enum crlf_action attr_action; /* What attr says */
-	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
-	int ident;
-	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
-};
-
 static struct attr_check *check;
 
-static void convert_attrs(const struct index_state *istate,
-			  struct conv_attrs *ca, const char *path)
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path)
 {
 	struct attr_check_item *ccheck = NULL;
 
diff --git a/convert.h b/convert.h
index e29d1026a6..aeb4a1be9a 100644
--- a/convert.h
+++ b/convert.h
@@ -37,6 +37,27 @@ enum eol {
 #endif
 };
 
+enum crlf_action {
+	CRLF_UNDEFINED,
+	CRLF_BINARY,
+	CRLF_TEXT,
+	CRLF_TEXT_INPUT,
+	CRLF_TEXT_CRLF,
+	CRLF_AUTO,
+	CRLF_AUTO_INPUT,
+	CRLF_AUTO_CRLF
+};
+
+struct convert_driver;
+
+struct conv_attrs {
+	struct convert_driver *drv;
+	enum crlf_action attr_action; /* What attr says */
+	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
+	int ident;
+	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
+};
+
 enum ce_delay_state {
 	CE_NO_DELAY = 0,
 	CE_CAN_DELAY = 1,
@@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 int would_convert_to_git_filter_fd(const struct index_state *istate,
 				   const char *path);
 
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path);
+
 /*
  * Initialize the checkout metadata with the given values.  Any argument may be
  * NULL if it is not applicable.  The treeish should be a commit if that is
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 02/21] convert: add [async_]convert_to_working_tree_ca() variants
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 03/21] convert: add get_stream_filter_ca() variant Matheus Tavares
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Junio C Hamano, Lars Schneider

From: Jeff Hostetler <jeffhost@microsoft.com>

Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on a index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

[matheus.bernardino: squash, remove one function definition and reword]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 50 ++++++++++++++++++++++++++++++++++++--------------
 convert.h |  9 +++++++++
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/convert.c b/convert.c
index 9710d770dc..757dc2585c 100644
--- a/convert.c
+++ b/convert.c
@@ -1450,7 +1450,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
 }
 
-static int convert_to_working_tree_internal(const struct index_state *istate,
+static int convert_to_working_tree_internal(const struct conv_attrs *ca,
 					    const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing,
@@ -1458,11 +1458,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 					    struct delayed_checkout *dco)
 {
 	int ret = 0, ret_filter = 0;
-	struct conv_attrs ca;
-
-	convert_attrs(istate, &ca, path);
 
-	ret |= ident_to_worktree(src, len, dst, ca.ident);
+	ret |= ident_to_worktree(src, len, dst, ca->ident);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -1472,24 +1469,24 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 	 * is a smudge or process filter (even if the process filter doesn't
 	 * support smudge).  The filters might expect CRLFs.
 	 */
-	if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) {
-		ret |= crlf_to_worktree(src, len, dst, ca.crlf_action);
+	if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) {
+		ret |= crlf_to_worktree(src, len, dst, ca->crlf_action);
 		if (ret) {
 			src = dst->buf;
 			len = dst->len;
 		}
 	}
 
-	ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding);
+	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
 	}
 
 	ret_filter = apply_filter(
-		path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco);
-	if (!ret_filter && ca.drv && ca.drv->required)
-		die(_("%s: smudge filter %s failed"), path, ca.drv->name);
+		path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco);
+	if (!ret_filter && ca->drv && ca->drv->required)
+		die(_("%s: smudge filter %s failed"), path, ca->drv->name);
 
 	return ret | ret_filter;
 }
@@ -1500,7 +1497,9 @@ int async_convert_to_working_tree(const struct index_state *istate,
 				  const struct checkout_metadata *meta,
 				  void *dco)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco);
 }
 
 int convert_to_working_tree(const struct index_state *istate,
@@ -1508,13 +1507,36 @@ int convert_to_working_tree(const struct index_state *istate,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL);
+}
+
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
+}
+
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
 }
 
 int renormalize_buffer(const struct index_state *istate, const char *path,
 		       const char *src, size_t len, struct strbuf *dst)
 {
-	int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL);
+	struct conv_attrs ca;
+	int ret;
+
+	convert_attrs(istate, &ca, path);
+	ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/convert.h b/convert.h
index aeb4a1be9a..46d537d1ae 100644
--- a/convert.h
+++ b/convert.h
@@ -100,11 +100,20 @@ int convert_to_working_tree(const struct index_state *istate,
 			    const char *path, const char *src,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta);
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta);
 int async_convert_to_working_tree(const struct index_state *istate,
 				  const char *path, const char *src,
 				  size_t len, struct strbuf *dst,
 				  const struct checkout_metadata *meta,
 				  void *dco);
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco);
 int async_query_available_blobs(const char *cmd,
 				struct string_list *available_paths);
 int renormalize_buffer(const struct index_state *istate,
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 03/21] convert: add get_stream_filter_ca() variant
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 02/21] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 04/21] convert: add conv_attrs classification Matheus Tavares
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Torsten Bögershausen,
	Nguyễn Thái Ngọc Duy, Johannes Schindelin,
	Jakub Narębski, Lars Schneider, Junio C Hamano,
	brian m. carlson

From: Jeff Hostetler <jeffhost@microsoft.com>

Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

[matheus.bernardino: move header comment to ca() variant and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 28 +++++++++++++++++-----------
 convert.h |  2 ++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/convert.c b/convert.c
index 757dc2585c..8e995b39c3 100644
--- a/convert.c
+++ b/convert.c
@@ -1963,34 +1963,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid)
 }
 
 /*
- * Return an appropriately constructed filter for the path, or NULL if
+ * Return an appropriately constructed filter for the given ca, or NULL if
  * the contents cannot be filtered without reading the whole thing
  * in-core.
  *
  * Note that you would be crazy to set CRLF, smudge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const struct index_state *istate,
-					const char *path,
-					const struct object_id *oid)
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid)
 {
-	struct conv_attrs ca;
 	struct stream_filter *filter = NULL;
 
-	convert_attrs(istate, &ca, path);
-	if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean))
+	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
 		return NULL;
 
-	if (ca.working_tree_encoding)
+	if (ca->working_tree_encoding)
 		return NULL;
 
-	if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF)
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
 		return NULL;
 
-	if (ca.ident)
+	if (ca->ident)
 		filter = ident_filter(oid);
 
-	if (output_eol(ca.crlf_action) == EOL_CRLF)
+	if (output_eol(ca->crlf_action) == EOL_CRLF)
 		filter = cascade_filter(filter, lf_to_crlf_filter());
 	else
 		filter = cascade_filter(filter, &null_filter_singleton);
@@ -1998,6 +1995,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate,
 	return filter;
 }
 
+struct stream_filter *get_stream_filter(const struct index_state *istate,
+					const char *path,
+					const struct object_id *oid)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return get_stream_filter_ca(&ca, oid);
+}
+
 void free_stream_filter(struct stream_filter *filter)
 {
 	filter->vtbl->free(filter);
diff --git a/convert.h b/convert.h
index 46d537d1ae..262c1a1d46 100644
--- a/convert.h
+++ b/convert.h
@@ -169,6 +169,8 @@ struct stream_filter; /* opaque */
 struct stream_filter *get_stream_filter(const struct index_state *istate,
 					const char *path,
 					const struct object_id *);
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid);
 void free_stream_filter(struct stream_filter *);
 int is_null_stream_filter(struct stream_filter *);
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 04/21] convert: add conv_attrs classification
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (2 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 03/21] convert: add get_stream_filter_ca() variant Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 05/21] entry: extract a header file for entry.c functions Matheus Tavares
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Lars Schneider, Torsten Bögershausen,
	brian m. carlson, Nguyễn Thái Ngọc Duy,
	Junio C Hamano

From: Jeff Hostetler <jeffhost@microsoft.com>

Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

[matheus.bernardino: use classification in get_stream_filter_ca()]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 26 +++++++++++++++++++-------
 convert.h | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/convert.c b/convert.c
index 8e995b39c3..c037bb99eb 100644
--- a/convert.c
+++ b/convert.c
@@ -1975,13 +1975,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
 {
 	struct stream_filter *filter = NULL;
 
-	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
-		return NULL;
-
-	if (ca->working_tree_encoding)
-		return NULL;
-
-	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+	if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE)
 		return NULL;
 
 	if (ca->ident)
@@ -2037,3 +2031,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst,
 	if (blob)
 		oidcpy(&dst->blob, blob);
 }
+
+enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca)
+{
+	if (ca->drv) {
+		if (ca->drv->process)
+			return CA_CLASS_INCORE_PROCESS;
+		if (ca->drv->smudge || ca->drv->clean)
+			return CA_CLASS_INCORE_FILTER;
+	}
+
+	if (ca->working_tree_encoding)
+		return CA_CLASS_INCORE;
+
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+		return CA_CLASS_INCORE;
+
+	return CA_CLASS_STREAMABLE;
+}
diff --git a/convert.h b/convert.h
index 262c1a1d46..523ba9b140 100644
--- a/convert.h
+++ b/convert.h
@@ -190,4 +190,37 @@ int stream_filter(struct stream_filter *,
 		  const char *input, size_t *isize_p,
 		  char *output, size_t *osize_p);
 
+enum conv_attrs_classification {
+	/*
+	 * The blob must be loaded into a buffer before it can be
+	 * smudged. All smudging is done in-proc.
+	 */
+	CA_CLASS_INCORE,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * single-file driver filter, such as rot13.
+	 */
+	CA_CLASS_INCORE_FILTER,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * long-running driver process, such as LFS. This might or
+	 * might not use delayed operations. (The important thing is
+	 * that there is a single subordinate long-running process
+	 * handling all associated blobs and in case of delayed
+	 * operations, may hold per-blob state.)
+	 */
+	CA_CLASS_INCORE_PROCESS,
+
+	/*
+	 * The blob can be streamed and smudged without needing to
+	 * completely read it into a buffer.
+	 */
+	CA_CLASS_STREAMABLE,
+};
+
+enum conv_attrs_classification classify_conv_attrs(
+	const struct conv_attrs *ca);
+
 #endif /* CONVERT_H */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 05/21] entry: extract a header file for entry.c functions
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (3 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 04/21] convert: add conv_attrs classification Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 06/21] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Ben Peart, Christian Couder, Lars Schneider, Stefan Beller,
	René Scharfe, Junio C Hamano

The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 apply.c                  |  1 +
 builtin/checkout-index.c |  1 +
 builtin/checkout.c       |  1 +
 builtin/difftool.c       |  1 +
 cache.h                  | 24 -----------------------
 entry.c                  |  9 +--------
 entry.h                  | 41 ++++++++++++++++++++++++++++++++++++++++
 unpack-trees.c           |  1 +
 8 files changed, 47 insertions(+), 32 deletions(-)
 create mode 100644 entry.h

diff --git a/apply.c b/apply.c
index 8bff604dbe..1443c307a4 100644
--- a/apply.c
+++ b/apply.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "rerere.h"
 #include "apply.h"
+#include "entry.h"
 
 struct gitdiff_data {
 	struct strbuf *root;
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index a854fd16e7..0f1ff73129 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "cache-tree.h"
 #include "parse-options.h"
+#include "entry.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 2837195491..3e09b29cfe 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -26,6 +26,7 @@
 #include "unpack-trees.h"
 #include "wt-status.h"
 #include "xdiff-interface.h"
+#include "entry.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
diff --git a/builtin/difftool.c b/builtin/difftool.c
index 7ac432b881..dfa22b67eb 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -23,6 +23,7 @@
 #include "lockfile.h"
 #include "object-store.h"
 #include "dir.h"
+#include "entry.h"
 
 static int trust_exit_code;
 
diff --git a/cache.h b/cache.h
index 0290849c19..e6963cf8fe 100644
--- a/cache.h
+++ b/cache.h
@@ -1695,30 +1695,6 @@ const char *show_ident_date(const struct ident_split *id,
  */
 int ident_cmp(const struct ident_split *, const struct ident_split *);
 
-struct checkout {
-	struct index_state *istate;
-	const char *base_dir;
-	int base_dir_len;
-	struct delayed_checkout *delayed_checkout;
-	struct checkout_metadata meta;
-	unsigned force:1,
-		 quiet:1,
-		 not_new:1,
-		 clone:1,
-		 refresh_cache:1;
-};
-#define CHECKOUT_INIT { NULL, "" }
-
-#define TEMPORARY_FILENAME_LENGTH 25
-int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts);
-void enable_delayed_checkout(struct checkout *state);
-int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
-/*
- * Unlink the last component and schedule the leading directories for
- * removal, such that empty directories get removed.
- */
-void unlink_entry(const struct cache_entry *ce);
-
 struct cache_def {
 	struct strbuf path;
 	int flags;
diff --git a/entry.c b/entry.c
index 449bd32dee..f46c06e831 100644
--- a/entry.c
+++ b/entry.c
@@ -6,6 +6,7 @@
 #include "submodule.h"
 #include "progress.h"
 #include "fsmonitor.h"
+#include "entry.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-/*
- * Write the contents from ce out to the working tree.
- *
- * When topath[] is not NULL, instead of writing to the working tree
- * file named by ce, a temporary file is created by this function and
- * its name is returned in topath[], which must be able to hold at
- * least TEMPORARY_FILENAME_LENGTH bytes long.
- */
 int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		   char *topath, int *nr_checkouts)
 {
diff --git a/entry.h b/entry.h
new file mode 100644
index 0000000000..2d69185448
--- /dev/null
+++ b/entry.h
@@ -0,0 +1,41 @@
+#ifndef ENTRY_H
+#define ENTRY_H
+
+#include "cache.h"
+#include "convert.h"
+
+struct checkout {
+	struct index_state *istate;
+	const char *base_dir;
+	int base_dir_len;
+	struct delayed_checkout *delayed_checkout;
+	struct checkout_metadata meta;
+	unsigned force:1,
+		 quiet:1,
+		 not_new:1,
+		 clone:1,
+		 refresh_cache:1;
+};
+#define CHECKOUT_INIT { NULL, "" }
+
+#define TEMPORARY_FILENAME_LENGTH 25
+
+/*
+ * Write the contents from ce out to the working tree.
+ *
+ * When topath[] is not NULL, instead of writing to the working tree
+ * file named by ce, a temporary file is created by this function and
+ * its name is returned in topath[], which must be able to hold at
+ * least TEMPORARY_FILENAME_LENGTH bytes long.
+ */
+int checkout_entry(struct cache_entry *ce, const struct checkout *state,
+		   char *topath, int *nr_checkouts);
+void enable_delayed_checkout(struct checkout *state);
+int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
+/*
+ * Unlink the last component and schedule the leading directories for
+ * removal, such that empty directories get removed.
+ */
+void unlink_entry(const struct cache_entry *ce);
+
+#endif /* ENTRY_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 323280dd48..a511fadd89 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -16,6 +16,7 @@
 #include "fsmonitor.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "entry.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 06/21] entry: make fstat_output() and read_blob_entry() public
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (4 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 05/21] entry: extract a header file for entry.c functions Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 07/21] entry: extract cache_entry update from write_entry() Matheus Tavares
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, brian m. carlson, Junio C Hamano,
	Brandon Williams, Jeff King, Denton Liu, Jonathan Nieder,
	Nguyễn Thái Ngọc Duy

These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 8 ++++----
 entry.h | 2 ++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index f46c06e831..cc27564473 100644
--- a/entry.c
+++ b/entry.c
@@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode)
 	return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
 }
 
-static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
 {
 	enum object_type type;
 	void *blob_data = read_object_file(&ce->oid, &type, size);
@@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf
 	}
 }
 
-static int fstat_output(int fd, const struct checkout *state, struct stat *st)
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 {
 	/* use fstat() only when path == ce->name */
 	if (fstat_is_reliable() &&
@@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path,
 		return -1;
 
 	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
-	*fstat_done = fstat_output(fd, state, statbuf);
+	*fstat_done = fstat_checkout_output(fd, state, statbuf);
 	result |= close(fd);
 
 	if (result)
@@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce,
 
 		wrote = write_in_full(fd, new_blob, size);
 		if (!to_tempfile)
-			fstat_done = fstat_output(fd, state, &st);
+			fstat_done = fstat_checkout_output(fd, state, &st);
 		close(fd);
 		free(new_blob);
 		if (wrote < 0)
diff --git a/entry.h b/entry.h
index 2d69185448..f860e60846 100644
--- a/entry.h
+++ b/entry.h
@@ -37,5 +37,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
  * removal, such that empty directories get removed.
  */
 void unlink_entry(const struct cache_entry *ce);
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 07/21] entry: extract cache_entry update from write_entry()
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (5 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 06/21] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 08/21] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Lars Schneider, Jeff King, Junio C Hamano,
	Nguyễn Thái Ngọc Duy, Johannes Schindelin,
	Ben Peart

This code will be used by the parallel checkout functions, outside
entry.c, so extract it to a public function.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 26 +++++++++++++++++---------
 entry.h |  2 ++
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/entry.c b/entry.c
index cc27564473..837629a804 100644
--- a/entry.c
+++ b/entry.c
@@ -251,6 +251,19 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 	return errs;
 }
 
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st)
+{
+	if (state->refresh_cache) {
+		assert(state->istate);
+		fill_stat_cache_info(state->istate, ce, st);
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
+		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+	}
+}
+
+
 static int write_entry(struct cache_entry *ce,
 		       char *path, const struct checkout *state, int to_tempfile)
 {
@@ -371,15 +384,10 @@ static int write_entry(struct cache_entry *ce,
 
 finish:
 	if (state->refresh_cache) {
-		assert(state->istate);
-		if (!fstat_done)
-			if (lstat(ce->name, &st) < 0)
-				return error_errno("unable to stat just-written file %s",
-						   ce->name);
-		fill_stat_cache_info(state->istate, ce, &st);
-		ce->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(state->istate, ce);
-		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+		if (!fstat_done && lstat(ce->name, &st) < 0)
+			return error_errno("unable to stat just-written file %s",
+					   ce->name);
+		update_ce_after_write(state, ce , &st);
 	}
 delayed:
 	return 0;
diff --git a/entry.h b/entry.h
index f860e60846..664aed1576 100644
--- a/entry.h
+++ b/entry.h
@@ -39,5 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 void unlink_entry(const struct cache_entry *ce);
 void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 08/21] entry: move conv_attrs lookup up to checkout_entry()
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (6 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 07/21] entry: extract cache_entry update from write_entry() Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 09/21] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Thomas Gummerer, Junio C Hamano,
	Nguyễn Thái Ngọc Duy, brian m. carlson

In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 39 +++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/entry.c b/entry.c
index 837629a804..59d5335ff1 100644
--- a/entry.c
+++ b/entry.c
@@ -263,9 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 	}
 }
 
-
-static int write_entry(struct cache_entry *ce,
-		       char *path, const struct checkout *state, int to_tempfile)
+/* Note: ca is used (and required) iff the entry refers to a regular file. */
+static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
+		       const struct checkout *state, int to_tempfile)
 {
 	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
 	struct delayed_checkout *dco = state->delayed_checkout;
@@ -282,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
 	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
 
 	if (ce_mode_s_ifmt == S_IFREG) {
-		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
-								 &ce->oid);
+		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
 		if (filter &&
 		    !streaming_write_entry(ce, path, filter,
 					   state, to_tempfile,
@@ -330,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
 		 * Convert from git internal format to working tree format
 		 */
 		if (dco && dco->state != CE_NO_DELAY) {
-			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
-							    size, &buf, &meta, dco);
+			ret = async_convert_to_working_tree_ca(ca, ce->name,
+							       new_blob, size,
+							       &buf, &meta, dco);
 			if (ret && string_list_has_string(&dco->paths, ce->name)) {
 				free(new_blob);
 				goto delayed;
 			}
-		} else
-			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
+		} else {
+			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
+							 size, &buf, &meta);
+		}
 
 		if (ret) {
 			free(new_blob);
@@ -443,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
+	struct conv_attrs ca;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -455,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 	}
 
-	if (topath)
-		return write_entry(ce, topath, state, 1);
+	if (topath) {
+		if (S_ISREG(ce->ce_mode)) {
+			convert_attrs(state->istate, &ca, ce->name);
+			return write_entry(ce, topath, &ca, state, 1);
+		}
+		return write_entry(ce, topath, NULL, state, 1);
+	}
 
 	strbuf_reset(&path);
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
@@ -520,9 +528,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 
 	create_directories(path.buf, path.len, state);
+
 	if (nr_checkouts)
 		(*nr_checkouts)++;
-	return write_entry(ce, path.buf, state, 0);
+
+	if (S_ISREG(ce->ce_mode)) {
+		convert_attrs(state->istate, &ca, ce->name);
+		return write_entry(ce, path.buf, &ca, state, 0);
+	}
+
+	return write_entry(ce, path.buf, NULL, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 09/21] entry: add checkout_entry_ca() which takes preloaded conv_attrs
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (7 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 08/21] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 10/21] unpack-trees: add basic support for parallel checkout Matheus Tavares
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Thomas Gummerer, Junio C Hamano, Denton Liu,
	Nguyễn Thái Ngọc Duy

The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 23 ++++++++++++-----------
 entry.h | 13 +++++++++++--
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/entry.c b/entry.c
index 59d5335ff1..f9835afba3 100644
--- a/entry.c
+++ b/entry.c
@@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts)
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
-	struct conv_attrs ca;
+	struct conv_attrs ca_buf;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -459,11 +460,11 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	}
 
 	if (topath) {
-		if (S_ISREG(ce->ce_mode)) {
-			convert_attrs(state->istate, &ca, ce->name);
-			return write_entry(ce, topath, &ca, state, 1);
+		if (S_ISREG(ce->ce_mode) && !ca) {
+			convert_attrs(state->istate, &ca_buf, ce->name);
+			ca = &ca_buf;
 		}
-		return write_entry(ce, topath, NULL, state, 1);
+		return write_entry(ce, topath, ca, state, 1);
 	}
 
 	strbuf_reset(&path);
@@ -532,12 +533,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	if (nr_checkouts)
 		(*nr_checkouts)++;
 
-	if (S_ISREG(ce->ce_mode)) {
-		convert_attrs(state->istate, &ca, ce->name);
-		return write_entry(ce, path.buf, &ca, state, 0);
+	if (S_ISREG(ce->ce_mode) && !ca) {
+		convert_attrs(state->istate, &ca_buf, ce->name);
+		ca = &ca_buf;
 	}
 
-	return write_entry(ce, path.buf, NULL, state, 0);
+	return write_entry(ce, path.buf, ca, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
diff --git a/entry.h b/entry.h
index 664aed1576..2081fbbbab 100644
--- a/entry.h
+++ b/entry.h
@@ -27,9 +27,18 @@ struct checkout {
  * file named by ce, a temporary file is created by this function and
  * its name is returned in topath[], which must be able to hold at
  * least TEMPORARY_FILENAME_LENGTH bytes long.
+ *
+ * With checkout_entry_ca(), callers can optionally pass a preloaded
+ * conv_attrs struct (to avoid reloading it), when ce refers to a
+ * regular file. If ca is NULL, the attributes will be loaded
+ * internally when (and if) needed.
  */
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts);
+#define checkout_entry(ce, state, topath, nr_checkouts) \
+		checkout_entry_ca(ce, NULL, state, topath, nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts);
+
 void enable_delayed_checkout(struct checkout *state);
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 /*
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 10/21] unpack-trees: add basic support for parallel checkout
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (8 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 09/21] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 11/21] parallel-checkout: make it truly parallel Matheus Tavares
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Jonathan Tan, René Scharfe, Christian Couder, Stefan Beller,
	Junio C Hamano, Lars Schneider

This new interface allows us to enqueue some of the entries being
checked out to later call write_entry() for them in parallel. For now,
the parallel checkout machinery is enabled by default and there is no
user configuration, but run_parallel_checkout() just writes the queued
entries in sequence (without spawning additional workers). The next
patch will actually implement the parallelism and, later, we will make
it configurable.

When there are path collisions among the entries being written (which
can happen e.g. with case-sensitive files in case-insensitive file
systems), the parallel checkout code detects the problem and mark the
checkout_item with CI_RETRY. Later, these items are sequentially feed to
checkout_entry() again. This is similar to the way the sequential code
deals with collisions, overwriting the previously checked out entries
with the subsequent ones. The only difference is that, once we start
writing the entries in parallel, we will no longer be able to determine
which of the colliding entries will survive on disk (for the sequential
algorithm, it is always the last one). Finally, just like the sequential
code, there is no additional overhead when there are no collisions.

Note: we continue the loop of write_checkout_item() even if the previous
call returned an error. This is how checkout_entry() is called in
builtin/checkout.c:checkout_paths() and unpack-trees.c:check-updates().
In the case of fatal errors, die() aborts the loop.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

For consistency, the parallel code replicates the sequential behavior of
overwriting colliding entries. However, during parallel checkout it's
possible to distinguish a path collision from the case where a path was
already present in the working tree before checkout. So, in the event of
a collision, we could chose to write a single entry and skip overwriting
it with the next ones. Does that sounds reasonable, or are there other
problems in not writing the extra colliding entries?

 Makefile            |   1 +
 entry.c             |   4 +
 parallel-checkout.c | 340 ++++++++++++++++++++++++++++++++++++++++++++
 parallel-checkout.h |  20 +++
 unpack-trees.c      |   6 +-
 5 files changed, 370 insertions(+), 1 deletion(-)
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

diff --git a/Makefile b/Makefile
index 65f8cfb236..caab8e6401 100644
--- a/Makefile
+++ b/Makefile
@@ -933,6 +933,7 @@ LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
+LIB_OBJS += parallel-checkout.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/entry.c b/entry.c
index f9835afba3..47c2c20d5a 100644
--- a/entry.c
+++ b/entry.c
@@ -7,6 +7,7 @@
 #include "progress.h"
 #include "fsmonitor.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -538,6 +539,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		ca = &ca_buf;
 	}
 
+	if (!enqueue_checkout(ce, ca))
+		return 0;
+
 	return write_entry(ce, path.buf, ca, state, 0);
 }
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
new file mode 100644
index 0000000000..e3b44eeb34
--- /dev/null
+++ b/parallel-checkout.c
@@ -0,0 +1,340 @@
+#include "cache.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "streaming.h"
+
+enum ci_status {
+	CI_PENDING = 0,
+	CI_SUCCESS,
+	CI_RETRY,
+	CI_FAILED,
+};
+
+struct checkout_item {
+	/* pointer to a istate->cache[] entry. Not owned by us. */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	struct stat st;
+	enum ci_status status;
+};
+
+struct parallel_checkout {
+	struct checkout_item *items;
+	size_t nr, alloc;
+};
+
+static struct parallel_checkout *parallel_checkout = NULL;
+
+enum pc_status {
+	PC_UNINITIALIZED = 0,
+	PC_ACCEPTING_ENTRIES,
+	PC_RUNNING,
+	PC_HANDLING_RESULTS,
+};
+
+static enum pc_status pc_status = PC_UNINITIALIZED;
+
+void init_parallel_checkout(void)
+{
+	if (parallel_checkout)
+		BUG("parallel checkout already initialized");
+
+	parallel_checkout = xcalloc(1, sizeof(*parallel_checkout));
+	pc_status = PC_ACCEPTING_ENTRIES;
+}
+
+static void finish_parallel_checkout(void)
+{
+	if (!parallel_checkout)
+		BUG("cannot finish parallel checkout: not initialized yet");
+
+	free(parallel_checkout->items);
+	FREE_AND_NULL(parallel_checkout);
+	pc_status = PC_UNINITIALIZED;
+}
+
+static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
+					     const struct conv_attrs *ca)
+{
+	enum conv_attrs_classification c;
+
+	if (!S_ISREG(ce->ce_mode))
+		return 0;
+
+	c = classify_conv_attrs(ca);
+	switch (c) {
+	case CA_CLASS_INCORE:
+		return 1;
+
+	case CA_CLASS_INCORE_FILTER:
+		/*
+		 * It would be safe to allow concurrent instances of
+		 * single-file smudge filters, like rot13, but we should not
+		 * assume that all filters are parallel-process safe. So we
+		 * don't allow this.
+		 */
+		return 0;
+
+	case CA_CLASS_INCORE_PROCESS:
+		/*
+		 * The parallel queue and the delayed queue are not compatible,
+		 * so they must be kept completely separated. And we can't tell
+		 * if a long-running process will delay its response without
+		 * actually asking it to perform the filtering. Therefore, this
+		 * type of filter is not allowed in parallel checkout.
+		 *
+		 * Furthermore, there should only be one instance of the
+		 * long-running process filter as we don't know how it is
+		 * managing its own concurrency. So, spreading the entries that
+		 * requisite such a filter among the parallel workers would
+		 * require a lot more inter-process communication. We would
+		 * probably have to designate a single process to interact with
+		 * the filter and send all the necessary data to it, for each
+		 * entry.
+		 */
+		return 0;
+
+	case CA_CLASS_STREAMABLE:
+		return 1;
+
+	default:
+		BUG("unsupported conv_attrs classification '%d'", c);
+	}
+}
+
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
+{
+	struct checkout_item *ci;
+
+	if (!parallel_checkout || pc_status != PC_ACCEPTING_ENTRIES ||
+	    !is_eligible_for_parallel_checkout(ce, ca))
+		return -1;
+
+	ALLOC_GROW(parallel_checkout->items, parallel_checkout->nr + 1,
+		   parallel_checkout->alloc);
+
+	ci = &parallel_checkout->items[parallel_checkout->nr++];
+	ci->ce = ce;
+	memcpy(&ci->ca, ca, sizeof(ci->ca));
+
+	return 0;
+}
+
+static int handle_results(struct checkout *state)
+{
+	int ret = 0;
+	size_t i;
+
+	pc_status = PC_HANDLING_RESULTS;
+
+	for (i = 0; i < parallel_checkout->nr; ++i) {
+		struct checkout_item *ci = &parallel_checkout->items[i];
+		struct stat *st = &ci->st;
+
+		switch(ci->status) {
+		case CI_SUCCESS:
+			update_ce_after_write(state, ci->ce, st);
+			break;
+		case CI_RETRY:
+			/*
+			 * The fails for which we set CI_RETRY are the ones
+			 * that might have been caused by a path collision. So
+			 * we let checkout_entry_ca() retry writing, as it will
+			 * properly handle collisions and the creation of
+			 * leading dirs in the entry's path.
+			 */
+			ret |= checkout_entry_ca(ci->ce, &ci->ca, state, NULL, NULL);
+			break;
+		case CI_FAILED:
+			ret = -1;
+			break;
+		case CI_PENDING:
+			BUG("parallel checkout finished with pending entries");
+		default:
+			BUG("unknown checkout item status in parallel checkout");
+		}
+	}
+
+	return ret;
+}
+
+static int reset_fd(int fd, const char *path)
+{
+	if (lseek(fd, 0, SEEK_SET) != 0)
+		return error_errno("failed to rewind descriptor of %s", path);
+	if (ftruncate(fd, 0))
+		return error_errno("failed to truncate file %s", path);
+	return 0;
+}
+
+static int write_checkout_item_to_fd(int fd, struct checkout *state,
+				     struct checkout_item *ci, const char *path)
+{
+	int ret;
+	struct stream_filter *filter;
+	struct strbuf buf = STRBUF_INIT;
+	char *new_blob;
+	unsigned long size;
+	size_t newsize = 0;
+	ssize_t wrote;
+
+	/* Sanity check */
+	assert(is_eligible_for_parallel_checkout(ci->ce, &ci->ca));
+
+	filter = get_stream_filter_ca(&ci->ca, &ci->ce->oid);
+	if (filter) {
+		if (stream_blob_to_fd(fd, &ci->ce->oid, filter, 1)) {
+			/* On error, reset fd to try writing without streaming */
+			if (reset_fd(fd, path))
+				return -1;
+		} else {
+			return 0;
+		}
+	}
+
+	new_blob = read_blob_entry(ci->ce, &size);
+	if (!new_blob)
+		return error("unable to read sha1 file of %s (%s)", path,
+			     oid_to_hex(&ci->ce->oid));
+
+	/*
+	 * checkout metadata is used to give context for external process
+	 * filters. Files requiring such filters are not eligible for parallel
+	 * checkout, so pass NULL.
+	 */
+	ret = convert_to_working_tree_ca(&ci->ca, ci->ce->name, new_blob, size,
+					 &buf, NULL);
+
+	if (ret) {
+		free(new_blob);
+		new_blob = strbuf_detach(&buf, &newsize);
+		size = newsize;
+	}
+
+	wrote = write_in_full(fd, new_blob, size);
+	free(new_blob);
+	if (wrote < 0)
+		return error("unable to write file %s", path);
+
+	return 0;
+}
+
+static int close_and_clear(int *fd)
+{
+	int ret = 0;
+
+	if (*fd >= 0) {
+		ret = close(*fd);
+		*fd = -1;
+	}
+
+	return ret;
+}
+
+static int check_leading_dirs(const char *path, int len, int prefix_len)
+{
+	const char *slash = path + len;
+
+	while (slash > path && *slash != '/')
+		slash--;
+
+	return has_dirs_only_path(path, slash - path, prefix_len);
+}
+
+static void write_checkout_item(struct checkout *state, struct checkout_item *ci)
+{
+	unsigned int mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
+	int fd = -1, fstat_done = 0;
+	struct strbuf path = STRBUF_INIT;
+
+	strbuf_add(&path, state->base_dir, state->base_dir_len);
+	strbuf_add(&path, ci->ce->name, ce_namelen(ci->ce));
+
+	/*
+	 * At this point, leading dirs should have already been created. But if
+	 * a symlink being checked out has collided with one of the dirs, due to
+	 * file system folding rules, it's possible that the dirs are no longer
+	 * present. So we have to check again, and report any path collisions.
+	 */
+	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
+		ci->status = CI_RETRY;
+		goto out;
+	}
+
+	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
+
+	if (fd < 0) {
+		if (errno == EEXIST || errno == EISDIR) {
+			/*
+			 * Errors which probably represent a path collision.
+			 * Suppress the error message and mark the ci to be
+			 * retried later, sequentially. ENOTDIR and ENOENT are
+			 * also interesting, but check_leading_dirs() should
+			 * have already caught these cases.
+			 */
+			ci->status = CI_RETRY;
+		} else {
+			error_errno("failed to open file %s", path.buf);
+			ci->status = CI_FAILED;
+		}
+		goto out;
+	}
+
+	if (write_checkout_item_to_fd(fd, state, ci, path.buf)) {
+		/* Error was already reported. */
+		ci->status = CI_FAILED;
+		goto out;
+	}
+
+	fstat_done = fstat_checkout_output(fd, state, &ci->st);
+
+	if (close_and_clear(&fd)) {
+		error_errno("unable to close file %s", path.buf);
+		ci->status = CI_FAILED;
+		goto out;
+	}
+
+	if (state->refresh_cache && !fstat_done && lstat(path.buf, &ci->st) < 0) {
+		error_errno("unable to stat just-written file %s",  path.buf);
+		ci->status = CI_FAILED;
+		goto out;
+	}
+
+	ci->status = CI_SUCCESS;
+
+out:
+	/*
+	 * No need to check close() return. At this point, either fd is already
+	 * closed, or we are on an error path, that has already been reported.
+	 */
+	close_and_clear(&fd);
+	strbuf_release(&path);
+}
+
+static int run_checkout_sequentially(struct checkout *state)
+{
+	size_t i;
+
+	for (i = 0; i < parallel_checkout->nr; ++i) {
+		struct checkout_item *ci = &parallel_checkout->items[i];
+		write_checkout_item(state, ci);
+	}
+
+	return handle_results(state);
+}
+
+
+int run_parallel_checkout(struct checkout *state)
+{
+	int ret;
+
+	if (!parallel_checkout)
+		BUG("cannot run parallel checkout: not initialized yet");
+
+	pc_status = PC_RUNNING;
+
+	ret = run_checkout_sequentially(state);
+
+	finish_parallel_checkout();
+	return ret;
+}
diff --git a/parallel-checkout.h b/parallel-checkout.h
new file mode 100644
index 0000000000..8eef59ffcd
--- /dev/null
+++ b/parallel-checkout.h
@@ -0,0 +1,20 @@
+#ifndef PARALLEL_CHECKOUT_H
+#define PARALLEL_CHECKOUT_H
+
+struct cache_entry;
+struct checkout;
+struct conv_attrs;
+
+void init_parallel_checkout(void);
+
+/*
+ * Return -1 if parallel checkout is currently not enabled or if the entry is
+ * not eligible for parallel checkout. Otherwise, enqueue the entry for later
+ * write and return 0.
+ */
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+
+/* Write all the queued entries, returning 0 on success.*/
+int run_parallel_checkout(struct checkout *state);
+
+#endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index a511fadd89..1b1da7485a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -17,6 +17,7 @@
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o,
 	if (should_update_submodules())
 		load_gitmodules_file(index, &state);
 
-	enable_delayed_checkout(&state);
 	if (has_promisor_remote()) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
@@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o,
 					   to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
+
+	enable_delayed_checkout(&state);
+	init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
+	errs |= run_parallel_checkout(&state);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 11/21] parallel-checkout: make it truly parallel
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (9 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 10/21] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-19 21:34   ` Jeff Hostetler
  2020-08-10 21:33 ` [RFC PATCH 12/21] parallel-checkout: add configuration options Matheus Tavares
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Paul Tan, Denton Liu, Remi Lespinet, Junio C Hamano

Use multiple worker processes to distribute the queued entries and call
write_checkout_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could
affect performance due to lock contention in the kernel. Work stealing
(or any other format of re-distribution) is not implemented yet.

For now, the number of workers is equal to the number of logical cores
available. But the next patch will add settings to configure this.
Distributed file systems, such as NFS and EFS, can benefit from using
more workers than the actual number of cores (see timings below).

The parallel version was benchmarked during three operations in the
linux repo, with cold cache: cloning v5.8, checking out v5.8 from
v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The
three tables below show the mean run times and standard deviations for
5 runs in a local file system, a Linux NFS server and Amazon EFS. The
numbers of workers were chosen based on what produces the best result
for each case.

Local:

            Clone                  Checkout I             Checkout II
Sequential  8.180 s ± 0.021 s      6.936 s ± 0.030 s      2.585 s ± 0.005 s
10 workers  3.633 s ± 0.040 s      2.288 s ± 0.026 s      1.058 s ± 0.015 s
Speedup     2.25 ± 0.03            3.03 ± 0.04            2.44 ± 0.03

Linux NFS server (v4.1, on EBS, single availability zone):

            Clone                  Checkout I             Checkout II
Sequential  208.069 s ± 2.522 s    198.610 s ± 1.979 s    54.376 s ± 1.333 s
32 workers  67.078 s ±  0.878 s    64.828 s ± 0.387 s     22.993 s ± 0.252 s
Speedup     3.10 ± 0.06            3.06 ± 0.04            2.36 ± 0.06

EFS (v4.1, replicated over multiple availability zones):

            Clone                  Checkout I             Checkout II
Sequential  1143.655 s ± 11.819 s  1277.891 s ± 10.481 s  396.891 s ± 7.505 s
64 workers  173.242 s ± 1.484 s    282.421 s ± 1.521 s    165.424 s ± 9.564 s
Speedup     6.60 ± 0.09            4.52 ± 0.04            2.40 ± 0.15

Local tests were executed in an i7-7700HQ (4 cores with hyper-threading)
running Manjaro Linux, with SSD. NFS and EFS tests were executed in an
Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux NFS server was
running on a m6g.large instance with 1 TB, EBS GP2 volume. Before each
timing, the linux repository was removed (or checked out back), and
`sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .gitignore                 |   1 +
 Makefile                   |   1 +
 builtin.h                  |   1 +
 builtin/checkout--helper.c | 135 +++++++++++++++++++++
 entry.c                    |  13 +-
 git.c                      |   2 +
 parallel-checkout.c        | 237 +++++++++++++++++++++++++++++++------
 parallel-checkout.h        |  74 +++++++++++-
 8 files changed, 425 insertions(+), 39 deletions(-)
 create mode 100644 builtin/checkout--helper.c

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..6c01f0a58c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,6 +33,7 @@
 /git-check-mailmap
 /git-check-ref-format
 /git-checkout
+/git-checkout--helper
 /git-checkout-index
 /git-cherry
 /git-cherry-pick
diff --git a/Makefile b/Makefile
index caab8e6401..926473d484 100644
--- a/Makefile
+++ b/Makefile
@@ -1049,6 +1049,7 @@ BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
+BUILTIN_OBJS += builtin/checkout--helper.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
 BUILTIN_OBJS += builtin/clean.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..5790c68750 100644
--- a/builtin.h
+++ b/builtin.h
@@ -122,6 +122,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix);
 int cmd_bundle(int argc, const char **argv, const char *prefix);
 int cmd_cat_file(int argc, const char **argv, const char *prefix);
 int cmd_checkout(int argc, const char **argv, const char *prefix);
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix);
 int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 int cmd_check_attr(int argc, const char **argv, const char *prefix);
 int cmd_check_ignore(int argc, const char **argv, const char *prefix);
diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
new file mode 100644
index 0000000000..269cf02feb
--- /dev/null
+++ b/builtin/checkout--helper.c
@@ -0,0 +1,135 @@
+#include "builtin.h"
+#include "config.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "parse-options.h"
+#include "pkt-line.h"
+
+static void packet_to_ci(char *line, int len, struct checkout_item *ci)
+{
+	struct ci_fixed_portion *fixed_portion;
+	char *encoding, *variant;
+
+	if (len < sizeof(struct ci_fixed_portion))
+		BUG("checkout worker received too short item (got %d, exp %d)",
+		    len, (int)sizeof(struct ci_fixed_portion));
+
+	fixed_portion = (struct ci_fixed_portion *)line;
+
+	if (len - sizeof(struct ci_fixed_portion) !=
+		fixed_portion->name_len + fixed_portion->working_tree_encoding_len)
+		BUG("checkout worker received corrupted item");
+
+	variant = line + sizeof(struct ci_fixed_portion);
+	if (fixed_portion->working_tree_encoding_len) {
+		encoding = xmemdupz(variant,
+				    fixed_portion->working_tree_encoding_len);
+		variant += fixed_portion->working_tree_encoding_len;
+	} else {
+		encoding = NULL;
+	}
+
+	memset(ci, 0, sizeof(*ci));
+	ci->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	ci->ce->ce_namelen = fixed_portion->name_len;
+	ci->ce->ce_mode = fixed_portion->ce_mode;
+	memcpy(ci->ce->name, variant, ci->ce->ce_namelen);
+	oidcpy(&ci->ce->oid, &fixed_portion->oid);
+
+	ci->id = fixed_portion->id;
+	ci->ca.attr_action = fixed_portion->attr_action;
+	ci->ca.crlf_action = fixed_portion->crlf_action;
+	ci->ca.ident = fixed_portion->ident;
+	ci->ca.working_tree_encoding = encoding;
+}
+
+static void report_result(struct checkout_item *ci)
+{
+	struct ci_result res = { 0 };
+	size_t size;
+
+	res.id = ci->id;
+	res.status = ci->status;
+
+	if (ci->status == CI_SUCCESS) {
+		res.st = ci->st;
+		size = sizeof(res);
+	} else {
+		size = ci_result_base_size();
+	}
+
+	packet_write(1, (const char *)&res, size);
+}
+
+/* Free the worker-side malloced data, but not the ci itself. */
+static void release_checkout_item_data(struct checkout_item *ci)
+{
+	free((char *)ci->ca.working_tree_encoding);
+	discard_cache_entry(ci->ce);
+}
+
+static void worker_loop(struct checkout *state)
+{
+	struct checkout_item *items = NULL;
+	size_t i, nr = 0, alloc = 0;
+
+	while (1) {
+		int len;
+		char *line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		ALLOC_GROW(items, nr + 1, alloc);
+		packet_to_ci(line, len, &items[nr++]);
+	}
+
+	for (i = 0; i < nr; ++i) {
+		struct checkout_item *ci = &items[i];
+		write_checkout_item(state, ci);
+		report_result(ci);
+		release_checkout_item_data(ci);
+	}
+
+	packet_flush(1);
+
+	free(items);
+}
+
+static const char * const checkout_helper_usage[] = {
+	N_("git checkout--helper [<options>]"),
+	NULL
+};
+
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct option checkout_helper_options[] = {
+		OPT_STRING(0, "prefix", &state.base_dir, N_("string"),
+			N_("when creating files, prepend <string>")),
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(checkout_helper_usage,
+				   checkout_helper_options);
+
+	git_config(git_default_config, NULL);
+	argc = parse_options(argc, argv, prefix, checkout_helper_options,
+			     checkout_helper_usage, 0);
+	if (argc > 0)
+		usage_with_options(checkout_helper_usage, checkout_helper_options);
+
+	if (state.base_dir)
+		state.base_dir_len = strlen(state.base_dir);
+
+	/*
+	 * Setting this on worker won't actually update the index. We just need
+	 * to pretend so to induce the checkout machinery to stat() the written
+	 * entries.
+	 */
+	state.refresh_cache = 1;
+
+	worker_loop(&state);
+	return 0;
+}
diff --git a/entry.c b/entry.c
index 47c2c20d5a..b6c808dffa 100644
--- a/entry.c
+++ b/entry.c
@@ -427,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
 	for (i = 0; i < state->istate->cache_nr; i++) {
 		struct cache_entry *dup = state->istate->cache[i];
 
-		if (dup == ce)
-			break;
+		if (dup == ce) {
+			/*
+			 * Parallel checkout creates the files in a racy order.
+			 * So the other side of the collision may appear after
+			 * the given cache_entry in the array.
+			 */
+			if (parallel_checkout_status() == PC_HANDLING_RESULTS)
+				continue;
+			else
+				break;
+		}
 
 		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
 			continue;
diff --git a/git.c b/git.c
index 8bd1d7551d..78c7bd412c 100644
--- a/git.c
+++ b/git.c
@@ -486,6 +486,8 @@ static struct cmd_struct commands[] = {
 	{ "check-mailmap", cmd_check_mailmap, RUN_SETUP },
 	{ "check-ref-format", cmd_check_ref_format, NO_PARSEOPT  },
 	{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
+	{ "checkout--helper", cmd_checkout__helper,
+		RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX },
 	{ "checkout-index", cmd_checkout_index,
 		RUN_SETUP | NEED_WORK_TREE},
 	{ "cherry", cmd_cherry, RUN_SETUP },
diff --git a/parallel-checkout.c b/parallel-checkout.c
index e3b44eeb34..ec42342bc8 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -1,39 +1,23 @@
 #include "cache.h"
 #include "entry.h"
 #include "parallel-checkout.h"
+#include "pkt-line.h"
+#include "run-command.h"
 #include "streaming.h"
 
-enum ci_status {
-	CI_PENDING = 0,
-	CI_SUCCESS,
-	CI_RETRY,
-	CI_FAILED,
-};
-
-struct checkout_item {
-	/* pointer to a istate->cache[] entry. Not owned by us. */
-	struct cache_entry *ce;
-	struct conv_attrs ca;
-	struct stat st;
-	enum ci_status status;
-};
-
 struct parallel_checkout {
 	struct checkout_item *items;
 	size_t nr, alloc;
 };
 
 static struct parallel_checkout *parallel_checkout = NULL;
-
-enum pc_status {
-	PC_UNINITIALIZED = 0,
-	PC_ACCEPTING_ENTRIES,
-	PC_RUNNING,
-	PC_HANDLING_RESULTS,
-};
-
 static enum pc_status pc_status = PC_UNINITIALIZED;
 
+enum pc_status parallel_checkout_status(void)
+{
+	return pc_status;
+}
+
 void init_parallel_checkout(void)
 {
 	if (parallel_checkout)
@@ -113,9 +97,11 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	ALLOC_GROW(parallel_checkout->items, parallel_checkout->nr + 1,
 		   parallel_checkout->alloc);
 
-	ci = &parallel_checkout->items[parallel_checkout->nr++];
+	ci = &parallel_checkout->items[parallel_checkout->nr];
 	ci->ce = ce;
 	memcpy(&ci->ca, ca, sizeof(ci->ca));
+	ci->id = parallel_checkout->nr;
+	parallel_checkout->nr++;
 
 	return 0;
 }
@@ -200,7 +186,8 @@ static int write_checkout_item_to_fd(int fd, struct checkout *state,
 	/*
 	 * checkout metadata is used to give context for external process
 	 * filters. Files requiring such filters are not eligible for parallel
-	 * checkout, so pass NULL.
+	 * checkout, so pass NULL. Note: if that changes, the metadata must also
+	 * be passed from the main process to the workers.
 	 */
 	ret = convert_to_working_tree_ca(&ci->ca, ci->ce->name, new_blob, size,
 					 &buf, NULL);
@@ -241,14 +228,14 @@ static int check_leading_dirs(const char *path, int len, int prefix_len)
 	return has_dirs_only_path(path, slash - path, prefix_len);
 }
 
-static void write_checkout_item(struct checkout *state, struct checkout_item *ci)
+void write_checkout_item(struct checkout *state, struct checkout_item *ci)
 {
 	unsigned int mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
 	int fd = -1, fstat_done = 0;
 	struct strbuf path = STRBUF_INIT;
 
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
-	strbuf_add(&path, ci->ce->name, ce_namelen(ci->ce));
+	strbuf_add(&path, ci->ce->name, ci->ce->ce_namelen);
 
 	/*
 	 * At this point, leading dirs should have already been created. But if
@@ -311,30 +298,214 @@ static void write_checkout_item(struct checkout *state, struct checkout_item *ci
 	strbuf_release(&path);
 }
 
-static int run_checkout_sequentially(struct checkout *state)
+static void send_one_item(int fd, struct checkout_item *ci)
+{
+	size_t len_data;
+	char *data, *variant;
+	struct ci_fixed_portion *fixed_portion;
+	const char *working_tree_encoding = ci->ca.working_tree_encoding;
+	size_t name_len = ci->ce->ce_namelen;
+	size_t working_tree_encoding_len = working_tree_encoding ?
+					   strlen(working_tree_encoding) : 0;
+
+	len_data = sizeof(struct ci_fixed_portion) + name_len +
+		   working_tree_encoding_len;
+
+	data = xcalloc(1, len_data);
+
+	fixed_portion = (struct ci_fixed_portion *)data;
+	fixed_portion->id = ci->id;
+	oidcpy(&fixed_portion->oid, &ci->ce->oid);
+	fixed_portion->ce_mode = ci->ce->ce_mode;
+	fixed_portion->attr_action = ci->ca.attr_action;
+	fixed_portion->crlf_action = ci->ca.crlf_action;
+	fixed_portion->ident = ci->ca.ident;
+	fixed_portion->name_len = name_len;
+	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
+
+	variant = data + sizeof(*fixed_portion);
+	if (working_tree_encoding_len) {
+		memcpy(variant, working_tree_encoding, working_tree_encoding_len);
+		variant += working_tree_encoding_len;
+	}
+	memcpy(variant, ci->ce->name, name_len);
+
+	packet_write(fd, data, len_data);
+
+	free(data);
+}
+
+static void send_batch(int fd, size_t start, size_t nr)
 {
 	size_t i;
+	for (i = 0; i < nr; ++i)
+		send_one_item(fd, &parallel_checkout->items[start + i]);
+	packet_flush(fd);
+}
 
-	for (i = 0; i < parallel_checkout->nr; ++i) {
-		struct checkout_item *ci = &parallel_checkout->items[i];
-		write_checkout_item(state, ci);
+static struct child_process *setup_workers(struct checkout *state, int num_workers)
+{
+	struct child_process *workers;
+	int i, workers_with_one_extra_item;
+	size_t base_batch_size, next_to_assign = 0;
+
+	base_batch_size = parallel_checkout->nr / num_workers;
+	workers_with_one_extra_item = parallel_checkout->nr % num_workers;
+	ALLOC_ARRAY(workers, num_workers);
+
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *cp = &workers[i];
+		size_t batch_size = base_batch_size;
+
+		child_process_init(cp);
+		cp->git_cmd = 1;
+		cp->in = -1;
+		cp->out = -1;
+		strvec_push(&cp->args, "checkout--helper");
+		if (state->base_dir_len)
+			strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
+		if (start_command(cp))
+			die(_("failed to spawn checkout worker"));
+
+		/* distribute the extra work evenly */
+		if (i < workers_with_one_extra_item)
+			batch_size++;
+
+		send_batch(cp->in, next_to_assign, batch_size);
+		next_to_assign += batch_size;
 	}
 
+	return workers;
+}
+
+static void finish_workers(struct child_process *workers, int num_workers)
+{
+	int i;
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *w = &workers[i];
+		if (w->in >= 0)
+			close(w->in);
+		if (w->out >= 0)
+			close(w->out);
+		if (finish_command(w))
+			die(_("checkout worker finished with error"));
+	}
+	free(workers);
+}
+
+static void parse_and_save_result(const char *line, int len)
+{
+	struct ci_result *res;
+	struct checkout_item *ci;
+
+	/*
+	 * Worker should send either the full result struct or just the base
+	 * (i.e. no stat data).
+	 */
+	if (len != ci_result_base_size() && len != sizeof(struct ci_result))
+		BUG("received corrupted item from checkout worker");
+
+	res = (struct ci_result *)line;
+
+	if (res->id > parallel_checkout->nr)
+		BUG("checkout worker sent unknown item id");
+
+	ci = &parallel_checkout->items[res->id];
+	ci->status = res->status;
+
+	/*
+	 * Worker only sends stat data on success. Otherwise, we *cannot* access
+	 * res->st as that will be an invalid address.
+	 */
+	if (res->status == CI_SUCCESS)
+		ci->st = res->st;
+}
+
+static void gather_results_from_workers(struct child_process *workers,
+					int num_workers)
+{
+	int i, active_workers = num_workers;
+	struct pollfd *pfds;
+
+	CALLOC_ARRAY(pfds, num_workers);
+	for (i = 0; i < num_workers; ++i) {
+		pfds[i].fd = workers[i].out;
+		pfds[i].events = POLLIN;
+	}
+
+	while (active_workers) {
+		int nr = poll(pfds, num_workers, -1);
+
+		if (nr < 0) {
+			if (errno == EINTR)
+				continue;
+			die_errno("failed to poll checkout workers");
+		}
+
+		for (i = 0; i < num_workers && nr > 0; ++i) {
+			struct pollfd *pfd = &pfds[i];
+
+			if (!pfd->revents)
+				continue;
+
+			if (pfd->revents & POLLIN) {
+				int len;
+				const char *line = packet_read_line(pfd->fd, &len);
+
+				if (!line) {
+					pfd->fd = -1;
+					active_workers--;
+				} else {
+					parse_and_save_result(line, len);
+				}
+			} else if (pfd->revents & POLLHUP) {
+				pfd->fd = -1;
+				active_workers--;
+			} else if (pfd->revents & (POLLNVAL | POLLERR)) {
+				die(_("error polling from checkout worker"));
+			}
+
+			nr--;
+		}
+	}
+
+	free(pfds);
+}
+
+static int run_checkout_sequentially(struct checkout *state)
+{
+	size_t i;
+	for (i = 0; i < parallel_checkout->nr; ++i)
+		write_checkout_item(state, &parallel_checkout->items[i]);
 	return handle_results(state);
 }
 
+static const int workers_threshold = 0;
 
 int run_parallel_checkout(struct checkout *state)
 {
-	int ret;
+	int num_workers = online_cpus();
+	int ret = 0;
+	struct child_process *workers;
 
 	if (!parallel_checkout)
 		BUG("cannot run parallel checkout: not initialized yet");
 
 	pc_status = PC_RUNNING;
 
-	ret = run_checkout_sequentially(state);
+	if (parallel_checkout->nr == 0) {
+		goto done;
+	} else if (parallel_checkout->nr < workers_threshold || num_workers == 1) {
+		ret = run_checkout_sequentially(state);
+		goto done;
+	}
+
+	workers = setup_workers(state, num_workers);
+	gather_results_from_workers(workers, num_workers);
+	finish_workers(workers, num_workers);
+	ret = handle_results(state);
 
+done:
 	finish_parallel_checkout();
 	return ret;
 }
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 8eef59ffcd..f25f2874ae 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -1,10 +1,21 @@
 #ifndef PARALLEL_CHECKOUT_H
 #define PARALLEL_CHECKOUT_H
 
-struct cache_entry;
-struct checkout;
-struct conv_attrs;
+#include "entry.h"
+#include "convert.h"
 
+/****************************************************************
+ * Users of parallel checkout
+ ****************************************************************/
+
+enum pc_status {
+	PC_UNINITIALIZED = 0,
+	PC_ACCEPTING_ENTRIES,
+	PC_RUNNING,
+	PC_HANDLING_RESULTS,
+};
+
+enum pc_status parallel_checkout_status(void);
 void init_parallel_checkout(void);
 
 /*
@@ -14,7 +25,62 @@ void init_parallel_checkout(void);
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 
-/* Write all the queued entries, returning 0 on success.*/
+/* Write all the queued entries, returning 0 on success. */
 int run_parallel_checkout(struct checkout *state);
 
+/****************************************************************
+ * Interface with checkout--helper
+ ****************************************************************/
+
+enum ci_status {
+	CI_PENDING = 0,
+	CI_SUCCESS,
+	CI_RETRY,
+	CI_FAILED,
+};
+
+struct checkout_item {
+	/*
+	 * In main process ce points to a istate->cache[] entry. Thus, it's not
+	 * owned by us. In workers they own the memory, which *must be* released.
+	 */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	size_t id; /* position in parallel_checkout->items[] of main process */
+
+	/* Output fields, sent from workers. */
+	enum ci_status status;
+	struct stat st;
+};
+
+/*
+ * The fixed-size portion of `struct checkout_item` that is sent to the workers.
+ * Following this will be 2 strings: ca.working_tree_encoding and ce.name; These
+ * are NOT null terminated, since we have the size in the fixed portion.
+ */
+struct ci_fixed_portion {
+	size_t id;
+	struct object_id oid;
+	unsigned int ce_mode;
+	enum crlf_action attr_action;
+	enum crlf_action crlf_action;
+	int ident;
+	size_t working_tree_encoding_len;
+	size_t name_len;
+};
+
+/*
+ * The `struct checkout_item` fields returned by the workers. The order is
+ * important here, specially stat being the last one, as it is omitted on error.
+ */
+struct ci_result {
+	size_t id;
+	enum ci_status status;
+	struct stat st;
+};
+
+#define ci_result_base_size() offsetof(struct ci_result, st)
+
+void write_checkout_item(struct checkout *state, struct checkout_item *ci);
+
 #endif /* PARALLEL_CHECKOUT_H */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 12/21] parallel-checkout: add configuration options
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (10 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 11/21] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 13/21] parallel-checkout: support progress displaying Matheus Tavares
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Junio C Hamano, René Scharfe, Stefan Beller

Add the checkout.workers and checkout.workerThreshold settings, which
allow users to configure and/or disable the parallel checkout feature as
desired. The first setting defines the number of workers and the second
defines the minimum number of entries to attempt parallel checkout.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

I still have to evaluate what is the best default value for
checkout.workersThreshold. For now, I used 0 so that the test suite uses
parallel-checkout by default, exercising the new code. I'm open to
suggestions on how we can improve testing for it, once 
checkout.workersThreshold is no longer 0.

Note: the default number of workers can probably be better calculated as
well, multiplying the number of cores by some factor. My machine, for
example, has 8 logical cores but 10 workers leads to the fastest
execution.

 Documentation/config/checkout.txt | 16 ++++++++++++++++
 parallel-checkout.c               | 26 +++++++++++++++++++++-----
 parallel-checkout.h               | 11 +++++++++--
 unpack-trees.c                    | 10 +++++++---
 4 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt
index 6b646813ab..9dabdf9231 100644
--- a/Documentation/config/checkout.txt
+++ b/Documentation/config/checkout.txt
@@ -16,3 +16,19 @@ will checkout the '<something>' branch on another remote,
 and by linkgit:git-worktree[1] when 'git worktree add' refers to a
 remote branch. This setting might be used for other checkout-like
 commands or functionality in the future.
+
+checkout.workers::
+	The number of worker processes to use when updating the working tree.
+	If unset (or set to a value less than one), Git will use as many
+	workers as the number of logical cores available. One means sequential
+	execution. This and the checkout.workersThreshold settings affect all
+	commands which perform checkout. E.g. checkout, switch, clone,
+	sparse-checkout, read-tree, etc.
+
+checkout.workersThreshold::
+	If set to a positive number, parallel checkout will not be attempted
+	when the number of files to be updated is less than the defined limit.
+	When set to a negative number or unset, defaults to 0. The reasoning
+	behind this config is that, when modifying a small number of files, a
+	sequential execution might be faster, as it avoids the cost of spawning
+	subprocesses and inter-process communication.
diff --git a/parallel-checkout.c b/parallel-checkout.c
index ec42342bc8..e0fca4d380 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -4,6 +4,8 @@
 #include "pkt-line.h"
 #include "run-command.h"
 #include "streaming.h"
+#include "thread-utils.h"
+#include "config.h"
 
 struct parallel_checkout {
 	struct checkout_item *items;
@@ -18,6 +20,19 @@ enum pc_status parallel_checkout_status(void)
 	return pc_status;
 }
 
+#define DEFAULT_WORKERS_THRESHOLD 0
+
+void get_parallel_checkout_configs(int *num_workers, int *threshold)
+{
+	if (git_config_get_int("checkout.workers", num_workers) ||
+	    *num_workers < 1)
+		*num_workers = online_cpus();
+
+	if (git_config_get_int("checkout.workersThreshold", threshold) ||
+	    *threshold < 0)
+		*threshold = DEFAULT_WORKERS_THRESHOLD;
+}
+
 void init_parallel_checkout(void)
 {
 	if (parallel_checkout)
@@ -480,22 +495,23 @@ static int run_checkout_sequentially(struct checkout *state)
 	return handle_results(state);
 }
 
-static const int workers_threshold = 0;
-
-int run_parallel_checkout(struct checkout *state)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
 {
-	int num_workers = online_cpus();
 	int ret = 0;
 	struct child_process *workers;
 
 	if (!parallel_checkout)
 		BUG("cannot run parallel checkout: not initialized yet");
 
+	if (num_workers < 1)
+		BUG("invalid number of workers for run_parallel_checkout: %d",
+		    num_workers);
+
 	pc_status = PC_RUNNING;
 
 	if (parallel_checkout->nr == 0) {
 		goto done;
-	} else if (parallel_checkout->nr < workers_threshold || num_workers == 1) {
+	} else if (parallel_checkout->nr < threshold || num_workers == 1) {
 		ret = run_checkout_sequentially(state);
 		goto done;
 	}
diff --git a/parallel-checkout.h b/parallel-checkout.h
index f25f2874ae..b4d412c8b5 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -18,6 +18,9 @@ enum pc_status {
 enum pc_status parallel_checkout_status(void);
 void init_parallel_checkout(void);
 
+/* Reads the checkout.workers and checkout.workersThreshold settings. */
+void get_parallel_checkout_configs(int *num_workers, int *threshold);
+
 /*
  * Return -1 if parallel checkout is currently not enabled or if the entry is
  * not eligible for parallel checkout. Otherwise, enqueue the entry for later
@@ -25,8 +28,12 @@ void init_parallel_checkout(void);
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 
-/* Write all the queued entries, returning 0 on success. */
-int run_parallel_checkout(struct checkout *state);
+/*
+ * Write all the queued entries, returning 0 on success. If the number of
+ * entries is below the specified threshold, the operation is performed
+ * sequentially.
+ */
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
 
 /****************************************************************
  * Interface with checkout--helper
diff --git a/unpack-trees.c b/unpack-trees.c
index 1b1da7485a..117ed42370 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o,
 	int errs = 0;
 	struct progress *progress;
 	struct checkout state = CHECKOUT_INIT;
-	int i;
+	int i, pc_workers, pc_threshold;
 
 	trace_performance_enter();
 	state.force = 1;
@@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o,
 		oid_array_clear(&to_fetch);
 	}
 
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+
 	enable_delayed_checkout(&state);
-	init_parallel_checkout();
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
-	errs |= run_parallel_checkout(&state);
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 13/21] parallel-checkout: support progress displaying
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (11 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 12/21] parallel-checkout: add configuration options Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 14/21] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Junio C Hamano, Johannes Schindelin, Elijah Newren

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 parallel-checkout.c | 40 +++++++++++++++++++++++++++++++++++++---
 parallel-checkout.h |  4 +++-
 unpack-trees.c      | 11 ++++++++---
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index e0fca4d380..78bf2de5ea 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -2,6 +2,7 @@
 #include "entry.h"
 #include "parallel-checkout.h"
 #include "pkt-line.h"
+#include "progress.h"
 #include "run-command.h"
 #include "streaming.h"
 #include "thread-utils.h"
@@ -10,6 +11,8 @@
 struct parallel_checkout {
 	struct checkout_item *items;
 	size_t nr, alloc;
+	struct progress *progress;
+	unsigned int *progress_cnt;
 };
 
 static struct parallel_checkout *parallel_checkout = NULL;
@@ -121,6 +124,22 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	return 0;
 }
 
+size_t pc_queue_size(void)
+{
+	if (!parallel_checkout)
+		return 0;
+	return parallel_checkout->nr;
+}
+
+static void advance_progress_meter(void)
+{
+	if (parallel_checkout && parallel_checkout->progress) {
+		(*parallel_checkout->progress_cnt)++;
+		display_progress(parallel_checkout->progress,
+				 *parallel_checkout->progress_cnt);
+	}
+}
+
 static int handle_results(struct checkout *state)
 {
 	int ret = 0;
@@ -132,6 +151,10 @@ static int handle_results(struct checkout *state)
 		struct checkout_item *ci = &parallel_checkout->items[i];
 		struct stat *st = &ci->st;
 
+		/*
+		 * Note: progress meter was already incremented for CI_SUCCESS
+		 * and CI_FAILED.
+		 */
 		switch(ci->status) {
 		case CI_SUCCESS:
 			update_ce_after_write(state, ci->ce, st);
@@ -145,6 +168,7 @@ static int handle_results(struct checkout *state)
 			 * leading dirs in the entry's path.
 			 */
 			ret |= checkout_entry_ca(ci->ce, &ci->ca, state, NULL, NULL);
+			advance_progress_meter();
 			break;
 		case CI_FAILED:
 			ret = -1;
@@ -434,6 +458,9 @@ static void parse_and_save_result(const char *line, int len)
 	 */
 	if (res->status == CI_SUCCESS)
 		ci->st = res->st;
+
+	if (res->status != CI_RETRY)
+		advance_progress_meter();
 }
 
 static void gather_results_from_workers(struct child_process *workers,
@@ -490,12 +517,17 @@ static void gather_results_from_workers(struct child_process *workers,
 static int run_checkout_sequentially(struct checkout *state)
 {
 	size_t i;
-	for (i = 0; i < parallel_checkout->nr; ++i)
-		write_checkout_item(state, &parallel_checkout->items[i]);
+	for (i = 0; i < parallel_checkout->nr; ++i) {
+		struct checkout_item *ci = &parallel_checkout->items[i];
+		write_checkout_item(state, ci);
+		if (ci->status != CI_RETRY)
+			advance_progress_meter();
+	}
 	return handle_results(state);
 }
 
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt)
 {
 	int ret = 0;
 	struct child_process *workers;
@@ -508,6 +540,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 		    num_workers);
 
 	pc_status = PC_RUNNING;
+	parallel_checkout->progress = progress;
+	parallel_checkout->progress_cnt = progress_cnt;
 
 	if (parallel_checkout->nr == 0) {
 		goto done;
diff --git a/parallel-checkout.h b/parallel-checkout.h
index b4d412c8b5..2b81a5db6c 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -27,13 +27,15 @@ void get_parallel_checkout_configs(int *num_workers, int *threshold);
  * write and return 0.
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+size_t pc_queue_size(void);
 
 /*
  * Write all the queued entries, returning 0 on success. If the number of
  * entries is below the specified threshold, the operation is performed
  * sequentially.
  */
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt);
 
 /****************************************************************
  * Interface with checkout--helper
diff --git a/unpack-trees.c b/unpack-trees.c
index 117ed42370..e05e6ceff2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o,
 		struct cache_entry *ce = index->cache[i];
 
 		if (ce->ce_flags & CE_UPDATE) {
+			size_t last_pc_queue_size = pc_queue_size();
+
 			if (ce->ce_flags & CE_WT_REMOVE)
 				BUG("both update and delete flags are set on %s",
 				    ce->name);
-			display_progress(progress, ++cnt);
 			ce->ce_flags &= ~CE_UPDATE;
 			errs |= checkout_entry(ce, &state, NULL, NULL);
+
+			if (last_pc_queue_size == pc_queue_size())
+				display_progress(progress, ++cnt);
 		}
 	}
-	stop_progress(&progress);
 	if (pc_workers > 1)
-		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      progress, &cnt);
+	stop_progress(&progress);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 14/21] make_transient_cache_entry(): optionally alloc from mem_pool
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (12 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 13/21] parallel-checkout: support progress displaying Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 15/21] builtin/checkout.c: complete parallel checkout support Matheus Tavares
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Patryk Obara, Johannes Schindelin, Junio C Hamano,
	Jameson Miller, Jeff King

Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout--helper.c |  2 +-
 builtin/checkout.c         |  2 +-
 builtin/difftool.c         |  2 +-
 cache.h                    | 10 +++++-----
 read-cache.c               | 12 ++++++++----
 unpack-trees.c             |  2 +-
 6 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
index 269cf02feb..d2ab40cb4c 100644
--- a/builtin/checkout--helper.c
+++ b/builtin/checkout--helper.c
@@ -30,7 +30,7 @@ static void packet_to_ci(char *line, int len, struct checkout_item *ci)
 	}
 
 	memset(ci, 0, sizeof(*ci));
-	ci->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	ci->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL);
 	ci->ce->ce_namelen = fixed_portion->name_len;
 	ci->ce->ce_mode = fixed_portion->ce_mode;
 	memcpy(ci->ce->name, variant, ci->ce->ce_namelen);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 3e09b29cfe..8e4a3c1df0 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
diff --git a/builtin/difftool.c b/builtin/difftool.c
index dfa22b67eb..5e7a57c8c2 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -323,7 +323,7 @@ static int checkout_path(unsigned mode, struct object_id *oid,
 	struct cache_entry *ce;
 	int ret;
 
-	ce = make_transient_cache_entry(mode, oid, path, 0);
+	ce = make_transient_cache_entry(mode, oid, path, 0, NULL);
 	ret = checkout_entry(ce, state, NULL, NULL);
 
 	discard_cache_entry(ce);
diff --git a/cache.h b/cache.h
index e6963cf8fe..e2b41c5f8b 100644
--- a/cache.h
+++ b/cache.h
@@ -355,16 +355,16 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate,
 					   size_t name_len);
 
 /*
- * Create a cache_entry that is not intended to be added to an index.
- * Caller is responsible for discarding the cache_entry
- * with `discard_cache_entry`.
+ * Create a cache_entry that is not intended to be added to an index. If mp is
+ * not NULL, the entry is allocated within the given memory pool. Caller is
+ * responsible for discarding the cache_entry with `discard_cache_entry`.
  */
 struct cache_entry *make_transient_cache_entry(unsigned int mode,
 					       const struct object_id *oid,
 					       const char *path,
-					       int stage);
+					       int stage, struct mem_pool *mp);
 
-struct cache_entry *make_empty_transient_cache_entry(size_t name_len);
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp);
 
 /*
  * Discard cache entry.
diff --git a/read-cache.c b/read-cache.c
index 8ed1c29b54..eeb122cca4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -811,8 +811,10 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t le
 	return mem_pool__ce_calloc(find_mem_pool(istate), len);
 }
 
-struct cache_entry *make_empty_transient_cache_entry(size_t len)
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp)
 {
+	if (mp)
+		return mem_pool__ce_calloc(mp, len);
 	return xcalloc(1, cache_entry_size(len));
 }
 
@@ -846,8 +848,10 @@ struct cache_entry *make_cache_entry(struct index_state *istate,
 	return ret;
 }
 
-struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid,
-					       const char *path, int stage)
+struct cache_entry *make_transient_cache_entry(unsigned int mode,
+					       const struct object_id *oid,
+					       const char *path, int stage,
+					       struct mem_pool *mp)
 {
 	struct cache_entry *ce;
 	int len;
@@ -858,7 +862,7 @@ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct o
 	}
 
 	len = strlen(path);
-	ce = make_empty_transient_cache_entry(len);
+	ce = make_empty_transient_cache_entry(len, mp);
 
 	oidcpy(&ce->oid, oid);
 	memcpy(ce->name, path, len);
diff --git a/unpack-trees.c b/unpack-trees.c
index e05e6ceff2..dcb40dc8fa 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1031,7 +1031,7 @@ static struct cache_entry *create_ce_entry(const struct traverse_info *info,
 	size_t len = traverse_path_len(info, tree_entry_len(n));
 	struct cache_entry *ce =
 		is_transient ?
-		make_empty_transient_cache_entry(len) :
+		make_empty_transient_cache_entry(len, NULL) :
 		make_empty_cache_entry(istate, len);
 
 	ce->ce_mode = create_ce_mode(n->mode);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 15/21] builtin/checkout.c: complete parallel checkout support
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (13 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 14/21] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 16/21] checkout-index: add " Matheus Tavares
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Junio C Hamano, Nguyễn Thái Ngọc Duy

There is one code path in builtin/checkout.c which still doesn't benefit
from parallel checkout because it calls checkout_entry() directly,
instead of unpack_trees(). Let's add parallel support for this missing
spot as well. Note: the transient cache entries allocated in
checkout_merged() are now allocated in a mem_pool which is only
discarded after parallel checkout finishes. This is done because the
entries need to be valid when run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 8e4a3c1df0..b9230d5009 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -27,6 +27,7 @@
 #include "wt-status.h"
 #include "xdiff-interface.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
@@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos,
 		return error(_("path '%s' does not have their version"), ce->name);
 }
 
-static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts)
+static int checkout_merged(int pos, const struct checkout *state,
+			   int *nr_checkouts, struct mem_pool *ce_mem_pool)
 {
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
@@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
-	discard_cache_entry(ce);
 	return status;
 }
 
@@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	int nr_checkouts = 0, nr_unmerged = 0;
 	int errs = 0;
 	int pos;
+	int pc_workers, pc_threshold;
+	struct mem_pool *ce_mem_pool = NULL;
 
 	state.force = 1;
 	state.refresh_cache = 1;
 	state.istate = &the_index;
 
+	mem_pool_init(&ce_mem_pool, 0);
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
 	init_checkout_metadata(&state.meta, info->refname,
 			       info->commit ? &info->commit->object.oid : &info->oid,
 			       NULL);
 
 	enable_delayed_checkout(&state);
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts,
 						       &nr_checkouts, opts->overlay_mode);
 			else if (opts->merge)
 				errs |= checkout_merged(pos, &state,
-							&nr_unmerged);
+							&nr_unmerged,
+							ce_mem_pool);
 			pos = skip_same_name(ce, pos) - 1;
 		}
 	}
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      NULL, NULL);
+	mem_pool_discard(ce_mem_pool, should_validate_cache_entries());
 	remove_marked_cache_entries(&the_index, 1);
 	remove_scheduled_dirs();
 	errs |= finish_delayed_checkout(&state, &nr_checkouts);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 16/21] checkout-index: add parallel checkout support
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (14 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 15/21] builtin/checkout.c: complete parallel checkout support Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 17/21] parallel-checkout: avoid stat() calls in workers Matheus Tavares
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Martin Ågren,
	Nguyễn Thái Ngọc Duy, Jeff King, Junio C Hamano

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout-index.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 0f1ff73129..33fb933c30 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -12,6 +12,7 @@
 #include "cache-tree.h"
 #include "parse-options.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
@@ -160,6 +161,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 	int prefix_length;
 	int force = 0, quiet = 0, not_new = 0;
 	int index_opt = 0;
+	int pc_workers, pc_threshold;
 	struct option builtin_checkout_index_options[] = {
 		OPT_BOOL('a', "all", &all,
 			N_("check out all files in the index")),
@@ -214,6 +216,14 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 		hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
 	}
 
+	if (!to_tempfile)
+		get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+	else
+		pc_workers = 1;
+
+	if (pc_workers > 1)
+		init_parallel_checkout();
+
 	/* Check out named files first */
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
@@ -256,6 +266,12 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 	if (all)
 		checkout_all(prefix, prefix_length);
 
+	if (pc_workers > 1) {
+		/* Errors were already reported */
+		run_parallel_checkout(&state, pc_workers, pc_threshold,
+				      NULL, NULL);
+	}
+
 	if (is_lock_file_locked(&lock_file) &&
 	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("Unable to write new index file");
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 17/21] parallel-checkout: avoid stat() calls in workers
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (15 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 16/21] checkout-index: add " Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 18/21] entry: use is_dir_sep() when checking leading dirs Matheus Tavares
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Elijah Newren,
	Junio C Hamano

The current parallel checkout implementation requires the workers to
stat() the path components of each entry before writing, to make sure
they are all real directories and not symlinks or something else. The
stat() info is cached, so this procedure should not be so bad
performance-wise. But the exact same check is already done by the main
process, before enqueueing the entries for parallel checkout, to remove
files that were in the way and create the leading dirs. The reason we
still need the second check is that, in case of path collisions, a
symlink X could be created after an entry x/f was enqueued, leading the
parallel worker to wrongly create the file at X/f. If we postpone the
symlinks' checkouts, though, we can avoid the need of these stat() calls
in the workers. Other types of path collisions are still possible, such
as a regular file X being written before the worker tries to write x/f.
But that's OK, since the parallel checkout machinery will check the
return of open() to detect such collisions (which would not be possible
for the symlink case, as open() would succeed).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c             | 10 +++++++
 parallel-checkout.c | 71 ++++++++++++++++++++++++++++-----------------
 parallel-checkout.h |  8 +++++
 unpack-trees.c      |  4 ++-
 4 files changed, 65 insertions(+), 28 deletions(-)

diff --git a/entry.c b/entry.c
index b6c808dffa..6208df23df 100644
--- a/entry.c
+++ b/entry.c
@@ -477,6 +477,16 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		return write_entry(ce, topath, ca, state, 1);
 	}
 
+	/*
+	 * If a regular file x/f is queued for parallel checkout and a symlink
+	 * X is created now, the worker could wrongly create the file at X/f
+	 * due to path collision. Thus, symlinks are only created after
+	 * parallel-eligible entries.
+	 */
+	if (parallel_checkout_status() == PC_ACCEPTING_ENTRIES &&
+	    S_ISLNK(ce->ce_mode))
+		enqueue_symlink_checkout(ce, nr_checkouts);
+
 	strbuf_reset(&path);
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
 	strbuf_add(&path, ce->name, ce_namelen(ce));
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 78bf2de5ea..fee93460c1 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -140,6 +140,44 @@ static void advance_progress_meter(void)
 	}
 }
 
+struct symlink_checkout_item {
+	struct cache_entry *ce;
+	int *nr_checkouts;
+};
+
+static struct symlink_checkout_item *symlink_queue = NULL;
+static size_t symlink_queue_nr = 0, symlink_queue_alloc = 0;
+
+void enqueue_symlink_checkout(struct cache_entry *ce, int *nr_checkouts)
+{
+	assert(S_ISLNK(ce->ce_mode));
+	ALLOC_GROW(symlink_queue, symlink_queue_nr + 1, symlink_queue_alloc);
+	symlink_queue[symlink_queue_nr].ce = ce;
+	symlink_queue[symlink_queue_nr].nr_checkouts = nr_checkouts;
+	symlink_queue_nr++;
+}
+
+size_t symlink_queue_size(void)
+{
+	return symlink_queue_nr;
+}
+
+static int checkout_symlink_queue(struct checkout *state)
+{
+	size_t i;
+	int ret = 0;
+
+	for (i = 0; i < symlink_queue_nr; ++i) {
+		struct symlink_checkout_item *sci = &symlink_queue[i];
+		ret |= checkout_entry(sci->ce, state, NULL, sci->nr_checkouts);
+		advance_progress_meter();
+	}
+
+	FREE_AND_NULL(symlink_queue);
+	symlink_queue_nr = symlink_queue_alloc = 0;
+	return ret;
+}
+
 static int handle_results(struct checkout *state)
 {
 	int ret = 0;
@@ -257,16 +295,6 @@ static int close_and_clear(int *fd)
 	return ret;
 }
 
-static int check_leading_dirs(const char *path, int len, int prefix_len)
-{
-	const char *slash = path + len;
-
-	while (slash > path && *slash != '/')
-		slash--;
-
-	return has_dirs_only_path(path, slash - path, prefix_len);
-}
-
 void write_checkout_item(struct checkout *state, struct checkout_item *ci)
 {
 	unsigned int mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
@@ -276,27 +304,15 @@ void write_checkout_item(struct checkout *state, struct checkout_item *ci)
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
 	strbuf_add(&path, ci->ce->name, ci->ce->ce_namelen);
 
-	/*
-	 * At this point, leading dirs should have already been created. But if
-	 * a symlink being checked out has collided with one of the dirs, due to
-	 * file system folding rules, it's possible that the dirs are no longer
-	 * present. So we have to check again, and report any path collisions.
-	 */
-	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
-		ci->status = CI_RETRY;
-		goto out;
-	}
-
 	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
 
 	if (fd < 0) {
-		if (errno == EEXIST || errno == EISDIR) {
+		if (errno == EEXIST || errno == EISDIR || errno == ENOENT ||
+		    errno == ENOTDIR) {
 			/*
 			 * Errors which probably represent a path collision.
 			 * Suppress the error message and mark the ci to be
-			 * retried later, sequentially. ENOTDIR and ENOENT are
-			 * also interesting, but check_leading_dirs() should
-			 * have already caught these cases.
+			 * retried later, sequentially.
 			 */
 			ci->status = CI_RETRY;
 		} else {
@@ -523,7 +539,7 @@ static int run_checkout_sequentially(struct checkout *state)
 		if (ci->status != CI_RETRY)
 			advance_progress_meter();
 	}
-	return handle_results(state);
+	return handle_results(state) | checkout_symlink_queue(state);
 }
 
 int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
@@ -553,7 +569,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 	workers = setup_workers(state, num_workers);
 	gather_results_from_workers(workers, num_workers);
 	finish_workers(workers, num_workers);
-	ret = handle_results(state);
+	ret |= handle_results(state);
+	ret |= checkout_symlink_queue(state);
 
 done:
 	finish_parallel_checkout();
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 2b81a5db6c..a4f7e5b7bd 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -29,6 +29,14 @@ void get_parallel_checkout_configs(int *num_workers, int *threshold);
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 size_t pc_queue_size(void);
 
+/*
+ * Enqueues a symlink to be checked out *sequentially* after the parallel
+ * checkout finishes. This is done to avoid path collisions with leading dirs,
+ * which could make parallel workers write a file to the wrong place.
+ */
+void enqueue_symlink_checkout(struct cache_entry *ce, int *nr_checkouts);
+size_t symlink_queue_size(void);
+
 /*
  * Write all the queued entries, returning 0 on success. If the number of
  * entries is below the specified threshold, the operation is performed
diff --git a/unpack-trees.c b/unpack-trees.c
index dcb40dc8fa..01928d3d65 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -472,6 +472,7 @@ static int check_updates(struct unpack_trees_options *o,
 
 		if (ce->ce_flags & CE_UPDATE) {
 			size_t last_pc_queue_size = pc_queue_size();
+			size_t last_symlink_queue_size = symlink_queue_size();
 
 			if (ce->ce_flags & CE_WT_REMOVE)
 				BUG("both update and delete flags are set on %s",
@@ -479,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o,
 			ce->ce_flags &= ~CE_UPDATE;
 			errs |= checkout_entry(ce, &state, NULL, NULL);
 
-			if (last_pc_queue_size == pc_queue_size())
+			if (last_pc_queue_size == pc_queue_size() &&
+			    last_symlink_queue_size == symlink_queue_size())
 				display_progress(progress, ++cnt);
 		}
 	}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 18/21] entry: use is_dir_sep() when checking leading dirs
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (16 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 17/21] parallel-checkout: avoid stat() calls in workers Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 19/21] symlinks: make has_dirs_only_path() track FL_NOENT Matheus Tavares
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git; +Cc: stolee, jeffhost, Junio C Hamano

The test 'prevent git~1 squatting on Windows' in t7415, adds the file
'd./a/x' and the submodule 'd\a' to the index, with `git -c
core.protectNTFS=false update-index --add --cacheinfo`. Then it performs
a clone with `--recurse-submodules`. Since "d./" and "d\" represent the
same entry on NTFS, the operation is expected to fail, because the
submodule directory is not empty by the time "d\a" is cloned. With
parallel checkout, this condition is still valid: although we call
checkout_entry() for gitlinks before we write regular files (which are
delayed for later parallel write), the actual submodule cloning only
happens after unpack_trees() returns, in builtin/clone.c:checkout().

Note, however, that we do create the submodule directory (and leading
directories) in unpack_trees(). But the current code iterates through
path components only considering "/", not "\", which is also valid on
Windows. The reason we don't fail to create the leading dir "d" for the
gitlink "d\a" is because, by the time we call mkdir("d\a"), "d" was
already created for the regular file 'd./a/x'. Again, this is still true
for parallel checkout, since we create leading dirs sequentially, even
for entries that are delayed for later writing. But in a following patch,
we will allow checkout workers to create the leading directories in
parallel, for better performance. Therefore, when checkout_entry() is
called for the gitlink "d\a", "d" won't be present yet, and mkdir("d\a")
will fail with ENOENT. To solve this, in preparation for the said patch,
let's use is_dir_sep() when checking path components, so that
checkout_entry() can correctly create "d" for the gitlink "d\a".

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

I'm not sure if this is the right way to make t7415 work with
parallel-checkout; or if we should, perhaps, change the test to add the
submodule at 'a/d'. I'd love if someone more familiar with Windows could
review this one.


 entry.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index 6208df23df..19f2c1d132 100644
--- a/entry.c
+++ b/entry.c
@@ -19,7 +19,7 @@ static void create_directories(const char *path, int path_len,
 		do {
 			buf[len] = path[len];
 			len++;
-		} while (len < path_len && path[len] != '/');
+		} while (len < path_len && !is_dir_sep(path[len]));
 		if (len >= path_len)
 			break;
 		buf[len] = 0;
@@ -404,7 +404,7 @@ static int check_path(const char *path, int len, struct stat *st, int skiplen)
 {
 	const char *slash = path + len;
 
-	while (path < slash && *slash != '/')
+	while (path < slash && !is_dir_sep(*slash))
 		slash--;
 	if (!has_dirs_only_path(path, slash - path, skiplen)) {
 		errno = ENOENT;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 19/21] symlinks: make has_dirs_only_path() track FL_NOENT
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (17 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 18/21] entry: use is_dir_sep() when checking leading dirs Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 20/21] parallel-checkout: create leading dirs in workers Matheus Tavares
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Denton Liu,
	Nguyễn Thái Ngọc Duy, Jeff King, Junio C Hamano

In the next patch, the parallel-checkout workers will be able to create
the leading directories of the entries being written by themselves, to
increase performance. But to do so, the main process will first need to
remove non-directory files that can potentially be in the way (the
reasoning is discussed in the next patch). This can be done without much
cost by calling has_dirs_only_path() for each path component until we
find the first one which is not a real directory, which should then be
removed. This operations is cheap because it doesn't have to call stat()
again for each component, as the information is already cached from the
previous call at entry.c:check_path().

However, when has_dirs_only_path() returns false, we don't know if the
component doesn't exist or if it exists as another file type. The best
we could do in this case would be to stat() the component again. When
there are many files to be checked out inside the same directory (yet
to be created by a worker), we would have to call stat() for the same
directory once for each path, even though there is nothing to be
unlinked there. We can skip this stat() calls by making
has_dirs_only_path() also ask for FL_NOENT caching, and converting its
return to a tri-state.

Note: since we are now caching FL_NOENT, we also need to manually
invalidate the cache when we create a directory in a path previously
cached as non-existent.

While we are here, also remove duplicated comments in
has_dirs_only_path() and check_leading_path().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 cache.h             |  1 +
 entry.c             | 11 +++++++++--
 parallel-checkout.c |  3 +++
 symlinks.c          | 42 ++++++++++++++++++------------------------
 4 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index e2b41c5f8b..7a08cd6f0e 100644
--- a/cache.h
+++ b/cache.h
@@ -1711,6 +1711,7 @@ int has_symlink_leading_path(const char *name, int len);
 int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
 int check_leading_path(const char *name, int len);
 int has_dirs_only_path(const char *name, int len, int prefix_len);
+void reset_default_lstat_cache(void);
 void schedule_dir_for_removal(const char *name, int len);
 void remove_scheduled_dirs(void);
 
diff --git a/entry.c b/entry.c
index 19f2c1d132..e876adff19 100644
--- a/entry.c
+++ b/entry.c
@@ -14,6 +14,7 @@ static void create_directories(const char *path, int path_len,
 {
 	char *buf = xmallocz(path_len);
 	int len = 0;
+	int reset_cache = 0;
 
 	while (len < path_len) {
 		do {
@@ -31,7 +32,7 @@ static void create_directories(const char *path, int path_len,
 		 * we test the path components of the prefix with the
 		 * stat() function instead of the lstat() function.
 		 */
-		if (has_dirs_only_path(buf, len, state->base_dir_len))
+		if (has_dirs_only_path(buf, len, state->base_dir_len) > 0)
 			continue; /* ok, it is already a directory. */
 
 		/*
@@ -45,8 +46,14 @@ static void create_directories(const char *path, int path_len,
 			    !unlink_or_warn(buf) && !mkdir(buf, 0777))
 				continue;
 			die_errno("cannot create directory at '%s'", buf);
+		} else {
+			/* The cache had FL_NOENT, but we now created a dir */
+			reset_cache = 1;
 		}
 	}
+
+	if (reset_cache)
+		reset_default_lstat_cache();
 	free(buf);
 }
 
@@ -406,7 +413,7 @@ static int check_path(const char *path, int len, struct stat *st, int skiplen)
 
 	while (path < slash && !is_dir_sep(*slash))
 		slash--;
-	if (!has_dirs_only_path(path, slash - path, skiplen)) {
+	if (has_dirs_only_path(path, slash - path, skiplen) <= 0) {
 		errno = ENOENT;
 		return -1;
 	}
diff --git a/parallel-checkout.c b/parallel-checkout.c
index fee93460c1..4d72540256 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -185,6 +185,9 @@ static int handle_results(struct checkout *state)
 
 	pc_status = PC_HANDLING_RESULTS;
 
+	/* Workers might have created dirs, so the cache must be invalidated */
+	reset_default_lstat_cache();
+
 	for (i = 0; i < parallel_checkout->nr; ++i) {
 		struct checkout_item *ci = &parallel_checkout->items[i];
 		struct stat *st = &ci->st;
diff --git a/symlinks.c b/symlinks.c
index 69d458a24d..3adf6ef8a1 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -47,6 +47,11 @@ static inline void reset_lstat_cache(struct cache_def *cache)
 	 */
 }
 
+void reset_default_lstat_cache(void)
+{
+	reset_lstat_cache(&default_cache);
+}
+
 #define FL_DIR      (1 << 0)
 #define FL_NOENT    (1 << 1)
 #define FL_SYMLINK  (1 << 2)
@@ -210,15 +215,6 @@ int has_symlink_leading_path(const char *name, int len)
 	return threaded_has_symlink_leading_path(&default_cache, name, len);
 }
 
-/*
- * Return zero if path 'name' has a leading symlink component or
- * if some leading path component does not exists.
- *
- * Return -1 if leading path exists and is a directory.
- *
- * Return path length if leading path exists and is neither a
- * directory nor a symlink.
- */
 int check_leading_path(const char *name, int len)
 {
 	return threaded_check_leading_path(&default_cache, name, len);
@@ -246,30 +242,28 @@ static int threaded_check_leading_path(struct cache_def *cache, const char *name
 		return match_len;
 }
 
-/*
- * Return non-zero if all path components of 'name' exists as a
- * directory.  If prefix_len > 0, we will test with the stat()
- * function instead of the lstat() function for a prefix length of
- * 'prefix_len', thus we then allow for symlinks in the prefix part as
- * long as those points to real existing directories.
- */
 int has_dirs_only_path(const char *name, int len, int prefix_len)
 {
 	return threaded_has_dirs_only_path(&default_cache, name, len, prefix_len);
 }
 
 /*
- * Return non-zero if all path components of 'name' exists as a
- * directory.  If prefix_len > 0, we will test with the stat()
- * function instead of the lstat() function for a prefix length of
- * 'prefix_len', thus we then allow for symlinks in the prefix part as
- * long as those points to real existing directories.
+ * Return a positive number if all path components of 'name' exist as
+ * directories, a negative number if a component does not exist, and 0 otherwise
+ * (e.g. a component exists but as another file type). If prefix_len > 0, we
+ * will test with the stat() function instead of the lstat() function for a
+ * prefix length of 'prefix_len', thus we return +1 for symlinks in the prefix
+ * part as long as those points to real existing directories.
  */
 static int threaded_has_dirs_only_path(struct cache_def *cache, const char *name, int len, int prefix_len)
 {
-	return lstat_cache(cache, name, len,
-			   FL_DIR|FL_FULLPATH, prefix_len) &
-		FL_DIR;
+	int flags = lstat_cache(cache, name, len,
+				FL_NOENT|FL_DIR|FL_FULLPATH, prefix_len);
+	if (flags & FL_DIR)
+		return 1;
+	if (flags & FL_NOENT)
+		return -1;
+	return 0;
 }
 
 static struct strbuf removal = STRBUF_INIT;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 20/21] parallel-checkout: create leading dirs in workers
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (18 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 19/21] symlinks: make has_dirs_only_path() track FL_NOENT Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-10 21:33 ` [RFC PATCH 21/21] parallel-checkout: skip checking the working tree on clone Matheus Tavares
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git
  Cc: stolee, jeffhost, Junio C Hamano,
	Nguyễn Thái Ngọc Duy, Thomas Gummerer

Allow the parallel workers to create the leading directories of the
entries being checked out, instead of pre-creating them in the main
process. This optimization should be more effective on file systems with
higher I/O latency.

Part of the process of creating leading dirs is the removal of any
non-directory file that could be in the way. This is currently done
inside entry.c:create_directories(). However, if we were to move this to
the workers as well, we would risk removing a file just written by
another worker, which collided with the one currently being written.  In
a worse scenario, we could remove the file right after a worker have
closed it but before it called stat(). To avoid these problems, let's
remove the non-directory files in the main process. And to avoid the
cost of extra lstat() calls in this process, we use
has_dirs_only_path(), which will have the necessary information already
cached from check_path().

Finally, to create the leading dirs in the workers, we could re-use
create_directories(). But, unlike the main process, we wouldn't have the
stat() information cached. Thus, let's use raceproof_create_file(),
which will only stat() the path components after a open() failure,
saving us time when creating subsequent files in the same directory.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c             | 45 ++++++++++++++++++++++++++++++++++++++++++---
 parallel-checkout.c | 42 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 78 insertions(+), 9 deletions(-)

diff --git a/entry.c b/entry.c
index e876adff19..5dfd4d150d 100644
--- a/entry.c
+++ b/entry.c
@@ -57,6 +57,43 @@ static void create_directories(const char *path, int path_len,
 	free(buf);
 }
 
+static void remove_non_dirs(const char *path, int path_len,
+			    const struct checkout *state)
+{
+	char *buf = xmallocz(path_len);
+	int len = 0;
+
+	while (len < path_len) {
+		int ret;
+
+		do {
+			buf[len] = path[len];
+			len++;
+		} while (len < path_len && !is_dir_sep(path[len]));
+		if (len >= path_len)
+			break;
+		buf[len] = 0;
+
+		ret = has_dirs_only_path(buf, len, state->base_dir_len);
+
+		if (ret > 0)
+			continue; /* Is directory. */
+		if (ret < 0)
+			break; /* No entry */
+
+		/* ret == 0: not a directory, let's unlink it. */
+
+		if (!state->force)
+			die("'%s' already exists, and it's not a directory", buf);
+
+		if (unlink(buf))
+			die_errno("cannot unlink '%s'", buf);
+		else
+			break;
+	}
+	free(buf);
+}
+
 static void remove_subtree(struct strbuf *path)
 {
 	DIR *dir = opendir(path->buf);
@@ -555,8 +592,6 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 	} else if (state->not_new)
 		return 0;
 
-	create_directories(path.buf, path.len, state);
-
 	if (nr_checkouts)
 		(*nr_checkouts)++;
 
@@ -565,9 +600,13 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		ca = &ca_buf;
 	}
 
-	if (!enqueue_checkout(ce, ca))
+	if (!enqueue_checkout(ce, ca)) {
+		/* "clean" path so that workers can create leading dirs */
+		remove_non_dirs(path.buf, path.len, state);
 		return 0;
+	}
 
+	create_directories(path.buf, path.len, state);
 	return write_entry(ce, path.buf, ca, state, 0);
 }
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 4d72540256..5b73d8fa4b 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -298,20 +298,48 @@ static int close_and_clear(int *fd)
 	return ret;
 }
 
+struct ci_open_data {
+	int fd;
+	unsigned int mode;
+};
+
+static int ci_open(const char *path, void *cb)
+{
+	struct ci_open_data *data = cb;
+	data->fd = open(path, O_WRONLY | O_CREAT | O_EXCL, data->mode);
+
+	if (data->fd < 0) {
+		/*
+		 * EISDIR can only indicate path collisions among the entries
+		 * being checked out. We don't need raceproof_create_file() to
+		 * try removing empty dirs. Instead, just let the caller known
+		 * that the path already exists, so that the collision can be
+		 * properly handled later.
+		 */
+		if (errno == EISDIR)
+			errno = EEXIST;
+		return 1;
+	}
+
+	return 0;
+}
+
 void write_checkout_item(struct checkout *state, struct checkout_item *ci)
 {
-	unsigned int mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
+	struct ci_open_data open_data;
 	int fd = -1, fstat_done = 0;
 	struct strbuf path = STRBUF_INIT;
 
+	open_data.mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
 	strbuf_add(&path, ci->ce->name, ci->ce->ce_namelen);
 
-	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
-
-	if (fd < 0) {
-		if (errno == EEXIST || errno == EISDIR || errno == ENOENT ||
-		    errno == ENOTDIR) {
+	/*
+	 * The main process already removed any non-directory file that was in
+	 * the way. So if we find one, it's a path collision.
+	 */
+	if (raceproof_create_file(path.buf, ci_open, &open_data)) {
+		if (errno == EEXIST || errno == ENOTDIR || errno == ENOENT) {
 			/*
 			 * Errors which probably represent a path collision.
 			 * Suppress the error message and mark the ci to be
@@ -325,6 +353,8 @@ void write_checkout_item(struct checkout *state, struct checkout_item *ci)
 		goto out;
 	}
 
+	fd = open_data.fd;
+
 	if (write_checkout_item_to_fd(fd, state, ci, path.buf)) {
 		/* Error was already reported. */
 		ci->status = CI_FAILED;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH 21/21] parallel-checkout: skip checking the working tree on clone
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (19 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 20/21] parallel-checkout: create leading dirs in workers Matheus Tavares
@ 2020-08-10 21:33 ` Matheus Tavares
  2020-08-12 16:57 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-08-10 21:33 UTC (permalink / raw)
  To: git; +Cc: stolee, jeffhost, Junio C Hamano, Thomas Gummerer

If the current checkout process is part of a clone, we can skip some
steps that check paths in the working tree, as we know it was previously
empty. More specifically, we can enqueue the entry for parallel checkout
before calling check_path() to see if the path was already present and
up-to-date. We can also skip calling remove_non_dirs().

Note: this optimization is only possible because the parallel checkout
machinery will detect path collisions, and call checkout_entry_ca()
again for them, going through the check_path() logic.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index 5dfd4d150d..8c03e23811 100644
--- a/entry.c
+++ b/entry.c
@@ -513,12 +513,24 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		return 0;
 	}
 
-	if (topath) {
+	if (topath || state->clone) {
 		if (S_ISREG(ce->ce_mode) && !ca) {
 			convert_attrs(state->istate, &ca_buf, ce->name);
 			ca = &ca_buf;
 		}
-		return write_entry(ce, topath, ca, state, 1);
+		if (topath)
+			return write_entry(ce, topath, ca, state, 1);
+		/*
+		 * Since we are cloning, there should be no previous files in
+		 * the working tree. So we can skip calling remove_non_dirs()
+		 * and check_path(). (parallel-checkout.c will take care of path
+		 * collision.)
+		 */
+		if (!enqueue_checkout(ce, ca)) {
+			if (nr_checkouts)
+				(*nr_checkouts)++;
+			return 0;
+		}
 	}
 
 	/*
-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH 00/21] [RFC] Parallel checkout
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (20 preceding siblings ...)
  2020-08-10 21:33 ` [RFC PATCH 21/21] parallel-checkout: skip checking the working tree on clone Matheus Tavares
@ 2020-08-12 16:57 ` Jeff Hostetler
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
  2020-10-01 16:42 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
  23 siblings, 0 replies; 154+ messages in thread
From: Jeff Hostetler @ 2020-08-12 16:57 UTC (permalink / raw)
  To: Matheus Tavares, git; +Cc: stolee, jeffhost



On 8/10/20 5:33 PM, Matheus Tavares wrote:
> This series adds parallel workers to the checkout machinery. The cache
> entries are distributed among helper processes which are responsible for
> reading, filtering and writing the blobs to the working tree. This
> should benefit all commands that call unpack_trees() or check_updates(),
> such as: checkout, clone, sparse-checkout, checkout-index, etc.
> 
> This proposal is based on two previous ones, by Duy [1] and Jeff [2]. It
> uses some of the patches from these two series, with additional changes.
> The final parallel version was benchmarked during three operations with
> cold cache in the linux repo: cloning v5.8, checking out v5.8 from
> v2.6.15 and checking out v5.8 from v5.7. The three tables below show the
> mean run times and standard deviations for 5 runs in: a local file
> system, a Linux NFS server and Amazon EFS. The number of workers was
> chosen based on what produces the best result for each case.
> 
 > ...
> 
> The first 4 patches come from [2]. I couldn't get in touch with Jeff yet
> and ask for his approval on then, so I didn't include his Signed-off-by,
> for the time being.

This looks like an interesting mixture of our efforts.  Thanks for
picking it up.  I got re-tasked earlier this summer and had to put
it on hold.  I've given it a quick read and like the overall shape.
I still need to give it an in-depth review and run some perf tests
on Windows and on the gigantic Windows and Office repos.

Please feel free to add my sign-off to those commits.

 > ...

Jeff

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH 11/21] parallel-checkout: make it truly parallel
  2020-08-10 21:33 ` [RFC PATCH 11/21] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-08-19 21:34   ` Jeff Hostetler
  2020-08-20  1:33     ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 154+ messages in thread
From: Jeff Hostetler @ 2020-08-19 21:34 UTC (permalink / raw)
  To: Matheus Tavares, git
  Cc: stolee, jeffhost, Nguyễn Thái Ngọc Duy,
	Paul Tan, Denton Liu, Remi Lespinet, Junio C Hamano



On 8/10/20 5:33 PM, Matheus Tavares wrote:
> Use multiple worker processes to distribute the queued entries and call
> write_checkout_item() in parallel for them. The items are distributed
> uniformly in contiguous chunks. This minimizes the chances of two
> workers writing to the same directory simultaneously, which could
> affect performance due to lock contention in the kernel. Work stealing
> (or any other format of re-distribution) is not implemented yet.
> 
> For now, the number of workers is equal to the number of logical cores
> available. But the next patch will add settings to configure this.
> Distributed file systems, such as NFS and EFS, can benefit from using
> more workers than the actual number of cores (see timings below).
> 
> The parallel version was benchmarked during three operations in the
> linux repo, with cold cache: cloning v5.8, checking out v5.8 from
> v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The
> three tables below show the mean run times and standard deviations for
> 5 runs in a local file system, a Linux NFS server and Amazon EFS. The
> numbers of workers were chosen based on what produces the best result
> for each case.
> 
> Local:
> 
>              Clone                  Checkout I             Checkout II
> Sequential  8.180 s ± 0.021 s      6.936 s ± 0.030 s      2.585 s ± 0.005 s
> 10 workers  3.633 s ± 0.040 s      2.288 s ± 0.026 s      1.058 s ± 0.015 s
> Speedup     2.25 ± 0.03            3.03 ± 0.04            2.44 ± 0.03
> 
> Linux NFS server (v4.1, on EBS, single availability zone):
> 
>              Clone                  Checkout I             Checkout II
> Sequential  208.069 s ± 2.522 s    198.610 s ± 1.979 s    54.376 s ± 1.333 s
> 32 workers  67.078 s ±  0.878 s    64.828 s ± 0.387 s     22.993 s ± 0.252 s
> Speedup     3.10 ± 0.06            3.06 ± 0.04            2.36 ± 0.06
> 
> EFS (v4.1, replicated over multiple availability zones):
> 
>              Clone                  Checkout I             Checkout II
> Sequential  1143.655 s ± 11.819 s  1277.891 s ± 10.481 s  396.891 s ± 7.505 s
> 64 workers  173.242 s ± 1.484 s    282.421 s ± 1.521 s    165.424 s ± 9.564 s
> Speedup     6.60 ± 0.09            4.52 ± 0.04            2.40 ± 0.15
> 
> Local tests were executed in an i7-7700HQ (4 cores with hyper-threading)
> running Manjaro Linux, with SSD. NFS and EFS tests were executed in an
> Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux NFS server was
> running on a m6g.large instance with 1 TB, EBS GP2 volume. Before each
> timing, the linux repository was removed (or checked out back), and
> `sync && sysctl vm.drop_caches=3` was executed.
> 
> Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>   .gitignore                 |   1 +
>   Makefile                   |   1 +
>   builtin.h                  |   1 +
>   builtin/checkout--helper.c | 135 +++++++++++++++++++++
>   entry.c                    |  13 +-
>   git.c                      |   2 +
>   parallel-checkout.c        | 237 +++++++++++++++++++++++++++++++------
>   parallel-checkout.h        |  74 +++++++++++-
>   8 files changed, 425 insertions(+), 39 deletions(-)
>   create mode 100644 builtin/checkout--helper.c
> 
> diff --git a/.gitignore b/.gitignore
> index ee509a2ad2..6c01f0a58c 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -33,6 +33,7 @@
>   /git-check-mailmap
>   /git-check-ref-format
>   /git-checkout
> +/git-checkout--helper
>   /git-checkout-index
>   /git-cherry
>   /git-cherry-pick
> diff --git a/Makefile b/Makefile
> index caab8e6401..926473d484 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1049,6 +1049,7 @@ BUILTIN_OBJS += builtin/check-attr.o
>   BUILTIN_OBJS += builtin/check-ignore.o
>   BUILTIN_OBJS += builtin/check-mailmap.o
>   BUILTIN_OBJS += builtin/check-ref-format.o
> +BUILTIN_OBJS += builtin/checkout--helper.o
>   BUILTIN_OBJS += builtin/checkout-index.o
>   BUILTIN_OBJS += builtin/checkout.o
>   BUILTIN_OBJS += builtin/clean.o
> diff --git a/builtin.h b/builtin.h
> index a5ae15bfe5..5790c68750 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -122,6 +122,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix);
>   int cmd_bundle(int argc, const char **argv, const char *prefix);
>   int cmd_cat_file(int argc, const char **argv, const char *prefix);
>   int cmd_checkout(int argc, const char **argv, const char *prefix);
> +int cmd_checkout__helper(int argc, const char **argv, const char *prefix);
>   int cmd_checkout_index(int argc, const char **argv, const char *prefix);
>   int cmd_check_attr(int argc, const char **argv, const char *prefix);
>   int cmd_check_ignore(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
> new file mode 100644
> index 0000000000..269cf02feb
> --- /dev/null
> +++ b/builtin/checkout--helper.c
> @@ -0,0 +1,135 @@
> +#include "builtin.h"
> +#include "config.h"
> +#include "entry.h"
> +#include "parallel-checkout.h"
> +#include "parse-options.h"
> +#include "pkt-line.h"
> +
> +static void packet_to_ci(char *line, int len, struct checkout_item *ci)
> +{
> +	struct ci_fixed_portion *fixed_portion;
> +	char *encoding, *variant;
> +
> +	if (len < sizeof(struct ci_fixed_portion))
> +		BUG("checkout worker received too short item (got %d, exp %d)",
> +		    len, (int)sizeof(struct ci_fixed_portion));
> +
> +	fixed_portion = (struct ci_fixed_portion *)line;
> +
> +	if (len - sizeof(struct ci_fixed_portion) !=
> +		fixed_portion->name_len + fixed_portion->working_tree_encoding_len)
> +		BUG("checkout worker received corrupted item");
> +
> +	variant = line + sizeof(struct ci_fixed_portion);
> +	if (fixed_portion->working_tree_encoding_len) {
> +		encoding = xmemdupz(variant,
> +				    fixed_portion->working_tree_encoding_len);
> +		variant += fixed_portion->working_tree_encoding_len;
> +	} else {
> +		encoding = NULL;
> +	}
> +
> +	memset(ci, 0, sizeof(*ci));
> +	ci->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
> +	ci->ce->ce_namelen = fixed_portion->name_len;
> +	ci->ce->ce_mode = fixed_portion->ce_mode;
> +	memcpy(ci->ce->name, variant, ci->ce->ce_namelen);
> +	oidcpy(&ci->ce->oid, &fixed_portion->oid);
> +
> +	ci->id = fixed_portion->id;
> +	ci->ca.attr_action = fixed_portion->attr_action;
> +	ci->ca.crlf_action = fixed_portion->crlf_action;
> +	ci->ca.ident = fixed_portion->ident;
> +	ci->ca.working_tree_encoding = encoding;
> +}
> +
> +static void report_result(struct checkout_item *ci)
> +{
> +	struct ci_result res = { 0 };
> +	size_t size;
> +
> +	res.id = ci->id;
> +	res.status = ci->status;
> +
> +	if (ci->status == CI_SUCCESS) {
> +		res.st = ci->st;
> +		size = sizeof(res);
> +	} else {
> +		size = ci_result_base_size();
> +	}
> +
> +	packet_write(1, (const char *)&res, size);
> +}
> +
> +/* Free the worker-side malloced data, but not the ci itself. */
> +static void release_checkout_item_data(struct checkout_item *ci)
> +{
> +	free((char *)ci->ca.working_tree_encoding);
> +	discard_cache_entry(ci->ce);
> +}
> +
> +static void worker_loop(struct checkout *state)
> +{
> +	struct checkout_item *items = NULL;
> +	size_t i, nr = 0, alloc = 0;
> +
> +	while (1) {
> +		int len;
> +		char *line = packet_read_line(0, &len);
> +
> +		if (!line)
> +			break;
> +
> +		ALLOC_GROW(items, nr + 1, alloc);
> +		packet_to_ci(line, len, &items[nr++]);
> +	}
> +
> +	for (i = 0; i < nr; ++i) {
> +		struct checkout_item *ci = &items[i];
> +		write_checkout_item(state, ci);
> +		report_result(ci);
> +		release_checkout_item_data(ci);
> +	}
> +
> +	packet_flush(1);
> +
> +	free(items);
> +}
> +
> +static const char * const checkout_helper_usage[] = {
> +	N_("git checkout--helper [<options>]"),
> +	NULL
> +};
> +
> +int cmd_checkout__helper(int argc, const char **argv, const char *prefix)
> +{
> +	struct checkout state = CHECKOUT_INIT;
> +	struct option checkout_helper_options[] = {
> +		OPT_STRING(0, "prefix", &state.base_dir, N_("string"),
> +			N_("when creating files, prepend <string>")),
> +		OPT_END()
> +	};
> +
> +	if (argc == 2 && !strcmp(argv[1], "-h"))
> +		usage_with_options(checkout_helper_usage,
> +				   checkout_helper_options);
> +
> +	git_config(git_default_config, NULL);
> +	argc = parse_options(argc, argv, prefix, checkout_helper_options,
> +			     checkout_helper_usage, 0);
> +	if (argc > 0)
> +		usage_with_options(checkout_helper_usage, checkout_helper_options);
> +
> +	if (state.base_dir)
> +		state.base_dir_len = strlen(state.base_dir);
> +
> +	/*
> +	 * Setting this on worker won't actually update the index. We just need
> +	 * to pretend so to induce the checkout machinery to stat() the written
> +	 * entries.
> +	 */
> +	state.refresh_cache = 1;
> +
> +	worker_loop(&state);
> +	return 0;
> +}
> diff --git a/entry.c b/entry.c
> index 47c2c20d5a..b6c808dffa 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -427,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
>   	for (i = 0; i < state->istate->cache_nr; i++) {
>   		struct cache_entry *dup = state->istate->cache[i];
>   
> -		if (dup == ce)
> -			break;
> +		if (dup == ce) {
> +			/*
> +			 * Parallel checkout creates the files in a racy order.
> +			 * So the other side of the collision may appear after
> +			 * the given cache_entry in the array.
> +			 */
> +			if (parallel_checkout_status() == PC_HANDLING_RESULTS)
> +				continue;
> +			else
> +				break;
> +		}
>   
>   		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
>   			continue;
> diff --git a/git.c b/git.c
> index 8bd1d7551d..78c7bd412c 100644
> --- a/git.c
> +++ b/git.c
> @@ -486,6 +486,8 @@ static struct cmd_struct commands[] = {
>   	{ "check-mailmap", cmd_check_mailmap, RUN_SETUP },
>   	{ "check-ref-format", cmd_check_ref_format, NO_PARSEOPT  },
>   	{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
> +	{ "checkout--helper", cmd_checkout__helper,
> +		RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX },
>   	{ "checkout-index", cmd_checkout_index,
>   		RUN_SETUP | NEED_WORK_TREE},
>   	{ "cherry", cmd_cherry, RUN_SETUP },
> diff --git a/parallel-checkout.c b/parallel-checkout.c
> index e3b44eeb34..ec42342bc8 100644
> --- a/parallel-checkout.c
> +++ b/parallel-checkout.c
> @@ -1,39 +1,23 @@
>   #include "cache.h"
>   #include "entry.h"
>   #include "parallel-checkout.h"
> +#include "pkt-line.h"
> +#include "run-command.h"
>   #include "streaming.h"
>   
> -enum ci_status {
> -	CI_PENDING = 0,
> -	CI_SUCCESS,
> -	CI_RETRY,
> -	CI_FAILED,
> -};
> -
> -struct checkout_item {
> -	/* pointer to a istate->cache[] entry. Not owned by us. */
> -	struct cache_entry *ce;
> -	struct conv_attrs ca;
> -	struct stat st;
> -	enum ci_status status;
> -};
> -
>   struct parallel_checkout {
>   	struct checkout_item *items;
>   	size_t nr, alloc;
>   };
>   
>   static struct parallel_checkout *parallel_checkout = NULL;
> -
> -enum pc_status {
> -	PC_UNINITIALIZED = 0,
> -	PC_ACCEPTING_ENTRIES,
> -	PC_RUNNING,
> -	PC_HANDLING_RESULTS,
> -};
> -
>   static enum pc_status pc_status = PC_UNINITIALIZED;
>   
> +enum pc_status parallel_checkout_status(void)
> +{
> +	return pc_status;
> +}
> +
>   void init_parallel_checkout(void)
>   {
>   	if (parallel_checkout)
> @@ -113,9 +97,11 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
>   	ALLOC_GROW(parallel_checkout->items, parallel_checkout->nr + 1,
>   		   parallel_checkout->alloc);
>   
> -	ci = &parallel_checkout->items[parallel_checkout->nr++];
> +	ci = &parallel_checkout->items[parallel_checkout->nr];
>   	ci->ce = ce;
>   	memcpy(&ci->ca, ca, sizeof(ci->ca));
> +	ci->id = parallel_checkout->nr;
> +	parallel_checkout->nr++;
>   
>   	return 0;
>   }
> @@ -200,7 +186,8 @@ static int write_checkout_item_to_fd(int fd, struct checkout *state,
>   	/*
>   	 * checkout metadata is used to give context for external process
>   	 * filters. Files requiring such filters are not eligible for parallel
> -	 * checkout, so pass NULL.
> +	 * checkout, so pass NULL. Note: if that changes, the metadata must also
> +	 * be passed from the main process to the workers.
>   	 */
>   	ret = convert_to_working_tree_ca(&ci->ca, ci->ce->name, new_blob, size,
>   					 &buf, NULL);
> @@ -241,14 +228,14 @@ static int check_leading_dirs(const char *path, int len, int prefix_len)
>   	return has_dirs_only_path(path, slash - path, prefix_len);
>   }
>   
> -static void write_checkout_item(struct checkout *state, struct checkout_item *ci)
> +void write_checkout_item(struct checkout *state, struct checkout_item *ci)
>   {
>   	unsigned int mode = (ci->ce->ce_mode & 0100) ? 0777 : 0666;
>   	int fd = -1, fstat_done = 0;
>   	struct strbuf path = STRBUF_INIT;
>   
>   	strbuf_add(&path, state->base_dir, state->base_dir_len);
> -	strbuf_add(&path, ci->ce->name, ce_namelen(ci->ce));
> +	strbuf_add(&path, ci->ce->name, ci->ce->ce_namelen);
>   
>   	/*
>   	 * At this point, leading dirs should have already been created. But if
> @@ -311,30 +298,214 @@ static void write_checkout_item(struct checkout *state, struct checkout_item *ci
>   	strbuf_release(&path);
>   }
>   
> -static int run_checkout_sequentially(struct checkout *state)
> +static void send_one_item(int fd, struct checkout_item *ci)
> +{
> +	size_t len_data;
> +	char *data, *variant;
> +	struct ci_fixed_portion *fixed_portion;
> +	const char *working_tree_encoding = ci->ca.working_tree_encoding;
> +	size_t name_len = ci->ce->ce_namelen;
> +	size_t working_tree_encoding_len = working_tree_encoding ?
> +					   strlen(working_tree_encoding) : 0;
> +
> +	len_data = sizeof(struct ci_fixed_portion) + name_len +
> +		   working_tree_encoding_len;
> +
> +	data = xcalloc(1, len_data);
> +
> +	fixed_portion = (struct ci_fixed_portion *)data;
> +	fixed_portion->id = ci->id;
> +	oidcpy(&fixed_portion->oid, &ci->ce->oid);
> +	fixed_portion->ce_mode = ci->ce->ce_mode;
> +	fixed_portion->attr_action = ci->ca.attr_action;
> +	fixed_portion->crlf_action = ci->ca.crlf_action;
> +	fixed_portion->ident = ci->ca.ident;
> +	fixed_portion->name_len = name_len;
> +	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
> +
> +	variant = data + sizeof(*fixed_portion);
> +	if (working_tree_encoding_len) {
> +		memcpy(variant, working_tree_encoding, working_tree_encoding_len);
> +		variant += working_tree_encoding_len;
> +	}
> +	memcpy(variant, ci->ce->name, name_len);
> +
> +	packet_write(fd, data, len_data);
> +
> +	free(data);
> +}
> +
> +static void send_batch(int fd, size_t start, size_t nr)
>   {
>   	size_t i;
> +	for (i = 0; i < nr; ++i)
> +		send_one_item(fd, &parallel_checkout->items[start + i]);
> +	packet_flush(fd);
> +}
>   
> -	for (i = 0; i < parallel_checkout->nr; ++i) {
> -		struct checkout_item *ci = &parallel_checkout->items[i];
> -		write_checkout_item(state, ci);
> +static struct child_process *setup_workers(struct checkout *state, int num_workers)
> +{
> +	struct child_process *workers;
> +	int i, workers_with_one_extra_item;
> +	size_t base_batch_size, next_to_assign = 0;
> +
> +	base_batch_size = parallel_checkout->nr / num_workers;
> +	workers_with_one_extra_item = parallel_checkout->nr % num_workers;
> +	ALLOC_ARRAY(workers, num_workers);
> +
> +	for (i = 0; i < num_workers; ++i) {
> +		struct child_process *cp = &workers[i];
> +		size_t batch_size = base_batch_size;
> +
> +		child_process_init(cp);
> +		cp->git_cmd = 1;
> +		cp->in = -1;
> +		cp->out = -1;
> +		strvec_push(&cp->args, "checkout--helper");
> +		if (state->base_dir_len)
> +			strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
> +		if (start_command(cp))
> +			die(_("failed to spawn checkout worker"));

We should consider splitting this loop into one to start the helpers
and another loop to later send them their assignments.  This would
better hide the process startup costs.

When comparing this version with my pc-p4-core branch on Windows,
I was seeing a delay of 0.8 seconds between each helper process
getting started.  And on my version a delay of 0.2 between them.

I was testing with a huge repo and the batch size was ~200k, so it
foreground process was stuck in send_batch() for a while before it
could start the next helper process.

It still takes the same amount of time to send each batch, but
the 2nd thru nth helpers can be starting while we are sending the
batch to the 1st helper.  (This might just be a Windows issue because
of how slow process creation is on Windows....)

We could maybe also save a little time splitting the batches
across the helpers, but that's a refinement for later...

> +
> +		/* distribute the extra work evenly */
> +		if (i < workers_with_one_extra_item)
> +			batch_size++;
> +
> +		send_batch(cp->in, next_to_assign, batch_size);
> +		next_to_assign += batch_size;
>   	}
>   
> +	return workers;
> +}
> +
> +static void finish_workers(struct child_process *workers, int num_workers)
> +{
> +	int i;
> +	for (i = 0; i < num_workers; ++i) {
> +		struct child_process *w = &workers[i];
> +		if (w->in >= 0)
> +			close(w->in);
> +		if (w->out >= 0)
> +			close(w->out);

You might also consider splitting this loop too.  The net-net here
is that the foreground process closes the handle to the child and
waits for the child to exit -- which it will because it get EOF on
its stdin.

But the foreground process is stuck in a wait() for it to do so.

You could make finish_workers() just call close() on all the child
handles and then have an atexit() handler to actually wait() and
reap them.  This would let the children exit asynchronously (while
the caller here in the foreground process is updating the index
on disk, for example).


> +		if (finish_command(w))
> +			die(_("checkout worker finished with error"));
> +	}
> +	free(workers);
> +}
> +
> +static void parse_and_save_result(const char *line, int len)
> +{
> +	struct ci_result *res;
> +	struct checkout_item *ci;
> +
> +	/*
> +	 * Worker should send either the full result struct or just the base
> +	 * (i.e. no stat data).
> +	 */
> +	if (len != ci_result_base_size() && len != sizeof(struct ci_result))
> +		BUG("received corrupted item from checkout worker");
> +
> +	res = (struct ci_result *)line;
> +
> +	if (res->id > parallel_checkout->nr)
> +		BUG("checkout worker sent unknown item id");
> +
> +	ci = &parallel_checkout->items[res->id];
> +	ci->status = res->status;
> +
> +	/*
> +	 * Worker only sends stat data on success. Otherwise, we *cannot* access
> +	 * res->st as that will be an invalid address.
> +	 */
> +	if (res->status == CI_SUCCESS)
> +		ci->st = res->st;
> +}
> +
> +static void gather_results_from_workers(struct child_process *workers,
> +					int num_workers)
> +{
> +	int i, active_workers = num_workers;
> +	struct pollfd *pfds;
> +
> +	CALLOC_ARRAY(pfds, num_workers);
> +	for (i = 0; i < num_workers; ++i) {
> +		pfds[i].fd = workers[i].out;
> +		pfds[i].events = POLLIN;
> +	}
> +
> +	while (active_workers) {
> +		int nr = poll(pfds, num_workers, -1);
> +
> +		if (nr < 0) {
> +			if (errno == EINTR)
> +				continue;
> +			die_errno("failed to poll checkout workers");
> +		}
> +
> +		for (i = 0; i < num_workers && nr > 0; ++i) {
> +			struct pollfd *pfd = &pfds[i];
> +
> +			if (!pfd->revents)
> +				continue;
> +
> +			if (pfd->revents & POLLIN) {
> +				int len;
> +				const char *line = packet_read_line(pfd->fd, &len);
> +
> +				if (!line) {
> +					pfd->fd = -1;
> +					active_workers--;
> +				} else {
> +					parse_and_save_result(line, len);
> +				}
> +			} else if (pfd->revents & POLLHUP) {
> +				pfd->fd = -1;
> +				active_workers--;
> +			} else if (pfd->revents & (POLLNVAL | POLLERR)) {
> +				die(_("error polling from checkout worker"));
> +			}
> +
> +			nr--;
> +		}
> +	}
> +
> +	free(pfds);
> +}
> +
> +static int run_checkout_sequentially(struct checkout *state)
> +{
> +	size_t i;
> +	for (i = 0; i < parallel_checkout->nr; ++i)
> +		write_checkout_item(state, &parallel_checkout->items[i]);
>   	return handle_results(state);
>   }
>   
> +static const int workers_threshold = 0;
>   
>   int run_parallel_checkout(struct checkout *state)
>   {
> -	int ret;
> +	int num_workers = online_cpus();
> +	int ret = 0;
> +	struct child_process *workers;
>   
>   	if (!parallel_checkout)
>   		BUG("cannot run parallel checkout: not initialized yet");
>   
>   	pc_status = PC_RUNNING;
>   
> -	ret = run_checkout_sequentially(state);
> +	if (parallel_checkout->nr == 0) {
> +		goto done;
> +	} else if (parallel_checkout->nr < workers_threshold || num_workers == 1) {
> +		ret = run_checkout_sequentially(state);
> +		goto done;
> +	}
> +
> +	workers = setup_workers(state, num_workers);
> +	gather_results_from_workers(workers, num_workers);
> +	finish_workers(workers, num_workers);
> +	ret = handle_results(state);
>   
> +done:
>   	finish_parallel_checkout();
>   	return ret;
>   }
> diff --git a/parallel-checkout.h b/parallel-checkout.h
> index 8eef59ffcd..f25f2874ae 100644
> --- a/parallel-checkout.h
> +++ b/parallel-checkout.h
> @@ -1,10 +1,21 @@
>   #ifndef PARALLEL_CHECKOUT_H
>   #define PARALLEL_CHECKOUT_H
>   
> -struct cache_entry;
> -struct checkout;
> -struct conv_attrs;
> +#include "entry.h"
> +#include "convert.h"
>   
> +/****************************************************************
> + * Users of parallel checkout
> + ****************************************************************/
> +
> +enum pc_status {
> +	PC_UNINITIALIZED = 0,
> +	PC_ACCEPTING_ENTRIES,
> +	PC_RUNNING,
> +	PC_HANDLING_RESULTS,
> +};
> +
> +enum pc_status parallel_checkout_status(void);
>   void init_parallel_checkout(void);
>   
>   /*
> @@ -14,7 +25,62 @@ void init_parallel_checkout(void);
>    */
>   int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
>   
> -/* Write all the queued entries, returning 0 on success.*/
> +/* Write all the queued entries, returning 0 on success. */
>   int run_parallel_checkout(struct checkout *state);
>   
> +/****************************************************************
> + * Interface with checkout--helper
> + ****************************************************************/
> +
> +enum ci_status {
> +	CI_PENDING = 0,
> +	CI_SUCCESS,
> +	CI_RETRY,
> +	CI_FAILED,
> +};
> +
> +struct checkout_item {
> +	/*
> +	 * In main process ce points to a istate->cache[] entry. Thus, it's not
> +	 * owned by us. In workers they own the memory, which *must be* released.
> +	 */
> +	struct cache_entry *ce;
> +	struct conv_attrs ca;
> +	size_t id; /* position in parallel_checkout->items[] of main process */
> +
> +	/* Output fields, sent from workers. */
> +	enum ci_status status;
> +	struct stat st;
> +};
> +
> +/*
> + * The fixed-size portion of `struct checkout_item` that is sent to the workers.
> + * Following this will be 2 strings: ca.working_tree_encoding and ce.name; These
> + * are NOT null terminated, since we have the size in the fixed portion.
> + */
> +struct ci_fixed_portion {
> +	size_t id;
> +	struct object_id oid;
> +	unsigned int ce_mode;
> +	enum crlf_action attr_action;
> +	enum crlf_action crlf_action;
> +	int ident;
> +	size_t working_tree_encoding_len;
> +	size_t name_len;
> +};
> +
> +/*
> + * The `struct checkout_item` fields returned by the workers. The order is
> + * important here, specially stat being the last one, as it is omitted on error.
> + */
> +struct ci_result {
> +	size_t id;
> +	enum ci_status status;
> +	struct stat st;
> +};
> +
> +#define ci_result_base_size() offsetof(struct ci_result, st)
> +
> +void write_checkout_item(struct checkout *state, struct checkout_item *ci);
> +
>   #endif /* PARALLEL_CHECKOUT_H */
> 

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH 11/21] parallel-checkout: make it truly parallel
  2020-08-19 21:34   ` Jeff Hostetler
@ 2020-08-20  1:33     ` Matheus Tavares Bernardino
  2020-08-20 14:39       ` Jeff Hostetler
  0 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-08-20  1:33 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: git, Derrick Stolee, jeffhost,
	Nguyễn Thái Ngọc Duy, Paul Tan, Denton Liu,
	Remi Lespinet, Junio C Hamano

On Wed, Aug 19, 2020 at 6:34 PM Jeff Hostetler <git@jeffhostetler.com> wrote:
>
> On 8/10/20 5:33 PM, Matheus Tavares wrote:
> >
> > +static struct child_process *setup_workers(struct checkout *state, int num_workers)
> > +{
> > +     struct child_process *workers;
> > +     int i, workers_with_one_extra_item;
> > +     size_t base_batch_size, next_to_assign = 0;
> > +
> > +     base_batch_size = parallel_checkout->nr / num_workers;
> > +     workers_with_one_extra_item = parallel_checkout->nr % num_workers;
> > +     ALLOC_ARRAY(workers, num_workers);
> > +
> > +     for (i = 0; i < num_workers; ++i) {
> > +             struct child_process *cp = &workers[i];
> > +             size_t batch_size = base_batch_size;
> > +
> > +             child_process_init(cp);
> > +             cp->git_cmd = 1;
> > +             cp->in = -1;
> > +             cp->out = -1;
> > +             strvec_push(&cp->args, "checkout--helper");
> > +             if (state->base_dir_len)
> > +                     strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
> > +             if (start_command(cp))
> > +                     die(_("failed to spawn checkout worker"));
>
> We should consider splitting this loop into one to start the helpers
> and another loop to later send them their assignments.  This would
> better hide the process startup costs.
>
> When comparing this version with my pc-p4-core branch on Windows,
> I was seeing a delay of 0.8 seconds between each helper process
> getting started.  And on my version a delay of 0.2 between them.
>
> I was testing with a huge repo and the batch size was ~200k, so it
> foreground process was stuck in send_batch() for a while before it
> could start the next helper process.
>
> It still takes the same amount of time to send each batch, but
> the 2nd thru nth helpers can be starting while we are sending the
> batch to the 1st helper.  (This might just be a Windows issue because
> of how slow process creation is on Windows....)

Thanks for the explanation. I will split the loop in v2.

> We could maybe also save a little time splitting the batches
> across the helpers, but that's a refinement for later...
>
> > +
> > +             /* distribute the extra work evenly */
> > +             if (i < workers_with_one_extra_item)
> > +                     batch_size++;
> > +
> > +             send_batch(cp->in, next_to_assign, batch_size);
> > +             next_to_assign += batch_size;
> >       }
> >
> > +     return workers;
> > +}
> > +
> > +static void finish_workers(struct child_process *workers, int num_workers)
> > +{
> > +     int i;
> > +     for (i = 0; i < num_workers; ++i) {
> > +             struct child_process *w = &workers[i];
> > +             if (w->in >= 0)
> > +                     close(w->in);
> > +             if (w->out >= 0)
> > +                     close(w->out);
>
> You might also consider splitting this loop too.  The net-net here
> is that the foreground process closes the handle to the child and
> waits for the child to exit -- which it will because it get EOF on
> its stdin.
>
> But the foreground process is stuck in a wait() for it to do so.
>
> You could make finish_workers() just call close() on all the child
> handles and then have an atexit() handler to actually wait() and
> reap them.  This would let the children exit asynchronously (while
> the caller here in the foreground process is updating the index
> on disk, for example).

Makes sense, thanks. And I think we could achieve this by setting both
`clean_on_exit` and `wait_after_clean` on the child_process struct,
right? (BTW, I have just noticed that we probably want these flags set
for another reason too: we wouldn't want the workers to keep checking
out files if the main process was killed.)

Maybe the downside of using the atexit() handler, instead of calling
finish_command(), would be that we cannot free the `workers` array
when cleaning up parallel-checkout, right? Also, we wouldn't be able
to report an error if the worker ends with an error code (but at this
point it would have already sent all the results to the foreground
process, anyway).

Do you think that we can mitigate the wait() cost by just splitting
the loop in two, like we are going to do in setup_workers()?

>
> > +             if (finish_command(w))
> > +                     die(_("checkout worker finished with error"));
> > +     }
> > +     free(workers);
> > +}

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH 11/21] parallel-checkout: make it truly parallel
  2020-08-20  1:33     ` Matheus Tavares Bernardino
@ 2020-08-20 14:39       ` Jeff Hostetler
  0 siblings, 0 replies; 154+ messages in thread
From: Jeff Hostetler @ 2020-08-20 14:39 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: git, Derrick Stolee, jeffhost,
	Nguyễn Thái Ngọc Duy, Paul Tan, Denton Liu,
	Remi Lespinet, Junio C Hamano



On 8/19/20 9:33 PM, Matheus Tavares Bernardino wrote:
> On Wed, Aug 19, 2020 at 6:34 PM Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> On 8/10/20 5:33 PM, Matheus Tavares wrote:
>>>
>>> +static struct child_process *setup_workers(struct checkout *state, int num_workers)
>>> +{
>>> +     struct child_process *workers;
>>> +     int i, workers_with_one_extra_item;
>>> +     size_t base_batch_size, next_to_assign = 0;
>>> +
>>> +     base_batch_size = parallel_checkout->nr / num_workers;
>>> +     workers_with_one_extra_item = parallel_checkout->nr % num_workers;
>>> +     ALLOC_ARRAY(workers, num_workers);
>>> +
>>> +     for (i = 0; i < num_workers; ++i) {
>>> +             struct child_process *cp = &workers[i];
>>> +             size_t batch_size = base_batch_size;
>>> +
>>> +             child_process_init(cp);
>>> +             cp->git_cmd = 1;
>>> +             cp->in = -1;
>>> +             cp->out = -1;
>>> +             strvec_push(&cp->args, "checkout--helper");
>>> +             if (state->base_dir_len)
>>> +                     strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
>>> +             if (start_command(cp))
>>> +                     die(_("failed to spawn checkout worker"));
>>
>> We should consider splitting this loop into one to start the helpers
>> and another loop to later send them their assignments.  This would
>> better hide the process startup costs.
>>
>> When comparing this version with my pc-p4-core branch on Windows,
>> I was seeing a delay of 0.8 seconds between each helper process
>> getting started.  And on my version a delay of 0.2 between them.
>>
>> I was testing with a huge repo and the batch size was ~200k, so it
>> foreground process was stuck in send_batch() for a while before it
>> could start the next helper process.
>>
>> It still takes the same amount of time to send each batch, but
>> the 2nd thru nth helpers can be starting while we are sending the
>> batch to the 1st helper.  (This might just be a Windows issue because
>> of how slow process creation is on Windows....)
> 
> Thanks for the explanation. I will split the loop in v2.
> 
>> We could maybe also save a little time splitting the batches
>> across the helpers, but that's a refinement for later...
>>
>>> +
>>> +             /* distribute the extra work evenly */
>>> +             if (i < workers_with_one_extra_item)
>>> +                     batch_size++;
>>> +
>>> +             send_batch(cp->in, next_to_assign, batch_size);
>>> +             next_to_assign += batch_size;
>>>        }
>>>
>>> +     return workers;
>>> +}
>>> +
>>> +static void finish_workers(struct child_process *workers, int num_workers)
>>> +{
>>> +     int i;
>>> +     for (i = 0; i < num_workers; ++i) {
>>> +             struct child_process *w = &workers[i];
>>> +             if (w->in >= 0)
>>> +                     close(w->in);
>>> +             if (w->out >= 0)
>>> +                     close(w->out);
>>
>> You might also consider splitting this loop too.  The net-net here
>> is that the foreground process closes the handle to the child and
>> waits for the child to exit -- which it will because it get EOF on
>> its stdin.
>>
>> But the foreground process is stuck in a wait() for it to do so.
>>
>> You could make finish_workers() just call close() on all the child
>> handles and then have an atexit() handler to actually wait() and
>> reap them.  This would let the children exit asynchronously (while
>> the caller here in the foreground process is updating the index
>> on disk, for example).
> 
> Makes sense, thanks. And I think we could achieve this by setting both
> `clean_on_exit` and `wait_after_clean` on the child_process struct,
> right? (BTW, I have just noticed that we probably want these flags set
> for another reason too: we wouldn't want the workers to keep checking
> out files if the main process was killed.)
> 
> Maybe the downside of using the atexit() handler, instead of calling
> finish_command(), would be that we cannot free the `workers` array
> when cleaning up parallel-checkout, right? Also, we wouldn't be able
> to report an error if the worker ends with an error code (but at this
> point it would have already sent all the results to the foreground
> process, anyway).
> 
> Do you think that we can mitigate the wait() cost by just splitting
> the loop in two, like we are going to do in setup_workers()?

Yeah, I was thinking about that after I hit send, that it'd be simpler
to do a close loop and then a wait loop and not bother with the atexit
complexity and yet get us most of the gains.

> 
>>
>>> +             if (finish_command(w))
>>> +                     die(_("checkout worker finished with error"));
>>> +     }
>>> +     free(workers);
>>> +}

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 00/19] Parallel Checkout (part I)
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (21 preceding siblings ...)
  2020-08-12 16:57 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
@ 2020-09-22 22:49 ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
                     ` (19 more replies)
  2020-10-01 16:42 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
  23 siblings, 20 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

This series adds helper workers to checkout, parallelizing the reading,
filtering and writing of multiple blobs to the working tree.

Since v1, I got the chance to benchmark parallel checkout in more
machines. The results showed that the parallelization is most effective
with repositories located on SSDs or over distributed file systems. For
local file systems on spinning disks, it does not always bring good
performances. In fact, it even brings a slowdown sometimes. But given
the results on the two first cases, I think it's worth having the
parallel code as an optional (and non-default) setting.

The size of the repository being checked out and the compression level
on the packfiles also influence how much performance gain we can get
from parallel checkout. For example, downloading the Linux repo from
GitHub and from kernel.org I got packfiles with 2.9GB and 1.4GB,
respectively. The number of objects was the same, but GitHub's had a
smaller number of delta-chains with size >= 7 [A]. For this reason, the
sequential checkout after GitHub's clone was considerably faster than
the sequential checkout after kernel.org's clone. And the speedup from
parallel checkout was more modest (but it was faster in absolute values,
nevertheless).

[A]: https://docs.google.com/spreadsheets/d/1dDGLym77JAGCVYhKQHe44r3pqtrsvHrjS4NmD_Hqr6k/edit?usp=sharing

V2 got bigger with tests and some additional optimizations, so I decided
to divide the original series into two parts to facilitate reviewing.
This one is constituted of:

- The first 9 patches are preparatory steps in convert.c and entry.c.
- The middle 6 actually implement parallel checkout.
- The last 4 add tests.

Part II will contain some extra optimizations, like work stealing and
the creation of leading directories in parallel. With that, workers
won't need to stat() the path components again before opening the files
for writing. We will also skip some stat() calls during clone.


Major changes since v1:

General:
- Added tests
- Parallel checkout is no longer the default, since not all machines
  benefit from it.
- Rebased on top of master to use the adjusted mem_pool API of
  en/mem-pool.

Patch 10:
- Converted BUG() to error(), in handle_results(), when we finish
  parallel checkout with pending entries. This is not really a BUG; it
  can happen when a worker dies before sending all of its results. Also,
  by emitting an error message instead of die()'ing, we can continue
  processing the next results and, thus, avoid wasting successful work.
- Added missing initialization of ci->status on enqueue_entry().
- Fixed bug on which collision report during clone would not be correct
  when the file that is first written appears after it's colliding pair
  in the cache array.
- Reworded commit message and added comment in handle_results() to
  explain why we retry writing entries with path collisions.
- Renamed CI_RETRY to CI_COLLISION, to make it easier to change the
  behavior on collided entries in the future, if necessary.
- Some other minor changes like:
  * Removed unnecessary PC_HANDLING_RESULTS status.
  * Statically allocated the global parallel_checkout struct.
  * Renamed checkout_item to parallel_checkout_item.

Patch 11:
- Made parse_and_save_result() safer by checking that the received data
  has the expected size, instead of trusting ci->status and possibly
  accessing an invalid address on errors.
- Limited the workers to the number of enqueued entries.
- Added comment in packet_to_ci() mentioning why it's OK to encode
  NULL as a zero length string when sending the working_tree_encoding to
  workers.
- Split subprocess' spawning and finalizing loops, to mitigate the
  spawn/wait cost.
- Don't die() when a worker exits with an error code (only report the
  error), to avoid wasting good work by not updating the index with the 
  stat information from the written entries.
- Renamed checkout.workersThreshold to checkout.thresholdForParallelism.


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (15):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: add tests for basic operations
  parallel-checkout: add tests related to clone collisions
  parallel-checkout: add tests related to .gitattributes
  ci: run test round with parallel-checkout enabled

 .gitignore                              |   1 +
 Documentation/config/checkout.txt       |  21 +
 Makefile                                |   2 +
 apply.c                                 |   1 +
 builtin.h                               |   1 +
 builtin/checkout--helper.c              | 142 ++++++
 builtin/checkout-index.c                |  17 +
 builtin/checkout.c                      |  21 +-
 builtin/difftool.c                      |   3 +-
 cache.h                                 |  34 +-
 ci/run-build-and-tests.sh               |   1 +
 convert.c                               | 121 +++--
 convert.h                               |  68 +++
 entry.c                                 | 102 ++--
 entry.h                                 |  54 ++
 git.c                                   |   2 +
 parallel-checkout.c                     | 631 ++++++++++++++++++++++++
 parallel-checkout.h                     | 103 ++++
 read-cache.c                            |  12 +-
 t/README                                |   4 +
 t/lib-encoding.sh                       |  25 +
 t/lib-parallel-checkout.sh              |  45 ++
 t/t0028-working-tree-encoding.sh        |  25 +-
 t/t2080-parallel-checkout-basics.sh     | 197 ++++++++
 t/t2081-parallel-checkout-collisions.sh | 116 +++++
 t/t2082-parallel-checkout-attributes.sh | 174 +++++++
 unpack-trees.c                          |  22 +-
 27 files changed, 1793 insertions(+), 152 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h
 create mode 100644 t/lib-encoding.sh
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh
 create mode 100755 t/t2081-parallel-checkout-collisions.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 01/19] convert: make convert_attrs() and convert structs public
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
                     ` (18 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

From: Jeff Hostetler <jeffhost@microsoft.com>

Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 23 ++---------------------
 convert.h | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 8e6c292421..941a845692 100644
--- a/convert.c
+++ b/convert.c
@@ -24,17 +24,6 @@
 #define CONVERT_STAT_BITS_TXT_CRLF  0x2
 #define CONVERT_STAT_BITS_BIN       0x4
 
-enum crlf_action {
-	CRLF_UNDEFINED,
-	CRLF_BINARY,
-	CRLF_TEXT,
-	CRLF_TEXT_INPUT,
-	CRLF_TEXT_CRLF,
-	CRLF_AUTO,
-	CRLF_AUTO_INPUT,
-	CRLF_AUTO_CRLF
-};
-
 struct text_stat {
 	/* NUL, CR, LF and CRLF counts */
 	unsigned nul, lonecr, lonelf, crlf;
@@ -1297,18 +1286,10 @@ static int git_path_check_ident(struct attr_check_item *check)
 	return !!ATTR_TRUE(value);
 }
 
-struct conv_attrs {
-	struct convert_driver *drv;
-	enum crlf_action attr_action; /* What attr says */
-	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
-	int ident;
-	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
-};
-
 static struct attr_check *check;
 
-static void convert_attrs(const struct index_state *istate,
-			  struct conv_attrs *ca, const char *path)
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path)
 {
 	struct attr_check_item *ccheck = NULL;
 
diff --git a/convert.h b/convert.h
index e29d1026a6..aeb4a1be9a 100644
--- a/convert.h
+++ b/convert.h
@@ -37,6 +37,27 @@ enum eol {
 #endif
 };
 
+enum crlf_action {
+	CRLF_UNDEFINED,
+	CRLF_BINARY,
+	CRLF_TEXT,
+	CRLF_TEXT_INPUT,
+	CRLF_TEXT_CRLF,
+	CRLF_AUTO,
+	CRLF_AUTO_INPUT,
+	CRLF_AUTO_CRLF
+};
+
+struct convert_driver;
+
+struct conv_attrs {
+	struct convert_driver *drv;
+	enum crlf_action attr_action; /* What attr says */
+	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
+	int ident;
+	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
+};
+
 enum ce_delay_state {
 	CE_NO_DELAY = 0,
 	CE_CAN_DELAY = 1,
@@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 int would_convert_to_git_filter_fd(const struct index_state *istate,
 				   const char *path);
 
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path);
+
 /*
  * Initialize the checkout metadata with the given values.  Any argument may be
  * NULL if it is not applicable.  The treeish should be a commit if that is
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
                     ` (17 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

From: Jeff Hostetler <jeffhost@microsoft.com>

Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on a index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash, remove one function definition and reword]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 50 ++++++++++++++++++++++++++++++++++++--------------
 convert.h |  9 +++++++++
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/convert.c b/convert.c
index 941a845692..55bcce891c 100644
--- a/convert.c
+++ b/convert.c
@@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
 }
 
-static int convert_to_working_tree_internal(const struct index_state *istate,
+static int convert_to_working_tree_internal(const struct conv_attrs *ca,
 					    const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing,
@@ -1455,11 +1455,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 					    struct delayed_checkout *dco)
 {
 	int ret = 0, ret_filter = 0;
-	struct conv_attrs ca;
-
-	convert_attrs(istate, &ca, path);
 
-	ret |= ident_to_worktree(src, len, dst, ca.ident);
+	ret |= ident_to_worktree(src, len, dst, ca->ident);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -1469,24 +1466,24 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 	 * is a smudge or process filter (even if the process filter doesn't
 	 * support smudge).  The filters might expect CRLFs.
 	 */
-	if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) {
-		ret |= crlf_to_worktree(src, len, dst, ca.crlf_action);
+	if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) {
+		ret |= crlf_to_worktree(src, len, dst, ca->crlf_action);
 		if (ret) {
 			src = dst->buf;
 			len = dst->len;
 		}
 	}
 
-	ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding);
+	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
 	}
 
 	ret_filter = apply_filter(
-		path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco);
-	if (!ret_filter && ca.drv && ca.drv->required)
-		die(_("%s: smudge filter %s failed"), path, ca.drv->name);
+		path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco);
+	if (!ret_filter && ca->drv && ca->drv->required)
+		die(_("%s: smudge filter %s failed"), path, ca->drv->name);
 
 	return ret | ret_filter;
 }
@@ -1497,7 +1494,9 @@ int async_convert_to_working_tree(const struct index_state *istate,
 				  const struct checkout_metadata *meta,
 				  void *dco)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco);
 }
 
 int convert_to_working_tree(const struct index_state *istate,
@@ -1505,13 +1504,36 @@ int convert_to_working_tree(const struct index_state *istate,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL);
+}
+
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
+}
+
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
 }
 
 int renormalize_buffer(const struct index_state *istate, const char *path,
 		       const char *src, size_t len, struct strbuf *dst)
 {
-	int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL);
+	struct conv_attrs ca;
+	int ret;
+
+	convert_attrs(istate, &ca, path);
+	ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/convert.h b/convert.h
index aeb4a1be9a..46d537d1ae 100644
--- a/convert.h
+++ b/convert.h
@@ -100,11 +100,20 @@ int convert_to_working_tree(const struct index_state *istate,
 			    const char *path, const char *src,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta);
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta);
 int async_convert_to_working_tree(const struct index_state *istate,
 				  const char *path, const char *src,
 				  size_t len, struct strbuf *dst,
 				  const struct checkout_metadata *meta,
 				  void *dco);
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco);
 int async_query_available_blobs(const char *cmd,
 				struct string_list *available_paths);
 int renormalize_buffer(const struct index_state *istate,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/19] convert: add get_stream_filter_ca() variant
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 04/19] convert: add conv_attrs classification Matheus Tavares
                     ` (16 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

From: Jeff Hostetler <jeffhost@microsoft.com>

Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: move header comment to ca() variant and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 28 +++++++++++++++++-----------
 convert.h |  2 ++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/convert.c b/convert.c
index 55bcce891c..c112ea23cb 100644
--- a/convert.c
+++ b/convert.c
@@ -1960,34 +1960,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid)
 }
 
 /*
- * Return an appropriately constructed filter for the path, or NULL if
+ * Return an appropriately constructed filter for the given ca, or NULL if
  * the contents cannot be filtered without reading the whole thing
  * in-core.
  *
  * Note that you would be crazy to set CRLF, smudge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const struct index_state *istate,
-					const char *path,
-					const struct object_id *oid)
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid)
 {
-	struct conv_attrs ca;
 	struct stream_filter *filter = NULL;
 
-	convert_attrs(istate, &ca, path);
-	if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean))
+	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
 		return NULL;
 
-	if (ca.working_tree_encoding)
+	if (ca->working_tree_encoding)
 		return NULL;
 
-	if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF)
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
 		return NULL;
 
-	if (ca.ident)
+	if (ca->ident)
 		filter = ident_filter(oid);
 
-	if (output_eol(ca.crlf_action) == EOL_CRLF)
+	if (output_eol(ca->crlf_action) == EOL_CRLF)
 		filter = cascade_filter(filter, lf_to_crlf_filter());
 	else
 		filter = cascade_filter(filter, &null_filter_singleton);
@@ -1995,6 +1992,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate,
 	return filter;
 }
 
+struct stream_filter *get_stream_filter(const struct index_state *istate,
+					const char *path,
+					const struct object_id *oid)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return get_stream_filter_ca(&ca, oid);
+}
+
 void free_stream_filter(struct stream_filter *filter)
 {
 	filter->vtbl->free(filter);
diff --git a/convert.h b/convert.h
index 46d537d1ae..262c1a1d46 100644
--- a/convert.h
+++ b/convert.h
@@ -169,6 +169,8 @@ struct stream_filter; /* opaque */
 struct stream_filter *get_stream_filter(const struct index_state *istate,
 					const char *path,
 					const struct object_id *);
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid);
 void free_stream_filter(struct stream_filter *);
 int is_null_stream_filter(struct stream_filter *);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 04/19] convert: add conv_attrs classification
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (2 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 05/19] entry: extract a header file for entry.c functions Matheus Tavares
                     ` (15 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

From: Jeff Hostetler <jeffhost@microsoft.com>

Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: use classification in get_stream_filter_ca()]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 26 +++++++++++++++++++-------
 convert.h | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/convert.c b/convert.c
index c112ea23cb..633ad6976a 100644
--- a/convert.c
+++ b/convert.c
@@ -1972,13 +1972,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
 {
 	struct stream_filter *filter = NULL;
 
-	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
-		return NULL;
-
-	if (ca->working_tree_encoding)
-		return NULL;
-
-	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+	if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE)
 		return NULL;
 
 	if (ca->ident)
@@ -2034,3 +2028,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst,
 	if (blob)
 		oidcpy(&dst->blob, blob);
 }
+
+enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca)
+{
+	if (ca->drv) {
+		if (ca->drv->process)
+			return CA_CLASS_INCORE_PROCESS;
+		if (ca->drv->smudge || ca->drv->clean)
+			return CA_CLASS_INCORE_FILTER;
+	}
+
+	if (ca->working_tree_encoding)
+		return CA_CLASS_INCORE;
+
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+		return CA_CLASS_INCORE;
+
+	return CA_CLASS_STREAMABLE;
+}
diff --git a/convert.h b/convert.h
index 262c1a1d46..523ba9b140 100644
--- a/convert.h
+++ b/convert.h
@@ -190,4 +190,37 @@ int stream_filter(struct stream_filter *,
 		  const char *input, size_t *isize_p,
 		  char *output, size_t *osize_p);
 
+enum conv_attrs_classification {
+	/*
+	 * The blob must be loaded into a buffer before it can be
+	 * smudged. All smudging is done in-proc.
+	 */
+	CA_CLASS_INCORE,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * single-file driver filter, such as rot13.
+	 */
+	CA_CLASS_INCORE_FILTER,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * long-running driver process, such as LFS. This might or
+	 * might not use delayed operations. (The important thing is
+	 * that there is a single subordinate long-running process
+	 * handling all associated blobs and in case of delayed
+	 * operations, may hold per-blob state.)
+	 */
+	CA_CLASS_INCORE_PROCESS,
+
+	/*
+	 * The blob can be streamed and smudged without needing to
+	 * completely read it into a buffer.
+	 */
+	CA_CLASS_STREAMABLE,
+};
+
+enum conv_attrs_classification classify_conv_attrs(
+	const struct conv_attrs *ca);
+
 #endif /* CONVERT_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/19] entry: extract a header file for entry.c functions
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (3 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 04/19] convert: add conv_attrs classification Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
                     ` (14 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 apply.c                  |  1 +
 builtin/checkout-index.c |  1 +
 builtin/checkout.c       |  1 +
 builtin/difftool.c       |  1 +
 cache.h                  | 24 -----------------------
 entry.c                  |  9 +--------
 entry.h                  | 41 ++++++++++++++++++++++++++++++++++++++++
 unpack-trees.c           |  1 +
 8 files changed, 47 insertions(+), 32 deletions(-)
 create mode 100644 entry.h

diff --git a/apply.c b/apply.c
index 76dba93c97..ddec80b4b0 100644
--- a/apply.c
+++ b/apply.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "rerere.h"
 #include "apply.h"
+#include "entry.h"
 
 struct gitdiff_data {
 	struct strbuf *root;
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index a854fd16e7..0f1ff73129 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "cache-tree.h"
 #include "parse-options.h"
+#include "entry.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 0951f8fee5..b18b9d6f3c 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -26,6 +26,7 @@
 #include "unpack-trees.h"
 #include "wt-status.h"
 #include "xdiff-interface.h"
+#include "entry.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
diff --git a/builtin/difftool.c b/builtin/difftool.c
index 7ac432b881..dfa22b67eb 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -23,6 +23,7 @@
 #include "lockfile.h"
 #include "object-store.h"
 #include "dir.h"
+#include "entry.h"
 
 static int trust_exit_code;
 
diff --git a/cache.h b/cache.h
index cee8aa5dc3..17350cafa2 100644
--- a/cache.h
+++ b/cache.h
@@ -1706,30 +1706,6 @@ const char *show_ident_date(const struct ident_split *id,
  */
 int ident_cmp(const struct ident_split *, const struct ident_split *);
 
-struct checkout {
-	struct index_state *istate;
-	const char *base_dir;
-	int base_dir_len;
-	struct delayed_checkout *delayed_checkout;
-	struct checkout_metadata meta;
-	unsigned force:1,
-		 quiet:1,
-		 not_new:1,
-		 clone:1,
-		 refresh_cache:1;
-};
-#define CHECKOUT_INIT { NULL, "" }
-
-#define TEMPORARY_FILENAME_LENGTH 25
-int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts);
-void enable_delayed_checkout(struct checkout *state);
-int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
-/*
- * Unlink the last component and schedule the leading directories for
- * removal, such that empty directories get removed.
- */
-void unlink_entry(const struct cache_entry *ce);
-
 struct cache_def {
 	struct strbuf path;
 	int flags;
diff --git a/entry.c b/entry.c
index a0532f1f00..b0b8099699 100644
--- a/entry.c
+++ b/entry.c
@@ -6,6 +6,7 @@
 #include "submodule.h"
 #include "progress.h"
 #include "fsmonitor.h"
+#include "entry.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-/*
- * Write the contents from ce out to the working tree.
- *
- * When topath[] is not NULL, instead of writing to the working tree
- * file named by ce, a temporary file is created by this function and
- * its name is returned in topath[], which must be able to hold at
- * least TEMPORARY_FILENAME_LENGTH bytes long.
- */
 int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		   char *topath, int *nr_checkouts)
 {
diff --git a/entry.h b/entry.h
new file mode 100644
index 0000000000..2d69185448
--- /dev/null
+++ b/entry.h
@@ -0,0 +1,41 @@
+#ifndef ENTRY_H
+#define ENTRY_H
+
+#include "cache.h"
+#include "convert.h"
+
+struct checkout {
+	struct index_state *istate;
+	const char *base_dir;
+	int base_dir_len;
+	struct delayed_checkout *delayed_checkout;
+	struct checkout_metadata meta;
+	unsigned force:1,
+		 quiet:1,
+		 not_new:1,
+		 clone:1,
+		 refresh_cache:1;
+};
+#define CHECKOUT_INIT { NULL, "" }
+
+#define TEMPORARY_FILENAME_LENGTH 25
+
+/*
+ * Write the contents from ce out to the working tree.
+ *
+ * When topath[] is not NULL, instead of writing to the working tree
+ * file named by ce, a temporary file is created by this function and
+ * its name is returned in topath[], which must be able to hold at
+ * least TEMPORARY_FILENAME_LENGTH bytes long.
+ */
+int checkout_entry(struct cache_entry *ce, const struct checkout *state,
+		   char *topath, int *nr_checkouts);
+void enable_delayed_checkout(struct checkout *state);
+int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
+/*
+ * Unlink the last component and schedule the leading directories for
+ * removal, such that empty directories get removed.
+ */
+void unlink_entry(const struct cache_entry *ce);
+
+#endif /* ENTRY_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 323280dd48..a511fadd89 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -16,6 +16,7 @@
 #include "fsmonitor.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "entry.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (4 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 05/19] entry: extract a header file for entry.c functions Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
                     ` (13 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 8 ++++----
 entry.h | 2 ++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index b0b8099699..b36071a610 100644
--- a/entry.c
+++ b/entry.c
@@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode)
 	return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
 }
 
-static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
 {
 	enum object_type type;
 	void *blob_data = read_object_file(&ce->oid, &type, size);
@@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf
 	}
 }
 
-static int fstat_output(int fd, const struct checkout *state, struct stat *st)
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 {
 	/* use fstat() only when path == ce->name */
 	if (fstat_is_reliable() &&
@@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path,
 		return -1;
 
 	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
-	*fstat_done = fstat_output(fd, state, statbuf);
+	*fstat_done = fstat_checkout_output(fd, state, statbuf);
 	result |= close(fd);
 
 	if (result)
@@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce,
 
 		wrote = write_in_full(fd, new_blob, size);
 		if (!to_tempfile)
-			fstat_done = fstat_output(fd, state, &st);
+			fstat_done = fstat_checkout_output(fd, state, &st);
 		close(fd);
 		free(new_blob);
 		if (wrote < 0)
diff --git a/entry.h b/entry.h
index 2d69185448..f860e60846 100644
--- a/entry.h
+++ b/entry.h
@@ -37,5 +37,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
  * removal, such that empty directories get removed.
  */
 void unlink_entry(const struct cache_entry *ce);
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 07/19] entry: extract cache_entry update from write_entry()
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (5 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
                     ` (12 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

This code will be used by the parallel checkout functions, outside
entry.c, so extract it to a public function.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 25 ++++++++++++++++---------
 entry.h |  2 ++
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/entry.c b/entry.c
index b36071a610..1d2df188e5 100644
--- a/entry.c
+++ b/entry.c
@@ -251,6 +251,18 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 	return errs;
 }
 
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st)
+{
+	if (state->refresh_cache) {
+		assert(state->istate);
+		fill_stat_cache_info(state->istate, ce, st);
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
+		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+	}
+}
+
 static int write_entry(struct cache_entry *ce,
 		       char *path, const struct checkout *state, int to_tempfile)
 {
@@ -371,15 +383,10 @@ static int write_entry(struct cache_entry *ce,
 
 finish:
 	if (state->refresh_cache) {
-		assert(state->istate);
-		if (!fstat_done)
-			if (lstat(ce->name, &st) < 0)
-				return error_errno("unable to stat just-written file %s",
-						   ce->name);
-		fill_stat_cache_info(state->istate, ce, &st);
-		ce->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(state->istate, ce);
-		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+		if (!fstat_done && lstat(ce->name, &st) < 0)
+			return error_errno("unable to stat just-written file %s",
+					   ce->name);
+		update_ce_after_write(state, ce , &st);
 	}
 delayed:
 	return 0;
diff --git a/entry.h b/entry.h
index f860e60846..664aed1576 100644
--- a/entry.h
+++ b/entry.h
@@ -39,5 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 void unlink_entry(const struct cache_entry *ce);
 void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (6 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-10-01 15:53     ` Jeff Hostetler
  2020-09-22 22:49   ` [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
                     ` (11 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 1d2df188e5..8237859b12 100644
--- a/entry.c
+++ b/entry.c
@@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 	}
 }
 
-static int write_entry(struct cache_entry *ce,
-		       char *path, const struct checkout *state, int to_tempfile)
+/* Note: ca is used (and required) iff the entry refers to a regular file. */
+static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
+		       const struct checkout *state, int to_tempfile)
 {
 	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
 	struct delayed_checkout *dco = state->delayed_checkout;
@@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
 	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
 
 	if (ce_mode_s_ifmt == S_IFREG) {
-		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
-								 &ce->oid);
+		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
 		if (filter &&
 		    !streaming_write_entry(ce, path, filter,
 					   state, to_tempfile,
@@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
 		 * Convert from git internal format to working tree format
 		 */
 		if (dco && dco->state != CE_NO_DELAY) {
-			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
-							    size, &buf, &meta, dco);
+			ret = async_convert_to_working_tree_ca(ca, ce->name,
+							       new_blob, size,
+							       &buf, &meta, dco);
 			if (ret && string_list_has_string(&dco->paths, ce->name)) {
 				free(new_blob);
 				goto delayed;
 			}
-		} else
-			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
+		} else {
+			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
+							 size, &buf, &meta);
+		}
 
 		if (ret) {
 			free(new_blob);
@@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
+	struct conv_attrs ca;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 	}
 
-	if (topath)
-		return write_entry(ce, topath, state, 1);
+	if (topath) {
+		if (S_ISREG(ce->ce_mode)) {
+			convert_attrs(state->istate, &ca, ce->name);
+			return write_entry(ce, topath, &ca, state, 1);
+		}
+		return write_entry(ce, topath, NULL, state, 1);
+	}
 
 	strbuf_reset(&path);
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
@@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 
 	create_directories(path.buf, path.len, state);
+
 	if (nr_checkouts)
 		(*nr_checkouts)++;
-	return write_entry(ce, path.buf, state, 0);
+
+	if (S_ISREG(ce->ce_mode)) {
+		convert_attrs(state->istate, &ca, ce->name);
+		return write_entry(ce, path.buf, &ca, state, 0);
+	}
+
+	return write_entry(ce, path.buf, NULL, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (7 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
                     ` (10 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 23 ++++++++++++-----------
 entry.h | 13 +++++++++++--
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/entry.c b/entry.c
index 8237859b12..9d79a5671f 100644
--- a/entry.c
+++ b/entry.c
@@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts)
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
-	struct conv_attrs ca;
+	struct conv_attrs ca_buf;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -459,11 +460,11 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	}
 
 	if (topath) {
-		if (S_ISREG(ce->ce_mode)) {
-			convert_attrs(state->istate, &ca, ce->name);
-			return write_entry(ce, topath, &ca, state, 1);
+		if (S_ISREG(ce->ce_mode) && !ca) {
+			convert_attrs(state->istate, &ca_buf, ce->name);
+			ca = &ca_buf;
 		}
-		return write_entry(ce, topath, NULL, state, 1);
+		return write_entry(ce, topath, ca, state, 1);
 	}
 
 	strbuf_reset(&path);
@@ -530,12 +531,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	if (nr_checkouts)
 		(*nr_checkouts)++;
 
-	if (S_ISREG(ce->ce_mode)) {
-		convert_attrs(state->istate, &ca, ce->name);
-		return write_entry(ce, path.buf, &ca, state, 0);
+	if (S_ISREG(ce->ce_mode) && !ca) {
+		convert_attrs(state->istate, &ca_buf, ce->name);
+		ca = &ca_buf;
 	}
 
-	return write_entry(ce, path.buf, NULL, state, 0);
+	return write_entry(ce, path.buf, ca, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
diff --git a/entry.h b/entry.h
index 664aed1576..2081fbbbab 100644
--- a/entry.h
+++ b/entry.h
@@ -27,9 +27,18 @@ struct checkout {
  * file named by ce, a temporary file is created by this function and
  * its name is returned in topath[], which must be able to hold at
  * least TEMPORARY_FILENAME_LENGTH bytes long.
+ *
+ * With checkout_entry_ca(), callers can optionally pass a preloaded
+ * conv_attrs struct (to avoid reloading it), when ce refers to a
+ * regular file. If ca is NULL, the attributes will be loaded
+ * internally when (and if) needed.
  */
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts);
+#define checkout_entry(ce, state, topath, nr_checkouts) \
+		checkout_entry_ca(ce, NULL, state, topath, nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts);
+
 void enable_delayed_checkout(struct checkout *state);
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 /*
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (8 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-10-05  6:17     ` [PATCH] parallel-checkout: drop unused checkout state parameter Jeff King
  2020-09-22 22:49   ` [PATCH v2 11/19] parallel-checkout: make it truly parallel Matheus Tavares
                     ` (9 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

This new interface allows us to enqueue some of the entries being
checked out to later call write_entry() for them in parallel. For now,
the parallel checkout machinery is enabled by default and there is no
user configuration, but run_parallel_checkout() just writes the queued
entries in sequence (without spawning additional workers). The next
patch will actually implement the parallelism and, later, we will make
it configurable.

When there are path collisions among the entries being written (which
can happen e.g. with case-sensitive files in case-insensitive file
systems), the parallel checkout code detects the problem and marks the
item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to
checkout_entry() again. This is similar to the way the sequential code
deals with collisions, overwriting the previously checked out entries
with the subsequent ones. The only difference is that, when we start
writing the entries in parallel, we won't be able to determine which of
the colliding entries will survive on disk (for the sequential
algorithm, it is always the last one).

I also experimented with the idea of not overwriting colliding entries,
and it seemed to work well in my simple tests. However, because just one
entry of each colliding group would be actually written, the others
would have null lstat() fields on the index. This might not be a problem
by itself, but it could cause performance penalties for subsequent
commands that need to refresh the index: when the st_size value cached
is 0, read-cache.c:ie_modified() will go to the filesystem to see if the
contents match. As mentioned in the function:

    * Immediately after read-tree or update-index --cacheinfo,
    * the length field is zero, as we have never even read the
    * lstat(2) information once, and we cannot trust DATA_CHANGED
    * returned by ie_match_stat() which in turn was returned by
    * ce_match_stat_basic() to signal that the filesize of the
    * blob changed.  We have to actually go to the filesystem to
    * see if the contents match, and if so, should answer "unchanged".

So, if we have N entries in a colliding group and we decide to write and
lstat() only one of them, every subsequent git-status will have to read,
convert, and hash the written file N - 1 times, to check that the N - 1
unwritten entries are dirty. By checking out all colliding entries (like
the sequential code does), we only pay the overhead once.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

Note: currently, we have to check leading directories again before
writing each parallel-eligible entry, as explained in the respective
code comment. But I plan to remove this extra work on part II,
postponing the checkout of symlinks to *after* the parallel-eligible
entries.

 Makefile            |   1 +
 entry.c             |  17 +-
 parallel-checkout.c | 368 ++++++++++++++++++++++++++++++++++++++++++++
 parallel-checkout.h |  27 ++++
 unpack-trees.c      |   6 +-
 5 files changed, 416 insertions(+), 3 deletions(-)
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

diff --git a/Makefile b/Makefile
index f1b1bc8aa0..3edcdc534c 100644
--- a/Makefile
+++ b/Makefile
@@ -932,6 +932,7 @@ LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
+LIB_OBJS += parallel-checkout.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/entry.c b/entry.c
index 9d79a5671f..6676954431 100644
--- a/entry.c
+++ b/entry.c
@@ -7,6 +7,7 @@
 #include "progress.h"
 #include "fsmonitor.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
 	for (i = 0; i < state->istate->cache_nr; i++) {
 		struct cache_entry *dup = state->istate->cache[i];
 
-		if (dup == ce)
-			break;
+		if (dup == ce) {
+			/*
+			 * Parallel checkout creates the files in no particular
+			 * order. So the other side of the collision may appear
+			 * after the given cache_entry in the array.
+			 */
+			if (parallel_checkout_status() == PC_RUNNING)
+				continue;
+			else
+				break;
+		}
 
 		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
 			continue;
@@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		ca = &ca_buf;
 	}
 
+	if (!enqueue_checkout(ce, ca))
+		return 0;
+
 	return write_entry(ce, path.buf, ca, state, 0);
 }
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
new file mode 100644
index 0000000000..7dc8ab2a67
--- /dev/null
+++ b/parallel-checkout.c
@@ -0,0 +1,368 @@
+#include "cache.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "streaming.h"
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/* pointer to a istate->cache[] entry. Not owned by us. */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	struct stat st;
+	enum pc_item_status status;
+};
+
+struct parallel_checkout {
+	enum pc_status status;
+	struct parallel_checkout_item *items;
+	size_t nr, alloc;
+};
+
+static struct parallel_checkout parallel_checkout = { 0 };
+
+enum pc_status parallel_checkout_status(void)
+{
+	return parallel_checkout.status;
+}
+
+void init_parallel_checkout(void)
+{
+	if (parallel_checkout.status != PC_UNINITIALIZED)
+		BUG("parallel checkout already initialized");
+
+	parallel_checkout.status = PC_ACCEPTING_ENTRIES;
+}
+
+static void finish_parallel_checkout(void)
+{
+	if (parallel_checkout.status == PC_UNINITIALIZED)
+		BUG("cannot finish parallel checkout: not initialized yet");
+
+	free(parallel_checkout.items);
+	memset(&parallel_checkout, 0, sizeof(parallel_checkout));
+}
+
+static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
+					     const struct conv_attrs *ca)
+{
+	enum conv_attrs_classification c;
+
+	if (!S_ISREG(ce->ce_mode))
+		return 0;
+
+	c = classify_conv_attrs(ca);
+	switch (c) {
+	case CA_CLASS_INCORE:
+		return 1;
+
+	case CA_CLASS_INCORE_FILTER:
+		/*
+		 * It would be safe to allow concurrent instances of
+		 * single-file smudge filters, like rot13, but we should not
+		 * assume that all filters are parallel-process safe. So we
+		 * don't allow this.
+		 */
+		return 0;
+
+	case CA_CLASS_INCORE_PROCESS:
+		/*
+		 * The parallel queue and the delayed queue are not compatible,
+		 * so they must be kept completely separated. And we can't tell
+		 * if a long-running process will delay its response without
+		 * actually asking it to perform the filtering. Therefore, this
+		 * type of filter is not allowed in parallel checkout.
+		 *
+		 * Furthermore, there should only be one instance of the
+		 * long-running process filter as we don't know how it is
+		 * managing its own concurrency. So, spreading the entries that
+		 * requisite such a filter among the parallel workers would
+		 * require a lot more inter-process communication. We would
+		 * probably have to designate a single process to interact with
+		 * the filter and send all the necessary data to it, for each
+		 * entry.
+		 */
+		return 0;
+
+	case CA_CLASS_STREAMABLE:
+		return 1;
+
+	default:
+		BUG("unsupported conv_attrs classification '%d'", c);
+	}
+}
+
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
+{
+	struct parallel_checkout_item *pc_item;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES ||
+	    !is_eligible_for_parallel_checkout(ce, ca))
+		return -1;
+
+	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
+		   parallel_checkout.alloc);
+
+	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item->ce = ce;
+	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
+	pc_item->status = PC_ITEM_PENDING;
+
+	return 0;
+}
+
+static int handle_results(struct checkout *state)
+{
+	int ret = 0;
+	size_t i;
+	int have_pending = 0;
+
+	/*
+	 * We first update the successfully written entries with the collected
+	 * stat() data, so that they can be found by mark_colliding_entries(),
+	 * in the next loop, when necessary.
+	 */
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		if (pc_item->status == PC_ITEM_WRITTEN)
+			update_ce_after_write(state, pc_item->ce, &pc_item->st);
+	}
+
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+
+		switch(pc_item->status) {
+		case PC_ITEM_WRITTEN:
+			/* Already handled */
+			break;
+		case PC_ITEM_COLLIDED:
+			/*
+			 * The entry could not be checked out due to a path
+			 * collision with another entry. Since there can only
+			 * be one entry of each colliding group on the disk, we
+			 * could skip trying to check out this one and move on.
+			 * However, this would leave the unwritten entries with
+			 * null stat() fields on the index, which could
+			 * potentially slow down subsequent operations that
+			 * require refreshing it: git would not be able to
+			 * trust st_size and would have to go to the filesystem
+			 * to see if the contents match (see ie_modified()).
+			 *
+			 * Instead, let's pay the overhead only once, now, and
+			 * call checkout_entry_ca() again for this file, to
+			 * have it's stat() data stored in the index. This also
+			 * has the benefit of adding this entry and its
+			 * colliding pair to the collision report message.
+			 * Additionally, this overwriting behavior is consistent
+			 * with what the sequential checkout does, so it doesn't
+			 * add any extra overhead.
+			 */
+			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
+						 state, NULL, NULL);
+			break;
+		case PC_ITEM_PENDING:
+			have_pending = 1;
+			/* fall through */
+		case PC_ITEM_FAILED:
+			ret = -1;
+			break;
+		default:
+			BUG("unknown checkout item status in parallel checkout");
+		}
+	}
+
+	if (have_pending)
+		error(_("parallel checkout finished with pending entries"));
+
+	return ret;
+}
+
+static int reset_fd(int fd, const char *path)
+{
+	if (lseek(fd, 0, SEEK_SET) != 0)
+		return error_errno("failed to rewind descriptor of %s", path);
+	if (ftruncate(fd, 0))
+		return error_errno("failed to truncate file %s", path);
+	return 0;
+}
+
+static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
+			       const char *path, struct checkout *state)
+{
+	int ret;
+	struct stream_filter *filter;
+	struct strbuf buf = STRBUF_INIT;
+	char *new_blob;
+	unsigned long size;
+	size_t newsize = 0;
+	ssize_t wrote;
+
+	/* Sanity check */
+	assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca));
+
+	filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid);
+	if (filter) {
+		if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
+			/* On error, reset fd to try writing without streaming */
+			if (reset_fd(fd, path))
+				return -1;
+		} else {
+			return 0;
+		}
+	}
+
+	new_blob = read_blob_entry(pc_item->ce, &size);
+	if (!new_blob)
+		return error("unable to read sha1 file of %s (%s)", path,
+			     oid_to_hex(&pc_item->ce->oid));
+
+	/*
+	 * checkout metadata is used to give context for external process
+	 * filters. Files requiring such filters are not eligible for parallel
+	 * checkout, so pass NULL.
+	 */
+	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
+					 new_blob, size, &buf, NULL);
+
+	if (ret) {
+		free(new_blob);
+		new_blob = strbuf_detach(&buf, &newsize);
+		size = newsize;
+	}
+
+	wrote = write_in_full(fd, new_blob, size);
+	free(new_blob);
+	if (wrote < 0)
+		return error("unable to write file %s", path);
+
+	return 0;
+}
+
+static int close_and_clear(int *fd)
+{
+	int ret = 0;
+
+	if (*fd >= 0) {
+		ret = close(*fd);
+		*fd = -1;
+	}
+
+	return ret;
+}
+
+static int check_leading_dirs(const char *path, int len, int prefix_len)
+{
+	const char *slash = path + len;
+
+	while (slash > path && *slash != '/')
+		slash--;
+
+	return has_dirs_only_path(path, slash - path, prefix_len);
+}
+
+static void write_pc_item(struct parallel_checkout_item *pc_item,
+			  struct checkout *state)
+{
+	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
+	int fd = -1, fstat_done = 0;
+	struct strbuf path = STRBUF_INIT;
+
+	strbuf_add(&path, state->base_dir, state->base_dir_len);
+	strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
+
+	/*
+	 * At this point, leading dirs should have already been created. But if
+	 * a symlink being checked out has collided with one of the dirs, due to
+	 * file system folding rules, it's possible that the dirs are no longer
+	 * present. So we have to check again, and report any path collisions.
+	 */
+	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
+		pc_item->status = PC_ITEM_COLLIDED;
+		goto out;
+	}
+
+	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
+
+	if (fd < 0) {
+		if (errno == EEXIST || errno == EISDIR) {
+			/*
+			 * Errors which probably represent a path collision.
+			 * Suppress the error message and mark the item to be
+			 * retried later, sequentially. ENOTDIR and ENOENT are
+			 * also interesting, but check_leading_dirs() should
+			 * have already caught these cases.
+			 */
+			pc_item->status = PC_ITEM_COLLIDED;
+		} else {
+			error_errno("failed to open file %s", path.buf);
+			pc_item->status = PC_ITEM_FAILED;
+		}
+		goto out;
+	}
+
+	if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) {
+		/* Error was already reported. */
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	fstat_done = fstat_checkout_output(fd, state, &pc_item->st);
+
+	if (close_and_clear(&fd)) {
+		error_errno("unable to close file %s", path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) {
+		error_errno("unable to stat just-written file %s",  path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	pc_item->status = PC_ITEM_WRITTEN;
+
+out:
+	/*
+	 * No need to check close() return. At this point, either fd is already
+	 * closed, or we are on an error path, that has already been reported.
+	 */
+	close_and_clear(&fd);
+	strbuf_release(&path);
+}
+
+static void write_items_sequentially(struct checkout *state)
+{
+	size_t i;
+
+	for (i = 0; i < parallel_checkout.nr; ++i)
+		write_pc_item(&parallel_checkout.items[i], state);
+}
+
+int run_parallel_checkout(struct checkout *state)
+{
+	int ret;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES)
+		BUG("cannot run parallel checkout: uninitialized or already running");
+
+	parallel_checkout.status = PC_RUNNING;
+
+	write_items_sequentially(state);
+	ret = handle_results(state);
+
+	finish_parallel_checkout();
+	return ret;
+}
diff --git a/parallel-checkout.h b/parallel-checkout.h
new file mode 100644
index 0000000000..e6d6fc01ea
--- /dev/null
+++ b/parallel-checkout.h
@@ -0,0 +1,27 @@
+#ifndef PARALLEL_CHECKOUT_H
+#define PARALLEL_CHECKOUT_H
+
+struct cache_entry;
+struct checkout;
+struct conv_attrs;
+
+enum pc_status {
+	PC_UNINITIALIZED = 0,
+	PC_ACCEPTING_ENTRIES,
+	PC_RUNNING,
+};
+
+enum pc_status parallel_checkout_status(void);
+void init_parallel_checkout(void);
+
+/*
+ * Return -1 if parallel checkout is currently not enabled or if the entry is
+ * not eligible for parallel checkout. Otherwise, enqueue the entry for later
+ * write and return 0.
+ */
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+
+/* Write all the queued entries, returning 0 on success.*/
+int run_parallel_checkout(struct checkout *state);
+
+#endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index a511fadd89..1b1da7485a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -17,6 +17,7 @@
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o,
 	if (should_update_submodules())
 		load_gitmodules_file(index, &state);
 
-	enable_delayed_checkout(&state);
 	if (has_promisor_remote()) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
@@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o,
 					   to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
+
+	enable_delayed_checkout(&state);
+	init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
+	errs |= run_parallel_checkout(&state);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 11/19] parallel-checkout: make it truly parallel
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (9 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-29 19:52     ` Martin Ågren
  2020-09-22 22:49   ` [PATCH v2 12/19] parallel-checkout: support progress displaying Matheus Tavares
                     ` (8 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Use multiple worker processes to distribute the queued entries and call
write_checkout_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could
affect performance due to lock contention in the kernel. Work stealing
(or any other format of re-distribution) is not implemented yet.

The parallel version was benchmarked during three operations in the
linux repo, with cold cache: cloning v5.8, checking out v5.8 from
v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The
four tables below show the mean run times and standard deviations for
5 runs in: a local file system with SSD, a local file system with HDD, a
Linux NFS server, and Amazon EFS. The numbers of workers were chosen
based on what produces the best result for each case.

Local SSD:

            Clone                  Checkout I             Checkout II
Sequential  8.171 s ± 0.206 s      8.735 s ± 0.230 s      4.166 s ± 0.246 s
10 workers  3.277 s ± 0.138 s      3.774 s ± 0.188 s      2.561 s ± 0.120 s
Speedup     2.49 ± 0.12            2.31 ± 0.13            1.63 ± 0.12

Local HDD:

            Clone                  Checkout I             Checkout II
Sequential  35.157 s ± 0.205 s     48.835 s ± 0.407 s     47.302 s ± 1.435 s
8 workers   35.538 s ± 0.325 s     49.353 s ± 0.826 s     48.919 s ± 0.416 s
Speedup     0.99 ± 0.01            0.99 ± 0.02            0.97 ± 0.03

Linux NFS server (v4.1, on EBS, single availability zone):

            Clone                  Checkout I             Checkout II
Sequential  216.070 s ± 3.611 s    211.169 s ± 3.147 s    57.446 s ± 1.301 s
32 workers  67.997 s ± 0.740 s     66.563 s ± 0.457 s     23.708 s ± 0.622 s
Speedup     3.18 ± 0.06            3.17 ± 0.05            2.42 ± 0.08

EFS (v4.1, replicated over multiple availability zones):

            Clone                  Checkout I             Checkout II
Sequential  1249.329 s ± 13.857 s  1438.979 s ± 78.792 s  543.919 s ± 18.745 s
64 workers  225.864 s ± 12.433 s   316.345 s ± 1.887 s    183.648 s ± 10.095 s
Speedup     5.53 ± 0.31            4.55 ± 0.25            2.96 ± 0.19

The above benchmarks show that parallel checkout is most effective on
repositories located on an SSD or over a distributed file system. For
local file systems on spinning disks, and/or older machines, the
parallelism does not always bring a good performance. In fact, it can
even increase the run time. For this reason, the sequential code is
still the default. Two settings are added to optionally enable and
configure the new parallel version as desired.

Local SSD tests were executed in an i7-7700HQ (4 cores with
hyper-threading) running Manjaro Linux. Local HDD tests were executed in
an i7-2600 (also 4 cores with hyper-threading), HDD Seagate Barracuda
7200 rpm SATA 3.0, running Debian 9.13. NFS and EFS tests were
executed in an Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux
NFS server was running on a m6g.large instance with 1 TB, EBS GP2
volume. Before each timing, the linux repository was removed (or checked
out back), and `sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .gitignore                        |   1 +
 Documentation/config/checkout.txt |  21 +++
 Makefile                          |   1 +
 builtin.h                         |   1 +
 builtin/checkout--helper.c        | 142 ++++++++++++++++
 git.c                             |   2 +
 parallel-checkout.c               | 273 +++++++++++++++++++++++++++---
 parallel-checkout.h               |  84 ++++++++-
 unpack-trees.c                    |  10 +-
 9 files changed, 501 insertions(+), 34 deletions(-)
 create mode 100644 builtin/checkout--helper.c

diff --git a/.gitignore b/.gitignore
index d0f692a355..6427739814 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,6 +33,7 @@
 /git-check-mailmap
 /git-check-ref-format
 /git-checkout
+/git-checkout--helper
 /git-checkout-index
 /git-cherry
 /git-cherry-pick
diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt
index 6b646813ab..44eb58bcd3 100644
--- a/Documentation/config/checkout.txt
+++ b/Documentation/config/checkout.txt
@@ -16,3 +16,24 @@ will checkout the '<something>' branch on another remote,
 and by linkgit:git-worktree[1] when 'git worktree add' refers to a
 remote branch. This setting might be used for other checkout-like
 commands or functionality in the future.
+
+checkout.workers::
+	The number of parallel workers to use when updating the working tree.
+	The default is one, i.e. sequential execution. If set to a value less
+	than one, Git will use as many workers as the number of logical cores
+	available. This setting and checkout.thresholdForParallelism affect all
+	commands that perform checkout. E.g. checkout, switch, clone, reset,
+	sparse-checkout, read-tree, etc.
++
+Note: parallel checkout usually delivers better performance for repositories
+located on SSDs or over NFS. For repositories on spinning disks and/or machines
+with a small number of cores, the default sequential checkout often performs
+better. The size and compression level of a repository might also influence how
+well the parallel version performs.
+
+checkout.thresholdForParallelism::
+	When running parallel checkout with a small number of files, the cost
+	of subprocess spawning and inter-process communication might outweigh
+	the parallelization gains. This setting allows to define the minimum
+	number of files for which parallel checkout should be attempted. The
+	default is 100.
diff --git a/Makefile b/Makefile
index 3edcdc534c..e9c6616180 100644
--- a/Makefile
+++ b/Makefile
@@ -1049,6 +1049,7 @@ BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
+BUILTIN_OBJS += builtin/checkout--helper.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
 BUILTIN_OBJS += builtin/clean.o
diff --git a/builtin.h b/builtin.h
index ba954e180c..b52243848d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -123,6 +123,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix);
 int cmd_bundle(int argc, const char **argv, const char *prefix);
 int cmd_cat_file(int argc, const char **argv, const char *prefix);
 int cmd_checkout(int argc, const char **argv, const char *prefix);
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix);
 int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 int cmd_check_attr(int argc, const char **argv, const char *prefix);
 int cmd_check_ignore(int argc, const char **argv, const char *prefix);
diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
new file mode 100644
index 0000000000..67fe37cf11
--- /dev/null
+++ b/builtin/checkout--helper.c
@@ -0,0 +1,142 @@
+#include "builtin.h"
+#include "config.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "parse-options.h"
+#include "pkt-line.h"
+
+static void packet_to_pc_item(char *line, int len,
+			      struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_fixed_portion *fixed_portion;
+	char *encoding, *variant;
+
+	if (len < sizeof(struct pc_item_fixed_portion))
+		BUG("checkout worker received too short item (got %dB, exp %dB)",
+		    len, (int)sizeof(struct pc_item_fixed_portion));
+
+	fixed_portion = (struct pc_item_fixed_portion *)line;
+
+	if (len - sizeof(struct pc_item_fixed_portion) !=
+		fixed_portion->name_len + fixed_portion->working_tree_encoding_len)
+		BUG("checkout worker received corrupted item");
+
+	variant = line + sizeof(struct pc_item_fixed_portion);
+
+	/*
+	 * Note: the main process uses zero length to communicate that the
+	 * encoding is NULL. There is no use case in actually sending an empty
+	 * string since it's considered as NULL when ca.working_tree_encoding
+	 * is set at git_path_check_encoding().
+	 */
+	if (fixed_portion->working_tree_encoding_len) {
+		encoding = xmemdupz(variant,
+				    fixed_portion->working_tree_encoding_len);
+		variant += fixed_portion->working_tree_encoding_len;
+	} else {
+		encoding = NULL;
+	}
+
+	memset(pc_item, 0, sizeof(*pc_item));
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce->ce_namelen = fixed_portion->name_len;
+	pc_item->ce->ce_mode = fixed_portion->ce_mode;
+	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
+	oidcpy(&pc_item->ce->oid, &fixed_portion->oid);
+
+	pc_item->id = fixed_portion->id;
+	pc_item->ca.crlf_action = fixed_portion->crlf_action;
+	pc_item->ca.ident = fixed_portion->ident;
+	pc_item->ca.working_tree_encoding = encoding;
+}
+
+static void report_result(struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_result res = { 0 };
+	size_t size;
+
+	res.id = pc_item->id;
+	res.status = pc_item->status;
+
+	if (pc_item->status == PC_ITEM_WRITTEN) {
+		res.st = pc_item->st;
+		size = sizeof(res);
+	} else {
+		size = PC_ITEM_RESULT_BASE_SIZE;
+	}
+
+	packet_write(1, (const char *)&res, size);
+}
+
+/* Free the worker-side malloced data, but not pc_item itself. */
+static void release_pc_item_data(struct parallel_checkout_item *pc_item)
+{
+	free((char *)pc_item->ca.working_tree_encoding);
+	discard_cache_entry(pc_item->ce);
+}
+
+static void worker_loop(struct checkout *state)
+{
+	struct parallel_checkout_item *items = NULL;
+	size_t i, nr = 0, alloc = 0;
+
+	while (1) {
+		int len;
+		char *line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		ALLOC_GROW(items, nr + 1, alloc);
+		packet_to_pc_item(line, len, &items[nr++]);
+	}
+
+	for (i = 0; i < nr; ++i) {
+		struct parallel_checkout_item *pc_item = &items[i];
+		write_pc_item(pc_item, state);
+		report_result(pc_item);
+		release_pc_item_data(pc_item);
+	}
+
+	packet_flush(1);
+
+	free(items);
+}
+
+static const char * const checkout_helper_usage[] = {
+	N_("git checkout--helper [<options>]"),
+	NULL
+};
+
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct option checkout_helper_options[] = {
+		OPT_STRING(0, "prefix", &state.base_dir, N_("string"),
+			N_("when creating files, prepend <string>")),
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(checkout_helper_usage,
+				   checkout_helper_options);
+
+	git_config(git_default_config, NULL);
+	argc = parse_options(argc, argv, prefix, checkout_helper_options,
+			     checkout_helper_usage, 0);
+	if (argc > 0)
+		usage_with_options(checkout_helper_usage, checkout_helper_options);
+
+	if (state.base_dir)
+		state.base_dir_len = strlen(state.base_dir);
+
+	/*
+	 * Setting this on worker won't actually update the index. We just need
+	 * to pretend so to induce the checkout machinery to stat() the written
+	 * entries.
+	 */
+	state.refresh_cache = 1;
+
+	worker_loop(&state);
+	return 0;
+}
diff --git a/git.c b/git.c
index 01c456edce..a09357fc56 100644
--- a/git.c
+++ b/git.c
@@ -487,6 +487,8 @@ static struct cmd_struct commands[] = {
 	{ "check-mailmap", cmd_check_mailmap, RUN_SETUP },
 	{ "check-ref-format", cmd_check_ref_format, NO_PARSEOPT  },
 	{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
+	{ "checkout--helper", cmd_checkout__helper,
+		RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX },
 	{ "checkout-index", cmd_checkout_index,
 		RUN_SETUP | NEED_WORK_TREE},
 	{ "cherry", cmd_cherry, RUN_SETUP },
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 7dc8ab2a67..7ea0faa526 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -1,28 +1,15 @@
 #include "cache.h"
 #include "entry.h"
 #include "parallel-checkout.h"
+#include "pkt-line.h"
+#include "run-command.h"
 #include "streaming.h"
+#include "thread-utils.h"
+#include "config.h"
 
-enum pc_item_status {
-	PC_ITEM_PENDING = 0,
-	PC_ITEM_WRITTEN,
-	/*
-	 * The entry could not be written because there was another file
-	 * already present in its path or leading directories. Since
-	 * checkout_entry_ca() removes such files from the working tree before
-	 * enqueueing the entry for parallel checkout, it means that there was
-	 * a path collision among the entries being written.
-	 */
-	PC_ITEM_COLLIDED,
-	PC_ITEM_FAILED,
-};
-
-struct parallel_checkout_item {
-	/* pointer to a istate->cache[] entry. Not owned by us. */
-	struct cache_entry *ce;
-	struct conv_attrs ca;
-	struct stat st;
-	enum pc_item_status status;
+struct pc_worker {
+	struct child_process cp;
+	size_t next_to_complete, nr_to_complete;
 };
 
 struct parallel_checkout {
@@ -38,6 +25,19 @@ enum pc_status parallel_checkout_status(void)
 	return parallel_checkout.status;
 }
 
+#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100
+
+void get_parallel_checkout_configs(int *num_workers, int *threshold)
+{
+	if (git_config_get_int("checkout.workers", num_workers))
+		*num_workers = 1;
+	else if (*num_workers < 1)
+		*num_workers = online_cpus();
+
+	if (git_config_get_int("checkout.thresholdForParallelism", threshold))
+		*threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM;
+}
+
 void init_parallel_checkout(void)
 {
 	if (parallel_checkout.status != PC_UNINITIALIZED)
@@ -115,10 +115,12 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
 		   parallel_checkout.alloc);
 
-	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item = &parallel_checkout.items[parallel_checkout.nr];
 	pc_item->ce = ce;
 	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
 	pc_item->status = PC_ITEM_PENDING;
+	pc_item->id = parallel_checkout.nr;
+	parallel_checkout.nr++;
 
 	return 0;
 }
@@ -231,7 +233,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
 	/*
 	 * checkout metadata is used to give context for external process
 	 * filters. Files requiring such filters are not eligible for parallel
-	 * checkout, so pass NULL.
+	 * checkout, so pass NULL. Note: if that changes, the metadata must also
+	 * be passed from the main process to the workers.
 	 */
 	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
 					 new_blob, size, &buf, NULL);
@@ -272,8 +275,8 @@ static int check_leading_dirs(const char *path, int len, int prefix_len)
 	return has_dirs_only_path(path, slash - path, prefix_len);
 }
 
-static void write_pc_item(struct parallel_checkout_item *pc_item,
-			  struct checkout *state)
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state)
 {
 	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
 	int fd = -1, fstat_done = 0;
@@ -343,6 +346,214 @@ static void write_pc_item(struct parallel_checkout_item *pc_item,
 	strbuf_release(&path);
 }
 
+static void send_one_item(int fd, struct parallel_checkout_item *pc_item)
+{
+	size_t len_data;
+	char *data, *variant;
+	struct pc_item_fixed_portion *fixed_portion;
+	const char *working_tree_encoding = pc_item->ca.working_tree_encoding;
+	size_t name_len = pc_item->ce->ce_namelen;
+	size_t working_tree_encoding_len = working_tree_encoding ?
+					   strlen(working_tree_encoding) : 0;
+
+	len_data = sizeof(struct pc_item_fixed_portion) + name_len +
+		   working_tree_encoding_len;
+
+	data = xcalloc(1, len_data);
+
+	fixed_portion = (struct pc_item_fixed_portion *)data;
+	fixed_portion->id = pc_item->id;
+	oidcpy(&fixed_portion->oid, &pc_item->ce->oid);
+	fixed_portion->ce_mode = pc_item->ce->ce_mode;
+	fixed_portion->crlf_action = pc_item->ca.crlf_action;
+	fixed_portion->ident = pc_item->ca.ident;
+	fixed_portion->name_len = name_len;
+	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
+
+	variant = data + sizeof(*fixed_portion);
+	if (working_tree_encoding_len) {
+		memcpy(variant, working_tree_encoding, working_tree_encoding_len);
+		variant += working_tree_encoding_len;
+	}
+	memcpy(variant, pc_item->ce->name, name_len);
+
+	packet_write(fd, data, len_data);
+
+	free(data);
+}
+
+static void send_batch(int fd, size_t start, size_t nr)
+{
+	size_t i;
+	for (i = 0; i < nr; ++i)
+		send_one_item(fd, &parallel_checkout.items[start + i]);
+	packet_flush(fd);
+}
+
+static struct pc_worker *setup_workers(struct checkout *state, int num_workers)
+{
+	struct pc_worker *workers;
+	int i, workers_with_one_extra_item;
+	size_t base_batch_size, next_to_assign = 0;
+
+	ALLOC_ARRAY(workers, num_workers);
+
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *cp = &workers[i].cp;
+
+		child_process_init(cp);
+		cp->git_cmd = 1;
+		cp->in = -1;
+		cp->out = -1;
+		cp->clean_on_exit = 1;
+		strvec_push(&cp->args, "checkout--helper");
+		if (state->base_dir_len)
+			strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
+		if (start_command(cp))
+			die(_("failed to spawn checkout worker"));
+	}
+
+	base_batch_size = parallel_checkout.nr / num_workers;
+	workers_with_one_extra_item = parallel_checkout.nr % num_workers;
+
+	for (i = 0; i < num_workers; ++i) {
+		struct pc_worker *worker = &workers[i];
+		size_t batch_size = base_batch_size;
+
+		/* distribute the extra work evenly */
+		if (i < workers_with_one_extra_item)
+			batch_size++;
+
+		send_batch(worker->cp.in, next_to_assign, batch_size);
+		worker->next_to_complete = next_to_assign;
+		worker->nr_to_complete = batch_size;
+
+		next_to_assign += batch_size;
+	}
+
+	return workers;
+}
+
+static void finish_workers(struct pc_worker *workers, int num_workers)
+{
+	int i;
+
+	/*
+	 * Close pipes before calling finish_command() to let the workers
+	 * exit asynchronously and avoid spending extra time on wait().
+	 */
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *cp = &workers[i].cp;
+		if (cp->in >= 0)
+			close(cp->in);
+		if (cp->out >= 0)
+			close(cp->out);
+	}
+
+	for (i = 0; i < num_workers; ++i) {
+		if (finish_command(&workers[i].cp))
+			error(_("checkout worker %d finished with error"), i);
+	}
+
+	free(workers);
+}
+
+#define ASSERT_PC_ITEM_RESULT_SIZE(got, exp) \
+{ \
+	if (got != exp) \
+		BUG("corrupted result from checkout worker (got %dB, exp %dB)", \
+		    got, exp); \
+} while(0)
+
+static void parse_and_save_result(const char *line, int len,
+				  struct pc_worker *worker)
+{
+	struct pc_item_result *res;
+	struct parallel_checkout_item *pc_item;
+	struct stat *st = NULL;
+
+	if (len < PC_ITEM_RESULT_BASE_SIZE)
+		BUG("too short result from checkout worker (got %dB, exp %dB)",
+		    len, (int)PC_ITEM_RESULT_BASE_SIZE);
+
+	res = (struct pc_item_result *)line;
+
+	/*
+	 * Worker should send either the full result struct on success, or
+	 * just the base (i.e. no stat data), otherwise.
+	 */
+	if (res->status == PC_ITEM_WRITTEN) {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)sizeof(struct pc_item_result));
+		st = &res->st;
+	} else {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)PC_ITEM_RESULT_BASE_SIZE);
+	}
+
+	if (!worker->nr_to_complete || res->id != worker->next_to_complete)
+		BUG("checkout worker sent unexpected item id");
+
+	worker->next_to_complete++;
+	worker->nr_to_complete--;
+
+	pc_item = &parallel_checkout.items[res->id];
+	pc_item->status = res->status;
+	if (st)
+		pc_item->st = *st;
+}
+
+
+static void gather_results_from_workers(struct pc_worker *workers,
+					int num_workers)
+{
+	int i, active_workers = num_workers;
+	struct pollfd *pfds;
+
+	CALLOC_ARRAY(pfds, num_workers);
+	for (i = 0; i < num_workers; ++i) {
+		pfds[i].fd = workers[i].cp.out;
+		pfds[i].events = POLLIN;
+	}
+
+	while (active_workers) {
+		int nr = poll(pfds, num_workers, -1);
+
+		if (nr < 0) {
+			if (errno == EINTR)
+				continue;
+			die_errno("failed to poll checkout workers");
+		}
+
+		for (i = 0; i < num_workers && nr > 0; ++i) {
+			struct pc_worker *worker = &workers[i];
+			struct pollfd *pfd = &pfds[i];
+
+			if (!pfd->revents)
+				continue;
+
+			if (pfd->revents & POLLIN) {
+				int len;
+				const char *line = packet_read_line(pfd->fd, &len);
+
+				if (!line) {
+					pfd->fd = -1;
+					active_workers--;
+				} else {
+					parse_and_save_result(line, len, worker);
+				}
+			} else if (pfd->revents & POLLHUP) {
+				pfd->fd = -1;
+				active_workers--;
+			} else if (pfd->revents & (POLLNVAL | POLLERR)) {
+				die(_("error polling from checkout worker"));
+			}
+
+			nr--;
+		}
+	}
+
+	free(pfds);
+}
+
 static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
@@ -351,7 +562,7 @@ static void write_items_sequentially(struct checkout *state)
 		write_pc_item(&parallel_checkout.items[i], state);
 }
 
-int run_parallel_checkout(struct checkout *state)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
 {
 	int ret;
 
@@ -360,7 +571,17 @@ int run_parallel_checkout(struct checkout *state)
 
 	parallel_checkout.status = PC_RUNNING;
 
-	write_items_sequentially(state);
+	if (parallel_checkout.nr < num_workers)
+		num_workers = parallel_checkout.nr;
+
+	if (num_workers <= 1 || parallel_checkout.nr < threshold) {
+		write_items_sequentially(state);
+	} else {
+		struct pc_worker *workers = setup_workers(state, num_workers);
+		gather_results_from_workers(workers, num_workers);
+		finish_workers(workers, num_workers);
+	}
+
 	ret = handle_results(state);
 
 	finish_parallel_checkout();
diff --git a/parallel-checkout.h b/parallel-checkout.h
index e6d6fc01ea..0c9984584e 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -1,9 +1,12 @@
 #ifndef PARALLEL_CHECKOUT_H
 #define PARALLEL_CHECKOUT_H
 
-struct cache_entry;
-struct checkout;
-struct conv_attrs;
+#include "entry.h"
+#include "convert.h"
+
+/****************************************************************
+ * Users of parallel checkout
+ ****************************************************************/
 
 enum pc_status {
 	PC_UNINITIALIZED = 0,
@@ -12,6 +15,7 @@ enum pc_status {
 };
 
 enum pc_status parallel_checkout_status(void);
+void get_parallel_checkout_configs(int *num_workers, int *threshold);
 void init_parallel_checkout(void);
 
 /*
@@ -21,7 +25,77 @@ void init_parallel_checkout(void);
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 
-/* Write all the queued entries, returning 0 on success.*/
-int run_parallel_checkout(struct checkout *state);
+/*
+ * Write all the queued entries, returning 0 on success. If the number of
+ * entries is smaller than the specified threshold, the operation is performed
+ * sequentially.
+ */
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+
+/****************************************************************
+ * Interface with checkout--helper
+ ****************************************************************/
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/*
+	 * In main process ce points to a istate->cache[] entry. Thus, it's not
+	 * owned by us. In workers they own the memory, which *must be* released.
+	 */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	size_t id; /* position in parallel_checkout.items[] of main process */
+
+	/* Output fields, sent from workers. */
+	enum pc_item_status status;
+	struct stat st;
+};
+
+/*
+ * The fixed-size portion of `struct parallel_checkout_item` that is sent to the
+ * workers. Following this will be 2 strings: ca.working_tree_encoding and
+ * ce.name; These are NOT null terminated, since we have the size in the fixed
+ * portion.
+ *
+ * Note that not all fields of conv_attrs and cache_entry are passed, only the
+ * ones that will be required by the workers to smudge and write the entry.
+ */
+struct pc_item_fixed_portion {
+	size_t id;
+	struct object_id oid;
+	unsigned int ce_mode;
+	enum crlf_action crlf_action;
+	int ident;
+	size_t working_tree_encoding_len;
+	size_t name_len;
+};
+
+/*
+ * The fields of `struct parallel_checkout_item` that are returned by the
+ * workers. Note: `st` must be the last one, as it is omitted on error.
+ */
+struct pc_item_result {
+	size_t id;
+	enum pc_item_status status;
+	struct stat st;
+};
+
+#define PC_ITEM_RESULT_BASE_SIZE offsetof(struct pc_item_result, st)
+
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state);
 
 #endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 1b1da7485a..117ed42370 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o,
 	int errs = 0;
 	struct progress *progress;
 	struct checkout state = CHECKOUT_INIT;
-	int i;
+	int i, pc_workers, pc_threshold;
 
 	trace_performance_enter();
 	state.force = 1;
@@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o,
 		oid_array_clear(&to_fetch);
 	}
 
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+
 	enable_delayed_checkout(&state);
-	init_parallel_checkout();
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
-	errs |= run_parallel_checkout(&state);
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 12/19] parallel-checkout: support progress displaying
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (10 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 11/19] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
                     ` (7 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 parallel-checkout.c | 34 +++++++++++++++++++++++++++++++---
 parallel-checkout.h |  4 +++-
 unpack-trees.c      | 11 ++++++++---
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index 7ea0faa526..5156b14c53 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -2,6 +2,7 @@
 #include "entry.h"
 #include "parallel-checkout.h"
 #include "pkt-line.h"
+#include "progress.h"
 #include "run-command.h"
 #include "streaming.h"
 #include "thread-utils.h"
@@ -16,6 +17,8 @@ struct parallel_checkout {
 	enum pc_status status;
 	struct parallel_checkout_item *items;
 	size_t nr, alloc;
+	struct progress *progress;
+	unsigned int *progress_cnt;
 };
 
 static struct parallel_checkout parallel_checkout = { 0 };
@@ -125,6 +128,20 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	return 0;
 }
 
+size_t pc_queue_size(void)
+{
+	return parallel_checkout.nr;
+}
+
+static void advance_progress_meter(void)
+{
+	if (parallel_checkout.progress) {
+		(*parallel_checkout.progress_cnt)++;
+		display_progress(parallel_checkout.progress,
+				 *parallel_checkout.progress_cnt);
+	}
+}
+
 static int handle_results(struct checkout *state)
 {
 	int ret = 0;
@@ -173,6 +190,7 @@ static int handle_results(struct checkout *state)
 			 */
 			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
 						 state, NULL, NULL);
+			advance_progress_meter();
 			break;
 		case PC_ITEM_PENDING:
 			have_pending = 1;
@@ -499,6 +517,9 @@ static void parse_and_save_result(const char *line, int len,
 	pc_item->status = res->status;
 	if (st)
 		pc_item->st = *st;
+
+	if (res->status != PC_ITEM_COLLIDED)
+		advance_progress_meter();
 }
 
 
@@ -558,11 +579,16 @@ static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
 
-	for (i = 0; i < parallel_checkout.nr; ++i)
-		write_pc_item(&parallel_checkout.items[i], state);
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		write_pc_item(pc_item, state);
+		if (pc_item->status != PC_ITEM_COLLIDED)
+			advance_progress_meter();
+	}
 }
 
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt)
 {
 	int ret;
 
@@ -570,6 +596,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 		BUG("cannot run parallel checkout: uninitialized or already running");
 
 	parallel_checkout.status = PC_RUNNING;
+	parallel_checkout.progress = progress;
+	parallel_checkout.progress_cnt = progress_cnt;
 
 	if (parallel_checkout.nr < num_workers)
 		num_workers = parallel_checkout.nr;
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 0c9984584e..6c3a016c0b 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -24,13 +24,15 @@ void init_parallel_checkout(void);
  * write and return 0.
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+size_t pc_queue_size(void);
 
 /*
  * Write all the queued entries, returning 0 on success. If the number of
  * entries is smaller than the specified threshold, the operation is performed
  * sequentially.
  */
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt);
 
 /****************************************************************
  * Interface with checkout--helper
diff --git a/unpack-trees.c b/unpack-trees.c
index 117ed42370..e05e6ceff2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o,
 		struct cache_entry *ce = index->cache[i];
 
 		if (ce->ce_flags & CE_UPDATE) {
+			size_t last_pc_queue_size = pc_queue_size();
+
 			if (ce->ce_flags & CE_WT_REMOVE)
 				BUG("both update and delete flags are set on %s",
 				    ce->name);
-			display_progress(progress, ++cnt);
 			ce->ce_flags &= ~CE_UPDATE;
 			errs |= checkout_entry(ce, &state, NULL, NULL);
+
+			if (last_pc_queue_size == pc_queue_size())
+				display_progress(progress, ++cnt);
 		}
 	}
-	stop_progress(&progress);
 	if (pc_workers > 1)
-		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      progress, &cnt);
+	stop_progress(&progress);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (11 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 12/19] parallel-checkout: support progress displaying Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
                     ` (6 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout--helper.c |  2 +-
 builtin/checkout.c         |  2 +-
 builtin/difftool.c         |  2 +-
 cache.h                    | 10 +++++-----
 read-cache.c               | 12 ++++++++----
 unpack-trees.c             |  2 +-
 6 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
index 67fe37cf11..9646ed9eeb 100644
--- a/builtin/checkout--helper.c
+++ b/builtin/checkout--helper.c
@@ -38,7 +38,7 @@ static void packet_to_pc_item(char *line, int len,
 	}
 
 	memset(pc_item, 0, sizeof(*pc_item));
-	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL);
 	pc_item->ce->ce_namelen = fixed_portion->name_len;
 	pc_item->ce->ce_mode = fixed_portion->ce_mode;
 	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b18b9d6f3c..c0bf5e6711 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
diff --git a/builtin/difftool.c b/builtin/difftool.c
index dfa22b67eb..5e7a57c8c2 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -323,7 +323,7 @@ static int checkout_path(unsigned mode, struct object_id *oid,
 	struct cache_entry *ce;
 	int ret;
 
-	ce = make_transient_cache_entry(mode, oid, path, 0);
+	ce = make_transient_cache_entry(mode, oid, path, 0, NULL);
 	ret = checkout_entry(ce, state, NULL, NULL);
 
 	discard_cache_entry(ce);
diff --git a/cache.h b/cache.h
index 17350cafa2..a394263f0e 100644
--- a/cache.h
+++ b/cache.h
@@ -355,16 +355,16 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate,
 					   size_t name_len);
 
 /*
- * Create a cache_entry that is not intended to be added to an index.
- * Caller is responsible for discarding the cache_entry
- * with `discard_cache_entry`.
+ * Create a cache_entry that is not intended to be added to an index. If mp is
+ * not NULL, the entry is allocated within the given memory pool. Caller is
+ * responsible for discarding the cache_entry with `discard_cache_entry`.
  */
 struct cache_entry *make_transient_cache_entry(unsigned int mode,
 					       const struct object_id *oid,
 					       const char *path,
-					       int stage);
+					       int stage, struct mem_pool *mp);
 
-struct cache_entry *make_empty_transient_cache_entry(size_t name_len);
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp);
 
 /*
  * Discard cache entry.
diff --git a/read-cache.c b/read-cache.c
index ecf6f68994..f9bac760af 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -813,8 +813,10 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t le
 	return mem_pool__ce_calloc(find_mem_pool(istate), len);
 }
 
-struct cache_entry *make_empty_transient_cache_entry(size_t len)
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp)
 {
+	if (mp)
+		return mem_pool__ce_calloc(mp, len);
 	return xcalloc(1, cache_entry_size(len));
 }
 
@@ -848,8 +850,10 @@ struct cache_entry *make_cache_entry(struct index_state *istate,
 	return ret;
 }
 
-struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid,
-					       const char *path, int stage)
+struct cache_entry *make_transient_cache_entry(unsigned int mode,
+					       const struct object_id *oid,
+					       const char *path, int stage,
+					       struct mem_pool *mp)
 {
 	struct cache_entry *ce;
 	int len;
@@ -860,7 +864,7 @@ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct o
 	}
 
 	len = strlen(path);
-	ce = make_empty_transient_cache_entry(len);
+	ce = make_empty_transient_cache_entry(len, mp);
 
 	oidcpy(&ce->oid, oid);
 	memcpy(ce->name, path, len);
diff --git a/unpack-trees.c b/unpack-trees.c
index e05e6ceff2..dcb40dc8fa 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1031,7 +1031,7 @@ static struct cache_entry *create_ce_entry(const struct traverse_info *info,
 	size_t len = traverse_path_len(info, tree_entry_len(n));
 	struct cache_entry *ce =
 		is_transient ?
-		make_empty_transient_cache_entry(len) :
+		make_empty_transient_cache_entry(len, NULL) :
 		make_empty_cache_entry(istate, len);
 
 	ce->ce_mode = create_ce_mode(n->mode);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (12 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 15/19] checkout-index: add " Matheus Tavares
                     ` (5 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

There is one code path in builtin/checkout.c which still doesn't benefit
from parallel checkout because it calls checkout_entry() directly,
instead of unpack_trees(). Let's add parallel support for this missing
spot as well. Note: the transient cache entries allocated in
checkout_merged() are now allocated in a mem_pool which is only
discarded after parallel checkout finishes. This is done because the
entries need to be valid when run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index c0bf5e6711..ddc4079b85 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -27,6 +27,7 @@
 #include "wt-status.h"
 #include "xdiff-interface.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
@@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos,
 		return error(_("path '%s' does not have their version"), ce->name);
 }
 
-static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts)
+static int checkout_merged(int pos, const struct checkout *state,
+			   int *nr_checkouts, struct mem_pool *ce_mem_pool)
 {
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
@@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
-	discard_cache_entry(ce);
 	return status;
 }
 
@@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	int nr_checkouts = 0, nr_unmerged = 0;
 	int errs = 0;
 	int pos;
+	int pc_workers, pc_threshold;
+	struct mem_pool ce_mem_pool;
 
 	state.force = 1;
 	state.refresh_cache = 1;
 	state.istate = &the_index;
 
+	mem_pool_init(&ce_mem_pool, 0);
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
 	init_checkout_metadata(&state.meta, info->refname,
 			       info->commit ? &info->commit->object.oid : &info->oid,
 			       NULL);
 
 	enable_delayed_checkout(&state);
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts,
 						       &nr_checkouts, opts->overlay_mode);
 			else if (opts->merge)
 				errs |= checkout_merged(pos, &state,
-							&nr_unmerged);
+							&nr_unmerged,
+							&ce_mem_pool);
 			pos = skip_same_name(ce, pos) - 1;
 		}
 	}
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      NULL, NULL);
+	mem_pool_discard(&ce_mem_pool, should_validate_cache_entries());
 	remove_marked_cache_entries(&the_index, 1);
 	remove_scheduled_dirs();
 	errs |= finish_delayed_checkout(&state, &nr_checkouts);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 15/19] checkout-index: add parallel checkout support
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (13 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
                     ` (4 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout-index.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 0f1ff73129..33fb933c30 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -12,6 +12,7 @@
 #include "cache-tree.h"
 #include "parse-options.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
@@ -160,6 +161,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 	int prefix_length;
 	int force = 0, quiet = 0, not_new = 0;
 	int index_opt = 0;
+	int pc_workers, pc_threshold;
 	struct option builtin_checkout_index_options[] = {
 		OPT_BOOL('a', "all", &all,
 			N_("check out all files in the index")),
@@ -214,6 +216,14 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 		hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
 	}
 
+	if (!to_tempfile)
+		get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+	else
+		pc_workers = 1;
+
+	if (pc_workers > 1)
+		init_parallel_checkout();
+
 	/* Check out named files first */
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
@@ -256,6 +266,12 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 	if (all)
 		checkout_all(prefix, prefix_length);
 
+	if (pc_workers > 1) {
+		/* Errors were already reported */
+		run_parallel_checkout(&state, pc_workers, pc_threshold,
+				      NULL, NULL);
+	}
+
 	if (is_lock_file_locked(&lock_file) &&
 	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("Unable to write new index file");
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (14 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 15/19] checkout-index: add " Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-10-20  1:35     ` Jonathan Nieder
  2020-09-22 22:49   ` [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
                     ` (3 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Add tests to populate the working tree during clone and checkout using
the sequential and parallel modes, to confirm that they produce
identical results. Also test basic checkout mechanics, such as checking
for symlinks in the leading directories and the abidance to --force.

Note: some helper functions are added to a common lib file which is only
included by t2080 for now. But it will also be used by another
parallel-checkout test in a following patch.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/lib-parallel-checkout.sh          |  39 ++++++
 t/t2080-parallel-checkout-basics.sh | 197 ++++++++++++++++++++++++++++
 2 files changed, 236 insertions(+)
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
new file mode 100644
index 0000000000..c95ca27711
--- /dev/null
+++ b/t/lib-parallel-checkout.sh
@@ -0,0 +1,39 @@
+# Helpers for t208* tests
+
+# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
+# and checks that the number of workers spawned is equal to $3.
+git_pc()
+{
+	if test $# -lt 4
+	then
+		BUG "too few arguments to git_pc()"
+	fi
+
+	workers=$1 threshold=$2 expected_workers=$3 &&
+	shift && shift && shift &&
+
+	rm -f trace &&
+	GIT_TRACE2="$(pwd)/trace" git \
+		-c checkout.workers=$workers \
+		-c checkout.thresholdForParallelism=$threshold \
+		-c advice.detachedHead=0 \
+		$@ &&
+
+	# Check that the expected number of workers has been used. Note that it
+	# can be different than the requested number in two cases: when the
+	# quantity of entries to be checked out is less than the number of
+	# workers; and when the threshold has not been reached.
+	#
+	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
+	test $workers_in_trace -eq $expected_workers &&
+	rm -f trace
+}
+
+# Verify that both the working tree and the index were created correctly
+verify_checkout()
+{
+	git -C $1 diff-index --quiet HEAD -- &&
+	git -C $1 diff-index --quiet --cached HEAD -- &&
+	git -C $1 status --porcelain >$1.status &&
+	test_must_be_empty $1.status
+}
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
new file mode 100755
index 0000000000..c088a06ecc
--- /dev/null
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -0,0 +1,197 @@
+#!/bin/sh
+
+test_description='parallel-checkout basics
+
+Ensure that parallel-checkout basically works on clone and checkout, spawning
+the required number of workers and correctly populating both the index and
+working tree.
+'
+
+TEST_NO_CREATE_REPO=1
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
+
+# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
+# currently produces a wrong result (See
+# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
+# So we skip the "parallel-checkout during clone" tests when this test flag is
+# set to "sha256". Remove this when the bug is fixed.
+#
+if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
+then
+	skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
+	test_done
+fi
+
+R_BASE=$GIT_BUILD_DIR
+
+test_expect_success 'sequential clone' '
+	git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&
+	verify_checkout r_sequential
+'
+
+test_expect_success 'parallel clone' '
+	git_pc 2 0 2 clone --quiet -- $R_BASE r_parallel &&
+	verify_checkout r_parallel
+'
+
+test_expect_success 'fallback to sequential clone (threshold)' '
+	git -C $R_BASE ls-files >files &&
+	nr_files=$(wc -l <files) &&
+	threshold=$(($nr_files + 1)) &&
+
+	git_pc 2 $threshold 0 clone --quiet -- $R_BASE r_sequential_fallback &&
+	verify_checkout r_sequential_fallback
+'
+
+# Just to be paranoid, actually compare the contents of the worktrees directly.
+test_expect_success 'compare working trees from clones' '
+	rm -rf r_sequential/.git &&
+	rm -rf r_parallel/.git &&
+	rm -rf r_sequential_fallback/.git &&
+	diff -qr r_sequential r_parallel &&
+	diff -qr r_sequential r_sequential_fallback
+'
+
+# Test parallel-checkout with different operations (creation, deletion,
+# modification) and entry types. A branch switch from B1 to B2 will contain:
+#
+# - a (file):      modified
+# - e/x (file):    deleted
+# - b (symlink):   deleted
+# - b/f (file):    created
+# - e (symlink):   created
+# - d (submodule): created
+#
+test_expect_success SYMLINKS 'setup repo for checkout with various operations' '
+	git init various &&
+	(
+		cd various &&
+		git checkout -b B1 &&
+		echo a>a &&
+		mkdir e &&
+		echo e/x >e/x &&
+		ln -s e b &&
+		git add -A &&
+		git commit -m B1 &&
+
+		git checkout -b B2 &&
+		echo modified >a &&
+		rm -rf e &&
+		rm b &&
+		mkdir b &&
+		echo b/f >b/f &&
+		ln -s b e &&
+		git init d &&
+		test_commit -C d f &&
+		git submodule add ./d &&
+		git add -A &&
+		git commit -m B2 &&
+
+		git checkout --recurse-submodules B1
+	)
+'
+
+test_expect_success SYMLINKS 'sequential checkout' '
+	cp -R various various_sequential &&
+	git_pc 1 0 0 -C various_sequential checkout --recurse-submodules B2 &&
+	verify_checkout various_sequential
+'
+
+test_expect_success SYMLINKS 'parallel checkout' '
+	cp -R various various_parallel &&
+	git_pc 2 0 2 -C various_parallel checkout --recurse-submodules B2 &&
+	verify_checkout various_parallel
+'
+
+test_expect_success SYMLINKS 'fallback to sequential checkout (threshold)' '
+	cp -R various various_sequential_fallback &&
+	git_pc 2 100 0 -C various_sequential_fallback checkout --recurse-submodules B2 &&
+	verify_checkout various_sequential_fallback
+'
+
+test_expect_success SYMLINKS 'compare working trees from checkouts' '
+	rm -rf various_sequential/.git &&
+	rm -rf various_parallel/.git &&
+	rm -rf various_sequential_fallback/.git &&
+	diff -qr various_sequential various_parallel &&
+	diff -qr various_sequential various_sequential_fallback
+'
+
+test_cmp_str()
+{
+	echo "$1" >tmp &&
+	test_cmp tmp "$2"
+}
+
+test_expect_success 'parallel checkout respects --[no]-force' '
+	git init dirty &&
+	(
+		cd dirty &&
+		mkdir D &&
+		test_commit D/F &&
+		test_commit F &&
+
+		echo changed >F.t &&
+		rm -rf D &&
+		echo changed >D &&
+
+		# We expect 0 workers because there is nothing to be updated
+		git_pc 2 0 0 checkout HEAD &&
+		test_path_is_file D &&
+		test_cmp_str changed D &&
+		test_cmp_str changed F.t &&
+
+		git_pc 2 0 2 checkout --force HEAD &&
+		test_path_is_dir D &&
+		test_cmp_str D/F D/F.t &&
+		test_cmp_str F F.t
+	)
+'
+
+test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading dirs' '
+	git init symlinks &&
+	(
+		cd symlinks &&
+		mkdir D E &&
+
+		# Create two entries in D to have enough work for 2 parallel
+		# workers
+		test_commit D/A &&
+		test_commit D/B &&
+		test_commit E/C &&
+		rm -rf D &&
+		ln -s E D &&
+
+		git_pc 2 0 2 checkout --force HEAD &&
+		! test -L D &&
+		test_cmp_str D/A D/A.t &&
+		test_cmp_str D/B D/B.t
+	)
+'
+
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'symlink colliding with leading dir' '
+	git init colliding-symlink &&
+	(
+		cd colliding-symlink &&
+		file_hex=$(git hash-object -w --stdin </dev/null) &&
+		file_oct=$(echo $file_hex | hex2oct) &&
+
+		sym_hex=$(echo "./D" | git hash-object -w --stdin) &&
+		sym_oct=$(echo $sym_hex | hex2oct) &&
+
+		printf "100644 D/A\0${file_oct}" >tree &&
+		printf "100644 E/B\0${file_oct}" >>tree &&
+		printf "120000 e\0${sym_oct}" >>tree &&
+
+		tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
+		commit_hex=$(git commit-tree -m collisions $tree_hex) &&
+		git update-ref refs/heads/colliding-symlink $commit_hex &&
+
+		git_pc 2 0 2 checkout colliding-symlink &&
+		test_path_is_dir D &&
+		test_path_is_missing D/B
+	)
+'
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (15 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
                     ` (2 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Add tests to confirm that path collisions are properly reported during a
clone operation using parallel-checkout.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t2081-parallel-checkout-collisions.sh | 115 ++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100755 t/t2081-parallel-checkout-collisions.sh

diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh
new file mode 100755
index 0000000000..3ce195b892
--- /dev/null
+++ b/t/t2081-parallel-checkout-collisions.sh
@@ -0,0 +1,115 @@
+#!/bin/sh
+
+test_description='parallel-checkout collisions'
+
+. ./test-lib.sh
+
+# When there are pathname collisions during a clone, Git should report a warning
+# listing all of the colliding entries. The sequential code detects a collision
+# by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
+# colliding pair of an item k, it searches cache_entry[0, k-1].
+#
+# This is not sufficient in parallel-checkout mode since colliding files may be
+# created in a racy order. The tests in this file make sure the collision
+# detection code is extended for parallel-checkout. This is done in two parts:
+#
+# - First, two parallel workers create four colliding files racily.
+# - Then this exercise is repeated but forcing the colliding pair to appear in
+#   the second half of the cache_entry's array.
+#
+# The second item uses the fact that files with clean/smudge filters are not
+# parallel-eligible; and that they are processed sequentially *before* any
+# worker is spawned. We set a filter attribute to the last entry in the
+# cache_entry[] array, making it non-eligible, so that it is populated first.
+# This way, we can test if the collision detection code is correctly looking
+# for collision pairs in the second half of the array.
+
+test_expect_success CASE_INSENSITIVE_FS 'setup' '
+	file_hex=$(git hash-object -w --stdin </dev/null) &&
+	file_oct=$(echo $file_hex | hex2oct) &&
+
+	attr_hex=$(echo "file_x filter=logger" | git hash-object -w --stdin) &&
+	attr_oct=$(echo $attr_hex | hex2oct) &&
+
+	printf "100644 FILE_X\0${file_oct}" >tree &&
+	printf "100644 FILE_x\0${file_oct}" >>tree &&
+	printf "100644 file_X\0${file_oct}" >>tree &&
+	printf "100644 file_x\0${file_oct}" >>tree &&
+	printf "100644 .gitattributes\0${attr_oct}" >>tree &&
+
+	tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
+	commit_hex=$(git commit-tree -m collisions $tree_hex) &&
+	git update-ref refs/heads/collisions $commit_hex &&
+
+	write_script logger_script <<-\EOF
+	echo "$@" >>filter.log
+	EOF
+'
+
+clone_and_check_collision()
+{
+	id=$1 workers=$2 threshold=$3 expected_workers=$4 filter=$5 &&
+
+	filter_opts=
+	if test "$filter" -eq "use_filter"
+	then
+		# We use `core.ignoreCase=0` so that only `file_x`
+		# matches the pattern in .gitattributes.
+		#
+		filter_opts='-c filter.logger.smudge="../logger_script %f" -c core.ignoreCase=0'
+	fi &&
+
+	test_path_is_missing $id.trace &&
+	GIT_TRACE2="$(pwd)/$id.trace" git \
+		-c checkout.workers=$workers \
+		-c checkout.thresholdForParallelism=$threshold \
+		$filter_opts clone --branch=collisions -- . r_$id 2>$id.warning &&
+
+	# Check that checkout spawned the right number of workers
+	workers_in_trace=$(grep "child_start\[.\] git checkout--helper" $id.trace | wc -l) &&
+	test $workers_in_trace -eq $expected_workers &&
+
+	if test $filter -eq "use_filter"
+	then
+		#  Make sure only 'file_x' was filtered
+		test_path_is_file r_$id/filter.log &&
+		echo file_x >expected.filter.log &&
+		test_cmp r_$id/filter.log expected.filter.log
+	else
+		test_path_is_missing r_$id/filter.log
+	fi &&
+
+	grep FILE_X $id.warning &&
+	grep FILE_x $id.warning &&
+	grep file_X $id.warning &&
+	grep file_x $id.warning &&
+	test_i18ngrep "the following paths have collided" $id.warning
+}
+
+test_expect_success CASE_INSENSITIVE_FS 'collision detection on parallel clone' '
+	clone_and_check_collision parallel 2 0 2
+'
+
+test_expect_success CASE_INSENSITIVE_FS 'collision detection on fallback to sequential clone' '
+	git ls-tree --name-only -r collisions >files &&
+	nr_files=$(wc -l <files) &&
+	threshold=$(($nr_files + 1)) &&
+	clone_and_check_collision sequential 2 $threshold 0
+'
+
+# The next two tests don't work on Windows because, on this system, collision
+# detection uses strcmp() (when core.ignoreCase=0) to find the colliding pair.
+# But they work on OSX, where collision detection uses inode.
+
+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on parallel clone w/ filter' '
+	clone_and_check_collision parallel-with-filter 2 0 2 use_filter
+'
+
+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on fallback to sequential clone w/ filter' '
+	git ls-tree --name-only -r collisions >files &&
+	nr_files=$(wc -l <files) &&
+	threshold=$(($nr_files + 1)) &&
+	clone_and_check_collision sequential-with-filter 2 $threshold 0 use_filter
+'
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (16 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-09-22 22:49   ` [PATCH v2 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

Add tests to confirm that `struct conv_attrs` data is correctly passed
from the main process to the workers, and that they properly smudge
files before writing to the working tree. Also check that
non-parallel-eligible entries, such as regular files that require
external filters, are correctly smudge and written when
parallel-checkout is enabled.

Note: to avoid repeating code, some helper functions are extracted from
t0028 into a common lib file.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/lib-encoding.sh                       |  25 ++++
 t/t0028-working-tree-encoding.sh        |  25 +---
 t/t2082-parallel-checkout-attributes.sh | 174 ++++++++++++++++++++++++
 3 files changed, 200 insertions(+), 24 deletions(-)
 create mode 100644 t/lib-encoding.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

diff --git a/t/lib-encoding.sh b/t/lib-encoding.sh
new file mode 100644
index 0000000000..c52ffbbed5
--- /dev/null
+++ b/t/lib-encoding.sh
@@ -0,0 +1,25 @@
+# Encoding helpers used by t0028 and t2082
+
+test_lazy_prereq NO_UTF16_BOM '
+	test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6
+'
+
+test_lazy_prereq NO_UTF32_BOM '
+	test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12
+'
+
+write_utf16 () {
+	if test_have_prereq NO_UTF16_BOM
+	then
+		printf '\376\377'
+	fi &&
+	iconv -f UTF-8 -t UTF-16
+}
+
+write_utf32 () {
+	if test_have_prereq NO_UTF32_BOM
+	then
+		printf '\0\0\376\377'
+	fi &&
+	iconv -f UTF-8 -t UTF-32
+}
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index bfc4fb9af5..4fffc3a639 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -3,33 +3,10 @@
 test_description='working-tree-encoding conversion via gitattributes'
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY/lib-encoding.sh"
 
 GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
 
-test_lazy_prereq NO_UTF16_BOM '
-	test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6
-'
-
-test_lazy_prereq NO_UTF32_BOM '
-	test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12
-'
-
-write_utf16 () {
-	if test_have_prereq NO_UTF16_BOM
-	then
-		printf '\376\377'
-	fi &&
-	iconv -f UTF-8 -t UTF-16
-}
-
-write_utf32 () {
-	if test_have_prereq NO_UTF32_BOM
-	then
-		printf '\0\0\376\377'
-	fi &&
-	iconv -f UTF-8 -t UTF-32
-}
-
 test_expect_success 'setup test files' '
 	git config core.eol lf &&
 
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
new file mode 100755
index 0000000000..6800574588
--- /dev/null
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -0,0 +1,174 @@
+#!/bin/sh
+
+test_description='parallel-checkout: attributes
+
+Verify that parallel-checkout correctly creates files that require
+conversions, as specified in .gitattributes. The main point here is
+to check that the conv_attr data is correctly sent to the workers
+and that it contains sufficient information to smudge files
+properly (without access to the index or attribute stack).
+'
+
+TEST_NO_CREATE_REPO=1
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
+. "$TEST_DIRECTORY/lib-encoding.sh"
+
+test_expect_success 'parallel-checkout with ident' '
+	git init ident &&
+	(
+		cd ident &&
+		echo "A ident" >.gitattributes &&
+		echo "\$Id\$" >A &&
+		echo "\$Id\$" >B &&
+		git add -A &&
+		git commit -m id &&
+
+		rm A B &&
+		git_pc 2 0 2 reset --hard &&
+		hexsz=$(test_oid hexsz) &&
+		grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A &&
+		grep "\\\$Id\\\$" B
+	)
+'
+
+test_expect_success 'parallel-checkout with re-encoding' '
+	git init encoding &&
+	(
+		cd encoding &&
+		echo text >utf8-text &&
+		cat utf8-text | write_utf16 >utf16-text &&
+
+		echo "A working-tree-encoding=UTF-16" >.gitattributes &&
+		cp utf16-text A &&
+		cp utf16-text B &&
+		git add A B .gitattributes &&
+		git commit -m encoding &&
+
+		# Check that A (and only A) is stored in UTF-8
+		git cat-file -p :A >A.internal &&
+		test_cmp_bin utf8-text A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_bin utf16-text B.internal &&
+
+		# Check that A is re-encoded during checkout
+		rm A B &&
+		git_pc 2 0 2 checkout A B &&
+		test_cmp_bin utf16-text A
+	)
+'
+
+test_expect_success 'parallel-checkout with eol conversions' '
+	git init eol &&
+	(
+		cd eol &&
+		git config core.autocrlf false &&
+		printf "multi\r\nline\r\ntext" >crlf-text &&
+		printf "multi\nline\ntext" >lf-text &&
+
+		echo "A text eol=crlf" >.gitattributes &&
+		echo "B -text" >>.gitattributes &&
+		cp crlf-text A &&
+		cp crlf-text B &&
+		git add A B .gitattributes &&
+		git commit -m eol &&
+
+		# Check that A (and only A) is stored with LF format
+		git cat-file -p :A >A.internal &&
+		test_cmp_bin lf-text A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_bin crlf-text B.internal &&
+
+		# Check that A is converted to CRLF during checkout
+		rm A B &&
+		git_pc 2 0 2 checkout A B &&
+		test_cmp_bin crlf-text A
+	)
+'
+
+test_cmp_str()
+{
+	echo "$1" >tmp &&
+	test_cmp tmp "$2"
+}
+
+# Entries that require an external filter are not eligible for parallel
+# checkout. Check that both the parallel-eligible and non-eligible entries are
+# properly writen in a single checkout process.
+#
+test_expect_success 'parallel-checkout and external filter' '
+	git init filter &&
+	(
+		cd filter &&
+		git config filter.x2y.clean "tr x y" &&
+		git config filter.x2y.smudge "tr y x" &&
+		git config filter.x2y.required true &&
+
+		echo "A filter=x2y" >.gitattributes &&
+		echo x >A &&
+		echo x >B &&
+		echo x >C &&
+		git add -A &&
+		git commit -m filter &&
+
+		# Check that A (and only A) was cleaned
+		git cat-file -p :A >A.internal &&
+		test_cmp_str y A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_str x B.internal &&
+		git cat-file -p :C >C.internal &&
+		test_cmp_str x C.internal &&
+
+		rm A B C *.internal &&
+		git_pc 2 0 2 checkout A B C &&
+		test_cmp_str x A &&
+		test_cmp_str x B &&
+		test_cmp_str x C
+	)
+'
+
+# The delayed queue is independent from the parallel queue, and they should be
+# able to work together in the same checkout process.
+#
+test_expect_success PERL 'parallel-checkout and delayed checkout' '
+	write_script rot13-filter.pl "$PERL_PATH" \
+		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
+	test_config_global filter.delay.process \
+		"\"$(pwd)/rot13-filter.pl\" \"$(pwd)/delayed.log\" clean smudge delay" &&
+	test_config_global filter.delay.required true &&
+
+	echo "a b c" >delay-content &&
+	echo "n o p" >delay-rot13-content &&
+
+	git init delayed &&
+	(
+		cd delayed &&
+		echo "*.a filter=delay" >.gitattributes &&
+		cp ../delay-content test-delay10.a &&
+		cp ../delay-content test-delay11.a &&
+		echo parallel >parallel1.b &&
+		echo parallel >parallel2.b &&
+		git add -A &&
+		git commit -m delayed &&
+
+		# Check that the stored data was cleaned
+		git cat-file -p :test-delay10.a > delay10.internal &&
+		test_cmp delay10.internal ../delay-rot13-content &&
+		git cat-file -p :test-delay11.a > delay11.internal &&
+		test_cmp delay11.internal ../delay-rot13-content &&
+		rm *.internal &&
+
+		rm *.a *.b
+	) &&
+
+	git_pc 2 0 2 -C delayed checkout -f &&
+	verify_checkout delayed &&
+
+	# Check that the *.a files got to the delay queue and were filtered
+	grep "smudge test-delay10.a .* \[DELAYED\]" delayed.log &&
+	grep "smudge test-delay11.a .* \[DELAYED\]" delayed.log &&
+	test_cmp delayed/test-delay10.a delay-content &&
+	test_cmp delayed/test-delay11.a delay-content
+'
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 19/19] ci: run test round with parallel-checkout enabled
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (17 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
@ 2020-09-22 22:49   ` Matheus Tavares
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-09-22 22:49 UTC (permalink / raw)
  To: git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren

We already have tests for the basic parallel-checkout operations. But
this code can also run in other commands, such as git-read-tree and
git-sparse-checkout, which are currently not tested with multiple
workers. To promote a wider test coverage without duplicating tests:

1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally
   force parallel-checkout execution during the whole test suite.

2. Include this variable in the second test round of the linux-gcc job
   of our ci scripts. This round runs `make test` again with some
   optional GIT_TEST_* variables enabled, so there is no additional
   overhead in exercising the parallel-checkout code here.

Note: the specific parallel-checkout tests t208* cannot be used in
combination with GIT_TEST_CHECKOUT_WORKERS as they need to set and check
the number of workers by themselves. So skip those tests when this flag
is set.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 ci/run-build-and-tests.sh               |  1 +
 parallel-checkout.c                     | 14 ++++++++++++++
 t/README                                |  4 ++++
 t/lib-parallel-checkout.sh              |  6 ++++++
 t/t2081-parallel-checkout-collisions.sh |  1 +
 5 files changed, 26 insertions(+)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 6c27b886b8..aa32ddc361 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -22,6 +22,7 @@ linux-gcc)
 	export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
 	export GIT_TEST_MULTI_PACK_INDEX=1
 	export GIT_TEST_ADD_I_USE_BUILTIN=1
+	export GIT_TEST_CHECKOUT_WORKERS=2
 	make test
 	;;
 linux-clang)
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 5156b14c53..94b44d2a48 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -32,6 +32,20 @@ enum pc_status parallel_checkout_status(void)
 
 void get_parallel_checkout_configs(int *num_workers, int *threshold)
 {
+	char *env_workers = getenv("GIT_TEST_CHECKOUT_WORKERS");
+
+	if (env_workers && *env_workers) {
+		if (strtol_i(env_workers, 10, num_workers)) {
+			die("invalid value for GIT_TEST_CHECKOUT_WORKERS: '%s'",
+			    env_workers);
+		}
+		if (*num_workers < 1)
+			*num_workers = online_cpus();
+
+		*threshold = 0;
+		return;
+	}
+
 	if (git_config_get_int("checkout.workers", num_workers))
 		*num_workers = 1;
 	else if (*num_workers < 1)
diff --git a/t/README b/t/README
index 2adaf7c2d2..cd1b15c55a 100644
--- a/t/README
+++ b/t/README
@@ -425,6 +425,10 @@ GIT_TEST_DEFAULT_HASH=<hash-algo> specifies which hash algorithm to
 use in the test scripts. Recognized values for <hash-algo> are "sha1"
 and "sha256".
 
+GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
+to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
+execution of the parallel-checkout code.
+
 Naming Tests
 ------------
 
diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index c95ca27711..80bb0a0900 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -1,5 +1,11 @@
 # Helpers for t208* tests
 
+if ! test -z "$GIT_TEST_CHECKOUT_WORKERS"
+then
+	skip_all="skipping test, GIT_TEST_CHECKOUT_WORKERS is set"
+	test_done
+fi
+
 # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
 # and checks that the number of workers spawned is equal to $3.
 git_pc()
diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh
index 3ce195b892..5dbff54bfb 100755
--- a/t/t2081-parallel-checkout-collisions.sh
+++ b/t/t2081-parallel-checkout-collisions.sh
@@ -3,6 +3,7 @@
 test_description='parallel-checkout collisions'
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
 
 # When there are pathname collisions during a clone, Git should report a warning
 # listing all of the colliding entries. The sequential code detects a collision
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/19] parallel-checkout: make it truly parallel
  2020-09-22 22:49   ` [PATCH v2 11/19] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-09-29 19:52     ` Martin Ågren
  2020-09-30 14:02       ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 154+ messages in thread
From: Martin Ågren @ 2020-09-29 19:52 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: Git Mailing List, Jeff Hostetler, Christian Couder, Jeff King,
	Thomas Gummerer, Elijah Newren

Hi Matheus,

On Wed, 23 Sep 2020 at 00:53, Matheus Tavares <matheus.bernardino@usp.br> wrote:
> --- a/Documentation/config/checkout.txt
> +++ b/Documentation/config/checkout.txt
> @@ -16,3 +16,24 @@ will checkout the '<something>' branch on another remote,
>  and by linkgit:git-worktree[1] when 'git worktree add' refers to a
>  remote branch. This setting might be used for other checkout-like
>  commands or functionality in the future.
> +
> +checkout.workers::
> +       The number of parallel workers to use when updating the working tree.
> +       The default is one, i.e. sequential execution. If set to a value less
> +       than one, Git will use as many workers as the number of logical cores
> +       available. This setting and checkout.thresholdForParallelism affect all

If you end up rerolling this patch series for other reasons, you might
want to consider using `backticks` around
`checkout.thresholdForParallelism` so that it gets typeset as monospace.

> +       commands that perform checkout. E.g. checkout, switch, clone, reset,
> +       sparse-checkout, read-tree, etc.

Similarly here. Or perhaps go for "linkgit:git-checkout[1],
linkgit:git-switch[1]" etc.

BTW, as far as ".e.g.," goes, this list looks quite long. :) I almost
get the feeling you've made it fairly exhaustive and added the "e.g.,"
more as future proofing than anything else. I don't think anyone would
complain if you left out, say, the plumbing `git read-tree` from the list.

> +Note: parallel checkout usually delivers better performance for repositories
> +located on SSDs or over NFS. For repositories on spinning disks and/or machines
> +with a small number of cores, the default sequential checkout often performs
> +better. The size and compression level of a repository might also influence how
> +well the parallel version performs.
> +
> +checkout.thresholdForParallelism::

Sorry if this has already been discussed, but this "For" looks somewhat
odd. Basically every config knob is "somethingForSomething". ;-) Could
this be `checkout.parallelismThreshold`?


> +       When running parallel checkout with a small number of files, the cost
> +       of subprocess spawning and inter-process communication might outweigh
> +       the parallelization gains. This setting allows to define the minimum
> +       number of files for which parallel checkout should be attempted. The
> +       default is 100.

Martin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/19] parallel-checkout: make it truly parallel
  2020-09-29 19:52     ` Martin Ågren
@ 2020-09-30 14:02       ` Matheus Tavares Bernardino
  0 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-09-30 14:02 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Git Mailing List, Jeff Hostetler, Christian Couder, Jeff King,
	Thomas Gummerer, Elijah Newren

Hi, Martin

On Tue, Sep 29, 2020 at 4:52 PM Martin Ågren <martin.agren@gmail.com> wrote:
>
> Hi Matheus,
>
> On Wed, 23 Sep 2020 at 00:53, Matheus Tavares <matheus.bernardino@usp.br> wrote:
> > --- a/Documentation/config/checkout.txt
> > +++ b/Documentation/config/checkout.txt
> > @@ -16,3 +16,24 @@ will checkout the '<something>' branch on another remote,
> >  and by linkgit:git-worktree[1] when 'git worktree add' refers to a
> >  remote branch. This setting might be used for other checkout-like
> >  commands or functionality in the future.
> > +
> > +checkout.workers::
> > +       The number of parallel workers to use when updating the working tree.
> > +       The default is one, i.e. sequential execution. If set to a value less
> > +       than one, Git will use as many workers as the number of logical cores
> > +       available. This setting and checkout.thresholdForParallelism affect all
>
> If you end up rerolling this patch series for other reasons, you might
> want to consider using `backticks` around
> `checkout.thresholdForParallelism` so that it gets typeset as monospace.

Sure, thanks!

> > +       commands that perform checkout. E.g. checkout, switch, clone, reset,
> > +       sparse-checkout, read-tree, etc.
>
> Similarly here. Or perhaps go for "linkgit:git-checkout[1],
> linkgit:git-switch[1]" etc.
>
> BTW, as far as ".e.g.," goes, this list looks quite long. :) I almost
> get the feeling you've made it fairly exhaustive and added the "e.g.,"
> more as future proofing than anything else. I don't think anyone would
> complain if you left out, say, the plumbing `git read-tree` from the list.

Yeah, I might have gotten a little too carried away there hehe :)

> > +Note: parallel checkout usually delivers better performance for repositories
> > +located on SSDs or over NFS. For repositories on spinning disks and/or machines
> > +with a small number of cores, the default sequential checkout often performs
> > +better. The size and compression level of a repository might also influence how
> > +well the parallel version performs.
> > +
> > +checkout.thresholdForParallelism::
>
> Sorry if this has already been discussed, but this "For" looks somewhat
> odd. Basically every config knob is "somethingForSomething". ;-) Could
> this be `checkout.parallelismThreshold`?

Thanks for the suggestion. TBH, I spent more time than I'd like to
admit trying to come up with a name for this setting... But
`checkout.parallelismThreshold` does sound better.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-09-22 22:49   ` [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-10-01 15:53     ` Jeff Hostetler
  2020-10-01 15:59       ` Jeff Hostetler
  0 siblings, 1 reply; 154+ messages in thread
From: Jeff Hostetler @ 2020-10-01 15:53 UTC (permalink / raw)
  To: Matheus Tavares, git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren



On 9/22/20 6:49 PM, Matheus Tavares wrote:
> In a following patch, checkout_entry() will use conv_attrs to decide
> whether an entry should be enqueued for parallel checkout or not. But
> the attributes lookup only happens lower in this call stack. To avoid
> the unnecessary work of loading the attributes twice, let's move it up
> to checkout_entry(), and pass the loaded struct down to write_entry().
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>   entry.c | 38 +++++++++++++++++++++++++++-----------
>   1 file changed, 27 insertions(+), 11 deletions(-)
> 
> diff --git a/entry.c b/entry.c
> index 1d2df188e5..8237859b12 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
>   	}
>   }
>   
> -static int write_entry(struct cache_entry *ce,
> -		       char *path, const struct checkout *state, int to_tempfile)
> +/* Note: ca is used (and required) iff the entry refers to a regular file. */
> +static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
> +		       const struct checkout *state, int to_tempfile)
>   {
>   	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
>   	struct delayed_checkout *dco = state->delayed_checkout;
> @@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
>   	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
>   
>   	if (ce_mode_s_ifmt == S_IFREG) {
> -		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
> -								 &ce->oid);
> +		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
>   		if (filter &&
>   		    !streaming_write_entry(ce, path, filter,
>   					   state, to_tempfile,
> @@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
>   		 * Convert from git internal format to working tree format
>   		 */
>   		if (dco && dco->state != CE_NO_DELAY) {
> -			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
> -							    size, &buf, &meta, dco);
> +			ret = async_convert_to_working_tree_ca(ca, ce->name,
> +							       new_blob, size,
> +							       &buf, &meta, dco);
>   			if (ret && string_list_has_string(&dco->paths, ce->name)) {
>   				free(new_blob);
>   				goto delayed;
>   			}
> -		} else
> -			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
> +		} else {
> +			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
> +							 size, &buf, &meta);
> +		}
>   
>   		if (ret) {
>   			free(new_blob);
> @@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>   {
>   	static struct strbuf path = STRBUF_INIT;
>   	struct stat st;
> +	struct conv_attrs ca;

I have to wonder if it would be clearer to move this declaration of `ca`
into the two `if { ... }` blocks where it is used -- to indicate that it
is only defined in two cases where we call `convert_attrs()`.

There are several other calls to `write_entry()` that pass NULL and it
could cause confusion.


>   
>   	if (ce->ce_flags & CE_WT_REMOVE) {
>   		if (topath)
> @@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>   		return 0;
>   	}
>   
> -	if (topath)
> -		return write_entry(ce, topath, state, 1);
> +	if (topath) {
> +		if (S_ISREG(ce->ce_mode)) {
> +			convert_attrs(state->istate, &ca, ce->name);
> +			return write_entry(ce, topath, &ca, state, 1);
> +		}
> +		return write_entry(ce, topath, NULL, state, 1);
> +	}
>   
>   	strbuf_reset(&path);
>   	strbuf_add(&path, state->base_dir, state->base_dir_len);
> @@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>   		return 0;
>   
>   	create_directories(path.buf, path.len, state);
> +
>   	if (nr_checkouts)
>   		(*nr_checkouts)++;
> -	return write_entry(ce, path.buf, state, 0);
> +
> +	if (S_ISREG(ce->ce_mode)) {
> +		convert_attrs(state->istate, &ca, ce->name);
> +		return write_entry(ce, path.buf, &ca, state, 0);
> +	}
> +
> +	return write_entry(ce, path.buf, NULL, state, 0);
>   }
>   
>   void unlink_entry(const struct cache_entry *ce)
> 

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-10-01 15:53     ` Jeff Hostetler
@ 2020-10-01 15:59       ` Jeff Hostetler
  0 siblings, 0 replies; 154+ messages in thread
From: Jeff Hostetler @ 2020-10-01 15:59 UTC (permalink / raw)
  To: Matheus Tavares, git; +Cc: jeffhost, chriscool, peff, t.gummerer, newren



On 10/1/20 11:53 AM, Jeff Hostetler wrote:
> 
> 
> On 9/22/20 6:49 PM, Matheus Tavares wrote:
>> In a following patch, checkout_entry() will use conv_attrs to decide
>> whether an entry should be enqueued for parallel checkout or not. But
>> the attributes lookup only happens lower in this call stack. To avoid
>> the unnecessary work of loading the attributes twice, let's move it up
>> to checkout_entry(), and pass the loaded struct down to write_entry().
>>
>> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
>> ---
>>   entry.c | 38 +++++++++++++++++++++++++++-----------
>>   1 file changed, 27 insertions(+), 11 deletions(-)
>>
>> diff --git a/entry.c b/entry.c
>> index 1d2df188e5..8237859b12 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout 
>> *state, struct cache_entry *ce,
>>       }
>>   }
>> -static int write_entry(struct cache_entry *ce,
>> -               char *path, const struct checkout *state, int 
>> to_tempfile)
>> +/* Note: ca is used (and required) iff the entry refers to a regular 
>> file. */
>> +static int write_entry(struct cache_entry *ce, char *path, struct 
>> conv_attrs *ca,
>> +               const struct checkout *state, int to_tempfile)
>>   {
>>       unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
>>       struct delayed_checkout *dco = state->delayed_checkout;
>> @@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
>>       clone_checkout_metadata(&meta, &state->meta, &ce->oid);
>>       if (ce_mode_s_ifmt == S_IFREG) {
>> -        struct stream_filter *filter = 
>> get_stream_filter(state->istate, ce->name,
>> -                                 &ce->oid);
>> +        struct stream_filter *filter = get_stream_filter_ca(ca, 
>> &ce->oid);
>>           if (filter &&
>>               !streaming_write_entry(ce, path, filter,
>>                          state, to_tempfile,
>> @@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
>>            * Convert from git internal format to working tree format
>>            */
>>           if (dco && dco->state != CE_NO_DELAY) {
>> -            ret = async_convert_to_working_tree(state->istate, 
>> ce->name, new_blob,
>> -                                size, &buf, &meta, dco);
>> +            ret = async_convert_to_working_tree_ca(ca, ce->name,
>> +                                   new_blob, size,
>> +                                   &buf, &meta, dco);
>>               if (ret && string_list_has_string(&dco->paths, ce->name)) {
>>                   free(new_blob);
>>                   goto delayed;
>>               }
>> -        } else
>> -            ret = convert_to_working_tree(state->istate, ce->name, 
>> new_blob, size, &buf, &meta);
>> +        } else {
>> +            ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
>> +                             size, &buf, &meta);
>> +        }
>>           if (ret) {
>>               free(new_blob);
>> @@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const 
>> struct checkout *state,
>>   {
>>       static struct strbuf path = STRBUF_INIT;
>>       struct stat st;
>> +    struct conv_attrs ca;
> 
> I have to wonder if it would be clearer to move this declaration of `ca`
> into the two `if { ... }` blocks where it is used -- to indicate that it
> is only defined in two cases where we call `convert_attrs()`.
> 
> There are several other calls to `write_entry()` that pass NULL and it
> could cause confusion.


Nevermind, I see what you did in step 9 and this makes sense.

> 
> 
>>       if (ce->ce_flags & CE_WT_REMOVE) {
>>           if (topath)
>> @@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const 
>> struct checkout *state,
>>           return 0;
>>       }
>> -    if (topath)
>> -        return write_entry(ce, topath, state, 1);
>> +    if (topath) {
>> +        if (S_ISREG(ce->ce_mode)) {
>> +            convert_attrs(state->istate, &ca, ce->name);
>> +            return write_entry(ce, topath, &ca, state, 1);
>> +        }
>> +        return write_entry(ce, topath, NULL, state, 1);
>> +    }
>>       strbuf_reset(&path);
>>       strbuf_add(&path, state->base_dir, state->base_dir_len);
>> @@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const 
>> struct checkout *state,
>>           return 0;
>>       create_directories(path.buf, path.len, state);
>> +
>>       if (nr_checkouts)
>>           (*nr_checkouts)++;
>> -    return write_entry(ce, path.buf, state, 0);
>> +
>> +    if (S_ISREG(ce->ce_mode)) {
>> +        convert_attrs(state->istate, &ca, ce->name);
>> +        return write_entry(ce, path.buf, &ca, state, 0);
>> +    }
>> +
>> +    return write_entry(ce, path.buf, NULL, state, 0);
>>   }
>>   void unlink_entry(const struct cache_entry *ce)
>>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH 00/21] [RFC] Parallel checkout
  2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
                   ` (22 preceding siblings ...)
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
@ 2020-10-01 16:42 ` Jeff Hostetler
  23 siblings, 0 replies; 154+ messages in thread
From: Jeff Hostetler @ 2020-10-01 16:42 UTC (permalink / raw)
  To: Matheus Tavares, git; +Cc: stolee, jeffhost



On 8/10/20 5:33 PM, Matheus Tavares wrote:
> This series adds parallel workers to the checkout machinery. The cache
> entries are distributed among helper processes which are responsible for
> reading, filtering and writing the blobs to the working tree. This
> should benefit all commands that call unpack_trees() or check_updates(),
> such as: checkout, clone, sparse-checkout, checkout-index, etc.

This series looks very good!
Thanks for your attention to detail.

Jeff


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH] parallel-checkout: drop unused checkout state parameter
  2020-09-22 22:49   ` [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-10-05  6:17     ` Jeff King
  2020-10-05 13:13       ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 154+ messages in thread
From: Jeff King @ 2020-10-05  6:17 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: Junio C Hamano, git, jeffhost, chriscool, t.gummerer, newren

On Tue, Sep 22, 2020 at 07:49:24PM -0300, Matheus Tavares wrote:

> +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
> +			       const char *path, struct checkout *state)

The "state" parameter is unused here. Maybe this on top of
mt/parallel-checkout-part-1?

-- >8 --
Subject: parallel-checkout: drop unused checkout state parameter

The write_pc_item_to_fd() function takes a "struct checkout *state"
parameter, but never uses it. This was true in its introduction in
fa33dd99f0 (unpack-trees: add basic support for parallel checkout,
2020-09-22). Its caller, write_pc_item(), has already pulled the useful
bits from the state struct into the "path" variable. Let's drop the
useless parameter.

Signed-off-by: Jeff King <peff@peff.net>
---
 parallel-checkout.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index 94b44d2a48..d077618719 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -233,7 +233,7 @@ static int reset_fd(int fd, const char *path)
 }
 
 static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
-			       const char *path, struct checkout *state)
+			       const char *path)
 {
 	int ret;
 	struct stream_filter *filter;
@@ -347,7 +347,7 @@ void write_pc_item(struct parallel_checkout_item *pc_item,
 		goto out;
 	}
 
-	if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) {
+	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
 		/* Error was already reported. */
 		pc_item->status = PC_ITEM_FAILED;
 		goto out;
-- 
2.28.0.1295.g4824feede7


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH] parallel-checkout: drop unused checkout state parameter
  2020-10-05  6:17     ` [PATCH] parallel-checkout: drop unused checkout state parameter Jeff King
@ 2020-10-05 13:13       ` Matheus Tavares Bernardino
  2020-10-05 13:45         ` Jeff King
  0 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-10-05 13:13 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, Jeff Hostetler, Christian Couder,
	Thomas Gummerer, Elijah Newren

On Mon, Oct 5, 2020 at 3:17 AM Jeff King <peff@peff.net> wrote:
>
> On Tue, Sep 22, 2020 at 07:49:24PM -0300, Matheus Tavares wrote:
>
> > +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
> > +                            const char *path, struct checkout *state)
>
> The "state" parameter is unused here. Maybe this on top of
> mt/parallel-checkout-part-1?
>
> -- >8 --
> Subject: parallel-checkout: drop unused checkout state parameter
>
> The write_pc_item_to_fd() function takes a "struct checkout *state"
> parameter, but never uses it. This was true in its introduction in
> fa33dd99f0 (unpack-trees: add basic support for parallel checkout,
> 2020-09-22). Its caller, write_pc_item(), has already pulled the useful
> bits from the state struct into the "path" variable. Let's drop the
> useless parameter.
>
> Signed-off-by: Jeff King <peff@peff.net>

Good catch, thanks.

I was going to suggest squashing this into fa33dd99f0, but I noticed
that mt/parallel-checkout-part-1 is already in next. We don't re-roll
series that are already in next, right?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH] parallel-checkout: drop unused checkout state parameter
  2020-10-05 13:13       ` Matheus Tavares Bernardino
@ 2020-10-05 13:45         ` Jeff King
  0 siblings, 0 replies; 154+ messages in thread
From: Jeff King @ 2020-10-05 13:45 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Junio C Hamano, git, Jeff Hostetler, Christian Couder,
	Thomas Gummerer, Elijah Newren

On Mon, Oct 05, 2020 at 10:13:21AM -0300, Matheus Tavares Bernardino wrote:

> I was going to suggest squashing this into fa33dd99f0, but I noticed
> that mt/parallel-checkout-part-1 is already in next. We don't re-roll
> series that are already in next, right?

Correct. That's also why I noticed it; I build my day-to-day Git by
integrating next with my personal topics, and one of my topics turns on
-Wunused-parameter. :)

-Peff

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-09-22 22:49   ` [PATCH v2 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
@ 2020-10-20  1:35     ` Jonathan Nieder
  2020-10-20  2:55       ` Taylor Blau
  2020-10-20  3:18       ` Matheus Tavares Bernardino
  0 siblings, 2 replies; 154+ messages in thread
From: Jonathan Nieder @ 2020-10-20  1:35 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, jeffhost, chriscool, peff, t.gummerer, newren

Hi,

Matheus Tavares wrote:

> Add tests to populate the working tree during clone and checkout using
> the sequential and parallel modes, to confirm that they produce
> identical results. Also test basic checkout mechanics, such as checking
> for symlinks in the leading directories and the abidance to --force.

Thanks for implementing parallel checkout!  I'm excited about the
feature.  And thanks for including these tests.

[...]
> --- /dev/null
> +++ b/t/lib-parallel-checkout.sh
> @@ -0,0 +1,39 @@
[...]
> +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
> +# and checks that the number of workers spawned is equal to $3.
> +git_pc()

nit: what does git_pc mean?  Can this spell it out more verbosely, or
could callers take on more of the burden?  (Perhaps it would make sense
to use a helper that uses test_config to set the relevant configuration,
and then the caller can use plain "git clone"?)

[...]
> +	GIT_TRACE2="$(pwd)/trace" git \
> +		-c checkout.workers=$workers \
> +		-c checkout.thresholdForParallelism=$threshold \
> +		-c advice.detachedHead=0 \
> +		$@ &&

$@ needs to be quoted, or else it will act like $* (and in particular it
won't handle parameters with embedded spaces).

> +
> +	# Check that the expected number of workers has been used. Note that it
> +	# can be different than the requested number in two cases: when the
> +	# quantity of entries to be checked out is less than the number of
> +	# workers; and when the threshold has not been reached.
> +	#
> +	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&

Do we use grep's \+ operator in other tests?  I thought we preferred to
use the more portable *, but it may be that I'm out of date.

[...]
> +# Verify that both the working tree and the index were created correctly
> +verify_checkout()
> +{
> +	git -C $1 diff-index --quiet HEAD -- &&
> +	git -C $1 diff-index --quiet --cached HEAD -- &&
> +	git -C $1 status --porcelain >$1.status &&
> +	test_must_be_empty $1.status
> +}

Like git_pc, this is not easy to take in at a glance.

"$1" needs to be quoted if we are to handle paths with spaces.

[...]
> --- /dev/null
> +++ b/t/t2080-parallel-checkout-basics.sh
> @@ -0,0 +1,197 @@
> +#!/bin/sh
> +
> +test_description='parallel-checkout basics
> +
> +Ensure that parallel-checkout basically works on clone and checkout, spawning
> +the required number of workers and correctly populating both the index and
> +working tree.
> +'
> +
> +TEST_NO_CREATE_REPO=1
> +. ./test-lib.sh
> +. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
> +
> +# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
> +# currently produces a wrong result (See
> +# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
> +# So we skip the "parallel-checkout during clone" tests when this test flag is
> +# set to "sha256". Remove this when the bug is fixed.
> +#
> +if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
> +then
> +	skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
> +	test_done
> +fi
> +
> +R_BASE=$GIT_BUILD_DIR
> +
> +test_expect_success 'sequential clone' '
> +	git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&

This fails when I run it when building from a tarball, which is
presenting me from releasing this patch series to Debian experimental.

Can we use an artificial repo instead of git.git?  Using git.git as
test data seems like a recipe for hard-to-reproduce test failures.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20  1:35     ` Jonathan Nieder
@ 2020-10-20  2:55       ` Taylor Blau
  2020-10-20 13:18         ` Matheus Tavares Bernardino
  2020-10-20  3:18       ` Matheus Tavares Bernardino
  1 sibling, 1 reply; 154+ messages in thread
From: Taylor Blau @ 2020-10-20  2:55 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Matheus Tavares, git, jeffhost, chriscool, peff, t.gummerer, newren

On Mon, Oct 19, 2020 at 06:35:58PM -0700, Jonathan Nieder wrote:
> > +
> > +	# Check that the expected number of workers has been used. Note that it
> > +	# can be different than the requested number in two cases: when the
> > +	# quantity of entries to be checked out is less than the number of
> > +	# workers; and when the threshold has not been reached.
> > +	#
> > +	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
>
> Do we use grep's \+ operator in other tests?  I thought we preferred to
> use the more portable *, but it may be that I'm out of date.

You're not out-of-date; I looked at this myself a couple of months ago:

  https://lore.kernel.org/git/20200812140352.GC74542@syl.lan/

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20  1:35     ` Jonathan Nieder
  2020-10-20  2:55       ` Taylor Blau
@ 2020-10-20  3:18       ` Matheus Tavares Bernardino
  2020-10-20  4:16         ` Jonathan Nieder
  2020-10-20 19:14         ` Junio C Hamano
  1 sibling, 2 replies; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-10-20  3:18 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Jeff Hostetler, Christian Couder, Jeff King,
	Thomas Gummerer, Elijah Newren

Hi, Jonathan

On Mon, Oct 19, 2020 at 10:36 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Hi,
>
> Matheus Tavares wrote:
>
> > Add tests to populate the working tree during clone and checkout using
> > the sequential and parallel modes, to confirm that they produce
> > identical results. Also test basic checkout mechanics, such as checking
> > for symlinks in the leading directories and the abidance to --force.
>
> Thanks for implementing parallel checkout!  I'm excited about the
> feature.  And thanks for including these tests.

Thanks for the comments and feedback :)

> [...]
> > --- /dev/null
> > +++ b/t/lib-parallel-checkout.sh
> > @@ -0,0 +1,39 @@
> [...]
> > +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
> > +# and checks that the number of workers spawned is equal to $3.
> > +git_pc()
>
> nit: what does git_pc mean?

The idea was "git w/ parallel-checkout". But I realize it may have
gotten too abbreviated...

> Can this spell it out more verbosely, or
> could callers take on more of the burden?  (Perhaps it would make sense
> to use a helper that uses test_config to set the relevant configuration,
> and then the caller can use plain "git clone"?)

Hmm, it's possible, but I think we might end up with quite a lot of
repetition (to always check that checkout spawned the right number of
workers).

> [...]
> > +     GIT_TRACE2="$(pwd)/trace" git \
> > +             -c checkout.workers=$workers \
> > +             -c checkout.thresholdForParallelism=$threshold \
> > +             -c advice.detachedHead=0 \
> > +             $@ &&
>
> $@ needs to be quoted, or else it will act like $* (and in particular it
> won't handle parameters with embedded spaces).

Nice catch, thanks! I will send a patch for this tomorrow.

> > +
> > +     # Check that the expected number of workers has been used. Note that it
> > +     # can be different than the requested number in two cases: when the
> > +     # quantity of entries to be checked out is less than the number of
> > +     # workers; and when the threshold has not been reached.
> > +     #
> > +     local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
>
> Do we use grep's \+ operator in other tests?  I thought we preferred to
> use the more portable *, but it may be that I'm out of date.

Oh, I didn't know about the portability issue with \+. This is already
in `next`, but I guess it's worth sending a follow-up patch to fix it,
right? (I see we have a second \+ occurrence in t7508, which could be
changed in the same patch.)

> [...]
> > +# Verify that both the working tree and the index were created correctly
> > +verify_checkout()
> > +{
> > +     git -C $1 diff-index --quiet HEAD -- &&
> > +     git -C $1 diff-index --quiet --cached HEAD -- &&
> > +     git -C $1 status --porcelain >$1.status &&
> > +     test_must_be_empty $1.status
> > +}
>
> Like git_pc, this is not easy to take in at a glance.
>
> "$1" needs to be quoted if we are to handle paths with spaces.

Thanks, again :) Currently, this function doesn't get paths with
spaces, but I agree that it's better to be cautious here.

> [...]
> > --- /dev/null
> > +++ b/t/t2080-parallel-checkout-basics.sh
> > @@ -0,0 +1,197 @@
> > +#!/bin/sh
> > +
> > +test_description='parallel-checkout basics
> > +
> > +Ensure that parallel-checkout basically works on clone and checkout, spawning
> > +the required number of workers and correctly populating both the index and
> > +working tree.
> > +'
> > +
> > +TEST_NO_CREATE_REPO=1
> > +. ./test-lib.sh
> > +. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
> > +
> > +# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
> > +# currently produces a wrong result (See
> > +# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
> > +# So we skip the "parallel-checkout during clone" tests when this test flag is
> > +# set to "sha256". Remove this when the bug is fixed.
> > +#
> > +if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
> > +then
> > +     skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
> > +     test_done
> > +fi
> > +
> > +R_BASE=$GIT_BUILD_DIR
> > +
> > +test_expect_success 'sequential clone' '
> > +     git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&
>
> This fails when I run it when building from a tarball, which is
> presenting me from releasing this patch series to Debian experimental.

Sorry for the trouble :( It didn't occur to me, while writing the
test, that it could also be run from the tarball.

> Can we use an artificial repo instead of git.git?  Using git.git as
> test data seems like a recipe for hard-to-reproduce test failures.

I think we could maybe drop these tests. There are already some
similar tests below these, which use an artificial repository. The
goal of using git.git in this section was to test parallel-checkout
with a real-world repo, and hopefully catch errors that we might not
see with small artificial ones.  But you have a very valid concern, as
well. Hmm, I'm not sure what is the best solution to this case. What
do you think?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20  3:18       ` Matheus Tavares Bernardino
@ 2020-10-20  4:16         ` Jonathan Nieder
  2020-10-20 19:14         ` Junio C Hamano
  1 sibling, 0 replies; 154+ messages in thread
From: Jonathan Nieder @ 2020-10-20  4:16 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: git, Jeff Hostetler, Christian Couder, Jeff King,
	Thomas Gummerer, Elijah Newren

Matheus Tavares Bernardino wrote:
> On Mon, Oct 19, 2020 at 10:36 PM Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Can we use an artificial repo instead of git.git?  Using git.git as
>> test data seems like a recipe for hard-to-reproduce test failures.
>
> I think we could maybe drop these tests. There are already some
> similar tests below these, which use an artificial repository. The
> goal of using git.git in this section was to test parallel-checkout
> with a real-world repo, and hopefully catch errors that we might not
> see with small artificial ones.  But you have a very valid concern, as
> well. Hmm, I'm not sure what is the best solution to this case. What
> do you think?

I see.  I suppose my preference would be to have a real-world example
in t/perf/ (see t/perf/README for how it allows an arbitrary repo to
be passed in) instead of in the regression tests.  In the regression
testsuite I'd focus more on particular behaviors I want to test (e.g.,
a file being replaced by a directory, that kind of thing).

Behaviors exercised by git.git are in some sense the *least* important
thing to test here, since developers in the Git project know to
advocate for those and exercise them day-to-day.  Where the testsuite
shines is in being able to advocate for use cases that are exercised
by other populations --- a testsuite failure can be a reminder to not
forget about the features other people need that are not part of our
own daily lives.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20  2:55       ` Taylor Blau
@ 2020-10-20 13:18         ` Matheus Tavares Bernardino
  2020-10-20 19:09           ` Junio C Hamano
  0 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-10-20 13:18 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Jonathan Nieder, git, Jeff Hostetler, Christian Couder,
	Jeff King, Thomas Gummerer, Elijah Newren

On Mon, Oct 19, 2020 at 11:55 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Mon, Oct 19, 2020 at 06:35:58PM -0700, Jonathan Nieder wrote:
> > > +
> > > +   # Check that the expected number of workers has been used. Note that it
> > > +   # can be different than the requested number in two cases: when the
> > > +   # quantity of entries to be checked out is less than the number of
> > > +   # workers; and when the threshold has not been reached.
> > > +   #
> > > +   local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
> >
> > Do we use grep's \+ operator in other tests?  I thought we preferred to
> > use the more portable *, but it may be that I'm out of date.
>
> You're not out-of-date; I looked at this myself a couple of months ago:
>
>   https://lore.kernel.org/git/20200812140352.GC74542@syl.lan/

Thanks for the pointer, I'll replace .\+ by ..*, then.

I noticed we also have some uses of + and ? in tests, with `grep -E`
(or egrep). Are we OK with ERE or did these maybe just slip in by
accident?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20 13:18         ` Matheus Tavares Bernardino
@ 2020-10-20 19:09           ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-20 19:09 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Taylor Blau, Jonathan Nieder, git, Jeff Hostetler,
	Christian Couder, Jeff King, Thomas Gummerer, Elijah Newren

Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:

> I noticed we also have some uses of + and ? in tests, with `grep -E`
> (or egrep). Are we OK with ERE or did these maybe just slip in by
> accident?

We are OK with 'grep -E' and 'egrep' and write '+' and '?' as valid
ERE elements.  What we are not OK with is to invoke ERE elements in
an expression that is supposed to be a BRE by prefixing a backslash,
e.g. '\+'.  Perhaps it is a GNU extension?

We need to remove '\+' in t/perf/bisect_regression used with sed.
What is sad is that this trick and "sed -E" are both GNUisms, and
there is no portable way to use ERE with sed X-<.

But we could resort to Perl in truly tricky cases ;-).




^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 16/19] parallel-checkout: add tests for basic operations
  2020-10-20  3:18       ` Matheus Tavares Bernardino
  2020-10-20  4:16         ` Jonathan Nieder
@ 2020-10-20 19:14         ` Junio C Hamano
  1 sibling, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-20 19:14 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Jonathan Nieder, git, Jeff Hostetler, Christian Couder,
	Jeff King, Thomas Gummerer, Elijah Newren

Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:

> Oh, I didn't know about the portability issue with \+. This is already
> in `next`, but I guess it's worth sending a follow-up patch to fix it,
> right? (I see we have a second \+ occurrence in t7508, which could be
> changed in the same patch.)

Note that soon, typically a week, after a release, the tip of the
next branch is rewound and all the topics that did not graduate to
master has a chance to get a clean start.  This may be a good use
case for that chance.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 00/19] Parallel Checkout (part I)
  2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
                     ` (18 preceding siblings ...)
  2020-09-22 22:49   ` [PATCH v2 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
@ 2020-10-29  2:14   ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
                       ` (21 more replies)
  19 siblings, 22 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

There was some semantic conflicts between this series and
jk/checkout-index-errors, so I rebased my series on top of that.

Also, I'd please ask reviewers to confirm that my descriptor
redirection in git_pc() (patch 17) is correct, as I'm not very 
familiar with the test suite descriptors.

Main changes since v2:

Patch 10:
  - Squashed Peff's patch removing an useless function parameter.

Patch 11:
  - Valgrind used to complain about send_one_item() passing
    uninitialized bytes to a syscall (write(2)). The referred bytes come
    from the unused positions on oid->hash[], when the hash is SHA-1.
    Since the workers won't use these bytes, there is no real harm. But
    the warning could cause confusion and even get in the way of
    detecting real errors, so I replaced the oidcpy() call with
    hashcpy().

Patch 16:
  - Replaced use of the non-portable '\+' in grep with '..*' (in
    t/lib-parallel-checkout.sh).

  - Properly quoted function parameters in t/lib-parallel-checkout.sh,
    as Jonathan pointed out.

  - In t2080, dropped tests that used git.git as test data, and added
    two more tests to check clone with parallel-checkout using the
    artificial repo already created for other tests.

  - No longer skip clone tests when GIT_TEST_DEFAULT_HASH is sha256. A
    bug in clone used to make the tests fail with this configuration set
    to this value, but the bug was fixed in 47ac970309 ("builtin/clone:
    avoid failure with GIT_DEFAULT_HASH", 2020-09-20).

Patch 17:
  - The test t2081-parallel-checkout-collisions.sh had a bug in which
    the filter options were being wrongly passed to git. These options
    were conditionally defined through a shell variable, for which the
    quoting was wrong. This should have made the test fail but, in fact,
    another bug (using the arithmetic operator `-eq` for strings), was
    preventing the problematic section from ever running. These bugs are
    now fixed, and the test script was also simplified, by making use of
    the lib-parallel-checkout.sh and eliminating the helper function.

  - Use "$TEST_ROOT/logger_script" instead of "../logger_script", to be
    on the safe side.


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (15):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: add tests for basic operations
  parallel-checkout: add tests related to clone collisions
  parallel-checkout: add tests related to .gitattributes
  ci: run test round with parallel-checkout enabled

 .gitignore                              |   1 +
 Documentation/config/checkout.txt       |  21 +
 Makefile                                |   2 +
 apply.c                                 |   1 +
 builtin.h                               |   1 +
 builtin/checkout--helper.c              | 142 ++++++
 builtin/checkout-index.c                |  22 +-
 builtin/checkout.c                      |  21 +-
 builtin/difftool.c                      |   3 +-
 cache.h                                 |  34 +-
 ci/run-build-and-tests.sh               |   1 +
 convert.c                               | 121 +++--
 convert.h                               |  68 +++
 entry.c                                 | 102 ++--
 entry.h                                 |  54 ++
 git.c                                   |   2 +
 parallel-checkout.c                     | 638 ++++++++++++++++++++++++
 parallel-checkout.h                     | 103 ++++
 read-cache.c                            |  12 +-
 t/README                                |   4 +
 t/lib-encoding.sh                       |  25 +
 t/lib-parallel-checkout.sh              |  46 ++
 t/t0028-working-tree-encoding.sh        |  25 +-
 t/t2080-parallel-checkout-basics.sh     | 170 +++++++
 t/t2081-parallel-checkout-collisions.sh |  98 ++++
 t/t2082-parallel-checkout-attributes.sh | 174 +++++++
 unpack-trees.c                          |  22 +-
 27 files changed, 1758 insertions(+), 155 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h
 create mode 100644 t/lib-encoding.sh
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh
 create mode 100755 t/t2081-parallel-checkout-collisions.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

Range-diff against v2:
 1:  b9d2a329d3 =  1:  dfc3e0fd62 convert: make convert_attrs() and convert structs public
 2:  313c3bcbeb =  2:  c5fbd1e16d convert: add [async_]convert_to_working_tree_ca() variants
 3:  29bbdb78e9 =  3:  c77b16f694 convert: add get_stream_filter_ca() variant
 4:  a1cf5df961 =  4:  18c3f4247e convert: add conv_attrs classification
 5:  25b311745a =  5:  2caa2c4345 entry: extract a header file for entry.c functions
 6:  dbee09e936 =  6:  bfa52df9e2 entry: make fstat_output() and read_blob_entry() public
 7:  b61b5c44f0 =  7:  91ef17f533 entry: extract cache_entry update from write_entry()
 8:  667ad0dea7 =  8:  81e03baab1 entry: move conv_attrs lookup up to checkout_entry()
 9:  4ddb34209e =  9:  e1b886f823 entry: add checkout_entry_ca() which takes preloaded conv_attrs
10:  af0d790973 ! 10:  2bdc13664e unpack-trees: add basic support for parallel checkout
    @@ parallel-checkout.c (new)
     +}
     +
     +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
    -+			       const char *path, struct checkout *state)
    ++			       const char *path)
     +{
     +	int ret;
     +	struct stream_filter *filter;
    @@ parallel-checkout.c (new)
     +		goto out;
     +	}
     +
    -+	if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) {
    ++	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
     +		/* Error was already reported. */
     +		pc_item->status = PC_ITEM_FAILED;
     +		goto out;
11:  991169488b ! 11:  096e543fd2 parallel-checkout: make it truly parallel
    @@ Documentation/config/checkout.txt: will checkout the '<something>' branch on ano
     +	The number of parallel workers to use when updating the working tree.
     +	The default is one, i.e. sequential execution. If set to a value less
     +	than one, Git will use as many workers as the number of logical cores
    -+	available. This setting and checkout.thresholdForParallelism affect all
    -+	commands that perform checkout. E.g. checkout, switch, clone, reset,
    -+	sparse-checkout, read-tree, etc.
    ++	available. This setting and `checkout.thresholdForParallelism` affect
    ++	all commands that perform checkout. E.g. checkout, clone, reset,
    ++	sparse-checkout, etc.
     ++
     +Note: parallel checkout usually delivers better performance for repositories
     +located on SSDs or over NFS. For repositories on spinning disks and/or machines
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +
     +	fixed_portion = (struct pc_item_fixed_portion *)data;
     +	fixed_portion->id = pc_item->id;
    -+	oidcpy(&fixed_portion->oid, &pc_item->ce->oid);
     +	fixed_portion->ce_mode = pc_item->ce->ce_mode;
     +	fixed_portion->crlf_action = pc_item->ca.crlf_action;
     +	fixed_portion->ident = pc_item->ca.ident;
     +	fixed_portion->name_len = name_len;
     +	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
    ++	/*
    ++	 * We use hashcpy() instead of oidcpy() because the hash[] positions
    ++	 * after `the_hash_algo->rawsz` might not be initialized. And Valgrind
    ++	 * would complain about passing uninitialized bytes to a syscall
    ++	 * (write(2)). There is no real harm in this case, but the warning could
    ++	 * hinder the detection of actual errors.
    ++	 */
    ++	hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash);
     +
     +	variant = data + sizeof(*fixed_portion);
     +	if (working_tree_encoding_len) {
12:  7ceadf2427 = 12:  9cfeb4821c parallel-checkout: support progress displaying
13:  f13b4c17f4 = 13:  da99b671e6 make_transient_cache_entry(): optionally alloc from mem_pool
14:  d7885a1130 = 14:  d3d561754a builtin/checkout.c: complete parallel checkout support
15:  1cf9b807f7 ! 15:  ee34c6e149 checkout-index: add parallel checkout support
    @@ builtin/checkout-index.c
      #define CHECKOUT_ALL 4
      static int nul_term_line;
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    - 	int prefix_length;
      	int force = 0, quiet = 0, not_new = 0;
      	int index_opt = 0;
    + 	int err = 0;
     +	int pc_workers, pc_threshold;
      	struct option builtin_checkout_index_options[] = {
      		OPT_BOOL('a', "all", &all,
    @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, co
      	for (i = 0; i < argc; i++) {
      		const char *arg = argv[i];
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    + 		strbuf_release(&buf);
    + 	}
    + 
    +-	if (err)
    +-		return 1;
    +-
      	if (all)
      		checkout_all(prefix, prefix_length);
      
     +	if (pc_workers > 1) {
    -+		/* Errors were already reported */
    -+		run_parallel_checkout(&state, pc_workers, pc_threshold,
    -+				      NULL, NULL);
    ++		err |= run_parallel_checkout(&state, pc_workers, pc_threshold,
    ++					     NULL, NULL);
     +	}
    ++
    ++	if (err)
    ++		return 1;
     +
      	if (is_lock_file_locked(&lock_file) &&
      	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
16:  64b41d537e ! 16:  05299a3cc0 parallel-checkout: add tests for basic operations
    @@ Commit message
         for symlinks in the leading directories and the abidance to --force.
     
         Note: some helper functions are added to a common lib file which is only
    -    included by t2080 for now. But it will also be used by another
    -    parallel-checkout test in a following patch.
    +    included by t2080 for now. But it will also be used by other
    +    parallel-checkout tests in the following patches.
     
         Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
    @@ t/lib-parallel-checkout.sh (new)
     +
     +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
     +# and checks that the number of workers spawned is equal to $3.
    ++#
     +git_pc()
     +{
     +	if test $# -lt 4
     +	then
     +		BUG "too few arguments to git_pc()"
    -+	fi
    ++	fi &&
     +
     +	workers=$1 threshold=$2 expected_workers=$3 &&
    -+	shift && shift && shift &&
    ++	shift 3 &&
     +
     +	rm -f trace &&
     +	GIT_TRACE2="$(pwd)/trace" git \
     +		-c checkout.workers=$workers \
     +		-c checkout.thresholdForParallelism=$threshold \
     +		-c advice.detachedHead=0 \
    -+		$@ &&
    ++		"$@" &&
     +
     +	# Check that the expected number of workers has been used. Note that it
    -+	# can be different than the requested number in two cases: when the
    -+	# quantity of entries to be checked out is less than the number of
    -+	# workers; and when the threshold has not been reached.
    ++	# can be different from the requested number in two cases: when the
    ++	# threshold is not reached; and when there are not enough
    ++	# parallel-eligible entries for all workers.
     +	#
    -+	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
    ++	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
     +	test $workers_in_trace -eq $expected_workers &&
     +	rm -f trace
     +}
    @@ t/lib-parallel-checkout.sh (new)
     +# Verify that both the working tree and the index were created correctly
     +verify_checkout()
     +{
    -+	git -C $1 diff-index --quiet HEAD -- &&
    -+	git -C $1 diff-index --quiet --cached HEAD -- &&
    -+	git -C $1 status --porcelain >$1.status &&
    -+	test_must_be_empty $1.status
    ++	git -C "$1" diff-index --quiet HEAD -- &&
    ++	git -C "$1" diff-index --quiet --cached HEAD -- &&
    ++	git -C "$1" status --porcelain >"$1".status &&
    ++	test_must_be_empty "$1".status
     +}
     
      ## t/t2080-parallel-checkout-basics.sh (new) ##
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +. ./test-lib.sh
     +. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
    -+# currently produces a wrong result (See
    -+# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
    -+# So we skip the "parallel-checkout during clone" tests when this test flag is
    -+# set to "sha256". Remove this when the bug is fixed.
    -+#
    -+if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
    -+then
    -+	skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
    -+	test_done
    -+fi
    -+
    -+R_BASE=$GIT_BUILD_DIR
    -+
    -+test_expect_success 'sequential clone' '
    -+	git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&
    -+	verify_checkout r_sequential
    -+'
    -+
    -+test_expect_success 'parallel clone' '
    -+	git_pc 2 0 2 clone --quiet -- $R_BASE r_parallel &&
    -+	verify_checkout r_parallel
    -+'
    -+
    -+test_expect_success 'fallback to sequential clone (threshold)' '
    -+	git -C $R_BASE ls-files >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+
    -+	git_pc 2 $threshold 0 clone --quiet -- $R_BASE r_sequential_fallback &&
    -+	verify_checkout r_sequential_fallback
    -+'
    -+
    -+# Just to be paranoid, actually compare the contents of the worktrees directly.
    -+test_expect_success 'compare working trees from clones' '
    -+	rm -rf r_sequential/.git &&
    -+	rm -rf r_parallel/.git &&
    -+	rm -rf r_sequential_fallback/.git &&
    -+	diff -qr r_sequential r_parallel &&
    -+	diff -qr r_sequential r_sequential_fallback
    -+'
    -+
     +# Test parallel-checkout with different operations (creation, deletion,
     +# modification) and entry types. A branch switch from B1 to B2 will contain:
     +#
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +	verify_checkout various_sequential_fallback
     +'
     +
    -+test_expect_success SYMLINKS 'compare working trees from checkouts' '
    -+	rm -rf various_sequential/.git &&
    -+	rm -rf various_parallel/.git &&
    -+	rm -rf various_sequential_fallback/.git &&
    -+	diff -qr various_sequential various_parallel &&
    -+	diff -qr various_sequential various_sequential_fallback
    ++test_expect_success SYMLINKS 'parallel checkout on clone' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 0 2 clone --recurse-submodules various various_parallel_clone  &&
    ++	verify_checkout various_parallel_clone
    ++'
    ++
    ++test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 100 0 clone --recurse-submodules various various_sequential_fallback_clone &&
    ++	verify_checkout various_sequential_fallback_clone
    ++'
    ++
    ++# Just to be paranoid, actually compare the working trees' contents directly.
    ++test_expect_success SYMLINKS 'compare the working trees' '
    ++	rm -rf various_*/.git &&
    ++	rm -rf various_*/d/.git &&
    ++
    ++	diff -r various_sequential various_parallel &&
    ++	diff -r various_sequential various_sequential_fallback &&
    ++	diff -r various_sequential various_parallel_clone &&
    ++	diff -r various_sequential various_sequential_fallback_clone
     +'
     +
     +test_cmp_str()
17:  70708d3e31 ! 17:  3d140dcacb parallel-checkout: add tests related to clone collisions
    @@ Commit message
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    + ## t/lib-parallel-checkout.sh ##
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 		-c checkout.workers=$workers \
    + 		-c checkout.thresholdForParallelism=$threshold \
    + 		-c advice.detachedHead=0 \
    +-		"$@" &&
    ++		"$@" 2>&8 &&
    + 
    + 	# Check that the expected number of workers has been used. Note that it
    + 	# can be different from the requested number in two cases: when the
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
    + 	test $workers_in_trace -eq $expected_workers &&
    + 	rm -f trace
    +-}
    ++} 8>&2 2>&4
    + 
    + # Verify that both the working tree and the index were created correctly
    + verify_checkout()
    +
      ## t/t2081-parallel-checkout-collisions.sh (new) ##
     @@
     +#!/bin/sh
     +
    -+test_description='parallel-checkout collisions'
    ++test_description='parallel-checkout collisions
    ++
    ++When there are path collisions during a clone, Git should report a warning
    ++listing all of the colliding entries. The sequential code detects a collision
    ++by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    ++colliding pair of an item k, it searches cache_entry[0, k-1].
    ++
    ++This is not sufficient in parallel checkout since:
    ++
    ++- A colliding file may be created between the lstat() and open() calls;
    ++- A colliding entry might appear in the second half of the cache_entry array.
    ++
    ++The tests in this file make sure that the collision detection code is extended
    ++for parallel checkout.
    ++'
     +
     +. ./test-lib.sh
    ++. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# When there are pathname collisions during a clone, Git should report a warning
    -+# listing all of the colliding entries. The sequential code detects a collision
    -+# by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    -+# colliding pair of an item k, it searches cache_entry[0, k-1].
    -+#
    -+# This is not sufficient in parallel-checkout mode since colliding files may be
    -+# created in a racy order. The tests in this file make sure the collision
    -+# detection code is extended for parallel-checkout. This is done in two parts:
    -+#
    -+# - First, two parallel workers create four colliding files racily.
    -+# - Then this exercise is repeated but forcing the colliding pair to appear in
    -+#   the second half of the cache_entry's array.
    -+#
    -+# The second item uses the fact that files with clean/smudge filters are not
    -+# parallel-eligible; and that they are processed sequentially *before* any
    -+# worker is spawned. We set a filter attribute to the last entry in the
    -+# cache_entry[] array, making it non-eligible, so that it is populated first.
    -+# This way, we can test if the collision detection code is correctly looking
    -+# for collision pairs in the second half of the array.
    ++TEST_ROOT="$PWD"
     +
     +test_expect_success CASE_INSENSITIVE_FS 'setup' '
    -+	file_hex=$(git hash-object -w --stdin </dev/null) &&
    -+	file_oct=$(echo $file_hex | hex2oct) &&
    ++	file_x_hex=$(git hash-object -w --stdin </dev/null) &&
    ++	file_x_oct=$(echo $file_x_hex | hex2oct) &&
     +
     +	attr_hex=$(echo "file_x filter=logger" | git hash-object -w --stdin) &&
     +	attr_oct=$(echo $attr_hex | hex2oct) &&
     +
    -+	printf "100644 FILE_X\0${file_oct}" >tree &&
    -+	printf "100644 FILE_x\0${file_oct}" >>tree &&
    -+	printf "100644 file_X\0${file_oct}" >>tree &&
    -+	printf "100644 file_x\0${file_oct}" >>tree &&
    ++	printf "100644 FILE_X\0${file_x_oct}" >tree &&
    ++	printf "100644 FILE_x\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_X\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_x\0${file_x_oct}" >>tree &&
     +	printf "100644 .gitattributes\0${attr_oct}" >>tree &&
     +
     +	tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
     +	commit_hex=$(git commit-tree -m collisions $tree_hex) &&
     +	git update-ref refs/heads/collisions $commit_hex &&
     +
    -+	write_script logger_script <<-\EOF
    ++	write_script "$TEST_ROOT"/logger_script <<-\EOF
     +	echo "$@" >>filter.log
     +	EOF
     +'
     +
    -+clone_and_check_collision()
    -+{
    -+	id=$1 workers=$2 threshold=$3 expected_workers=$4 filter=$5 &&
    -+
    -+	filter_opts=
    -+	if test "$filter" -eq "use_filter"
    -+	then
    -+		# We use `core.ignoreCase=0` so that only `file_x`
    -+		# matches the pattern in .gitattributes.
    -+		#
    -+		filter_opts='-c filter.logger.smudge="../logger_script %f" -c core.ignoreCase=0'
    -+	fi &&
    -+
    -+	test_path_is_missing $id.trace &&
    -+	GIT_TRACE2="$(pwd)/$id.trace" git \
    -+		-c checkout.workers=$workers \
    -+		-c checkout.thresholdForParallelism=$threshold \
    -+		$filter_opts clone --branch=collisions -- . r_$id 2>$id.warning &&
    -+
    -+	# Check that checkout spawned the right number of workers
    -+	workers_in_trace=$(grep "child_start\[.\] git checkout--helper" $id.trace | wc -l) &&
    -+	test $workers_in_trace -eq $expected_workers &&
    -+
    -+	if test $filter -eq "use_filter"
    -+	then
    -+		#  Make sure only 'file_x' was filtered
    -+		test_path_is_file r_$id/filter.log &&
    ++for mode in parallel sequential-fallback
    ++do
    ++
    ++	case $mode in
    ++	parallel)		workers=2 threshold=0 expected_workers=2 ;;
    ++	sequential-fallback)	workers=2 threshold=100 expected_workers=0 ;;
    ++	esac
    ++
    ++	test_expect_success CASE_INSENSITIVE_FS "collision detection on $mode clone" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			clone --branch=collisions . $mode 2>$mode.stderr &&
    ++
    ++		grep FILE_X $mode.stderr &&
    ++		grep FILE_x $mode.stderr &&
    ++		grep file_X $mode.stderr &&
    ++		grep file_x $mode.stderr &&
    ++		test_i18ngrep "the following paths have collided" $mode.stderr
    ++	'
    ++
    ++	# The following test ensures that the collision detection code is
    ++	# correctly looking for colliding peers in the second half of the
    ++	# cache_entry array. This is done by defining a smudge command for the
    ++	# *last* array entry, which makes it non-eligible for parallel-checkout.
    ++	# The last entry is then checked out *before* any worker is spawned,
    ++	# making it succeed and the workers' entries collide.
    ++	#
    ++	# Note: this test don't work on Windows because, on this system,
    ++	# collision detection uses strcmp() when core.ignoreCase=false. And we
    ++	# have to set core.ignoreCase=false so that only 'file_x' matches the
    ++	# pattern of the filter attribute. But it works on OSX, where collision
    ++	# detection uses inode.
    ++	#
    ++	test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN "collision detection on $mode clone w/ filter" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			-c core.ignoreCase=false \
    ++			-c filter.logger.smudge="\"$TEST_ROOT/logger_script\" %f" \
    ++			clone --branch=collisions . ${mode}_with_filter \
    ++			2>${mode}_with_filter.stderr &&
    ++
    ++		grep FILE_X ${mode}_with_filter.stderr &&
    ++		grep FILE_x ${mode}_with_filter.stderr &&
    ++		grep file_X ${mode}_with_filter.stderr &&
    ++		grep file_x ${mode}_with_filter.stderr &&
    ++		test_i18ngrep "the following paths have collided" ${mode}_with_filter.stderr &&
    ++
    ++		# Make sure only "file_x" was filtered
    ++		test_path_is_file ${mode}_with_filter/filter.log &&
     +		echo file_x >expected.filter.log &&
    -+		test_cmp r_$id/filter.log expected.filter.log
    -+	else
    -+		test_path_is_missing r_$id/filter.log
    -+	fi &&
    -+
    -+	grep FILE_X $id.warning &&
    -+	grep FILE_x $id.warning &&
    -+	grep file_X $id.warning &&
    -+	grep file_x $id.warning &&
    -+	test_i18ngrep "the following paths have collided" $id.warning
    -+}
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on parallel clone' '
    -+	clone_and_check_collision parallel 2 0 2
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on fallback to sequential clone' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential 2 $threshold 0
    -+'
    -+
    -+# The next two tests don't work on Windows because, on this system, collision
    -+# detection uses strcmp() (when core.ignoreCase=0) to find the colliding pair.
    -+# But they work on OSX, where collision detection uses inode.
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on parallel clone w/ filter' '
    -+	clone_and_check_collision parallel-with-filter 2 0 2 use_filter
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on fallback to sequential clone w/ filter' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential-with-filter 2 $threshold 0 use_filter
    -+'
    ++		test_cmp ${mode}_with_filter/filter.log expected.filter.log
    ++	'
    ++done
     +
     +test_done
18:  ece38f0483 = 18:  b26f676cae parallel-checkout: add tests related to .gitattributes
19:  b4cb5905d2 ! 19:  641c61f9b6 ci: run test round with parallel-checkout enabled
    @@ t/lib-parallel-checkout.sh
     +
      # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
      # and checks that the number of workers spawned is equal to $3.
    - git_pc()
    -
    - ## t/t2081-parallel-checkout-collisions.sh ##
    -@@
    - test_description='parallel-checkout collisions'
    - 
    - . ./test-lib.sh
    -+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
    - 
    - # When there are pathname collisions during a clone, Git should report a warning
    - # listing all of the colliding entries. The sequential code detects a collision
    + #
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 01/19] convert: make convert_attrs() and convert structs public
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29 23:40       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
                       ` (20 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 23 ++---------------------
 convert.h | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index ee360c2f07..eb14714979 100644
--- a/convert.c
+++ b/convert.c
@@ -24,17 +24,6 @@
 #define CONVERT_STAT_BITS_TXT_CRLF  0x2
 #define CONVERT_STAT_BITS_BIN       0x4
 
-enum crlf_action {
-	CRLF_UNDEFINED,
-	CRLF_BINARY,
-	CRLF_TEXT,
-	CRLF_TEXT_INPUT,
-	CRLF_TEXT_CRLF,
-	CRLF_AUTO,
-	CRLF_AUTO_INPUT,
-	CRLF_AUTO_CRLF
-};
-
 struct text_stat {
 	/* NUL, CR, LF and CRLF counts */
 	unsigned nul, lonecr, lonelf, crlf;
@@ -1297,18 +1286,10 @@ static int git_path_check_ident(struct attr_check_item *check)
 	return !!ATTR_TRUE(value);
 }
 
-struct conv_attrs {
-	struct convert_driver *drv;
-	enum crlf_action attr_action; /* What attr says */
-	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
-	int ident;
-	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
-};
-
 static struct attr_check *check;
 
-static void convert_attrs(const struct index_state *istate,
-			  struct conv_attrs *ca, const char *path)
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path)
 {
 	struct attr_check_item *ccheck = NULL;
 
diff --git a/convert.h b/convert.h
index e29d1026a6..aeb4a1be9a 100644
--- a/convert.h
+++ b/convert.h
@@ -37,6 +37,27 @@ enum eol {
 #endif
 };
 
+enum crlf_action {
+	CRLF_UNDEFINED,
+	CRLF_BINARY,
+	CRLF_TEXT,
+	CRLF_TEXT_INPUT,
+	CRLF_TEXT_CRLF,
+	CRLF_AUTO,
+	CRLF_AUTO_INPUT,
+	CRLF_AUTO_CRLF
+};
+
+struct convert_driver;
+
+struct conv_attrs {
+	struct convert_driver *drv;
+	enum crlf_action attr_action; /* What attr says */
+	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
+	int ident;
+	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
+};
+
 enum ce_delay_state {
 	CE_NO_DELAY = 0,
 	CE_CAN_DELAY = 1,
@@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 int would_convert_to_git_filter_fd(const struct index_state *istate,
 				   const char *path);
 
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path);
+
 /*
  * Initialize the checkout metadata with the given values.  Any argument may be
  * NULL if it is not applicable.  The treeish should be a commit if that is
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29 23:48       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
                       ` (19 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on a index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash, remove one function definition and reword]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 50 ++++++++++++++++++++++++++++++++++++--------------
 convert.h |  9 +++++++++
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/convert.c b/convert.c
index eb14714979..191a42a0ae 100644
--- a/convert.c
+++ b/convert.c
@@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
 }
 
-static int convert_to_working_tree_internal(const struct index_state *istate,
+static int convert_to_working_tree_internal(const struct conv_attrs *ca,
 					    const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing,
@@ -1455,11 +1455,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 					    struct delayed_checkout *dco)
 {
 	int ret = 0, ret_filter = 0;
-	struct conv_attrs ca;
-
-	convert_attrs(istate, &ca, path);
 
-	ret |= ident_to_worktree(src, len, dst, ca.ident);
+	ret |= ident_to_worktree(src, len, dst, ca->ident);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -1469,24 +1466,24 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 	 * is a smudge or process filter (even if the process filter doesn't
 	 * support smudge).  The filters might expect CRLFs.
 	 */
-	if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) {
-		ret |= crlf_to_worktree(src, len, dst, ca.crlf_action);
+	if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) {
+		ret |= crlf_to_worktree(src, len, dst, ca->crlf_action);
 		if (ret) {
 			src = dst->buf;
 			len = dst->len;
 		}
 	}
 
-	ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding);
+	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
 	}
 
 	ret_filter = apply_filter(
-		path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco);
-	if (!ret_filter && ca.drv && ca.drv->required)
-		die(_("%s: smudge filter %s failed"), path, ca.drv->name);
+		path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco);
+	if (!ret_filter && ca->drv && ca->drv->required)
+		die(_("%s: smudge filter %s failed"), path, ca->drv->name);
 
 	return ret | ret_filter;
 }
@@ -1497,7 +1494,9 @@ int async_convert_to_working_tree(const struct index_state *istate,
 				  const struct checkout_metadata *meta,
 				  void *dco)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco);
 }
 
 int convert_to_working_tree(const struct index_state *istate,
@@ -1505,13 +1504,36 @@ int convert_to_working_tree(const struct index_state *istate,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL);
+}
+
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
+}
+
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta)
+{
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
 }
 
 int renormalize_buffer(const struct index_state *istate, const char *path,
 		       const char *src, size_t len, struct strbuf *dst)
 {
-	int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL);
+	struct conv_attrs ca;
+	int ret;
+
+	convert_attrs(istate, &ca, path);
+	ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/convert.h b/convert.h
index aeb4a1be9a..46d537d1ae 100644
--- a/convert.h
+++ b/convert.h
@@ -100,11 +100,20 @@ int convert_to_working_tree(const struct index_state *istate,
 			    const char *path, const char *src,
 			    size_t len, struct strbuf *dst,
 			    const struct checkout_metadata *meta);
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta);
 int async_convert_to_working_tree(const struct index_state *istate,
 				  const char *path, const char *src,
 				  size_t len, struct strbuf *dst,
 				  const struct checkout_metadata *meta,
 				  void *dco);
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco);
 int async_query_available_blobs(const char *cmd,
 				struct string_list *available_paths);
 int renormalize_buffer(const struct index_state *istate,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 03/19] convert: add get_stream_filter_ca() variant
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29 23:51       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 04/19] convert: add conv_attrs classification Matheus Tavares
                       ` (18 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: move header comment to ca() variant and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 28 +++++++++++++++++-----------
 convert.h |  2 ++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/convert.c b/convert.c
index 191a42a0ae..bd4d3f01cd 100644
--- a/convert.c
+++ b/convert.c
@@ -1960,34 +1960,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid)
 }
 
 /*
- * Return an appropriately constructed filter for the path, or NULL if
+ * Return an appropriately constructed filter for the given ca, or NULL if
  * the contents cannot be filtered without reading the whole thing
  * in-core.
  *
  * Note that you would be crazy to set CRLF, smudge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const struct index_state *istate,
-					const char *path,
-					const struct object_id *oid)
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid)
 {
-	struct conv_attrs ca;
 	struct stream_filter *filter = NULL;
 
-	convert_attrs(istate, &ca, path);
-	if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean))
+	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
 		return NULL;
 
-	if (ca.working_tree_encoding)
+	if (ca->working_tree_encoding)
 		return NULL;
 
-	if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF)
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
 		return NULL;
 
-	if (ca.ident)
+	if (ca->ident)
 		filter = ident_filter(oid);
 
-	if (output_eol(ca.crlf_action) == EOL_CRLF)
+	if (output_eol(ca->crlf_action) == EOL_CRLF)
 		filter = cascade_filter(filter, lf_to_crlf_filter());
 	else
 		filter = cascade_filter(filter, &null_filter_singleton);
@@ -1995,6 +1992,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate,
 	return filter;
 }
 
+struct stream_filter *get_stream_filter(const struct index_state *istate,
+					const char *path,
+					const struct object_id *oid)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return get_stream_filter_ca(&ca, oid);
+}
+
 void free_stream_filter(struct stream_filter *filter)
 {
 	filter->vtbl->free(filter);
diff --git a/convert.h b/convert.h
index 46d537d1ae..262c1a1d46 100644
--- a/convert.h
+++ b/convert.h
@@ -169,6 +169,8 @@ struct stream_filter; /* opaque */
 struct stream_filter *get_stream_filter(const struct index_state *istate,
 					const char *path,
 					const struct object_id *);
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid);
 void free_stream_filter(struct stream_filter *);
 int is_null_stream_filter(struct stream_filter *);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 04/19] convert: add conv_attrs classification
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (2 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29 23:53       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 05/19] entry: extract a header file for entry.c functions Matheus Tavares
                       ` (17 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: use classification in get_stream_filter_ca()]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 26 +++++++++++++++++++-------
 convert.h | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/convert.c b/convert.c
index bd4d3f01cd..c0b45149b5 100644
--- a/convert.c
+++ b/convert.c
@@ -1972,13 +1972,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
 {
 	struct stream_filter *filter = NULL;
 
-	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
-		return NULL;
-
-	if (ca->working_tree_encoding)
-		return NULL;
-
-	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+	if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE)
 		return NULL;
 
 	if (ca->ident)
@@ -2034,3 +2028,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst,
 	if (blob)
 		oidcpy(&dst->blob, blob);
 }
+
+enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca)
+{
+	if (ca->drv) {
+		if (ca->drv->process)
+			return CA_CLASS_INCORE_PROCESS;
+		if (ca->drv->smudge || ca->drv->clean)
+			return CA_CLASS_INCORE_FILTER;
+	}
+
+	if (ca->working_tree_encoding)
+		return CA_CLASS_INCORE;
+
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+		return CA_CLASS_INCORE;
+
+	return CA_CLASS_STREAMABLE;
+}
diff --git a/convert.h b/convert.h
index 262c1a1d46..523ba9b140 100644
--- a/convert.h
+++ b/convert.h
@@ -190,4 +190,37 @@ int stream_filter(struct stream_filter *,
 		  const char *input, size_t *isize_p,
 		  char *output, size_t *osize_p);
 
+enum conv_attrs_classification {
+	/*
+	 * The blob must be loaded into a buffer before it can be
+	 * smudged. All smudging is done in-proc.
+	 */
+	CA_CLASS_INCORE,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * single-file driver filter, such as rot13.
+	 */
+	CA_CLASS_INCORE_FILTER,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * long-running driver process, such as LFS. This might or
+	 * might not use delayed operations. (The important thing is
+	 * that there is a single subordinate long-running process
+	 * handling all associated blobs and in case of delayed
+	 * operations, may hold per-blob state.)
+	 */
+	CA_CLASS_INCORE_PROCESS,
+
+	/*
+	 * The blob can be streamed and smudged without needing to
+	 * completely read it into a buffer.
+	 */
+	CA_CLASS_STREAMABLE,
+};
+
+enum conv_attrs_classification classify_conv_attrs(
+	const struct conv_attrs *ca);
+
 #endif /* CONVERT_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 05/19] entry: extract a header file for entry.c functions
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (3 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 04/19] convert: add conv_attrs classification Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-30 21:36       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
                       ` (16 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 apply.c                  |  1 +
 builtin/checkout-index.c |  1 +
 builtin/checkout.c       |  1 +
 builtin/difftool.c       |  1 +
 cache.h                  | 24 -----------------------
 entry.c                  |  9 +--------
 entry.h                  | 41 ++++++++++++++++++++++++++++++++++++++++
 unpack-trees.c           |  1 +
 8 files changed, 47 insertions(+), 32 deletions(-)
 create mode 100644 entry.h

diff --git a/apply.c b/apply.c
index 76dba93c97..ddec80b4b0 100644
--- a/apply.c
+++ b/apply.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "rerere.h"
 #include "apply.h"
+#include "entry.h"
 
 struct gitdiff_data {
 	struct strbuf *root;
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 4bbfc92dce..9276ed0258 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "cache-tree.h"
 #include "parse-options.h"
+#include "entry.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 0951f8fee5..b18b9d6f3c 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -26,6 +26,7 @@
 #include "unpack-trees.h"
 #include "wt-status.h"
 #include "xdiff-interface.h"
+#include "entry.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
diff --git a/builtin/difftool.c b/builtin/difftool.c
index 7ac432b881..dfa22b67eb 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -23,6 +23,7 @@
 #include "lockfile.h"
 #include "object-store.h"
 #include "dir.h"
+#include "entry.h"
 
 static int trust_exit_code;
 
diff --git a/cache.h b/cache.h
index c0072d43b1..ccfeb9ba2b 100644
--- a/cache.h
+++ b/cache.h
@@ -1706,30 +1706,6 @@ const char *show_ident_date(const struct ident_split *id,
  */
 int ident_cmp(const struct ident_split *, const struct ident_split *);
 
-struct checkout {
-	struct index_state *istate;
-	const char *base_dir;
-	int base_dir_len;
-	struct delayed_checkout *delayed_checkout;
-	struct checkout_metadata meta;
-	unsigned force:1,
-		 quiet:1,
-		 not_new:1,
-		 clone:1,
-		 refresh_cache:1;
-};
-#define CHECKOUT_INIT { NULL, "" }
-
-#define TEMPORARY_FILENAME_LENGTH 25
-int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts);
-void enable_delayed_checkout(struct checkout *state);
-int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
-/*
- * Unlink the last component and schedule the leading directories for
- * removal, such that empty directories get removed.
- */
-void unlink_entry(const struct cache_entry *ce);
-
 struct cache_def {
 	struct strbuf path;
 	int flags;
diff --git a/entry.c b/entry.c
index a0532f1f00..b0b8099699 100644
--- a/entry.c
+++ b/entry.c
@@ -6,6 +6,7 @@
 #include "submodule.h"
 #include "progress.h"
 #include "fsmonitor.h"
+#include "entry.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-/*
- * Write the contents from ce out to the working tree.
- *
- * When topath[] is not NULL, instead of writing to the working tree
- * file named by ce, a temporary file is created by this function and
- * its name is returned in topath[], which must be able to hold at
- * least TEMPORARY_FILENAME_LENGTH bytes long.
- */
 int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		   char *topath, int *nr_checkouts)
 {
diff --git a/entry.h b/entry.h
new file mode 100644
index 0000000000..2d69185448
--- /dev/null
+++ b/entry.h
@@ -0,0 +1,41 @@
+#ifndef ENTRY_H
+#define ENTRY_H
+
+#include "cache.h"
+#include "convert.h"
+
+struct checkout {
+	struct index_state *istate;
+	const char *base_dir;
+	int base_dir_len;
+	struct delayed_checkout *delayed_checkout;
+	struct checkout_metadata meta;
+	unsigned force:1,
+		 quiet:1,
+		 not_new:1,
+		 clone:1,
+		 refresh_cache:1;
+};
+#define CHECKOUT_INIT { NULL, "" }
+
+#define TEMPORARY_FILENAME_LENGTH 25
+
+/*
+ * Write the contents from ce out to the working tree.
+ *
+ * When topath[] is not NULL, instead of writing to the working tree
+ * file named by ce, a temporary file is created by this function and
+ * its name is returned in topath[], which must be able to hold at
+ * least TEMPORARY_FILENAME_LENGTH bytes long.
+ */
+int checkout_entry(struct cache_entry *ce, const struct checkout *state,
+		   char *topath, int *nr_checkouts);
+void enable_delayed_checkout(struct checkout *state);
+int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
+/*
+ * Unlink the last component and schedule the leading directories for
+ * removal, such that empty directories get removed.
+ */
+void unlink_entry(const struct cache_entry *ce);
+
+#endif /* ENTRY_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 323280dd48..a511fadd89 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -16,6 +16,7 @@
 #include "fsmonitor.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "entry.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (4 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 05/19] entry: extract a header file for entry.c functions Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
                       ` (15 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 8 ++++----
 entry.h | 2 ++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index b0b8099699..b36071a610 100644
--- a/entry.c
+++ b/entry.c
@@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode)
 	return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
 }
 
-static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
 {
 	enum object_type type;
 	void *blob_data = read_object_file(&ce->oid, &type, size);
@@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf
 	}
 }
 
-static int fstat_output(int fd, const struct checkout *state, struct stat *st)
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 {
 	/* use fstat() only when path == ce->name */
 	if (fstat_is_reliable() &&
@@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path,
 		return -1;
 
 	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
-	*fstat_done = fstat_output(fd, state, statbuf);
+	*fstat_done = fstat_checkout_output(fd, state, statbuf);
 	result |= close(fd);
 
 	if (result)
@@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce,
 
 		wrote = write_in_full(fd, new_blob, size);
 		if (!to_tempfile)
-			fstat_done = fstat_output(fd, state, &st);
+			fstat_done = fstat_checkout_output(fd, state, &st);
 		close(fd);
 		free(new_blob);
 		if (wrote < 0)
diff --git a/entry.h b/entry.h
index 2d69185448..f860e60846 100644
--- a/entry.h
+++ b/entry.h
@@ -37,5 +37,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
  * removal, such that empty directories get removed.
  */
 void unlink_entry(const struct cache_entry *ce);
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 07/19] entry: extract cache_entry update from write_entry()
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (5 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
                       ` (14 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

This code will be used by the parallel checkout functions, outside
entry.c, so extract it to a public function.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 25 ++++++++++++++++---------
 entry.h |  2 ++
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/entry.c b/entry.c
index b36071a610..1d2df188e5 100644
--- a/entry.c
+++ b/entry.c
@@ -251,6 +251,18 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 	return errs;
 }
 
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st)
+{
+	if (state->refresh_cache) {
+		assert(state->istate);
+		fill_stat_cache_info(state->istate, ce, st);
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
+		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+	}
+}
+
 static int write_entry(struct cache_entry *ce,
 		       char *path, const struct checkout *state, int to_tempfile)
 {
@@ -371,15 +383,10 @@ static int write_entry(struct cache_entry *ce,
 
 finish:
 	if (state->refresh_cache) {
-		assert(state->istate);
-		if (!fstat_done)
-			if (lstat(ce->name, &st) < 0)
-				return error_errno("unable to stat just-written file %s",
-						   ce->name);
-		fill_stat_cache_info(state->istate, ce, &st);
-		ce->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(state->istate, ce);
-		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+		if (!fstat_done && lstat(ce->name, &st) < 0)
+			return error_errno("unable to stat just-written file %s",
+					   ce->name);
+		update_ce_after_write(state, ce , &st);
 	}
 delayed:
 	return 0;
diff --git a/entry.h b/entry.h
index f860e60846..664aed1576 100644
--- a/entry.h
+++ b/entry.h
@@ -39,5 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 void unlink_entry(const struct cache_entry *ce);
 void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (6 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-30 21:58       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
                       ` (13 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 1d2df188e5..8237859b12 100644
--- a/entry.c
+++ b/entry.c
@@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 	}
 }
 
-static int write_entry(struct cache_entry *ce,
-		       char *path, const struct checkout *state, int to_tempfile)
+/* Note: ca is used (and required) iff the entry refers to a regular file. */
+static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
+		       const struct checkout *state, int to_tempfile)
 {
 	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
 	struct delayed_checkout *dco = state->delayed_checkout;
@@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
 	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
 
 	if (ce_mode_s_ifmt == S_IFREG) {
-		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
-								 &ce->oid);
+		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
 		if (filter &&
 		    !streaming_write_entry(ce, path, filter,
 					   state, to_tempfile,
@@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
 		 * Convert from git internal format to working tree format
 		 */
 		if (dco && dco->state != CE_NO_DELAY) {
-			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
-							    size, &buf, &meta, dco);
+			ret = async_convert_to_working_tree_ca(ca, ce->name,
+							       new_blob, size,
+							       &buf, &meta, dco);
 			if (ret && string_list_has_string(&dco->paths, ce->name)) {
 				free(new_blob);
 				goto delayed;
 			}
-		} else
-			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
+		} else {
+			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
+							 size, &buf, &meta);
+		}
 
 		if (ret) {
 			free(new_blob);
@@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
+	struct conv_attrs ca;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 	}
 
-	if (topath)
-		return write_entry(ce, topath, state, 1);
+	if (topath) {
+		if (S_ISREG(ce->ce_mode)) {
+			convert_attrs(state->istate, &ca, ce->name);
+			return write_entry(ce, topath, &ca, state, 1);
+		}
+		return write_entry(ce, topath, NULL, state, 1);
+	}
 
 	strbuf_reset(&path);
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
@@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 
 	create_directories(path.buf, path.len, state);
+
 	if (nr_checkouts)
 		(*nr_checkouts)++;
-	return write_entry(ce, path.buf, state, 0);
+
+	if (S_ISREG(ce->ce_mode)) {
+		convert_attrs(state->istate, &ca, ce->name);
+		return write_entry(ce, path.buf, &ca, state, 0);
+	}
+
+	return write_entry(ce, path.buf, NULL, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (7 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-30 22:02       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
                       ` (12 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 23 ++++++++++++-----------
 entry.h | 13 +++++++++++--
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/entry.c b/entry.c
index 8237859b12..9d79a5671f 100644
--- a/entry.c
+++ b/entry.c
@@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts)
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
-	struct conv_attrs ca;
+	struct conv_attrs ca_buf;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -459,11 +460,11 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	}
 
 	if (topath) {
-		if (S_ISREG(ce->ce_mode)) {
-			convert_attrs(state->istate, &ca, ce->name);
-			return write_entry(ce, topath, &ca, state, 1);
+		if (S_ISREG(ce->ce_mode) && !ca) {
+			convert_attrs(state->istate, &ca_buf, ce->name);
+			ca = &ca_buf;
 		}
-		return write_entry(ce, topath, NULL, state, 1);
+		return write_entry(ce, topath, ca, state, 1);
 	}
 
 	strbuf_reset(&path);
@@ -530,12 +531,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	if (nr_checkouts)
 		(*nr_checkouts)++;
 
-	if (S_ISREG(ce->ce_mode)) {
-		convert_attrs(state->istate, &ca, ce->name);
-		return write_entry(ce, path.buf, &ca, state, 0);
+	if (S_ISREG(ce->ce_mode) && !ca) {
+		convert_attrs(state->istate, &ca_buf, ce->name);
+		ca = &ca_buf;
 	}
 
-	return write_entry(ce, path.buf, NULL, state, 0);
+	return write_entry(ce, path.buf, ca, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
diff --git a/entry.h b/entry.h
index 664aed1576..2081fbbbab 100644
--- a/entry.h
+++ b/entry.h
@@ -27,9 +27,18 @@ struct checkout {
  * file named by ce, a temporary file is created by this function and
  * its name is returned in topath[], which must be able to hold at
  * least TEMPORARY_FILENAME_LENGTH bytes long.
+ *
+ * With checkout_entry_ca(), callers can optionally pass a preloaded
+ * conv_attrs struct (to avoid reloading it), when ce refers to a
+ * regular file. If ca is NULL, the attributes will be loaded
+ * internally when (and if) needed.
  */
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts);
+#define checkout_entry(ce, state, topath, nr_checkouts) \
+		checkout_entry_ca(ce, NULL, state, topath, nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts);
+
 void enable_delayed_checkout(struct checkout *state);
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
 /*
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (8 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-11-02 19:35       ` Junio C Hamano
  2020-10-29  2:14     ` [PATCH v3 11/19] parallel-checkout: make it truly parallel Matheus Tavares
                       ` (11 subsequent siblings)
  21 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

This new interface allows us to enqueue some of the entries being
checked out to later call write_entry() for them in parallel. For now,
the parallel checkout machinery is enabled by default and there is no
user configuration, but run_parallel_checkout() just writes the queued
entries in sequence (without spawning additional workers). The next
patch will actually implement the parallelism and, later, we will make
it configurable.

When there are path collisions among the entries being written (which
can happen e.g. with case-sensitive files in case-insensitive file
systems), the parallel checkout code detects the problem and marks the
item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to
checkout_entry() again. This is similar to the way the sequential code
deals with collisions, overwriting the previously checked out entries
with the subsequent ones. The only difference is that, when we start
writing the entries in parallel, we won't be able to determine which of
the colliding entries will survive on disk (for the sequential
algorithm, it is always the last one).

I also experimented with the idea of not overwriting colliding entries,
and it seemed to work well in my simple tests. However, because just one
entry of each colliding group would be actually written, the others
would have null lstat() fields on the index. This might not be a problem
by itself, but it could cause performance penalties for subsequent
commands that need to refresh the index: when the st_size value cached
is 0, read-cache.c:ie_modified() will go to the filesystem to see if the
contents match. As mentioned in the function:

    * Immediately after read-tree or update-index --cacheinfo,
    * the length field is zero, as we have never even read the
    * lstat(2) information once, and we cannot trust DATA_CHANGED
    * returned by ie_match_stat() which in turn was returned by
    * ce_match_stat_basic() to signal that the filesize of the
    * blob changed.  We have to actually go to the filesystem to
    * see if the contents match, and if so, should answer "unchanged".

So, if we have N entries in a colliding group and we decide to write and
lstat() only one of them, every subsequent git-status will have to read,
convert, and hash the written file N - 1 times, to check that the N - 1
unwritten entries are dirty. By checking out all colliding entries (like
the sequential code does), we only pay the overhead once.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Makefile            |   1 +
 entry.c             |  17 +-
 parallel-checkout.c | 368 ++++++++++++++++++++++++++++++++++++++++++++
 parallel-checkout.h |  27 ++++
 unpack-trees.c      |   6 +-
 5 files changed, 416 insertions(+), 3 deletions(-)
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

diff --git a/Makefile b/Makefile
index 1fb0ec1705..10ee5e709b 100644
--- a/Makefile
+++ b/Makefile
@@ -945,6 +945,7 @@ LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
+LIB_OBJS += parallel-checkout.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/entry.c b/entry.c
index 9d79a5671f..6676954431 100644
--- a/entry.c
+++ b/entry.c
@@ -7,6 +7,7 @@
 #include "progress.h"
 #include "fsmonitor.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
 	for (i = 0; i < state->istate->cache_nr; i++) {
 		struct cache_entry *dup = state->istate->cache[i];
 
-		if (dup == ce)
-			break;
+		if (dup == ce) {
+			/*
+			 * Parallel checkout creates the files in no particular
+			 * order. So the other side of the collision may appear
+			 * after the given cache_entry in the array.
+			 */
+			if (parallel_checkout_status() == PC_RUNNING)
+				continue;
+			else
+				break;
+		}
 
 		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
 			continue;
@@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		ca = &ca_buf;
 	}
 
+	if (!enqueue_checkout(ce, ca))
+		return 0;
+
 	return write_entry(ce, path.buf, ca, state, 0);
 }
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
new file mode 100644
index 0000000000..981dbe6ff3
--- /dev/null
+++ b/parallel-checkout.c
@@ -0,0 +1,368 @@
+#include "cache.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "streaming.h"
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/* pointer to a istate->cache[] entry. Not owned by us. */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	struct stat st;
+	enum pc_item_status status;
+};
+
+struct parallel_checkout {
+	enum pc_status status;
+	struct parallel_checkout_item *items;
+	size_t nr, alloc;
+};
+
+static struct parallel_checkout parallel_checkout = { 0 };
+
+enum pc_status parallel_checkout_status(void)
+{
+	return parallel_checkout.status;
+}
+
+void init_parallel_checkout(void)
+{
+	if (parallel_checkout.status != PC_UNINITIALIZED)
+		BUG("parallel checkout already initialized");
+
+	parallel_checkout.status = PC_ACCEPTING_ENTRIES;
+}
+
+static void finish_parallel_checkout(void)
+{
+	if (parallel_checkout.status == PC_UNINITIALIZED)
+		BUG("cannot finish parallel checkout: not initialized yet");
+
+	free(parallel_checkout.items);
+	memset(&parallel_checkout, 0, sizeof(parallel_checkout));
+}
+
+static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
+					     const struct conv_attrs *ca)
+{
+	enum conv_attrs_classification c;
+
+	if (!S_ISREG(ce->ce_mode))
+		return 0;
+
+	c = classify_conv_attrs(ca);
+	switch (c) {
+	case CA_CLASS_INCORE:
+		return 1;
+
+	case CA_CLASS_INCORE_FILTER:
+		/*
+		 * It would be safe to allow concurrent instances of
+		 * single-file smudge filters, like rot13, but we should not
+		 * assume that all filters are parallel-process safe. So we
+		 * don't allow this.
+		 */
+		return 0;
+
+	case CA_CLASS_INCORE_PROCESS:
+		/*
+		 * The parallel queue and the delayed queue are not compatible,
+		 * so they must be kept completely separated. And we can't tell
+		 * if a long-running process will delay its response without
+		 * actually asking it to perform the filtering. Therefore, this
+		 * type of filter is not allowed in parallel checkout.
+		 *
+		 * Furthermore, there should only be one instance of the
+		 * long-running process filter as we don't know how it is
+		 * managing its own concurrency. So, spreading the entries that
+		 * requisite such a filter among the parallel workers would
+		 * require a lot more inter-process communication. We would
+		 * probably have to designate a single process to interact with
+		 * the filter and send all the necessary data to it, for each
+		 * entry.
+		 */
+		return 0;
+
+	case CA_CLASS_STREAMABLE:
+		return 1;
+
+	default:
+		BUG("unsupported conv_attrs classification '%d'", c);
+	}
+}
+
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
+{
+	struct parallel_checkout_item *pc_item;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES ||
+	    !is_eligible_for_parallel_checkout(ce, ca))
+		return -1;
+
+	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
+		   parallel_checkout.alloc);
+
+	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item->ce = ce;
+	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
+	pc_item->status = PC_ITEM_PENDING;
+
+	return 0;
+}
+
+static int handle_results(struct checkout *state)
+{
+	int ret = 0;
+	size_t i;
+	int have_pending = 0;
+
+	/*
+	 * We first update the successfully written entries with the collected
+	 * stat() data, so that they can be found by mark_colliding_entries(),
+	 * in the next loop, when necessary.
+	 */
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		if (pc_item->status == PC_ITEM_WRITTEN)
+			update_ce_after_write(state, pc_item->ce, &pc_item->st);
+	}
+
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+
+		switch(pc_item->status) {
+		case PC_ITEM_WRITTEN:
+			/* Already handled */
+			break;
+		case PC_ITEM_COLLIDED:
+			/*
+			 * The entry could not be checked out due to a path
+			 * collision with another entry. Since there can only
+			 * be one entry of each colliding group on the disk, we
+			 * could skip trying to check out this one and move on.
+			 * However, this would leave the unwritten entries with
+			 * null stat() fields on the index, which could
+			 * potentially slow down subsequent operations that
+			 * require refreshing it: git would not be able to
+			 * trust st_size and would have to go to the filesystem
+			 * to see if the contents match (see ie_modified()).
+			 *
+			 * Instead, let's pay the overhead only once, now, and
+			 * call checkout_entry_ca() again for this file, to
+			 * have it's stat() data stored in the index. This also
+			 * has the benefit of adding this entry and its
+			 * colliding pair to the collision report message.
+			 * Additionally, this overwriting behavior is consistent
+			 * with what the sequential checkout does, so it doesn't
+			 * add any extra overhead.
+			 */
+			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
+						 state, NULL, NULL);
+			break;
+		case PC_ITEM_PENDING:
+			have_pending = 1;
+			/* fall through */
+		case PC_ITEM_FAILED:
+			ret = -1;
+			break;
+		default:
+			BUG("unknown checkout item status in parallel checkout");
+		}
+	}
+
+	if (have_pending)
+		error(_("parallel checkout finished with pending entries"));
+
+	return ret;
+}
+
+static int reset_fd(int fd, const char *path)
+{
+	if (lseek(fd, 0, SEEK_SET) != 0)
+		return error_errno("failed to rewind descriptor of %s", path);
+	if (ftruncate(fd, 0))
+		return error_errno("failed to truncate file %s", path);
+	return 0;
+}
+
+static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
+			       const char *path)
+{
+	int ret;
+	struct stream_filter *filter;
+	struct strbuf buf = STRBUF_INIT;
+	char *new_blob;
+	unsigned long size;
+	size_t newsize = 0;
+	ssize_t wrote;
+
+	/* Sanity check */
+	assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca));
+
+	filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid);
+	if (filter) {
+		if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
+			/* On error, reset fd to try writing without streaming */
+			if (reset_fd(fd, path))
+				return -1;
+		} else {
+			return 0;
+		}
+	}
+
+	new_blob = read_blob_entry(pc_item->ce, &size);
+	if (!new_blob)
+		return error("unable to read sha1 file of %s (%s)", path,
+			     oid_to_hex(&pc_item->ce->oid));
+
+	/*
+	 * checkout metadata is used to give context for external process
+	 * filters. Files requiring such filters are not eligible for parallel
+	 * checkout, so pass NULL.
+	 */
+	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
+					 new_blob, size, &buf, NULL);
+
+	if (ret) {
+		free(new_blob);
+		new_blob = strbuf_detach(&buf, &newsize);
+		size = newsize;
+	}
+
+	wrote = write_in_full(fd, new_blob, size);
+	free(new_blob);
+	if (wrote < 0)
+		return error("unable to write file %s", path);
+
+	return 0;
+}
+
+static int close_and_clear(int *fd)
+{
+	int ret = 0;
+
+	if (*fd >= 0) {
+		ret = close(*fd);
+		*fd = -1;
+	}
+
+	return ret;
+}
+
+static int check_leading_dirs(const char *path, int len, int prefix_len)
+{
+	const char *slash = path + len;
+
+	while (slash > path && *slash != '/')
+		slash--;
+
+	return has_dirs_only_path(path, slash - path, prefix_len);
+}
+
+static void write_pc_item(struct parallel_checkout_item *pc_item,
+			  struct checkout *state)
+{
+	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
+	int fd = -1, fstat_done = 0;
+	struct strbuf path = STRBUF_INIT;
+
+	strbuf_add(&path, state->base_dir, state->base_dir_len);
+	strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
+
+	/*
+	 * At this point, leading dirs should have already been created. But if
+	 * a symlink being checked out has collided with one of the dirs, due to
+	 * file system folding rules, it's possible that the dirs are no longer
+	 * present. So we have to check again, and report any path collisions.
+	 */
+	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
+		pc_item->status = PC_ITEM_COLLIDED;
+		goto out;
+	}
+
+	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
+
+	if (fd < 0) {
+		if (errno == EEXIST || errno == EISDIR) {
+			/*
+			 * Errors which probably represent a path collision.
+			 * Suppress the error message and mark the item to be
+			 * retried later, sequentially. ENOTDIR and ENOENT are
+			 * also interesting, but check_leading_dirs() should
+			 * have already caught these cases.
+			 */
+			pc_item->status = PC_ITEM_COLLIDED;
+		} else {
+			error_errno("failed to open file %s", path.buf);
+			pc_item->status = PC_ITEM_FAILED;
+		}
+		goto out;
+	}
+
+	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
+		/* Error was already reported. */
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	fstat_done = fstat_checkout_output(fd, state, &pc_item->st);
+
+	if (close_and_clear(&fd)) {
+		error_errno("unable to close file %s", path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) {
+		error_errno("unable to stat just-written file %s",  path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	pc_item->status = PC_ITEM_WRITTEN;
+
+out:
+	/*
+	 * No need to check close() return. At this point, either fd is already
+	 * closed, or we are on an error path, that has already been reported.
+	 */
+	close_and_clear(&fd);
+	strbuf_release(&path);
+}
+
+static void write_items_sequentially(struct checkout *state)
+{
+	size_t i;
+
+	for (i = 0; i < parallel_checkout.nr; ++i)
+		write_pc_item(&parallel_checkout.items[i], state);
+}
+
+int run_parallel_checkout(struct checkout *state)
+{
+	int ret;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES)
+		BUG("cannot run parallel checkout: uninitialized or already running");
+
+	parallel_checkout.status = PC_RUNNING;
+
+	write_items_sequentially(state);
+	ret = handle_results(state);
+
+	finish_parallel_checkout();
+	return ret;
+}
diff --git a/parallel-checkout.h b/parallel-checkout.h
new file mode 100644
index 0000000000..e6d6fc01ea
--- /dev/null
+++ b/parallel-checkout.h
@@ -0,0 +1,27 @@
+#ifndef PARALLEL_CHECKOUT_H
+#define PARALLEL_CHECKOUT_H
+
+struct cache_entry;
+struct checkout;
+struct conv_attrs;
+
+enum pc_status {
+	PC_UNINITIALIZED = 0,
+	PC_ACCEPTING_ENTRIES,
+	PC_RUNNING,
+};
+
+enum pc_status parallel_checkout_status(void);
+void init_parallel_checkout(void);
+
+/*
+ * Return -1 if parallel checkout is currently not enabled or if the entry is
+ * not eligible for parallel checkout. Otherwise, enqueue the entry for later
+ * write and return 0.
+ */
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+
+/* Write all the queued entries, returning 0 on success.*/
+int run_parallel_checkout(struct checkout *state);
+
+#endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index a511fadd89..1b1da7485a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -17,6 +17,7 @@
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o,
 	if (should_update_submodules())
 		load_gitmodules_file(index, &state);
 
-	enable_delayed_checkout(&state);
 	if (has_promisor_remote()) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
@@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o,
 					   to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
+
+	enable_delayed_checkout(&state);
+	init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
+	errs |= run_parallel_checkout(&state);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 11/19] parallel-checkout: make it truly parallel
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (9 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 12/19] parallel-checkout: support progress displaying Matheus Tavares
                       ` (10 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Use multiple worker processes to distribute the queued entries and call
write_checkout_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could
affect performance due to lock contention in the kernel. Work stealing
(or any other format of re-distribution) is not implemented yet.

The parallel version was benchmarked during three operations in the
linux repo, with cold cache: cloning v5.8, checking out v5.8 from
v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The
four tables below show the mean run times and standard deviations for
5 runs in: a local file system with SSD, a local file system with HDD, a
Linux NFS server, and Amazon EFS. The numbers of workers were chosen
based on what produces the best result for each case.

Local SSD:

            Clone                  Checkout I             Checkout II
Sequential  8.171 s ± 0.206 s      8.735 s ± 0.230 s      4.166 s ± 0.246 s
10 workers  3.277 s ± 0.138 s      3.774 s ± 0.188 s      2.561 s ± 0.120 s
Speedup     2.49 ± 0.12            2.31 ± 0.13            1.63 ± 0.12

Local HDD:

            Clone                  Checkout I             Checkout II
Sequential  35.157 s ± 0.205 s     48.835 s ± 0.407 s     47.302 s ± 1.435 s
8 workers   35.538 s ± 0.325 s     49.353 s ± 0.826 s     48.919 s ± 0.416 s
Speedup     0.99 ± 0.01            0.99 ± 0.02            0.97 ± 0.03

Linux NFS server (v4.1, on EBS, single availability zone):

            Clone                  Checkout I             Checkout II
Sequential  216.070 s ± 3.611 s    211.169 s ± 3.147 s    57.446 s ± 1.301 s
32 workers  67.997 s ± 0.740 s     66.563 s ± 0.457 s     23.708 s ± 0.622 s
Speedup     3.18 ± 0.06            3.17 ± 0.05            2.42 ± 0.08

EFS (v4.1, replicated over multiple availability zones):

            Clone                  Checkout I             Checkout II
Sequential  1249.329 s ± 13.857 s  1438.979 s ± 78.792 s  543.919 s ± 18.745 s
64 workers  225.864 s ± 12.433 s   316.345 s ± 1.887 s    183.648 s ± 10.095 s
Speedup     5.53 ± 0.31            4.55 ± 0.25            2.96 ± 0.19

The above benchmarks show that parallel checkout is most effective on
repositories located on an SSD or over a distributed file system. For
local file systems on spinning disks, and/or older machines, the
parallelism does not always bring a good performance. In fact, it can
even increase the run time. For this reason, the sequential code is
still the default. Two settings are added to optionally enable and
configure the new parallel version as desired.

Local SSD tests were executed in an i7-7700HQ (4 cores with
hyper-threading) running Manjaro Linux. Local HDD tests were executed in
an i7-2600 (also 4 cores with hyper-threading), HDD Seagate Barracuda
7200 rpm SATA 3.0, running Debian 9.13. NFS and EFS tests were
executed in an Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux
NFS server was running on a m6g.large instance with 1 TB, EBS GP2
volume. Before each timing, the linux repository was removed (or checked
out back), and `sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .gitignore                        |   1 +
 Documentation/config/checkout.txt |  21 +++
 Makefile                          |   1 +
 builtin.h                         |   1 +
 builtin/checkout--helper.c        | 142 +++++++++++++++
 git.c                             |   2 +
 parallel-checkout.c               | 280 +++++++++++++++++++++++++++---
 parallel-checkout.h               |  84 ++++++++-
 unpack-trees.c                    |  10 +-
 9 files changed, 508 insertions(+), 34 deletions(-)
 create mode 100644 builtin/checkout--helper.c

diff --git a/.gitignore b/.gitignore
index 6232d33924..1a341ea184 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,6 +33,7 @@
 /git-check-mailmap
 /git-check-ref-format
 /git-checkout
+/git-checkout--helper
 /git-checkout-index
 /git-cherry
 /git-cherry-pick
diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt
index 6b646813ab..23e8f7cde0 100644
--- a/Documentation/config/checkout.txt
+++ b/Documentation/config/checkout.txt
@@ -16,3 +16,24 @@ will checkout the '<something>' branch on another remote,
 and by linkgit:git-worktree[1] when 'git worktree add' refers to a
 remote branch. This setting might be used for other checkout-like
 commands or functionality in the future.
+
+checkout.workers::
+	The number of parallel workers to use when updating the working tree.
+	The default is one, i.e. sequential execution. If set to a value less
+	than one, Git will use as many workers as the number of logical cores
+	available. This setting and `checkout.thresholdForParallelism` affect
+	all commands that perform checkout. E.g. checkout, clone, reset,
+	sparse-checkout, etc.
++
+Note: parallel checkout usually delivers better performance for repositories
+located on SSDs or over NFS. For repositories on spinning disks and/or machines
+with a small number of cores, the default sequential checkout often performs
+better. The size and compression level of a repository might also influence how
+well the parallel version performs.
+
+checkout.thresholdForParallelism::
+	When running parallel checkout with a small number of files, the cost
+	of subprocess spawning and inter-process communication might outweigh
+	the parallelization gains. This setting allows to define the minimum
+	number of files for which parallel checkout should be attempted. The
+	default is 100.
diff --git a/Makefile b/Makefile
index 10ee5e709b..535e6e94aa 100644
--- a/Makefile
+++ b/Makefile
@@ -1063,6 +1063,7 @@ BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
+BUILTIN_OBJS += builtin/checkout--helper.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
 BUILTIN_OBJS += builtin/clean.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..2abbe14b0b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -123,6 +123,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix);
 int cmd_bundle(int argc, const char **argv, const char *prefix);
 int cmd_cat_file(int argc, const char **argv, const char *prefix);
 int cmd_checkout(int argc, const char **argv, const char *prefix);
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix);
 int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 int cmd_check_attr(int argc, const char **argv, const char *prefix);
 int cmd_check_ignore(int argc, const char **argv, const char *prefix);
diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
new file mode 100644
index 0000000000..67fe37cf11
--- /dev/null
+++ b/builtin/checkout--helper.c
@@ -0,0 +1,142 @@
+#include "builtin.h"
+#include "config.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "parse-options.h"
+#include "pkt-line.h"
+
+static void packet_to_pc_item(char *line, int len,
+			      struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_fixed_portion *fixed_portion;
+	char *encoding, *variant;
+
+	if (len < sizeof(struct pc_item_fixed_portion))
+		BUG("checkout worker received too short item (got %dB, exp %dB)",
+		    len, (int)sizeof(struct pc_item_fixed_portion));
+
+	fixed_portion = (struct pc_item_fixed_portion *)line;
+
+	if (len - sizeof(struct pc_item_fixed_portion) !=
+		fixed_portion->name_len + fixed_portion->working_tree_encoding_len)
+		BUG("checkout worker received corrupted item");
+
+	variant = line + sizeof(struct pc_item_fixed_portion);
+
+	/*
+	 * Note: the main process uses zero length to communicate that the
+	 * encoding is NULL. There is no use case in actually sending an empty
+	 * string since it's considered as NULL when ca.working_tree_encoding
+	 * is set at git_path_check_encoding().
+	 */
+	if (fixed_portion->working_tree_encoding_len) {
+		encoding = xmemdupz(variant,
+				    fixed_portion->working_tree_encoding_len);
+		variant += fixed_portion->working_tree_encoding_len;
+	} else {
+		encoding = NULL;
+	}
+
+	memset(pc_item, 0, sizeof(*pc_item));
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce->ce_namelen = fixed_portion->name_len;
+	pc_item->ce->ce_mode = fixed_portion->ce_mode;
+	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
+	oidcpy(&pc_item->ce->oid, &fixed_portion->oid);
+
+	pc_item->id = fixed_portion->id;
+	pc_item->ca.crlf_action = fixed_portion->crlf_action;
+	pc_item->ca.ident = fixed_portion->ident;
+	pc_item->ca.working_tree_encoding = encoding;
+}
+
+static void report_result(struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_result res = { 0 };
+	size_t size;
+
+	res.id = pc_item->id;
+	res.status = pc_item->status;
+
+	if (pc_item->status == PC_ITEM_WRITTEN) {
+		res.st = pc_item->st;
+		size = sizeof(res);
+	} else {
+		size = PC_ITEM_RESULT_BASE_SIZE;
+	}
+
+	packet_write(1, (const char *)&res, size);
+}
+
+/* Free the worker-side malloced data, but not pc_item itself. */
+static void release_pc_item_data(struct parallel_checkout_item *pc_item)
+{
+	free((char *)pc_item->ca.working_tree_encoding);
+	discard_cache_entry(pc_item->ce);
+}
+
+static void worker_loop(struct checkout *state)
+{
+	struct parallel_checkout_item *items = NULL;
+	size_t i, nr = 0, alloc = 0;
+
+	while (1) {
+		int len;
+		char *line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		ALLOC_GROW(items, nr + 1, alloc);
+		packet_to_pc_item(line, len, &items[nr++]);
+	}
+
+	for (i = 0; i < nr; ++i) {
+		struct parallel_checkout_item *pc_item = &items[i];
+		write_pc_item(pc_item, state);
+		report_result(pc_item);
+		release_pc_item_data(pc_item);
+	}
+
+	packet_flush(1);
+
+	free(items);
+}
+
+static const char * const checkout_helper_usage[] = {
+	N_("git checkout--helper [<options>]"),
+	NULL
+};
+
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct option checkout_helper_options[] = {
+		OPT_STRING(0, "prefix", &state.base_dir, N_("string"),
+			N_("when creating files, prepend <string>")),
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(checkout_helper_usage,
+				   checkout_helper_options);
+
+	git_config(git_default_config, NULL);
+	argc = parse_options(argc, argv, prefix, checkout_helper_options,
+			     checkout_helper_usage, 0);
+	if (argc > 0)
+		usage_with_options(checkout_helper_usage, checkout_helper_options);
+
+	if (state.base_dir)
+		state.base_dir_len = strlen(state.base_dir);
+
+	/*
+	 * Setting this on worker won't actually update the index. We just need
+	 * to pretend so to induce the checkout machinery to stat() the written
+	 * entries.
+	 */
+	state.refresh_cache = 1;
+
+	worker_loop(&state);
+	return 0;
+}
diff --git a/git.c b/git.c
index 4bdcdad2cc..384f144593 100644
--- a/git.c
+++ b/git.c
@@ -487,6 +487,8 @@ static struct cmd_struct commands[] = {
 	{ "check-mailmap", cmd_check_mailmap, RUN_SETUP },
 	{ "check-ref-format", cmd_check_ref_format, NO_PARSEOPT  },
 	{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
+	{ "checkout--helper", cmd_checkout__helper,
+		RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX },
 	{ "checkout-index", cmd_checkout_index,
 		RUN_SETUP | NEED_WORK_TREE},
 	{ "cherry", cmd_cherry, RUN_SETUP },
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 981dbe6ff3..a5508e27c2 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -1,28 +1,15 @@
 #include "cache.h"
 #include "entry.h"
 #include "parallel-checkout.h"
+#include "pkt-line.h"
+#include "run-command.h"
 #include "streaming.h"
+#include "thread-utils.h"
+#include "config.h"
 
-enum pc_item_status {
-	PC_ITEM_PENDING = 0,
-	PC_ITEM_WRITTEN,
-	/*
-	 * The entry could not be written because there was another file
-	 * already present in its path or leading directories. Since
-	 * checkout_entry_ca() removes such files from the working tree before
-	 * enqueueing the entry for parallel checkout, it means that there was
-	 * a path collision among the entries being written.
-	 */
-	PC_ITEM_COLLIDED,
-	PC_ITEM_FAILED,
-};
-
-struct parallel_checkout_item {
-	/* pointer to a istate->cache[] entry. Not owned by us. */
-	struct cache_entry *ce;
-	struct conv_attrs ca;
-	struct stat st;
-	enum pc_item_status status;
+struct pc_worker {
+	struct child_process cp;
+	size_t next_to_complete, nr_to_complete;
 };
 
 struct parallel_checkout {
@@ -38,6 +25,19 @@ enum pc_status parallel_checkout_status(void)
 	return parallel_checkout.status;
 }
 
+#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100
+
+void get_parallel_checkout_configs(int *num_workers, int *threshold)
+{
+	if (git_config_get_int("checkout.workers", num_workers))
+		*num_workers = 1;
+	else if (*num_workers < 1)
+		*num_workers = online_cpus();
+
+	if (git_config_get_int("checkout.thresholdForParallelism", threshold))
+		*threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM;
+}
+
 void init_parallel_checkout(void)
 {
 	if (parallel_checkout.status != PC_UNINITIALIZED)
@@ -115,10 +115,12 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
 		   parallel_checkout.alloc);
 
-	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item = &parallel_checkout.items[parallel_checkout.nr];
 	pc_item->ce = ce;
 	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
 	pc_item->status = PC_ITEM_PENDING;
+	pc_item->id = parallel_checkout.nr;
+	parallel_checkout.nr++;
 
 	return 0;
 }
@@ -231,7 +233,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
 	/*
 	 * checkout metadata is used to give context for external process
 	 * filters. Files requiring such filters are not eligible for parallel
-	 * checkout, so pass NULL.
+	 * checkout, so pass NULL. Note: if that changes, the metadata must also
+	 * be passed from the main process to the workers.
 	 */
 	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
 					 new_blob, size, &buf, NULL);
@@ -272,8 +275,8 @@ static int check_leading_dirs(const char *path, int len, int prefix_len)
 	return has_dirs_only_path(path, slash - path, prefix_len);
 }
 
-static void write_pc_item(struct parallel_checkout_item *pc_item,
-			  struct checkout *state)
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state)
 {
 	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
 	int fd = -1, fstat_done = 0;
@@ -343,6 +346,221 @@ static void write_pc_item(struct parallel_checkout_item *pc_item,
 	strbuf_release(&path);
 }
 
+static void send_one_item(int fd, struct parallel_checkout_item *pc_item)
+{
+	size_t len_data;
+	char *data, *variant;
+	struct pc_item_fixed_portion *fixed_portion;
+	const char *working_tree_encoding = pc_item->ca.working_tree_encoding;
+	size_t name_len = pc_item->ce->ce_namelen;
+	size_t working_tree_encoding_len = working_tree_encoding ?
+					   strlen(working_tree_encoding) : 0;
+
+	len_data = sizeof(struct pc_item_fixed_portion) + name_len +
+		   working_tree_encoding_len;
+
+	data = xcalloc(1, len_data);
+
+	fixed_portion = (struct pc_item_fixed_portion *)data;
+	fixed_portion->id = pc_item->id;
+	fixed_portion->ce_mode = pc_item->ce->ce_mode;
+	fixed_portion->crlf_action = pc_item->ca.crlf_action;
+	fixed_portion->ident = pc_item->ca.ident;
+	fixed_portion->name_len = name_len;
+	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
+	/*
+	 * We use hashcpy() instead of oidcpy() because the hash[] positions
+	 * after `the_hash_algo->rawsz` might not be initialized. And Valgrind
+	 * would complain about passing uninitialized bytes to a syscall
+	 * (write(2)). There is no real harm in this case, but the warning could
+	 * hinder the detection of actual errors.
+	 */
+	hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash);
+
+	variant = data + sizeof(*fixed_portion);
+	if (working_tree_encoding_len) {
+		memcpy(variant, working_tree_encoding, working_tree_encoding_len);
+		variant += working_tree_encoding_len;
+	}
+	memcpy(variant, pc_item->ce->name, name_len);
+
+	packet_write(fd, data, len_data);
+
+	free(data);
+}
+
+static void send_batch(int fd, size_t start, size_t nr)
+{
+	size_t i;
+	for (i = 0; i < nr; ++i)
+		send_one_item(fd, &parallel_checkout.items[start + i]);
+	packet_flush(fd);
+}
+
+static struct pc_worker *setup_workers(struct checkout *state, int num_workers)
+{
+	struct pc_worker *workers;
+	int i, workers_with_one_extra_item;
+	size_t base_batch_size, next_to_assign = 0;
+
+	ALLOC_ARRAY(workers, num_workers);
+
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *cp = &workers[i].cp;
+
+		child_process_init(cp);
+		cp->git_cmd = 1;
+		cp->in = -1;
+		cp->out = -1;
+		cp->clean_on_exit = 1;
+		strvec_push(&cp->args, "checkout--helper");
+		if (state->base_dir_len)
+			strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
+		if (start_command(cp))
+			die(_("failed to spawn checkout worker"));
+	}
+
+	base_batch_size = parallel_checkout.nr / num_workers;
+	workers_with_one_extra_item = parallel_checkout.nr % num_workers;
+
+	for (i = 0; i < num_workers; ++i) {
+		struct pc_worker *worker = &workers[i];
+		size_t batch_size = base_batch_size;
+
+		/* distribute the extra work evenly */
+		if (i < workers_with_one_extra_item)
+			batch_size++;
+
+		send_batch(worker->cp.in, next_to_assign, batch_size);
+		worker->next_to_complete = next_to_assign;
+		worker->nr_to_complete = batch_size;
+
+		next_to_assign += batch_size;
+	}
+
+	return workers;
+}
+
+static void finish_workers(struct pc_worker *workers, int num_workers)
+{
+	int i;
+
+	/*
+	 * Close pipes before calling finish_command() to let the workers
+	 * exit asynchronously and avoid spending extra time on wait().
+	 */
+	for (i = 0; i < num_workers; ++i) {
+		struct child_process *cp = &workers[i].cp;
+		if (cp->in >= 0)
+			close(cp->in);
+		if (cp->out >= 0)
+			close(cp->out);
+	}
+
+	for (i = 0; i < num_workers; ++i) {
+		if (finish_command(&workers[i].cp))
+			error(_("checkout worker %d finished with error"), i);
+	}
+
+	free(workers);
+}
+
+#define ASSERT_PC_ITEM_RESULT_SIZE(got, exp) \
+{ \
+	if (got != exp) \
+		BUG("corrupted result from checkout worker (got %dB, exp %dB)", \
+		    got, exp); \
+} while(0)
+
+static void parse_and_save_result(const char *line, int len,
+				  struct pc_worker *worker)
+{
+	struct pc_item_result *res;
+	struct parallel_checkout_item *pc_item;
+	struct stat *st = NULL;
+
+	if (len < PC_ITEM_RESULT_BASE_SIZE)
+		BUG("too short result from checkout worker (got %dB, exp %dB)",
+		    len, (int)PC_ITEM_RESULT_BASE_SIZE);
+
+	res = (struct pc_item_result *)line;
+
+	/*
+	 * Worker should send either the full result struct on success, or
+	 * just the base (i.e. no stat data), otherwise.
+	 */
+	if (res->status == PC_ITEM_WRITTEN) {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)sizeof(struct pc_item_result));
+		st = &res->st;
+	} else {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)PC_ITEM_RESULT_BASE_SIZE);
+	}
+
+	if (!worker->nr_to_complete || res->id != worker->next_to_complete)
+		BUG("checkout worker sent unexpected item id");
+
+	worker->next_to_complete++;
+	worker->nr_to_complete--;
+
+	pc_item = &parallel_checkout.items[res->id];
+	pc_item->status = res->status;
+	if (st)
+		pc_item->st = *st;
+}
+
+
+static void gather_results_from_workers(struct pc_worker *workers,
+					int num_workers)
+{
+	int i, active_workers = num_workers;
+	struct pollfd *pfds;
+
+	CALLOC_ARRAY(pfds, num_workers);
+	for (i = 0; i < num_workers; ++i) {
+		pfds[i].fd = workers[i].cp.out;
+		pfds[i].events = POLLIN;
+	}
+
+	while (active_workers) {
+		int nr = poll(pfds, num_workers, -1);
+
+		if (nr < 0) {
+			if (errno == EINTR)
+				continue;
+			die_errno("failed to poll checkout workers");
+		}
+
+		for (i = 0; i < num_workers && nr > 0; ++i) {
+			struct pc_worker *worker = &workers[i];
+			struct pollfd *pfd = &pfds[i];
+
+			if (!pfd->revents)
+				continue;
+
+			if (pfd->revents & POLLIN) {
+				int len;
+				const char *line = packet_read_line(pfd->fd, &len);
+
+				if (!line) {
+					pfd->fd = -1;
+					active_workers--;
+				} else {
+					parse_and_save_result(line, len, worker);
+				}
+			} else if (pfd->revents & POLLHUP) {
+				pfd->fd = -1;
+				active_workers--;
+			} else if (pfd->revents & (POLLNVAL | POLLERR)) {
+				die(_("error polling from checkout worker"));
+			}
+
+			nr--;
+		}
+	}
+
+	free(pfds);
+}
+
 static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
@@ -351,7 +569,7 @@ static void write_items_sequentially(struct checkout *state)
 		write_pc_item(&parallel_checkout.items[i], state);
 }
 
-int run_parallel_checkout(struct checkout *state)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
 {
 	int ret;
 
@@ -360,7 +578,17 @@ int run_parallel_checkout(struct checkout *state)
 
 	parallel_checkout.status = PC_RUNNING;
 
-	write_items_sequentially(state);
+	if (parallel_checkout.nr < num_workers)
+		num_workers = parallel_checkout.nr;
+
+	if (num_workers <= 1 || parallel_checkout.nr < threshold) {
+		write_items_sequentially(state);
+	} else {
+		struct pc_worker *workers = setup_workers(state, num_workers);
+		gather_results_from_workers(workers, num_workers);
+		finish_workers(workers, num_workers);
+	}
+
 	ret = handle_results(state);
 
 	finish_parallel_checkout();
diff --git a/parallel-checkout.h b/parallel-checkout.h
index e6d6fc01ea..0c9984584e 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -1,9 +1,12 @@
 #ifndef PARALLEL_CHECKOUT_H
 #define PARALLEL_CHECKOUT_H
 
-struct cache_entry;
-struct checkout;
-struct conv_attrs;
+#include "entry.h"
+#include "convert.h"
+
+/****************************************************************
+ * Users of parallel checkout
+ ****************************************************************/
 
 enum pc_status {
 	PC_UNINITIALIZED = 0,
@@ -12,6 +15,7 @@ enum pc_status {
 };
 
 enum pc_status parallel_checkout_status(void);
+void get_parallel_checkout_configs(int *num_workers, int *threshold);
 void init_parallel_checkout(void);
 
 /*
@@ -21,7 +25,77 @@ void init_parallel_checkout(void);
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 
-/* Write all the queued entries, returning 0 on success.*/
-int run_parallel_checkout(struct checkout *state);
+/*
+ * Write all the queued entries, returning 0 on success. If the number of
+ * entries is smaller than the specified threshold, the operation is performed
+ * sequentially.
+ */
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+
+/****************************************************************
+ * Interface with checkout--helper
+ ****************************************************************/
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/*
+	 * In main process ce points to a istate->cache[] entry. Thus, it's not
+	 * owned by us. In workers they own the memory, which *must be* released.
+	 */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	size_t id; /* position in parallel_checkout.items[] of main process */
+
+	/* Output fields, sent from workers. */
+	enum pc_item_status status;
+	struct stat st;
+};
+
+/*
+ * The fixed-size portion of `struct parallel_checkout_item` that is sent to the
+ * workers. Following this will be 2 strings: ca.working_tree_encoding and
+ * ce.name; These are NOT null terminated, since we have the size in the fixed
+ * portion.
+ *
+ * Note that not all fields of conv_attrs and cache_entry are passed, only the
+ * ones that will be required by the workers to smudge and write the entry.
+ */
+struct pc_item_fixed_portion {
+	size_t id;
+	struct object_id oid;
+	unsigned int ce_mode;
+	enum crlf_action crlf_action;
+	int ident;
+	size_t working_tree_encoding_len;
+	size_t name_len;
+};
+
+/*
+ * The fields of `struct parallel_checkout_item` that are returned by the
+ * workers. Note: `st` must be the last one, as it is omitted on error.
+ */
+struct pc_item_result {
+	size_t id;
+	enum pc_item_status status;
+	struct stat st;
+};
+
+#define PC_ITEM_RESULT_BASE_SIZE offsetof(struct pc_item_result, st)
+
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state);
 
 #endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 1b1da7485a..117ed42370 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o,
 	int errs = 0;
 	struct progress *progress;
 	struct checkout state = CHECKOUT_INIT;
-	int i;
+	int i, pc_workers, pc_threshold;
 
 	trace_performance_enter();
 	state.force = 1;
@@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o,
 		oid_array_clear(&to_fetch);
 	}
 
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+
 	enable_delayed_checkout(&state);
-	init_parallel_checkout();
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
-	errs |= run_parallel_checkout(&state);
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 12/19] parallel-checkout: support progress displaying
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (10 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 11/19] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
                       ` (9 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 parallel-checkout.c | 34 +++++++++++++++++++++++++++++++---
 parallel-checkout.h |  4 +++-
 unpack-trees.c      | 11 ++++++++---
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index a5508e27c2..c5c449d224 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -2,6 +2,7 @@
 #include "entry.h"
 #include "parallel-checkout.h"
 #include "pkt-line.h"
+#include "progress.h"
 #include "run-command.h"
 #include "streaming.h"
 #include "thread-utils.h"
@@ -16,6 +17,8 @@ struct parallel_checkout {
 	enum pc_status status;
 	struct parallel_checkout_item *items;
 	size_t nr, alloc;
+	struct progress *progress;
+	unsigned int *progress_cnt;
 };
 
 static struct parallel_checkout parallel_checkout = { 0 };
@@ -125,6 +128,20 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	return 0;
 }
 
+size_t pc_queue_size(void)
+{
+	return parallel_checkout.nr;
+}
+
+static void advance_progress_meter(void)
+{
+	if (parallel_checkout.progress) {
+		(*parallel_checkout.progress_cnt)++;
+		display_progress(parallel_checkout.progress,
+				 *parallel_checkout.progress_cnt);
+	}
+}
+
 static int handle_results(struct checkout *state)
 {
 	int ret = 0;
@@ -173,6 +190,7 @@ static int handle_results(struct checkout *state)
 			 */
 			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
 						 state, NULL, NULL);
+			advance_progress_meter();
 			break;
 		case PC_ITEM_PENDING:
 			have_pending = 1;
@@ -506,6 +524,9 @@ static void parse_and_save_result(const char *line, int len,
 	pc_item->status = res->status;
 	if (st)
 		pc_item->st = *st;
+
+	if (res->status != PC_ITEM_COLLIDED)
+		advance_progress_meter();
 }
 
 
@@ -565,11 +586,16 @@ static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
 
-	for (i = 0; i < parallel_checkout.nr; ++i)
-		write_pc_item(&parallel_checkout.items[i], state);
+	for (i = 0; i < parallel_checkout.nr; ++i) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		write_pc_item(pc_item, state);
+		if (pc_item->status != PC_ITEM_COLLIDED)
+			advance_progress_meter();
+	}
 }
 
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt)
 {
 	int ret;
 
@@ -577,6 +603,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 		BUG("cannot run parallel checkout: uninitialized or already running");
 
 	parallel_checkout.status = PC_RUNNING;
+	parallel_checkout.progress = progress;
+	parallel_checkout.progress_cnt = progress_cnt;
 
 	if (parallel_checkout.nr < num_workers)
 		num_workers = parallel_checkout.nr;
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 0c9984584e..6c3a016c0b 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -24,13 +24,15 @@ void init_parallel_checkout(void);
  * write and return 0.
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+size_t pc_queue_size(void);
 
 /*
  * Write all the queued entries, returning 0 on success. If the number of
  * entries is smaller than the specified threshold, the operation is performed
  * sequentially.
  */
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt);
 
 /****************************************************************
  * Interface with checkout--helper
diff --git a/unpack-trees.c b/unpack-trees.c
index 117ed42370..e05e6ceff2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o,
 		struct cache_entry *ce = index->cache[i];
 
 		if (ce->ce_flags & CE_UPDATE) {
+			size_t last_pc_queue_size = pc_queue_size();
+
 			if (ce->ce_flags & CE_WT_REMOVE)
 				BUG("both update and delete flags are set on %s",
 				    ce->name);
-			display_progress(progress, ++cnt);
 			ce->ce_flags &= ~CE_UPDATE;
 			errs |= checkout_entry(ce, &state, NULL, NULL);
+
+			if (last_pc_queue_size == pc_queue_size())
+				display_progress(progress, ++cnt);
 		}
 	}
-	stop_progress(&progress);
 	if (pc_workers > 1)
-		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      progress, &cnt);
+	stop_progress(&progress);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (11 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 12/19] parallel-checkout: support progress displaying Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
                       ` (8 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout--helper.c |  2 +-
 builtin/checkout.c         |  2 +-
 builtin/difftool.c         |  2 +-
 cache.h                    | 10 +++++-----
 read-cache.c               | 12 ++++++++----
 unpack-trees.c             |  2 +-
 6 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
index 67fe37cf11..9646ed9eeb 100644
--- a/builtin/checkout--helper.c
+++ b/builtin/checkout--helper.c
@@ -38,7 +38,7 @@ static void packet_to_pc_item(char *line, int len,
 	}
 
 	memset(pc_item, 0, sizeof(*pc_item));
-	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL);
 	pc_item->ce->ce_namelen = fixed_portion->name_len;
 	pc_item->ce->ce_mode = fixed_portion->ce_mode;
 	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b18b9d6f3c..c0bf5e6711 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
diff --git a/builtin/difftool.c b/builtin/difftool.c
index dfa22b67eb..5e7a57c8c2 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -323,7 +323,7 @@ static int checkout_path(unsigned mode, struct object_id *oid,
 	struct cache_entry *ce;
 	int ret;
 
-	ce = make_transient_cache_entry(mode, oid, path, 0);
+	ce = make_transient_cache_entry(mode, oid, path, 0, NULL);
 	ret = checkout_entry(ce, state, NULL, NULL);
 
 	discard_cache_entry(ce);
diff --git a/cache.h b/cache.h
index ccfeb9ba2b..b5074b2cb2 100644
--- a/cache.h
+++ b/cache.h
@@ -355,16 +355,16 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate,
 					   size_t name_len);
 
 /*
- * Create a cache_entry that is not intended to be added to an index.
- * Caller is responsible for discarding the cache_entry
- * with `discard_cache_entry`.
+ * Create a cache_entry that is not intended to be added to an index. If mp is
+ * not NULL, the entry is allocated within the given memory pool. Caller is
+ * responsible for discarding the cache_entry with `discard_cache_entry`.
  */
 struct cache_entry *make_transient_cache_entry(unsigned int mode,
 					       const struct object_id *oid,
 					       const char *path,
-					       int stage);
+					       int stage, struct mem_pool *mp);
 
-struct cache_entry *make_empty_transient_cache_entry(size_t name_len);
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp);
 
 /*
  * Discard cache entry.
diff --git a/read-cache.c b/read-cache.c
index ecf6f68994..f9bac760af 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -813,8 +813,10 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t le
 	return mem_pool__ce_calloc(find_mem_pool(istate), len);
 }
 
-struct cache_entry *make_empty_transient_cache_entry(size_t len)
+struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp)
 {
+	if (mp)
+		return mem_pool__ce_calloc(mp, len);
 	return xcalloc(1, cache_entry_size(len));
 }
 
@@ -848,8 +850,10 @@ struct cache_entry *make_cache_entry(struct index_state *istate,
 	return ret;
 }
 
-struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid,
-					       const char *path, int stage)
+struct cache_entry *make_transient_cache_entry(unsigned int mode,
+					       const struct object_id *oid,
+					       const char *path, int stage,
+					       struct mem_pool *mp)
 {
 	struct cache_entry *ce;
 	int len;
@@ -860,7 +864,7 @@ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct o
 	}
 
 	len = strlen(path);
-	ce = make_empty_transient_cache_entry(len);
+	ce = make_empty_transient_cache_entry(len, mp);
 
 	oidcpy(&ce->oid, oid);
 	memcpy(ce->name, path, len);
diff --git a/unpack-trees.c b/unpack-trees.c
index e05e6ceff2..dcb40dc8fa 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1031,7 +1031,7 @@ static struct cache_entry *create_ce_entry(const struct traverse_info *info,
 	size_t len = traverse_path_len(info, tree_entry_len(n));
 	struct cache_entry *ce =
 		is_transient ?
-		make_empty_transient_cache_entry(len) :
+		make_empty_transient_cache_entry(len, NULL) :
 		make_empty_cache_entry(istate, len);
 
 	ce->ce_mode = create_ce_mode(n->mode);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (12 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 15/19] checkout-index: add " Matheus Tavares
                       ` (7 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

There is one code path in builtin/checkout.c which still doesn't benefit
from parallel checkout because it calls checkout_entry() directly,
instead of unpack_trees(). Let's add parallel support for this missing
spot as well. Note: the transient cache entries allocated in
checkout_merged() are now allocated in a mem_pool which is only
discarded after parallel checkout finishes. This is done because the
entries need to be valid when run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index c0bf5e6711..ddc4079b85 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -27,6 +27,7 @@
 #include "wt-status.h"
 #include "xdiff-interface.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
@@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos,
 		return error(_("path '%s' does not have their version"), ce->name);
 }
 
-static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts)
+static int checkout_merged(int pos, const struct checkout *state,
+			   int *nr_checkouts, struct mem_pool *ce_mem_pool)
 {
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
@@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);
-	discard_cache_entry(ce);
 	return status;
 }
 
@@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	int nr_checkouts = 0, nr_unmerged = 0;
 	int errs = 0;
 	int pos;
+	int pc_workers, pc_threshold;
+	struct mem_pool ce_mem_pool;
 
 	state.force = 1;
 	state.refresh_cache = 1;
 	state.istate = &the_index;
 
+	mem_pool_init(&ce_mem_pool, 0);
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
 	init_checkout_metadata(&state.meta, info->refname,
 			       info->commit ? &info->commit->object.oid : &info->oid,
 			       NULL);
 
 	enable_delayed_checkout(&state);
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts,
 						       &nr_checkouts, opts->overlay_mode);
 			else if (opts->merge)
 				errs |= checkout_merged(pos, &state,
-							&nr_unmerged);
+							&nr_unmerged,
+							&ce_mem_pool);
 			pos = skip_same_name(ce, pos) - 1;
 		}
 	}
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      NULL, NULL);
+	mem_pool_discard(&ce_mem_pool, should_validate_cache_entries());
 	remove_marked_cache_entries(&the_index, 1);
 	remove_scheduled_dirs();
 	errs |= finish_delayed_checkout(&state, &nr_checkouts);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 15/19] checkout-index: add parallel checkout support
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (13 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
                       ` (6 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout-index.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 9276ed0258..9a2e255f58 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -12,6 +12,7 @@
 #include "cache-tree.h"
 #include "parse-options.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
@@ -169,6 +170,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 	int force = 0, quiet = 0, not_new = 0;
 	int index_opt = 0;
 	int err = 0;
+	int pc_workers, pc_threshold;
 	struct option builtin_checkout_index_options[] = {
 		OPT_BOOL('a', "all", &all,
 			N_("check out all files in the index")),
@@ -223,6 +225,14 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 		hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
 	}
 
+	if (!to_tempfile)
+		get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+	else
+		pc_workers = 1;
+
+	if (pc_workers > 1)
+		init_parallel_checkout();
+
 	/* Check out named files first */
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
@@ -262,12 +272,17 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
-	if (err)
-		return 1;
-
 	if (all)
 		checkout_all(prefix, prefix_length);
 
+	if (pc_workers > 1) {
+		err |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					     NULL, NULL);
+	}
+
+	if (err)
+		return 1;
+
 	if (is_lock_file_locked(&lock_file) &&
 	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("Unable to write new index file");
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 16/19] parallel-checkout: add tests for basic operations
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (14 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 15/19] checkout-index: add " Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
                       ` (5 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Add tests to populate the working tree during clone and checkout using
the sequential and parallel modes, to confirm that they produce
identical results. Also test basic checkout mechanics, such as checking
for symlinks in the leading directories and the abidance to --force.

Note: some helper functions are added to a common lib file which is only
included by t2080 for now. But it will also be used by other
parallel-checkout tests in the following patches.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/lib-parallel-checkout.sh          |  40 +++++++
 t/t2080-parallel-checkout-basics.sh | 170 ++++++++++++++++++++++++++++
 2 files changed, 210 insertions(+)
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
new file mode 100644
index 0000000000..4dad9043fb
--- /dev/null
+++ b/t/lib-parallel-checkout.sh
@@ -0,0 +1,40 @@
+# Helpers for t208* tests
+
+# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
+# and checks that the number of workers spawned is equal to $3.
+#
+git_pc()
+{
+	if test $# -lt 4
+	then
+		BUG "too few arguments to git_pc()"
+	fi &&
+
+	workers=$1 threshold=$2 expected_workers=$3 &&
+	shift 3 &&
+
+	rm -f trace &&
+	GIT_TRACE2="$(pwd)/trace" git \
+		-c checkout.workers=$workers \
+		-c checkout.thresholdForParallelism=$threshold \
+		-c advice.detachedHead=0 \
+		"$@" &&
+
+	# Check that the expected number of workers has been used. Note that it
+	# can be different from the requested number in two cases: when the
+	# threshold is not reached; and when there are not enough
+	# parallel-eligible entries for all workers.
+	#
+	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
+	test $workers_in_trace -eq $expected_workers &&
+	rm -f trace
+}
+
+# Verify that both the working tree and the index were created correctly
+verify_checkout()
+{
+	git -C "$1" diff-index --quiet HEAD -- &&
+	git -C "$1" diff-index --quiet --cached HEAD -- &&
+	git -C "$1" status --porcelain >"$1".status &&
+	test_must_be_empty "$1".status
+}
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
new file mode 100755
index 0000000000..edea88f14f
--- /dev/null
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -0,0 +1,170 @@
+#!/bin/sh
+
+test_description='parallel-checkout basics
+
+Ensure that parallel-checkout basically works on clone and checkout, spawning
+the required number of workers and correctly populating both the index and
+working tree.
+'
+
+TEST_NO_CREATE_REPO=1
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
+
+# Test parallel-checkout with different operations (creation, deletion,
+# modification) and entry types. A branch switch from B1 to B2 will contain:
+#
+# - a (file):      modified
+# - e/x (file):    deleted
+# - b (symlink):   deleted
+# - b/f (file):    created
+# - e (symlink):   created
+# - d (submodule): created
+#
+test_expect_success SYMLINKS 'setup repo for checkout with various operations' '
+	git init various &&
+	(
+		cd various &&
+		git checkout -b B1 &&
+		echo a>a &&
+		mkdir e &&
+		echo e/x >e/x &&
+		ln -s e b &&
+		git add -A &&
+		git commit -m B1 &&
+
+		git checkout -b B2 &&
+		echo modified >a &&
+		rm -rf e &&
+		rm b &&
+		mkdir b &&
+		echo b/f >b/f &&
+		ln -s b e &&
+		git init d &&
+		test_commit -C d f &&
+		git submodule add ./d &&
+		git add -A &&
+		git commit -m B2 &&
+
+		git checkout --recurse-submodules B1
+	)
+'
+
+test_expect_success SYMLINKS 'sequential checkout' '
+	cp -R various various_sequential &&
+	git_pc 1 0 0 -C various_sequential checkout --recurse-submodules B2 &&
+	verify_checkout various_sequential
+'
+
+test_expect_success SYMLINKS 'parallel checkout' '
+	cp -R various various_parallel &&
+	git_pc 2 0 2 -C various_parallel checkout --recurse-submodules B2 &&
+	verify_checkout various_parallel
+'
+
+test_expect_success SYMLINKS 'fallback to sequential checkout (threshold)' '
+	cp -R various various_sequential_fallback &&
+	git_pc 2 100 0 -C various_sequential_fallback checkout --recurse-submodules B2 &&
+	verify_checkout various_sequential_fallback
+'
+
+test_expect_success SYMLINKS 'parallel checkout on clone' '
+	git -C various checkout --recurse-submodules B2 &&
+	git_pc 2 0 2 clone --recurse-submodules various various_parallel_clone  &&
+	verify_checkout various_parallel_clone
+'
+
+test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' '
+	git -C various checkout --recurse-submodules B2 &&
+	git_pc 2 100 0 clone --recurse-submodules various various_sequential_fallback_clone &&
+	verify_checkout various_sequential_fallback_clone
+'
+
+# Just to be paranoid, actually compare the working trees' contents directly.
+test_expect_success SYMLINKS 'compare the working trees' '
+	rm -rf various_*/.git &&
+	rm -rf various_*/d/.git &&
+
+	diff -r various_sequential various_parallel &&
+	diff -r various_sequential various_sequential_fallback &&
+	diff -r various_sequential various_parallel_clone &&
+	diff -r various_sequential various_sequential_fallback_clone
+'
+
+test_cmp_str()
+{
+	echo "$1" >tmp &&
+	test_cmp tmp "$2"
+}
+
+test_expect_success 'parallel checkout respects --[no]-force' '
+	git init dirty &&
+	(
+		cd dirty &&
+		mkdir D &&
+		test_commit D/F &&
+		test_commit F &&
+
+		echo changed >F.t &&
+		rm -rf D &&
+		echo changed >D &&
+
+		# We expect 0 workers because there is nothing to be updated
+		git_pc 2 0 0 checkout HEAD &&
+		test_path_is_file D &&
+		test_cmp_str changed D &&
+		test_cmp_str changed F.t &&
+
+		git_pc 2 0 2 checkout --force HEAD &&
+		test_path_is_dir D &&
+		test_cmp_str D/F D/F.t &&
+		test_cmp_str F F.t
+	)
+'
+
+test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading dirs' '
+	git init symlinks &&
+	(
+		cd symlinks &&
+		mkdir D E &&
+
+		# Create two entries in D to have enough work for 2 parallel
+		# workers
+		test_commit D/A &&
+		test_commit D/B &&
+		test_commit E/C &&
+		rm -rf D &&
+		ln -s E D &&
+
+		git_pc 2 0 2 checkout --force HEAD &&
+		! test -L D &&
+		test_cmp_str D/A D/A.t &&
+		test_cmp_str D/B D/B.t
+	)
+'
+
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'symlink colliding with leading dir' '
+	git init colliding-symlink &&
+	(
+		cd colliding-symlink &&
+		file_hex=$(git hash-object -w --stdin </dev/null) &&
+		file_oct=$(echo $file_hex | hex2oct) &&
+
+		sym_hex=$(echo "./D" | git hash-object -w --stdin) &&
+		sym_oct=$(echo $sym_hex | hex2oct) &&
+
+		printf "100644 D/A\0${file_oct}" >tree &&
+		printf "100644 E/B\0${file_oct}" >>tree &&
+		printf "120000 e\0${sym_oct}" >>tree &&
+
+		tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
+		commit_hex=$(git commit-tree -m collisions $tree_hex) &&
+		git update-ref refs/heads/colliding-symlink $commit_hex &&
+
+		git_pc 2 0 2 checkout colliding-symlink &&
+		test_path_is_dir D &&
+		test_path_is_missing D/B
+	)
+'
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (15 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
                       ` (4 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Add tests to confirm that path collisions are properly reported during a
clone operation using parallel-checkout.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/lib-parallel-checkout.sh              |  4 +-
 t/t2081-parallel-checkout-collisions.sh | 98 +++++++++++++++++++++++++
 2 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100755 t/t2081-parallel-checkout-collisions.sh

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index 4dad9043fb..e62a433eb1 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -18,7 +18,7 @@ git_pc()
 		-c checkout.workers=$workers \
 		-c checkout.thresholdForParallelism=$threshold \
 		-c advice.detachedHead=0 \
-		"$@" &&
+		"$@" 2>&8 &&
 
 	# Check that the expected number of workers has been used. Note that it
 	# can be different from the requested number in two cases: when the
@@ -28,7 +28,7 @@ git_pc()
 	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
 	test $workers_in_trace -eq $expected_workers &&
 	rm -f trace
-}
+} 8>&2 2>&4
 
 # Verify that both the working tree and the index were created correctly
 verify_checkout()
diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh
new file mode 100755
index 0000000000..5cab2dcd2c
--- /dev/null
+++ b/t/t2081-parallel-checkout-collisions.sh
@@ -0,0 +1,98 @@
+#!/bin/sh
+
+test_description='parallel-checkout collisions
+
+When there are path collisions during a clone, Git should report a warning
+listing all of the colliding entries. The sequential code detects a collision
+by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
+colliding pair of an item k, it searches cache_entry[0, k-1].
+
+This is not sufficient in parallel checkout since:
+
+- A colliding file may be created between the lstat() and open() calls;
+- A colliding entry might appear in the second half of the cache_entry array.
+
+The tests in this file make sure that the collision detection code is extended
+for parallel checkout.
+'
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
+
+TEST_ROOT="$PWD"
+
+test_expect_success CASE_INSENSITIVE_FS 'setup' '
+	file_x_hex=$(git hash-object -w --stdin </dev/null) &&
+	file_x_oct=$(echo $file_x_hex | hex2oct) &&
+
+	attr_hex=$(echo "file_x filter=logger" | git hash-object -w --stdin) &&
+	attr_oct=$(echo $attr_hex | hex2oct) &&
+
+	printf "100644 FILE_X\0${file_x_oct}" >tree &&
+	printf "100644 FILE_x\0${file_x_oct}" >>tree &&
+	printf "100644 file_X\0${file_x_oct}" >>tree &&
+	printf "100644 file_x\0${file_x_oct}" >>tree &&
+	printf "100644 .gitattributes\0${attr_oct}" >>tree &&
+
+	tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
+	commit_hex=$(git commit-tree -m collisions $tree_hex) &&
+	git update-ref refs/heads/collisions $commit_hex &&
+
+	write_script "$TEST_ROOT"/logger_script <<-\EOF
+	echo "$@" >>filter.log
+	EOF
+'
+
+for mode in parallel sequential-fallback
+do
+
+	case $mode in
+	parallel)		workers=2 threshold=0 expected_workers=2 ;;
+	sequential-fallback)	workers=2 threshold=100 expected_workers=0 ;;
+	esac
+
+	test_expect_success CASE_INSENSITIVE_FS "collision detection on $mode clone" '
+		git_pc $workers $threshold $expected_workers \
+			clone --branch=collisions . $mode 2>$mode.stderr &&
+
+		grep FILE_X $mode.stderr &&
+		grep FILE_x $mode.stderr &&
+		grep file_X $mode.stderr &&
+		grep file_x $mode.stderr &&
+		test_i18ngrep "the following paths have collided" $mode.stderr
+	'
+
+	# The following test ensures that the collision detection code is
+	# correctly looking for colliding peers in the second half of the
+	# cache_entry array. This is done by defining a smudge command for the
+	# *last* array entry, which makes it non-eligible for parallel-checkout.
+	# The last entry is then checked out *before* any worker is spawned,
+	# making it succeed and the workers' entries collide.
+	#
+	# Note: this test don't work on Windows because, on this system,
+	# collision detection uses strcmp() when core.ignoreCase=false. And we
+	# have to set core.ignoreCase=false so that only 'file_x' matches the
+	# pattern of the filter attribute. But it works on OSX, where collision
+	# detection uses inode.
+	#
+	test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN "collision detection on $mode clone w/ filter" '
+		git_pc $workers $threshold $expected_workers \
+			-c core.ignoreCase=false \
+			-c filter.logger.smudge="\"$TEST_ROOT/logger_script\" %f" \
+			clone --branch=collisions . ${mode}_with_filter \
+			2>${mode}_with_filter.stderr &&
+
+		grep FILE_X ${mode}_with_filter.stderr &&
+		grep FILE_x ${mode}_with_filter.stderr &&
+		grep file_X ${mode}_with_filter.stderr &&
+		grep file_x ${mode}_with_filter.stderr &&
+		test_i18ngrep "the following paths have collided" ${mode}_with_filter.stderr &&
+
+		# Make sure only "file_x" was filtered
+		test_path_is_file ${mode}_with_filter/filter.log &&
+		echo file_x >expected.filter.log &&
+		test_cmp ${mode}_with_filter/filter.log expected.filter.log
+	'
+done
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (16 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29  2:14     ` [PATCH v3 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
                       ` (3 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Add tests to confirm that `struct conv_attrs` data is correctly passed
from the main process to the workers, and that they properly smudge
files before writing to the working tree. Also check that
non-parallel-eligible entries, such as regular files that require
external filters, are correctly smudge and written when
parallel-checkout is enabled.

Note: to avoid repeating code, some helper functions are extracted from
t0028 into a common lib file.

Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/lib-encoding.sh                       |  25 ++++
 t/t0028-working-tree-encoding.sh        |  25 +---
 t/t2082-parallel-checkout-attributes.sh | 174 ++++++++++++++++++++++++
 3 files changed, 200 insertions(+), 24 deletions(-)
 create mode 100644 t/lib-encoding.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

diff --git a/t/lib-encoding.sh b/t/lib-encoding.sh
new file mode 100644
index 0000000000..c52ffbbed5
--- /dev/null
+++ b/t/lib-encoding.sh
@@ -0,0 +1,25 @@
+# Encoding helpers used by t0028 and t2082
+
+test_lazy_prereq NO_UTF16_BOM '
+	test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6
+'
+
+test_lazy_prereq NO_UTF32_BOM '
+	test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12
+'
+
+write_utf16 () {
+	if test_have_prereq NO_UTF16_BOM
+	then
+		printf '\376\377'
+	fi &&
+	iconv -f UTF-8 -t UTF-16
+}
+
+write_utf32 () {
+	if test_have_prereq NO_UTF32_BOM
+	then
+		printf '\0\0\376\377'
+	fi &&
+	iconv -f UTF-8 -t UTF-32
+}
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index bfc4fb9af5..4fffc3a639 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -3,33 +3,10 @@
 test_description='working-tree-encoding conversion via gitattributes'
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY/lib-encoding.sh"
 
 GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
 
-test_lazy_prereq NO_UTF16_BOM '
-	test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6
-'
-
-test_lazy_prereq NO_UTF32_BOM '
-	test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12
-'
-
-write_utf16 () {
-	if test_have_prereq NO_UTF16_BOM
-	then
-		printf '\376\377'
-	fi &&
-	iconv -f UTF-8 -t UTF-16
-}
-
-write_utf32 () {
-	if test_have_prereq NO_UTF32_BOM
-	then
-		printf '\0\0\376\377'
-	fi &&
-	iconv -f UTF-8 -t UTF-32
-}
-
 test_expect_success 'setup test files' '
 	git config core.eol lf &&
 
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
new file mode 100755
index 0000000000..6800574588
--- /dev/null
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -0,0 +1,174 @@
+#!/bin/sh
+
+test_description='parallel-checkout: attributes
+
+Verify that parallel-checkout correctly creates files that require
+conversions, as specified in .gitattributes. The main point here is
+to check that the conv_attr data is correctly sent to the workers
+and that it contains sufficient information to smudge files
+properly (without access to the index or attribute stack).
+'
+
+TEST_NO_CREATE_REPO=1
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
+. "$TEST_DIRECTORY/lib-encoding.sh"
+
+test_expect_success 'parallel-checkout with ident' '
+	git init ident &&
+	(
+		cd ident &&
+		echo "A ident" >.gitattributes &&
+		echo "\$Id\$" >A &&
+		echo "\$Id\$" >B &&
+		git add -A &&
+		git commit -m id &&
+
+		rm A B &&
+		git_pc 2 0 2 reset --hard &&
+		hexsz=$(test_oid hexsz) &&
+		grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A &&
+		grep "\\\$Id\\\$" B
+	)
+'
+
+test_expect_success 'parallel-checkout with re-encoding' '
+	git init encoding &&
+	(
+		cd encoding &&
+		echo text >utf8-text &&
+		cat utf8-text | write_utf16 >utf16-text &&
+
+		echo "A working-tree-encoding=UTF-16" >.gitattributes &&
+		cp utf16-text A &&
+		cp utf16-text B &&
+		git add A B .gitattributes &&
+		git commit -m encoding &&
+
+		# Check that A (and only A) is stored in UTF-8
+		git cat-file -p :A >A.internal &&
+		test_cmp_bin utf8-text A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_bin utf16-text B.internal &&
+
+		# Check that A is re-encoded during checkout
+		rm A B &&
+		git_pc 2 0 2 checkout A B &&
+		test_cmp_bin utf16-text A
+	)
+'
+
+test_expect_success 'parallel-checkout with eol conversions' '
+	git init eol &&
+	(
+		cd eol &&
+		git config core.autocrlf false &&
+		printf "multi\r\nline\r\ntext" >crlf-text &&
+		printf "multi\nline\ntext" >lf-text &&
+
+		echo "A text eol=crlf" >.gitattributes &&
+		echo "B -text" >>.gitattributes &&
+		cp crlf-text A &&
+		cp crlf-text B &&
+		git add A B .gitattributes &&
+		git commit -m eol &&
+
+		# Check that A (and only A) is stored with LF format
+		git cat-file -p :A >A.internal &&
+		test_cmp_bin lf-text A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_bin crlf-text B.internal &&
+
+		# Check that A is converted to CRLF during checkout
+		rm A B &&
+		git_pc 2 0 2 checkout A B &&
+		test_cmp_bin crlf-text A
+	)
+'
+
+test_cmp_str()
+{
+	echo "$1" >tmp &&
+	test_cmp tmp "$2"
+}
+
+# Entries that require an external filter are not eligible for parallel
+# checkout. Check that both the parallel-eligible and non-eligible entries are
+# properly writen in a single checkout process.
+#
+test_expect_success 'parallel-checkout and external filter' '
+	git init filter &&
+	(
+		cd filter &&
+		git config filter.x2y.clean "tr x y" &&
+		git config filter.x2y.smudge "tr y x" &&
+		git config filter.x2y.required true &&
+
+		echo "A filter=x2y" >.gitattributes &&
+		echo x >A &&
+		echo x >B &&
+		echo x >C &&
+		git add -A &&
+		git commit -m filter &&
+
+		# Check that A (and only A) was cleaned
+		git cat-file -p :A >A.internal &&
+		test_cmp_str y A.internal &&
+		git cat-file -p :B >B.internal &&
+		test_cmp_str x B.internal &&
+		git cat-file -p :C >C.internal &&
+		test_cmp_str x C.internal &&
+
+		rm A B C *.internal &&
+		git_pc 2 0 2 checkout A B C &&
+		test_cmp_str x A &&
+		test_cmp_str x B &&
+		test_cmp_str x C
+	)
+'
+
+# The delayed queue is independent from the parallel queue, and they should be
+# able to work together in the same checkout process.
+#
+test_expect_success PERL 'parallel-checkout and delayed checkout' '
+	write_script rot13-filter.pl "$PERL_PATH" \
+		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
+	test_config_global filter.delay.process \
+		"\"$(pwd)/rot13-filter.pl\" \"$(pwd)/delayed.log\" clean smudge delay" &&
+	test_config_global filter.delay.required true &&
+
+	echo "a b c" >delay-content &&
+	echo "n o p" >delay-rot13-content &&
+
+	git init delayed &&
+	(
+		cd delayed &&
+		echo "*.a filter=delay" >.gitattributes &&
+		cp ../delay-content test-delay10.a &&
+		cp ../delay-content test-delay11.a &&
+		echo parallel >parallel1.b &&
+		echo parallel >parallel2.b &&
+		git add -A &&
+		git commit -m delayed &&
+
+		# Check that the stored data was cleaned
+		git cat-file -p :test-delay10.a > delay10.internal &&
+		test_cmp delay10.internal ../delay-rot13-content &&
+		git cat-file -p :test-delay11.a > delay11.internal &&
+		test_cmp delay11.internal ../delay-rot13-content &&
+		rm *.internal &&
+
+		rm *.a *.b
+	) &&
+
+	git_pc 2 0 2 -C delayed checkout -f &&
+	verify_checkout delayed &&
+
+	# Check that the *.a files got to the delay queue and were filtered
+	grep "smudge test-delay10.a .* \[DELAYED\]" delayed.log &&
+	grep "smudge test-delay11.a .* \[DELAYED\]" delayed.log &&
+	test_cmp delayed/test-delay10.a delay-content &&
+	test_cmp delayed/test-delay11.a delay-content
+'
+
+test_done
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v3 19/19] ci: run test round with parallel-checkout enabled
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (17 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
@ 2020-10-29  2:14     ` Matheus Tavares
  2020-10-29 19:48     ` [PATCH v3 00/19] Parallel Checkout (part I) Junio C Hamano
                       ` (2 subsequent siblings)
  21 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-10-29  2:14 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

We already have tests for the basic parallel-checkout operations. But
this code can also run in other commands, such as git-read-tree and
git-sparse-checkout, which are currently not tested with multiple
workers. To promote a wider test coverage without duplicating tests:

1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally
   force parallel-checkout execution during the whole test suite.

2. Include this variable in the second test round of the linux-gcc job
   of our ci scripts. This round runs `make test` again with some
   optional GIT_TEST_* variables enabled, so there is no additional
   overhead in exercising the parallel-checkout code here.

Note: the specific parallel-checkout tests t208* cannot be used in
combination with GIT_TEST_CHECKOUT_WORKERS as they need to set and check
the number of workers by themselves. So skip those tests when this flag
is set.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 ci/run-build-and-tests.sh  |  1 +
 parallel-checkout.c        | 14 ++++++++++++++
 t/README                   |  4 ++++
 t/lib-parallel-checkout.sh |  6 ++++++
 4 files changed, 25 insertions(+)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 6c27b886b8..aa32ddc361 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -22,6 +22,7 @@ linux-gcc)
 	export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
 	export GIT_TEST_MULTI_PACK_INDEX=1
 	export GIT_TEST_ADD_I_USE_BUILTIN=1
+	export GIT_TEST_CHECKOUT_WORKERS=2
 	make test
 	;;
 linux-clang)
diff --git a/parallel-checkout.c b/parallel-checkout.c
index c5c449d224..7482447f2d 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -32,6 +32,20 @@ enum pc_status parallel_checkout_status(void)
 
 void get_parallel_checkout_configs(int *num_workers, int *threshold)
 {
+	char *env_workers = getenv("GIT_TEST_CHECKOUT_WORKERS");
+
+	if (env_workers && *env_workers) {
+		if (strtol_i(env_workers, 10, num_workers)) {
+			die("invalid value for GIT_TEST_CHECKOUT_WORKERS: '%s'",
+			    env_workers);
+		}
+		if (*num_workers < 1)
+			*num_workers = online_cpus();
+
+		*threshold = 0;
+		return;
+	}
+
 	if (git_config_get_int("checkout.workers", num_workers))
 		*num_workers = 1;
 	else if (*num_workers < 1)
diff --git a/t/README b/t/README
index 2adaf7c2d2..cd1b15c55a 100644
--- a/t/README
+++ b/t/README
@@ -425,6 +425,10 @@ GIT_TEST_DEFAULT_HASH=<hash-algo> specifies which hash algorithm to
 use in the test scripts. Recognized values for <hash-algo> are "sha1"
 and "sha256".
 
+GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
+to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
+execution of the parallel-checkout code.
+
 Naming Tests
 ------------
 
diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index e62a433eb1..7b454da375 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -1,5 +1,11 @@
 # Helpers for t208* tests
 
+if ! test -z "$GIT_TEST_CHECKOUT_WORKERS"
+then
+	skip_all="skipping test, GIT_TEST_CHECKOUT_WORKERS is set"
+	test_done
+fi
+
 # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
 # and checks that the number of workers spawned is equal to $3.
 #
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 00/19] Parallel Checkout (part I)
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (18 preceding siblings ...)
  2020-10-29  2:14     ` [PATCH v3 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
@ 2020-10-29 19:48     ` Junio C Hamano
  2020-10-30 15:58     ` Jeff Hostetler
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
  21 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-29 19:48 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren

Matheus Tavares <matheus.bernardino@usp.br> writes:

> There was some semantic conflicts between this series and
> jk/checkout-index-errors, so I rebased my series on top of that.

That is sensible, as you'd want to be able to rely on the exit
status from the command while testing.

Will replace what has been queued.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 01/19] convert: make convert_attrs() and convert structs public
  2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
@ 2020-10-29 23:40       ` Junio C Hamano
  2020-10-30 17:01         ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 154+ messages in thread
From: Junio C Hamano @ 2020-10-29 23:40 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

Matheus Tavares <matheus.bernardino@usp.br> writes:

> diff --git a/convert.h b/convert.h
> index e29d1026a6..aeb4a1be9a 100644
> --- a/convert.h
> +++ b/convert.h
> @@ -37,6 +37,27 @@ enum eol {
>  #endif
>  };
>  
> +enum crlf_action {
> +	CRLF_UNDEFINED,
> +	CRLF_BINARY,
> +	CRLF_TEXT,
> +	CRLF_TEXT_INPUT,
> +	CRLF_TEXT_CRLF,
> +	CRLF_AUTO,
> +	CRLF_AUTO_INPUT,
> +	CRLF_AUTO_CRLF
> +};
> +
> +struct convert_driver;
> +
> +struct conv_attrs {
> +	struct convert_driver *drv;
> +	enum crlf_action attr_action; /* What attr says */
> +	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
> +	int ident;
> +	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
> +};
> +
>  enum ce_delay_state {
>  	CE_NO_DELAY = 0,
>  	CE_CAN_DELAY = 1,
> @@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate,
>  int would_convert_to_git_filter_fd(const struct index_state *istate,
>  				   const char *path);
>  
> +void convert_attrs(const struct index_state *istate,
> +		   struct conv_attrs *ca, const char *path);
> +
>  /*
>   * Initialize the checkout metadata with the given values.  Any argument may be
>   * NULL if it is not applicable.  The treeish should be a commit if that is

The new global symbols are reasonable, I would think, with a
possible exception of "crlf_action", which may want to also have
"conv" or "convert" somewhere in its name.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants
  2020-10-29  2:14     ` [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
@ 2020-10-29 23:48       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-29 23:48 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

Matheus Tavares <matheus.bernardino@usp.br> writes:

> -static int convert_to_working_tree_internal(const struct index_state *istate,
> +static int convert_to_working_tree_internal(const struct conv_attrs *ca,

Makes sense.  Once we know conv_attrs, we do not need the istate to
convert the contents.

> @@ -1497,7 +1494,9 @@ int async_convert_to_working_tree(const struct index_state *istate,
>  				  const struct checkout_metadata *meta,
>  				  void *dco)
>  {
> -	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
> +	struct conv_attrs ca;
> +	convert_attrs(istate, &ca, path);
> +	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco);
>  }
>
> @@ -1505,13 +1504,36 @@ int convert_to_working_tree(const struct index_state *istate,
>  			    size_t len, struct strbuf *dst,
>  			    const struct checkout_metadata *meta)
>  {
> -	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
> +	struct conv_attrs ca;
> +	convert_attrs(istate, &ca, path);
> +	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL);
> +}

OK, these naturally implement "let's lift convert_attrs() out of the
callee and move it to the callers".  However...

> +int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
> +				     const char *path, const char *src,
> +				     size_t len, struct strbuf *dst,
> +				     const struct checkout_metadata *meta,
> +				     void *dco)
> +{
> +	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
> +}
> +
> +int convert_to_working_tree_ca(const struct conv_attrs *ca,
> +			       const char *path, const char *src,
> +			       size_t len, struct strbuf *dst,
> +			       const struct checkout_metadata *meta)
> +{
> +	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
>  }

... shouldn't they be implemented as thin wrappers around these new
*_ca() variants of the API functions?  Otherwise, the *_ca()
variants are not yet used by anybody yet at this step, are they?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 03/19] convert: add get_stream_filter_ca() variant
  2020-10-29  2:14     ` [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
@ 2020-10-29 23:51       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-29 23:51 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

Matheus Tavares <matheus.bernardino@usp.br> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Like the previous patch, we will also need to call get_stream_filter()
> with a precomputed `struct conv_attrs`, when we add support for parallel
> checkout workers. So add the _ca() variant which takes the conversion
> attributes struct as a parameter.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> [matheus.bernardino: move header comment to ca() variant and reword msg]
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  convert.c | 28 +++++++++++++++++-----------
>  convert.h |  2 ++
>  2 files changed, 19 insertions(+), 11 deletions(-)

Same idea as 02/19, which is sound.

It makes readers wonder why this one is separate, while
convert_to_working_tree(), async_convert_to_working_tree(), and
renormalize_buffer() were done in a single patch in a single step,
though.



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 04/19] convert: add conv_attrs classification
  2020-10-29  2:14     ` [PATCH v3 04/19] convert: add conv_attrs classification Matheus Tavares
@ 2020-10-29 23:53       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-29 23:53 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

Matheus Tavares <matheus.bernardino@usp.br> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Create `enum conv_attrs_classification` to express the different ways
> that attributes are handled for a blob during checkout.
>
> This will be used in a later commit when deciding whether to add a file
> to the parallel or delayed queue during checkout. For now, we can also
> use it in get_stream_filter_ca() to simplify the function (as the
> classifying logic is the same).
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> [matheus.bernardino: use classification in get_stream_filter_ca()]
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  convert.c | 26 +++++++++++++++++++-------
>  convert.h | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+), 7 deletions(-)

Yup, having an actual user of the new layer of abstraction in the
same patch makes it more easily understandable.  If only the new
function and enum were presented without anybody using, it would
have been much harder to swallow, without visible and immediate
benefit.

Looking good.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 00/19] Parallel Checkout (part I)
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (19 preceding siblings ...)
  2020-10-29 19:48     ` [PATCH v3 00/19] Parallel Checkout (part I) Junio C Hamano
@ 2020-10-30 15:58     ` Jeff Hostetler
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
  21 siblings, 0 replies; 154+ messages in thread
From: Jeff Hostetler @ 2020-10-30 15:58 UTC (permalink / raw)
  To: Matheus Tavares, git
  Cc: gitster, chriscool, peff, newren, jrnieder, martin.agren



On 10/28/20 10:14 PM, Matheus Tavares wrote:
 > ...

Looks good to me.
Thanks for pushing this forward.

Jeff


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 01/19] convert: make convert_attrs() and convert structs public
  2020-10-29 23:40       ` Junio C Hamano
@ 2020-10-30 17:01         ` Matheus Tavares Bernardino
  2020-10-30 17:38           ` Junio C Hamano
  0 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-10-30 17:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff Hostetler, Christian Couder, Jeff King, Elijah Newren,
	Jonathan Nieder, Martin Ågren, Jeff Hostetler

On Thu, Oct 29, 2020 at 8:40 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matheus Tavares <matheus.bernardino@usp.br> writes:
>
> > diff --git a/convert.h b/convert.h
> > index e29d1026a6..aeb4a1be9a 100644
> > --- a/convert.h
> > +++ b/convert.h
> > @@ -37,6 +37,27 @@ enum eol {
> >  #endif
> >  };
> >
> > +enum crlf_action {
> > +     CRLF_UNDEFINED,
> > +     CRLF_BINARY,
> > +     CRLF_TEXT,
> > +     CRLF_TEXT_INPUT,
> > +     CRLF_TEXT_CRLF,
> > +     CRLF_AUTO,
> > +     CRLF_AUTO_INPUT,
> > +     CRLF_AUTO_CRLF
> > +};
> > +
> > +struct convert_driver;
> > +
> > +struct conv_attrs {
> > +     struct convert_driver *drv;
> > +     enum crlf_action attr_action; /* What attr says */
> > +     enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
> > +     int ident;
> > +     const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
> > +};
> > +
> >  enum ce_delay_state {
> >       CE_NO_DELAY = 0,
> >       CE_CAN_DELAY = 1,
> > @@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate,
> >  int would_convert_to_git_filter_fd(const struct index_state *istate,
> >                                  const char *path);
> >
> > +void convert_attrs(const struct index_state *istate,
> > +                struct conv_attrs *ca, const char *path);
> > +
> >  /*
> >   * Initialize the checkout metadata with the given values.  Any argument may be
> >   * NULL if it is not applicable.  The treeish should be a commit if that is
>
> The new global symbols are reasonable, I would think, with a
> possible exception of "crlf_action", which may want to also have
> "conv" or "convert" somewhere in its name.

OK. Maybe `enum crlf_conv_action`? In this case, should I also change
the prefix of the enum values? I'm not sure if it's worth it, though,
since there are about 52 occurrences of them.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 01/19] convert: make convert_attrs() and convert structs public
  2020-10-30 17:01         ` Matheus Tavares Bernardino
@ 2020-10-30 17:38           ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-30 17:38 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: git, Jeff Hostetler, Christian Couder, Jeff King, Elijah Newren,
	Jonathan Nieder, Martin Ågren, Jeff Hostetler

Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:

>> > +enum crlf_action {
>> > +     CRLF_UNDEFINED,
>> > +     CRLF_BINARY,
>> > +     CRLF_TEXT,
>> > +     CRLF_TEXT_INPUT,
>> > +     CRLF_TEXT_CRLF,
>> > +     CRLF_AUTO,
>> > +     CRLF_AUTO_INPUT,
>> > +     CRLF_AUTO_CRLF
>> > +};
>> > +
>> > +struct convert_driver;
>> > +
>> > +struct conv_attrs {
>> > +     struct convert_driver *drv;
>> ...
>> > +void convert_attrs(const struct index_state *istate,
>> > +                struct conv_attrs *ca, const char *path);
>> > +
>> >  /*
>> >   * Initialize the checkout metadata with the given values.  Any argument may be
>> >   * NULL if it is not applicable.  The treeish should be a commit if that is
>>
>> The new global symbols are reasonable, I would think, with a
>> possible exception of "crlf_action", which may want to also have
>> "conv" or "convert" somewhere in its name.
>
> OK. Maybe `enum crlf_conv_action`? In this case, should I also change

Either that, or "conv_crlf_action" (or even use the fully spelled
"convert_" as the prefix common to the global symbols from the
subsystem).

> In this case, should I also change
> the prefix of the enum values? I'm not sure if it's worth it, though,
> since there are about 52 occurrences of them.

At the use sites, these constants will be passed to, or compared
with values returned by, the API functions whose names make it clear
that they are from the "convert_" family, so I think it is OK to
leave the values as-is, as long as there is no unrelated symbol
whose name starts with "CRLF_" (and "git grep '^CRLF_'" tells me
that there is not any).

Thanks.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 05/19] entry: extract a header file for entry.c functions
  2020-10-29  2:14     ` [PATCH v3 05/19] entry: extract a header file for entry.c functions Matheus Tavares
@ 2020-10-30 21:36       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-30 21:36 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren

Matheus Tavares <matheus.bernardino@usp.br> writes:

> The declarations of entry.c's public functions and structures currently
> reside in cache.h. Although not many, they contribute to the size of
> cache.h and, when changed, cause the unnecessary recompilation of
> modules that don't really use these functions. So let's move them to a
> new entry.h header.

Good idea.  This is mostly moving things around, so there are only a
few minor nits.

> diff --git a/entry.h b/entry.h
> new file mode 100644
> index 0000000000..2d69185448
> --- /dev/null
> +++ b/entry.h
> @@ -0,0 +1,41 @@
> +#ifndef ENTRY_H
> +#define ENTRY_H
> +
> +#include "cache.h"
> +#include "convert.h"
> +
> +struct checkout {
> +	struct index_state *istate;
> +	const char *base_dir;
> +	int base_dir_len;
> +	struct delayed_checkout *delayed_checkout;
> +	struct checkout_metadata meta;
> +	unsigned force:1,
> +		 quiet:1,
> +		 not_new:1,
> +		 clone:1,
> +		 refresh_cache:1;
> +};
> +#define CHECKOUT_INIT { NULL, "" }
> +

It makes sense to have a blank here, like you did, as we just
completed the definition of "struct checkout" and things directly
related to it.

> +#define TEMPORARY_FILENAME_LENGTH 25
> +
> +/*
> + * Write the contents from ce out to the working tree.
> + *
> + * When topath[] is not NULL, instead of writing to the working tree
> + * file named by ce, a temporary file is created by this function and
> + * its name is returned in topath[], which must be able to hold at
> + * least TEMPORARY_FILENAME_LENGTH bytes long.
> + */
> +int checkout_entry(struct cache_entry *ce, const struct checkout *state,
> +		   char *topath, int *nr_checkouts);

The comment before the above block applies to both the function and
to the TEMPORARY_FILENAME_LENGTH preprocessor macro.  And this is
where we conclude the definition related to the function so it is a
good idea to have a blank line here....

> +void enable_delayed_checkout(struct checkout *state);
> +int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);

....and here, as we have finished talking about the "delayed" stuff.

> +/*
> + * Unlink the last component and schedule the leading directories for
> + * removal, such that empty directories get removed.
> + */
> +void unlink_entry(const struct cache_entry *ce);
> +
> +#endif /* ENTRY_H */
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 323280dd48..a511fadd89 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -16,6 +16,7 @@
>  #include "fsmonitor.h"
>  #include "object-store.h"
>  #include "promisor-remote.h"
> +#include "entry.h"
>  
>  /*
>   * Error messages expected by scripts out of plumbing commands such as
n

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-10-29  2:14     ` [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-10-30 21:58       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-30 21:58 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren

Matheus Tavares <matheus.bernardino@usp.br> writes:

> +/* Note: ca is used (and required) iff the entry refers to a regular file. */

This reflects how the current code happens to work, and it is
unlikely to change (in other words, I offhand do not think of a
reason why attributes may affect checking out a symlink or a
submodule), so that's probably OK.  I mention this specifically
because ...

> +static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
> +		       const struct checkout *state, int to_tempfile)
>  {
>  	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
>  	struct delayed_checkout *dco = state->delayed_checkout;
> @@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
>  	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
>  
>  	if (ce_mode_s_ifmt == S_IFREG) {
> -		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
> -								 &ce->oid);
> +		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
>  		if (filter &&
>  		    !streaming_write_entry(ce, path, filter,
>  					   state, to_tempfile,
> @@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
>  		 * Convert from git internal format to working tree format
>  		 */
>  		if (dco && dco->state != CE_NO_DELAY) {
> -			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
> -							    size, &buf, &meta, dco);
> +			ret = async_convert_to_working_tree_ca(ca, ce->name,
> +							       new_blob, size,
> +							       &buf, &meta, dco);
>  			if (ret && string_list_has_string(&dco->paths, ce->name)) {
>  				free(new_blob);
>  				goto delayed;
>  			}
> -		} else
> -			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
> +		} else {
> +			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
> +							 size, &buf, &meta);
> +		}
>  
>  		if (ret) {
>  			free(new_blob);
> @@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>  {
>  	static struct strbuf path = STRBUF_INIT;
>  	struct stat st;
> +	struct conv_attrs ca;
>  
>  	if (ce->ce_flags & CE_WT_REMOVE) {
>  		if (topath)
> @@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>  		return 0;
>  	}
>  
> -	if (topath)
> -		return write_entry(ce, topath, state, 1);
> +	if (topath) {
> +		if (S_ISREG(ce->ce_mode)) {
> +			convert_attrs(state->istate, &ca, ce->name);
> +			return write_entry(ce, topath, &ca, state, 1);
> +		}
> +		return write_entry(ce, topath, NULL, state, 1);
> +	}

... it looked somewhat upside-down at the first glance that we
decide if lower level routines are allowed to use the ca at this
high level in the callchain.  But it is the point of this change
to lift the point to make the decision to use attributes higher in
the callchain, so it would be OK (or "unavoidable").

I wonder if it is worth to avoid early return from the inner block,
like this:

	struct conv_attrs *use_ca = NULL;
	...
	if (topath) {
		struct conv_attrs ca;
		if (S_ISREG(...)) {
			convert_attrs(... &ca ...);
			use_ca = &ca;
 		}
		return write_entry(ce, topath, use_ca, state, 1);
	}

which would make it easier to further add code that is common to
both regular file and other things before we call write_entry().

The same comment applies to the codepath where a new file gets
created in the next hunk.

> @@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
>  		return 0;
>  
>  	create_directories(path.buf, path.len, state);
> +
>  	if (nr_checkouts)
>  		(*nr_checkouts)++;
> -	return write_entry(ce, path.buf, state, 0);
> +
> +	if (S_ISREG(ce->ce_mode)) {
> +		convert_attrs(state->istate, &ca, ce->name);
> +		return write_entry(ce, path.buf, &ca, state, 0);
> +	}
> +
> +	return write_entry(ce, path.buf, NULL, state, 0);
>  }
>  
>  void unlink_entry(const struct cache_entry *ce)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs
  2020-10-29  2:14     ` [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
@ 2020-10-30 22:02       ` Junio C Hamano
  0 siblings, 0 replies; 154+ messages in thread
From: Junio C Hamano @ 2020-10-30 22:02 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren

Matheus Tavares <matheus.bernardino@usp.br> writes:

> The parallel checkout machinery will call checkout_entry() for entries
> that could not be written in parallel due to path collisions. At this
> point, we will already be holding the conversion attributes for each
> entry, and it would be wasteful to let checkout_entry() load these
> again. Instead, let's add the checkout_entry_ca() variant, which
> optionally takes a preloaded conv_attrs struct.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---

I think my review comment to 08/19 is partly taken care of with this
step.  Perhaps the progression will become simpler to understand if
we add this new helper first?  I dunno.  In either case, the end
result of applying both 08 and 09 looks quite nicer than the state
after up to 07 are applied.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout
  2020-10-29  2:14     ` [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-11-02 19:35       ` Junio C Hamano
  2020-11-03  3:48         ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 154+ messages in thread
From: Junio C Hamano @ 2020-11-02 19:35 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, git, chriscool, peff, newren, jrnieder, martin.agren

Matheus Tavares <matheus.bernardino@usp.br> writes:

> This new interface allows us to enqueue some of the entries being
> checked out to later call write_entry() for them in parallel. For now,
> the parallel checkout machinery is enabled by default and there is no
> user configuration, but run_parallel_checkout() just writes the queued
> entries in sequence (without spawning additional workers).

In other words, this would show the worst case overhead caused by
the framework to allow parallel checkout, relative to the current
code.  Which is quite a sensible and separate step to have in the
series.  I like it.

> The next
> patch will actually implement the parallelism and, later, we will make
> it configurable.

OK.

> When there are path collisions among the entries being written (which
> can happen e.g. with case-sensitive files in case-insensitive file
> systems), the parallel checkout code detects the problem and marks the
> item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to
> checkout_entry() again. This is similar to the way the sequential code
> deals with collisions, overwriting the previously checked out entries
> with the subsequent ones. The only difference is that, when we start
> writing the entries in parallel, we won't be able to determine which of
> the colliding entries will survive on disk (for the sequential
> algorithm, it is always the last one).

Sure.  "The last one" determinism does not buy us very much, but it
is prudent to keep such a behavioural difference in mind.

> I also experimented with the idea of not overwriting colliding entries,
> and it seemed to work well in my simple tests. However, because just one
> entry of each colliding group would be actually written, the others
> would have null lstat() fields on the index. This might not be a problem
> by itself, but it could cause performance penalties for subsequent
> commands that need to refresh the index: when the st_size value cached
> is 0, read-cache.c:ie_modified() will go to the filesystem to see if the
> contents match. As mentioned in the function:
>
>     * Immediately after read-tree or update-index --cacheinfo,
>     * the length field is zero, as we have never even read the
>     * lstat(2) information once, and we cannot trust DATA_CHANGED
>     * returned by ie_match_stat() which in turn was returned by
>     * ce_match_stat_basic() to signal that the filesize of the
>     * blob changed.  We have to actually go to the filesystem to
>     * see if the contents match, and if so, should answer "unchanged".
>
> So, if we have N entries in a colliding group and we decide to write and
> lstat() only one of them, every subsequent git-status will have to read,
> convert, and hash the written file N - 1 times, to check that the N - 1
> unwritten entries are dirty. By checking out all colliding entries (like
> the sequential code does), we only pay the overhead once.

And the cost is to writing them out N times is not free, either, I
presume?

But I do not see the point of wasting engineering effort by trying
to make it more efficient to create a corrupt working tree that is
unusable because some paths that ought to exist are missing, so I
think it is OK.

> Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  Makefile            |   1 +
>  entry.c             |  17 +-
>  parallel-checkout.c | 368 ++++++++++++++++++++++++++++++++++++++++++++
>  parallel-checkout.h |  27 ++++
>  unpack-trees.c      |   6 +-
>  5 files changed, 416 insertions(+), 3 deletions(-)
>  create mode 100644 parallel-checkout.c
>  create mode 100644 parallel-checkout.h
>
> diff --git a/Makefile b/Makefile
> index 1fb0ec1705..10ee5e709b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -945,6 +945,7 @@ LIB_OBJS += pack-revindex.o
>  LIB_OBJS += pack-write.o
>  LIB_OBJS += packfile.o
>  LIB_OBJS += pager.o
> +LIB_OBJS += parallel-checkout.o
>  LIB_OBJS += parse-options-cb.o
>  LIB_OBJS += parse-options.o
>  LIB_OBJS += patch-delta.o
> diff --git a/entry.c b/entry.c
> index 9d79a5671f..6676954431 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -7,6 +7,7 @@
>  #include "progress.h"
>  #include "fsmonitor.h"
>  #include "entry.h"
> +#include "parallel-checkout.h"
>  
>  static void create_directories(const char *path, int path_len,
>  			       const struct checkout *state)
> @@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
>  	for (i = 0; i < state->istate->cache_nr; i++) {
>  		struct cache_entry *dup = state->istate->cache[i];
>  
> -		if (dup == ce)
> -			break;
> +		if (dup == ce) {
> +			/*
> +			 * Parallel checkout creates the files in no particular
> +			 * order. So the other side of the collision may appear
> +			 * after the given cache_entry in the array.
> +			 */
> +			if (parallel_checkout_status() == PC_RUNNING)
> +				continue;
> +			else
> +				break;
> +		}
>  
>  		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
>  			continue;
> @@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
>  		ca = &ca_buf;
>  	}
>  
> +	if (!enqueue_checkout(ce, ca))
> +		return 0;
> +
>  	return write_entry(ce, path.buf, ca, state, 0);

It it is not wrong but feels strange that paths that cannot be
handled by parallel codepath for whatever reason are written using
the fallback code, but the fallback actually touches the disk before
the queued paths for parallel writeout ;-) What's the reason why
some paths cannot be handled by the new codepath again?  Also, can a
path that is handled by the fallback code collide with other paths
that are handled by the parallel codepath, and what happens for
these paths?

>  }
>  
> diff --git a/parallel-checkout.c b/parallel-checkout.c
> new file mode 100644
> index 0000000000..981dbe6ff3
> --- /dev/null
> +++ b/parallel-checkout.c
> @@ -0,0 +1,368 @@
> +#include "cache.h"
> +#include "entry.h"
> +#include "parallel-checkout.h"
> +#include "streaming.h"
> +
> +enum pc_item_status {
> +	PC_ITEM_PENDING = 0,
> +	PC_ITEM_WRITTEN,
> +	/*
> +	 * The entry could not be written because there was another file
> +	 * already present in its path or leading directories. Since
> +	 * checkout_entry_ca() removes such files from the working tree before
> +	 * enqueueing the entry for parallel checkout, it means that there was
> +	 * a path collision among the entries being written.
> +	 */
> +	PC_ITEM_COLLIDED,
> +	PC_ITEM_FAILED,
> +};
> +
> +struct parallel_checkout_item {
> +	/* pointer to a istate->cache[] entry. Not owned by us. */
> +	struct cache_entry *ce;
> +	struct conv_attrs ca;
> +	struct stat st;
> +	enum pc_item_status status;
> +};
> +
> +struct parallel_checkout {
> +	enum pc_status status;
> +	struct parallel_checkout_item *items;
> +	size_t nr, alloc;
> +};
> +
> +static struct parallel_checkout parallel_checkout = { 0 };

Can't we let this handled by BSS by not explicitly giving an initial
value?

> +enum pc_status parallel_checkout_status(void)
> +{
> +	return parallel_checkout.status;
> +}
> +
> +void init_parallel_checkout(void)
> +{
> +	if (parallel_checkout.status != PC_UNINITIALIZED)
> +		BUG("parallel checkout already initialized");
> +
> +	parallel_checkout.status = PC_ACCEPTING_ENTRIES;
> +}
> +
> +static void finish_parallel_checkout(void)
> +{
> +	if (parallel_checkout.status == PC_UNINITIALIZED)
> +		BUG("cannot finish parallel checkout: not initialized yet");
> +
> +	free(parallel_checkout.items);
> +	memset(&parallel_checkout, 0, sizeof(parallel_checkout));
> +}
> +
> +static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
> +					     const struct conv_attrs *ca)
> +{
> +	enum conv_attrs_classification c;
> +
> +	if (!S_ISREG(ce->ce_mode))
> +		return 0;
> +
> +	c = classify_conv_attrs(ca);
> +	switch (c) {
> +	case CA_CLASS_INCORE:
> +		return 1;
> +
> +	case CA_CLASS_INCORE_FILTER:
> +		/*
> +		 * It would be safe to allow concurrent instances of
> +		 * single-file smudge filters, like rot13, but we should not
> +		 * assume that all filters are parallel-process safe. So we
> +		 * don't allow this.
> +		 */
> +		return 0;
> +
> +	case CA_CLASS_INCORE_PROCESS:
> +		/*
> +		 * The parallel queue and the delayed queue are not compatible,
> +		 * so they must be kept completely separated. And we can't tell
> +		 * if a long-running process will delay its response without
> +		 * actually asking it to perform the filtering. Therefore, this
> +		 * type of filter is not allowed in parallel checkout.
> +		 *
> +		 * Furthermore, there should only be one instance of the
> +		 * long-running process filter as we don't know how it is
> +		 * managing its own concurrency. So, spreading the entries that
> +		 * requisite such a filter among the parallel workers would
> +		 * require a lot more inter-process communication. We would
> +		 * probably have to designate a single process to interact with
> +		 * the filter and send all the necessary data to it, for each
> +		 * entry.
> +		 */
> +		return 0;
> +
> +	case CA_CLASS_STREAMABLE:
> +		return 1;
> +
> +	default:
> +		BUG("unsupported conv_attrs classification '%d'", c);
> +	}
> +}

OK, the comments fairly clearly explain the reason for each case.
Good.

> +static int handle_results(struct checkout *state)
> +{
> +	int ret = 0;
> +	size_t i;
> +	int have_pending = 0;
> +
> +	/*
> +	 * We first update the successfully written entries with the collected
> +	 * stat() data, so that they can be found by mark_colliding_entries(),
> +	 * in the next loop, when necessary.
> +	 */
> +	for (i = 0; i < parallel_checkout.nr; ++i) {

We encourage post_increment++ when there is no particular reason to
do otherwise in this codebase (I won't repeat in the remainder of
this review).

> +static int reset_fd(int fd, const char *path)
> +{
> +	if (lseek(fd, 0, SEEK_SET) != 0)
> +		return error_errno("failed to rewind descriptor of %s", path);
> +	if (ftruncate(fd, 0))
> +		return error_errno("failed to truncate file %s", path);
> +	return 0;
> +}

This is in the error codepath when streaming fails, and we'll later
attempt the normal "read object in-core, write it out" codepath, but
is it enough to just ftruncate() it?  I am wondering why it is OK
not to unlink() the failed one---is it the caller who is responsible
for opening the file descriptor to write to, and at the layer of the
caller of this helper there is no way to re-open it, or something
like that?

	... /me looks ahead and it seems the answer is "yes".

> +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
> +			       const char *path)
> ...
> +	if (filter) {
> +		if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
> +			/* On error, reset fd to try writing without streaming */
> +			if (reset_fd(fd, path))
> +				return -1;
> +		} else {
> +			return 0;
> +		}
> +	}
> +
> +	new_blob = read_blob_entry(pc_item->ce, &size);
> ...
> +	wrote = write_in_full(fd, new_blob, size);

> +static int check_leading_dirs(const char *path, int len, int prefix_len)
> +{
> +	const char *slash = path + len;
> +
> +	while (slash > path && *slash != '/')
> +		slash--;

It is kind of surprising that we do not give us an easy-to-use
helper to find the separtor between dirname and basename.  If there
were, we do not even need this helper function with an unclear name
(i.e. "check" does not mean much to those who are trying to
understand the caller---"leading directories are checked for
what???" will be their question).

Perhaps create or find such a helper to remove this function and use
has_dirs_only_path() directly in the caller?

> +	return has_dirs_only_path(path, slash - path, prefix_len);
> +}

> +static void write_pc_item(struct parallel_checkout_item *pc_item,
> +			  struct checkout *state)
> +{
> +	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
> +	int fd = -1, fstat_done = 0;
> +	struct strbuf path = STRBUF_INIT;
> +
> +	strbuf_add(&path, state->base_dir, state->base_dir_len);
> +	strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
> +
> +	/*
> +	 * At this point, leading dirs should have already been created. But if
> +	 * a symlink being checked out has collided with one of the dirs, due to
> +	 * file system folding rules, it's possible that the dirs are no longer

Is "file system folding rule" clear to readers of the code after
this patch lands?  It isn't at least to me.

> +	 * present. So we have to check again, and report any path collisions.
> +	 */
> +	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
> +		pc_item->status = PC_ITEM_COLLIDED;
> +		goto out;
> +	}

Thanks.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout
  2020-11-02 19:35       ` Junio C Hamano
@ 2020-11-03  3:48         ` Matheus Tavares Bernardino
  0 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares Bernardino @ 2020-11-03  3:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff Hostetler, Christian Couder, Jeff King, Elijah Newren,
	Jonathan Nieder, Martin Ågren

On Mon, Nov 2, 2020 at 4:35 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matheus Tavares <matheus.bernardino@usp.br> writes:
[...]
> >
> > @@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
> >               ca = &ca_buf;
> >       }
> >
> > +     if (!enqueue_checkout(ce, ca))
> > +             return 0;
> > +
> >       return write_entry(ce, path.buf, ca, state, 0);
>
> It it is not wrong but feels strange that paths that cannot be
> handled by parallel codepath for whatever reason are written using
> the fallback code, but the fallback actually touches the disk before
> the queued paths for parallel writeout ;-)

Yeah... I also considered having a second "sequential_checkout_item"
queue, and iterating it after the parallel-eligible entries. But I
thought that it might be better to write the ineligible entries right
away and save a little memory (especially for the regular files, for
which we would also have to hold the conversion attributes).

With that said, I ended up adding a second queue in part 2, just for
symlinks. By postponing the checkout of symlinks we can avoid the
check_leading_dirs() function and the additional lstat() calls in the
workers. This also makes it possible to create the leading directories
in parallel (in part 3) with raceproof_create_file(), which is quite
nice as it only calls stat() when open() fails. And since symlinks
probably appear in smaller numbers than regular files, this second
queue should never get too long.

> What's the reason why
> some paths cannot be handled by the new codepath again?

Submodules and symlinks are not eligible for parallel checkout mainly
because it would be hard to detect collisions when they are involved.
For symlinks, one worker could create the symlink a/b => d right
before another worker tries to open() and write() a/b/c, which would
then produce the wrong a/d/c file. And for submodules, we could have a
worker checking out a submodule S while another worker writes the
colliding regular file s/f.

As for regular files, we don't parallelize the checkout of entries
which require external filters, mainly because we cannot guarantee
that such filters are parallel-process safe. But also, the
delayed-checkout queue is incompatible with the parallel-checkout
queue (in the sense that each entry should only be present in one of
the two queues).

> Also, can a
> path that is handled by the fallback code collide with other paths
> that are handled by the parallel codepath, and what happens for
> these paths?

Yes, it can happen. But the parallel-checkout machinery should be
ready for it. There are two cases:

1. Both paths collide in the basename (e.g. a/b and a/B)
2. One path collide in the dirname (e.g. a/b and a/B/c)

For both cases, the collision will happen when trying to write the
parallel-eligible path. This happens because, for now, all paths that
are ineligible for parallel-checkout are checked out first. So, in the
first case, we will detect the collision when open() fails in
write_pc_item().

The second case is a little trickier, since [in part 1] we create the
leading directories right before enqueueing an entry for
parallel-checkout. An ineligible entry could then collide with the
dirname of an already enqueued parallel-eligible entry, removing (and
replacing) the created dirs. Also, the ineligible entry could be a
symlink, and we want to avoid the case of workers writing the entry
a/b/c at a/d/c due to a symlink in b. These collisions with the
dirname are detected when has_dirs_only_path() fails in
check_leading_dirs().

Furthermore, there is no risk that has_dirs_only_path() succeeds, but
then another entry collides with the leading directories before the
actual checkout. Because, when we start the workers, no file or
directory is ever removed.

> >  }
> >
> > diff --git a/parallel-checkout.c b/parallel-checkout.c
> > new file mode 100644
> > index 0000000000..981dbe6ff3
> > --- /dev/null
> > +++ b/parallel-checkout.c
> > @@ -0,0 +1,368 @@
> > +#include "cache.h"
> > +#include "entry.h"
> > +#include "parallel-checkout.h"
> > +#include "streaming.h"
> > +
> > +enum pc_item_status {
> > +     PC_ITEM_PENDING = 0,
> > +     PC_ITEM_WRITTEN,
> > +     /*
> > +      * The entry could not be written because there was another file
> > +      * already present in its path or leading directories. Since
> > +      * checkout_entry_ca() removes such files from the working tree before
> > +      * enqueueing the entry for parallel checkout, it means that there was
> > +      * a path collision among the entries being written.
> > +      */
> > +     PC_ITEM_COLLIDED,
> > +     PC_ITEM_FAILED,
> > +};
> > +
> > +struct parallel_checkout_item {
> > +     /* pointer to a istate->cache[] entry. Not owned by us. */
> > +     struct cache_entry *ce;
> > +     struct conv_attrs ca;
> > +     struct stat st;
> > +     enum pc_item_status status;
> > +};
> > +
> > +struct parallel_checkout {
> > +     enum pc_status status;
> > +     struct parallel_checkout_item *items;
> > +     size_t nr, alloc;
> > +};
> > +
> > +static struct parallel_checkout parallel_checkout = { 0 };
>
> Can't we let this handled by BSS by not explicitly giving an initial
> value?

Good catch, thanks.

> > +enum pc_status parallel_checkout_status(void)
> > +{
> > +     return parallel_checkout.status;
> > +}
> > +
> > +void init_parallel_checkout(void)
> > +{
> > +     if (parallel_checkout.status != PC_UNINITIALIZED)
> > +             BUG("parallel checkout already initialized");
> > +
> > +     parallel_checkout.status = PC_ACCEPTING_ENTRIES;
> > +}
> > +
> > +static void finish_parallel_checkout(void)
> > +{
> > +     if (parallel_checkout.status == PC_UNINITIALIZED)
> > +             BUG("cannot finish parallel checkout: not initialized yet");
> > +
> > +     free(parallel_checkout.items);
> > +     memset(&parallel_checkout, 0, sizeof(parallel_checkout));
> > +}
> > +
> > +static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
> > +                                          const struct conv_attrs *ca)
> > +{
> > +     enum conv_attrs_classification c;
> > +
> > +     if (!S_ISREG(ce->ce_mode))
> > +             return 0;
> > +
> > +     c = classify_conv_attrs(ca);
> > +     switch (c) {
> > +     case CA_CLASS_INCORE:
> > +             return 1;
> > +
> > +     case CA_CLASS_INCORE_FILTER:
> > +             /*
> > +              * It would be safe to allow concurrent instances of
> > +              * single-file smudge filters, like rot13, but we should not
> > +              * assume that all filters are parallel-process safe. So we
> > +              * don't allow this.
> > +              */
> > +             return 0;
> > +
> > +     case CA_CLASS_INCORE_PROCESS:
> > +             /*
> > +              * The parallel queue and the delayed queue are not compatible,
> > +              * so they must be kept completely separated. And we can't tell
> > +              * if a long-running process will delay its response without
> > +              * actually asking it to perform the filtering. Therefore, this
> > +              * type of filter is not allowed in parallel checkout.
> > +              *
> > +              * Furthermore, there should only be one instance of the
> > +              * long-running process filter as we don't know how it is
> > +              * managing its own concurrency. So, spreading the entries that
> > +              * requisite such a filter among the parallel workers would
> > +              * require a lot more inter-process communication. We would
> > +              * probably have to designate a single process to interact with
> > +              * the filter and send all the necessary data to it, for each
> > +              * entry.
> > +              */
> > +             return 0;
> > +
> > +     case CA_CLASS_STREAMABLE:
> > +             return 1;
> > +
> > +     default:
> > +             BUG("unsupported conv_attrs classification '%d'", c);
> > +     }
> > +}
>
> OK, the comments fairly clearly explain the reason for each case.
> Good.
>
> > +static int handle_results(struct checkout *state)
> > +{
> > +     int ret = 0;
> > +     size_t i;
> > +     int have_pending = 0;
> > +
> > +     /*
> > +      * We first update the successfully written entries with the collected
> > +      * stat() data, so that they can be found by mark_colliding_entries(),
> > +      * in the next loop, when necessary.
> > +      */
> > +     for (i = 0; i < parallel_checkout.nr; ++i) {
>
> We encourage post_increment++ when there is no particular reason to
> do otherwise in this codebase (I won't repeat in the remainder of
> this review).

OK, I will fix the pre-increments, thanks.

> > +static int reset_fd(int fd, const char *path)
> > +{
> > +     if (lseek(fd, 0, SEEK_SET) != 0)
> > +             return error_errno("failed to rewind descriptor of %s", path);
> > +     if (ftruncate(fd, 0))
> > +             return error_errno("failed to truncate file %s", path);
> > +     return 0;
> > +}
>
> This is in the error codepath when streaming fails, and we'll later
> attempt the normal "read object in-core, write it out" codepath, but
> is it enough to just ftruncate() it?  I am wondering why it is OK
> not to unlink() the failed one---is it the caller who is responsible
> for opening the file descriptor to write to, and at the layer of the
> caller of this helper there is no way to re-open it, or something
> like that?

Right. We also avoid unlinking the failed one to keep the invariant
that the first worker to successfully open(O_CREAT | O_EXCL) a file
has the "ownership" for that path. So other workers that try to open
the same path will know that there is a collision and can immediately
abort checking out their entry.

>         ... /me looks ahead and it seems the answer is "yes".
>
> > +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
> > +                            const char *path)
> > ...
> > +     if (filter) {
> > +             if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
> > +                     /* On error, reset fd to try writing without streaming */
> > +                     if (reset_fd(fd, path))
> > +                             return -1;
> > +             } else {
> > +                     return 0;
> > +             }
> > +     }
> > +
> > +     new_blob = read_blob_entry(pc_item->ce, &size);
> > ...
> > +     wrote = write_in_full(fd, new_blob, size);
>
> > +static int check_leading_dirs(const char *path, int len, int prefix_len)
> > +{
> > +     const char *slash = path + len;
> > +
> > +     while (slash > path && *slash != '/')
> > +             slash--;
>
> It is kind of surprising that we do not give us an easy-to-use
> helper to find the separtor between dirname and basename.  If there
> were, we do not even need this helper function with an unclear name
> (i.e. "check" does not mean much to those who are trying to
> understand the caller---"leading directories are checked for
> what???" will be their question).
>
> Perhaps create or find such a helper to remove this function and use
> has_dirs_only_path() directly in the caller?

OK, I'll look into it. It would be better if we can reuse an already
present helper, since this call to has_dirs_only_path() will be
removed in part 2.

> > +     return has_dirs_only_path(path, slash - path, prefix_len);
> > +}
>
> > +static void write_pc_item(struct parallel_checkout_item *pc_item,
> > +                       struct checkout *state)
> > +{
> > +     unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
> > +     int fd = -1, fstat_done = 0;
> > +     struct strbuf path = STRBUF_INIT;
> > +
> > +     strbuf_add(&path, state->base_dir, state->base_dir_len);
> > +     strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
> > +
> > +     /*
> > +      * At this point, leading dirs should have already been created. But if
> > +      * a symlink being checked out has collided with one of the dirs, due to
> > +      * file system folding rules, it's possible that the dirs are no longer
>
> Is "file system folding rule" clear to readers of the code after
> this patch lands?  It isn't at least to me.

OK, I will rephrase this paragraph to make it clearer.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 00/19] Parallel Checkout (part I)
  2020-10-29  2:14   ` [PATCH v3 00/19] Parallel Checkout (part I) Matheus Tavares
                       ` (20 preceding siblings ...)
  2020-10-30 15:58     ` Jeff Hostetler
@ 2020-11-04 20:32     ` Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
                         ` (19 more replies)
  21 siblings, 20 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:32 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Changes since v3:

Patch 1:
  - Renamed 'enum crlf_action' to 'enum convert_crlf_action', since it's
    now public and the latter suits better the global namespace.

Patch 2:
  - Implemented the regular [async_]convert_to_working_tree() functions
    as thin wrappers around the new _ca() variants.

Patches 5 and 6:
  - Properly added blank lines to separate declaration blocks in entry.h.

Patch 8:
  - Used a `struct conv_attrs ca_buf` (together with the `ca` pointer)
    to avoid the early return in checkout_entry() when
    S_ISREG(ce->ce_mode). I think this makes the patch a little easier
    to parse and also simplifies the next patch.

Patch 10:
  - Removed explicit zero initialization of the static struct parallel_checkout.
  - Removed the check_leading_dirs() function, which had a quite generic
    name, and integrated the code into write_pc_item().

    Note: for this change, I used the find_last_dir_sep() helper, which
    is slightly slower since it doesn't take the path's length, and
    thus, it has to iterate the whole string. Alternatively, We could
    add an strbuf_find_last_dir_sep() variant which takes an strbuf and
    starts the search from the end, to save some iterations per path.
    But this snippet will be removed in part 2, so I thought it wouldn't
    be worth adding a new helper now.

  - Rephrased comment about the has_dirs_only_path() call in workers, for
    better clarity.

- Changed all unnecessary uses of ++pre_increment in the series to
  post_increment++ (patches 10 and 11).


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (15):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: add tests for basic operations
  parallel-checkout: add tests related to clone collisions
  parallel-checkout: add tests related to .gitattributes
  ci: run test round with parallel-checkout enabled

 .gitignore                              |   1 +
 Documentation/config/checkout.txt       |  21 +
 Makefile                                |   2 +
 apply.c                                 |   1 +
 builtin.h                               |   1 +
 builtin/checkout--helper.c              | 142 ++++++
 builtin/checkout-index.c                |  22 +-
 builtin/checkout.c                      |  21 +-
 builtin/difftool.c                      |   3 +-
 cache.h                                 |  34 +-
 ci/run-build-and-tests.sh               |   1 +
 convert.c                               | 130 ++---
 convert.h                               |  96 +++-
 entry.c                                 | 102 ++--
 entry.h                                 |  55 +++
 git.c                                   |   2 +
 parallel-checkout.c                     | 632 ++++++++++++++++++++++++
 parallel-checkout.h                     | 103 ++++
 read-cache.c                            |  12 +-
 t/README                                |   4 +
 t/lib-encoding.sh                       |  25 +
 t/lib-parallel-checkout.sh              |  46 ++
 t/t0028-working-tree-encoding.sh        |  25 +-
 t/t2080-parallel-checkout-basics.sh     | 170 +++++++
 t/t2081-parallel-checkout-collisions.sh |  98 ++++
 t/t2082-parallel-checkout-attributes.sh | 174 +++++++
 unpack-trees.c                          |  22 +-
 27 files changed, 1766 insertions(+), 179 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h
 create mode 100644 t/lib-encoding.sh
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh
 create mode 100755 t/t2081-parallel-checkout-collisions.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

Range-diff against v3:
 1:  dfc3e0fd62 !  1:  2726f6dc05 convert: make convert_attrs() and convert structs public
    @@ Commit message
         Move convert_attrs() declaration from convert.c to convert.h, together
         with the conv_attrs struct and the crlf_action enum. This function and
         the data structures will be used outside convert.c in the upcoming
    -    parallel checkout implementation.
    +    parallel checkout implementation. Note that crlf_action is renamed to
    +    convert_crlf_action, which is more appropriate for the global namespace.
     
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
         [matheus.bernardino: squash and reword msg]
    @@ convert.c
      struct text_stat {
      	/* NUL, CR, LF and CRLF counts */
      	unsigned nul, lonecr, lonelf, crlf;
    +@@ convert.c: static int text_eol_is_crlf(void)
    + 	return 0;
    + }
    + 
    +-static enum eol output_eol(enum crlf_action crlf_action)
    ++static enum eol output_eol(enum convert_crlf_action crlf_action)
    + {
    + 	switch (crlf_action) {
    + 	case CRLF_BINARY:
    +@@ convert.c: static int has_crlf_in_index(const struct index_state *istate, const char *path)
    + }
    + 
    + static int will_convert_lf_to_crlf(struct text_stat *stats,
    +-				   enum crlf_action crlf_action)
    ++				   enum convert_crlf_action crlf_action)
    + {
    + 	if (output_eol(crlf_action) != EOL_CRLF)
    + 		return 0;
    +@@ convert.c: static int encode_to_worktree(const char *path, const char *src, size_t src_len,
    + static int crlf_to_git(const struct index_state *istate,
    + 		       const char *path, const char *src, size_t len,
    + 		       struct strbuf *buf,
    +-		       enum crlf_action crlf_action, int conv_flags)
    ++		       enum convert_crlf_action crlf_action, int conv_flags)
    + {
    + 	struct text_stat stats;
    + 	char *dst;
    +@@ convert.c: static int crlf_to_git(const struct index_state *istate,
    + 	return 1;
    + }
    + 
    +-static int crlf_to_worktree(const char *src, size_t len,
    +-			    struct strbuf *buf, enum crlf_action crlf_action)
    ++static int crlf_to_worktree(const char *src, size_t len, struct strbuf *buf,
    ++			    enum convert_crlf_action crlf_action)
    + {
    + 	char *to_free = NULL;
    + 	struct text_stat stats;
    +@@ convert.c: static const char *git_path_check_encoding(struct attr_check_item *check)
    + 	return value;
    + }
    + 
    +-static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
    ++static enum convert_crlf_action git_path_check_crlf(struct attr_check_item *check)
    + {
    + 	const char *value = check->value;
    + 
     @@ convert.c: static int git_path_check_ident(struct attr_check_item *check)
      	return !!ATTR_TRUE(value);
      }
    @@ convert.c: static int git_path_check_ident(struct attr_check_item *check)
      
     
      ## convert.h ##
    -@@ convert.h: enum eol {
    - #endif
    +@@ convert.h: struct checkout_metadata {
    + 	struct object_id blob;
      };
      
    -+enum crlf_action {
    ++enum convert_crlf_action {
     +	CRLF_UNDEFINED,
     +	CRLF_BINARY,
     +	CRLF_TEXT,
    @@ convert.h: enum eol {
     +
     +struct conv_attrs {
     +	struct convert_driver *drv;
    -+	enum crlf_action attr_action; /* What attr says */
    -+	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
    ++	enum convert_crlf_action attr_action; /* What attr says */
    ++	enum convert_crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
     +	int ident;
     +	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
     +};
     +
    - enum ce_delay_state {
    - 	CE_NO_DELAY = 0,
    - 	CE_CAN_DELAY = 1,
    -@@ convert.h: void convert_to_git_filter_fd(const struct index_state *istate,
    - int would_convert_to_git_filter_fd(const struct index_state *istate,
    - 				   const char *path);
    - 
     +void convert_attrs(const struct index_state *istate,
     +		   struct conv_attrs *ca, const char *path);
     +
    - /*
    -  * Initialize the checkout metadata with the given values.  Any argument may be
    -  * NULL if it is not applicable.  The treeish should be a commit if that is
    + extern enum eol core_eol;
    + extern char *check_roundtrip_encoding;
    + const char *get_cached_convert_stats_ascii(const struct index_state *istate,
 2:  c5fbd1e16d !  2:  fc03417592 convert: add [async_]convert_to_working_tree_ca() variants
    @@ convert.c: static int convert_to_working_tree_internal(const struct index_state
      
      	return ret | ret_filter;
      }
    -@@ convert.c: int async_convert_to_working_tree(const struct index_state *istate,
    - 				  const struct checkout_metadata *meta,
    - 				  void *dco)
    - {
    --	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
    -+	struct conv_attrs ca;
    -+	convert_attrs(istate, &ca, path);
    -+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco);
    - }
      
    - int convert_to_working_tree(const struct index_state *istate,
    -@@ convert.c: int convert_to_working_tree(const struct index_state *istate,
    - 			    size_t len, struct strbuf *dst,
    - 			    const struct checkout_metadata *meta)
    - {
    --	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
    -+	struct conv_attrs ca;
    -+	convert_attrs(istate, &ca, path);
    -+	return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL);
    -+}
    -+
    +-int async_convert_to_working_tree(const struct index_state *istate,
    +-				  const char *path, const char *src,
    +-				  size_t len, struct strbuf *dst,
    +-				  const struct checkout_metadata *meta,
    +-				  void *dco)
     +int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
     +				     const char *path, const char *src,
     +				     size_t len, struct strbuf *dst,
     +				     const struct checkout_metadata *meta,
     +				     void *dco)
    -+{
    + {
    +-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
     +	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
    -+}
    -+
    + }
    + 
    +-int convert_to_working_tree(const struct index_state *istate,
    +-			    const char *path, const char *src,
    +-			    size_t len, struct strbuf *dst,
    +-			    const struct checkout_metadata *meta)
     +int convert_to_working_tree_ca(const struct conv_attrs *ca,
     +			       const char *path, const char *src,
     +			       size_t len, struct strbuf *dst,
     +			       const struct checkout_metadata *meta)
    -+{
    + {
    +-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
     +	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
      }
      
    @@ convert.c: int convert_to_working_tree(const struct index_state *istate,
      		len = dst->len;
     
      ## convert.h ##
    -@@ convert.h: int convert_to_working_tree(const struct index_state *istate,
    - 			    const char *path, const char *src,
    - 			    size_t len, struct strbuf *dst,
    - 			    const struct checkout_metadata *meta);
    +@@ convert.h: const char *get_convert_attr_ascii(const struct index_state *istate,
    + int convert_to_git(const struct index_state *istate,
    + 		   const char *path, const char *src, size_t len,
    + 		   struct strbuf *dst, int conv_flags);
    +-int convert_to_working_tree(const struct index_state *istate,
    +-			    const char *path, const char *src,
    +-			    size_t len, struct strbuf *dst,
    +-			    const struct checkout_metadata *meta);
    +-int async_convert_to_working_tree(const struct index_state *istate,
    +-				  const char *path, const char *src,
    +-				  size_t len, struct strbuf *dst,
    +-				  const struct checkout_metadata *meta,
    +-				  void *dco);
     +int convert_to_working_tree_ca(const struct conv_attrs *ca,
     +			       const char *path, const char *src,
     +			       size_t len, struct strbuf *dst,
     +			       const struct checkout_metadata *meta);
    - int async_convert_to_working_tree(const struct index_state *istate,
    - 				  const char *path, const char *src,
    - 				  size_t len, struct strbuf *dst,
    - 				  const struct checkout_metadata *meta,
    - 				  void *dco);
     +int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
     +				     const char *path, const char *src,
     +				     size_t len, struct strbuf *dst,
     +				     const struct checkout_metadata *meta,
     +				     void *dco);
    ++static inline int convert_to_working_tree(const struct index_state *istate,
    ++					  const char *path, const char *src,
    ++					  size_t len, struct strbuf *dst,
    ++					  const struct checkout_metadata *meta)
    ++{
    ++	struct conv_attrs ca;
    ++	convert_attrs(istate, &ca, path);
    ++	return convert_to_working_tree_ca(&ca, path, src, len, dst, meta);
    ++}
    ++static inline int async_convert_to_working_tree(const struct index_state *istate,
    ++						const char *path, const char *src,
    ++						size_t len, struct strbuf *dst,
    ++						const struct checkout_metadata *meta,
    ++						void *dco)
    ++{
    ++	struct conv_attrs ca;
    ++	convert_attrs(istate, &ca, path);
    ++	return async_convert_to_working_tree_ca(&ca, path, src, len, dst, meta, dco);
    ++}
      int async_query_available_blobs(const char *cmd,
      				struct string_list *available_paths);
      int renormalize_buffer(const struct index_state *istate,
 3:  c77b16f694 =  3:  8ce20f1031 convert: add get_stream_filter_ca() variant
 4:  18c3f4247e =  4:  aa1eb461f4 convert: add conv_attrs classification
 5:  2caa2c4345 !  5:  cb3dea224b entry: extract a header file for entry.c functions
    @@ entry.h (new)
     +#define CHECKOUT_INIT { NULL, "" }
     +
     +#define TEMPORARY_FILENAME_LENGTH 25
    -+
     +/*
     + * Write the contents from ce out to the working tree.
     + *
    @@ entry.h (new)
     + */
     +int checkout_entry(struct cache_entry *ce, const struct checkout *state,
     +		   char *topath, int *nr_checkouts);
    ++
     +void enable_delayed_checkout(struct checkout *state);
     +int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
    ++
     +/*
     + * Unlink the last component and schedule the leading directories for
     + * removal, such that empty directories get removed.
 6:  bfa52df9e2 !  6:  46ed6274d7 entry: make fstat_output() and read_blob_entry() public
    @@ entry.c: static int write_entry(struct cache_entry *ce,
     
      ## entry.h ##
     @@ entry.h: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
    -  * removal, such that empty directories get removed.
       */
      void unlink_entry(const struct cache_entry *ce);
    + 
     +void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
     +int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
    - 
    ++
      #endif /* ENTRY_H */
 7:  91ef17f533 !  7:  a0479d02ff entry: extract cache_entry update from write_entry()
    @@ entry.c: static int write_entry(struct cache_entry *ce,
      	return 0;
     
      ## entry.h ##
    -@@ entry.h: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
    - void unlink_entry(const struct cache_entry *ce);
    +@@ entry.h: void unlink_entry(const struct cache_entry *ce);
    + 
      void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
      int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
     +void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 8:  81e03baab1 !  8:  5c993cc27f entry: move conv_attrs lookup up to checkout_entry()
    @@ entry.c: int checkout_entry(struct cache_entry *ce, const struct checkout *state
      {
      	static struct strbuf path = STRBUF_INIT;
      	struct stat st;
    -+	struct conv_attrs ca;
    ++	struct conv_attrs ca_buf, *ca = NULL;
      
      	if (ce->ce_flags & CE_WT_REMOVE) {
      		if (topath)
    @@ entry.c: int checkout_entry(struct cache_entry *ce, const struct checkout *state
     -		return write_entry(ce, topath, state, 1);
     +	if (topath) {
     +		if (S_ISREG(ce->ce_mode)) {
    -+			convert_attrs(state->istate, &ca, ce->name);
    -+			return write_entry(ce, topath, &ca, state, 1);
    ++			convert_attrs(state->istate, &ca_buf, ce->name);
    ++			ca = &ca_buf;
     +		}
    -+		return write_entry(ce, topath, NULL, state, 1);
    ++		return write_entry(ce, topath, ca, state, 1);
     +	}
      
      	strbuf_reset(&path);
    @@ entry.c: int checkout_entry(struct cache_entry *ce, const struct checkout *state
     -	return write_entry(ce, path.buf, state, 0);
     +
     +	if (S_ISREG(ce->ce_mode)) {
    -+		convert_attrs(state->istate, &ca, ce->name);
    -+		return write_entry(ce, path.buf, &ca, state, 0);
    ++		convert_attrs(state->istate, &ca_buf, ce->name);
    ++		ca = &ca_buf;
     +	}
     +
     +	return write_entry(ce, path.buf, NULL, state, 0);
 9:  e1b886f823 !  9:  aa635bda21 entry: add checkout_entry_ca() which takes preloaded conv_attrs
    @@ entry.c: static void mark_colliding_entries(const struct checkout *state,
      {
      	static struct strbuf path = STRBUF_INIT;
      	struct stat st;
    --	struct conv_attrs ca;
    +-	struct conv_attrs ca_buf, *ca = NULL;
     +	struct conv_attrs ca_buf;
      
      	if (ce->ce_flags & CE_WT_REMOVE) {
    @@ entry.c: int checkout_entry(struct cache_entry *ce, const struct checkout *state
      
      	if (topath) {
     -		if (S_ISREG(ce->ce_mode)) {
    --			convert_attrs(state->istate, &ca, ce->name);
    --			return write_entry(ce, topath, &ca, state, 1);
     +		if (S_ISREG(ce->ce_mode) && !ca) {
    -+			convert_attrs(state->istate, &ca_buf, ce->name);
    -+			ca = &ca_buf;
    + 			convert_attrs(state->istate, &ca_buf, ce->name);
    + 			ca = &ca_buf;
      		}
    --		return write_entry(ce, topath, NULL, state, 1);
    -+		return write_entry(ce, topath, ca, state, 1);
    - 	}
    - 
    - 	strbuf_reset(&path);
     @@ entry.c: int checkout_entry(struct cache_entry *ce, const struct checkout *state,
      	if (nr_checkouts)
      		(*nr_checkouts)++;
      
     -	if (S_ISREG(ce->ce_mode)) {
    --		convert_attrs(state->istate, &ca, ce->name);
    --		return write_entry(ce, path.buf, &ca, state, 0);
     +	if (S_ISREG(ce->ce_mode) && !ca) {
    -+		convert_attrs(state->istate, &ca_buf, ce->name);
    -+		ca = &ca_buf;
    + 		convert_attrs(state->istate, &ca_buf, ce->name);
    + 		ca = &ca_buf;
      	}
      
     -	return write_entry(ce, path.buf, NULL, state, 0);
    @@ entry.h: struct checkout {
     +int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
     +		      const struct checkout *state, char *topath,
     +		      int *nr_checkouts);
    -+
    + 
      void enable_delayed_checkout(struct checkout *state);
      int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
    - /*
10:  2bdc13664e ! 10:  bc8447cd9c unpack-trees: add basic support for parallel checkout
    @@ parallel-checkout.c (new)
     +	size_t nr, alloc;
     +};
     +
    -+static struct parallel_checkout parallel_checkout = { 0 };
    ++static struct parallel_checkout parallel_checkout;
     +
     +enum pc_status parallel_checkout_status(void)
     +{
    @@ parallel-checkout.c (new)
     +	 * stat() data, so that they can be found by mark_colliding_entries(),
     +	 * in the next loop, when necessary.
     +	 */
    -+	for (i = 0; i < parallel_checkout.nr; ++i) {
    ++	for (i = 0; i < parallel_checkout.nr; i++) {
     +		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
     +		if (pc_item->status == PC_ITEM_WRITTEN)
     +			update_ce_after_write(state, pc_item->ce, &pc_item->st);
     +	}
     +
    -+	for (i = 0; i < parallel_checkout.nr; ++i) {
    ++	for (i = 0; i < parallel_checkout.nr; i++) {
     +		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
     +
     +		switch(pc_item->status) {
    @@ parallel-checkout.c (new)
     +	return ret;
     +}
     +
    -+static int check_leading_dirs(const char *path, int len, int prefix_len)
    -+{
    -+	const char *slash = path + len;
    -+
    -+	while (slash > path && *slash != '/')
    -+		slash--;
    -+
    -+	return has_dirs_only_path(path, slash - path, prefix_len);
    -+}
    -+
     +static void write_pc_item(struct parallel_checkout_item *pc_item,
     +			  struct checkout *state)
     +{
     +	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
     +	int fd = -1, fstat_done = 0;
     +	struct strbuf path = STRBUF_INIT;
    ++	const char *dir_sep;
     +
     +	strbuf_add(&path, state->base_dir, state->base_dir_len);
     +	strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
     +
    ++	dir_sep = find_last_dir_sep(path.buf);
    ++
     +	/*
    -+	 * At this point, leading dirs should have already been created. But if
    -+	 * a symlink being checked out has collided with one of the dirs, due to
    -+	 * file system folding rules, it's possible that the dirs are no longer
    -+	 * present. So we have to check again, and report any path collisions.
    ++	 * The leading dirs should have been already created by now. But, in
    ++	 * case of path collisions, one of the dirs could have been replaced by
    ++	 * a symlink (checked out after we enqueued this entry for parallel
    ++	 * checkout). Thus, we must check the leading dirs again.
     +	 */
    -+	if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) {
    ++	if (dir_sep && !has_dirs_only_path(path.buf, dir_sep - path.buf,
    ++					   state->base_dir_len)) {
     +		pc_item->status = PC_ITEM_COLLIDED;
     +		goto out;
     +	}
    @@ parallel-checkout.c (new)
     +			 * Errors which probably represent a path collision.
     +			 * Suppress the error message and mark the item to be
     +			 * retried later, sequentially. ENOTDIR and ENOENT are
    -+			 * also interesting, but check_leading_dirs() should
    -+			 * have already caught these cases.
    ++			 * also interesting, but the above has_dirs_only_path()
    ++			 * call should have already caught these cases.
     +			 */
     +			pc_item->status = PC_ITEM_COLLIDED;
     +		} else {
    @@ parallel-checkout.c (new)
     +{
     +	size_t i;
     +
    -+	for (i = 0; i < parallel_checkout.nr; ++i)
    ++	for (i = 0; i < parallel_checkout.nr; i++)
     +		write_pc_item(&parallel_checkout.items[i], state);
     +}
     +
11:  096e543fd2 ! 11:  815137685a parallel-checkout: make it truly parallel
    @@ builtin/checkout--helper.c (new)
     +		packet_to_pc_item(line, len, &items[nr++]);
     +	}
     +
    -+	for (i = 0; i < nr; ++i) {
    ++	for (i = 0; i < nr; i++) {
     +		struct parallel_checkout_item *pc_item = &items[i];
     +		write_pc_item(pc_item, state);
     +		report_result(pc_item);
    @@ parallel-checkout.c: static int write_pc_item_to_fd(struct parallel_checkout_ite
      	 */
      	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
      					 new_blob, size, &buf, NULL);
    -@@ parallel-checkout.c: static int check_leading_dirs(const char *path, int len, int prefix_len)
    - 	return has_dirs_only_path(path, slash - path, prefix_len);
    +@@ parallel-checkout.c: static int close_and_clear(int *fd)
    + 	return ret;
      }
      
     -static void write_pc_item(struct parallel_checkout_item *pc_item,
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +static void send_batch(int fd, size_t start, size_t nr)
     +{
     +	size_t i;
    -+	for (i = 0; i < nr; ++i)
    ++	for (i = 0; i < nr; i++)
     +		send_one_item(fd, &parallel_checkout.items[start + i]);
     +	packet_flush(fd);
     +}
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +
     +	ALLOC_ARRAY(workers, num_workers);
     +
    -+	for (i = 0; i < num_workers; ++i) {
    ++	for (i = 0; i < num_workers; i++) {
     +		struct child_process *cp = &workers[i].cp;
     +
     +		child_process_init(cp);
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +	base_batch_size = parallel_checkout.nr / num_workers;
     +	workers_with_one_extra_item = parallel_checkout.nr % num_workers;
     +
    -+	for (i = 0; i < num_workers; ++i) {
    ++	for (i = 0; i < num_workers; i++) {
     +		struct pc_worker *worker = &workers[i];
     +		size_t batch_size = base_batch_size;
     +
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +	 * Close pipes before calling finish_command() to let the workers
     +	 * exit asynchronously and avoid spending extra time on wait().
     +	 */
    -+	for (i = 0; i < num_workers; ++i) {
    ++	for (i = 0; i < num_workers; i++) {
     +		struct child_process *cp = &workers[i].cp;
     +		if (cp->in >= 0)
     +			close(cp->in);
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +			close(cp->out);
     +	}
     +
    -+	for (i = 0; i < num_workers; ++i) {
    ++	for (i = 0; i < num_workers; i++) {
     +		if (finish_command(&workers[i].cp))
     +			error(_("checkout worker %d finished with error"), i);
     +	}
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +	struct pollfd *pfds;
     +
     +	CALLOC_ARRAY(pfds, num_workers);
    -+	for (i = 0; i < num_workers; ++i) {
    ++	for (i = 0; i < num_workers; i++) {
     +		pfds[i].fd = workers[i].cp.out;
     +		pfds[i].events = POLLIN;
     +	}
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +			die_errno("failed to poll checkout workers");
     +		}
     +
    -+		for (i = 0; i < num_workers && nr > 0; ++i) {
    ++		for (i = 0; i < num_workers && nr > 0; i++) {
     +			struct pc_worker *worker = &workers[i];
     +			struct pollfd *pfd = &pfds[i];
     +
    @@ parallel-checkout.h: void init_parallel_checkout(void);
     +	size_t id;
     +	struct object_id oid;
     +	unsigned int ce_mode;
    -+	enum crlf_action crlf_action;
    ++	enum convert_crlf_action crlf_action;
     +	int ident;
     +	size_t working_tree_encoding_len;
     +	size_t name_len;
12:  9cfeb4821c ! 12:  2b42621582 parallel-checkout: support progress displaying
    @@ parallel-checkout.c: struct parallel_checkout {
     +	unsigned int *progress_cnt;
      };
      
    - static struct parallel_checkout parallel_checkout = { 0 };
    + static struct parallel_checkout parallel_checkout;
     @@ parallel-checkout.c: int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
      	return 0;
      }
    @@ parallel-checkout.c: static void write_items_sequentially(struct checkout *state
      {
      	size_t i;
      
    --	for (i = 0; i < parallel_checkout.nr; ++i)
    +-	for (i = 0; i < parallel_checkout.nr; i++)
     -		write_pc_item(&parallel_checkout.items[i], state);
    -+	for (i = 0; i < parallel_checkout.nr; ++i) {
    ++	for (i = 0; i < parallel_checkout.nr; i++) {
     +		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
     +		write_pc_item(pc_item, state);
     +		if (pc_item->status != PC_ITEM_COLLIDED)
13:  da99b671e6 = 13:  960116579a make_transient_cache_entry(): optionally alloc from mem_pool
14:  d3d561754a = 14:  fb9f2f580c builtin/checkout.c: complete parallel checkout support
15:  ee34c6e149 = 15:  a844451e58 checkout-index: add parallel checkout support
16:  05299a3cc0 = 16:  3733857ffa parallel-checkout: add tests for basic operations
17:  3d140dcacb = 17:  c8a2974f81 parallel-checkout: add tests related to clone collisions
18:  b26f676cae = 18:  86fccd57d5 parallel-checkout: add tests related to .gitattributes
19:  641c61f9b6 = 19:  7f3e23cc38 ci: run test round with parallel-checkout enabled
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 01/19] convert: make convert_attrs() and convert structs public
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-05 10:40         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
                         ` (18 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation. Note that crlf_action is renamed to
convert_crlf_action, which is more appropriate for the global namespace.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 35 ++++++++---------------------------
 convert.h | 24 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/convert.c b/convert.c
index ee360c2f07..f13b001273 100644
--- a/convert.c
+++ b/convert.c
@@ -24,17 +24,6 @@
 #define CONVERT_STAT_BITS_TXT_CRLF  0x2
 #define CONVERT_STAT_BITS_BIN       0x4
 
-enum crlf_action {
-	CRLF_UNDEFINED,
-	CRLF_BINARY,
-	CRLF_TEXT,
-	CRLF_TEXT_INPUT,
-	CRLF_TEXT_CRLF,
-	CRLF_AUTO,
-	CRLF_AUTO_INPUT,
-	CRLF_AUTO_CRLF
-};
-
 struct text_stat {
 	/* NUL, CR, LF and CRLF counts */
 	unsigned nul, lonecr, lonelf, crlf;
@@ -172,7 +161,7 @@ static int text_eol_is_crlf(void)
 	return 0;
 }
 
-static enum eol output_eol(enum crlf_action crlf_action)
+static enum eol output_eol(enum convert_crlf_action crlf_action)
 {
 	switch (crlf_action) {
 	case CRLF_BINARY:
@@ -246,7 +235,7 @@ static int has_crlf_in_index(const struct index_state *istate, const char *path)
 }
 
 static int will_convert_lf_to_crlf(struct text_stat *stats,
-				   enum crlf_action crlf_action)
+				   enum convert_crlf_action crlf_action)
 {
 	if (output_eol(crlf_action) != EOL_CRLF)
 		return 0;
@@ -499,7 +488,7 @@ static int encode_to_worktree(const char *path, const char *src, size_t src_len,
 static int crlf_to_git(const struct index_state *istate,
 		       const char *path, const char *src, size_t len,
 		       struct strbuf *buf,
-		       enum crlf_action crlf_action, int conv_flags)
+		       enum convert_crlf_action crlf_action, int conv_flags)
 {
 	struct text_stat stats;
 	char *dst;
@@ -585,8 +574,8 @@ static int crlf_to_git(const struct index_state *istate,
 	return 1;
 }
 
-static int crlf_to_worktree(const char *src, size_t len,
-			    struct strbuf *buf, enum crlf_action crlf_action)
+static int crlf_to_worktree(const char *src, size_t len, struct strbuf *buf,
+			    enum convert_crlf_action crlf_action)
 {
 	char *to_free = NULL;
 	struct text_stat stats;
@@ -1247,7 +1236,7 @@ static const char *git_path_check_encoding(struct attr_check_item *check)
 	return value;
 }
 
-static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
+static enum convert_crlf_action git_path_check_crlf(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1297,18 +1286,10 @@ static int git_path_check_ident(struct attr_check_item *check)
 	return !!ATTR_TRUE(value);
 }
 
-struct conv_attrs {
-	struct convert_driver *drv;
-	enum crlf_action attr_action; /* What attr says */
-	enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
-	int ident;
-	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
-};
-
 static struct attr_check *check;
 
-static void convert_attrs(const struct index_state *istate,
-			  struct conv_attrs *ca, const char *path)
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path)
 {
 	struct attr_check_item *ccheck = NULL;
 
diff --git a/convert.h b/convert.h
index e29d1026a6..5678e99922 100644
--- a/convert.h
+++ b/convert.h
@@ -63,6 +63,30 @@ struct checkout_metadata {
 	struct object_id blob;
 };
 
+enum convert_crlf_action {
+	CRLF_UNDEFINED,
+	CRLF_BINARY,
+	CRLF_TEXT,
+	CRLF_TEXT_INPUT,
+	CRLF_TEXT_CRLF,
+	CRLF_AUTO,
+	CRLF_AUTO_INPUT,
+	CRLF_AUTO_CRLF
+};
+
+struct convert_driver;
+
+struct conv_attrs {
+	struct convert_driver *drv;
+	enum convert_crlf_action attr_action; /* What attr says */
+	enum convert_crlf_action crlf_action; /* When no attr is set, use core.autocrlf */
+	int ident;
+	const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
+};
+
+void convert_attrs(const struct index_state *istate,
+		   struct conv_attrs *ca, const char *path);
+
 extern enum eol core_eol;
 extern char *check_roundtrip_encoding;
 const char *get_cached_convert_stats_ascii(const struct index_state *istate,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 02/19] convert: add [async_]convert_to_working_tree_ca() variants
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-05 11:10         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
                         ` (17 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on a index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash, remove one function definition and reword]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 47 ++++++++++++++++++++++++-----------------------
 convert.h | 37 ++++++++++++++++++++++++++++---------
 2 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/convert.c b/convert.c
index f13b001273..ab3d517233 100644
--- a/convert.c
+++ b/convert.c
@@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
 }
 
-static int convert_to_working_tree_internal(const struct index_state *istate,
+static int convert_to_working_tree_internal(const struct conv_attrs *ca,
 					    const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing,
@@ -1455,11 +1455,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 					    struct delayed_checkout *dco)
 {
 	int ret = 0, ret_filter = 0;
-	struct conv_attrs ca;
-
-	convert_attrs(istate, &ca, path);
 
-	ret |= ident_to_worktree(src, len, dst, ca.ident);
+	ret |= ident_to_worktree(src, len, dst, ca->ident);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -1469,49 +1466,53 @@ static int convert_to_working_tree_internal(const struct index_state *istate,
 	 * is a smudge or process filter (even if the process filter doesn't
 	 * support smudge).  The filters might expect CRLFs.
 	 */
-	if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) {
-		ret |= crlf_to_worktree(src, len, dst, ca.crlf_action);
+	if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) {
+		ret |= crlf_to_worktree(src, len, dst, ca->crlf_action);
 		if (ret) {
 			src = dst->buf;
 			len = dst->len;
 		}
 	}
 
-	ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding);
+	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
 	}
 
 	ret_filter = apply_filter(
-		path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco);
-	if (!ret_filter && ca.drv && ca.drv->required)
-		die(_("%s: smudge filter %s failed"), path, ca.drv->name);
+		path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco);
+	if (!ret_filter && ca->drv && ca->drv->required)
+		die(_("%s: smudge filter %s failed"), path, ca->drv->name);
 
 	return ret | ret_filter;
 }
 
-int async_convert_to_working_tree(const struct index_state *istate,
-				  const char *path, const char *src,
-				  size_t len, struct strbuf *dst,
-				  const struct checkout_metadata *meta,
-				  void *dco)
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
 }
 
-int convert_to_working_tree(const struct index_state *istate,
-			    const char *path, const char *src,
-			    size_t len, struct strbuf *dst,
-			    const struct checkout_metadata *meta)
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
 }
 
 int renormalize_buffer(const struct index_state *istate, const char *path,
 		       const char *src, size_t len, struct strbuf *dst)
 {
-	int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL);
+	struct conv_attrs ca;
+	int ret;
+
+	convert_attrs(istate, &ca, path);
+	ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/convert.h b/convert.h
index 5678e99922..a4838b5e5c 100644
--- a/convert.h
+++ b/convert.h
@@ -99,15 +99,34 @@ const char *get_convert_attr_ascii(const struct index_state *istate,
 int convert_to_git(const struct index_state *istate,
 		   const char *path, const char *src, size_t len,
 		   struct strbuf *dst, int conv_flags);
-int convert_to_working_tree(const struct index_state *istate,
-			    const char *path, const char *src,
-			    size_t len, struct strbuf *dst,
-			    const struct checkout_metadata *meta);
-int async_convert_to_working_tree(const struct index_state *istate,
-				  const char *path, const char *src,
-				  size_t len, struct strbuf *dst,
-				  const struct checkout_metadata *meta,
-				  void *dco);
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta);
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco);
+static inline int convert_to_working_tree(const struct index_state *istate,
+					  const char *path, const char *src,
+					  size_t len, struct strbuf *dst,
+					  const struct checkout_metadata *meta)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_ca(&ca, path, src, len, dst, meta);
+}
+static inline int async_convert_to_working_tree(const struct index_state *istate,
+						const char *path, const char *src,
+						size_t len, struct strbuf *dst,
+						const struct checkout_metadata *meta,
+						void *dco)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return async_convert_to_working_tree_ca(&ca, path, src, len, dst, meta, dco);
+}
 int async_query_available_blobs(const char *cmd,
 				struct string_list *available_paths);
 int renormalize_buffer(const struct index_state *istate,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 03/19] convert: add get_stream_filter_ca() variant
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-05 11:45         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 04/19] convert: add conv_attrs classification Matheus Tavares
                         ` (16 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: move header comment to ca() variant and reword msg]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 28 +++++++++++++++++-----------
 convert.h |  2 ++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/convert.c b/convert.c
index ab3d517233..0a61e4e9bf 100644
--- a/convert.c
+++ b/convert.c
@@ -1939,34 +1939,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid)
 }
 
 /*
- * Return an appropriately constructed filter for the path, or NULL if
+ * Return an appropriately constructed filter for the given ca, or NULL if
  * the contents cannot be filtered without reading the whole thing
  * in-core.
  *
  * Note that you would be crazy to set CRLF, smudge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const struct index_state *istate,
-					const char *path,
-					const struct object_id *oid)
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid)
 {
-	struct conv_attrs ca;
 	struct stream_filter *filter = NULL;
 
-	convert_attrs(istate, &ca, path);
-	if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean))
+	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
 		return NULL;
 
-	if (ca.working_tree_encoding)
+	if (ca->working_tree_encoding)
 		return NULL;
 
-	if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF)
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
 		return NULL;
 
-	if (ca.ident)
+	if (ca->ident)
 		filter = ident_filter(oid);
 
-	if (output_eol(ca.crlf_action) == EOL_CRLF)
+	if (output_eol(ca->crlf_action) == EOL_CRLF)
 		filter = cascade_filter(filter, lf_to_crlf_filter());
 	else
 		filter = cascade_filter(filter, &null_filter_singleton);
@@ -1974,6 +1971,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate,
 	return filter;
 }
 
+struct stream_filter *get_stream_filter(const struct index_state *istate,
+					const char *path,
+					const struct object_id *oid)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return get_stream_filter_ca(&ca, oid);
+}
+
 void free_stream_filter(struct stream_filter *filter)
 {
 	filter->vtbl->free(filter);
diff --git a/convert.h b/convert.h
index a4838b5e5c..484b50965d 100644
--- a/convert.h
+++ b/convert.h
@@ -179,6 +179,8 @@ struct stream_filter; /* opaque */
 struct stream_filter *get_stream_filter(const struct index_state *istate,
 					const char *path,
 					const struct object_id *);
+struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
+					   const struct object_id *oid);
 void free_stream_filter(struct stream_filter *);
 int is_null_stream_filter(struct stream_filter *);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 04/19] convert: add conv_attrs classification
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (2 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-05 12:07         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 05/19] entry: extract a header file for entry.c functions Matheus Tavares
                         ` (15 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git
  Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren,
	Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: use classification in get_stream_filter_ca()]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 26 +++++++++++++++++++-------
 convert.h | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/convert.c b/convert.c
index 0a61e4e9bf..3b2d626268 100644
--- a/convert.c
+++ b/convert.c
@@ -1951,13 +1951,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
 {
 	struct stream_filter *filter = NULL;
 
-	if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean))
-		return NULL;
-
-	if (ca->working_tree_encoding)
-		return NULL;
-
-	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+	if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE)
 		return NULL;
 
 	if (ca->ident)
@@ -2013,3 +2007,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst,
 	if (blob)
 		oidcpy(&dst->blob, blob);
 }
+
+enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca)
+{
+	if (ca->drv) {
+		if (ca->drv->process)
+			return CA_CLASS_INCORE_PROCESS;
+		if (ca->drv->smudge || ca->drv->clean)
+			return CA_CLASS_INCORE_FILTER;
+	}
+
+	if (ca->working_tree_encoding)
+		return CA_CLASS_INCORE;
+
+	if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF)
+		return CA_CLASS_INCORE;
+
+	return CA_CLASS_STREAMABLE;
+}
diff --git a/convert.h b/convert.h
index 484b50965d..43e567a59b 100644
--- a/convert.h
+++ b/convert.h
@@ -200,4 +200,37 @@ int stream_filter(struct stream_filter *,
 		  const char *input, size_t *isize_p,
 		  char *output, size_t *osize_p);
 
+enum conv_attrs_classification {
+	/*
+	 * The blob must be loaded into a buffer before it can be
+	 * smudged. All smudging is done in-proc.
+	 */
+	CA_CLASS_INCORE,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * single-file driver filter, such as rot13.
+	 */
+	CA_CLASS_INCORE_FILTER,
+
+	/*
+	 * The blob must be loaded into a buffer, but uses a
+	 * long-running driver process, such as LFS. This might or
+	 * might not use delayed operations. (The important thing is
+	 * that there is a single subordinate long-running process
+	 * handling all associated blobs and in case of delayed
+	 * operations, may hold per-blob state.)
+	 */
+	CA_CLASS_INCORE_PROCESS,
+
+	/*
+	 * The blob can be streamed and smudged without needing to
+	 * completely read it into a buffer.
+	 */
+	CA_CLASS_STREAMABLE,
+};
+
+enum conv_attrs_classification classify_conv_attrs(
+	const struct conv_attrs *ca);
+
 #endif /* CONVERT_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 05/19] entry: extract a header file for entry.c functions
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (3 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 04/19] convert: add conv_attrs classification Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-06  8:31         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
                         ` (14 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 apply.c                  |  1 +
 builtin/checkout-index.c |  1 +
 builtin/checkout.c       |  1 +
 builtin/difftool.c       |  1 +
 cache.h                  | 24 -----------------------
 entry.c                  |  9 +--------
 entry.h                  | 42 ++++++++++++++++++++++++++++++++++++++++
 unpack-trees.c           |  1 +
 8 files changed, 48 insertions(+), 32 deletions(-)
 create mode 100644 entry.h

diff --git a/apply.c b/apply.c
index 76dba93c97..ddec80b4b0 100644
--- a/apply.c
+++ b/apply.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "rerere.h"
 #include "apply.h"
+#include "entry.h"
 
 struct gitdiff_data {
 	struct strbuf *root;
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 4bbfc92dce..9276ed0258 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "cache-tree.h"
 #include "parse-options.h"
+#include "entry.h"
 
 #define CHECKOUT_ALL 4
 static int nul_term_line;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 0951f8fee5..b18b9d6f3c 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -26,6 +26,7 @@
 #include "unpack-trees.h"
 #include "wt-status.h"
 #include "xdiff-interface.h"
+#include "entry.h"
 
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
diff --git a/builtin/difftool.c b/builtin/difftool.c
index 7ac432b881..dfa22b67eb 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -23,6 +23,7 @@
 #include "lockfile.h"
 #include "object-store.h"
 #include "dir.h"
+#include "entry.h"
 
 static int trust_exit_code;
 
diff --git a/cache.h b/cache.h
index c0072d43b1..ccfeb9ba2b 100644
--- a/cache.h
+++ b/cache.h
@@ -1706,30 +1706,6 @@ const char *show_ident_date(const struct ident_split *id,
  */
 int ident_cmp(const struct ident_split *, const struct ident_split *);
 
-struct checkout {
-	struct index_state *istate;
-	const char *base_dir;
-	int base_dir_len;
-	struct delayed_checkout *delayed_checkout;
-	struct checkout_metadata meta;
-	unsigned force:1,
-		 quiet:1,
-		 not_new:1,
-		 clone:1,
-		 refresh_cache:1;
-};
-#define CHECKOUT_INIT { NULL, "" }
-
-#define TEMPORARY_FILENAME_LENGTH 25
-int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts);
-void enable_delayed_checkout(struct checkout *state);
-int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
-/*
- * Unlink the last component and schedule the leading directories for
- * removal, such that empty directories get removed.
- */
-void unlink_entry(const struct cache_entry *ce);
-
 struct cache_def {
 	struct strbuf path;
 	int flags;
diff --git a/entry.c b/entry.c
index a0532f1f00..b0b8099699 100644
--- a/entry.c
+++ b/entry.c
@@ -6,6 +6,7 @@
 #include "submodule.h"
 #include "progress.h"
 #include "fsmonitor.h"
+#include "entry.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-/*
- * Write the contents from ce out to the working tree.
- *
- * When topath[] is not NULL, instead of writing to the working tree
- * file named by ce, a temporary file is created by this function and
- * its name is returned in topath[], which must be able to hold at
- * least TEMPORARY_FILENAME_LENGTH bytes long.
- */
 int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		   char *topath, int *nr_checkouts)
 {
diff --git a/entry.h b/entry.h
new file mode 100644
index 0000000000..acbbb90220
--- /dev/null
+++ b/entry.h
@@ -0,0 +1,42 @@
+#ifndef ENTRY_H
+#define ENTRY_H
+
+#include "cache.h"
+#include "convert.h"
+
+struct checkout {
+	struct index_state *istate;
+	const char *base_dir;
+	int base_dir_len;
+	struct delayed_checkout *delayed_checkout;
+	struct checkout_metadata meta;
+	unsigned force:1,
+		 quiet:1,
+		 not_new:1,
+		 clone:1,
+		 refresh_cache:1;
+};
+#define CHECKOUT_INIT { NULL, "" }
+
+#define TEMPORARY_FILENAME_LENGTH 25
+/*
+ * Write the contents from ce out to the working tree.
+ *
+ * When topath[] is not NULL, instead of writing to the working tree
+ * file named by ce, a temporary file is created by this function and
+ * its name is returned in topath[], which must be able to hold at
+ * least TEMPORARY_FILENAME_LENGTH bytes long.
+ */
+int checkout_entry(struct cache_entry *ce, const struct checkout *state,
+		   char *topath, int *nr_checkouts);
+
+void enable_delayed_checkout(struct checkout *state);
+int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
+
+/*
+ * Unlink the last component and schedule the leading directories for
+ * removal, such that empty directories get removed.
+ */
+void unlink_entry(const struct cache_entry *ce);
+
+#endif /* ENTRY_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 323280dd48..a511fadd89 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -16,6 +16,7 @@
 #include "fsmonitor.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "entry.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 06/19] entry: make fstat_output() and read_blob_entry() public
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (4 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 05/19] entry: extract a header file for entry.c functions Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
                         ` (13 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 8 ++++----
 entry.h | 3 +++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index b0b8099699..b36071a610 100644
--- a/entry.c
+++ b/entry.c
@@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode)
 	return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
 }
 
-static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
 {
 	enum object_type type;
 	void *blob_data = read_object_file(&ce->oid, &type, size);
@@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf
 	}
 }
 
-static int fstat_output(int fd, const struct checkout *state, struct stat *st)
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 {
 	/* use fstat() only when path == ce->name */
 	if (fstat_is_reliable() &&
@@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path,
 		return -1;
 
 	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
-	*fstat_done = fstat_output(fd, state, statbuf);
+	*fstat_done = fstat_checkout_output(fd, state, statbuf);
 	result |= close(fd);
 
 	if (result)
@@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce,
 
 		wrote = write_in_full(fd, new_blob, size);
 		if (!to_tempfile)
-			fstat_done = fstat_output(fd, state, &st);
+			fstat_done = fstat_checkout_output(fd, state, &st);
 		close(fd);
 		free(new_blob);
 		if (wrote < 0)
diff --git a/entry.h b/entry.h
index acbbb90220..60df93ca78 100644
--- a/entry.h
+++ b/entry.h
@@ -39,4 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
  */
 void unlink_entry(const struct cache_entry *ce);
 
+void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
+int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
+
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 07/19] entry: extract cache_entry update from write_entry()
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (5 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-06  8:53         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
                         ` (12 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

This code will be used by the parallel checkout functions, outside
entry.c, so extract it to a public function.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 25 ++++++++++++++++---------
 entry.h |  2 ++
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/entry.c b/entry.c
index b36071a610..1d2df188e5 100644
--- a/entry.c
+++ b/entry.c
@@ -251,6 +251,18 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 	return errs;
 }
 
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st)
+{
+	if (state->refresh_cache) {
+		assert(state->istate);
+		fill_stat_cache_info(state->istate, ce, st);
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
+		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+	}
+}
+
 static int write_entry(struct cache_entry *ce,
 		       char *path, const struct checkout *state, int to_tempfile)
 {
@@ -371,15 +383,10 @@ static int write_entry(struct cache_entry *ce,
 
 finish:
 	if (state->refresh_cache) {
-		assert(state->istate);
-		if (!fstat_done)
-			if (lstat(ce->name, &st) < 0)
-				return error_errno("unable to stat just-written file %s",
-						   ce->name);
-		fill_stat_cache_info(state->istate, ce, &st);
-		ce->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(state->istate, ce);
-		state->istate->cache_changed |= CE_ENTRY_CHANGED;
+		if (!fstat_done && lstat(ce->name, &st) < 0)
+			return error_errno("unable to stat just-written file %s",
+					   ce->name);
+		update_ce_after_write(state, ce , &st);
 	}
 delayed:
 	return 0;
diff --git a/entry.h b/entry.h
index 60df93ca78..ea7290bcd5 100644
--- a/entry.h
+++ b/entry.h
@@ -41,5 +41,7 @@ void unlink_entry(const struct cache_entry *ce);
 
 void *read_blob_entry(const struct cache_entry *ce, unsigned long *size);
 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
+void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
+			   struct stat *st);
 
 #endif /* ENTRY_H */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 08/19] entry: move conv_attrs lookup up to checkout_entry()
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (6 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-06  9:35         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
                         ` (11 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 1d2df188e5..486712c3a9 100644
--- a/entry.c
+++ b/entry.c
@@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 	}
 }
 
-static int write_entry(struct cache_entry *ce,
-		       char *path, const struct checkout *state, int to_tempfile)
+/* Note: ca is used (and required) iff the entry refers to a regular file. */
+static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca,
+		       const struct checkout *state, int to_tempfile)
 {
 	unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
 	struct delayed_checkout *dco = state->delayed_checkout;
@@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce,
 	clone_checkout_metadata(&meta, &state->meta, &ce->oid);
 
 	if (ce_mode_s_ifmt == S_IFREG) {
-		struct stream_filter *filter = get_stream_filter(state->istate, ce->name,
-								 &ce->oid);
+		struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid);
 		if (filter &&
 		    !streaming_write_entry(ce, path, filter,
 					   state, to_tempfile,
@@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce,
 		 * Convert from git internal format to working tree format
 		 */
 		if (dco && dco->state != CE_NO_DELAY) {
-			ret = async_convert_to_working_tree(state->istate, ce->name, new_blob,
-							    size, &buf, &meta, dco);
+			ret = async_convert_to_working_tree_ca(ca, ce->name,
+							       new_blob, size,
+							       &buf, &meta, dco);
 			if (ret && string_list_has_string(&dco->paths, ce->name)) {
 				free(new_blob);
 				goto delayed;
 			}
-		} else
-			ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta);
+		} else {
+			ret = convert_to_working_tree_ca(ca, ce->name, new_blob,
+							 size, &buf, &meta);
+		}
 
 		if (ret) {
 			free(new_blob);
@@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
+	struct conv_attrs ca_buf, *ca = NULL;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 	}
 
-	if (topath)
-		return write_entry(ce, topath, state, 1);
+	if (topath) {
+		if (S_ISREG(ce->ce_mode)) {
+			convert_attrs(state->istate, &ca_buf, ce->name);
+			ca = &ca_buf;
+		}
+		return write_entry(ce, topath, ca, state, 1);
+	}
 
 	strbuf_reset(&path);
 	strbuf_add(&path, state->base_dir, state->base_dir_len);
@@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 		return 0;
 
 	create_directories(path.buf, path.len, state);
+
 	if (nr_checkouts)
 		(*nr_checkouts)++;
-	return write_entry(ce, path.buf, state, 0);
+
+	if (S_ISREG(ce->ce_mode)) {
+		convert_attrs(state->istate, &ca_buf, ce->name);
+		ca = &ca_buf;
+	}
+
+	return write_entry(ce, path.buf, NULL, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (7 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-06 10:02         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
                         ` (10 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 entry.c | 13 +++++++------
 entry.h | 12 ++++++++++--
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/entry.c b/entry.c
index 486712c3a9..9d79a5671f 100644
--- a/entry.c
+++ b/entry.c
@@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state,
 	}
 }
 
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts)
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
-	struct conv_attrs ca_buf, *ca = NULL;
+	struct conv_attrs ca_buf;
 
 	if (ce->ce_flags & CE_WT_REMOVE) {
 		if (topath)
@@ -459,7 +460,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	}
 
 	if (topath) {
-		if (S_ISREG(ce->ce_mode)) {
+		if (S_ISREG(ce->ce_mode) && !ca) {
 			convert_attrs(state->istate, &ca_buf, ce->name);
 			ca = &ca_buf;
 		}
@@ -530,12 +531,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state,
 	if (nr_checkouts)
 		(*nr_checkouts)++;
 
-	if (S_ISREG(ce->ce_mode)) {
+	if (S_ISREG(ce->ce_mode) && !ca) {
 		convert_attrs(state->istate, &ca_buf, ce->name);
 		ca = &ca_buf;
 	}
 
-	return write_entry(ce, path.buf, NULL, state, 0);
+	return write_entry(ce, path.buf, ca, state, 0);
 }
 
 void unlink_entry(const struct cache_entry *ce)
diff --git a/entry.h b/entry.h
index ea7290bcd5..d8244c5db2 100644
--- a/entry.h
+++ b/entry.h
@@ -26,9 +26,17 @@ struct checkout {
  * file named by ce, a temporary file is created by this function and
  * its name is returned in topath[], which must be able to hold at
  * least TEMPORARY_FILENAME_LENGTH bytes long.
+ *
+ * With checkout_entry_ca(), callers can optionally pass a preloaded
+ * conv_attrs struct (to avoid reloading it), when ce refers to a
+ * regular file. If ca is NULL, the attributes will be loaded
+ * internally when (and if) needed.
  */
-int checkout_entry(struct cache_entry *ce, const struct checkout *state,
-		   char *topath, int *nr_checkouts);
+#define checkout_entry(ce, state, topath, nr_checkouts) \
+		checkout_entry_ca(ce, NULL, state, topath, nr_checkouts)
+int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
+		      const struct checkout *state, char *topath,
+		      int *nr_checkouts);
 
 void enable_delayed_checkout(struct checkout *state);
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 10/19] unpack-trees: add basic support for parallel checkout
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (8 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-06 11:36         ` Christian Couder
  2020-11-04 20:33       ` [PATCH v4 11/19] parallel-checkout: make it truly parallel Matheus Tavares
                         ` (9 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

This new interface allows us to enqueue some of the entries being
checked out to later call write_entry() for them in parallel. For now,
the parallel checkout machinery is enabled by default and there is no
user configuration, but run_parallel_checkout() just writes the queued
entries in sequence (without spawning additional workers). The next
patch will actually implement the parallelism and, later, we will make
it configurable.

When there are path collisions among the entries being written (which
can happen e.g. with case-sensitive files in case-insensitive file
systems), the parallel checkout code detects the problem and marks the
item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to
checkout_entry() again. This is similar to the way the sequential code
deals with collisions, overwriting the previously checked out entries
with the subsequent ones. The only difference is that, when we start
writing the entries in parallel, we won't be able to determine which of
the colliding entries will survive on disk (for the sequential
algorithm, it is always the last one).

I also experimented with the idea of not overwriting colliding entries,
and it seemed to work well in my simple tests. However, because just one
entry of each colliding group would be actually written, the others
would have null lstat() fields on the index. This might not be a problem
by itself, but it could cause performance penalties for subsequent
commands that need to refresh the index: when the st_size value cached
is 0, read-cache.c:ie_modified() will go to the filesystem to see if the
contents match. As mentioned in the function:

    * Immediately after read-tree or update-index --cacheinfo,
    * the length field is zero, as we have never even read the
    * lstat(2) information once, and we cannot trust DATA_CHANGED
    * returned by ie_match_stat() which in turn was returned by
    * ce_match_stat_basic() to signal that the filesize of the
    * blob changed.  We have to actually go to the filesystem to
    * see if the contents match, and if so, should answer "unchanged".

So, if we have N entries in a colliding group and we decide to write and
lstat() only one of them, every subsequent git-status will have to read,
convert, and hash the written file N - 1 times, to check that the N - 1
unwritten entries are dirty. By checking out all colliding entries (like
the sequential code does), we only pay the overhead once.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Makefile            |   1 +
 entry.c             |  17 ++-
 parallel-checkout.c | 362 ++++++++++++++++++++++++++++++++++++++++++++
 parallel-checkout.h |  27 ++++
 unpack-trees.c      |   6 +-
 5 files changed, 410 insertions(+), 3 deletions(-)
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

diff --git a/Makefile b/Makefile
index 1fb0ec1705..10ee5e709b 100644
--- a/Makefile
+++ b/Makefile
@@ -945,6 +945,7 @@ LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
+LIB_OBJS += parallel-checkout.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/entry.c b/entry.c
index 9d79a5671f..6676954431 100644
--- a/entry.c
+++ b/entry.c
@@ -7,6 +7,7 @@
 #include "progress.h"
 #include "fsmonitor.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state,
 	for (i = 0; i < state->istate->cache_nr; i++) {
 		struct cache_entry *dup = state->istate->cache[i];
 
-		if (dup == ce)
-			break;
+		if (dup == ce) {
+			/*
+			 * Parallel checkout creates the files in no particular
+			 * order. So the other side of the collision may appear
+			 * after the given cache_entry in the array.
+			 */
+			if (parallel_checkout_status() == PC_RUNNING)
+				continue;
+			else
+				break;
+		}
 
 		if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
 			continue;
@@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 		ca = &ca_buf;
 	}
 
+	if (!enqueue_checkout(ce, ca))
+		return 0;
+
 	return write_entry(ce, path.buf, ca, state, 0);
 }
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
new file mode 100644
index 0000000000..fd871b09d3
--- /dev/null
+++ b/parallel-checkout.c
@@ -0,0 +1,362 @@
+#include "cache.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "streaming.h"
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/* pointer to a istate->cache[] entry. Not owned by us. */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	struct stat st;
+	enum pc_item_status status;
+};
+
+struct parallel_checkout {
+	enum pc_status status;
+	struct parallel_checkout_item *items;
+	size_t nr, alloc;
+};
+
+static struct parallel_checkout parallel_checkout;
+
+enum pc_status parallel_checkout_status(void)
+{
+	return parallel_checkout.status;
+}
+
+void init_parallel_checkout(void)
+{
+	if (parallel_checkout.status != PC_UNINITIALIZED)
+		BUG("parallel checkout already initialized");
+
+	parallel_checkout.status = PC_ACCEPTING_ENTRIES;
+}
+
+static void finish_parallel_checkout(void)
+{
+	if (parallel_checkout.status == PC_UNINITIALIZED)
+		BUG("cannot finish parallel checkout: not initialized yet");
+
+	free(parallel_checkout.items);
+	memset(&parallel_checkout, 0, sizeof(parallel_checkout));
+}
+
+static int is_eligible_for_parallel_checkout(const struct cache_entry *ce,
+					     const struct conv_attrs *ca)
+{
+	enum conv_attrs_classification c;
+
+	if (!S_ISREG(ce->ce_mode))
+		return 0;
+
+	c = classify_conv_attrs(ca);
+	switch (c) {
+	case CA_CLASS_INCORE:
+		return 1;
+
+	case CA_CLASS_INCORE_FILTER:
+		/*
+		 * It would be safe to allow concurrent instances of
+		 * single-file smudge filters, like rot13, but we should not
+		 * assume that all filters are parallel-process safe. So we
+		 * don't allow this.
+		 */
+		return 0;
+
+	case CA_CLASS_INCORE_PROCESS:
+		/*
+		 * The parallel queue and the delayed queue are not compatible,
+		 * so they must be kept completely separated. And we can't tell
+		 * if a long-running process will delay its response without
+		 * actually asking it to perform the filtering. Therefore, this
+		 * type of filter is not allowed in parallel checkout.
+		 *
+		 * Furthermore, there should only be one instance of the
+		 * long-running process filter as we don't know how it is
+		 * managing its own concurrency. So, spreading the entries that
+		 * requisite such a filter among the parallel workers would
+		 * require a lot more inter-process communication. We would
+		 * probably have to designate a single process to interact with
+		 * the filter and send all the necessary data to it, for each
+		 * entry.
+		 */
+		return 0;
+
+	case CA_CLASS_STREAMABLE:
+		return 1;
+
+	default:
+		BUG("unsupported conv_attrs classification '%d'", c);
+	}
+}
+
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
+{
+	struct parallel_checkout_item *pc_item;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES ||
+	    !is_eligible_for_parallel_checkout(ce, ca))
+		return -1;
+
+	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
+		   parallel_checkout.alloc);
+
+	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item->ce = ce;
+	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
+	pc_item->status = PC_ITEM_PENDING;
+
+	return 0;
+}
+
+static int handle_results(struct checkout *state)
+{
+	int ret = 0;
+	size_t i;
+	int have_pending = 0;
+
+	/*
+	 * We first update the successfully written entries with the collected
+	 * stat() data, so that they can be found by mark_colliding_entries(),
+	 * in the next loop, when necessary.
+	 */
+	for (i = 0; i < parallel_checkout.nr; i++) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		if (pc_item->status == PC_ITEM_WRITTEN)
+			update_ce_after_write(state, pc_item->ce, &pc_item->st);
+	}
+
+	for (i = 0; i < parallel_checkout.nr; i++) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+
+		switch(pc_item->status) {
+		case PC_ITEM_WRITTEN:
+			/* Already handled */
+			break;
+		case PC_ITEM_COLLIDED:
+			/*
+			 * The entry could not be checked out due to a path
+			 * collision with another entry. Since there can only
+			 * be one entry of each colliding group on the disk, we
+			 * could skip trying to check out this one and move on.
+			 * However, this would leave the unwritten entries with
+			 * null stat() fields on the index, which could
+			 * potentially slow down subsequent operations that
+			 * require refreshing it: git would not be able to
+			 * trust st_size and would have to go to the filesystem
+			 * to see if the contents match (see ie_modified()).
+			 *
+			 * Instead, let's pay the overhead only once, now, and
+			 * call checkout_entry_ca() again for this file, to
+			 * have it's stat() data stored in the index. This also
+			 * has the benefit of adding this entry and its
+			 * colliding pair to the collision report message.
+			 * Additionally, this overwriting behavior is consistent
+			 * with what the sequential checkout does, so it doesn't
+			 * add any extra overhead.
+			 */
+			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
+						 state, NULL, NULL);
+			break;
+		case PC_ITEM_PENDING:
+			have_pending = 1;
+			/* fall through */
+		case PC_ITEM_FAILED:
+			ret = -1;
+			break;
+		default:
+			BUG("unknown checkout item status in parallel checkout");
+		}
+	}
+
+	if (have_pending)
+		error(_("parallel checkout finished with pending entries"));
+
+	return ret;
+}
+
+static int reset_fd(int fd, const char *path)
+{
+	if (lseek(fd, 0, SEEK_SET) != 0)
+		return error_errno("failed to rewind descriptor of %s", path);
+	if (ftruncate(fd, 0))
+		return error_errno("failed to truncate file %s", path);
+	return 0;
+}
+
+static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
+			       const char *path)
+{
+	int ret;
+	struct stream_filter *filter;
+	struct strbuf buf = STRBUF_INIT;
+	char *new_blob;
+	unsigned long size;
+	size_t newsize = 0;
+	ssize_t wrote;
+
+	/* Sanity check */
+	assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca));
+
+	filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid);
+	if (filter) {
+		if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
+			/* On error, reset fd to try writing without streaming */
+			if (reset_fd(fd, path))
+				return -1;
+		} else {
+			return 0;
+		}
+	}
+
+	new_blob = read_blob_entry(pc_item->ce, &size);
+	if (!new_blob)
+		return error("unable to read sha1 file of %s (%s)", path,
+			     oid_to_hex(&pc_item->ce->oid));
+
+	/*
+	 * checkout metadata is used to give context for external process
+	 * filters. Files requiring such filters are not eligible for parallel
+	 * checkout, so pass NULL.
+	 */
+	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
+					 new_blob, size, &buf, NULL);
+
+	if (ret) {
+		free(new_blob);
+		new_blob = strbuf_detach(&buf, &newsize);
+		size = newsize;
+	}
+
+	wrote = write_in_full(fd, new_blob, size);
+	free(new_blob);
+	if (wrote < 0)
+		return error("unable to write file %s", path);
+
+	return 0;
+}
+
+static int close_and_clear(int *fd)
+{
+	int ret = 0;
+
+	if (*fd >= 0) {
+		ret = close(*fd);
+		*fd = -1;
+	}
+
+	return ret;
+}
+
+static void write_pc_item(struct parallel_checkout_item *pc_item,
+			  struct checkout *state)
+{
+	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
+	int fd = -1, fstat_done = 0;
+	struct strbuf path = STRBUF_INIT;
+	const char *dir_sep;
+
+	strbuf_add(&path, state->base_dir, state->base_dir_len);
+	strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen);
+
+	dir_sep = find_last_dir_sep(path.buf);
+
+	/*
+	 * The leading dirs should have been already created by now. But, in
+	 * case of path collisions, one of the dirs could have been replaced by
+	 * a symlink (checked out after we enqueued this entry for parallel
+	 * checkout). Thus, we must check the leading dirs again.
+	 */
+	if (dir_sep && !has_dirs_only_path(path.buf, dir_sep - path.buf,
+					   state->base_dir_len)) {
+		pc_item->status = PC_ITEM_COLLIDED;
+		goto out;
+	}
+
+	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode);
+
+	if (fd < 0) {
+		if (errno == EEXIST || errno == EISDIR) {
+			/*
+			 * Errors which probably represent a path collision.
+			 * Suppress the error message and mark the item to be
+			 * retried later, sequentially. ENOTDIR and ENOENT are
+			 * also interesting, but the above has_dirs_only_path()
+			 * call should have already caught these cases.
+			 */
+			pc_item->status = PC_ITEM_COLLIDED;
+		} else {
+			error_errno("failed to open file %s", path.buf);
+			pc_item->status = PC_ITEM_FAILED;
+		}
+		goto out;
+	}
+
+	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
+		/* Error was already reported. */
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	fstat_done = fstat_checkout_output(fd, state, &pc_item->st);
+
+	if (close_and_clear(&fd)) {
+		error_errno("unable to close file %s", path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) {
+		error_errno("unable to stat just-written file %s",  path.buf);
+		pc_item->status = PC_ITEM_FAILED;
+		goto out;
+	}
+
+	pc_item->status = PC_ITEM_WRITTEN;
+
+out:
+	/*
+	 * No need to check close() return. At this point, either fd is already
+	 * closed, or we are on an error path, that has already been reported.
+	 */
+	close_and_clear(&fd);
+	strbuf_release(&path);
+}
+
+static void write_items_sequentially(struct checkout *state)
+{
+	size_t i;
+
+	for (i = 0; i < parallel_checkout.nr; i++)
+		write_pc_item(&parallel_checkout.items[i], state);
+}
+
+int run_parallel_checkout(struct checkout *state)
+{
+	int ret;
+
+	if (parallel_checkout.status != PC_ACCEPTING_ENTRIES)
+		BUG("cannot run parallel checkout: uninitialized or already running");
+
+	parallel_checkout.status = PC_RUNNING;
+
+	write_items_sequentially(state);
+	ret = handle_results(state);
+
+	finish_parallel_checkout();
+	return ret;
+}
diff --git a/parallel-checkout.h b/parallel-checkout.h
new file mode 100644
index 0000000000..e6d6fc01ea
--- /dev/null
+++ b/parallel-checkout.h
@@ -0,0 +1,27 @@
+#ifndef PARALLEL_CHECKOUT_H
+#define PARALLEL_CHECKOUT_H
+
+struct cache_entry;
+struct checkout;
+struct conv_attrs;
+
+enum pc_status {
+	PC_UNINITIALIZED = 0,
+	PC_ACCEPTING_ENTRIES,
+	PC_RUNNING,
+};
+
+enum pc_status parallel_checkout_status(void);
+void init_parallel_checkout(void);
+
+/*
+ * Return -1 if parallel checkout is currently not enabled or if the entry is
+ * not eligible for parallel checkout. Otherwise, enqueue the entry for later
+ * write and return 0.
+ */
+int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+
+/* Write all the queued entries, returning 0 on success.*/
+int run_parallel_checkout(struct checkout *state);
+
+#endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index a511fadd89..1b1da7485a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -17,6 +17,7 @@
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "entry.h"
+#include "parallel-checkout.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o,
 	if (should_update_submodules())
 		load_gitmodules_file(index, &state);
 
-	enable_delayed_checkout(&state);
 	if (has_promisor_remote()) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
@@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o,
 					   to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
+
+	enable_delayed_checkout(&state);
+	init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
+	errs |= run_parallel_checkout(&state);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 11/19] parallel-checkout: make it truly parallel
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (9 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-12-16 22:31         ` Emily Shaffer
  2020-11-04 20:33       ` [PATCH v4 12/19] parallel-checkout: support progress displaying Matheus Tavares
                         ` (8 subsequent siblings)
  19 siblings, 1 reply; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Use multiple worker processes to distribute the queued entries and call
write_checkout_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could
affect performance due to lock contention in the kernel. Work stealing
(or any other format of re-distribution) is not implemented yet.

The parallel version was benchmarked during three operations in the
linux repo, with cold cache: cloning v5.8, checking out v5.8 from
v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The
four tables below show the mean run times and standard deviations for
5 runs in: a local file system with SSD, a local file system with HDD, a
Linux NFS server, and Amazon EFS. The numbers of workers were chosen
based on what produces the best result for each case.

Local SSD:

            Clone                  Checkout I             Checkout II
Sequential  8.171 s ± 0.206 s      8.735 s ± 0.230 s      4.166 s ± 0.246 s
10 workers  3.277 s ± 0.138 s      3.774 s ± 0.188 s      2.561 s ± 0.120 s
Speedup     2.49 ± 0.12            2.31 ± 0.13            1.63 ± 0.12

Local HDD:

            Clone                  Checkout I             Checkout II
Sequential  35.157 s ± 0.205 s     48.835 s ± 0.407 s     47.302 s ± 1.435 s
8 workers   35.538 s ± 0.325 s     49.353 s ± 0.826 s     48.919 s ± 0.416 s
Speedup     0.99 ± 0.01            0.99 ± 0.02            0.97 ± 0.03

Linux NFS server (v4.1, on EBS, single availability zone):

            Clone                  Checkout I             Checkout II
Sequential  216.070 s ± 3.611 s    211.169 s ± 3.147 s    57.446 s ± 1.301 s
32 workers  67.997 s ± 0.740 s     66.563 s ± 0.457 s     23.708 s ± 0.622 s
Speedup     3.18 ± 0.06            3.17 ± 0.05            2.42 ± 0.08

EFS (v4.1, replicated over multiple availability zones):

            Clone                  Checkout I             Checkout II
Sequential  1249.329 s ± 13.857 s  1438.979 s ± 78.792 s  543.919 s ± 18.745 s
64 workers  225.864 s ± 12.433 s   316.345 s ± 1.887 s    183.648 s ± 10.095 s
Speedup     5.53 ± 0.31            4.55 ± 0.25            2.96 ± 0.19

The above benchmarks show that parallel checkout is most effective on
repositories located on an SSD or over a distributed file system. For
local file systems on spinning disks, and/or older machines, the
parallelism does not always bring a good performance. In fact, it can
even increase the run time. For this reason, the sequential code is
still the default. Two settings are added to optionally enable and
configure the new parallel version as desired.

Local SSD tests were executed in an i7-7700HQ (4 cores with
hyper-threading) running Manjaro Linux. Local HDD tests were executed in
an i7-2600 (also 4 cores with hyper-threading), HDD Seagate Barracuda
7200 rpm SATA 3.0, running Debian 9.13. NFS and EFS tests were
executed in an Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux
NFS server was running on a m6g.large instance with 1 TB, EBS GP2
volume. Before each timing, the linux repository was removed (or checked
out back), and `sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .gitignore                        |   1 +
 Documentation/config/checkout.txt |  21 +++
 Makefile                          |   1 +
 builtin.h                         |   1 +
 builtin/checkout--helper.c        | 142 +++++++++++++++
 git.c                             |   2 +
 parallel-checkout.c               | 280 +++++++++++++++++++++++++++---
 parallel-checkout.h               |  84 ++++++++-
 unpack-trees.c                    |  10 +-
 9 files changed, 508 insertions(+), 34 deletions(-)
 create mode 100644 builtin/checkout--helper.c

diff --git a/.gitignore b/.gitignore
index 6232d33924..1a341ea184 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,6 +33,7 @@
 /git-check-mailmap
 /git-check-ref-format
 /git-checkout
+/git-checkout--helper
 /git-checkout-index
 /git-cherry
 /git-cherry-pick
diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt
index 6b646813ab..23e8f7cde0 100644
--- a/Documentation/config/checkout.txt
+++ b/Documentation/config/checkout.txt
@@ -16,3 +16,24 @@ will checkout the '<something>' branch on another remote,
 and by linkgit:git-worktree[1] when 'git worktree add' refers to a
 remote branch. This setting might be used for other checkout-like
 commands or functionality in the future.
+
+checkout.workers::
+	The number of parallel workers to use when updating the working tree.
+	The default is one, i.e. sequential execution. If set to a value less
+	than one, Git will use as many workers as the number of logical cores
+	available. This setting and `checkout.thresholdForParallelism` affect
+	all commands that perform checkout. E.g. checkout, clone, reset,
+	sparse-checkout, etc.
++
+Note: parallel checkout usually delivers better performance for repositories
+located on SSDs or over NFS. For repositories on spinning disks and/or machines
+with a small number of cores, the default sequential checkout often performs
+better. The size and compression level of a repository might also influence how
+well the parallel version performs.
+
+checkout.thresholdForParallelism::
+	When running parallel checkout with a small number of files, the cost
+	of subprocess spawning and inter-process communication might outweigh
+	the parallelization gains. This setting allows to define the minimum
+	number of files for which parallel checkout should be attempted. The
+	default is 100.
diff --git a/Makefile b/Makefile
index 10ee5e709b..535e6e94aa 100644
--- a/Makefile
+++ b/Makefile
@@ -1063,6 +1063,7 @@ BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
+BUILTIN_OBJS += builtin/checkout--helper.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
 BUILTIN_OBJS += builtin/clean.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..2abbe14b0b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -123,6 +123,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix);
 int cmd_bundle(int argc, const char **argv, const char *prefix);
 int cmd_cat_file(int argc, const char **argv, const char *prefix);
 int cmd_checkout(int argc, const char **argv, const char *prefix);
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix);
 int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 int cmd_check_attr(int argc, const char **argv, const char *prefix);
 int cmd_check_ignore(int argc, const char **argv, const char *prefix);
diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
new file mode 100644
index 0000000000..a61ed76f0d
--- /dev/null
+++ b/builtin/checkout--helper.c
@@ -0,0 +1,142 @@
+#include "builtin.h"
+#include "config.h"
+#include "entry.h"
+#include "parallel-checkout.h"
+#include "parse-options.h"
+#include "pkt-line.h"
+
+static void packet_to_pc_item(char *line, int len,
+			      struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_fixed_portion *fixed_portion;
+	char *encoding, *variant;
+
+	if (len < sizeof(struct pc_item_fixed_portion))
+		BUG("checkout worker received too short item (got %dB, exp %dB)",
+		    len, (int)sizeof(struct pc_item_fixed_portion));
+
+	fixed_portion = (struct pc_item_fixed_portion *)line;
+
+	if (len - sizeof(struct pc_item_fixed_portion) !=
+		fixed_portion->name_len + fixed_portion->working_tree_encoding_len)
+		BUG("checkout worker received corrupted item");
+
+	variant = line + sizeof(struct pc_item_fixed_portion);
+
+	/*
+	 * Note: the main process uses zero length to communicate that the
+	 * encoding is NULL. There is no use case in actually sending an empty
+	 * string since it's considered as NULL when ca.working_tree_encoding
+	 * is set at git_path_check_encoding().
+	 */
+	if (fixed_portion->working_tree_encoding_len) {
+		encoding = xmemdupz(variant,
+				    fixed_portion->working_tree_encoding_len);
+		variant += fixed_portion->working_tree_encoding_len;
+	} else {
+		encoding = NULL;
+	}
+
+	memset(pc_item, 0, sizeof(*pc_item));
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce->ce_namelen = fixed_portion->name_len;
+	pc_item->ce->ce_mode = fixed_portion->ce_mode;
+	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
+	oidcpy(&pc_item->ce->oid, &fixed_portion->oid);
+
+	pc_item->id = fixed_portion->id;
+	pc_item->ca.crlf_action = fixed_portion->crlf_action;
+	pc_item->ca.ident = fixed_portion->ident;
+	pc_item->ca.working_tree_encoding = encoding;
+}
+
+static void report_result(struct parallel_checkout_item *pc_item)
+{
+	struct pc_item_result res = { 0 };
+	size_t size;
+
+	res.id = pc_item->id;
+	res.status = pc_item->status;
+
+	if (pc_item->status == PC_ITEM_WRITTEN) {
+		res.st = pc_item->st;
+		size = sizeof(res);
+	} else {
+		size = PC_ITEM_RESULT_BASE_SIZE;
+	}
+
+	packet_write(1, (const char *)&res, size);
+}
+
+/* Free the worker-side malloced data, but not pc_item itself. */
+static void release_pc_item_data(struct parallel_checkout_item *pc_item)
+{
+	free((char *)pc_item->ca.working_tree_encoding);
+	discard_cache_entry(pc_item->ce);
+}
+
+static void worker_loop(struct checkout *state)
+{
+	struct parallel_checkout_item *items = NULL;
+	size_t i, nr = 0, alloc = 0;
+
+	while (1) {
+		int len;
+		char *line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		ALLOC_GROW(items, nr + 1, alloc);
+		packet_to_pc_item(line, len, &items[nr++]);
+	}
+
+	for (i = 0; i < nr; i++) {
+		struct parallel_checkout_item *pc_item = &items[i];
+		write_pc_item(pc_item, state);
+		report_result(pc_item);
+		release_pc_item_data(pc_item);
+	}
+
+	packet_flush(1);
+
+	free(items);
+}
+
+static const char * const checkout_helper_usage[] = {
+	N_("git checkout--helper [<options>]"),
+	NULL
+};
+
+int cmd_checkout__helper(int argc, const char **argv, const char *prefix)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct option checkout_helper_options[] = {
+		OPT_STRING(0, "prefix", &state.base_dir, N_("string"),
+			N_("when creating files, prepend <string>")),
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(checkout_helper_usage,
+				   checkout_helper_options);
+
+	git_config(git_default_config, NULL);
+	argc = parse_options(argc, argv, prefix, checkout_helper_options,
+			     checkout_helper_usage, 0);
+	if (argc > 0)
+		usage_with_options(checkout_helper_usage, checkout_helper_options);
+
+	if (state.base_dir)
+		state.base_dir_len = strlen(state.base_dir);
+
+	/*
+	 * Setting this on worker won't actually update the index. We just need
+	 * to pretend so to induce the checkout machinery to stat() the written
+	 * entries.
+	 */
+	state.refresh_cache = 1;
+
+	worker_loop(&state);
+	return 0;
+}
diff --git a/git.c b/git.c
index 4bdcdad2cc..384f144593 100644
--- a/git.c
+++ b/git.c
@@ -487,6 +487,8 @@ static struct cmd_struct commands[] = {
 	{ "check-mailmap", cmd_check_mailmap, RUN_SETUP },
 	{ "check-ref-format", cmd_check_ref_format, NO_PARSEOPT  },
 	{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
+	{ "checkout--helper", cmd_checkout__helper,
+		RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX },
 	{ "checkout-index", cmd_checkout_index,
 		RUN_SETUP | NEED_WORK_TREE},
 	{ "cherry", cmd_cherry, RUN_SETUP },
diff --git a/parallel-checkout.c b/parallel-checkout.c
index fd871b09d3..2d77998f46 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -1,28 +1,15 @@
 #include "cache.h"
 #include "entry.h"
 #include "parallel-checkout.h"
+#include "pkt-line.h"
+#include "run-command.h"
 #include "streaming.h"
+#include "thread-utils.h"
+#include "config.h"
 
-enum pc_item_status {
-	PC_ITEM_PENDING = 0,
-	PC_ITEM_WRITTEN,
-	/*
-	 * The entry could not be written because there was another file
-	 * already present in its path or leading directories. Since
-	 * checkout_entry_ca() removes such files from the working tree before
-	 * enqueueing the entry for parallel checkout, it means that there was
-	 * a path collision among the entries being written.
-	 */
-	PC_ITEM_COLLIDED,
-	PC_ITEM_FAILED,
-};
-
-struct parallel_checkout_item {
-	/* pointer to a istate->cache[] entry. Not owned by us. */
-	struct cache_entry *ce;
-	struct conv_attrs ca;
-	struct stat st;
-	enum pc_item_status status;
+struct pc_worker {
+	struct child_process cp;
+	size_t next_to_complete, nr_to_complete;
 };
 
 struct parallel_checkout {
@@ -38,6 +25,19 @@ enum pc_status parallel_checkout_status(void)
 	return parallel_checkout.status;
 }
 
+#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100
+
+void get_parallel_checkout_configs(int *num_workers, int *threshold)
+{
+	if (git_config_get_int("checkout.workers", num_workers))
+		*num_workers = 1;
+	else if (*num_workers < 1)
+		*num_workers = online_cpus();
+
+	if (git_config_get_int("checkout.thresholdForParallelism", threshold))
+		*threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM;
+}
+
 void init_parallel_checkout(void)
 {
 	if (parallel_checkout.status != PC_UNINITIALIZED)
@@ -115,10 +115,12 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1,
 		   parallel_checkout.alloc);
 
-	pc_item = &parallel_checkout.items[parallel_checkout.nr++];
+	pc_item = &parallel_checkout.items[parallel_checkout.nr];
 	pc_item->ce = ce;
 	memcpy(&pc_item->ca, ca, sizeof(pc_item->ca));
 	pc_item->status = PC_ITEM_PENDING;
+	pc_item->id = parallel_checkout.nr;
+	parallel_checkout.nr++;
 
 	return 0;
 }
@@ -231,7 +233,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
 	/*
 	 * checkout metadata is used to give context for external process
 	 * filters. Files requiring such filters are not eligible for parallel
-	 * checkout, so pass NULL.
+	 * checkout, so pass NULL. Note: if that changes, the metadata must also
+	 * be passed from the main process to the workers.
 	 */
 	ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name,
 					 new_blob, size, &buf, NULL);
@@ -262,8 +265,8 @@ static int close_and_clear(int *fd)
 	return ret;
 }
 
-static void write_pc_item(struct parallel_checkout_item *pc_item,
-			  struct checkout *state)
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state)
 {
 	unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666;
 	int fd = -1, fstat_done = 0;
@@ -337,6 +340,221 @@ static void write_pc_item(struct parallel_checkout_item *pc_item,
 	strbuf_release(&path);
 }
 
+static void send_one_item(int fd, struct parallel_checkout_item *pc_item)
+{
+	size_t len_data;
+	char *data, *variant;
+	struct pc_item_fixed_portion *fixed_portion;
+	const char *working_tree_encoding = pc_item->ca.working_tree_encoding;
+	size_t name_len = pc_item->ce->ce_namelen;
+	size_t working_tree_encoding_len = working_tree_encoding ?
+					   strlen(working_tree_encoding) : 0;
+
+	len_data = sizeof(struct pc_item_fixed_portion) + name_len +
+		   working_tree_encoding_len;
+
+	data = xcalloc(1, len_data);
+
+	fixed_portion = (struct pc_item_fixed_portion *)data;
+	fixed_portion->id = pc_item->id;
+	fixed_portion->ce_mode = pc_item->ce->ce_mode;
+	fixed_portion->crlf_action = pc_item->ca.crlf_action;
+	fixed_portion->ident = pc_item->ca.ident;
+	fixed_portion->name_len = name_len;
+	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
+	/*
+	 * We use hashcpy() instead of oidcpy() because the hash[] positions
+	 * after `the_hash_algo->rawsz` might not be initialized. And Valgrind
+	 * would complain about passing uninitialized bytes to a syscall
+	 * (write(2)). There is no real harm in this case, but the warning could
+	 * hinder the detection of actual errors.
+	 */
+	hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash);
+
+	variant = data + sizeof(*fixed_portion);
+	if (working_tree_encoding_len) {
+		memcpy(variant, working_tree_encoding, working_tree_encoding_len);
+		variant += working_tree_encoding_len;
+	}
+	memcpy(variant, pc_item->ce->name, name_len);
+
+	packet_write(fd, data, len_data);
+
+	free(data);
+}
+
+static void send_batch(int fd, size_t start, size_t nr)
+{
+	size_t i;
+	for (i = 0; i < nr; i++)
+		send_one_item(fd, &parallel_checkout.items[start + i]);
+	packet_flush(fd);
+}
+
+static struct pc_worker *setup_workers(struct checkout *state, int num_workers)
+{
+	struct pc_worker *workers;
+	int i, workers_with_one_extra_item;
+	size_t base_batch_size, next_to_assign = 0;
+
+	ALLOC_ARRAY(workers, num_workers);
+
+	for (i = 0; i < num_workers; i++) {
+		struct child_process *cp = &workers[i].cp;
+
+		child_process_init(cp);
+		cp->git_cmd = 1;
+		cp->in = -1;
+		cp->out = -1;
+		cp->clean_on_exit = 1;
+		strvec_push(&cp->args, "checkout--helper");
+		if (state->base_dir_len)
+			strvec_pushf(&cp->args, "--prefix=%s", state->base_dir);
+		if (start_command(cp))
+			die(_("failed to spawn checkout worker"));
+	}
+
+	base_batch_size = parallel_checkout.nr / num_workers;
+	workers_with_one_extra_item = parallel_checkout.nr % num_workers;
+
+	for (i = 0; i < num_workers; i++) {
+		struct pc_worker *worker = &workers[i];
+		size_t batch_size = base_batch_size;
+
+		/* distribute the extra work evenly */
+		if (i < workers_with_one_extra_item)
+			batch_size++;
+
+		send_batch(worker->cp.in, next_to_assign, batch_size);
+		worker->next_to_complete = next_to_assign;
+		worker->nr_to_complete = batch_size;
+
+		next_to_assign += batch_size;
+	}
+
+	return workers;
+}
+
+static void finish_workers(struct pc_worker *workers, int num_workers)
+{
+	int i;
+
+	/*
+	 * Close pipes before calling finish_command() to let the workers
+	 * exit asynchronously and avoid spending extra time on wait().
+	 */
+	for (i = 0; i < num_workers; i++) {
+		struct child_process *cp = &workers[i].cp;
+		if (cp->in >= 0)
+			close(cp->in);
+		if (cp->out >= 0)
+			close(cp->out);
+	}
+
+	for (i = 0; i < num_workers; i++) {
+		if (finish_command(&workers[i].cp))
+			error(_("checkout worker %d finished with error"), i);
+	}
+
+	free(workers);
+}
+
+#define ASSERT_PC_ITEM_RESULT_SIZE(got, exp) \
+{ \
+	if (got != exp) \
+		BUG("corrupted result from checkout worker (got %dB, exp %dB)", \
+		    got, exp); \
+} while(0)
+
+static void parse_and_save_result(const char *line, int len,
+				  struct pc_worker *worker)
+{
+	struct pc_item_result *res;
+	struct parallel_checkout_item *pc_item;
+	struct stat *st = NULL;
+
+	if (len < PC_ITEM_RESULT_BASE_SIZE)
+		BUG("too short result from checkout worker (got %dB, exp %dB)",
+		    len, (int)PC_ITEM_RESULT_BASE_SIZE);
+
+	res = (struct pc_item_result *)line;
+
+	/*
+	 * Worker should send either the full result struct on success, or
+	 * just the base (i.e. no stat data), otherwise.
+	 */
+	if (res->status == PC_ITEM_WRITTEN) {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)sizeof(struct pc_item_result));
+		st = &res->st;
+	} else {
+		ASSERT_PC_ITEM_RESULT_SIZE(len, (int)PC_ITEM_RESULT_BASE_SIZE);
+	}
+
+	if (!worker->nr_to_complete || res->id != worker->next_to_complete)
+		BUG("checkout worker sent unexpected item id");
+
+	worker->next_to_complete++;
+	worker->nr_to_complete--;
+
+	pc_item = &parallel_checkout.items[res->id];
+	pc_item->status = res->status;
+	if (st)
+		pc_item->st = *st;
+}
+
+
+static void gather_results_from_workers(struct pc_worker *workers,
+					int num_workers)
+{
+	int i, active_workers = num_workers;
+	struct pollfd *pfds;
+
+	CALLOC_ARRAY(pfds, num_workers);
+	for (i = 0; i < num_workers; i++) {
+		pfds[i].fd = workers[i].cp.out;
+		pfds[i].events = POLLIN;
+	}
+
+	while (active_workers) {
+		int nr = poll(pfds, num_workers, -1);
+
+		if (nr < 0) {
+			if (errno == EINTR)
+				continue;
+			die_errno("failed to poll checkout workers");
+		}
+
+		for (i = 0; i < num_workers && nr > 0; i++) {
+			struct pc_worker *worker = &workers[i];
+			struct pollfd *pfd = &pfds[i];
+
+			if (!pfd->revents)
+				continue;
+
+			if (pfd->revents & POLLIN) {
+				int len;
+				const char *line = packet_read_line(pfd->fd, &len);
+
+				if (!line) {
+					pfd->fd = -1;
+					active_workers--;
+				} else {
+					parse_and_save_result(line, len, worker);
+				}
+			} else if (pfd->revents & POLLHUP) {
+				pfd->fd = -1;
+				active_workers--;
+			} else if (pfd->revents & (POLLNVAL | POLLERR)) {
+				die(_("error polling from checkout worker"));
+			}
+
+			nr--;
+		}
+	}
+
+	free(pfds);
+}
+
 static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
@@ -345,7 +563,7 @@ static void write_items_sequentially(struct checkout *state)
 		write_pc_item(&parallel_checkout.items[i], state);
 }
 
-int run_parallel_checkout(struct checkout *state)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
 {
 	int ret;
 
@@ -354,7 +572,17 @@ int run_parallel_checkout(struct checkout *state)
 
 	parallel_checkout.status = PC_RUNNING;
 
-	write_items_sequentially(state);
+	if (parallel_checkout.nr < num_workers)
+		num_workers = parallel_checkout.nr;
+
+	if (num_workers <= 1 || parallel_checkout.nr < threshold) {
+		write_items_sequentially(state);
+	} else {
+		struct pc_worker *workers = setup_workers(state, num_workers);
+		gather_results_from_workers(workers, num_workers);
+		finish_workers(workers, num_workers);
+	}
+
 	ret = handle_results(state);
 
 	finish_parallel_checkout();
diff --git a/parallel-checkout.h b/parallel-checkout.h
index e6d6fc01ea..54314ccdc5 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -1,9 +1,12 @@
 #ifndef PARALLEL_CHECKOUT_H
 #define PARALLEL_CHECKOUT_H
 
-struct cache_entry;
-struct checkout;
-struct conv_attrs;
+#include "entry.h"
+#include "convert.h"
+
+/****************************************************************
+ * Users of parallel checkout
+ ****************************************************************/
 
 enum pc_status {
 	PC_UNINITIALIZED = 0,
@@ -12,6 +15,7 @@ enum pc_status {
 };
 
 enum pc_status parallel_checkout_status(void);
+void get_parallel_checkout_configs(int *num_workers, int *threshold);
 void init_parallel_checkout(void);
 
 /*
@@ -21,7 +25,77 @@ void init_parallel_checkout(void);
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
 
-/* Write all the queued entries, returning 0 on success.*/
-int run_parallel_checkout(struct checkout *state);
+/*
+ * Write all the queued entries, returning 0 on success. If the number of
+ * entries is smaller than the specified threshold, the operation is performed
+ * sequentially.
+ */
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+
+/****************************************************************
+ * Interface with checkout--helper
+ ****************************************************************/
+
+enum pc_item_status {
+	PC_ITEM_PENDING = 0,
+	PC_ITEM_WRITTEN,
+	/*
+	 * The entry could not be written because there was another file
+	 * already present in its path or leading directories. Since
+	 * checkout_entry_ca() removes such files from the working tree before
+	 * enqueueing the entry for parallel checkout, it means that there was
+	 * a path collision among the entries being written.
+	 */
+	PC_ITEM_COLLIDED,
+	PC_ITEM_FAILED,
+};
+
+struct parallel_checkout_item {
+	/*
+	 * In main process ce points to a istate->cache[] entry. Thus, it's not
+	 * owned by us. In workers they own the memory, which *must be* released.
+	 */
+	struct cache_entry *ce;
+	struct conv_attrs ca;
+	size_t id; /* position in parallel_checkout.items[] of main process */
+
+	/* Output fields, sent from workers. */
+	enum pc_item_status status;
+	struct stat st;
+};
+
+/*
+ * The fixed-size portion of `struct parallel_checkout_item` that is sent to the
+ * workers. Following this will be 2 strings: ca.working_tree_encoding and
+ * ce.name; These are NOT null terminated, since we have the size in the fixed
+ * portion.
+ *
+ * Note that not all fields of conv_attrs and cache_entry are passed, only the
+ * ones that will be required by the workers to smudge and write the entry.
+ */
+struct pc_item_fixed_portion {
+	size_t id;
+	struct object_id oid;
+	unsigned int ce_mode;
+	enum convert_crlf_action crlf_action;
+	int ident;
+	size_t working_tree_encoding_len;
+	size_t name_len;
+};
+
+/*
+ * The fields of `struct parallel_checkout_item` that are returned by the
+ * workers. Note: `st` must be the last one, as it is omitted on error.
+ */
+struct pc_item_result {
+	size_t id;
+	enum pc_item_status status;
+	struct stat st;
+};
+
+#define PC_ITEM_RESULT_BASE_SIZE offsetof(struct pc_item_result, st)
+
+void write_pc_item(struct parallel_checkout_item *pc_item,
+		   struct checkout *state);
 
 #endif /* PARALLEL_CHECKOUT_H */
diff --git a/unpack-trees.c b/unpack-trees.c
index 1b1da7485a..117ed42370 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o,
 	int errs = 0;
 	struct progress *progress;
 	struct checkout state = CHECKOUT_INIT;
-	int i;
+	int i, pc_workers, pc_threshold;
 
 	trace_performance_enter();
 	state.force = 1;
@@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o,
 		oid_array_clear(&to_fetch);
 	}
 
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+
 	enable_delayed_checkout(&state);
-	init_parallel_checkout();
+	if (pc_workers > 1)
+		init_parallel_checkout();
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 
@@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o,
 		}
 	}
 	stop_progress(&progress);
-	errs |= run_parallel_checkout(&state);
+	if (pc_workers > 1)
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 12/19] parallel-checkout: support progress displaying
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (10 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 11/19] parallel-checkout: make it truly parallel Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
                         ` (7 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 parallel-checkout.c | 34 +++++++++++++++++++++++++++++++---
 parallel-checkout.h |  4 +++-
 unpack-trees.c      | 11 ++++++++---
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index 2d77998f46..72ac93d541 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -2,6 +2,7 @@
 #include "entry.h"
 #include "parallel-checkout.h"
 #include "pkt-line.h"
+#include "progress.h"
 #include "run-command.h"
 #include "streaming.h"
 #include "thread-utils.h"
@@ -16,6 +17,8 @@ struct parallel_checkout {
 	enum pc_status status;
 	struct parallel_checkout_item *items;
 	size_t nr, alloc;
+	struct progress *progress;
+	unsigned int *progress_cnt;
 };
 
 static struct parallel_checkout parallel_checkout;
@@ -125,6 +128,20 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca)
 	return 0;
 }
 
+size_t pc_queue_size(void)
+{
+	return parallel_checkout.nr;
+}
+
+static void advance_progress_meter(void)
+{
+	if (parallel_checkout.progress) {
+		(*parallel_checkout.progress_cnt)++;
+		display_progress(parallel_checkout.progress,
+				 *parallel_checkout.progress_cnt);
+	}
+}
+
 static int handle_results(struct checkout *state)
 {
 	int ret = 0;
@@ -173,6 +190,7 @@ static int handle_results(struct checkout *state)
 			 */
 			ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca,
 						 state, NULL, NULL);
+			advance_progress_meter();
 			break;
 		case PC_ITEM_PENDING:
 			have_pending = 1;
@@ -500,6 +518,9 @@ static void parse_and_save_result(const char *line, int len,
 	pc_item->status = res->status;
 	if (st)
 		pc_item->st = *st;
+
+	if (res->status != PC_ITEM_COLLIDED)
+		advance_progress_meter();
 }
 
 
@@ -559,11 +580,16 @@ static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
 
-	for (i = 0; i < parallel_checkout.nr; i++)
-		write_pc_item(&parallel_checkout.items[i], state);
+	for (i = 0; i < parallel_checkout.nr; i++) {
+		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
+		write_pc_item(pc_item, state);
+		if (pc_item->status != PC_ITEM_COLLIDED)
+			advance_progress_meter();
+	}
 }
 
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt)
 {
 	int ret;
 
@@ -571,6 +597,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 		BUG("cannot run parallel checkout: uninitialized or already running");
 
 	parallel_checkout.status = PC_RUNNING;
+	parallel_checkout.progress = progress;
+	parallel_checkout.progress_cnt = progress_cnt;
 
 	if (parallel_checkout.nr < num_workers)
 		num_workers = parallel_checkout.nr;
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 54314ccdc5..8377b179d5 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -24,13 +24,15 @@ void init_parallel_checkout(void);
  * write and return 0.
  */
 int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);
+size_t pc_queue_size(void);
 
 /*
  * Write all the queued entries, returning 0 on success. If the number of
  * entries is smaller than the specified threshold, the operation is performed
  * sequentially.
  */
-int run_parallel_checkout(struct checkout *state, int num_workers, int threshold);
+int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
+			  struct progress *progress, unsigned int *progress_cnt);
 
 /****************************************************************
  * Interface with checkout--helper
diff --git a/unpack-trees.c b/unpack-trees.c
index 117ed42370..e05e6ceff2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o,
 		struct cache_entry *ce = index->cache[i];
 
 		if (ce->ce_flags & CE_UPDATE) {
+			size_t last_pc_queue_size = pc_queue_size();
+
 			if (ce->ce_flags & CE_WT_REMOVE)
 				BUG("both update and delete flags are set on %s",
 				    ce->name);
-			display_progress(progress, ++cnt);
 			ce->ce_flags &= ~CE_UPDATE;
 			errs |= checkout_entry(ce, &state, NULL, NULL);
+
+			if (last_pc_queue_size == pc_queue_size())
+				display_progress(progress, ++cnt);
 		}
 	}
-	stop_progress(&progress);
 	if (pc_workers > 1)
-		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
+		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
+					      progress, &cnt);
+	stop_progress(&progress);
 	errs |= finish_delayed_checkout(&state, NULL);
 	git_attr_set_direction(GIT_ATTR_CHECKIN);
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v4 13/19] make_transient_cache_entry(): optionally alloc from mem_pool
  2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
                         ` (11 preceding siblings ...)
  2020-11-04 20:33       ` [PATCH v4 12/19] parallel-checkout: support progress displaying Matheus Tavares
@ 2020-11-04 20:33       ` Matheus Tavares
  2020-11-04 20:33       ` [PATCH v4 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
                         ` (6 subsequent siblings)
  19 siblings, 0 replies; 154+ messages in thread
From: Matheus Tavares @ 2020-11-04 20:33 UTC (permalink / raw)
  To: git; +Cc: gitster, git, chriscool, peff, newren, jrnieder, martin.agren

Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/checkout--helper.c |  2 +-
 builtin/checkout.c         |  2 +-
 builtin/difftool.c         |  2 +-
 cache.h                    | 10 +++++-----
 read-cache.c               | 12 ++++++++----
 unpack-trees.c             |  2 +-
 6 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c
index a61ed76f0d..5d6f3e71d0 100644
--- a/builtin/checkout--helper.c
+++ b/builtin/checkout--helper.c
@@ -38,7 +38,7 @@ static void packet_to_pc_item(char *line, int len,
 	}
 
 	memset(pc_item, 0, sizeof(*pc_item));
-	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len);
+	pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL);
 	pc_item->ce->ce_namelen = fixed_portion->name_len;
 	pc_item->ce->ce_mode = fixed_portion->ce_mode;
 	memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b18b9d6f3c..c0bf5e6711 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko
 	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
 		die(_("Unable to add merge result for '%s'"), path);
 	free(result_buf.ptr);
-	ce = make_transient_cache_entry(mode, &oid, path, 2);
+	ce = make_transient_cache_entry(mode, &oid, path, 2, NULL);
 	if (!ce)
 		die(_("make_cache_entry failed for path '%s'"), path);
 	status = checkout_entry(ce, state, NULL, nr_checkouts);