From: Matheus Tavares <matheus.bernardino@usp.br>
To: gitster@pobox.com
Cc: git@vger.kernel.org, christian.couder@gmail.com, git@jeffhostetler.com
Subject: [PATCH v4 0/5] Parallel Checkout (part 2)
Date: Mon, 19 Apr 2021 16:53:30 -0300 [thread overview]
Message-ID: <cover.1618861380.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1618790794.git.matheus.bernardino@usp.br>
This version is almost identical to v3, but the last patch incorporates
the typo fixes and other rewording suggestions Christian made about the
design doc on the last round.
I decided to remove the sentence about step 3 dominating the execution
time as that's not always the case on e.g. a non-local clone or
sparse-checkout.
Matheus Tavares (5):
unpack-trees: add basic support for parallel checkout
parallel-checkout: make it truly parallel
parallel-checkout: add configuration options
parallel-checkout: support progress displaying
parallel-checkout: add design documentation
.gitignore | 1 +
Documentation/Makefile | 1 +
Documentation/config/checkout.txt | 21 +
Documentation/technical/parallel-checkout.txt | 270 ++++++++
Makefile | 2 +
builtin.h | 1 +
builtin/checkout--worker.c | 145 ++++
entry.c | 17 +-
git.c | 2 +
parallel-checkout.c | 655 ++++++++++++++++++
parallel-checkout.h | 111 +++
unpack-trees.c | 19 +-
12 files changed, 1240 insertions(+), 5 deletions(-)
create mode 100644 Documentation/technical/parallel-checkout.txt
create mode 100644 builtin/checkout--worker.c
create mode 100644 parallel-checkout.c
create mode 100644 parallel-checkout.h
Range-diff against v3:
1: 7096822c14 = 1: 7096822c14 unpack-trees: add basic support for parallel checkout
2: 4526516ea0 = 2: 4526516ea0 parallel-checkout: make it truly parallel
3: ad165c0637 = 3: ad165c0637 parallel-checkout: add configuration options
4: cf9e28dc0e = 4: cf9e28dc0e parallel-checkout: support progress displaying
5: 415d4114aa ! 5: fd929f072c parallel-checkout: add design documentation
@@ Documentation/technical/parallel-checkout.txt (new)
+* Step 4: Write the new index to disk.
+
+Step 3 is the focus of the "parallel checkout" effort described here.
-+It dominates the execution time for most of the above command types.
+
+Sequential Implementation
+-------------------------
@@ Documentation/technical/parallel-checkout.txt (new)
+It wouldn't be safe to perform Step 3b in parallel, as there could be
+race conditions between file creations and removals. Instead, the
+parallel checkout framework lets the sequential code handle Step 3b,
-+and use parallel workers to replace the sequential
++and uses parallel workers to replace the sequential
+`entry.c:write_entry()` calls from Step 3c.
+
+Rejected Multi-Threaded Solution
@@ Documentation/technical/parallel-checkout.txt (new)
+warning for the user, like the classic sequential checkout does.
+
+The workers are able to detect both collisions among the entries being
-+concurrently written and collisions among parallel-eligible and
-+ineligible entries. The general idea for collision detection is quite
-+straightforward: for each parallel-eligible entry, the main process must
-+remove all files that prevent this entry from being written (before
-+enqueueing it). This includes any non-directory file in the leading path
-+of the entry. Later, when a worker gets assigned the entry, it looks
-+again for the non-directories files and for an already existing file at
-+the entry's path. If any of these checks finds something, the worker
-+knows that there was a path collision.
++concurrently written and collisions between a parallel-eligible entry
++and an ineligible entry. The general idea for collision detection is
++quite straightforward: for each parallel-eligible entry, the main
++process must remove all files that prevent this entry from being written
++(before enqueueing it). This includes any non-directory file in the
++leading path of the entry. Later, when a worker gets assigned the entry,
++it looks again for the non-directories files and for an already existing
++file at the entry's path. If any of these checks finds something, the
++worker knows that there was a path collision.
+
+Because parallel checkout can distinguish path collisions from the case
+where the file was already present in the working tree before checkout,
@@ Documentation/technical/parallel-checkout.txt (new)
+Besides, long-running filters may use the delayed checkout feature to
+postpone the return of some filtered blobs. The delayed checkout queue
+and the parallel checkout queue are not compatible and should remain
-+separated.
++separate.
++
+Note: regular files that only require internal filters, like end-of-line
+conversion and re-encoding, are eligible for parallel checkout.
@@ Documentation/technical/parallel-checkout.txt (new)
+The API
+-------
+
-+The parallel checkout API was designed with the goal to minimize changes
-+to the current users of the checkout machinery. This means that they
-+don't have to call a different function for sequential or parallel
++The parallel checkout API was designed with the goal of minimizing
++changes to the current users of the checkout machinery. This means that
++they don't have to call a different function for sequential or parallel
+checkout. As already mentioned, `checkout_entry()` will automatically
+insert the given entry in the parallel checkout queue when this feature
+is enabled and the entry is eligible; otherwise, it will just write the
--
2.30.1
next prev parent reply other threads:[~2021-04-19 19:53 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-17 21:12 [PATCH 0/5] Parallel Checkout (part 2) Matheus Tavares
2021-03-17 21:12 ` [PATCH 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-03-31 4:22 ` Christian Couder
2021-04-02 14:39 ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-03-31 4:32 ` Christian Couder
2021-04-02 14:42 ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-03-31 4:33 ` Christian Couder
2021-04-02 14:45 ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-03-17 21:12 ` [PATCH 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-03-31 5:36 ` Christian Couder
2021-03-18 20:56 ` [PATCH 0/5] Parallel Checkout (part 2) Junio C Hamano
2021-03-19 3:24 ` Matheus Tavares
2021-03-19 22:58 ` Junio C Hamano
2021-03-31 5:42 ` Christian Couder
2021-04-08 16:16 ` [PATCH v2 " Matheus Tavares
2021-04-08 16:17 ` [PATCH v2 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-08 16:17 ` [PATCH v2 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-08 16:17 ` [PATCH v2 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-08 16:17 ` [PATCH v2 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-08 16:17 ` [PATCH v2 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-04-08 19:52 ` [PATCH v2 0/5] Parallel Checkout (part 2) Junio C Hamano
2021-04-16 21:43 ` Junio C Hamano
2021-04-17 19:57 ` Matheus Tavares Bernardino
2021-04-19 9:41 ` Christian Couder
2021-04-19 0:14 ` [PATCH v3 " Matheus Tavares
2021-04-19 0:14 ` [PATCH v3 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-19 0:14 ` [PATCH v3 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-19 0:14 ` [PATCH v3 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-19 0:14 ` [PATCH v3 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-19 0:14 ` [PATCH v3 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-04-19 9:36 ` Christian Couder
2021-04-19 19:53 ` Matheus Tavares [this message]
2021-04-19 19:53 ` [PATCH v4 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-19 19:53 ` [PATCH v4 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-19 19:53 ` [PATCH v4 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-19 19:53 ` [PATCH v4 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-19 19:53 ` [PATCH v4 5/5] parallel-checkout: add design documentation Matheus Tavares
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1618861380.git.matheus.bernardino@usp.br \
--to=matheus.bernardino@usp.br \
--cc=christian.couder@gmail.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).