From: Derrick Stolee <stolee@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, newren@gmail.com,
matheus.bernardino@usp.br,
Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search
Date: Fri, 20 Aug 2021 11:18:18 -0400 [thread overview]
Message-ID: <b7bd8e73-de86-c563-8b7d-405310ce6c57@gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2108190950540.55@tvgsbejvaqbjf.bet>
On 8/19/2021 4:01 AM, Johannes Schindelin wrote:
> Hi Stolee,
>
> On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:
>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The iterated search in find_cache_entry() was recently modified to
>> include a loop that searches backwards for a sparse directory entry that
>> matches the given traverse_info and name_entry. However, the string
>> comparison failed to actually concatenate those two strings, so this
>> failed to find a sparse directory when it was not a top-level directory.
>>
>> This caused some errors in rare cases where a 'git checkout' spanned a
>> diff that modified files within the sparse directory entry, but we could
>> not correctly find the entry.
>
> Good explanation.
>
> I wonder a bit about the performance impact. How "hot" is this function?
> I.e. how often is it called, on average?
>
> I ask because I see opportunities to optimize in both directions: it could
> be written more concisely (if speed does not matter as much), and it could
> be made faster (if speed matters a lot). See below for more.
I would definitely optimize for speed here. This can be a very hot path,
I believe.
>> + strbuf_addstr(&full_path, info->traverse_path);
>> + strbuf_add(&full_path, p->path, p->pathlen);
>> + strbuf_addch(&full_path, '/');
>
> This could be reduced to:
>
> strbuf_addf(&full_path, "%s%.*s/",
> info->traverse_path, (int)p->pathlen, p->path);
We should definitely avoid formatted strings here, if possible.
> But if speed matters, we probably need something more like this:
>
> size_t full_path_len;
> const char *full_path;
> char *full_path_1 = NULL;
>
> if (!*info->traverse_path) {
> full_path = p->path;
> full_path_len = p->pathlen;
> } else {
> size_t len = strlen(info->traverse_path);
>
> full_path_len = len + p->pathlen + 1;
> full_path = full_path_1 = xmalloc(full_path_len + 1);
> memcpy(full_path_1, info->traverse_path, len);
> memcpy(full_path_1 + len, p->path, p->pathlen);
> full_path_1[full_path_len - 1] = '/';
> full_path_1[full_path_len] = '\0';
> }
The critical benefit here is that we do not need to allocate a
buffer if the traverse_path does not exist. That might be a
worthwhile investment. That leads to justifying the use of
bare 'char *'s instead of 'struct strbuf'.
If the traverse_path is usually non-null, then we could continue using
strbufs as a helper and get the planned performance gains by using
strbuf_grow(&full_path, full_path_len + 1) followed by strbuf_add()
(instead of strbuf_addstr()). That would make this code a bit less
ugly with the only real overhead being the extra insertions of '\0'
characters as we add the strings to the strbuf().
I will need to investigate so see which one is the best.
Thanks,
-Stolee
next prev parent reply other threads:[~2021-08-20 15:18 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-29 17:27 [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 1/2] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 2/2] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-07-30 13:52 ` Elijah Newren
2021-08-02 14:34 ` Derrick Stolee
2021-08-02 16:17 ` Elijah Newren
2021-08-05 1:55 ` Derrick Stolee
2021-08-05 3:54 ` Elijah Newren
2021-07-30 13:11 ` [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-10 19:50 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-19 18:24 ` Elijah Newren
2021-08-20 15:04 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-12 17:29 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-19 7:45 ` Johannes Schindelin
2021-08-20 15:09 ` Derrick Stolee
2021-08-20 16:40 ` Eric Sunshine
2021-08-17 13:23 ` [PATCH v3 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-19 8:01 ` Johannes Schindelin
2021-08-20 15:18 ` Derrick Stolee [this message]
2021-08-20 19:35 ` René Scharfe
2021-08-20 20:22 ` René Scharfe
2021-08-19 18:29 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-19 8:07 ` Johannes Schindelin
2021-08-20 15:30 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-19 8:11 ` Johannes Schindelin
2021-08-20 15:36 ` Derrick Stolee
2021-08-19 20:53 ` Elijah Newren
2021-08-20 15:39 ` Derrick Stolee
2021-08-20 16:05 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-18 18:59 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-19 8:48 ` Johannes Schindelin
2021-08-20 15:49 ` Derrick Stolee
2021-08-20 16:15 ` Elijah Newren
2021-08-20 15:56 ` Elijah Newren
2021-08-23 20:00 ` Johannes Schindelin
2021-08-17 14:09 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-24 21:51 ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 01/10] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 02/10] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 03/10] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 04/10] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-08-27 21:33 ` Elijah Newren
2021-08-30 13:19 ` Derrick Stolee
2021-08-30 20:08 ` Elijah Newren
2021-08-24 21:51 ` [PATCH v4 05/10] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-24 22:21 ` René Scharfe
2021-08-25 1:09 ` Derrick Stolee
2021-08-24 21:51 ` [PATCH v4 06/10] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 07/10] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 08/10] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 09/10] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 10/10] sparse-checkout: add config to disable deleting dirs Derrick Stolee via GitGitGadget
2021-08-27 20:58 ` Elijah Newren
2021-08-30 13:30 ` Derrick Stolee
2021-08-30 20:11 ` Elijah Newren
2021-08-27 21:56 ` [PATCH v4 00/10] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-27 22:01 ` Elijah Newren
2021-08-30 13:34 ` Derrick Stolee
2021-08-30 20:14 ` Elijah Newren
2021-08-30 13:54 ` Derrick Stolee
2021-08-30 20:23 ` Elijah Newren
2021-09-08 1:42 ` [PATCH v5 0/9] " Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 1/9] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 2/9] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 3/9] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 4/9] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 5/9] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 6/9] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 7/9] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 8/9] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 9/9] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-09-08 5:21 ` [PATCH v5 0/9] Sparse index: delete ignored files outside sparse cone Junio C Hamano
2021-09-08 6:56 ` Junio C Hamano
2021-09-08 11:39 ` Derrick Stolee
2021-09-08 16:11 ` Junio C Hamano
2021-09-08 5:30 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7bd8e73-de86-c563-8b7d-405310ce6c57@gmail.com \
--to=stolee@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=derrickstolee@github.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).