From: "René Scharfe" <l.s.r@web.de>
To: Derrick Stolee <stolee@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, newren@gmail.com,
matheus.bernardino@usp.br,
Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search
Date: Fri, 20 Aug 2021 22:22:41 +0200 [thread overview]
Message-ID: <15624e06-d5cd-d83d-f894-f8ffe3809db0@web.de> (raw)
In-Reply-To: <6d3844e7-bcda-490b-ba08-57c9e8058c4b@web.de>
Am 20.08.21 um 21:35 schrieb René Scharfe:
> Am 20.08.21 um 17:18 schrieb Derrick Stolee:
>> On 8/19/2021 4:01 AM, Johannes Schindelin wrote:
>>> Hi Stolee,
>>>
>>> On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:
>>>
>>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>>
>>>> The iterated search in find_cache_entry() was recently modified to
>>>> include a loop that searches backwards for a sparse directory entry that
>>>> matches the given traverse_info and name_entry. However, the string
>>>> comparison failed to actually concatenate those two strings, so this
>>>> failed to find a sparse directory when it was not a top-level directory.
>>>>
>>>> This caused some errors in rare cases where a 'git checkout' spanned a
>>>> diff that modified files within the sparse directory entry, but we could
>>>> not correctly find the entry.
>>>
>>> Good explanation.
>>>
>>> I wonder a bit about the performance impact. How "hot" is this function?
>>> I.e. how often is it called, on average?
>>>
>>> I ask because I see opportunities to optimize in both directions: it could
>>> be written more concisely (if speed does not matter as much), and it could
>>> be made faster (if speed matters a lot). See below for more.
>>
>> I would definitely optimize for speed here. This can be a very hot path,
>> I believe.
>>
>>>> + strbuf_addstr(&full_path, info->traverse_path);
>>>> + strbuf_add(&full_path, p->path, p->pathlen);
>>>> + strbuf_addch(&full_path, '/');
>>>
>>> This could be reduced to:
>>>
>>> strbuf_addf(&full_path, "%s%.*s/",
>>> info->traverse_path, (int)p->pathlen, p->path);
>>
>> We should definitely avoid formatted strings here, if possible.
>>
>>> But if speed matters, we probably need something more like this:
>>>
>>> size_t full_path_len;
>>> const char *full_path;
>>> char *full_path_1 = NULL;
>>>
>>> if (!*info->traverse_path) {
>>> full_path = p->path;
>>> full_path_len = p->pathlen;
>>> } else {
>>> size_t len = strlen(info->traverse_path);
>>>
>>> full_path_len = len + p->pathlen + 1;
>>> full_path = full_path_1 = xmalloc(full_path_len + 1);
>>> memcpy(full_path_1, info->traverse_path, len);
>>> memcpy(full_path_1 + len, p->path, p->pathlen);
>>> full_path_1[full_path_len - 1] = '/';
>>> full_path_1[full_path_len] = '\0';
>>> }
>>
>> The critical benefit here is that we do not need to allocate a
>> buffer if the traverse_path does not exist. That might be a
>> worthwhile investment. That leads to justifying the use of
>> bare 'char *'s instead of 'struct strbuf'.
>>
>> If the traverse_path is usually non-null, then we could continue using
>> strbufs as a helper and get the planned performance gains by using
>> strbuf_grow(&full_path, full_path_len + 1) followed by strbuf_add()
>> (instead of strbuf_addstr()). That would make this code a bit less
>> ugly with the only real overhead being the extra insertions of '\0'
>> characters as we add the strings to the strbuf().
>
> You create full_path only to compare it to another string. You can
> compare the pieces directly, without allocating and copying:
>
> const char *path;
>
> if (!skip_prefix(ce->name, info->traverse_path, &path) ||
> strncmp(path, p->path, p->pathlen) ||
> strcmp(path + p->pathlen, "/"))
The strcmp line is wrong (should be path[p->pathlen] != '/'), but
you get the idea..
> return NULL;
>
> A test would be nice to demonstrate the fixed issue.
>
> René
>
next prev parent reply other threads:[~2021-08-20 20:22 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-29 17:27 [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 1/2] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 2/2] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-07-30 13:52 ` Elijah Newren
2021-08-02 14:34 ` Derrick Stolee
2021-08-02 16:17 ` Elijah Newren
2021-08-05 1:55 ` Derrick Stolee
2021-08-05 3:54 ` Elijah Newren
2021-07-30 13:11 ` [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-10 19:50 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-19 18:24 ` Elijah Newren
2021-08-20 15:04 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-12 17:29 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-19 7:45 ` Johannes Schindelin
2021-08-20 15:09 ` Derrick Stolee
2021-08-20 16:40 ` Eric Sunshine
2021-08-17 13:23 ` [PATCH v3 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-19 8:01 ` Johannes Schindelin
2021-08-20 15:18 ` Derrick Stolee
2021-08-20 19:35 ` René Scharfe
2021-08-20 20:22 ` René Scharfe [this message]
2021-08-19 18:29 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-19 8:07 ` Johannes Schindelin
2021-08-20 15:30 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-19 8:11 ` Johannes Schindelin
2021-08-20 15:36 ` Derrick Stolee
2021-08-19 20:53 ` Elijah Newren
2021-08-20 15:39 ` Derrick Stolee
2021-08-20 16:05 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-18 18:59 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-19 8:48 ` Johannes Schindelin
2021-08-20 15:49 ` Derrick Stolee
2021-08-20 16:15 ` Elijah Newren
2021-08-20 15:56 ` Elijah Newren
2021-08-23 20:00 ` Johannes Schindelin
2021-08-17 14:09 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-24 21:51 ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 01/10] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 02/10] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 03/10] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 04/10] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-08-27 21:33 ` Elijah Newren
2021-08-30 13:19 ` Derrick Stolee
2021-08-30 20:08 ` Elijah Newren
2021-08-24 21:51 ` [PATCH v4 05/10] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-24 22:21 ` René Scharfe
2021-08-25 1:09 ` Derrick Stolee
2021-08-24 21:51 ` [PATCH v4 06/10] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 07/10] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 08/10] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 09/10] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 10/10] sparse-checkout: add config to disable deleting dirs Derrick Stolee via GitGitGadget
2021-08-27 20:58 ` Elijah Newren
2021-08-30 13:30 ` Derrick Stolee
2021-08-30 20:11 ` Elijah Newren
2021-08-27 21:56 ` [PATCH v4 00/10] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-27 22:01 ` Elijah Newren
2021-08-30 13:34 ` Derrick Stolee
2021-08-30 20:14 ` Elijah Newren
2021-08-30 13:54 ` Derrick Stolee
2021-08-30 20:23 ` Elijah Newren
2021-09-08 1:42 ` [PATCH v5 0/9] " Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 1/9] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 2/9] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 3/9] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 4/9] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 5/9] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 6/9] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 7/9] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 8/9] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 9/9] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-09-08 5:21 ` [PATCH v5 0/9] Sparse index: delete ignored files outside sparse cone Junio C Hamano
2021-09-08 6:56 ` Junio C Hamano
2021-09-08 11:39 ` Derrick Stolee
2021-09-08 16:11 ` Junio C Hamano
2021-09-08 5:30 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15624e06-d5cd-d83d-f894-f8ffe3809db0@web.de \
--to=l.s.r@web.de \
--cc=Johannes.Schindelin@gmx.de \
--cc=derrickstolee@github.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=newren@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).