git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH] parallel-checkout: send the new object_id algo field to the workers
Date: Fri, 14 May 2021 19:53:07 +0000	[thread overview]
Message-ID: <YJ7VI5kHAAk126YJ@camp.crustytoothpaste.net> (raw)
In-Reply-To: <a4225bc79d963c5a411105e2b75f9ca4b80de835.1621000916.git.matheus.bernardino@usp.br>

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

On 2021-05-14 at 14:36:00, Matheus Tavares wrote:
> An object_id storing a SHA-1 name has some unused bytes at the end of
> the hash array. Since these bytes are not used, they are usually not
> initialized to any value either. However, at
> parallel_checkout.c:send_one_item() the object_id of a cache entry is
> copied into a buffer which is later sent to a checkout worker through a
> pipe write(). This makes Valgrind complain about passing uninitialized
> bytes to a syscall. The worker won't use these uninitialized bytes
> either, but the warning could confuse someone trying to debug this code;
> So instead of using oidcpy(), send_one_item() uses hashcpy() to only
> copy the used/initialized bytes of the object_id, and leave the
> remaining part with zeros.
> 
> However, since cf0983213c ("hash: add an algo member to struct
> object_id", 2021-04-26), using hashcpy() is no longer sufficient here as
> it won't copy the new algo field from the object_id. Let's add and use a
> new function which meets both our requirements of copying all the
> important object_id data while still avoiding the uninitialized bytes,
> by padding the end of the hash array in the destination object_id. With
> this change, we also no longer need the destination buffer from
> send_one_item() to be initialized with zeros, so let's switch from
> xcalloc() to xmalloc() to make this clear.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> 
> Hi, brian
> 
> I've read the hash transition plan, but I'm not confident to say that I
> fully understand it yet, so maybe this patch is not exactly what we need
> here. Mainly, I'm not sure I understand in which cases we will have an
> object_id.algo that is not the_hash_algo. Is it for the early transition
> phases, where we have a SHA-256 repo that accepts user input as SHA-1? 

Yes, that's correct, as well as for interoperability with remotes using
a different hash algorithm.

> Also, the object_id's copied here at send_one_item() always come from a
> `struct cache_entry`. In this case, can they still have different
> `algo`s or do we expect them to be the_hash_algo?

No, things in the index should always use the same algorithm..

The patch looks fine to me.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2021-05-14 19:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14 14:36 [RFC PATCH] parallel-checkout: send the new object_id algo field to the workers Matheus Tavares
2021-05-14 19:53 ` brian m. carlson [this message]
2021-05-17 16:54   ` Derrick Stolee
2021-05-17 19:32     ` Junio C Hamano
2021-05-17 19:49 ` [PATCH v2] " Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJ7VI5kHAAk126YJ@camp.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).