git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>
Subject: [PATCH v5 0/5] Fix use of uninitialized hash algorithms
Date: Mon, 20 May 2024 16:14:29 -0700	[thread overview]
Message-ID: <20240520231434.1816979-1-gitster@pobox.com> (raw)
In-Reply-To: <cover.1715582857.git.ps@pks.im>

A change recently merged to 'next' stops us from defaulting to using
SHA-1 unless other code (like a logic early in the start-up sequence
to see what hash is being used in the repository we are working in)
explicitly sets it, leading to a (deliberate) crash of "git" when we
forgot to cover certain code paths.

It turns out we have a few.  Notable ones are all operations that
are designed to work outside a repository.  We should go over all
such code paths and give them a reasonable default when there is one
available (e.g. for historical reasons, patch-id is documented to
work with SHA-1 hashes, so arguably it, or at least when it is
invoked with the "--stable" option, should do so everywhere, not
just in SHA-1 repositories, but in SHA-256 repository or outside any
repository).  In the meantime, if an end-user hits such a "bug"
before we can fix it, it would be nice to give them an escape hatch
to restore the historical behaviour of falling back to use SHA-1.

These patches are designed to apply on a merge of c8aed5e8
(repository: stop setting SHA1 as the default object hash,
2024-05-07) into 3e4a232f (The third batch, 2024-05-13), which has
been the same base throughout the past iterations.

In this fifth iteration:

 - The first step no longer falls back to GIT_DEFAULT_HASH; the
   escape hatch is a dedicated GIT_TEST_DEFAULT_HASH_ALGO
   environment variable, but hopefully we do not have to advertise
   it all that often.

 - The second step has been simplified somewhat to use the "nongit"
   helper when we only need to run a single "git" command in t1517.
   The way the expected output files were prepared in the previous
   versions did not correctly force use of SHA-1 algorithm, which
   has been corrected.  The third step and fourth step for t1517
   continue to be "flip expect_failure to expect_success", but you
   can see context differences in the range-diff.

 - The fourth step also has a fix for t1007 where the previous
   iterations did not correctly force use of SHA-1 to prepare the
   expected output.

Otherwise this round should be ready, modulo possible typoes.


Junio C Hamano (3):
  setup: add an escape hatch for "no more default hash algorithm" change
  t1517: test commands that are designed to be run outside repository
  apply: fix uninitialized hash function

Patrick Steinhardt (2):
  builtin/patch-id: fix uninitialized hash function
  builtin/hash-object: fix uninitialized hash function

 builtin/apply.c         |  4 +++
 builtin/hash-object.c   |  3 +++
 builtin/patch-id.c      | 13 +++++++++
 repository.c            | 44 ++++++++++++++++++++++++++++++
 t/t1007-hash-object.sh  |  6 +++++
 t/t1517-outside-repo.sh | 59 +++++++++++++++++++++++++++++++++++++++++
 t/t4204-patch-id.sh     | 34 ++++++++++++++++++++++++
 7 files changed, 163 insertions(+)
 create mode 100755 t/t1517-outside-repo.sh

-- 
2.45.1-216-g4365c6fcf9

1:  3038f73e1c ! 1:  f5e6868a61 setup: add an escape hatch for "no more default hash algorithm" change
    @@ Commit message
         Partially revert c8aed5e8 (repository: stop setting SHA1 as the
         default object hash, 2024-05-07), to keep end-user systems still
         broken when we have gap in our test coverage but yet give them an
    -    escape hatch to set the GIT_DEFAULT_HASH environment variable to
    -    "sha1" in order to revert to the previous behaviour.  This variable
    -    has been in use for using SHA-256 hash by default, and it should be
    -    a better fit than inventing a new and test-only knob.
    +    escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment
    +    variable to "sha1" in order to revert to the previous behaviour.
     
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Due to the way the end-user facing GIT_DEFAULT_HASH environment
    +    variable is used in our test suite, we unfortunately cannot reuse it
    +    for this purpose.
     
    - ## environment.h ##
    -@@ environment.h: const char *getenv_safe(struct strvec *argv, const char *name);
    - #define GIT_OPTIONAL_LOCKS_ENVIRONMENT "GIT_OPTIONAL_LOCKS"
    - #define GIT_TEXT_DOMAIN_DIR_ENVIRONMENT "GIT_TEXTDOMAINDIR"
    - #define GIT_ATTR_SOURCE_ENVIRONMENT "GIT_ATTR_SOURCE"
    -+#define GIT_DEFAULT_HASH_ENVIRONMENT "GIT_DEFAULT_HASH"
    - 
    - /*
    -  * Environment variable used in handshaking the wire protocol.
    +    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## repository.c ##
    -@@
    - #include "git-compat-util.h"
    - #include "abspath.h"
    -+#include "environment.h"
    - #include "repository.h"
    - #include "object-store-ll.h"
    - #include "config.h"
     @@
      static struct repository the_repo;
      struct repository *the_repository = &the_repo;
      
    ++/*
    ++ * An escape hatch: if we hit a bug in the production code that fails
    ++ * to set an appropriate hash algorithm (most likely to happen when
    ++ * running outside a repository), we can tell the user who reported
    ++ * the crash to set the environment variable to "sha1" (all lowercase)
    ++ * to revert to the historical behaviour of defaulting to SHA-1.
    ++ */
     +static void set_default_hash_algo(struct repository *repo)
     +{
     +	const char *hash_name;
     +	int algo;
     +
    -+	hash_name = getenv(GIT_DEFAULT_HASH_ENVIRONMENT);
    ++	hash_name = getenv("GIT_TEST_DEFAULT_HASH_ALGO");
     +	if (!hash_name)
     +		return;
     +	algo = hash_algo_by_name(hash_name);
    -+
    -+	/*
    -+	 * NEEDSWORK: after all, falling back to SHA-1 by assigning
    -+	 * GIT_HASH_SHA1 to algo here, instead of returning, may give
    -+	 * us better behaviour.
    -+	 */
     +	if (algo == GIT_HASH_UNKNOWN)
     +		return;
     +
    @@ repository.c: void initialize_repository(struct repository *repo)
     +	 * of the hashed value to see if their input looks like a
     +	 * plausible hash value.
     +	 *
    -+	 * We are in the process of identifying the codepaths and
    ++	 * We are in the process of identifying such code paths and
     +	 * giving them an appropriate default individually; any
    -+	 * unconverted codepath that tries to access the_hash_algo
    ++	 * unconverted code paths that try to access the_hash_algo
     +	 * will thus fail.  The end-users however have an escape hatch
    -+	 * to set GIT_DEFAULT_HASH environment variable to "sha1" get
    -+	 * back the old behaviour of defaulting to SHA-1.
    ++	 * to set GIT_TEST_DEFAULT_HASH_ALGO environment variable to
    ++	 * "sha1" to get back the old behaviour of defaulting to SHA-1.
    ++	 *
    ++	 * This escape hatch is deliberately kept unadvertised, so
    ++	 * that they see crashes and we can get a report before
    ++	 * telling them about it.
     +	 */
     +	if (repo == the_repository)
     +		set_default_hash_algo(repo);
      }
      
      static void expand_base_dir(char **out, const char *in,
    -
    - ## setup.c ##
    -@@ setup.c: int daemonize(void)
    - #define TEST_FILEMODE 1
    - #endif
    - 
    --#define GIT_DEFAULT_HASH_ENVIRONMENT "GIT_DEFAULT_HASH"
    --
    - static void copy_templates_1(struct strbuf *path, struct strbuf *template_path,
    - 			     DIR *dir)
    - {
2:  e56ae68961 ! 2:  f3f9bf0480 t1517: test commands that are designed to be run outside repository
    @@ t/t1517-outside-repo.sh (new)
     +	git diff >sample.patch
     +'
     +
    -+test_expect_failure 'compute a patch-id outside repository' '
    -+	git patch-id <sample.patch >patch-id.expect &&
    -+	(
    -+		cd non-repo &&
    -+		git patch-id <../sample.patch >../patch-id.actual
    -+	) &&
    ++test_expect_failure 'compute a patch-id outside repository (uses SHA-1)' '
    ++	nongit env GIT_DEFAULT_HASH=sha1 \
    ++		git patch-id <sample.patch >patch-id.expect &&
    ++	nongit \
    ++		git patch-id <sample.patch >patch-id.actual &&
     +	test_cmp patch-id.expect patch-id.actual
     +'
     +
    -+test_expect_failure 'hash-object outside repository' '
    -+	git hash-object --stdin <sample.patch >hash.expect &&
    -+	(
    -+		cd non-repo &&
    -+		git hash-object --stdin <../sample.patch >../hash.actual
    -+	) &&
    ++test_expect_failure 'hash-object outside repository (uses SHA-1)' '
    ++	nongit env GIT_DEFAULT_HASH=sha1 \
    ++		git hash-object --stdin <sample.patch >hash.expect &&
    ++	nongit \
    ++		git hash-object --stdin <sample.patch >hash.actual &&
     +	test_cmp hash.expect hash.actual
     +'
     +
3:  e1ae0e95fc ! 3:  246cf395d0 builtin/patch-id: fix uninitialized hash function
    @@ t/t1517-outside-repo.sh: test_expect_success 'set up a non-repo directory and te
      	git diff >sample.patch
      '
      
    --test_expect_failure 'compute a patch-id outside repository' '
    -+test_expect_success 'compute a patch-id outside repository' '
    - 	git patch-id <sample.patch >patch-id.expect &&
    - 	(
    - 		cd non-repo &&
    +-test_expect_failure 'compute a patch-id outside repository (uses SHA-1)' '
    ++test_expect_success 'compute a patch-id outside repository (uses SHA-1)' '
    + 	nongit env GIT_DEFAULT_HASH=sha1 \
    + 		git patch-id <sample.patch >patch-id.expect &&
    + 	nongit \
     
      ## t/t4204-patch-id.sh ##
     @@ t/t4204-patch-id.sh: test_expect_success 'patch-id handles diffs with one line of before/after' '
4:  1c8ce0d024 ! 4:  fe14490548 builtin/hash-object: fix uninitialized hash function
    @@ t/t1007-hash-object.sh: test_expect_success '--literally with extra-long type' '
      	echo example | git hash-object -t $t --literally --stdin
      '
      
    -+test_expect_success '--stdin outside of repository' '
    ++test_expect_success '--stdin outside of repository (uses SHA-1)' '
     +	nongit git hash-object --stdin <hello >actual &&
    -+	echo "$(test_oid hello)" >expect &&
    ++	echo "$(test_oid --hash=sha1 hello)" >expect &&
     +	test_cmp expect actual
     +'
     +
      test_done
     
      ## t/t1517-outside-repo.sh ##
    -@@ t/t1517-outside-repo.sh: test_expect_success 'compute a patch-id outside repository' '
    +@@ t/t1517-outside-repo.sh: test_expect_success 'compute a patch-id outside repository (uses SHA-1)' '
      	test_cmp patch-id.expect patch-id.actual
      '
      
    --test_expect_failure 'hash-object outside repository' '
    -+test_expect_success 'hash-object outside repository' '
    - 	git hash-object --stdin <sample.patch >hash.expect &&
    - 	(
    - 		cd non-repo &&
    +-test_expect_failure 'hash-object outside repository (uses SHA-1)' '
    ++test_expect_success 'hash-object outside repository (uses SHA-1)' '
    + 	nongit env GIT_DEFAULT_HASH=sha1 \
    + 		git hash-object --stdin <sample.patch >hash.expect &&
    + 	nongit \
5:  5b0ec43757 ! 5:  c6d5e68374 apply: fix uninitialized hash function
    @@ builtin/apply.c: int cmd_apply(int argc, const char **argv, const char *prefix)
      				   apply_usage);
     
      ## t/t1517-outside-repo.sh ##
    -@@ t/t1517-outside-repo.sh: test_expect_success 'hash-object outside repository' '
    +@@ t/t1517-outside-repo.sh: test_expect_success 'hash-object outside repository (uses SHA-1)' '
      	test_cmp hash.expect hash.actual
      '
      


  parent reply	other threads:[~2024-05-20 23:14 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-13  7:15 [PATCH 0/2] Fix use of uninitialized hash algos Patrick Steinhardt
2024-05-13  7:15 ` [PATCH 1/2] builtin/patch-id: fix uninitialized hash function Patrick Steinhardt
2024-05-13  7:15 ` [PATCH 2/2] builtin/hash-object: " Patrick Steinhardt
2024-05-14  0:16   ` Junio C Hamano
2024-05-13 16:01 ` [PATCH 0/2] Fix use of uninitialized hash algos Junio C Hamano
2024-05-13 18:36   ` Junio C Hamano
2024-05-13 19:21 ` [PATCH v2 0/4] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-13 19:21   ` [PATCH v2 1/4] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-13 19:48     ` Kyle Lippincott
2024-05-13 19:21   ` [PATCH v2 2/4] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-13 19:57     ` Kyle Lippincott
2024-05-13 20:33       ` Junio C Hamano
2024-05-13 21:00         ` Junio C Hamano
2024-05-13 21:07           ` Kyle Lippincott
2024-05-13 19:21   ` [PATCH v2 3/4] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-13 19:21   ` [PATCH v2 4/4] builtin/hash-object: " Junio C Hamano
2024-05-13 21:28   ` [PATCH 5/4] apply: " Junio C Hamano
2024-05-13 22:41 ` [PATCH v3 0/5] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-13 23:11     ` Junio C Hamano
2024-05-14  4:31       ` Patrick Steinhardt
2024-05-14 15:52         ` Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 4/5] builtin/hash-object: " Junio C Hamano
2024-05-13 23:13     ` Junio C Hamano
2024-05-14  4:32       ` Patrick Steinhardt
2024-05-14 15:55         ` Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 5/5] apply: " Junio C Hamano
2024-05-14  1:14 ` [PATCH v4 0/5] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-14  4:32     ` Patrick Steinhardt
2024-05-14 15:05       ` Junio C Hamano
2024-05-14 17:19     ` Junio C Hamano
2024-05-15 12:23       ` Patrick Steinhardt
2024-05-16 15:31       ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-14  4:32     ` Patrick Steinhardt
2024-05-14 15:08       ` Junio C Hamano
2024-05-15 12:24         ` Patrick Steinhardt
2024-05-15 14:15           ` Junio C Hamano
2024-05-15 14:25             ` Patrick Steinhardt
2024-05-15 15:40               ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 4/5] builtin/hash-object: " Junio C Hamano
2024-05-17 23:49     ` Junio C Hamano
2024-05-20 21:19       ` Junio C Hamano
2024-05-20 22:45         ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 5/5] apply: " Junio C Hamano
2024-05-20 23:14 ` Junio C Hamano [this message]
2024-05-20 23:14   ` [PATCH v5 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-21  7:57     ` Patrick Steinhardt
2024-05-21 15:59       ` Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 4/5] builtin/hash-object: " Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 5/5] apply: " Junio C Hamano
2024-05-21  7:58     ` Patrick Steinhardt
2024-05-21 13:36       ` Junio C Hamano
2024-05-21  7:58   ` [PATCH v5 0/5] Fix use of uninitialized hash algorithms Patrick Steinhardt
2024-05-21 18:07     ` Junio C Hamano
2024-05-22  4:51       ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240520231434.1816979-1-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).