git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] setup: recognise extensions.objectFormat
@ 2018-01-24 13:09 Patryk Obara
  2018-01-24 13:37 ` Patryk Obara
  2018-01-25 10:08 ` Duy Nguyen
  0 siblings, 2 replies; 6+ messages in thread
From: Patryk Obara @ 2018-01-24 13:09 UTC (permalink / raw)
  To: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Jeff King, brian m . carlson
  Cc: Jonathan Nieder, Brandon Williams, Jonathan Tan, Stefan Beller

This extension selects which hashing algorithm from vtable should be
used for reading and writing objects in the object store.  At the moment
supports only single value (sha-1).

In case value of objectFormat is an unknown hashing algorithm, Git
command will fail with following message:

  fatal: unknown repository extensions found:
	  objectformat = <value>

To indicate, that this specific objectFormat value is not recognised.

The objectFormat extension is not allowed in repository marked as
version 0 to prevent any possibility of accidentally writing a NewHash
object in the sha-1 object store. This extension behaviour is different
than preciousObjects extension (which is allowed in repo version 0).

Add tests and documentation note about new extension.
---
 Documentation/technical/repository-version.txt |  8 ++++++++
 setup.c                                        | 27 ++++++++++++++++++++++++++
 t/t1302-repo-version.sh                        | 15 ++++++++++++++
 3 files changed, 50 insertions(+)

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 00ad37986e..14a75a7fee 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,11 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`objectFormat`
+~~~~~~~~~~~~~~
+
+This extension instructs Git to use a specific algorithm for addressing
+and interpreting objects in the object store. Currently, the only
+supported object format is `sha-1`.  See `hash-function-transition.txt`
+document for more detailed explanation.
diff --git a/setup.c b/setup.c
index 8cc34186ce..b48a90d9ce 100644
--- a/setup.c
+++ b/setup.c
@@ -405,6 +405,31 @@ void setup_work_tree(void)
 	initialized = 1;
 }
 
+static int find_object_format(const char *value)
+{
+	int i;
+	for (i = GIT_HASH_SHA1; i < GIT_HASH_NALGOS; ++i) {
+		if (strcmp(value, hash_algos[i].name) == 0)
+			return i;
+	}
+	return GIT_HASH_UNKNOWN;
+}
+
+static void detect_object_format(struct repository_format *data,
+				 const char *value)
+{
+	if (data->version == 0)
+		die("invalid repository format version");
+
+	data->hash_algo = find_object_format(value);
+	if (data->hash_algo == GIT_HASH_UNKNOWN) {
+		char object_format[25];
+		xsnprintf(object_format, sizeof(object_format),
+			  "objectformat = %s", value);
+		string_list_append(&data->unknown_extensions, object_format);
+	}
+}
+
 static int check_repo_format(const char *var, const char *value, void *vdata)
 {
 	struct repository_format *data = vdata;
@@ -422,6 +447,8 @@ static int check_repo_format(const char *var, const char *value, void *vdata)
 			;
 		else if (!strcmp(ext, "preciousobjects"))
 			data->precious_objects = git_config_bool(var, value);
+		else if (!strcmp(ext, "objectformat"))
+			detect_object_format(data, value);
 		else
 			string_list_append(&data->unknown_extensions, ext);
 	} else if (strcmp(var, "core.bare") == 0) {
diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh
index ce4cff13bb..227b397ff2 100755
--- a/t/t1302-repo-version.sh
+++ b/t/t1302-repo-version.sh
@@ -107,4 +107,19 @@ test_expect_success 'gc runs without complaint' '
 	git gc
 '
 
+test_expect_success 'object-format not allowed in repo version=0' '
+	mkconfig 0 "objectFormat = sha-1" >.git/config &&
+	check_abort
+'
+
+test_expect_success 'object-format=sha-1 allowed' '
+	mkconfig 1 "objectFormat = sha-1" >.git/config &&
+	check_allow
+'
+
+test_expect_success 'object-format=foo unsupported' '
+	mkconfig 1 "objectFormat = foo" >.git/config &&
+	check_abort
+'
+
 test_done

base-commit: 5be1f00a9a701532232f57958efab4be8c959a29
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup: recognise extensions.objectFormat
  2018-01-24 13:09 [PATCH] setup: recognise extensions.objectFormat Patryk Obara
@ 2018-01-24 13:37 ` Patryk Obara
  2018-01-25 10:08 ` Duy Nguyen
  1 sibling, 0 replies; 6+ messages in thread
From: Patryk Obara @ 2018-01-24 13:37 UTC (permalink / raw)
  To: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Jeff King, brian m . carlson
  Cc: Jonathan Nieder, Brandon Williams, Jonathan Tan, Stefan Beller

Argh! Forgot to sign-off the commit…

-- 
| ← Ceci n'est pas une pipe
Patryk Obara

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup: recognise extensions.objectFormat
  2018-01-24 13:09 [PATCH] setup: recognise extensions.objectFormat Patryk Obara
  2018-01-24 13:37 ` Patryk Obara
@ 2018-01-25 10:08 ` Duy Nguyen
  2018-01-26 14:41   ` Johannes Schindelin
  2018-01-26 18:06   ` Stefan Beller
  1 sibling, 2 replies; 6+ messages in thread
From: Duy Nguyen @ 2018-01-25 10:08 UTC (permalink / raw)
  To: Patryk Obara
  Cc: Git Mailing List, Junio C Hamano, Jeff King, brian m . carlson,
	Jonathan Nieder, Brandon Williams, Jonathan Tan, Stefan Beller

On Wed, Jan 24, 2018 at 8:09 PM, Patryk Obara <patryk.obara@gmail.com> wrote:
> This extension selects which hashing algorithm from vtable should be
> used for reading and writing objects in the object store.  At the moment
> supports only single value (sha-1).
>
> In case value of objectFormat is an unknown hashing algorithm, Git
> command will fail with following message:
>
>   fatal: unknown repository extensions found:
>           objectformat = <value>
>
> To indicate, that this specific objectFormat value is not recognised.
>
> The objectFormat extension is not allowed in repository marked as
> version 0 to prevent any possibility of accidentally writing a NewHash
> object in the sha-1 object store. This extension behaviour is different
> than preciousObjects extension (which is allowed in repo version 0).

This config is so sensitive I wonder if we should forbid changing it
via git-config. You can't simply change this and expect anything to
work anyway.

"git init" can have an option to specify object format. "git clone"
naturally inherits the format from the remote repository. Maybe a
future command allows to convert hash algorithm on an existing repo
(*). But other than that nobody is allowed to change this.

(*) it's probably git-clone that does this job, cloning and converting
at the same time.

> +`objectFormat`
> +~~~~~~~~~~~~~~
> +
> +This extension instructs Git to use a specific algorithm for addressing
> +and interpreting objects in the object store. Currently, the only
> +supported object format is `sha-1`.  See `hash-function-transition.txt`
> +document for more detailed explanation.

Maybe the word "experimental" should be mentioned somewhere.

> +static int find_object_format(const char *value)
> +{
> +       int i;
> +       for (i = GIT_HASH_SHA1; i < GIT_HASH_NALGOS; ++i) {
> +               if (strcmp(value, hash_algos[i].name) == 0)
> +                       return i;
> +       }
> +       return GIT_HASH_UNKNOWN;
> +}
> +
> +static void detect_object_format(struct repository_format *data,
> +                                const char *value)
> +{
> +       if (data->version == 0)
> +               die("invalid repository format version");

die(_("invalid repository format version '%d'"), data->version);

> +
> +       data->hash_algo = find_object_format(value);
> +       if (data->hash_algo == GIT_HASH_UNKNOWN) {
> +               char object_format[25];
> +               xsnprintf(object_format, sizeof(object_format),
> +                         "objectformat = %s", value);

We have strbuf so that we don't have to deal with fixed size buffers like this.
-- 
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup: recognise extensions.objectFormat
  2018-01-25 10:08 ` Duy Nguyen
@ 2018-01-26 14:41   ` Johannes Schindelin
  2018-01-26 17:44     ` Patryk Obara
  2018-01-26 18:06   ` Stefan Beller
  1 sibling, 1 reply; 6+ messages in thread
From: Johannes Schindelin @ 2018-01-26 14:41 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Patryk Obara, Git Mailing List, Junio C Hamano, Jeff King,
	brian m . carlson, Jonathan Nieder, Brandon Williams,
	Jonathan Tan, Stefan Beller

Hi Duy,

On Thu, 25 Jan 2018, Duy Nguyen wrote:

> On Wed, Jan 24, 2018 at 8:09 PM, Patryk Obara <patryk.obara@gmail.com> wrote:
> > This extension selects which hashing algorithm from vtable should be
> > used for reading and writing objects in the object store.  At the moment
> > supports only single value (sha-1).
> >
> > In case value of objectFormat is an unknown hashing algorithm, Git
> > command will fail with following message:
> >
> >   fatal: unknown repository extensions found:
> >           objectformat = <value>
> >
> > To indicate, that this specific objectFormat value is not recognised.
> >
> > The objectFormat extension is not allowed in repository marked as
> > version 0 to prevent any possibility of accidentally writing a NewHash
> > object in the sha-1 object store. This extension behaviour is different
> > than preciousObjects extension (which is allowed in repo version 0).
> 
> This config is so sensitive I wonder if we should forbid changing it
> via git-config. You can't simply change this and expect anything to
> work anyway.

I don't think it makes sense to forbid `git config` from changing these
values, as it is all-too-easy to change them via `git config -e` *anyway*.
And we already have the repositoryFormat precedent with the exact same
issue.

In my opinion, it would only complicate the code, for very little (if at
all noticable) benefit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup: recognise extensions.objectFormat
  2018-01-26 14:41   ` Johannes Schindelin
@ 2018-01-26 17:44     ` Patryk Obara
  0 siblings, 0 replies; 6+ messages in thread
From: Patryk Obara @ 2018-01-26 17:44 UTC (permalink / raw)
  To: Johannes Schindelin, Duy Nguyen
  Cc: Git Mailing List, Junio C Hamano, Jeff King, brian m . carlson,
	Jonathan Nieder, Brandon Williams, Jonathan Tan, Stefan Beller

On 26/01/2018 15:41, Johannes Schindelin wrote:

> On Thu, 25 Jan 2018, Duy Nguyen wrote:
>>
>> This config is so sensitive I wonder if we should forbid changing it
>> via git-config. You can't simply change this and expect anything to
>> work anyway.
> 
> I don't think it makes sense to forbid `git config` from changing these
> values, as it is all-too-easy to change them via `git config -e` *anyway*.
> And we already have the repositoryFormat precedent with the exact same
> issue.
> 
> In my opinion, it would only complicate the code, for very little (if at
> all noticable) benefit.

That's my sentiment as well, but some measure of user protection might 
be necessary.

I was thinking about doing a sanity check: when objectFormat is set,
re-hash N (randomly selected?) objects, where N is sufficient to get 
3-sigma confidence.

This might be necessary to prevent object store corruption e.g. when
objectFormat is set in repo, that is cloned with reference from repo 
without extension.

-- 
| ← Ceci n'est pas une pipe
Patryk Obara

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup: recognise extensions.objectFormat
  2018-01-25 10:08 ` Duy Nguyen
  2018-01-26 14:41   ` Johannes Schindelin
@ 2018-01-26 18:06   ` Stefan Beller
  1 sibling, 0 replies; 6+ messages in thread
From: Stefan Beller @ 2018-01-26 18:06 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Patryk Obara, Git Mailing List, Junio C Hamano, Jeff King,
	brian m . carlson, Jonathan Nieder, Brandon Williams,
	Jonathan Tan

On Thu, Jan 25, 2018 at 2:08 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Jan 24, 2018 at 8:09 PM, Patryk Obara <patryk.obara@gmail.com> wrote:
>> This extension selects which hashing algorithm from vtable should be
>> used for reading and writing objects in the object store.  At the moment
>> supports only single value (sha-1).
>>
>> In case value of objectFormat is an unknown hashing algorithm, Git
>> command will fail with following message:
>>
>>   fatal: unknown repository extensions found:
>>           objectformat = <value>
>>
>> To indicate, that this specific objectFormat value is not recognised.
>>
>> The objectFormat extension is not allowed in repository marked as
>> version 0 to prevent any possibility of accidentally writing a NewHash
>> object in the sha-1 object store. This extension behaviour is different
>> than preciousObjects extension (which is allowed in repo version 0).
>
> This config is so sensitive I wonder if we should forbid changing it
> via git-config. You can't simply change this and expect anything to
> work anyway.

You may have a local tool to do so, "git convert-repo-to <newhash>",
that would also adjust this setting.

I'd second Johannes to not add special error handling and forbidding
git-config to change this setting.

> "git init" can have an option to specify object format.

makes sense.

> "git clone"
> naturally inherits the format from the remote repository.

not necessarily. As the hash transition plan suggests, you'd start with
a local conversion, so it would make sense to have a
"clone --convert-locally-to <newhash>" flag.

> Maybe a
> future command allows to convert hash algorithm on an existing repo
> (*). But other than that nobody is allowed to change this.
>
> (*) it's probably git-clone that does this job, cloning and converting
> at the same time.

I think we also need on-the-fly conversion in fetch/push, so we can work
converted locally, but pretend to the outside we speak sha1 just fine.,

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-01-26 18:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-24 13:09 [PATCH] setup: recognise extensions.objectFormat Patryk Obara
2018-01-24 13:37 ` Patryk Obara
2018-01-25 10:08 ` Duy Nguyen
2018-01-26 14:41   ` Johannes Schindelin
2018-01-26 17:44     ` Patryk Obara
2018-01-26 18:06   ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).