git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH 0/2] bumping repository format version
@ 2015-06-23 10:50 Jeff King
  2015-06-23 10:53 ` [PATCH 1/2] introduce "extensions" form of core.repositoryformatversion Jeff King
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2015-06-23 10:50 UTC (permalink / raw)
  To: git

We've managed to avoid bumping core.repositoryformatversion for the past
10 years, which is great. But I think there are some looming features
that are going to need it. The most obvious one is changing the ref
storage, where we need some way to tell older gits "no, please don't
think that taking 'refs/heads/foo.lock' is sufficient to actually lock".

The first patch in this series is an attempt to pave the way for version
bumps like this in as painless a way as possible, by letting us mark
incompatible "extensions" by name. That way we can version things like
"how do you lock a ref" independent of the main repositoryformatversion
setting (just like we do for index version, for example).

See the explanation in the first patch for more details. The second
patch shows another use of the "extension" feature to provide safety
in shared-object repos against older versions of git.

  [1/2]: introduce "extensions" form of core.repositoryformatversion
  [2/2]: introduce "preciousObjects" repository extension

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] introduce "extensions" form of core.repositoryformatversion
  2015-06-23 10:50 [RFC/PATCH 0/2] bumping repository format version Jeff King
@ 2015-06-23 10:53 ` Jeff King
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff King @ 2015-06-23 10:53 UTC (permalink / raw)
  To: git

Normally we try to avoid bumps of the whole-repository
core.repositoryformatversion field. However, it is
unavoidable if we want to safely change certain aspects of
git in a backwards-incompatible way (e.g., modifying the set
of ref tips that we must traverse to generate a list of
unreachable, safe-to-prune objects).

If we were to bump the repository version for every such
change, then any implementation understanding version `X`
would also have to understand `X-1`, `X-2`, and so forth,
even though the incompatibilities may be in orthogonal parts
of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that
the user cannot choose to use one feature without the other,
weighing the tradeoff in compatibility only for that
particular feature).

This patch documents the existing repositoryformatversion
strategy and introduces a new format, "1", which lets a
repository specify that it must run with an arbitrary set of
extensions. This can be used, for example:

 - to inform git that the objects should not be pruned based
   only on the reachability of the ref tips (e.g, because it
   has "clone --shared" children)

 - that the refs are stored in a format besides the usual
   "refs" and "packed-refs" directories

Because we bump to format "1", and because format "1"
requires that a running git knows about any extensions
mentioned, we know that older versions of the code will not
do something dangerous when confronted with these new
formats.

For example, if the user chooses to use database storage for
refs, they may set the "extensions.refbackend" config to
"db". Older versions of git will not understand format "1"
and bail. Versions of git which understand "1" but do not
know about "refbackend", or which know about "refbackend"
but not about the "db" backend, will refuse to run. This is
annoying, of course, but much better than the alternative of
claiming that there are no refs in the repository, or
writing to a location that other implementations will not
read.

Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that
is meant to be used by users and future extensions to
provide safety with older implementations.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/technical/repository-version.txt | 81 ++++++++++++++++++++++++++
 cache.h                                        |  6 ++
 setup.c                                        | 37 +++++++++++-
 t/t1302-repo-version.sh                        | 38 ++++++++++++
 4 files changed, 159 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/technical/repository-version.txt

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
new file mode 100644
index 0000000..3d7106d
--- /dev/null
+++ b/Documentation/technical/repository-version.txt
@@ -0,0 +1,81 @@
+Git Repository Format Versions
+==============================
+
+Every git repository is marked with a numeric version in the
+`core.repositoryformatversion` key of its `config` file. This version
+specifies the rules for operating on the on-disk repository data. An
+implementation of git which does not understand a particular version
+advertised by an on-disk repository MUST NOT operate on that repository;
+doing so risks not only producing wrong results, but actually losing
+data.
+
+Because of this rule, version bumps should be kept to an absolute
+minimum. Instead, we generally prefer these strategies:
+
+  - bumping format version numbers of individual data files (e.g.,
+    index, packfiles, etc). This restricts the incompatibilities only to
+    those files.
+
+  - introducing new data that gracefully degrades when used by older
+    clients (e.g., pack bitmap files are ignored by older clients, which
+    simply do not take advantage of the optimization they provide).
+
+A whole-repository format version bump should only be part of a change
+that cannot be independently versioned. For instance, if one were to
+change the reachability rules for objects, or the rules for locking
+refs, that would require a bump of the repository format version.
+
+Note that this applies only to accessing the repository's disk contents
+directly. An older client which understands only format `0` may still
+connect via `git://` to a repository using format `1`, as long as the
+server process understands format `1`.
+
+The preferred strategy for rolling out a version bump (whether whole
+repository or for a single file) is to teach git to read the new format,
+and allow writing the new format with a config switch or command line
+option (for experimentation or for those who do not care about backwards
+compatibility with older gits). Then after a long period to allow the
+reading capability to become common, we may switch to writing the new
+format by default.
+
+The currently defined format versions are:
+
+Version `0`
+-----------
+
+This is the format defined by the initial version of git, including but
+not limited to the format of the repository directory, the repository
+configuration file, and the object and ref storage. Specifying the
+complete behavior of git is beyond the scope of this document.
+
+Version `1`
+-----------
+
+This format is identical to version `0`, with the following exceptions:
+
+  1. When reading the `core.repositoryformatversion` variable, a git
+     implementation which supports version 1 MUST also read any
+     configuration keys found in the `extensions` section of the
+     configuration file.
+
+  2. If a version-1 repository specifies any `extensions.*` keys that
+     the running git has not implemented, the operation MUST NOT
+     proceed. Similarly, if the value of any known key is not understood
+     by the implementation, the operation MUST NOT proceed.
+
+Note that if no extensions are specified in the config file, then
+`core.repositoryformatversion` SHOULD be set to `0` (setting it to `1`
+provides no benefit, and makes the repository incompatible with older
+implementations of git).
+
+This document will serve as the master list for extensions. Any
+implementation wishing to define a new extension should make a note of
+it here, in order to claim the name.
+
+The defined extensions are:
+
+`noop`
+~~~~~~
+
+This extension does not change git's behavior at all. It is useful only
+for testing format-1 compatibility.
diff --git a/cache.h b/cache.h
index 571c98f..bee425b 100644
--- a/cache.h
+++ b/cache.h
@@ -686,7 +686,13 @@ extern char *notes_ref_name;
 
 extern int grafts_replace_parents;
 
+/*
+ * GIT_REPO_VERSION is the version we write by default. The
+ * _READ variant is the highest number we know how to
+ * handle.
+ */
 #define GIT_REPO_VERSION 0
+#define GIT_REPO_VERSION_READ 1
 extern int repository_format_version;
 extern int check_repository_format(void);
 
diff --git a/setup.c b/setup.c
index 82c0cc2..0d53846 100644
--- a/setup.c
+++ b/setup.c
@@ -5,6 +5,7 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+static struct string_list unknown_extensions = STRING_LIST_INIT_DUP;
 
 /*
  * The input parameter must contain an absolute path, and it must already be
@@ -352,10 +353,23 @@ void setup_work_tree(void)
 
 static int check_repo_format(const char *var, const char *value, void *cb)
 {
+	const char *ext;
+
 	if (strcmp(var, "core.repositoryformatversion") == 0)
 		repository_format_version = git_config_int(var, value);
 	else if (strcmp(var, "core.sharedrepository") == 0)
 		shared_repository = git_config_perm(var, value);
+	else if (skip_prefix(var, "extensions.", &ext)) {
+		/*
+		 * record any known extensions here; otherwise,
+		 * we fall through to recording it as unknown, and
+		 * check_repository_format will complain
+		 */
+		if (!strcmp(ext, "noop"))
+			;
+		else
+			string_list_append(&unknown_extensions, ext);
+	}
 	return 0;
 }
 
@@ -366,6 +380,8 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
 	config_fn_t fn;
 	int ret = 0;
 
+	string_list_clear(&unknown_extensions, 0);
+
 	if (get_common_dir(&sb, gitdir))
 		fn = check_repo_format;
 	else
@@ -383,16 +399,31 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
 	 * is a good one.
 	 */
 	git_config_early(fn, NULL, repo_config);
-	if (GIT_REPO_VERSION < repository_format_version) {
+	if (GIT_REPO_VERSION_READ < repository_format_version) {
 		if (!nongit_ok)
 			die ("Expected git repo version <= %d, found %d",
-			     GIT_REPO_VERSION, repository_format_version);
+			     GIT_REPO_VERSION_READ, repository_format_version);
 		warning("Expected git repo version <= %d, found %d",
-			GIT_REPO_VERSION, repository_format_version);
+			GIT_REPO_VERSION_READ, repository_format_version);
 		warning("Please upgrade Git");
 		*nongit_ok = -1;
 		ret = -1;
 	}
+
+	if (repository_format_version >= 1 && unknown_extensions.nr) {
+		int i;
+
+		if (!nongit_ok)
+			die("unknown repository extension: %s",
+			    unknown_extensions.items[0].string);
+
+		for (i = 0; i < unknown_extensions.nr; i++)
+			warning("unknown repository extension: %s",
+				unknown_extensions.items[i].string);
+		*nongit_ok = -1;
+		ret = -1;
+	}
+
 	strbuf_release(&sb);
 	return ret;
 }
diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh
index 0d9388a..8dd6fd7 100755
--- a/t/t1302-repo-version.sh
+++ b/t/t1302-repo-version.sh
@@ -67,4 +67,42 @@ test_expect_success 'gitdir required mode' '
 	)
 '
 
+check_allow () {
+	git rev-parse --git-dir >actual &&
+	echo .git >expect &&
+	test_cmp expect actual
+}
+
+check_abort () {
+	test_must_fail git rev-parse --git-dir
+}
+
+# avoid git-config, since it cannot be trusted to run
+# in a repository with a broken version
+mkconfig () {
+	echo '[core]' &&
+	echo "repositoryformatversion = $1" &&
+	shift &&
+
+	if test $# -gt 0; then
+		echo '[extensions]' &&
+		for i in "$@"; do
+			echo "$i"
+		done
+	fi
+}
+
+while read outcome version extensions; do
+	test_expect_success "$outcome version=$version $extensions" "
+		mkconfig $version $extensions >.git/config &&
+		check_${outcome}
+	"
+done <<\EOF
+allow 0
+allow 1
+allow 1 noop
+abort 1 no-such-extension
+allow 0 no-such-extension
+EOF
+
 test_done
-- 
2.4.4.719.g3984bc6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 10:50 [RFC/PATCH 0/2] bumping repository format version Jeff King
  2015-06-23 10:53 ` [PATCH 1/2] introduce "extensions" form of core.repositoryformatversion Jeff King
@ 2015-06-23 10:54 ` Jeff King
  2015-06-23 11:14   ` Duy Nguyen
                     ` (3 more replies)
  1 sibling, 4 replies; 13+ messages in thread
From: Jeff King @ 2015-06-23 10:54 UTC (permalink / raw)
  To: git

If this extension is used in a repository, then no
operations should run which may drop objects from the object
storage. This can be useful if you are sharing that storage
with other repositories whose refs you cannot see.

For instance, if you do:

  $ git clone -s parent child
  $ git -C parent config extensions.preciousObjects true
  $ git -C parent config core.repositoryformatversion 1

you now have additional safety when running git in the
parent repository. Prunes and repacks will bail with an
error, and `git gc` will skip those operations (it will
continue to pack refs and do other non-object operations).
Older versions of git, when run in the repository, will
fail on every operation.

Note that we do not set the preciousObjects extension by
default when doing a "clone -s", as doing so breaks
backwards compatibility. It is a decision the user should
make explicitly.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/technical/repository-version.txt |  7 +++++++
 builtin/gc.c                                   | 20 +++++++++++---------
 builtin/prune.c                                |  3 +++
 builtin/repack.c                               |  3 +++
 cache.h                                        |  1 +
 environment.c                                  |  1 +
 setup.c                                        |  2 ++
 t/t1302-repo-version.sh                        | 22 ++++++++++++++++++++++
 8 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 3d7106d..00ad379 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -79,3 +79,10 @@ The defined extensions are:
 
 This extension does not change git's behavior at all. It is useful only
 for testing format-1 compatibility.
+
+`preciousObjects`
+~~~~~~~~~~~~~~~~~
+
+When the config key `extensions.preciousObjects` is set to `true`,
+objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
+`git repack -d`).
diff --git a/builtin/gc.c b/builtin/gc.c
index 36fe333..8b8dc6b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (gc_before_repack())
 		return -1;
 
-	if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
-		return error(FAILED_RUN, repack.argv[0]);
-
-	if (prune_expire) {
-		argv_array_push(&prune, prune_expire);
-		if (quiet)
-			argv_array_push(&prune, "--no-progress");
-		if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
-			return error(FAILED_RUN, prune.argv[0]);
+	if (!repository_format_precious_objects) {
+		if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
+			return error(FAILED_RUN, repack.argv[0]);
+
+		if (prune_expire) {
+			argv_array_push(&prune, prune_expire);
+			if (quiet)
+				argv_array_push(&prune, "--no-progress");
+			if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
+				return error(FAILED_RUN, prune.argv[0]);
+		}
 	}
 
 	if (prune_worktrees_expire) {
diff --git a/builtin/prune.c b/builtin/prune.c
index 0c73246..fc0c8e8 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	if (repository_format_precious_objects)
+		die("cannot prune in a precious-objects repo");
+
 	while (argc--) {
 		unsigned char sha1[20];
 		const char *name = *argv++;
diff --git a/builtin/repack.c b/builtin/repack.c
index af7340c..8ae7fe5 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, builtin_repack_options,
 				git_repack_usage, 0);
 
+	if (delete_redundant && repository_format_precious_objects)
+		die("cannot repack in a precious-objects repo");
+
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/cache.h b/cache.h
index bee425b..9b59a63 100644
--- a/cache.h
+++ b/cache.h
@@ -694,6 +694,7 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_version;
+extern int repository_format_precious_objects;
 extern int check_repository_format(void);
 
 #define MTIME_CHANGED	0x0001
diff --git a/environment.c b/environment.c
index 61c685b..da66e82 100644
--- a/environment.c
+++ b/environment.c
@@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_version;
+int repository_format_precious_objects;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 int shared_repository = PERM_UMASK;
diff --git a/setup.c b/setup.c
index 0d53846..8b8dca9 100644
--- a/setup.c
+++ b/setup.c
@@ -367,6 +367,8 @@ static int check_repo_format(const char *var, const char *value, void *cb)
 		 */
 		if (!strcmp(ext, "noop"))
 			;
+		else if (!strcmp(ext, "preciousobjects"))
+			repository_format_precious_objects = git_config_bool(var, value);
 		else
 			string_list_append(&unknown_extensions, ext);
 	}
diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh
index 8dd6fd7..9bcd349 100755
--- a/t/t1302-repo-version.sh
+++ b/t/t1302-repo-version.sh
@@ -105,4 +105,26 @@ abort 1 no-such-extension
 allow 0 no-such-extension
 EOF
 
+test_expect_success 'precious-objects allowed' '
+	mkconfig 1 preciousObjects >.git/config &&
+	check_allow
+'
+
+test_expect_success 'precious-objects blocks destructive repack' '
+	test_must_fail git repack -ad
+'
+
+test_expect_success 'other repacks are OK' '
+	test_commit foo &&
+	git repack
+'
+
+test_expect_success 'precious-objects blocks prune' '
+	test_must_fail git prune
+'
+
+test_expect_success 'gc runs without complaint' '
+	git gc
+'
+
 test_done
-- 
2.4.4.719.g3984bc6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
@ 2015-06-23 11:14   ` Duy Nguyen
  2015-06-23 11:47     ` Jeff King
  2015-06-23 21:05   ` Junio C Hamano
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Duy Nguyen @ 2015-06-23 11:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Tue, Jun 23, 2015 at 5:54 PM, Jeff King <peff@peff.net> wrote:
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 0c73246..fc0c8e8 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>                 return 0;
>         }
>
> +       if (repository_format_precious_objects)
> +               die("cannot prune in a precious-objects repo");
> +
>         while (argc--) {
>                 unsigned char sha1[20];
>                 const char *name = *argv++;
> diff --git a/builtin/repack.c b/builtin/repack.c
> index af7340c..8ae7fe5 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>         argc = parse_options(argc, argv, prefix, builtin_repack_options,
>                                 git_repack_usage, 0);
>
> +       if (delete_redundant && repository_format_precious_objects)
> +               die("cannot repack in a precious-objects repo");
> +
>         if (pack_kept_objects < 0)
>                 pack_kept_objects = write_bitmaps;
>

I know both commands have translatable strings that are not marked
with _(). But if you reroll, maybe you could add _() for these new
strings. It's even better if you mark all others in these commands
too, if you have time.
-- 
Duy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 11:14   ` Duy Nguyen
@ 2015-06-23 11:47     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2015-06-23 11:47 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

On Tue, Jun 23, 2015 at 06:14:22PM +0700, Duy Nguyen wrote:

> > +       if (delete_redundant && repository_format_precious_objects)
> > +               die("cannot repack in a precious-objects repo");
> > +
> >         if (pack_kept_objects < 0)
> >                 pack_kept_objects = write_bitmaps;
> >
> 
> I know both commands have translatable strings that are not marked
> with _(). But if you reroll, maybe you could add _() for these new
> strings. It's even better if you mark all others in these commands
> too, if you have time.

Yeah, I agree these should be marked for translation. Thanks.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
  2015-06-23 11:14   ` Duy Nguyen
@ 2015-06-23 21:05   ` Junio C Hamano
  2015-06-24  7:50     ` Jeff King
  2015-06-23 21:31   ` David Turner
  2015-06-24  8:12   ` Jeff King
  3 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2015-06-23 21:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

>  This extension does not change git's behavior at all. It is useful only
>  for testing format-1 compatibility.
> +
> +`preciousObjects`
> +~~~~~~~~~~~~~~~~~
> +
> +When the config key `extensions.preciousObjects` is set to `true`,
> +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
> +`git repack -d`).

OK.  In essense, the 'extension' on the disk is like 'capability' on
the wire, in that you are not supposed to ask for capability they do
not understand, and you are not supposed to touch a repository you
do not understand.

And the above looks like a reasonable sample "feature" to protect by
the 'extension' system.

> +	if (delete_redundant && repository_format_precious_objects)
> +		die("cannot repack in a precious-objects repo");

This message initially threw me off during my cursory reading, but
the code tells me that this is only about "repack -d".

Unfortunately the users do not get the chance to read the code;
perhaps s/cannot repack/& -d/; or something?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
  2015-06-23 11:14   ` Duy Nguyen
  2015-06-23 21:05   ` Junio C Hamano
@ 2015-06-23 21:31   ` David Turner
  2015-06-24  7:55     ` Jeff King
  2015-06-24  8:12   ` Jeff King
  3 siblings, 1 reply; 13+ messages in thread
From: David Turner @ 2015-06-23 21:31 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote:
> +	mkconfig 1 preciousObjects >.git/config &&

nit: I think it's better to use git config rather than manually writing
config files.  git config is more future-proof if we up switching config
backends; it's also more composable with other configs (making this test
easier to copy and manipulate), and more explicit.

Other than that it looks good to me.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 21:05   ` Junio C Hamano
@ 2015-06-24  7:50     ` Jeff King
  2015-06-24 17:15       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2015-06-24  7:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 23, 2015 at 02:05:28PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >  This extension does not change git's behavior at all. It is useful only
> >  for testing format-1 compatibility.
> > +
> > +`preciousObjects`
> > +~~~~~~~~~~~~~~~~~
> > +
> > +When the config key `extensions.preciousObjects` is set to `true`,
> > +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
> > +`git repack -d`).
> 
> OK.  In essense, the 'extension' on the disk is like 'capability' on
> the wire, in that you are not supposed to ask for capability they do
> not understand, and you are not supposed to touch a repository you
> do not understand.

Yeah, I think that is a good analogy.

> > +	if (delete_redundant && repository_format_precious_objects)
> > +		die("cannot repack in a precious-objects repo");
> 
> This message initially threw me off during my cursory reading, but
> the code tells me that this is only about "repack -d".
> 
> Unfortunately the users do not get the chance to read the code;
> perhaps s/cannot repack/& -d/; or something?

I agree that would be better. I originally just blocked all use of
git-repack, but at the last minute softened it to just "repack -d". I'm
not sure if that would actually help anyone in practice. Sure, doing
"git repack" without any options is not destructive, but I wonder if
anybody actually does it. They either run `git gc`, or they probably do
something more exotic and disable the safety check during that run[1].

So I think we could squash in the patch below (which also marks the
strings for translation). But I'd also be OK with the rule covering all
of `git repack`.

-Peff

[1] One of my proposed uses for this is to revamp the way we handle
    shared objects on GitHub servers. Right now objects get pushed to
    individual forks, and then migrate to a shared repository that is
    accessed via the alternates mechanism. I would like to move to
    symlinking the `objects/` directory to write directly into the
    shared space. But the destruction from accidentally running
    something like `git gc` in a fork is very high. With this patch, we
    can bump the forks to the v1 format and mark their objects as
    precious.

---
diff --git a/builtin/prune.c b/builtin/prune.c
index fc0c8e8..6a58e75 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 	}
 
 	if (repository_format_precious_objects)
-		die("cannot prune in a precious-objects repo");
+		die(_("cannot prune in a precious-objects repo"));
 
 	while (argc--) {
 		unsigned char sha1[20];
diff --git a/builtin/repack.c b/builtin/repack.c
index 8ae7fe5..3beda2c 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				git_repack_usage, 0);
 
 	if (delete_redundant && repository_format_precious_objects)
-		die("cannot repack in a precious-objects repo");
+		die(_("cannot delete packs in a precious-objects repo"));
 
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 21:31   ` David Turner
@ 2015-06-24  7:55     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2015-06-24  7:55 UTC (permalink / raw)
  To: David Turner; +Cc: git

On Tue, Jun 23, 2015 at 05:31:14PM -0400, David Turner wrote:

> On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote:
> > +	mkconfig 1 preciousObjects >.git/config &&
> 
> nit: I think it's better to use git config rather than manually writing
> config files.  git config is more future-proof if we up switching config
> backends; it's also more composable with other configs (making this test
> easier to copy and manipulate), and more explicit.

I would have preferred that, but it makes the tests very fragile. You
are depending on "git config" running even when the current directory is
not a valid git repo (and we walk up to the surround git.git directory
and change the config there!).

I guess we could use "git config -f .git/config", though that is
partially defeating the purpose of your suggestion.

I dunno. I kind of figured we would cross that bridge if and when we
come to it. I imagine you are pretty sensitive to it, though, having
just waded through all the tests that assume various things about
.git/refs. :)

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
                     ` (2 preceding siblings ...)
  2015-06-23 21:31   ` David Turner
@ 2015-06-24  8:12   ` Jeff King
  2015-06-24 10:29     ` Duy Nguyen
  3 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2015-06-24  8:12 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote:

> diff --git a/builtin/prune.c b/builtin/prune.c
> index 0c73246..fc0c8e8 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>  		return 0;
>  	}
>  
> +	if (repository_format_precious_objects)
> +		die("cannot prune in a precious-objects repo");
> +

By the way, I originally wrote this patch on an older version of git,
and was surprised that `git gc` broke when I brought it forward. The
problem was that gc now runs `git prune --worktree`, and my die() was
affecting that case.

It really seems misplaced to me to make worktree-pruning part of
git-prune. I imagine the rationale was "well, we are pruning things, so
let's add an option to this command rather than make a new one". But I
do not think that follows our usual pattern with gc, which is:

  1. Individual areas of code handle their own cleanup. E.g., "reflog
     expire", "rerere gc".

  2. We tie them together with "git gc", not with "git prune".

So it seems weird that "git prune --worktree" is a fundamentally
different command than "git prune". I think by (1), it should be a
separate "git prune-worktree" (or better yet, "git worktree prune", as
the start of a command for manipulating the list of multiple-worktrees).

Not a _huge_ deal, but if we want to change it, it would be nice to do so
now before it is part of a release. Thoughts?

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-24  8:12   ` Jeff King
@ 2015-06-24 10:29     ` Duy Nguyen
  0 siblings, 0 replies; 13+ messages in thread
From: Duy Nguyen @ 2015-06-24 10:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Wed, Jun 24, 2015 at 3:12 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote:
>
>> diff --git a/builtin/prune.c b/builtin/prune.c
>> index 0c73246..fc0c8e8 100644
>> --- a/builtin/prune.c
>> +++ b/builtin/prune.c
>> @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>>               return 0;
>>       }
>>
>> +     if (repository_format_precious_objects)
>> +             die("cannot prune in a precious-objects repo");
>> +
>
> By the way, I originally wrote this patch on an older version of git,
> and was surprised that `git gc` broke when I brought it forward. The
> problem was that gc now runs `git prune --worktree`, and my die() was
> affecting that case.
>
> It really seems misplaced to me to make worktree-pruning part of
> git-prune. I imagine the rationale was "well, we are pruning things, so
> let's add an option to this command rather than make a new one". But I
> do not think that follows our usual pattern with gc, which is:
>
>   1. Individual areas of code handle their own cleanup. E.g., "reflog
>      expire", "rerere gc".
>
>   2. We tie them together with "git gc", not with "git prune".
>
> So it seems weird that "git prune --worktree" is a fundamentally
> different command than "git prune". I think by (1), it should be a
> separate "git prune-worktree" (or better yet, "git worktree prune", as
> the start of a command for manipulating the list of multiple-worktrees).
>
> Not a _huge_ deal, but if we want to change it, it would be nice to do so
> now before it is part of a release. Thoughts?

I was misled by the generic name "prune" :) (OK my bad, the document
clearly says it's about object database). Maybe we should make an
alias prune-objects.. And you caught me too late, I also added
prune_shallow() in there.

Multiple worktree feature is not released yet so I still have some
time to move it out. Yeah "git worktree prune" makes sense. I think I
need a way to list worktrees anyway. I didn't find good enough reason
to create "git worktree" back then just for listing..
-- 
Duy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-24  7:50     ` Jeff King
@ 2015-06-24 17:15       ` Junio C Hamano
  2015-06-25 10:07         ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2015-06-24 17:15 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

>> > +	if (delete_redundant && repository_format_precious_objects)
>> > +		die("cannot repack in a precious-objects repo");
>> 
>> This message initially threw me off during my cursory reading, but
>> the code tells me that this is only about "repack -d".
>> 
>> Unfortunately the users do not get the chance to read the code;
>> perhaps s/cannot repack/& -d/; or something?
>
> I agree that would be better. I originally just blocked all use of
> git-repack, but at the last minute softened it to just "repack -d". I'm
> not sure if that would actually help anyone in practice. Sure, doing
> "git repack" without any options is not destructive, but I wonder if
> anybody actually does it.

Hmph, if you cannot afford to lose objects that are unreachable from
your refs (because you know your repository has borrowers) but are
suffering from too many packs, wouldn't "repack -a" be the most
natural thing to do?  Maybe I am biased, but "git gc" is not the
first thing that comes to my mind in that situation.

> So I think we could squash in the patch below (which also marks the
> strings for translation). But I'd also be OK with the rule covering all
> of `git repack`.

OK, will squash it in.

> [1] One of my proposed uses for this is to revamp the way we handle
>     shared objects on GitHub servers. Right now objects get pushed to
>     individual forks, and then migrate to a shared repository that is
>     accessed via the alternates mechanism. I would like to move to
>     symlinking the `objects/` directory to write directly into the
>     shared space. But the destruction from accidentally running
>     something like `git gc` in a fork is very high. With this patch, we
>     can bump the forks to the v1 format and mark their objects as
>     precious.
>
> ---
> diff --git a/builtin/prune.c b/builtin/prune.c
> index fc0c8e8..6a58e75 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	if (repository_format_precious_objects)
> -		die("cannot prune in a precious-objects repo");
> +		die(_("cannot prune in a precious-objects repo"));
>  
>  	while (argc--) {
>  		unsigned char sha1[20];
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 8ae7fe5..3beda2c 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  				git_repack_usage, 0);
>  
>  	if (delete_redundant && repository_format_precious_objects)
> -		die("cannot repack in a precious-objects repo");
> +		die(_("cannot delete packs in a precious-objects repo"));
>  
>  	if (pack_kept_objects < 0)
>  		pack_kept_objects = write_bitmaps;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] introduce "preciousObjects" repository extension
  2015-06-24 17:15       ` Junio C Hamano
@ 2015-06-25 10:07         ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2015-06-25 10:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jun 24, 2015 at 10:15:08AM -0700, Junio C Hamano wrote:

> > I agree that would be better. I originally just blocked all use of
> > git-repack, but at the last minute softened it to just "repack -d". I'm
> > not sure if that would actually help anyone in practice. Sure, doing
> > "git repack" without any options is not destructive, but I wonder if
> > anybody actually does it.
> 
> Hmph, if you cannot afford to lose objects that are unreachable from
> your refs (because you know your repository has borrowers) but are
> suffering from too many packs, wouldn't "repack -a" be the most
> natural thing to do?  Maybe I am biased, but "git gc" is not the
> first thing that comes to my mind in that situation.

My assumption was that people fall into one of two categories:

  - people who just run `git gc`

  - people who are doing something clever, and will use `git
    repack` to do a full repack after making sure it is safe to do so.

    E.g., after "clone -s", it might be OK to do:

      for i in ../*.git; do
        git fetch $i +refs/*:refs/remotes/$i/*
      done
      git -c extensions.preciousObjects=false repack -ad

    But only the user can make that decision; git does not know whether
    "../*.git" is the complete set of children.

Certainly "git repack -a" is a safe stopgap in the shared-object parent,
but eventually you will want to do the clever thing. :)

I think it is OK to use this patch as a starting point, and for people
to loosen the rules later if there is a combination of repack flags that
are safe to run but not covered by the current logic (there is no
regression, since preciousObjects is a new extension, and going forward
it is OK to allow new safe things to open up workflows, but not the
other way around).

It may even be that the current patch even allows any sane workflow; I
am only claiming that I did not think too hard on it, and tried to err
on the side of safety, and allowing the workflow above.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-06-25 10:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-23 10:50 [RFC/PATCH 0/2] bumping repository format version Jeff King
2015-06-23 10:53 ` [PATCH 1/2] introduce "extensions" form of core.repositoryformatversion Jeff King
2015-06-23 10:54 ` [PATCH 2/2] introduce "preciousObjects" repository extension Jeff King
2015-06-23 11:14   ` Duy Nguyen
2015-06-23 11:47     ` Jeff King
2015-06-23 21:05   ` Junio C Hamano
2015-06-24  7:50     ` Jeff King
2015-06-24 17:15       ` Junio C Hamano
2015-06-25 10:07         ` Jeff King
2015-06-23 21:31   ` David Turner
2015-06-24  7:55     ` Jeff King
2015-06-24  8:12   ` Jeff King
2015-06-24 10:29     ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).