git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: David Turner <dturner@twopensource.com>, git@vger.kernel.org
Subject: Re: [PATCH 13/16] init: allow alternate backends to be set for new repos
Date: Wed, 23 Dec 2015 14:34:23 +0100	[thread overview]
Message-ID: <567AA2DF.1020408@alum.mit.edu> (raw)
In-Reply-To: <1449102921-7707-14-git-send-email-dturner@twopensource.com>

On 12/03/2015 01:35 AM, David Turner wrote:
> git init learns a new argument --refs-backend-type.  Presently, only
> "files" is supported, but later we will add other backends.
> 
> When this argument is used, the repository's core.refsBackendType
> configuration value is set, and the refs backend's initdb function is
> used to set up the ref database.
> 
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
>  Documentation/git-init-db.txt |  2 +-
>  Documentation/git-init.txt    |  6 +++++-
>  builtin/init-db.c             | 10 ++++++++++
>  cache.h                       |  2 ++
>  config.c                      | 20 ++++++++++++++++++++
>  environment.c                 |  1 +
>  path.c                        | 32 ++++++++++++++++++++++++++++++--
>  refs.c                        |  8 ++++++++
>  refs.h                        | 12 ++++++++++++
>  setup.c                       | 10 ++++++++++
>  10 files changed, 99 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/git-init-db.txt b/Documentation/git-init-db.txt
> index 648a6cd..72fbd71 100644
> --- a/Documentation/git-init-db.txt
> +++ b/Documentation/git-init-db.txt
> @@ -9,7 +9,7 @@ git-init-db - Creates an empty Git repository
>  SYNOPSIS
>  --------
>  [verse]
> -'git init-db' [-q | --quiet] [--bare] [--template=<template_directory>] [--separate-git-dir <git dir>] [--shared[=<permissions>]]
> +'git init-db' [-q | --quiet] [--bare] [--template=<template_directory>] [--separate-git-dir <git dir>] [--shared[=<permissions>]] [--refs-backend-type=<name>]
>  
>  
>  DESCRIPTION
> diff --git a/Documentation/git-init.txt b/Documentation/git-init.txt
> index 8174d27..9ea6753 100644
> --- a/Documentation/git-init.txt
> +++ b/Documentation/git-init.txt
> @@ -12,7 +12,7 @@ SYNOPSIS
>  'git init' [-q | --quiet] [--bare] [--template=<template_directory>]
>  	  [--separate-git-dir <git dir>]
>  	  [--shared[=<permissions>]] [directory]
> -
> +	  [--refs-backend-type=<name>]

ISTM that "backend" (used here in this option name, and in the manpage)
is not such a meaningful term to users. Could we pick a less obscure
term? E.g., maybe "--ref-storage=<name>"?


>  DESCRIPTION
>  -----------
> @@ -113,6 +113,10 @@ does not exist, it will be created.
>  
>  --
>  
> +--refs-backend-type=<name>::
> +Type of refs backend. Default is to use the original "files" backend,
> +which stores ref data in files in .git/refs and .git/packed-refs.
> +
>  TEMPLATE DIRECTORY
>  ------------------
>  
> diff --git a/builtin/init-db.c b/builtin/init-db.c
> index 4771e7e..44db591 100644
> --- a/builtin/init-db.c
> +++ b/builtin/init-db.c
> @@ -204,6 +204,14 @@ static int create_default_files(const char *template_path)
>  		adjust_shared_perm(get_git_dir());
>  	}
>  
> +	if (refs_backend_type) {
> +		struct refdb_config_data config_data = {NULL};
> +		git_config_set("core.refsBackendType", refs_backend_type);
> +		config_data.refs_backend_type = refs_backend_type;
> +		config_data.refs_base = get_git_dir();
> +		set_refs_backend(refs_backend_type, &config_data);
> +	}
> +

Here is the source of the "void *data" argument that puzzled me in patch
08/16. I'm still puzzled. This code, which is later also used for the
LMDB backend, *always* passes that function a "struct refdb_config_data
*". So why is the argument declared to be "void *"?

If, on the other hand, you want to preserve a backend's freedom to
require different extra data in this parameter, then this code in
init-db.c, and code like it elsewhere, would have to know about the
reference backends so that it knows what data to pass to each one. In
that case, there would be no reason to make this function virtual; this
code could just as well call a different function (with a different
signature) depending on the backend that is in use.

Either way, something seems strange here.

(Remember that another alternative is to let the refs backend read
whatever specialized information it needs from the config by itself.)

>  	if (refs_init_db(&err, shared_repository))
>  		die("failed to set up refs db: %s", err.buf);
>  
> @@ -469,6 +477,8 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
>  		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
>  		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
>  			   N_("separate git dir from working tree")),
> +		OPT_STRING(0, "refs-backend-type", &refs_backend_type,
> +			   N_("name"), N_("name of backend type to use")),
>  		OPT_END()
>  	};
>  
> diff --git a/cache.h b/cache.h
> index 707455a..d1534db 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -696,6 +696,8 @@ extern enum object_creation_mode object_creation_mode;
>  
>  extern char *notes_ref_name;
>  
> +extern const char *refs_backend_type;
> +
>  extern int grafts_replace_parents;
>  
>  /*
> diff --git a/config.c b/config.c
> index 248a21a..210aa08 100644
> --- a/config.c
> +++ b/config.c
> @@ -10,6 +10,7 @@
>  #include "exec_cmd.h"
>  #include "strbuf.h"
>  #include "quote.h"
> +#include "refs.h"
>  #include "hashmap.h"
>  #include "string-list.h"
>  #include "utf8.h"
> @@ -1207,6 +1208,25 @@ int git_config_early(config_fn_t fn, void *data, const char *repo_config)
>  	}
>  
>  	if (repo_config && !access_or_die(repo_config, R_OK, 0)) {
> +		struct refdb_config_data refdb_data = {NULL};
> +		char *repo_config_copy;
> +
> +		/*
> +		 * make sure we always read the backend config from the
> +		 * core section on startup
> +		 */
> +		ret += git_config_from_file(refdb_config, repo_config,
> +					    &refdb_data);
> +
> +		repo_config_copy = xstrdup(repo_config);
> +		refdb_data.refs_base = xstrdup(dirname(repo_config_copy));
> +		free(repo_config_copy);
> +
> +		if (refdb_data.refs_backend_type &&
> +		    strcmp(refdb_data.refs_backend_type, "files")) {
> +			die("Unexpected backend %s", refdb_data.refs_backend_type);
> +		}

The refs code already maintains a list of valid backend names. It would
be better to ask it to validate this parameter than to maintain a second
list here. Or just call set_refs_backend() and let *it* fail.

> +
>  		ret += git_config_from_file(fn, repo_config, data);
>  		found += 1;
>  	}
> diff --git a/environment.c b/environment.c
> index 2da7fe2..8dbf0ab 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -66,6 +66,7 @@ int merge_log_config = -1;
>  int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
>  struct startup_info *startup_info;
>  unsigned long pack_size_limit_cfg;
> +const char *refs_backend_type;
>  
>  #ifndef PROTECT_HFS_DEFAULT
>  #define PROTECT_HFS_DEFAULT 0
> diff --git a/path.c b/path.c
> index 3cd155e..86a8035 100644
> --- a/path.c
> +++ b/path.c
> @@ -2,6 +2,7 @@
>   * Utilities for paths and pathnames
>   */
>  #include "cache.h"
> +#include "refs.h"
>  #include "strbuf.h"
>  #include "string-list.h"
>  #include "dir.h"
> @@ -510,9 +511,36 @@ int validate_headref(const char *path)
>  	unsigned char sha1[20];
>  	int fd;
>  	ssize_t len;
> +	struct refdb_config_data refdb_data = {NULL, NULL};
> +
> +	if (lstat(path, &st) < 0) {
> +		int backend_type_set;
> +		struct strbuf config_path = STRBUF_INIT;
> +		if (path) {
> +			char *pathdup = xstrdup(path);
> +			char *git_dir = dirname(pathdup);
> +			strbuf_addf(&config_path, "%s/%s", git_dir, "config");
> +			free(pathdup);
> +		} else {
> +			strbuf_addstr(&config_path, "config");
> +		}
>  
> -	if (lstat(path, &st) < 0)
> -		return -1;
> +		if (git_config_from_file(refdb_config, config_path.buf, &refdb_data)) {
> +			strbuf_release(&config_path);
> +			return -1;
> +		}
> +
> +		backend_type_set = !!refdb_data.refs_backend_type;
> +		free((void *)refdb_data.refs_backend_type);
> +		free((void *)refdb_data.refs_base);
> +		strbuf_release(&config_path);
> +		/*
> +		 * Alternate backends are assumed to keep HEAD
> +		 * in a valid state, so there's no need to do
> +		 * further validation.
> +		 */
> +		return backend_type_set ? 0 : -1;
> +	}

I'm not sure what problem you are trying to solve with this particular
change. Won't a lmdb-backed repository still have a "HEAD" file with the
usual contents?

In any case, it seems like you are doing work that belongs at a higher
level.

validate_headref() has only one caller, is_git_directory(). The purpose
of that function is to check whether a specified directory is a
plausible Git directory, which it does by looking for some required files:

* "objects/" (unless GIT_OBJECT_DIRECTORY is set)
* "refs/"
* "HEAD" (which must also have appropriate contents)

NB: it doesn't care whether there is a "config" file. Presumably that
was invented later.

It seems to me that whether a repository should have a "refs/" and/or
"HEAD", and if so what contents "HEAD" should have, or whether (for
example) it should have a "refdb" file, depends on what reference
backend is in use. But *that* depends on the contents of "config" or its
absence.

You are reading "config" down here in validate_headref(). It feels to me
more like is_git_directory() should read "config", determine which
reference backend seems to be in use, and then ask *that* reference
backend whether the directory is a plausible Git directory. The files
backend would check for "HEAD" and "refs/" and the LMDB backend would
check for "HEAD" and "refdb/".

Note that this approach would require "config" to be read and parsed an
extra time. That might be considered too expensive to do so early in the
startup code. I don't know.

By the way, this code ends up reading "config" files a bit more
promiscuously than the old code. I don't think this raises any security
considerations, but I wanted to point it out anyway. If somebody can put
a hostile "config" file in a directory where you are running git, then
they can probably put an "objects/", "refs/", and "HEAD" file there,
which is all that is needed to get even the old code to read the
"config" file.

>  	/* Make sure it is a "refs/.." symlink */
>  	if (S_ISLNK(st.st_mode)) {
> diff --git a/refs.c b/refs.c
> index e48e43a..96e1673 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -24,6 +24,14 @@ struct ref_be *the_refs_backend = &refs_be_files;
>   */
>  struct ref_be *refs_backends = &refs_be_files;
>  
> +const char *refs_backend_type;
> +
> +void register_refs_backend(struct ref_be *be)
> +{
> +	be->next = refs_backends;
> +	refs_backends = be;
> +}
> +
>  /*
>   * This function is used to switch to an alternate backend.
>   */
> diff --git a/refs.h b/refs.h
> index c211b9e..c3670e8 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -510,6 +510,18 @@ extern int reflog_expire(const char *refname, const unsigned char *sha1,
>  			 reflog_expiry_cleanup_fn cleanup_fn,
>  			 void *policy_cb_data);
>  
> +struct refdb_config_data {
> +	const char *refs_backend_type;
> +	const char *refs_base;
> +};
> +/*
> + * Read the refdb configuration data out of the config file
> + */
> +int refdb_config(const char *var, const char *value, void *ptr);
> +
> +struct ref_be;
>  int set_refs_backend(const char *name, void *data);
>  
> +void register_refs_backend(struct ref_be *be);
> +
>  #endif /* REFS_H */
> diff --git a/setup.c b/setup.c
> index d343725..de6b8ac 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1,5 +1,6 @@
>  #include "cache.h"
>  #include "dir.h"
> +#include "refs.h"
>  #include "string-list.h"
>  
>  static int inside_git_dir = -1;
> @@ -263,6 +264,15 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir)
>  	return ret;
>  }
>  
> +int refdb_config(const char *var, const char *value, void *ptr)
> +{
> +       struct refdb_config_data *cdata = ptr;
> +
> +       if (!strcmp(var, "core.refsbackendtype"))
> +	       cdata->refs_backend_type = xstrdup((char *)value);

Unnecessary cast.

> +       return 0;
> +}
> +
>  /*
>   * Test if it looks like we're at a git directory.
>   * We want to see:

I can guarantee that before long somebody will want a config option that
can be put in a global .gitconfig to choose the default refs backend
type for new repositories when it is not specified explicitly via the
command line. But that can wait :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

  parent reply	other threads:[~2015-12-23 13:41 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-03  0:35 [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner
2015-12-03  0:35 ` [PATCH 01/16] refs: add a backend method structure with transaction functions David Turner
2015-12-05  0:07   ` Junio C Hamano
2015-12-03  0:35 ` [PATCH 02/16] refs: add methods for misc ref operations David Turner
2015-12-11 23:33   ` Junio C Hamano
2015-12-11 23:49     ` David Turner
2015-12-11 23:39   ` Junio C Hamano
2015-12-11 23:49     ` David Turner
2015-12-12  0:23       ` Junio C Hamano
2015-12-12  0:48         ` David Turner
2015-12-18  4:06     ` Howard Chu
2015-12-03  0:35 ` [PATCH 03/16] refs: add methods for the ref iterators David Turner
2016-01-03  0:06   ` David Aguilar
2016-01-04 19:01     ` Junio C Hamano
2016-01-05 13:43       ` Michael Haggerty
2016-01-05 18:56         ` Junio C Hamano
2016-01-04 19:12     ` Ronnie Sahlberg
2016-01-04 20:26       ` Junio C Hamano
2016-01-05  1:17         ` Jeff King
2016-01-05  3:29           ` Junio C Hamano
2015-12-03  0:35 ` [PATCH 04/16] refs: add do_for_each_per_worktree_ref David Turner
2015-12-11 23:52   ` Junio C Hamano
2015-12-12  0:01     ` David Turner
2015-12-03  0:35 ` [PATCH 05/16] refs: add methods for reflog David Turner
2015-12-03  0:35 ` [PATCH 06/16] refs: add method for initial ref transaction commit David Turner
2015-12-03  0:35 ` [PATCH 07/16] refs: add method for delete_refs David Turner
2015-12-03  0:35 ` [PATCH 08/16] refs: add methods to init refs backend and db David Turner
2015-12-23  5:33   ` Michael Haggerty
2015-12-23  6:54     ` David Turner
2015-12-03  0:35 ` [PATCH 09/16] refs: add method to rename refs David Turner
2015-12-03  0:35 ` [PATCH 10/16] refs: make lock generic David Turner
2015-12-03  0:35 ` [PATCH 11/16] refs: move duplicate check to common code David Turner
2015-12-23  6:27   ` Michael Haggerty
2016-01-05 16:42     ` David Turner
2015-12-03  0:35 ` [PATCH 12/16] refs: always handle non-normal refs in files backend David Turner
2015-12-23  8:02   ` Michael Haggerty
2016-01-06  0:13     ` David Turner
2016-01-06 23:41     ` [PATCH/RFC v2 1/3] refs: allow log-only updates David Turner
2016-01-06 23:41       ` [PATCH/RFC v2 2/3] refs: resolve symbolic refs first David Turner
2016-01-06 23:41       ` [PATCH/RFC v2 3/3] refs: always handle non-normal refs in files backend David Turner
2016-01-08 12:52         ` David Turner
2016-01-06 23:42     ` [PATCH 12/16] " David Turner
2015-12-03  0:35 ` [PATCH 13/16] init: allow alternate backends to be set for new repos David Turner
2015-12-05  0:07   ` Junio C Hamano
2015-12-05  6:30   ` Duy Nguyen
2015-12-05  7:44     ` Jeff King
2015-12-08  0:38       ` David Turner
2015-12-23  9:52       ` Michael Haggerty
2015-12-23 20:01         ` Jeff King
2015-12-10 18:02   ` Jeff King
2015-12-10 19:36     ` David Turner
2015-12-23 11:30   ` [PATCH] clone: use child_process for recursive checkouts Michael Haggerty
2016-01-06 23:41     ` David Turner
2015-12-23 13:34   ` Michael Haggerty [this message]
2016-01-05 17:26     ` [PATCH 13/16] init: allow alternate backends to be set for new repos David Turner
2016-01-05 18:03       ` Junio C Hamano
2016-01-05 18:24         ` David Turner
2016-01-06 12:02         ` Michael Haggerty
2016-01-06 12:52     ` Duy Nguyen
2016-01-07  3:31       ` Shawn Pearce
2015-12-03  0:35 ` [PATCH 14/16] refs: allow ref backend to be set for clone David Turner
2015-12-23 13:51   ` Michael Haggerty
2015-12-23 20:23     ` Eric Sunshine
2015-12-03  0:35 ` [PATCH 15/16] refs: add LMDB refs backend David Turner
2015-12-05  0:08   ` Junio C Hamano
2015-12-05  0:25     ` David Turner
2015-12-17  1:00   ` Jonathan Nieder
2015-12-17  2:31     ` David Turner
2015-12-17 20:49       ` Jonathan Nieder
2015-12-23 14:32   ` Michael Haggerty
2016-01-08 16:05     ` David Turner
2015-12-03  0:35 ` [PATCH 16/16] refs: tests for lmdb backend David Turner
2015-12-22 23:56 ` [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=567AA2DF.1020408@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).