From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com,
martin.agren@gmail.com
Subject: [PATCH v2 0/5] commit-graph: use 'struct object_directory *' everywhere
Date: Mon, 3 Feb 2020 13:17:53 -0800 [thread overview]
Message-ID: <cover.1580764494.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1580424766.git.me@ttaylorr.com>
Hi,
Here is a re-roll of the series in this thread to replace string-based
path comparison in 'commit-graph.c' code with raw pointer comparison of
'struct object_directory *'s.
The only thing that has changed substantially since v1 is that I swapped
the order of patches 2/6 and 4/6. What was patch 3/6 got folded into
what is now patch 2/5. This resolves an inconvenience where we had to
define a helper function in 'commit-graph.c', when it really belonged in
'builtin/commit-graph.c'.
I took a few other suggestions from Martin in what is now patch 4/5, and
noticed a few other small things along the way. A range-diff since v1 is
included below.
Thanks as always,
Taylor
Taylor Blau (5):
t5318: don't pass non-object directory to '--object-dir'
commit-graph.h: store an odb in 'struct write_commit_graph_context'
commit-graph.h: store object directory in 'struct commit_graph'
commit-graph.c: remove path normalization, comparison
commit-graph.h: use odb in 'load_commit_graph_one_fd_st'
Documentation/git-commit-graph.txt | 5 +-
builtin/commit-graph.c | 34 +++++++--
builtin/commit.c | 2 +-
builtin/fetch.c | 2 +-
builtin/gc.c | 2 +-
commit-graph.c | 115 +++++++++++++----------------
commit-graph.h | 15 ++--
t/helper/test-read-graph.c | 8 +-
t/t5318-commit-graph.sh | 4 +-
9 files changed, 100 insertions(+), 87 deletions(-)
Range-diff against v1:
1: 09ba72c85a = 1: 84a8709ad1 t5318: don't pass non-object directory to '--object-dir'
4: 9784a5db3c ! 2: d9819cfb33 commit-graph.h: store an odb in 'struct write_commit_graph_context'
@@ Metadata
## Commit message ##
commit-graph.h: store an odb in 'struct write_commit_graph_context'
- In a previous patch, the 'char *object_dir' in 'struct commit_graph' was
- replaced with a 'struct object_directory'. This patch applies the same
- treatement to 'struct write_commit_graph_context', which is an
- intermediate step towards getting rid of all path normalization in
- 'commit-graph.c'.
-
- Instead of taking a 'char *object_dir', functions that construct a
- 'struct write_commit_graph_context' now take a 'struct object_directory
- *'. Any code that needs an object directory path use '->path' instead.
-
- This ensures that all calls to functions that perform path normalization
- are given arguments which do not themselves require normalization. This
- prepares those functions to drop their normalization entirely, which
- will occur in the subsequent patch.
+ There are lots of places in 'commit-graph.h' where a function either has
+ (or almost has) a full 'struct object_directory *', accesses '->path',
+ and then throws away the rest of the struct.
+
+ This can cause headaches when comparing the locations of object
+ directories across alternates (e.g., in the case of deciding if two
+ commit-graph layers can be merged). These paths are normalized with
+ 'normalize_path_copy()' which mitigates some comparison issues, but not
+ all [1].
+
+ Replace usage of 'char *object_dir' with 'odb->path' by storing a
+ 'struct object_directory *' in the 'write_commit_graph_context'
+ structure. This is an intermediate step towards getting rid of all path
+ normalization in 'commit-graph.c'.
+
+ Resolving a user-provided '--object-dir' argument now requires that we
+ compare it to the known alternates for equality. Prior to this patch,
+ an unknown '--object-dir' argument would silently exit with status zero.
+
+ This can clearly lead to unintended behavior, such as verifying
+ commit-graphs that aren't in a repository's own object store (or one of
+ its alternates), or causing a typo to mask a legitimate commit-graph
+ verification failure. Make this error non-silent by 'die()'-ing when the
+ given '--object-dir' does not match any known alternate object store.
+
+ [1]: In my testing, for example, I can get one side of the commit-graph
+ code to fill object_dir with "./objects" and the other with just
+ "objects".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
+ ## Documentation/git-commit-graph.txt ##
+@@ Documentation/git-commit-graph.txt: OPTIONS
+ file. This parameter exists to specify the location of an alternate
+ that only has the objects directory, not a full `.git` directory. The
+ commit-graph file is expected to be in the `<dir>/info` directory and
+- the packfiles are expected to be in `<dir>/pack`.
++ the packfiles are expected to be in `<dir>/pack`. If the directory
++ could not be made into an absolute path, or does not match any known
++ object directory, `git commit-graph ...` will exit with non-zero
++ status.
+
+ --[no-]progress::
+ Turn progress on/off explicitly. If neither is specified, progress is
+
## builtin/commit-graph.c ##
+@@ builtin/commit-graph.c: static struct opts_commit_graph {
+ int progress;
+ } opts;
+
++struct object_directory *find_odb(struct repository *r, const char *obj_dir)
++{
++ struct object_directory *odb;
++ char *obj_dir_real = real_pathdup(obj_dir, 1);
++
++ prepare_alt_odb(r);
++ for (odb = r->objects->odb; odb; odb = odb->next) {
++ if (!strcmp(obj_dir_real, real_path(odb->path)))
++ break;
++ }
++
++ free(obj_dir_real);
++
++ if (!odb)
++ die(_("could not find object directory matching %s"), obj_dir);
++ return odb;
++}
++
+ static int graph_verify(int argc, const char **argv)
+ {
+ struct commit_graph *graph = NULL;
++ struct object_directory *odb = NULL;
+ char *graph_name;
+ int open_ok;
+ int fd;
+@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
+ if (opts.progress)
+ flags |= COMMIT_GRAPH_WRITE_PROGRESS;
+
+- graph_name = get_commit_graph_filename(opts.obj_dir);
++ odb = find_odb(the_repository, opts.obj_dir);
++ graph_name = get_commit_graph_filename(odb->path);
+ open_ok = open_commit_graph(graph_name, &fd, &st);
+ if (!open_ok && errno != ENOENT)
+ die_errno(_("Could not open commit-graph '%s'"), graph_name);
+@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
+
+ if (open_ok)
+ graph = load_commit_graph_one_fd_st(fd, &st);
+- else
+- graph = read_commit_graph_one(the_repository, opts.obj_dir);
++ else
++ graph = read_commit_graph_one(the_repository, odb->path);
+
+ /* Return failure if open_ok predicted success */
+ if (!graph)
@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
- odb = find_odb_or_die(the_repository, opts.obj_dir);
+ {
+ struct string_list *pack_indexes = NULL;
+ struct string_list *commit_hex = NULL;
++ struct object_directory *odb = NULL;
+ struct string_list lines;
+ int result = 0;
+ enum commit_graph_write_flags flags = 0;
+@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
+ flags |= COMMIT_GRAPH_WRITE_PROGRESS;
+
+ read_replace_refs = 0;
++ odb = find_odb(the_repository, opts.obj_dir);
if (opts.reachable) {
-- if (write_commit_graph_reachable(odb->path, flags, &split_opts))
+- if (write_commit_graph_reachable(opts.obj_dir, flags, &split_opts))
+ if (write_commit_graph_reachable(odb, flags, &split_opts))
return 1;
return 0;
@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
UNLEAK(buf);
}
-- if (write_commit_graph(odb->path,
+- if (write_commit_graph(opts.obj_dir,
+ if (write_commit_graph(odb,
pack_indexes,
commit_hex,
@@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph
while (g && (g->num_commits <= size_mult * num_commits ||
(max_commits && num_commits > max_commits))) {
-- if (strcmp(g->odb->path, ctx->obj_dir))
-+ if (strcmp(g->odb->path, ctx->odb->path))
+- if (strcmp(g->obj_dir, ctx->obj_dir))
++ if (strcmp(g->obj_dir, ctx->odb->path))
break;
num_commits += g->num_commits;
@@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
- char *old_graph_name = get_commit_graph_filename(g->odb->path);
+ char *old_graph_name = get_commit_graph_filename(g->obj_dir);
if (!strcmp(g->filename, old_graph_name) &&
-- strcmp(g->odb->path, ctx->obj_dir)) {
-+ strcmp(g->odb->path, ctx->odb->path)) {
+- strcmp(g->obj_dir, ctx->obj_dir)) {
++ strcmp(g->obj_dir, ctx->odb->path)) {
ctx->num_commit_graphs_after = 1;
ctx->new_base_graph = NULL;
}
@@ commit-graph.c: int write_commit_graph(const char *obj_dir,
ctx->oids.alloc = split_opts->max_commits;
if (ctx->append) {
-- struct object_directory *odb = find_odb(ctx->r, ctx->obj_dir);
-- prepare_commit_graph_one(ctx->r, odb);
-+ prepare_commit_graph_one(ctx->r, ctx->odb);
+- prepare_commit_graph_one(ctx->r, ctx->obj_dir);
++ prepare_commit_graph_one(ctx->r, ctx->odb->path);
if (ctx->r->objects->commit_graph)
ctx->oids.alloc += ctx->r->objects->commit_graph->num_commits;
}
@@ commit-graph.c: int write_commit_graph(const char *obj_dir,
for (i = 0; i < ctx->num_commit_graphs_after; i++) {
## commit-graph.h ##
+@@
+ #include "repository.h"
+ #include "string-list.h"
+ #include "cache.h"
++#include "object-store.h"
+
+ #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
+ #define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
@@ commit-graph.h: struct split_commit_graph_opts {
* is not compatible with the commit-graph feature, then the
* methods will return 0 without writing a commit-graph.
2: 328884f5f6 ! 3: 5fd5cfca6e commit-graph.h: store object directory in 'struct commit_graph'
@@ Metadata
## Commit message ##
commit-graph.h: store object directory in 'struct commit_graph'
- There are lots of places in 'commit-graph.h' where a function either has
- (or almost has) a full 'struct object_directory *', accesses '->path',
- and then throws away the rest of the struct.
+ In a previous patch, the 'char *object_dir' in 'struct commit_graph' was
+ replaced with a 'struct object_directory'. This patch applies the same
+ treatment to 'struct commit_graph', which is another intermediate step
+ towards getting rid of all path normalization in 'commit-graph.c'.
- This can cause headaches when comparing the locations of object
- directories across alternates (e.g., in the case of deciding if two
- commit-graph layers can be merged). These paths are normalized with
- 'normalize_path_copy()' which mitigates some comparison issues, but not
- all [1].
+ Instead of taking a 'char *object_dir', functions that construct a
+ 'struct commit_graph' now take a 'struct object_directory *'. Any code
+ that needs an object directory path use '->path' instead.
- Instead of getting rid of the 'struct object_directory *', store that
- insead of a 'char *odb' in 'struct commit_graph'. Once the 'struct
- write_commit_graph_context' has an object_directory pointer, too, this
- will allow calling code to replace these error-prone path comparisons
- with raw pointer comparisons, thereby circumventing any
- normalization-related errors. This will be introduced in a subsequent
- patch.
-
- [1]: In my testing, for example, I can get one side of the commit-graph
- code to fill object_dir with "./objects" and the other with just
- "objects".
+ This ensures that all calls to functions that perform path normalization
+ are given arguments which do not themselves require normalization. This
+ prepares those functions to drop their normalization entirely, which
+ will occur in the subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## builtin/commit-graph.c ##
@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
-
if (open_ok)
graph = load_commit_graph_one_fd_st(fd, &st);
-- else
-- graph = read_commit_graph_one(the_repository, opts.obj_dir);
-+ else {
-+ struct object_directory *odb;
-+ if ((odb = find_odb(the_repository, opts.obj_dir)))
-+ graph = read_commit_graph_one(the_repository, odb);
-+ }
+ else
+- graph = read_commit_graph_one(the_repository, odb->path);
++ graph = read_commit_graph_one(the_repository, odb);
/* Return failure if open_ok predicted success */
if (!graph)
-@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
- struct string_list lines;
- int result = 0;
- enum commit_graph_write_flags flags = 0;
-+ struct object_directory *odb = NULL;
-
- static struct option builtin_commit_graph_write_options[] = {
- OPT_STRING(0, "object-dir", &opts.obj_dir,
-@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
- flags |= COMMIT_GRAPH_WRITE_PROGRESS;
-
- read_replace_refs = 0;
-+ odb = find_odb(the_repository, opts.obj_dir);
-
- if (opts.reachable) {
-- if (write_commit_graph_reachable(opts.obj_dir, flags, &split_opts))
-+ if (write_commit_graph_reachable(odb->path, flags, &split_opts))
- return 1;
- return 0;
- }
-@@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
- UNLEAK(buf);
- }
-
-- if (write_commit_graph(opts.obj_dir,
-+ if (write_commit_graph(odb->path,
- pack_indexes,
- commit_hex,
- flags,
-
- ## builtin/commit.c ##
-@@
- #include "help.h"
- #include "commit-reach.h"
- #include "commit-graph.h"
-+#include "object-store.h"
-
- static const char * const builtin_commit_usage[] = {
- N_("git commit [<options>] [--] <pathspec>..."),
## commit-graph.c ##
-@@ commit-graph.c: static uint8_t oid_version(void)
- return 1;
- }
-
-+struct object_directory *find_odb(struct repository *r, const char *obj_dir)
-+{
-+ struct object_directory *odb;
-+ char *obj_dir_real = real_pathdup(obj_dir, 1);
-+ int cmp = -1;
-+
-+ prepare_alt_odb(r);
-+ for (odb = r->objects->odb; odb; odb = odb->next) {
-+ cmp = strcmp(obj_dir_real, real_path(odb->path));
-+ if (!cmp)
-+ break;
-+ }
-+
-+ free(obj_dir_real);
-+
-+ if (cmp)
-+ odb = NULL;
-+ return odb;
-+}
-+
- static struct commit_graph *alloc_commit_graph(void)
- {
- struct commit_graph *g = xcalloc(1, sizeof(*g));
@@ commit-graph.c: static struct commit_graph *load_commit_graph_one(const char *graph_file)
return g;
}
@@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph
while (g && (g->num_commits <= size_mult * num_commits ||
(max_commits && num_commits > max_commits))) {
-- if (strcmp(g->obj_dir, ctx->obj_dir))
-+ if (strcmp(g->odb->path, ctx->obj_dir))
+- if (strcmp(g->obj_dir, ctx->odb->path))
++ if (strcmp(g->odb->path, ctx->odb->path))
break;
num_commits += g->num_commits;
@@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph
+ char *old_graph_name = get_commit_graph_filename(g->odb->path);
if (!strcmp(g->filename, old_graph_name) &&
-- strcmp(g->obj_dir, ctx->obj_dir)) {
-+ strcmp(g->odb->path, ctx->obj_dir)) {
+- strcmp(g->obj_dir, ctx->odb->path)) {
++ strcmp(g->odb->path, ctx->odb->path)) {
ctx->num_commit_graphs_after = 1;
ctx->new_base_graph = NULL;
}
-@@ commit-graph.c: int write_commit_graph(const char *obj_dir,
+@@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
ctx->oids.alloc = split_opts->max_commits;
if (ctx->append) {
-- prepare_commit_graph_one(ctx->r, ctx->obj_dir);
-+ struct object_directory *odb = find_odb(ctx->r, ctx->obj_dir);
-+ prepare_commit_graph_one(ctx->r, odb);
+- prepare_commit_graph_one(ctx->r, ctx->odb->path);
++ prepare_commit_graph_one(ctx->r, ctx->odb);
if (ctx->r->objects->commit_graph)
ctx->oids.alloc += ctx->r->objects->commit_graph->num_commits;
}
## commit-graph.h ##
-@@
- #include "repository.h"
- #include "string-list.h"
- #include "cache.h"
-+#include "object-store.h"
-
- #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
- #define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
-@@ commit-graph.h: struct commit;
- char *get_commit_graph_filename(const char *obj_dir);
- int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
-
-+struct object_directory *find_odb(struct repository *r, const char *obj_dir);
-+
- /*
- * Given a commit struct, try to fill the commit struct info, including:
- * 1. tree object
@@ commit-graph.h: struct commit_graph {
uint32_t num_commits;
struct object_id oid;
3: ce884d7742 < -: ---------- builtin/commit-graph.c: die() with unknown '--object-dir'
5: c9b2ba46ab ! 4: f14e95aa7e commit-graph.c: remove path normalization, comparison
@@ Commit message
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## builtin/commit-graph.c ##
-@@ builtin/commit-graph.c: static struct object_directory *find_odb_or_die(struct repository *r,
- static int graph_verify(int argc, const char **argv)
- {
- struct commit_graph *graph = NULL;
-+ struct object_directory *odb = NULL;
- char *graph_name;
- int open_ok;
- int fd;
@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
- if (opts.progress)
flags |= COMMIT_GRAPH_WRITE_PROGRESS;
-- graph_name = get_commit_graph_filename(opts.obj_dir);
-+ odb = find_odb_or_die(the_repository, opts.obj_dir);
+ odb = find_odb(the_repository, opts.obj_dir);
+- graph_name = get_commit_graph_filename(odb->path);
+ graph_name = get_commit_graph_filename(odb);
open_ok = open_commit_graph(graph_name, &fd, &st);
if (!open_ok && errno != ENOENT)
die_errno(_("Could not open commit-graph '%s'"), graph_name);
-@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
-
- if (open_ok)
- graph = load_commit_graph_one_fd_st(fd, &st);
-- else {
-- struct object_directory *odb;
-- if ((odb = find_odb_or_die(the_repository, opts.obj_dir)))
-- graph = read_commit_graph_one(the_repository, odb);
-- }
-+ else
-+ graph = read_commit_graph_one(the_repository, odb);
-
- /* Return failure if open_ok predicted success */
- if (!graph)
## commit-graph.c ##
@@
@@ commit-graph.h
+char *get_commit_graph_filename(struct object_directory *odb);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
- struct object_directory *find_odb(struct repository *r, const char *obj_dir);
+ /*
## t/helper/test-read-graph.c ##
@@ t/helper/test-read-graph.c: int cmd__read_graph(int argc, const char **argv)
6: e70483f771 = 5: aa12b7378b commit-graph.h: use odb in 'load_commit_graph_one_fd_st'
--
2.25.0.119.gaa12b7378b
next prev parent reply other threads:[~2020-02-03 21:17 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-30 23:00 [PATCH 0/6] commit-graph: use 'struct object_directory *' everywhere Taylor Blau
2020-01-30 23:00 ` [PATCH 1/6] t5318: don't pass non-object directory to '--object-dir' Taylor Blau
2020-01-30 23:00 ` [PATCH 5/6] commit-graph.c: remove path normalization, comparison Taylor Blau
2020-01-30 23:00 ` [PATCH 6/6] commit-graph.h: use odb in 'load_commit_graph_one_fd_st' Taylor Blau
2020-01-30 23:00 ` [PATCH 2/6] commit-graph.h: store object directory in 'struct commit_graph' Taylor Blau
2020-01-31 6:52 ` Martin Ågren
2020-01-31 10:20 ` Jeff King
2020-01-31 19:19 ` Martin Ågren
2020-02-03 4:36 ` Taylor Blau
2020-02-03 8:36 ` Jeff King
2020-01-31 20:49 ` Junio C Hamano
2020-02-03 3:58 ` Taylor Blau
2020-01-30 23:00 ` [PATCH 4/6] commit-graph.h: store an odb in 'struct write_commit_graph_context' Taylor Blau
2020-01-30 23:00 ` [PATCH 3/6] builtin/commit-graph.c: die() with unknown '--object-dir' Taylor Blau
2020-01-31 10:30 ` [PATCH 0/6] commit-graph: use 'struct object_directory *' everywhere Jeff King
2020-01-31 13:22 ` Derrick Stolee
2020-02-03 4:38 ` Taylor Blau
2020-02-03 21:17 ` Taylor Blau [this message]
2020-02-03 21:17 ` [PATCH v2 1/5] t5318: don't pass non-object directory to '--object-dir' Taylor Blau
2020-02-03 21:17 ` [PATCH v2 2/5] commit-graph.h: store an odb in 'struct write_commit_graph_context' Taylor Blau
2020-02-04 5:51 ` Taylor Blau
2020-02-04 19:54 ` Junio C Hamano
2020-02-04 21:28 ` Taylor Blau
2020-02-04 21:44 ` Junio C Hamano
2020-02-03 21:18 ` [PATCH v2 3/5] commit-graph.h: store object directory in 'struct commit_graph' Taylor Blau
2020-02-03 21:18 ` [PATCH v2 4/5] commit-graph.c: remove path normalization, comparison Taylor Blau
2020-02-03 21:18 ` [PATCH v2 5/5] commit-graph.h: use odb in 'load_commit_graph_one_fd_st' Taylor Blau
2020-02-05 12:30 ` [PATCH v2 0/5] commit-graph: use 'struct object_directory *' everywhere Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1580764494.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=martin.agren@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).