git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] worktree: teach `repair` to fix multi-directional breakage
@ 2020-12-08 17:37 Eric Sunshine
  2020-12-08 21:47 ` Junio C Hamano
  2020-12-21  8:16 ` [PATCH v2 0/1] teach `worktree repair` to fix two-way linkage Eric Sunshine
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Sunshine @ 2020-12-08 17:37 UTC (permalink / raw)
  To: git; +Cc: Philippe Blain, Duy Nguyen, Eric Sunshine

`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.

However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.

Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---

This patch addresses a limitation of `git worktree repair` in which it
is unable to repair the two-way links between the repository and
secondary worktrees if both the repository and the worktrees have been
moved manually. This limitation was known at the time `repair` was
introduced but I left it and other bells-and-whistles unimplemented
because bare bone `repair` was already fairly complex and I worried
about reviewer fatigue. However, a recent report[1] by Philippe, in
which he encountered exactly this situation, prompted me to think about
how `repair` should deal with the case, and this patch is the result.

[1]: https://lore.kernel.org/git/63AC7AC2-5D32-479B-BF9E-0E5C31351A1B@gmail.com/

 Documentation/git-worktree.txt |  5 +++++
 builtin/worktree.c             |  2 +-
 t/t2406-worktree-repair.sh     | 25 +++++++++++++++++++++
 worktree.c                     | 41 ++++++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index af06128cc9..02a706c4c0 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -143,6 +143,11 @@ locate it. Running `repair` within the recently-moved working tree will
 reestablish the connection. If multiple linked working trees are moved,
 running `repair` from any working tree with each tree's new `<path>` as
 an argument, will reestablish the connection to all the specified paths.
++
+If both the main working tree and linked working trees have been moved
+manually, then running `repair` in the main working tree and specifying the
+new `<path>` of each linked working tree will reestablish all connections
+in both directions.
 
 unlock::
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 197fd24a55..71287b2da6 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -1052,10 +1052,10 @@ static int repair(int ac, const char **av, const char *prefix)
 	int rc = 0;
 
 	ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
-	repair_worktrees(report_repair, &rc);
 	p = ac > 0 ? av : self;
 	for (; *p; p++)
 		repair_worktree_at_path(*p, report_repair, &rc);
+	repair_worktrees(report_repair, &rc);
 	return rc;
 }
 
diff --git a/t/t2406-worktree-repair.sh b/t/t2406-worktree-repair.sh
index 1fe468bfe8..687342bfa7 100755
--- a/t/t2406-worktree-repair.sh
+++ b/t/t2406-worktree-repair.sh
@@ -104,6 +104,16 @@ test_expect_success 'repo not found; .git not file' '
 	test_i18ngrep ".git is not a file" err
 '
 
+test_expect_success 'repo not found; .git not referencing repo' '
+	test_when_finished "rm -rf side not-a-repo && git worktree prune" &&
+	git worktree add --detach side &&
+	sed s,\.git/worktrees/side$,not-a-repo, side/.git >side/.newgit &&
+	mv side/.newgit side/.git &&
+	mkdir not-a-repo &&
+	test_must_fail git worktree repair side 2>err &&
+	test_i18ngrep ".git file does not reference a repository" err
+'
+
 test_expect_success 'repo not found; .git file broken' '
 	test_when_finished "rm -rf orig moved && git worktree prune" &&
 	git worktree add --detach orig &&
@@ -176,4 +186,19 @@ test_expect_success 'repair multiple gitdir files' '
 	test_must_be_empty err
 '
 
+test_expect_success 'repair moved main and linked worktrees' '
+	test_when_finished "rm -rf orig moved" &&
+	test_create_repo orig/main &&
+	test_commit -C orig/main init &&
+	git -C orig/main worktree add --detach ../side &&
+	sed s,orig/side/\.git,moved/side/.git, \
+		orig/main/.git/worktrees/side/gitdir >expect-gitdir &&
+	sed s,orig/main/.git/worktrees/side,moved/main/.git/worktrees/side, \
+		orig/side/.git >expect-gitfile &&
+	mv orig moved &&
+	git -C moved/main worktree repair ../side &&
+	test_cmp expect-gitdir moved/main/.git/worktrees/side/gitdir &&
+	test_cmp expect-gitfile moved/side/.git
+'
+
 test_done
diff --git a/worktree.c b/worktree.c
index f84ceae87d..821b233479 100644
--- a/worktree.c
+++ b/worktree.c
@@ -644,6 +644,42 @@ static int is_main_worktree_path(const char *path)
 	return !cmp;
 }
 
+/*
+ * If both the main worktree and linked worktree have been moved, then the
+ * gitfile /path/to/worktree/.git won't point into the repository, thus we
+ * won't know which <repo>/worktrees/<id>/gitdir to repair. However, we may
+ * be able to infer the gitdir by manually reading /path/to/worktree/.git,
+ * extracting the <id>, and checking if <repo>/worktrees/<id> exists.
+ */
+static char *infer_backlink(const char *gitfile)
+{
+	struct strbuf actual = STRBUF_INIT;
+	struct strbuf inferred = STRBUF_INIT;
+	const char *id;
+
+	if (strbuf_read_file(&actual, gitfile, 0) < 0)
+		goto error;
+	if (!starts_with(actual.buf, "gitdir:"))
+		goto error;
+	if (!(id = find_last_dir_sep(actual.buf)))
+		goto error;
+	strbuf_trim(&actual);
+	id++; /* advance past '/' to point at <id> */
+	if (!*id)
+		goto error;
+	strbuf_git_common_path(&inferred, the_repository, "worktrees/%s", id);
+	if (!is_directory(inferred.buf))
+		goto error;
+
+	strbuf_release(&actual);
+	return strbuf_detach(&inferred, NULL);
+
+error:
+	strbuf_release(&actual);
+	strbuf_release(&inferred);
+	return NULL;
+}
+
 /*
  * Repair <repo>/worktrees/<id>/gitdir if missing, corrupt, or not pointing at
  * the worktree's path.
@@ -675,6 +711,11 @@ void repair_worktree_at_path(const char *path,
 	if (err == READ_GITFILE_ERR_NOT_A_FILE) {
 		fn(1, realdotgit.buf, _("unable to locate repository; .git is not a file"), cb_data);
 		goto done;
+	} else if (err == READ_GITFILE_ERR_NOT_A_REPO) {
+		if (!(backlink = infer_backlink(realdotgit.buf))) {
+			fn(1, realdotgit.buf, _("unable to locate repository; .git file does not reference a repository"), cb_data);
+			goto done;
+		}
 	} else if (err) {
 		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
 		goto done;
-- 
2.29.2.684.gfbc64c5ab5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] worktree: teach `repair` to fix multi-directional breakage
  2020-12-08 17:37 [PATCH] worktree: teach `repair` to fix multi-directional breakage Eric Sunshine
@ 2020-12-08 21:47 ` Junio C Hamano
  2020-12-17 10:22   ` Eric Sunshine
  2020-12-21  8:16 ` [PATCH v2 0/1] teach `worktree repair` to fix two-way linkage Eric Sunshine
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-12-08 21:47 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Philippe Blain, Duy Nguyen

Eric Sunshine <sunshine@sunshineco.com> writes:

> However, if both the repository and the worktrees are moved, then links
> are severed in both directions, and no repair is possible. This is the
> case even when the new worktree locations are specified as arguments to
> `git worktree repair`. The reason for this limitation is twofold. First,
> when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
> to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
> <repo> is the old path to the repository, thus it is unable to fix the
> `gitdir` file at its new location since it doesn't know where it is.
> Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
> location of the worktree's gitfile (/path/to/worktree/.git), the path
> recorded in `gitdir` is the old location of the worktree's gitfile, thus
> it is unable to repair the gitfile since it doesn't know where it is.

Third, when a worktree of an unrelated repository is moved to
location the worktree moved like that used to occupy, or an
unrelated repository is moved to the repository moved like that used
to occupy, the gitfile of the moved worktree would point at an
unrelated repository (which may not even be a clone of the same
project).

There probably are other failure cases possible.  Perhaps having to
notice and fail when both worktree and repository moved (i.e. the
case your patch handles) indicates that we would need to be able to
handle such a situation sensibly as well?  Do we leave some clue to
help us validate that the repository the gitfile points at indeed
corresponds to the one our worktree works with and vice versa?  If
we don't currently, should we?

> Fix these shortcomings by teaching `repair` to attempt to infer the new
> location of the <repo>/worktrees/<id>/gitdir file when the location
> recorded in the worktree's gitfile has become stale but the file is
> otherwise well-formed.

Hmph, after reading the "first" and "second" above a few times, it
is unclear how this "inference" would work.  The gitfile points at
the old location of the repository, which was moved to elsewhere
without telling the working tree (i.e. "First" problem), so...?

> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index af06128cc9..02a706c4c0 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -143,6 +143,11 @@ locate it. Running `repair` within the recently-moved working tree will
>  reestablish the connection. If multiple linked working trees are moved,
>  running `repair` from any working tree with each tree's new `<path>` as
>  an argument, will reestablish the connection to all the specified paths.
> ++
> +If both the main working tree and linked working trees have been moved
> +manually, then running `repair` in the main working tree and specifying the
> +new `<path>` of each linked working tree will reestablish all connections
> +in both directions.

This sounds like miracles, but it is perfectly the right thing to do
to say what we can do without telling users how we do so.

> diff --git a/worktree.c b/worktree.c
> index f84ceae87d..821b233479 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -644,6 +644,42 @@ static int is_main_worktree_path(const char *path)
>  	return !cmp;
>  }
>  
> +/*
> + * If both the main worktree and linked worktree have been moved, then the
> + * gitfile /path/to/worktree/.git won't point into the repository, thus we
> + * won't know which <repo>/worktrees/<id>/gitdir to repair. However, we may
> + * be able to infer the gitdir by manually reading /path/to/worktree/.git,
> + * extracting the <id>, and checking if <repo>/worktrees/<id> exists.
> + */
> +static char *infer_backlink(const char *gitfile)
> +{
> +	struct strbuf actual = STRBUF_INIT;
> +	struct strbuf inferred = STRBUF_INIT;
> +	const char *id;
> +
> +	if (strbuf_read_file(&actual, gitfile, 0) < 0)
> +		goto error;
> +	if (!starts_with(actual.buf, "gitdir:"))
> +		goto error;
> +	if (!(id = find_last_dir_sep(actual.buf)))
> +		goto error;
> +	strbuf_trim(&actual);
> +	id++; /* advance past '/' to point at <id> */
> +	if (!*id)
> +		goto error;
> +	strbuf_git_common_path(&inferred, the_repository, "worktrees/%s", id);
> +	if (!is_directory(inferred.buf))
> +		goto error;
> +
> +	strbuf_release(&actual);
> +	return strbuf_detach(&inferred, NULL);
> +
> +error:
> +	strbuf_release(&actual);
> +	strbuf_release(&inferred);
> +	return NULL;
> +}
> +
>  /*
>   * Repair <repo>/worktrees/<id>/gitdir if missing, corrupt, or not pointing at
>   * the worktree's path.
> @@ -675,6 +711,11 @@ void repair_worktree_at_path(const char *path,
>  	if (err == READ_GITFILE_ERR_NOT_A_FILE) {
>  		fn(1, realdotgit.buf, _("unable to locate repository; .git is not a file"), cb_data);
>  		goto done;
> +	} else if (err == READ_GITFILE_ERR_NOT_A_REPO) {
> +		if (!(backlink = infer_backlink(realdotgit.buf))) {
> +			fn(1, realdotgit.buf, _("unable to locate repository; .git file does not reference a repository"), cb_data);
> +			goto done;
> +		}
>  	} else if (err) {
>  		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
>  		goto done;


> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 197fd24a55..71287b2da6 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -1052,10 +1052,10 @@ static int repair(int ac, const char **av, const char *prefix)
>  	int rc = 0;
>  
>  	ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
> -	repair_worktrees(report_repair, &rc);
>  	p = ac > 0 ? av : self;
>  	for (; *p; p++)
>  		repair_worktree_at_path(*p, report_repair, &rc);
> +	repair_worktrees(report_repair, &rc);
>  	return rc;
>  }

The reason why repair_worktrees() has to wait until the individual
repairing is done for all the given paths in the new code is...?

> diff --git a/t/t2406-worktree-repair.sh b/t/t2406-worktree-repair.sh
> index 1fe468bfe8..687342bfa7 100755
> --- a/t/t2406-worktree-repair.sh
> +++ b/t/t2406-worktree-repair.sh
> @@ -104,6 +104,16 @@ test_expect_success 'repo not found; .git not file' '
>  	test_i18ngrep ".git is not a file" err
>  '
>  
> +test_expect_success 'repo not found; .git not referencing repo' '
> +	test_when_finished "rm -rf side not-a-repo && git worktree prune" &&
> +	git worktree add --detach side &&
> +	sed s,\.git/worktrees/side$,not-a-repo, side/.git >side/.newgit &&
> +	mv side/.newgit side/.git &&
> +	mkdir not-a-repo &&
> +	test_must_fail git worktree repair side 2>err &&
> +	test_i18ngrep ".git file does not reference a repository" err
> +'

Failing upon empty directory and things that do not make sense is
good.  A more interesting or realistic case would have the side/.git
gitlink points at an unrelated repository because the user moved the
repository and replaced it with another repository, which has the
".git/worktree/" directory and even ".git/worktree/side" directory.

> +test_expect_success 'repair moved main and linked worktrees' '
> +	test_when_finished "rm -rf orig moved" &&
> +	test_create_repo orig/main &&
> +	test_commit -C orig/main init &&
> +	git -C orig/main worktree add --detach ../side &&
> +	sed s,orig/side/\.git,moved/side/.git, \

The pattern needs anchored with '$' to the right, in order not to
get confused by a substring like "ooorig/side/.gitto/" that appears
in the $(cwd).

> +		orig/main/.git/worktrees/side/gitdir >expect-gitdir &&
> +	sed s,orig/main/.git/worktrees/side,moved/main/.git/worktrees/side, \
> +		orig/side/.git >expect-gitfile &&

> +	mv orig moved &&

Hmmm, so this single move is what moves both the repository with the
primary worktree and the secondary worktree.  Does the "fix" assume
that the relative location between them do not change?  How realistic
is that, I wonder?

In any case, I think the problem description in the proposed log
message is quite well written but the solution seems unclear.

Thanks.

> +	git -C moved/main worktree repair ../side &&
> +	test_cmp expect-gitdir moved/main/.git/worktrees/side/gitdir &&
> +	test_cmp expect-gitfile moved/side/.git
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] worktree: teach `repair` to fix multi-directional breakage
  2020-12-08 21:47 ` Junio C Hamano
@ 2020-12-17 10:22   ` Eric Sunshine
  2020-12-17 19:49     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Sunshine @ 2020-12-17 10:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Philippe Blain, Duy Nguyen

On Tue, Dec 8, 2020 at 4:48 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> > However, if both the repository and the worktrees are moved, then links
> > are severed in both directions, and no repair is possible. [...]
>
> Third, when a worktree of an unrelated repository is moved to
> location the worktree moved like that used to occupy, or an
> unrelated repository is moved to the repository moved like that used
> to occupy, the gitfile of the moved worktree would point at an
> unrelated repository (which may not even be a clone of the same
> project).
>
> There probably are other failure cases possible.

There are indeed many ways to shoot oneself in the foot.

> Perhaps having to
> notice and fail when both worktree and repository moved (i.e. the
> case your patch handles) indicates that we would need to be able to
> handle such a situation sensibly as well?  Do we leave some clue to
> help us validate that the repository the gitfile points at indeed
> corresponds to the one our worktree works with and vice versa?  If
> we don't currently, should we?

We only have the paths making the two-way link between the repository
and the secondary worktrees. Specifically, the outgoing
<repo>/worktrees/<id>/gitdir points at /path/to/worktree/.git, and the
incoming /path/to/worktree/.git points at <repo>/worktrees/<id>.

As for the scenario you mention about moving a worktree to the
location previously occupied by a worktree belonging to a different
repository, or moving a repository to a location previously occupied
by a different repository, it is certainly possible to get into odd
situations. For instance, if  "worktree-A" from repository "A" is
moved to the location previously occupied by "worktree-B" from
repository "B", then "worktree-A" won't consider itself broken, but
both repositories "A" and "B" will consider the worktree broken
(because "A" will complain that "worktree-A" is missing, and "B" will
complain that "worktree-A" doesn't point back at "B").

I alluded to such confusion in my original response[1] to Philippe by
saying that, if not careful, a heuristic for inferring the new
location could end up hijacking a worktree from a different
repository.

So, there is no mechanism for reliably associating a worktree with a
repository, and this limitation isn't specific to secondary worktrees
created by `git worktree add`. There is no reliable association
between the repository (which may have been split off with
--separate-git-dir, for instance) and the main worktree either. This
may suggest the need to assign unique signatures to the repository and
each worktree (including the main worktree), but such a change
requires plenty of thought and is far outside the scope of this simple
enhancement to `git worktree repair`.

[1]: https://lore.kernel.org/git/CAPig+cT=-nNcfrzjSmTXymhFkB22bPFE6QRKXqPtat2ipUdboQ@mail.gmail.com/

> > Fix these shortcomings by teaching `repair` to attempt to infer the new
> > location of the <repo>/worktrees/<id>/gitdir file when the location
> > recorded in the worktree's gitfile has become stale but the file is
> > otherwise well-formed.
>
> Hmph, after reading the "first" and "second" above a few times, it
> is unclear how this "inference" would work.  The gitfile points at
> the old location of the repository, which was moved to elsewhere
> without telling the working tree (i.e. "First" problem), so...?

The inference is intentionally simple-minded. There is no path-based
inference or other heuristic at play; the entire inference is based
upon <id>. The worktree's path is specified as an argument. `git
worktree repair` manually reads the ".git" gitfile at that location
and, if it is well-formed, extracts the <id>. It then searches for a
corresponding <id> in <repo>/worktrees/ and, if found, concludes that
there is a reasonable match and "repairs" <repo>/worktrees/<id>/gitdir
to point at the specified worktree path.

I can update the commit message to describe the heuristic rather than
requiring the reader consult infer_backlink() in the patch itself, if
that would help.

The patch incorporates two safeguards to avoid hijacking a worktree
from a different repository. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository. I can
mention these safeguards in the commit message, as well.

It's still possible to bypass the safeguards by moving the foreign
repository too, in which case the foreign worktree being repaired
won't point at its repository anymore, thus there is no 100% guarantee
of safety (without the introduction of unique signatures mentioned
above). However, the goal of this patch (and `git worktree repair` in
general) is to help users recover from what are likely to be common
problems, not to protect users from shooting themselves in the foot by
the many means available.

A couple observations if, despite the safeguards, the user does manage
to hijack a worktree: (1) the problem will likely be noticed quickly
when, for instance, `git status` reports the bulk of the worktree's
files as untracked and other expected files missing due to the
mismatched `index`; (2) it's easy enough to recover from the situation
(without damage) by manually patching <repo>/worktrees/<id>/gitdir and
/path/to/worktree/.git, which is what the user would have had to do
anyhow prior to the existence of `git worktree repair`.

> > -     repair_worktrees(report_repair, &rc);
> >       p = ac > 0 ? av : self;
> >       for (; *p; p++)
> >               repair_worktree_at_path(*p, report_repair, &rc);
> > +     repair_worktrees(report_repair, &rc);
>
> The reason why repair_worktrees() has to wait until the individual
> repairing is done for all the given paths in the new code is...?

Because the repair can't succeed if both halves of the two-way link
are broken, but can succeed if at least one half is sound. As
described above, the worktree path is specified as an argument to `git
worktree repair` and the new inference may allow
repair_worktree_at_path() to repair the outgoing
<repo>/worktrees/<id>/gitdir, thus restoring one half of the two-way
link, which then allows the subsequent repair_worktrees() to repair
the other half, the incoming /path/to/worktree/.git file.

I realized even when writing the commit message that the reason for
this particular change might not be obvious, but I also was concerned
that I wouldn't be able to explain it in a succinct way without
overwhelming the rest of the commit message. However, I can try to
come up with suitable phrasing and incorporate the explanation into
the commit message.

> > +test_expect_success 'repo not found; .git not referencing repo' '
> > +     test_when_finished "rm -rf side not-a-repo && git worktree prune" &&
> > +     git worktree add --detach side &&
> > +     sed s,\.git/worktrees/side$,not-a-repo, side/.git >side/.newgit &&
> > +     mv side/.newgit side/.git &&
> > +     mkdir not-a-repo &&
> > +     test_must_fail git worktree repair side 2>err &&
> > +     test_i18ngrep ".git file does not reference a repository" err
> > +'
>
> Failing upon empty directory and things that do not make sense is
> good.  A more interesting or realistic case would have the side/.git
> gitlink points at an unrelated repository because the user moved the
> repository and replaced it with another repository, which has the
> ".git/worktree/" directory and even ".git/worktree/side" directory.

Detecting that sort of situation is outside the scope of this patch
since `git worktree repair` itself can't presently detect it. Some
future enhancement may make this possible, but the current change
doesn't deal with it.

This test is aimed specifically at verifying that
repair_worktree_at_path() now specially handles
READ_GITFILE_ERR_NOT_A_REPO returned by read_gitfile_gently() rather
than lumping it together with the other generic errors returned by the
function. It recognizes this specific error because, as a safety
mechanism against hijacking, the new inference is only attempted when
READ_GITFILE_ERR_NOT_A_REPO is returned.

> > +test_expect_success 'repair moved main and linked worktrees' '
> > +     test_when_finished "rm -rf orig moved" &&
> > +     test_create_repo orig/main &&
> > +     test_commit -C orig/main init &&
> > +     git -C orig/main worktree add --detach ../side &&
> > +     sed s,orig/side/\.git,moved/side/.git, \
>
> The pattern needs anchored with '$' to the right, in order not to
> get confused by a substring like "ooorig/side/.gitto/" that appears
> in the $(cwd).

Well spotted. The test I added just above this test correctly anchors
the pattern, but I forgot it here. Will fix.

> > +             orig/main/.git/worktrees/side/gitdir >expect-gitdir &&
> > +     sed s,orig/main/.git/worktrees/side,moved/main/.git/worktrees/side, \
> > +             orig/side/.git >expect-gitfile &&
>
> > +     mv orig moved &&
>
> Hmmm, so this single move is what moves both the repository with the
> primary worktree and the secondary worktree.  Does the "fix" assume
> that the relative location between them do not change?  How realistic
> is that, I wonder?

As mentioned above, the inference is not path-based; it relies only
upon matching up the <id>. So this `mv orig moved` is just a
convenient way to relocate both the repository and worktree; it is not
intended to convey any meaning about the inference mechanism. However,
to avoid potentially misleading readers, I can revise the test to work
with distinct paths and manipulate them separately.

> In any case, I think the problem description in the proposed log
> message is quite well written but the solution seems unclear.

I'll update the commit message to talk about the <id>-based inference.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] worktree: teach `repair` to fix multi-directional breakage
  2020-12-17 10:22   ` Eric Sunshine
@ 2020-12-17 19:49     ` Junio C Hamano
  2020-12-17 20:02       ` Eric Sunshine
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-12-17 19:49 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Philippe Blain, Duy Nguyen

Eric Sunshine <sunshine@sunshineco.com> writes:

> We only have the paths making the two-way link between the repository
> and the secondary worktrees. Specifically, the outgoing
> <repo>/worktrees/<id>/gitdir points at /path/to/worktree/.git, and the
> incoming /path/to/worktree/.git points at <repo>/worktrees/<id>.

Yes.

> The inference is intentionally simple-minded. There is no path-based
> inference or other heuristic at play; the entire inference is based
> upon <id>. The worktree's path is specified as an argument. `git
> worktree repair` manually reads the ".git" gitfile at that location
> and, if it is well-formed, extracts the <id>. It then searches for a
> corresponding <id> in <repo>/worktrees/ and,...

That is exactly the point I got confused.  The worktree's path comes
as an argument from the user, so we'd trust it.  And it has ".git"
that is a gitfile that used to point at <repo> but because we are
trying to deal with a situation where both worktree and repo moved,
it cannot be used to learn where <repo> is.  Then, even after
learning what <id> to use, how would we know where to use that <id>
to find <repo>/worktrees/<id>, when the location of <repo> is unknown?

I think the answer is "where the user starts the 'git worktree'
command has to be under control of some repository (perhaps found by
a call to setup_git_directory()), and we'd use that one as <repo>".

Since I did not see that (in hindsight an obvious) piece, I failed
to see how it could possibly work.  But now it is clear.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] worktree: teach `repair` to fix multi-directional breakage
  2020-12-17 19:49     ` Junio C Hamano
@ 2020-12-17 20:02       ` Eric Sunshine
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2020-12-17 20:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Philippe Blain, Duy Nguyen

On Thu, Dec 17, 2020 at 2:49 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> > The inference is intentionally simple-minded. There is no path-based
> > inference or other heuristic at play; the entire inference is based
> > upon <id>. The worktree's path is specified as an argument. `git
> > worktree repair` manually reads the ".git" gitfile at that location
> > and, if it is well-formed, extracts the <id>. It then searches for a
> > corresponding <id> in <repo>/worktrees/ and,...
>
> That is exactly the point I got confused.  The worktree's path comes
> as an argument from the user, so we'd trust it.  And it has ".git"
> that is a gitfile that used to point at <repo> but because we are
> trying to deal with a situation where both worktree and repo moved,
> it cannot be used to learn where <repo> is.  Then, even after
> learning what <id> to use, how would we know where to use that <id>
> to find <repo>/worktrees/<id>, when the location of <repo> is unknown?
>
> I think the answer is "where the user starts the 'git worktree'
> command has to be under control of some repository (perhaps found by
> a call to setup_git_directory()), and we'd use that one as <repo>".

Correct. This is why the documentation update:

    If both the main working tree and linked working trees have been
    moved manually, then running `repair` in the main working tree and
    specifying the new `<path>` of each linked working tree will
    reestablish all connections in both directions.

says explicitly that `git worktree repair` must be run in the main
worktree for this particular case. (For a bare repository, the command
would be run in the bare repository instead, but I omitted that bit to
avoid bogging down the documentation, and because the couple preceding
paragraphs already mention the bare repository case, so I figured the
reader would understand.)

I could also have mentioned in the commit message the requirement of
running `git worktree repair` in the main worktree (or bare repo), but
didn't want to repeat what the patch itself was already saying. But I
think I'll update the commit message to mention it when re-rolling
since it's important information for understanding how the repair
works.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 0/1] teach `worktree repair` to fix two-way linkage
  2020-12-08 17:37 [PATCH] worktree: teach `repair` to fix multi-directional breakage Eric Sunshine
  2020-12-08 21:47 ` Junio C Hamano
@ 2020-12-21  8:16 ` Eric Sunshine
  2020-12-21  8:16   ` [PATCH v2 1/1] worktree: teach `repair` to fix multi-directional breakage Eric Sunshine
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Sunshine @ 2020-12-21  8:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philippe Blain, Duy Nguyen, Eric Sunshine

This is a re-roll of [1] which addresses a limitation of `git worktree
repair` in which it is unable to repair the two-way links between the
repository and its secondary worktrees if both the repository and the
worktrees have been moved manually.

The primary change in v2 is that the commit message has been expanded
quite a bit to explain how the new repair mechanism works, rather than
expecting the reader to glean the necessary information from the patch
itself, as well as to express the reasoning behind a couple changes
which might not be obvious at first glance.

The patch itself is unchanged with the exception of two minor
alterations to one of the two added tests. First, anchoring has been
added to a couple `sed` expressions. Second, the rename of the main and
secondary worktrees is now done with two distinct `mv` commands rather
than a single "clever" `mv` so as to avoid giving the reader the false
impression that the cleverness implies something about how the repair
mechanism works (it doesn't imply anything, it was just done as a
convenience in v1).

[1]: https://lore.kernel.org/git/20201208173705.5770-1-sunshine@sunshineco.com/

Eric Sunshine (1):
  worktree: teach `repair` to fix multi-directional breakage

 Documentation/git-worktree.txt |  5 +++++
 builtin/worktree.c             |  2 +-
 t/t2406-worktree-repair.sh     | 26 +++++++++++++++++++++
 worktree.c                     | 41 ++++++++++++++++++++++++++++++++++
 4 files changed, 73 insertions(+), 1 deletion(-)

Range-diff against v1:
1:  956e61bbc8 ! 1:  cd38528672 worktree: teach `repair` to fix multi-directional breakage
    @@ Commit message
         Fix these shortcomings by teaching `repair` to attempt to infer the new
         location of the <repo>/worktrees/<id>/gitdir file when the location
         recorded in the worktree's gitfile has become stale but the file is
    -    otherwise well-formed.
    +    otherwise well-formed. The inference is intentionally simple-minded.
    +    For each worktree path specified as an argument, `git worktree repair`
    +    manually reads the ".git" gitfile at that location and, if it is
    +    well-formed, extracts the <id>. It then searches for a corresponding
    +    <id> in <repo>/worktrees/ and, if found, concludes that there is a
    +    reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
    +    the specified worktree path. In order for <repo> to be known, `git
    +    worktree repair` must be run in the main worktree or bare repository.
    +
    +    `git worktree repair` first attempts to repair each incoming
    +    /path/to/worktree/.git gitfile to point at the repository, and then
    +    attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
    +    at the worktrees. This sequence was chosen arbitrarily when originally
    +    implemented since the order of fixes is immaterial as long as one side
    +    of the two-way link between the repository and a worktree is sound.
    +    However, for this new repair technique to work, the order must be
    +    reversed. This is because the new inference mechanism, when it is
    +    successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
    +    repaired, thus fixing one side of the two-way link. Once that side is
    +    fixed, the other side can be fixed by the existing repair mechanism,
    +    hence the order of repairs is now significant.
    +
    +    Two safeguards are employed to avoid hijacking a worktree from a
    +    different repository if the user accidentally specifies a foreign
    +    worktree as an argument. The first, as described above, is that it
    +    requires an <id> match between the repository and the worktree. That
    +    itself is not foolproof for preventing hijack, so the second safeguard
    +    is that the inference will only kick in if the worktree's
    +    /path/to/worktree/.git gitfile does not point at a repository.
     
         Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
     
    @@ t/t2406-worktree-repair.sh: test_expect_success 'repair multiple gitdir files' '
      '
      
     +test_expect_success 'repair moved main and linked worktrees' '
    -+	test_when_finished "rm -rf orig moved" &&
    -+	test_create_repo orig/main &&
    -+	test_commit -C orig/main init &&
    -+	git -C orig/main worktree add --detach ../side &&
    -+	sed s,orig/side/\.git,moved/side/.git, \
    -+		orig/main/.git/worktrees/side/gitdir >expect-gitdir &&
    -+	sed s,orig/main/.git/worktrees/side,moved/main/.git/worktrees/side, \
    -+		orig/side/.git >expect-gitfile &&
    -+	mv orig moved &&
    -+	git -C moved/main worktree repair ../side &&
    -+	test_cmp expect-gitdir moved/main/.git/worktrees/side/gitdir &&
    -+	test_cmp expect-gitfile moved/side/.git
    ++	test_when_finished "rm -rf main side mainmoved sidemoved" &&
    ++	test_create_repo main &&
    ++	test_commit -C main init &&
    ++	git -C main worktree add --detach ../side &&
    ++	sed "s,side/\.git$,sidemoved/.git," \
    ++		main/.git/worktrees/side/gitdir >expect-gitdir &&
    ++	sed "s,main/.git/worktrees/side$,mainmoved/.git/worktrees/side," \
    ++		side/.git >expect-gitfile &&
    ++	mv main mainmoved &&
    ++	mv side sidemoved &&
    ++	git -C mainmoved worktree repair ../sidemoved &&
    ++	test_cmp expect-gitdir mainmoved/.git/worktrees/side/gitdir &&
    ++	test_cmp expect-gitfile sidemoved/.git
     +'
     +
      test_done
-- 
2.30.0.rc1.243.g5298b911bd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/1] worktree: teach `repair` to fix multi-directional breakage
  2020-12-21  8:16 ` [PATCH v2 0/1] teach `worktree repair` to fix two-way linkage Eric Sunshine
@ 2020-12-21  8:16   ` Eric Sunshine
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2020-12-21  8:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philippe Blain, Duy Nguyen, Eric Sunshine

`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.

However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.

Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed. The inference is intentionally simple-minded.
For each worktree path specified as an argument, `git worktree repair`
manually reads the ".git" gitfile at that location and, if it is
well-formed, extracts the <id>. It then searches for a corresponding
<id> in <repo>/worktrees/ and, if found, concludes that there is a
reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
the specified worktree path. In order for <repo> to be known, `git
worktree repair` must be run in the main worktree or bare repository.

`git worktree repair` first attempts to repair each incoming
/path/to/worktree/.git gitfile to point at the repository, and then
attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
at the worktrees. This sequence was chosen arbitrarily when originally
implemented since the order of fixes is immaterial as long as one side
of the two-way link between the repository and a worktree is sound.
However, for this new repair technique to work, the order must be
reversed. This is because the new inference mechanism, when it is
successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
repaired, thus fixing one side of the two-way link. Once that side is
fixed, the other side can be fixed by the existing repair mechanism,
hence the order of repairs is now significant.

Two safeguards are employed to avoid hijacking a worktree from a
different repository if the user accidentally specifies a foreign
worktree as an argument. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 Documentation/git-worktree.txt |  5 +++++
 builtin/worktree.c             |  2 +-
 t/t2406-worktree-repair.sh     | 26 +++++++++++++++++++++
 worktree.c                     | 41 ++++++++++++++++++++++++++++++++++
 4 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index af06128cc9..02a706c4c0 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -143,6 +143,11 @@ locate it. Running `repair` within the recently-moved working tree will
 reestablish the connection. If multiple linked working trees are moved,
 running `repair` from any working tree with each tree's new `<path>` as
 an argument, will reestablish the connection to all the specified paths.
++
+If both the main working tree and linked working trees have been moved
+manually, then running `repair` in the main working tree and specifying the
+new `<path>` of each linked working tree will reestablish all connections
+in both directions.
 
 unlock::
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 197fd24a55..71287b2da6 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -1052,10 +1052,10 @@ static int repair(int ac, const char **av, const char *prefix)
 	int rc = 0;
 
 	ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
-	repair_worktrees(report_repair, &rc);
 	p = ac > 0 ? av : self;
 	for (; *p; p++)
 		repair_worktree_at_path(*p, report_repair, &rc);
+	repair_worktrees(report_repair, &rc);
 	return rc;
 }
 
diff --git a/t/t2406-worktree-repair.sh b/t/t2406-worktree-repair.sh
index 1fe468bfe8..f73741886b 100755
--- a/t/t2406-worktree-repair.sh
+++ b/t/t2406-worktree-repair.sh
@@ -104,6 +104,16 @@ test_expect_success 'repo not found; .git not file' '
 	test_i18ngrep ".git is not a file" err
 '
 
+test_expect_success 'repo not found; .git not referencing repo' '
+	test_when_finished "rm -rf side not-a-repo && git worktree prune" &&
+	git worktree add --detach side &&
+	sed s,\.git/worktrees/side$,not-a-repo, side/.git >side/.newgit &&
+	mv side/.newgit side/.git &&
+	mkdir not-a-repo &&
+	test_must_fail git worktree repair side 2>err &&
+	test_i18ngrep ".git file does not reference a repository" err
+'
+
 test_expect_success 'repo not found; .git file broken' '
 	test_when_finished "rm -rf orig moved && git worktree prune" &&
 	git worktree add --detach orig &&
@@ -176,4 +186,20 @@ test_expect_success 'repair multiple gitdir files' '
 	test_must_be_empty err
 '
 
+test_expect_success 'repair moved main and linked worktrees' '
+	test_when_finished "rm -rf main side mainmoved sidemoved" &&
+	test_create_repo main &&
+	test_commit -C main init &&
+	git -C main worktree add --detach ../side &&
+	sed "s,side/\.git$,sidemoved/.git," \
+		main/.git/worktrees/side/gitdir >expect-gitdir &&
+	sed "s,main/.git/worktrees/side$,mainmoved/.git/worktrees/side," \
+		side/.git >expect-gitfile &&
+	mv main mainmoved &&
+	mv side sidemoved &&
+	git -C mainmoved worktree repair ../sidemoved &&
+	test_cmp expect-gitdir mainmoved/.git/worktrees/side/gitdir &&
+	test_cmp expect-gitfile sidemoved/.git
+'
+
 test_done
diff --git a/worktree.c b/worktree.c
index f84ceae87d..821b233479 100644
--- a/worktree.c
+++ b/worktree.c
@@ -644,6 +644,42 @@ static int is_main_worktree_path(const char *path)
 	return !cmp;
 }
 
+/*
+ * If both the main worktree and linked worktree have been moved, then the
+ * gitfile /path/to/worktree/.git won't point into the repository, thus we
+ * won't know which <repo>/worktrees/<id>/gitdir to repair. However, we may
+ * be able to infer the gitdir by manually reading /path/to/worktree/.git,
+ * extracting the <id>, and checking if <repo>/worktrees/<id> exists.
+ */
+static char *infer_backlink(const char *gitfile)
+{
+	struct strbuf actual = STRBUF_INIT;
+	struct strbuf inferred = STRBUF_INIT;
+	const char *id;
+
+	if (strbuf_read_file(&actual, gitfile, 0) < 0)
+		goto error;
+	if (!starts_with(actual.buf, "gitdir:"))
+		goto error;
+	if (!(id = find_last_dir_sep(actual.buf)))
+		goto error;
+	strbuf_trim(&actual);
+	id++; /* advance past '/' to point at <id> */
+	if (!*id)
+		goto error;
+	strbuf_git_common_path(&inferred, the_repository, "worktrees/%s", id);
+	if (!is_directory(inferred.buf))
+		goto error;
+
+	strbuf_release(&actual);
+	return strbuf_detach(&inferred, NULL);
+
+error:
+	strbuf_release(&actual);
+	strbuf_release(&inferred);
+	return NULL;
+}
+
 /*
  * Repair <repo>/worktrees/<id>/gitdir if missing, corrupt, or not pointing at
  * the worktree's path.
@@ -675,6 +711,11 @@ void repair_worktree_at_path(const char *path,
 	if (err == READ_GITFILE_ERR_NOT_A_FILE) {
 		fn(1, realdotgit.buf, _("unable to locate repository; .git is not a file"), cb_data);
 		goto done;
+	} else if (err == READ_GITFILE_ERR_NOT_A_REPO) {
+		if (!(backlink = infer_backlink(realdotgit.buf))) {
+			fn(1, realdotgit.buf, _("unable to locate repository; .git file does not reference a repository"), cb_data);
+			goto done;
+		}
 	} else if (err) {
 		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
 		goto done;
-- 
2.30.0.rc1.243.g5298b911bd


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-21  8:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 17:37 [PATCH] worktree: teach `repair` to fix multi-directional breakage Eric Sunshine
2020-12-08 21:47 ` Junio C Hamano
2020-12-17 10:22   ` Eric Sunshine
2020-12-17 19:49     ` Junio C Hamano
2020-12-17 20:02       ` Eric Sunshine
2020-12-21  8:16 ` [PATCH v2 0/1] teach `worktree repair` to fix two-way linkage Eric Sunshine
2020-12-21  8:16   ` [PATCH v2 1/1] worktree: teach `repair` to fix multi-directional breakage Eric Sunshine

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).