git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* should git maintenance prefetch be taught to honor remote.fetch refspec?
@ 2021-04-01 18:49 Tom Saeger
  2021-04-01 19:07 ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-01 18:49 UTC (permalink / raw)
  To: git

I've recently setup git maintenance and noticed prefetch task
fetches ALL remote refs (well not tags) and does not honor
remote fetch config settings.

Details:

# narrowed set of remote refs produced by `git fetch --all --prune`
$ git show-ref | grep "refs/remotes/" | wc -l
81

# git maintenance prefetch refs
$ git show-ref | grep "refs/prefetch/" | wc -l
262

# git worktree with 7 remotes
$ git remote | wc -l
7

# some remotes configured to fetch narrowed set of remote refs
$ git config --local --list | grep ".fetch=" | wc -l
11

# concrete example of remote fetch config
$ git config --local --list | grep "stable.fetch="
remote.stable.fetch=+refs/heads/linux-4.14.y:refs/remotes/stable/linux-4.14.y
remote.stable.fetch=+refs/heads/linux-5.4.y:refs/remotes/stable/linux-5.4.y


Should git maintenance prefetch task be taught to honor remote fetch config settings?


Something equivalent to:

git config --local --list \
| sed -ne '/remote\..*fetch=/{s#refs/remotes/#refs/prefetch/#;s#^remote\.\([^\.]\+\)\.fetch=#\1 #;p;};' \
| while read -r remote refspec ; do
    git fetch ${remote} --prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet ${refspec};
done


Regards,

--Tom

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 18:49 should git maintenance prefetch be taught to honor remote.fetch refspec? Tom Saeger
@ 2021-04-01 19:07 ` Derrick Stolee
  2021-04-01 19:42   ` Tom Saeger
  2021-04-01 20:14   ` Junio C Hamano
  0 siblings, 2 replies; 27+ messages in thread
From: Derrick Stolee @ 2021-04-01 19:07 UTC (permalink / raw)
  To: Tom Saeger, git

On 4/1/2021 2:49 PM, Tom Saeger wrote:
> I've recently setup git maintenance and noticed prefetch task
> fetches ALL remote refs (well not tags) and does not honor
> remote fetch config settings.

This is intentional. The point is to get the latest objects
without modifying any local copies of refs. You still need
to run "git fetch" manually to update the refs, but that
should be faster because you have most of the objects already
in your copy of the repository.

Here is the essential part of the documentation [1]:

	The refmap is custom to avoid updating local or
	remote branches (those in refs/heads or refs/remotes).
	Instead, the remote refs are stored in
	refs/prefetch/<remote>/. Also, tags are not updated.

	This is done to avoid disrupting the remote-tracking
	branches. The end users expect these refs to stay
	unmoved unless they initiate a fetch. With prefetch
	task, however, the objects necessary to complete a
	later real fetch would already be obtained, so the real
	fetch would go faster. In the ideal case, it will just
	become an update to a bunch of remote-tracking branches
	without any object transfer.

[1] https://git-scm.com/docs/git-maintenance#Documentation/git-maintenance.txt-prefetch

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 19:07 ` Derrick Stolee
@ 2021-04-01 19:42   ` Tom Saeger
  2021-04-01 20:14   ` Junio C Hamano
  1 sibling, 0 replies; 27+ messages in thread
From: Tom Saeger @ 2021-04-01 19:42 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Thu, Apr 01, 2021 at 03:07:37PM -0400, Derrick Stolee wrote:
> On 4/1/2021 2:49 PM, Tom Saeger wrote:
> > I've recently setup git maintenance and noticed prefetch task
> > fetches ALL remote refs (well not tags) and does not honor
> > remote fetch config settings.
> 
> This is intentional. The point is to get the latest objects
> without modifying any local copies of refs. You still need
> to run "git fetch" manually to update the refs, but that
> should be faster because you have most of the objects already
> in your copy of the repository.

I get all that.

In my particular use-case - I constrained the fetch refspec to purposely
limit the number of objects and remote refs only to those of interest.

"git fetch" won't ever use the other 120 refs that got pulled in by
prefetch.

If the configured remote fetch refspecs were honored,
prefetch would only update those objects/refs which `git fetch` would
have.

> 
> Here is the essential part of the documentation [1]:
> 
> 	The refmap is custom to avoid updating local or
> 	remote branches (those in refs/heads or refs/remotes).
> 	Instead, the remote refs are stored in
> 	refs/prefetch/<remote>/. Also, tags are not updated.
> 
> 	This is done to avoid disrupting the remote-tracking
> 	branches. The end users expect these refs to stay
> 	unmoved unless they initiate a fetch. With prefetch
> 	task, however, the objects necessary to complete a
> 	later real fetch would already be obtained, so the real
> 	fetch would go faster. In the ideal case, it will just
> 	become an update to a bunch of remote-tracking branches
> 	without any object transfer.
> 
> [1] https://git-scm.com/docs/git-maintenance#Documentation/git-maintenance.txt-prefetch
> 
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 19:07 ` Derrick Stolee
  2021-04-01 19:42   ` Tom Saeger
@ 2021-04-01 20:14   ` Junio C Hamano
  2021-04-01 22:11     ` Tom Saeger
  2021-04-01 22:25     ` Derrick Stolee
  1 sibling, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2021-04-01 20:14 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Tom Saeger, git

Derrick Stolee <stolee@gmail.com> writes:

> On 4/1/2021 2:49 PM, Tom Saeger wrote:
>> I've recently setup git maintenance and noticed prefetch task
>> fetches ALL remote refs (well not tags) and does not honor
>> remote fetch config settings.
>
> This is intentional. The point is to get the latest objects
> without modifying any local copies of refs. You still need
> to run "git fetch" manually to update the refs, but that
> should be faster because you have most of the objects already
> in your copy of the repository.

You answered only half of the question, I think.

The plain vanilla refspec after you clone would be

    [remote "origin"]
	fetch = +refs/heads/*:refs/remotes/origin/*

and "maintenance prefetch" intentionally redirect the destination
part away from refs/remotes/origin/ to avoid disrupting the
reference to @{upstream} etc. that are used locally.  And
intentionally doing so is good.

But imagine you are tracking my 'todo' branch which has nothing
common with the history of Git.  You'd have

    [remote "origin"]
	url = git://git.kernel.org/pub/scm/git/git.git
	fetch = +refs/heads/todo:refs/remotes/origin/todo

If "maintenance prefetch" does "+refs/heads/*:refs/prefetch/origin/*",
you'd end up the history of Git, which is not interesting for you who
are only interested in the 'todo' branch (to be checked out in the
Meta subdirectory to use Meta/cook script or read Meta/whats-cooking.txt
file).

So, redirecting the right-hand of configured refspec is a good idea;
not copying the left-hand of configured refspec, and unconditionally
using "refs/heads/*" is not.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 20:14   ` Junio C Hamano
@ 2021-04-01 22:11     ` Tom Saeger
  2021-04-01 22:25     ` Derrick Stolee
  1 sibling, 0 replies; 27+ messages in thread
From: Tom Saeger @ 2021-04-01 22:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Thu, Apr 01, 2021 at 01:14:05PM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > On 4/1/2021 2:49 PM, Tom Saeger wrote:
> >> I've recently setup git maintenance and noticed prefetch task
> >> fetches ALL remote refs (well not tags) and does not honor
> >> remote fetch config settings.
> >
> > This is intentional. The point is to get the latest objects
> > without modifying any local copies of refs. You still need
> > to run "git fetch" manually to update the refs, but that
> > should be faster because you have most of the objects already
> > in your copy of the repository.
> 
> You answered only half of the question, I think.
> 
> The plain vanilla refspec after you clone would be
> 
>     [remote "origin"]
> 	fetch = +refs/heads/*:refs/remotes/origin/*
> 
> and "maintenance prefetch" intentionally redirect the destination
> part away from refs/remotes/origin/ to avoid disrupting the
> reference to @{upstream} etc. that are used locally.  And
> intentionally doing so is good.
> 
> But imagine you are tracking my 'todo' branch which has nothing
> common with the history of Git.  You'd have
> 
>     [remote "origin"]
> 	url = git://git.kernel.org/pub/scm/git/git.git
> 	fetch = +refs/heads/todo:refs/remotes/origin/todo
> 
> If "maintenance prefetch" does "+refs/heads/*:refs/prefetch/origin/*",
> you'd end up the history of Git, which is not interesting for you who
> are only interested in the 'todo' branch (to be checked out in the
> Meta subdirectory to use Meta/cook script or read Meta/whats-cooking.txt
> file).

Yes - precisely.

> 
> So, redirecting the right-hand of configured refspec is a good idea;
> not copying the left-hand of configured refspec, and unconditionally
> using "refs/heads/*" is not.
> 

And multiple values can be set for remote.remotename.fetch


Behavior equivalent to:

git remote | while read -r remote; do
    refspecs=$(
    git config --get-all "remote.${remote}.fetch" \
    | sed 's#refs/remotes/#refs/prefetch/#' \
    | xargs)
    git fetch "${remote}" --prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet ${refspecs};
done

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 20:14   ` Junio C Hamano
  2021-04-01 22:11     ` Tom Saeger
@ 2021-04-01 22:25     ` Derrick Stolee
  2021-04-02 18:27       ` Tom Saeger
  1 sibling, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2021-04-01 22:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tom Saeger, git

On 4/1/2021 4:14 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> On 4/1/2021 2:49 PM, Tom Saeger wrote:
>>> I've recently setup git maintenance and noticed prefetch task
>>> fetches ALL remote refs (well not tags) and does not honor
>>> remote fetch config settings.
>>
>> This is intentional. The point is to get the latest objects
>> without modifying any local copies of refs. You still need
>> to run "git fetch" manually to update the refs, but that
>> should be faster because you have most of the objects already
>> in your copy of the repository.
> 
> You answered only half of the question, I think.

You are right. Thanks, both, for pointing that out.
 
> The plain vanilla refspec after you clone would be
> 
>     [remote "origin"]
> 	fetch = +refs/heads/*:refs/remotes/origin/*
> 
> and "maintenance prefetch" intentionally redirect the destination
> part away from refs/remotes/origin/ to avoid disrupting the
> reference to @{upstream} etc. that are used locally.  And
> intentionally doing so is good.
> 
> But imagine you are tracking my 'todo' branch which has nothing
> common with the history of Git.  You'd have
> 
>     [remote "origin"]
> 	url = git://git.kernel.org/pub/scm/git/git.git
> 	fetch = +refs/heads/todo:refs/remotes/origin/todo
> 
> If "maintenance prefetch" does "+refs/heads/*:refs/prefetch/origin/*",
> you'd end up the history of Git, which is not interesting for you who
> are only interested in the 'todo' branch (to be checked out in the
> Meta subdirectory to use Meta/cook script or read Meta/whats-cooking.txt
> file).
> 
> So, redirecting the right-hand of configured refspec is a good idea;
> not copying the left-hand of configured refspec, and unconditionally
> using "refs/heads/*" is not.
 
This makes sense as a way to augment the feature. It doesn't seem
like a common scenario, but it would be good for users to have
that flexibility.

Upon initial inspection, it shouldn't be too much work. However,
there is some generality to the refspec that might not be wholly
appropriate for prefetch (such as the exact_sha1 option). I'm
unfamiliar with the advanced forms of the refspec, so it'll take
some time to have confidence in this approach.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-01 22:25     ` Derrick Stolee
@ 2021-04-02 18:27       ` Tom Saeger
  2021-04-02 20:43         ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-02 18:27 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Junio C Hamano, git

On Thu, Apr 01, 2021 at 06:25:36PM -0400, Derrick Stolee wrote:
> On 4/1/2021 4:14 PM, Junio C Hamano wrote:
> > Derrick Stolee <stolee@gmail.com> writes:
> > 
> >> On 4/1/2021 2:49 PM, Tom Saeger wrote:
> >>> I've recently setup git maintenance and noticed prefetch task
> >>> fetches ALL remote refs (well not tags) and does not honor
> >>> remote fetch config settings.
> >>
> >> This is intentional. The point is to get the latest objects
> >> without modifying any local copies of refs. You still need
> >> to run "git fetch" manually to update the refs, but that
> >> should be faster because you have most of the objects already
> >> in your copy of the repository.
> > 
> > You answered only half of the question, I think.
> 
> You are right. Thanks, both, for pointing that out.
>  
> > The plain vanilla refspec after you clone would be
> > 
> >     [remote "origin"]
> > 	fetch = +refs/heads/*:refs/remotes/origin/*
> > 
> > and "maintenance prefetch" intentionally redirect the destination
> > part away from refs/remotes/origin/ to avoid disrupting the
> > reference to @{upstream} etc. that are used locally.  And
> > intentionally doing so is good.
> > 
> > But imagine you are tracking my 'todo' branch which has nothing
> > common with the history of Git.  You'd have
> > 
> >     [remote "origin"]
> > 	url = git://git.kernel.org/pub/scm/git/git.git
> > 	fetch = +refs/heads/todo:refs/remotes/origin/todo
> > 
> > If "maintenance prefetch" does "+refs/heads/*:refs/prefetch/origin/*",
> > you'd end up the history of Git, which is not interesting for you who
> > are only interested in the 'todo' branch (to be checked out in the
> > Meta subdirectory to use Meta/cook script or read Meta/whats-cooking.txt
> > file).
> > 
> > So, redirecting the right-hand of configured refspec is a good idea;
> > not copying the left-hand of configured refspec, and unconditionally
> > using "refs/heads/*" is not.
>  
> This makes sense as a way to augment the feature. It doesn't seem
> like a common scenario, but it would be good for users to have
> that flexibility.

It's common for me, especially on repos requiring 'maintenance'.

> 
> Upon initial inspection, it shouldn't be too much work. However,
> there is some generality to the refspec that might not be wholly
> appropriate for prefetch (such as the exact_sha1 option). I'm
> unfamiliar with the advanced forms of the refspec, so it'll take
> some time to have confidence in this approach.

Didn't know about exact_sha1.  prefetch probably wouldn't do
anything in that case?

'negative' refspecs - hmm haven't tried those.  I see how that might
complicate things or maybe not.

generally isn't it still changing the right-hand side of refspec?

replacing ":refs/" with ":refs/prefetch/"

This would still work for refspecs with negative patterns right?

I'm willing to help with/test/review this, thanks for investigating.


> 
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 18:27       ` Tom Saeger
@ 2021-04-02 20:43         ` Derrick Stolee
  2021-04-02 21:07           ` Derrick Stolee
                             ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Derrick Stolee @ 2021-04-02 20:43 UTC (permalink / raw)
  To: Tom Saeger; +Cc: Junio C Hamano, git

On 4/2/2021 2:27 PM, Tom Saeger wrote:
> On Thu, Apr 01, 2021 at 06:25:36PM -0400, Derrick Stolee wrote:
>> On 4/1/2021 4:14 PM, Junio C Hamano wrote:
>>> Derrick Stolee <stolee@gmail.com> writes:
>>>
>>>> On 4/1/2021 2:49 PM, Tom Saeger wrote:
>>>
>>> So, redirecting the right-hand of configured refspec is a good idea;
>>> not copying the left-hand of configured refspec, and unconditionally
>>> using "refs/heads/*" is not.
>>  
>> This makes sense as a way to augment the feature. It doesn't seem
>> like a common scenario, but it would be good for users to have
>> that flexibility.
> 
> It's common for me, especially on repos requiring 'maintenance'.

I'm sure that once this is a tool in your belt, then it becomes
common. The number of users who think about refspecs is likely a
small proportion. But features should work as well as they can
for as many users as possible. There's a way forward here, it
just is a little tricky due to the generality of refspecs.

>> Upon initial inspection, it shouldn't be too much work. However,
>> there is some generality to the refspec that might not be wholly
>> appropriate for prefetch (such as the exact_sha1 option). I'm
>> unfamiliar with the advanced forms of the refspec, so it'll take
>> some time to have confidence in this approach.
> 
> Didn't know about exact_sha1.  prefetch probably wouldn't do
> anything in that case?

My guess is that those should be dropped and ignored. But
maybe the approach below will still work?

> 'negative' refspecs - hmm haven't tried those.  I see how that might
> complicate things or maybe not.
> 
> generally isn't it still changing the right-hand side of refspec?
> 
> replacing ":refs/" with ":refs/prefetch/"

Right, this substring replacement might be easiest to achieve. The
'struct refspec' doesn't make it incredibly easy. Perhaps skipping
the refspec parsing and just doing that substring swap directly from
the config value might be the best approach.

> This would still work for refspecs with negative patterns right?

One of the issues is that negative patterns have no ":refs/"
substring.

The other issue is that exact matches (no "*") have an exact
string in the destination, too, so replacing the _entire_
destination with "refs/prefetch/<remote>/*" breaks the refspec.
I think the substring approach will still work here.

> I'm willing to help with/test/review this, thanks for investigating.
I have a branch available [1], but I'm seeing some failures only
on FreeBSD [2] and I can't understand why that platform is failing
this test. The current version (as of this writing) does not do
the substring replacement technique, and hence it just gives up
on exact matches. I will try the substring approach as an
alternative and see where that gets me.

[1] https://github.com/gitgitgadget/git/pull/924
[2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 20:43         ` Derrick Stolee
@ 2021-04-02 21:07           ` Derrick Stolee
  2021-04-02 21:39             ` Tom Saeger
  2021-04-02 21:15           ` Tom Saeger
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2021-04-02 21:07 UTC (permalink / raw)
  To: Tom Saeger; +Cc: Junio C Hamano, git

On 4/2/2021 4:43 PM, Derrick Stolee wrote:
> On 4/2/2021 2:27 PM, Tom Saeger wrote:
>> generally isn't it still changing the right-hand side of refspec?
>>
>> replacing ":refs/" with ":refs/prefetch/"
> 
> Right, this substring replacement might be easiest to achieve. The
> 'struct refspec' doesn't make it incredibly easy. Perhaps skipping
> the refspec parsing and just doing that substring swap directly from
> the config value might be the best approach.
> 
>> This would still work for refspecs with negative patterns right?
> 
> One of the issues is that negative patterns have no ":refs/"
> substring.
> 
> The other issue is that exact matches (no "*") have an exact
> string in the destination, too, so replacing the _entire_
> destination with "refs/prefetch/<remote>/*" breaks the refspec.
> I think the substring approach will still work here.

I updated my branch with the substring approach, which is
probably the better solution. Please give it a try. I don't
expect that change to help the FreeBSD build, but we will see.
 
> [1] https://github.com/gitgitgadget/git/pull/924
> [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 20:43         ` Derrick Stolee
  2021-04-02 21:07           ` Derrick Stolee
@ 2021-04-02 21:15           ` Tom Saeger
  2021-04-02 21:19           ` Junio C Hamano
  2021-04-02 22:32           ` Eric Sunshine
  3 siblings, 0 replies; 27+ messages in thread
From: Tom Saeger @ 2021-04-02 21:15 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Junio C Hamano, git

On Fri, Apr 02, 2021 at 04:43:16PM -0400, Derrick Stolee wrote:
> On 4/2/2021 2:27 PM, Tom Saeger wrote:
> > On Thu, Apr 01, 2021 at 06:25:36PM -0400, Derrick Stolee wrote:
> >> On 4/1/2021 4:14 PM, Junio C Hamano wrote:
> >>> Derrick Stolee <stolee@gmail.com> writes:
> >>>
> >>>> On 4/1/2021 2:49 PM, Tom Saeger wrote:
> >>>
> >>> So, redirecting the right-hand of configured refspec is a good idea;
> >>> not copying the left-hand of configured refspec, and unconditionally
> >>> using "refs/heads/*" is not.
> >>  
> >> This makes sense as a way to augment the feature. It doesn't seem
> >> like a common scenario, but it would be good for users to have
> >> that flexibility.
> > 
> > It's common for me, especially on repos requiring 'maintenance'.
> 
> I'm sure that once this is a tool in your belt, then it becomes
> common. The number of users who think about refspecs is likely a
> small proportion. But features should work as well as they can
> for as many users as possible. There's a way forward here, it
> just is a little tricky due to the generality of refspecs.
> 
> >> Upon initial inspection, it shouldn't be too much work. However,
> >> there is some generality to the refspec that might not be wholly
> >> appropriate for prefetch (such as the exact_sha1 option). I'm
> >> unfamiliar with the advanced forms of the refspec, so it'll take
> >> some time to have confidence in this approach.
> > 
> > Didn't know about exact_sha1.  prefetch probably wouldn't do
> > anything in that case?
> 
> My guess is that those should be dropped and ignored. But
> maybe the approach below will still work?
> 
> > 'negative' refspecs - hmm haven't tried those.  I see how that might
> > complicate things or maybe not.
> > 
> > generally isn't it still changing the right-hand side of refspec?
> > 
> > replacing ":refs/" with ":refs/prefetch/"
> 
> Right, this substring replacement might be easiest to achieve. The
> 'struct refspec' doesn't make it incredibly easy. Perhaps skipping
> the refspec parsing and just doing that substring swap directly from
> the config value might be the best approach.
> 
> > This would still work for refspecs with negative patterns right?
> 
> One of the issues is that negative patterns have no ":refs/"
> substring.
> 
> The other issue is that exact matches (no "*") have an exact
> string in the destination, too, so replacing the _entire_
> destination with "refs/prefetch/<remote>/*" breaks the refspec.
> I think the substring approach will still work here.
> 
> > I'm willing to help with/test/review this, thanks for investigating.
> I have a branch available [1], but I'm seeing some failures only
> on FreeBSD [2] and I can't understand why that platform is failing
> this test. The current version (as of this writing) does not do
> the substring replacement technique, and hence it just gives up
> on exact matches. I will try the substring approach as an
> alternative and see where that gets me.

I also worked up a patch, not nearly as elegant as yours, but it did
work.
I didn't think about changing fetch_remote to take struct remote like what
you've done.

Thanks - I'll give this a try.

--Tom

> 
> [1] https://github.com/gitgitgadget/git/pull/924
> [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
> 
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 20:43         ` Derrick Stolee
  2021-04-02 21:07           ` Derrick Stolee
  2021-04-02 21:15           ` Tom Saeger
@ 2021-04-02 21:19           ` Junio C Hamano
  2021-04-02 21:33             ` Tom Saeger
  2021-04-04 20:25             ` Derrick Stolee
  2021-04-02 22:32           ` Eric Sunshine
  3 siblings, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2021-04-02 21:19 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Tom Saeger, git

Derrick Stolee <stolee@gmail.com> writes:

> The other issue is that exact matches (no "*") have an exact
> string in the destination, too, so replacing the _entire_
> destination with "refs/prefetch/<remote>/*" breaks the refspec.
> I think the substring approach will still work here.

I wonder if that matters.  If the exact format says

	[remote "origin"]
	        url = git://git.kernel.org/pub/scm/git/git.git/
		fetch = +refs/heads/todo:refs/x/y/z

you can just add refs/prefetch/<remote>/ to the entire RHS to ensure
that (1) the prefetch would not affect anything outside the 'prefetch'
to break @{upstream} and friends, and (2) the prefetch from this remote
would not affect anything used for other remotes.

IOW, the RHS, as long as it is hidden from normal operations and
does not conflict with interaction with other repositories, where
exactly in the refs hierarchy these objects are "parked" does not
matter, I would think.

I do not recommend unparsed refspec and textually munging, by the
way.  Doesn't

	git fetch master:remotes/origin/master

first parse to normalize the src/dst sides to turn it into

	git fetch refs/heads/master:refs/remotes/origin/master

which is what you want to further redirect to the prefetch hierarchy

	git fetch +refs/heads/master:refs/prefetch/origin/refs/remotes/origin/master

or whatever munging scheme is used?

Also, I wonder if there should be a mechanism to prune the
prefetched refs, but that is totally outside the scope of the
problem we have been discussing in this thread.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 21:19           ` Junio C Hamano
@ 2021-04-02 21:33             ` Tom Saeger
  2021-04-04 20:25             ` Derrick Stolee
  1 sibling, 0 replies; 27+ messages in thread
From: Tom Saeger @ 2021-04-02 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Fri, Apr 02, 2021 at 02:19:06PM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > The other issue is that exact matches (no "*") have an exact
> > string in the destination, too, so replacing the _entire_
> > destination with "refs/prefetch/<remote>/*" breaks the refspec.
> > I think the substring approach will still work here.
> 
> I wonder if that matters.  If the exact format says
> 
> 	[remote "origin"]
> 	        url = git://git.kernel.org/pub/scm/git/git.git/
> 		fetch = +refs/heads/todo:refs/x/y/z
> 
> you can just add refs/prefetch/<remote>/ to the entire RHS to ensure
> that (1) the prefetch would not affect anything outside the 'prefetch'
> to break @{upstream} and friends, and (2) the prefetch from this remote
> would not affect anything used for other remotes.
> 
> IOW, the RHS, as long as it is hidden from normal operations and
> does not conflict with interaction with other repositories, where
> exactly in the refs hierarchy these objects are "parked" does not
> matter, I would think.
> 
> I do not recommend unparsed refspec and textually munging, by the
> way.  Doesn't
> 
> 	git fetch master:remotes/origin/master
> 
> first parse to normalize the src/dst sides to turn it into
> 
> 	git fetch refs/heads/master:refs/remotes/origin/master
> 
> which is what you want to further redirect to the prefetch hierarchy
> 
> 	git fetch +refs/heads/master:refs/prefetch/origin/refs/remotes/origin/master
> 
> or whatever munging scheme is used?
> 
> Also, I wonder if there should be a mechanism to prune the
> prefetched refs, but that is totally outside the scope of the
> problem we have been discussing in this thread.

FWIW - did this locally to test:

git show-ref | grep -F "prefetch" | cut -d' ' -f2 | while read -r ref ; do git update-ref -d $ref; done
git maintenance run --task prefetch
git show-ref | egrep "refs/remotes" | wc -l
git show-ref | egrep "refs/prefetch" | wc -l

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 21:07           ` Derrick Stolee
@ 2021-04-02 21:39             ` Tom Saeger
  2021-04-02 22:09               ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-02 21:39 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Junio C Hamano, git

On Fri, Apr 02, 2021 at 05:07:09PM -0400, Derrick Stolee wrote:
> On 4/2/2021 4:43 PM, Derrick Stolee wrote:
> > On 4/2/2021 2:27 PM, Tom Saeger wrote:
> >> generally isn't it still changing the right-hand side of refspec?
> >>
> >> replacing ":refs/" with ":refs/prefetch/"
> > 
> > Right, this substring replacement might be easiest to achieve. The
> > 'struct refspec' doesn't make it incredibly easy. Perhaps skipping
> > the refspec parsing and just doing that substring swap directly from
> > the config value might be the best approach.
> > 
> >> This would still work for refspecs with negative patterns right?
> > 
> > One of the issues is that negative patterns have no ":refs/"
> > substring.
> > 
> > The other issue is that exact matches (no "*") have an exact
> > string in the destination, too, so replacing the _entire_
> > destination with "refs/prefetch/<remote>/*" breaks the refspec.
> > I think the substring approach will still work here.
> 
> I updated my branch with the substring approach, which is
> probably the better solution. Please give it a try. I don't
> expect that change to help the FreeBSD build, but we will see.


This worked for all the scenarios I tried, which had both negatives and
multi remote fetch values.

Looks good!

Reviewed-by: Tom Saeger <tom.saeger@oracle.com>

>  
> > [1] https://github.com/gitgitgadget/git/pull/924
> > [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
> 
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 21:39             ` Tom Saeger
@ 2021-04-02 22:09               ` Junio C Hamano
  2021-04-02 22:27                 ` Tom Saeger
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-04-02 22:09 UTC (permalink / raw)
  To: Tom Saeger; +Cc: Derrick Stolee, git

Tom Saeger <tom.saeger@oracle.com> writes:

> On Fri, Apr 02, 2021 at 05:07:09PM -0400, Derrick Stolee wrote:
>> On 4/2/2021 4:43 PM, Derrick Stolee wrote:
>> > On 4/2/2021 2:27 PM, Tom Saeger wrote:
>> >> generally isn't it still changing the right-hand side of refspec?
>> >>
>> >> replacing ":refs/" with ":refs/prefetch/"
>> > 
>> > Right, this substring replacement might be easiest to achieve. The
>> > 'struct refspec' doesn't make it incredibly easy. Perhaps skipping
>> > the refspec parsing and just doing that substring swap directly from
>> > the config value might be the best approach.
>> > 
>> >> This would still work for refspecs with negative patterns right?
>> > 
>> > One of the issues is that negative patterns have no ":refs/"
>> > substring.
>> > 
>> > The other issue is that exact matches (no "*") have an exact
>> > string in the destination, too, so replacing the _entire_
>> > destination with "refs/prefetch/<remote>/*" breaks the refspec.
>> > I think the substring approach will still work here.
>> 
>> I updated my branch with the substring approach, which is
>> probably the better solution. Please give it a try. I don't
>> expect that change to help the FreeBSD build, but we will see.
>
>
> This worked for all the scenarios I tried, which had both negatives and
> multi remote fetch values.
>
> Looks good!
>
> Reviewed-by: Tom Saeger <tom.saeger@oracle.com>

That sounds more like tested-by, but anyway, thanks for working well
together.



>
>>  
>> > [1] https://github.com/gitgitgadget/git/pull/924
>> > [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
>> 
>> Thanks,
>> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 22:09               ` Junio C Hamano
@ 2021-04-02 22:27                 ` Tom Saeger
  0 siblings, 0 replies; 27+ messages in thread
From: Tom Saeger @ 2021-04-02 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Fri, Apr 02, 2021 at 03:09:01PM -0700, Junio C Hamano wrote:
> Tom Saeger <tom.saeger@oracle.com> writes:
> 
> > On Fri, Apr 02, 2021 at 05:07:09PM -0400, Derrick Stolee wrote:
> >> On 4/2/2021 4:43 PM, Derrick Stolee wrote:
> >> > On 4/2/2021 2:27 PM, Tom Saeger wrote:
> >> >> generally isn't it still changing the right-hand side of refspec?
> >> >>
> >> >> replacing ":refs/" with ":refs/prefetch/"
> >> > 
> >> > Right, this substring replacement might be easiest to achieve. The
> >> > 'struct refspec' doesn't make it incredibly easy. Perhaps skipping
> >> > the refspec parsing and just doing that substring swap directly from
> >> > the config value might be the best approach.
> >> > 
> >> >> This would still work for refspecs with negative patterns right?
> >> > 
> >> > One of the issues is that negative patterns have no ":refs/"
> >> > substring.
> >> > 
> >> > The other issue is that exact matches (no "*") have an exact
> >> > string in the destination, too, so replacing the _entire_
> >> > destination with "refs/prefetch/<remote>/*" breaks the refspec.
> >> > I think the substring approach will still work here.
> >> 
> >> I updated my branch with the substring approach, which is
> >> probably the better solution. Please give it a try. I don't
> >> expect that change to help the FreeBSD build, but we will see.
> >
> >
> > This worked for all the scenarios I tried, which had both negatives and
> > multi remote fetch values.
> >
> > Looks good!
> >
> > Reviewed-by: Tom Saeger <tom.saeger@oracle.com>
> 
> That sounds more like tested-by, but anyway, thanks for working well
> together.

Tested-by: Tom Saeger <tom.saeger@oracle.com>

works for me, I did review the code but perhaps it's best to leave
reviews to others.

> 
> 
> 
> >
> >>  
> >> > [1] https://github.com/gitgitgadget/git/pull/924
> >> > [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
> >> 
> >> Thanks,
> >> -Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 20:43         ` Derrick Stolee
                             ` (2 preceding siblings ...)
  2021-04-02 21:19           ` Junio C Hamano
@ 2021-04-02 22:32           ` Eric Sunshine
  2021-04-03 20:21             ` Tom Saeger
  3 siblings, 1 reply; 27+ messages in thread
From: Eric Sunshine @ 2021-04-02 22:32 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Tom Saeger, Junio C Hamano, Git List

On Fri, Apr 2, 2021 at 4:43 PM Derrick Stolee <stolee@gmail.com> wrote:
> I have a branch available [1], but I'm seeing some failures only
> on FreeBSD [2] and I can't understand why that platform is failing
> this test. The current version (as of this writing) does not do
> the substring replacement technique, and hence it just gives up
> on exact matches. I will try the substring approach as an
> alternative and see where that gets me.
>
> [1] https://github.com/gitgitgadget/git/pull/924
> [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534

The "+" in patterns such as `+refs/heads/\\*:refs/prefetch...` is what
is throwing it off. FreeBSD `grep` doesn't seem to like it, though
it's not clear why. Escaping it the same way as you escaped "*"
doesn't make it work. Replacing "+" with catchall "." does work, so
that's one way to fix it.

However, all the escaping you need to do in these refspec patterns to
pass them to `grep` is ugly. A much better solution may be to change
the `grep` in test-lib-functions.sh:test_subcommand() to `grep -F` to
force it to match literally. That way, you can drop all the backslash
escaping, including those in front of "[" and "]". A cursory audit of
callers test_subcommand() seems to indicate that none of them pass
regex patterns, so using `-F` is probably safe and a good idea.

By the way, the `coccinelle` check is also "failing", correctly
suggesting that you change:

    strbuf_addf(&replace, ":refs/prefetch/");

to:

    strbuf_addstr(&replace, ":refs/prefetch/");

in `builtin/gc.c`.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 22:32           ` Eric Sunshine
@ 2021-04-03 20:21             ` Tom Saeger
  2021-04-03 22:41               ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-03 20:21 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Derrick Stolee, Junio C Hamano, Git List

On Fri, Apr 02, 2021 at 06:32:56PM -0400, Eric Sunshine wrote:
> On Fri, Apr 2, 2021 at 4:43 PM Derrick Stolee <stolee@gmail.com> wrote:
> > I have a branch available [1], but I'm seeing some failures only
> > on FreeBSD [2] and I can't understand why that platform is failing
> > this test. The current version (as of this writing) does not do
> > the substring replacement technique, and hence it just gives up
> > on exact matches. I will try the substring approach as an
> > alternative and see where that gets me.
> >
> > [1] https://github.com/gitgitgadget/git/pull/924
> > [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
> 
> The "+" in patterns such as `+refs/heads/\\*:refs/prefetch...` is what
> is throwing it off. FreeBSD `grep` doesn't seem to like it, though
> it's not clear why. Escaping it the same way as you escaped "*"
> doesn't make it work. Replacing "+" with catchall "." does work, so
> that's one way to fix it.
> 
> However, all the escaping you need to do in these refspec patterns to
> pass them to `grep` is ugly. A much better solution may be to change
> the `grep` in test-lib-functions.sh:test_subcommand() to `grep -F` to
> force it to match literally. That way, you can drop all the backslash
> escaping, including those in front of "[" and "]". A cursory audit of
> callers test_subcommand() seems to indicate that none of them pass
> regex patterns, so using `-F` is probably safe and a good idea.
> 
> By the way, the `coccinelle` check is also "failing", correctly
> suggesting that you change:
> 
>     strbuf_addf(&replace, ":refs/prefetch/");
> 
> to:
> 
>     strbuf_addstr(&replace, ":refs/prefetch/");
> 
> in `builtin/gc.c`.

Was curious - so rolled all Eric's feedback into a patch.

This passes 'make test' and 'make coccicheck' locally, not sure about FreeBSD.

Stolee - can you squash this onto your PR and try it if you agree?

--Tom


diff --git a/builtin/gc.c b/builtin/gc.c
index 98229bda25fd..eb4a03a2b47b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -894,7 +894,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 		if (index_of_refs) {
 			struct strbuf replace = STRBUF_INIT;
 			strbuf_add(&replace, raw, index_of_refs - raw);
-			strbuf_addf(&replace, ":refs/prefetch/");
+			strbuf_addstr(&replace, ":refs/prefetch/");
 			strbuf_addstr(&replace, index_of_refs + 6);
 
 			strvec_push(&child.args, replace.buf);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 93ae93b73611..3366ea188782 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -142,8 +142,8 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remotes/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remotes/remote2/\\* <run-prefetch.txt &&
+	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
 	git log prefetch/remotes/remote1/one &&
 	git log prefetch/remotes/remote2/two &&
@@ -169,12 +169,12 @@ test_expect_success 'prefetch custom refspecs' '
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
 
 	# skips second refspec because it is not a pattern type
-	rs1="+refs/heads/\\*:refs/prefetch/remotes/remote1/\\*" &&
+	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
 	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
 	rs3="^refs/heads/special/secret/not-fetched" &&
 
 	test_subcommand git fetch remote1 $fetchargs $rs1 $rs2 $rs3 <prefetch-refspec.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remotes/remote2/\\* <prefetch-refspec.txt &&
+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <prefetch-refspec.txt &&
 
 	# first refspec is overridden by second
 	test_must_fail git rev-parse refs/prefetch/special/fetched &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6348e8d7339c..e9a1cf3e227a 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1652,9 +1652,9 @@ test_subcommand () {
 
 	if test -n "$negate"
 	then
-		! grep "\[$expr\]"
+		! grep -F "$expr"
 	else
-		grep "\[$expr\]"
+		grep -F "$expr"
 	fi
 }
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-03 20:21             ` Tom Saeger
@ 2021-04-03 22:41               ` Derrick Stolee
  0 siblings, 0 replies; 27+ messages in thread
From: Derrick Stolee @ 2021-04-03 22:41 UTC (permalink / raw)
  To: Tom Saeger, Eric Sunshine; +Cc: Junio C Hamano, Git List

On 4/3/2021 4:21 PM, Tom Saeger wrote:
> On Fri, Apr 02, 2021 at 06:32:56PM -0400, Eric Sunshine wrote:
>> On Fri, Apr 2, 2021 at 4:43 PM Derrick Stolee <stolee@gmail.com> wrote:
>>> I have a branch available [1], but I'm seeing some failures only
>>> on FreeBSD [2] and I can't understand why that platform is failing
>>> this test. The current version (as of this writing) does not do
>>> the substring replacement technique, and hence it just gives up
>>> on exact matches. I will try the substring approach as an
>>> alternative and see where that gets me.
>>>
>>> [1] https://github.com/gitgitgadget/git/pull/924
>>> [2] https://github.com/gitgitgadget/git/pull/924/checks?check_run_id=2256079534
>>
>> The "+" in patterns such as `+refs/heads/\\*:refs/prefetch...` is what
>> is throwing it off. FreeBSD `grep` doesn't seem to like it, though
>> it's not clear why. Escaping it the same way as you escaped "*"
>> doesn't make it work. Replacing "+" with catchall "." does work, so
>> that's one way to fix it.
>>
>> However, all the escaping you need to do in these refspec patterns to
>> pass them to `grep` is ugly. A much better solution may be to change
>> the `grep` in test-lib-functions.sh:test_subcommand() to `grep -F` to
>> force it to match literally. That way, you can drop all the backslash
>> escaping, including those in front of "[" and "]". A cursory audit of
>> callers test_subcommand() seems to indicate that none of them pass
>> regex patterns, so using `-F` is probably safe and a good idea.

Yes, this is an excellent idea. Thanks.

>> By the way, the `coccinelle` check is also "failing", correctly
>> suggesting that you change:
>>
>>     strbuf_addf(&replace, ":refs/prefetch/");
>>
>> to:
>>
>>     strbuf_addstr(&replace, ":refs/prefetch/");
>>
>> in `builtin/gc.c`.

Good point! I'll get that, too.

> Was curious - so rolled all Eric's feedback into a patch.
> 
> This passes 'make test' and 'make coccicheck' locally, not sure about FreeBSD.
> 
> Stolee - can you squash this onto your PR and try it if you agree?

I made my own version of a patch before seeing this one, but
thanks for working on it.
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 6348e8d7339c..e9a1cf3e227a 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1652,9 +1652,9 @@ test_subcommand () {
>  
>  	if test -n "$negate"
>  	then
> -		! grep "\[$expr\]"
> +		! grep -F "$expr"
>  	else
> -		grep "\[$expr\]"
> +		grep -F "$expr"
>  	fi

Specifically I would use "[$expr]" here so it includes the
terminating characters of the JSON array that we are testing
against. It can be important that we have the complete set
of arguments, so these brackets are important.

Thanks, all. I'm taking a look at a mechanism for doing this
without munging the raw string directly, as Junio mentioned.
It might require adding an output format method in the refspec
API, but that's probably a good thing, anyway.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-02 21:19           ` Junio C Hamano
  2021-04-02 21:33             ` Tom Saeger
@ 2021-04-04 20:25             ` Derrick Stolee
  2021-04-04 23:10               ` Junio C Hamano
  1 sibling, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2021-04-04 20:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tom Saeger, git



On 4/2/2021 5:19 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> The other issue is that exact matches (no "*") have an exact
>> string in the destination, too, so replacing the _entire_
>> destination with "refs/prefetch/<remote>/*" breaks the refspec.
>> I think the substring approach will still work here.
> 
> I wonder if that matters.  If the exact format says
> 
> 	[remote "origin"]
> 	        url = git://git.kernel.org/pub/scm/git/git.git/
> 		fetch = +refs/heads/todo:refs/x/y/z
> 
> you can just add refs/prefetch/<remote>/ to the entire RHS to ensure
> that (1) the prefetch would not affect anything outside the 'prefetch'
> to break @{upstream} and friends, and (2) the prefetch from this remote
> would not affect anything used for other remotes.
> 
> IOW, the RHS, as long as it is hidden from normal operations and
> does not conflict with interaction with other repositories, where
> exactly in the refs hierarchy these objects are "parked" does not
> matter, I would think.
> 
> I do not recommend unparsed refspec and textually munging, by the
> way.  Doesn't
> 
> 	git fetch master:remotes/origin/master
> 
> first parse to normalize the src/dst sides to turn it into
> 
> 	git fetch refs/heads/master:refs/remotes/origin/master

I do not see evidence of this being reflected in my testing. The
good news is that I'm creating a test scenario that will give us
a clear way to test what the refspec parsing (and new output
format logic) is doing.

Now, it might be that we are having an equivalent behavior to
what you are suggesting because of the way refs are interpreted
when those refs are created based on the refspec. This complicates
things slightly, and I can work to see if there is a good way to
modify the refspec parsing to make this an explicit assumption.

> which is what you want to further redirect to the prefetch hierarchy
> 
> 	git fetch +refs/heads/master:refs/prefetch/origin/refs/remotes/origin/master
> 
> or whatever munging scheme is used?

The replacement will use the existing refspec's patterns for
partitioning the refspace, except we add 'prefetch' and no longer
insert the remote name ourselves. This allows us to have shorter
refnames while respecting the given layout as much as possible.

> Also, I wonder if there should be a mechanism to prune the
> prefetched refs, but that is totally outside the scope of the
> problem we have been discussing in this thread.
 
Yes, this is a good idea. Prefetch refs are going to add the
"force" option (starts with '+') because these refs should be
updated automatically when the remote refs are force-pushed.
Having a mechanism for removing dropped refs is a good idea.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-04 20:25             ` Derrick Stolee
@ 2021-04-04 23:10               ` Junio C Hamano
  2021-04-05 13:20                 ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-04-04 23:10 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Tom Saeger, git

Derrick Stolee <stolee@gmail.com> writes:

>> I do not recommend unparsed refspec and textually munging, by the
>> way.  Doesn't
>> 
>> 	git fetch master:remotes/origin/master
>> 
>> first parse to normalize the src/dst sides to turn it into
>> 
>> 	git fetch refs/heads/master:refs/remotes/origin/master

I tried to jug my memory in this area a bit by reading the relevant
code.  For non-wildcard refspec, e.g. with

    [remote "origin"]
	url = ../git.git/
	fetch = master:remotes/origin/master
	tagopt = --no-tags

you'd get

    $ git fetch -v
    From ../git
     * [new branch]            master     -> origin/master
    $ git for-each-ref
    2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48 commit	refs/remotes/origin/master

It all happens inside remote.c::get_fetch_map(), I think.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-04 23:10               ` Junio C Hamano
@ 2021-04-05 13:20                 ` Derrick Stolee
  2021-04-05 18:48                   ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2021-04-05 13:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tom Saeger, git

On 4/4/2021 7:10 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
>> On 4/2/2021 5:19 PM, Junio C Hamano wrote:
>>> I do not recommend unparsed refspec and textually munging, by the
>>> way.  Doesn't
>>>
>>> 	git fetch master:remotes/origin/master
>>>
>>> first parse to normalize the src/dst sides to turn it into
>>>
>>> 	git fetch refs/heads/master:refs/remotes/origin/master
> 
> I tried to jug my memory in this area a bit by reading the relevant
> code.  For non-wildcard refspec, e.g. with
> 
>     [remote "origin"]
> 	url = ../git.git/
> 	fetch = master:remotes/origin/master
> 	tagopt = --no-tags
> 
> you'd get
> 
>     $ git fetch -v
>     From ../git
>      * [new branch]            master     -> origin/master
>     $ git for-each-ref
>     2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48 commit	refs/remotes/origin/master
> 
> It all happens inside remote.c::get_fetch_map(), I think.

I see there that some refs are being matched with some
expectation of remote refs inside get_remote_ref() (which
calls find_ref_by_name_abbrev() as a helper). There does not
appear to be any place that modifies the refs directly in
general, and instead refs are matched from the refspec using
standard short-ref rules.

This is particularly shown in examples like "git push topic"
never modifying the push-refspec from having 'src' equal to
"topic" and instead the ref machinery discovers that "topic"
really means "refs/heads/topic".

I took your advice to not munge raw refpsecs and instead
worked directly on the 'dst' of the struct refspec_item. This
required adding a method that formats a refspec, which I test
carefully. I spent about an hour trying to have parse_refspec()
add the appropriate "refs/" prefix, but it becomes difficult
when we intend to make a distinction between "refs/heads/" and
other sub-areas within "refs/". Finally, I punted on that
conversion and made the logic in 'prefetch' extremely obvious:

1. If the refspec's 'dst' starts with "refs/", then replace
   that prefix with "refs/prefetch/".

2. If the refspec's 'dst' does not start with "refs/", then
   concatenate "refs/prefetch/" and 'dst'.

This will keep a roughly-equivalent partition of refs (some
might have previously collided that will not any more).

I have posted my patch series [1], so please take a look. It
builds up the infrastructure to properly test such a refspec
expansion, if we wish to do so.

[1] https://lore.kernel.org/git/pull.924.git.1617627856.gitgitgadget@gmail.com/

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 13:20                 ` Derrick Stolee
@ 2021-04-05 18:48                   ` Junio C Hamano
  2021-04-05 20:38                     ` Tom Saeger
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-04-05 18:48 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Tom Saeger, git

Derrick Stolee <stolee@gmail.com> writes:

> ... but it becomes difficult
> when we intend to make a distinction between "refs/heads/" and
> other sub-areas within "refs/". Finally, I punted on that
> conversion and made the logic in 'prefetch' extremely obvious:
>
> 1. If the refspec's 'dst' starts with "refs/", then replace
>    that prefix with "refs/prefetch/".
>
> 2. If the refspec's 'dst' does not start with "refs/", then
>    concatenate "refs/prefetch/" and 'dst'.
>
> This will keep a roughly-equivalent partition of refs (some
> might have previously collided that will not any more).

Makes sense.  Do we need to add another rule?

3. If the refspec does not have 'dst', ignore it.

> I have posted my patch series [1], so please take a look. It
> builds up the infrastructure to properly test such a refspec
> expansion, if we wish to do so.
>
> [1] https://lore.kernel.org/git/pull.924.git.1617627856.gitgitgadget@gmail.com/
>
> Thanks,
> -Stolee

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 18:48                   ` Junio C Hamano
@ 2021-04-05 20:38                     ` Tom Saeger
  2021-04-05 20:47                       ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-05 20:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Mon, Apr 05, 2021 at 11:48:56AM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > ... but it becomes difficult
> > when we intend to make a distinction between "refs/heads/" and
> > other sub-areas within "refs/". Finally, I punted on that
> > conversion and made the logic in 'prefetch' extremely obvious:
> >
> > 1. If the refspec's 'dst' starts with "refs/", then replace
> >    that prefix with "refs/prefetch/".
> >
> > 2. If the refspec's 'dst' does not start with "refs/", then
> >    concatenate "refs/prefetch/" and 'dst'.
> >
> > This will keep a roughly-equivalent partition of refs (some
> > might have previously collided that will not any more).
> 
> Makes sense.  Do we need to add another rule?
> 
> 3. If the refspec does not have 'dst', ignore it.

I just tried what I think you're saying here.

Consider this fetch config:

$ git config --local --get-regexp "pr-924"
remote.pr-924.url https://github.com/gitgitgadget/git
remote.pr-924.fetch +refs/tags/pr-924/derrickstolee/maintenance/refspec-v1

Seems legal, fetch even works

$ git fetch pr-924
Fetching pr-924
From https://github.com/gitgitgadget/git
 * tag                         pr-924/derrickstolee/maintenance/refspec-v1 -> FETCH_HEAD


Prefetch segfaults

$ git maintenance run --task prefetch
[1]    3811315 segmentation fault (core dumped)  git maintenance run --task prefetch


refspec which isn't a 'negative' with a src, but no dst.


Quick hack:

diff --git a/builtin/gc.c b/builtin/gc.c
index 92cb8b4e0bfa..2f4b8f2375c3 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -899,6 +899,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                        continue;
                }

+               if (!rsi->dst) {
+                       continue;
+               }
+
                refspec_item_init(&replace, remote->fetch.raw[i], 1);

                /*


Better, but now a different issue:

$ git maintenance run --task prefetch
fatal: --refmap option is only meaningful with command-line refspec(s).
error: failed to prefetch remotes
error: task 'prefetch' failed

So a "remote" where the only fetch url is skipped.


> 
> > I have posted my patch series [1], so please take a look. It
> > builds up the infrastructure to properly test such a refspec
> > expansion, if we wish to do so.
> >
> > [1] https://lore.kernel.org/git/pull.924.git.1617627856.gitgitgadget@gmail.com/
> >
> > Thanks,
> > -Stolee
> 
> Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 20:38                     ` Tom Saeger
@ 2021-04-05 20:47                       ` Junio C Hamano
  2021-04-05 20:49                         ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-04-05 20:47 UTC (permalink / raw)
  To: Tom Saeger; +Cc: Derrick Stolee, git

Tom Saeger <tom.saeger@oracle.com> writes:

> $ git config --local --get-regexp "pr-924"
> remote.pr-924.url https://github.com/gitgitgadget/git
> remote.pr-924.fetch +refs/tags/pr-924/derrickstolee/maintenance/refspec-v1
>
> Seems legal, fetch even works

Yes.  For a ref that is one-shot use (like PR tags), this does not
make much sense, but

    [remote "submaintainer1"]
	url = ... repository of submaintainer #1 ...
	fetch = master
	tagopts = --no-tags

is a reasonable thing to have for those who regularly work with
submaintainer(s) of their project.  They'd do

	$ git pull submaintainer1

to accept the work their submaintainers have done.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 20:47                       ` Junio C Hamano
@ 2021-04-05 20:49                         ` Derrick Stolee
  2021-04-05 20:50                           ` Tom Saeger
  0 siblings, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2021-04-05 20:49 UTC (permalink / raw)
  To: Junio C Hamano, Tom Saeger; +Cc: git

On 4/5/2021 4:47 PM, Junio C Hamano wrote:
> Tom Saeger <tom.saeger@oracle.com> writes:
> 
>> $ git config --local --get-regexp "pr-924"
>> remote.pr-924.url https://github.com/gitgitgadget/git
>> remote.pr-924.fetch +refs/tags/pr-924/derrickstolee/maintenance/refspec-v1
>>
>> Seems legal, fetch even works
> 
> Yes.  For a ref that is one-shot use (like PR tags), this does not
> make much sense, but
> 
>     [remote "submaintainer1"]
> 	url = ... repository of submaintainer #1 ...
> 	fetch = master
> 	tagopts = --no-tags
> 
> is a reasonable thing to have for those who regularly work with
> submaintainer(s) of their project.  They'd do
> 
> 	$ git pull submaintainer1
> 
> to accept the work their submaintainers have done.

Thanks for the extra testing! I'll be sure to fix this bug in v2.

-Stolee
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 20:49                         ` Derrick Stolee
@ 2021-04-05 20:50                           ` Tom Saeger
  2021-04-05 20:54                             ` Derrick Stolee
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Saeger @ 2021-04-05 20:50 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Junio C Hamano, git

On Mon, Apr 05, 2021 at 04:49:29PM -0400, Derrick Stolee wrote:
> On 4/5/2021 4:47 PM, Junio C Hamano wrote:
> > Tom Saeger <tom.saeger@oracle.com> writes:
> > 
> >> $ git config --local --get-regexp "pr-924"
> >> remote.pr-924.url https://github.com/gitgitgadget/git
> >> remote.pr-924.fetch +refs/tags/pr-924/derrickstolee/maintenance/refspec-v1
> >>
> >> Seems legal, fetch even works
> > 
> > Yes.  For a ref that is one-shot use (like PR tags), this does not
> > make much sense, but
> > 
> >     [remote "submaintainer1"]
> > 	url = ... repository of submaintainer #1 ...
> > 	fetch = master
> > 	tagopts = --no-tags
> > 
> > is a reasonable thing to have for those who regularly work with
> > submaintainer(s) of their project.  They'd do
> > 
> > 	$ git pull submaintainer1
> > 
> > to accept the work their submaintainers have done.
> 
> Thanks for the extra testing! I'll be sure to fix this bug in v2.
> 
> -Stolee
>  

Hacked this:

diff --git a/builtin/gc.c b/builtin/gc.c
index 92cb8b4e0bfa..8c0fcbd3bb7e 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,6 +879,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        struct maintenance_run_opts *opts = cbdata;
        struct child_process child = CHILD_PROCESS_INIT;
        int i;
+       int nargs;

        child.git_cmd = 1;
        strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -888,6 +889,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        if (opts->quiet)
                strvec_push(&child.args, "--quiet");

+       nargs = child.args.nr;
+
        for (i = 0; i < remote->fetch.nr; i++) {
                struct refspec_item replace;
                struct refspec_item *rsi = &remote->fetch.items[i];
@@ -899,6 +902,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                        continue;
                }

+               if (!rsi->dst) {
+                       continue;
+               }
+
                refspec_item_init(&replace, remote->fetch.raw[i], 1);

                /*
@@ -920,6 +927,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                refspec_item_clear(&replace);
        }

+       /* skip remote if no refspecs to fetch */
+       if (child.args.nr <= nargs)
+               return 0;
+
        return !!run_command(&child);
 }


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: should git maintenance prefetch be taught to honor remote.fetch refspec?
  2021-04-05 20:50                           ` Tom Saeger
@ 2021-04-05 20:54                             ` Derrick Stolee
  0 siblings, 0 replies; 27+ messages in thread
From: Derrick Stolee @ 2021-04-05 20:54 UTC (permalink / raw)
  To: Tom Saeger; +Cc: Junio C Hamano, git

On 4/5/2021 4:50 PM, Tom Saeger wrote:
> On Mon, Apr 05, 2021 at 04:49:29PM -0400, Derrick Stolee wrote:
>> On 4/5/2021 4:47 PM, Junio C Hamano wrote:
>>> Tom Saeger <tom.saeger@oracle.com> writes:
>>>
>>>> $ git config --local --get-regexp "pr-924"
>>>> remote.pr-924.url https://github.com/gitgitgadget/git
>>>> remote.pr-924.fetch +refs/tags/pr-924/derrickstolee/maintenance/refspec-v1
>>>>
>>>> Seems legal, fetch even works
>>>
>>> Yes.  For a ref that is one-shot use (like PR tags), this does not
>>> make much sense, but
>>>
>>>     [remote "submaintainer1"]
>>> 	url = ... repository of submaintainer #1 ...
>>> 	fetch = master
>>> 	tagopts = --no-tags
>>>
>>> is a reasonable thing to have for those who regularly work with
>>> submaintainer(s) of their project.  They'd do
>>>
>>> 	$ git pull submaintainer1
>>>
>>> to accept the work their submaintainers have done.
>>
>> Thanks for the extra testing! I'll be sure to fix this bug in v2.
>>
>> -Stolee
>>  
> 
> Hacked this:
>
> @@ -920,6 +927,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>                 refspec_item_clear(&replace);
>         }
> 
> +       /* skip remote if no refspecs to fetch */
> +       if (child.args.nr <= nargs)
> +               return 0;
> +

This is a good approach, but we'll need to clean up the data in
'child' since run_command() no longer does it for us.

>         return !!run_command(&child);

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-04-05 20:54 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01 18:49 should git maintenance prefetch be taught to honor remote.fetch refspec? Tom Saeger
2021-04-01 19:07 ` Derrick Stolee
2021-04-01 19:42   ` Tom Saeger
2021-04-01 20:14   ` Junio C Hamano
2021-04-01 22:11     ` Tom Saeger
2021-04-01 22:25     ` Derrick Stolee
2021-04-02 18:27       ` Tom Saeger
2021-04-02 20:43         ` Derrick Stolee
2021-04-02 21:07           ` Derrick Stolee
2021-04-02 21:39             ` Tom Saeger
2021-04-02 22:09               ` Junio C Hamano
2021-04-02 22:27                 ` Tom Saeger
2021-04-02 21:15           ` Tom Saeger
2021-04-02 21:19           ` Junio C Hamano
2021-04-02 21:33             ` Tom Saeger
2021-04-04 20:25             ` Derrick Stolee
2021-04-04 23:10               ` Junio C Hamano
2021-04-05 13:20                 ` Derrick Stolee
2021-04-05 18:48                   ` Junio C Hamano
2021-04-05 20:38                     ` Tom Saeger
2021-04-05 20:47                       ` Junio C Hamano
2021-04-05 20:49                         ` Derrick Stolee
2021-04-05 20:50                           ` Tom Saeger
2021-04-05 20:54                             ` Derrick Stolee
2021-04-02 22:32           ` Eric Sunshine
2021-04-03 20:21             ` Tom Saeger
2021-04-03 22:41               ` Derrick Stolee

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git