git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Brandon Williams" <bmwill@google.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH v2] clone: add a --no-tags option to clone without tags
Date: Wed, 26 Apr 2017 10:56:28 +0200	[thread overview]
Message-ID: <CACBZZX7u_1hifAHxNJU+WCkdk2+s63PV5F5dSx=M5azPE4Ra0A@mail.gmail.com> (raw)
In-Reply-To: <20170425224521.GM28740@aiede.svl.corp.google.com>

On Wed, Apr 26, 2017 at 12:45 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
>
> Ęvar Arnfjörš Bjarmason wrote:
>
>> Add a --no-tags option to "git clone" to clone without tags. Currently
>> there's no easy way to clone a repository and end up with just a
>> "master" branch via --single-branch, or track all branches and no
>> tags. Now --no-tags can be added to "git clone" with or without
>> --single-branch to clone a repository without tags.
>
> Now I've read the discussion from v1, so you can see my thoughts
> evolving in real time. :)
>
> The above feels a bit misleading when it says "there's no easy way to
> clone a repository and end up with just a 'master' branch".
> --single-branch does exactly that.

I'll reword this, what I meant is "just a master branch [and no other
references]". Not "just the master branch [and no other branches]".

> Some annotated tags *pointing to
> its history* come along for the ride, but what harm are they doing?

I'll explain this in a bit more detail in the commit message & docs,
both you & Junio (in <xmqq1ssparom.fsf@gitster.mtv.corp.google.com>)
seem to be making the assumption that only getting & maintaining the
tags for the branch you're cloning is cheap.

This assumes a repo that while large, doesn't get a lot of releases on
its main branch. E.g. linux.git has ~650k commits, and ~500 v* tags,
that's a tag every 1300 commits or so.

Now if you run this on linux.git:

    $ git rev-list origin/master | parallel -j6 --progress 'test
"$(echo {} | cut -b1)" = 0 && git tag -a -m"msg" test-tag-{}'

You'll get a tag a bit more than every 16 commits, and now a lot of
everyday commands become slow, because a lot of them need to look at
every ref before they start:

    $ (time (git log -1 >/dev/null)) 2>&1|grep ^real
    real    0m1.304s

Whereas on a linux.git without all those tags:

    $ (time (git log -1 >/dev/null)) 2>&1|grep ^real
    real    0m0.027s

And you can imagine what this does to some other commands, e.g. bash completion:

    $ git log <TAB>
    Display all 512 possibilities? (y or n)

v.s.:

    $ git log <TAB>
    Display all 42129 possibilities? (y or n)

Furthermore, if upstream has a high tag churn, i.e. creates lots of
tags but prunes them regularly even if you set tagOpts=--prune you'll
still end up with an every slower local repository as you slowly
accumulate every tag upstream has created every, you'd need to fetch
with:

    $ git fetch origin --prune 'refs/tags/*:refs/tags/*'

Simply never fetching the tags in the first place & making sure they
aren't fetched avoids all of this, and is perfect e.g. for the use
case of something that runs an automated "pull" on the repo to index
its code (I initially wrote this for a https://github.com/etsy/hound/
setup), or if you'd just like to run the likes of "git log" on the
master branch without starting that up taking a second longer.

> In other words, I think the commit message needs a bit more detail about
> the use case, to say why omitting those tags is useful.  The use case
> is probably sane but it is not explained.  A side effect (and my main
> motivation) is that this would make it crystal clear to people looking
> at the patch in history that it is talking about tags that are part of
> "master"'s history, not tags pointing elsewhere.

I'll add that.

>> Before this the only way of doing this was either by manually tweaking
>> the config in a fresh repository:
>
> Usually commit messages refer to the state of things without some
> patch using the present tense --- e.g. "Without this patch, this
> --no-tags option can be emulated by (1) manually tweaking the config
> in a fresh repository, or (2) by setting tagOpt=--no-tags after
> cloning and deleting any existing tags".
>
> [...]
>> Which of course was also subtly buggy if --branch was pointed at a
>> tag, leaving the user in a detached head:
>>
>>     git clone --single-branch --branch v2.12.0 git@github.com:git/git.git &&
>>     cd git &&
>>     git config remote.origin.tagOpt --no-tags &&
>>     git tag -l | xargs git tag -d
>
> At this point I lose the trail of thought.  I don't think it's
> important to understanding the patch.

I'm going to leave that in because anyone who needs this feature for a
similar use-case (which I'll explain in more detail), would need to do
exactly that to get a bug-compatible version of the same behavior if
they need to run on an older git version for whatever reason.

>> Now all this complexity becomes the much simpler:
>>
>>     git clone --single-branch --no-tags git@github.com:git/git.git
>>
>> Or in the case of cloning a single tag "branch":
>>
>>     git clone --single-branch --branch v2.12.0 --no-tags git@github.com:git/git.git
>
> Nice.
>
> [...]
>>  Documentation/git-clone.txt | 14 ++++++++-
>>  builtin/clone.c             | 13 ++++++--
>>  t/t5612-clone-refspec.sh    | 73 +++++++++++++++++++++++++++++++++++++++++++--
>>  3 files changed, 95 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
>> index 30052cce49..57b3f478ed 100644
>> --- a/Documentation/git-clone.txt
>> +++ b/Documentation/git-clone.txt
>> @@ -13,7 +13,7 @@ SYNOPSIS
>>         [-l] [-s] [--no-hardlinks] [-q] [-n] [--bare] [--mirror]
>>         [-o <name>] [-b <name>] [-u <upload-pack>] [--reference <repository>]
>>         [--dissociate] [--separate-git-dir <git dir>]
>> -       [--depth <depth>] [--[no-]single-branch]
>> +       [--depth <depth>] [--[no-]single-branch] [--no-tags]
>
> Can I pass --tags to negate a previous --no-tags?

Yeah both --tags and --no-no-tags work as with every other OPT_BOOL
option. See "[RFC PATCH] parse-options: disallow double-negations of
options starting with no-".

> [...]
>> +--no-tags::
>> +     Don't clone any tags, and set
>> +     `remote.<remote>.tagOpt=--no-tags` in the config, ensuring
>> +     that future `git pull` and `git fetch` operations won't follow
>> +     any tags. Subsequent explicit tag fetches will still work,
>> +     (see linkgit:git-fetch[1]).
>> ++
>> +Can be used in conjunction with `--single-branch` to clone & maintain
>
> nit: s/&/and/

Will fix.

> [...]
>> +test_expect_success 'clone with --no-tags' '
>> +     (
>> +             cd dir_all_no_tags && git fetch &&
>> +             git for-each-ref refs/tags >../actual
>
> nit: this would be easier to read with the 'cd' and 'git fetch' on
> separate lines.
>
> [...]
>> +test_expect_success '--single-branch while HEAD pointing at master and --no-tags' '
>> +     (
>> +             cd dir_master_no_tags && git fetch &&
>
> Likewise.

This was following the existing style in the file, but sure, I'll
prepend a patch to this series to fix all of that before building this
patch on top.

>> +             git for-each-ref refs/remotes/origin |
>> +             sed -e "/HEAD$/d" \
>> +                 -e "s|/remotes/origin/|/heads/|" >../actual
>
> Can $/ be expanded by the shell?

I think not, and if there's some issue with it it's obscure enough to
not have caused issues since 31b808a032 ("clone --single: limit the
fetch refspec to fetched branch", 2012-09-20) which introduced this
pattern earlier in the test file, I'm just copy/pasting similar setup
from elsewhere in the file.

  parent reply	other threads:[~2017-04-26  8:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-14 21:28 Is there a way to get 'git fetch --no-tags' semantics with 'git clone'? Ævar Arnfjörð Bjarmason
2017-04-18 19:15 ` [PATCH] clone: add a --no-tags option to clone without tags Ævar Arnfjörð Bjarmason
2017-04-18 21:06   ` Ævar Arnfjörð Bjarmason
2017-04-18 23:30   ` Brandon Williams
2017-04-19  1:38   ` Junio C Hamano
2017-04-19  5:32     ` Junio C Hamano
2017-04-19 14:38     ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2017-04-25 22:45       ` Jonathan Nieder
2017-04-26  1:26         ` Junio C Hamano
2017-04-26  8:56         ` Ævar Arnfjörð Bjarmason [this message]
2017-04-25 22:35   ` [PATCH] " Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACBZZX7u_1hifAHxNJU+WCkdk2+s63PV5F5dSx=M5azPE4Ra0A@mail.gmail.com' \
    --to=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).