git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Anders Janmyr <anders@janmyr.com>
Cc: git@vger.kernel.org
Subject: Re: Possible bug in git describe, additional commits differs when cloned with --depth
Date: Fri, 27 Sep 2019 14:02:01 +0200	[thread overview]
Message-ID: <20190927120201.GM2637@szeder.dev> (raw)
In-Reply-To: <CA+UvoT7EBa6S6Fi7scYTo8mYKx=n1e=sPvxy5TRP3vG7gw97Xw@mail.gmail.com>

On Fri, Sep 27, 2019 at 11:51:07AM +0200, Anders Janmyr wrote:
> Hi,
> 
> I'm not sure if this is a bug or not but `git describe` gives
> different results when the repo has been cloned with `--depth` or not.
> 
> In the example below from the git repository the number of additional
> commits since the
> last tag differs 256 vs. 265.
> 
> ```
> $ git clone https://github.com/git/git
> $ cd git/
> $ git describe
> v2.23.0-256-g4c86140027
> $ git rev-list -n 1 HEAD
> 4c86140027f4a0d2caaa3ab4bd8bfc5ce3c11c8a
> 
> 
> $ git clone --depth=50 https://github.com/git/git git-depth
> $ cd git-depth/
> $ git describe
> v2.23.0-265-g4c861400
> $ git rev-list -n 1 HEAD
> 4c86140027f4a0d2caaa3ab4bd8bfc5ce3c11c8a
> ```

I don't think this is a bug, but rather an inherent limitation of
shallow histories with lots of merges, and it affects not only 'git
describe', but any limited history traversal.

In the Git project new features are developed on their dedicated
branches, which are then eventually merged to 'master'.  Alas, we make
mistakes, and sometimes we realize that a feature was buggy after it
has already been merged to 'master'.  In such cases the bugfix is
often applied not on top of 'master', but on top of the feature
branch, so it can be merged to maintenance releases as well.

This results in a history like this:

  M2     Merge the bugfix to 'master'
  |  \
  |   \
 v2.0  |
  |    o  Bugfix for new feature
 CO2   |
  |    |
  M1  /  Merge 'new feature' to 'master'
  | \/
  |  o   new feature
  |  |
  |  o
  |  |
  | CO1
  |  |    
  | /
 v1.0

Describing M2 in a full repository results in something like
v2.0-2-gdeadbeef, because M2 contains only two commits that aren't in
v2.0, (M2 and the bugfix).

Now let's suppose that in a shallow repo the given '--depth=<N>'
resulted in a cutoff at commits CO1 and CO2, meaning that the shallow
repo does not include commits M1 and v1.0.  Consequently, Git can't
possibly see that the commits implementing the new feature are already
merged and thus reachable from v2.0, so it will count those commits as
well, resulting in v2.0-5-gabcdef.

There is a lot more going on in the Git repository, so it's not as
simple as above.  Case in point is the merge d1a251a1fa (Merge branch
'en/checkout-mismerge-fix', 2019-09-09), which merges a fix to a bug
that happened before v2.22.0-rc0, but that bug was not introduced in
the feature branch, but while merging that branch to 'master'.  The
result is still the same, though: since there are a lot of commits on
the ancestry path between that buggy merge and v2.23.0, '--depth=50'
doesn't include them all in the shallow clone, so Git can't possibly
know that that merge is reachable from v2.23.0.

  # same in both the full and shallow repos
  $ git log --oneline v2.23.0..d1a251a1fa^ |wc -l
  57

  # in the full repo
  $ git log --oneline v2.23.0..d1a251a1fa |wc -l
  59

  # in the shallow repo
  $ git log --oneline v2.23.0..d1a251a1fa |wc -l
  132

> In the example above the first version also gives additional digits
> for the SHA,
> g4c86140027 vs. g4c861400, but that is not always the case.

Git shows as many hexdigits as needed to form a unique object name
with a few additional digits worth of safety margin.  There are a lot
more objects in the full repository than in the shallow clone, which
means more hexdigits in the abbreviated object name.


Thanks for letting us know.  I think this is worth a warning in
the documentation of 'git clone --depth'.


  reply	other threads:[~2019-09-27 12:02 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-27  9:51 Possible bug in git describe, additional commits differs when cloned with --depth Anders Janmyr
2019-09-27 12:02 ` SZEDER Gábor [this message]
2019-10-08 12:31   ` [BUG] a commit date-related bug in 'git describe' [was: Re: Possible bug in git describe, additional commits differs when cloned with --depth] SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190927120201.GM2637@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=anders@janmyr.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).