git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nelson Elhage <nelhage@nelhage.com>
To: git@vger.kernel.org
Subject: git clone --shallow-since can result in inconsistent shallow clones
Date: Tue, 25 Aug 2020 11:38:04 -0700	[thread overview]
Message-ID: <CAPSG9dZV2EPpVKkOMcjv5z+NF7rUu=V-ZkZNx47rCv122HsiKg@mail.gmail.com> (raw)

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)

I ran

  git clone --shallow-since="1548454011" "https://github.com/abseil/abseil-cpp"

to produce a shallow clone of abseil-cpp.git, with the aim of going
deep enough to grab commit `5e0dcf72c64fae912184d2e0de87195fe8f0a425`,
which I know to have a commit date of `1548454011`.

What did you expect to happen? (Expected behavior)

- I expected the command to produce a valid shallow git clone.
- I further expected the repository to include commit
  5e0dcf72c64fae912184d2e0de87195fe8f0a425, which has a commit date <=
  the provided `--shallow`, as do all of its descendants up to the
  `master` branch

What happened instead? (Actual behavior)

- The clone command produced an inconsistent shallow clone. In the
repository I see:

    $ cat .git/shallow
    5e0dcf72c64fae912184d2e0de87195fe8f0a425
    89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede

But commit `5e0dcf72c64fae912184d2e0de87195fe8f0a425` is missing. An
attempt to `git fetch --unshallow` errors out, because the server
sends an `unshallow 5e0dcf72c64fae912184d2e0de87195fe8f0a425`, which
we are unable to execute since we're missing that object.

That object is also the specific one I mentioned above that I wanted.

What's different between what you expected and what actually happened?

Anything else you want to add:

The problem here is triggered by passing a `shallow-since` that lies
*between* the first and and second parents of a merge commit that
itself is on the first-parent spine. If we examine the relevant
portion of `abseil-cpp.git`'s history, we find:

    $ git --no-pager log --format='%h %ct' --graph
89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede~6..89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede
    *   89ea0c5 1548698816      # WANT
    |\
    | * 7ec3270 1548194022      # WANT
    * | 5e0dcf7 1548454011      # WANT
    * | 0dffca4 1548346230      # DON'T WANT
    * | 6b4201f 1548261751      # DON'T WANT
    |/
    * 0b1e6d4 1547838308        # DON'T WANT
    * efccc50 1547753737        # DON'T WANT

I've annotated the commits with WANT or DON'T WONT based on whether or
not their commit time is included by the `--shallow-since` filter.

What is happening, I believe, is that we are marking 89ea0c5 as
shallow, since its first parent is unwanted. However, marking it
shallow causes pack generation to ignore _all_ of its parents,
including 7ec3270, which we _do_ want. This results in the
inconsistent state where we mark `5e0dcf7` as shallow (and send the
`shallow` line), but don't send the actual object.

It's unfortunately a bit unclear to me what _should_ happen here. We
really want a way to mark `89ea0c5` as "partially-shallow", and send
its second parent, but not its first parent, but shallowness is a
property of an entire commit, not of a specific commit/parent
relationship. However, it'd be nice if we at least ended up with a
consistent state, instead of with a repository with invalid `shallow`
marks.

Please review the rest of the bug report below.
You can delete any lines you don't wish to share.


[System Info]
git version:
git version 2.28.0.461.g40977abb40
cpu: x86_64
built from commit: 40977abb4059c11004726852a79df64f4553944d
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64
compiler info: gnuc: 9.3
libc info: glibc: 2.31
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]

- Nelson Elhage

             reply	other threads:[~2020-08-25 18:38 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-25 18:38 Nelson Elhage [this message]
2020-08-25 19:51 ` git clone --shallow-since can result in inconsistent shallow clones Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPSG9dZV2EPpVKkOMcjv5z+NF7rUu=V-ZkZNx47rCv122HsiKg@mail.gmail.com' \
    --to=nelhage@nelhage.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).