git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git clone --shallow-since can result in inconsistent shallow clones
@ 2020-08-25 18:38 Nelson Elhage
  2020-08-25 19:51 ` Jeff King
  0 siblings, 1 reply; 2+ messages in thread
From: Nelson Elhage @ 2020-08-25 18:38 UTC (permalink / raw)
  To: git

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)

I ran

  git clone --shallow-since="1548454011" "https://github.com/abseil/abseil-cpp"

to produce a shallow clone of abseil-cpp.git, with the aim of going
deep enough to grab commit `5e0dcf72c64fae912184d2e0de87195fe8f0a425`,
which I know to have a commit date of `1548454011`.

What did you expect to happen? (Expected behavior)

- I expected the command to produce a valid shallow git clone.
- I further expected the repository to include commit
  5e0dcf72c64fae912184d2e0de87195fe8f0a425, which has a commit date <=
  the provided `--shallow`, as do all of its descendants up to the
  `master` branch

What happened instead? (Actual behavior)

- The clone command produced an inconsistent shallow clone. In the
repository I see:

    $ cat .git/shallow
    5e0dcf72c64fae912184d2e0de87195fe8f0a425
    89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede

But commit `5e0dcf72c64fae912184d2e0de87195fe8f0a425` is missing. An
attempt to `git fetch --unshallow` errors out, because the server
sends an `unshallow 5e0dcf72c64fae912184d2e0de87195fe8f0a425`, which
we are unable to execute since we're missing that object.

That object is also the specific one I mentioned above that I wanted.

What's different between what you expected and what actually happened?

Anything else you want to add:

The problem here is triggered by passing a `shallow-since` that lies
*between* the first and and second parents of a merge commit that
itself is on the first-parent spine. If we examine the relevant
portion of `abseil-cpp.git`'s history, we find:

    $ git --no-pager log --format='%h %ct' --graph
89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede~6..89ea0c5ff34aaa5855cfc7aa41f323b8a0ef0ede
    *   89ea0c5 1548698816      # WANT
    |\
    | * 7ec3270 1548194022      # WANT
    * | 5e0dcf7 1548454011      # WANT
    * | 0dffca4 1548346230      # DON'T WANT
    * | 6b4201f 1548261751      # DON'T WANT
    |/
    * 0b1e6d4 1547838308        # DON'T WANT
    * efccc50 1547753737        # DON'T WANT

I've annotated the commits with WANT or DON'T WONT based on whether or
not their commit time is included by the `--shallow-since` filter.

What is happening, I believe, is that we are marking 89ea0c5 as
shallow, since its first parent is unwanted. However, marking it
shallow causes pack generation to ignore _all_ of its parents,
including 7ec3270, which we _do_ want. This results in the
inconsistent state where we mark `5e0dcf7` as shallow (and send the
`shallow` line), but don't send the actual object.

It's unfortunately a bit unclear to me what _should_ happen here. We
really want a way to mark `89ea0c5` as "partially-shallow", and send
its second parent, but not its first parent, but shallowness is a
property of an entire commit, not of a specific commit/parent
relationship. However, it'd be nice if we at least ended up with a
consistent state, instead of with a repository with invalid `shallow`
marks.

Please review the rest of the bug report below.
You can delete any lines you don't wish to share.


[System Info]
git version:
git version 2.28.0.461.g40977abb40
cpu: x86_64
built from commit: 40977abb4059c11004726852a79df64f4553944d
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64
compiler info: gnuc: 9.3
libc info: glibc: 2.31
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]

- Nelson Elhage

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: git clone --shallow-since can result in inconsistent shallow clones
  2020-08-25 18:38 git clone --shallow-since can result in inconsistent shallow clones Nelson Elhage
@ 2020-08-25 19:51 ` Jeff King
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff King @ 2020-08-25 19:51 UTC (permalink / raw)
  To: Nelson Elhage; +Cc: git

On Tue, Aug 25, 2020 at 11:38:04AM -0700, Nelson Elhage wrote:

> It's unfortunately a bit unclear to me what _should_ happen here. We
> really want a way to mark `89ea0c5` as "partially-shallow", and send
> its second parent, but not its first parent, but shallowness is a
> property of an entire commit, not of a specific commit/parent
> relationship. However, it'd be nice if we at least ended up with a
> consistent state, instead of with a repository with invalid `shallow`
> marks.

I think this is the same issue I reported recently here:

  https://lore.kernel.org/git/20200721160643.GA3288097@coredump.intra.peff.net/

AFAICT the shallow feature is just defective and can't accurately
represent this situation. Unfortunately nobody seemed to have any bright
ideas, and the developer who implemented most of the shallow features
(including shallow-since) is no longer active.

So I suspect it is fixable, but probably requires somebody to get pretty
familiar with the shallow code, and propose a fix that involves both
code and a protocol change.

-Peff

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-25 19:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-25 18:38 git clone --shallow-since can result in inconsistent shallow clones Nelson Elhage
2020-08-25 19:51 ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).