git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [BUG?] shallow-since can't handle merges
@ 2020-07-21 16:06 Jeff King
  0 siblings, 0 replies; only message in thread
From: Jeff King @ 2020-07-21 16:06 UTC (permalink / raw)
  To: git

I came across an interesting case where the shallow code doesn't do do
the right thing. From my reading, it doesn't look like it could possibly
work, but I also don't understand the shallow feature very well, so here
we are. :)

To reproduce:

  # make a copy of the "upstream" repo; you can also fetch directly from
  # it, but having it locally lets you poke at both sides of the
  # conversation
  git clone --bare https://github.com/DefinitelyTyped/DefinitelyTyped upstream.git

  # now make a local copy of a related repo
  git clone --bare --depth=1 https://github.com/Maxim-Mazurok/DefinitelyTyped --no-single-branch repo.git
  cd repo.git

  # and then try to deepen down to this date. This is early morning on
  # May 9 2020.
  git remote add upstream ../upstream.git
  git fetch upstream master --shallow-since=1589000000

The resulting history you get is cut off at May 12, even though there
are parent commits between May 9 and May 12 that should be included.

Here's what the graph in upstream.git looks like:

  $ git log --all --graph --format='%cd %h' --date=iso-local
  * 2020-07-21 13:31:19 +0000 3e7ef84b0a
  * 2020-07-21 02:15:25 +0000 97f6a9df78
  [...]
  * 2020-05-12 23:57:33 +0000 9a0645f52a
  *   2020-05-12 23:55:43 +0000 c7883f4581
  |\
  | * 2020-04-28 17:17:56 +0000 a4b68cbe3a
  * |   2020-05-12 23:50:34 +0000 9d282b0d7d
  |\ \
  | * | 2020-05-05 23:53:20 +0000 daf04e0dcb
  | * | 2020-05-05 23:53:18 +0000 4b2129de54
  * | | 2020-05-12 22:56:55 +0000 c599db54c9
  * | | 2020-05-12 22:04:29 +0000 5bf87114a9
  [...]
  * | | 2020-05-09 09:25:35 +0000 f258742372
  * | | 2020-05-09 08:55:55 +0000 9c96a1cd72
  * | | 2020-05-08 19:55:36 +0000 77f55bc175
  [...]

So the actual "bottom" that we should send is 9c96a1cd72. But the
history we get only goes down to c7883f4581. The tricky thing, though,
is that that commit is a merge. The first parent is still something we
want to send, but the second parent is too old. And then the merge at
9d282b0d7d has the same thing. The first parent is recent, but the
second parent is old.

I walked through upload-pack in a debugger, and its procedure looks
sane. It does a revision walk with --max-age to come up with a list of
"not shallow" commits, which include all three of those, as well as the
commits in between. And then for each not-shallow commit, it sees if any
parents weren't visited, in which case we know we have a boundary. So
that commit gets marked as "shallow", to indicate that it will have its
parents truncated. In this case, all three of the commits I mentioned
get this. And indeed, if we use GIT_TRACE_PACKET, we can see that it
mentions them (and only them):

  packet:  upload-pack> shallow c7883f4581688909baf65c1219a8e842e369150b
  packet:  upload-pack> shallow 9d282b0d7d2e05d7324e70afa66c4ff1306e67b7
  packet:  upload-pack> shallow 9c96a1cd729d0209ba696f225b42d8aafd27295b

But here's the thing. We _don't_ want to mark those first two merges as
fully shallow. We only want to ignore their second parents, but let them
continue to traverse the first parents.

I'm not sure if the shallow mechanism or protocol is even capable of
representing a partial graft like that. Maybe somebody who's more
familiar with the shallow system can comment?

-Peff

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-07-21 16:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-21 16:06 [BUG?] shallow-since can't handle merges Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).