git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "James Ramsay" <james@jramsay.com.au>
To: git@vger.kernel.org
Subject: [TOPIC 5/17] Partial Clone
Date: Thu, 12 Mar 2020 15:00:54 +1100	[thread overview]
Message-ID: <58425B78-7C6D-41CD-92AE-434D0A58F968@jramsay.com.au> (raw)
In-Reply-To: <AC2EB721-2979-43FD-922D-C5076A57F24B@jramsay.com.au>

1. Stolee: what is the status, who is deploying it, what issues need to 
be handled? Example, downloading tags. Hard to highly recommend it.

2. Taylor: we deployed it. No activity except for internal testing. Some 
more activity, but no crashes. Have been dragging our feet. Chicken egg, 
can’t deploy it because the client may not work, but hoping to hear 
about problems.

3. ZJ: dark launched for a mission critical repos. Internal questions 
from CI team, not sure about performance. Build farm hitting it hard 
with different filter specs.

4. Taylor: we have patches we are promising to the list. Blob none and 
limit, for now, but add them incrementally. Bitmap patches are on the 
list

5. James: I’ve been talking to customers who have high interest in 
this. But they are hesitant. Do people have similar situations, like 
shallow clones?

6. Jonathan N: we’re not using it en masse with server farms (see 
Terry’s stats). Performance issues with catchup, long periods of 
downloading and no progress. Missing progress display means a user waits 
and gets worried. On server side, reachability check can be expensive, 
in part because enumerating refs is expensive.

7. Peff: client experience sucks with N+1 situations. If the server 
operator side is tolerable, that way it’s easier to move the client 
side forward. By default, v2 just serves them up, no reachability check. 
Not sure if we’ll do that forever. Often have to inflate blobs that 
are deltas, and then delta compression which is not needed.

8. Stolee: Jonathan built a batch download when changing trees. Possible 
to improve by sending haves.

9. Jonathan N: if you’re in the blob none filter, and say I have a 
commit, I might not actually have what the server expects.

10. Peff: could enumerate blobs

12. Demetr: Partial clones are dangerous for DoS attacks

12. Jonathan: JGit forbids most filters that can't use bitmaps.

13. Peff: just blob filters? Yes, so far.

14. Jonathan: as far as the client experience goes, we’re not batching 
often enough and not showing progress on catch-up fetches. Any other UX 
issues?

15. Jeff: no, those two are what I meant.

16. James: another question for git service providers: Is it a 
replacement for LFS?

17. Brian: some files can compress, others don’t. Repacking can blow 
up if you try to compress something that can’t be compressed. How do 
we identify which objects we compress, and which we don’t.

18. Jonathan N: if you see something already compressed, tell zlib to do 
passthrough compression.

19. Taylor: two problems - which projects do you want to quarantine, 
where do you put them. CDN offloading would be nice.

20. Stolee: reachability bitmaps are tied to a single packfile. Becomes 
more and more expensive. Even just having them in another file requires 
a lot of work.

21. Taylor: we’re looking at some heuristics so that some parts of the 
pack can just be moved over verbatim.

22. Peff: I see three problems: multi pack lookups, bitmaps,

23. Jonathan N: we never generate on the fly deltas

24. Peff: there are pathological cases.

25. Terry: we are seeing 89k partial clones per day. Majority is clone. 
Shallow clone equivalent.

26. Peff: why? Is it better?

27. Jonathan N: initial clone is about the same as shallow. One reason 
we encourage, if you do a follow up, with shallow clone it is expensive 
for the server.

28. Stolee: if you persist the previous shallow clone, it is much much 
cheaper to do incremental fetch.

29. Terry: JGit has enough shallow clone bugs that we often just send 
everything. Make shallow clone obsolete

30. Jonathan N: Jenkins style CI, option for shallow clone. Want to run 
diff or git describe, have to turn it off. Partial clone is simpler.

31. Minh: could the server force the client to partial clone?

32. Brian: risks, working on an airplane. I don’t want to do any kind 
of fetch operation on poor connection. Could be good for CI, but don’t 
want to break things for humans.

33. Jonathan N: if I am going to get on an airplane, is there a way to 
fill it in the background. There are workarounds, like run `git show` 
which needs everything.

34. Elijah: I want to fetch a bunch more stuff, but don’t fetch 
anymore, throw an error rather than hanging.

35. Jonathan: filter blob:none is people's first experience of the 
feature. Make it a first class ui concept, present a user oriented UI 
like git sparse-checkout?

36. Taylor: It looks like it’s simple to use, but there’s a lot to 
do to actually use it. And Scalar is doing that for you.

37. James: Some of our customers would be interested to have a feature 
that pushes down configuration to all the users. It would give them LFS 
by default, without the end-users doing something.

38. Jonathan: We considered enabling a global config at Google. For 
example for 1+GB files.

  parent reply	other threads:[~2020-03-12  4:01 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-12  3:55 Notes from Git Contributor Summit, Los Angeles (April 5, 2020) James Ramsay
2020-03-12  3:56 ` [TOPIC 1/17] Reftable James Ramsay
2020-03-12  3:56 ` [TOPIC 2/17] Hooks in the future James Ramsay
2020-03-12 14:16   ` Emily Shaffer
2020-03-13 17:56     ` Junio C Hamano
2020-04-07 23:01       ` Emily Shaffer
2020-04-07 23:51         ` Emily Shaffer
2020-04-08  0:40           ` Junio C Hamano
2020-04-08  1:09             ` Emily Shaffer
2020-04-10 21:31           ` Jeff King
2020-04-13 19:15             ` Emily Shaffer
2020-04-13 21:52               ` Jeff King
2020-04-14  0:54                 ` [RFC PATCH v2 0/2] configuration-based hook management (was: [TOPIC 2/17] Hooks in the future) Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 1/2] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 2/2] hook: add --list mode Emily Shaffer
2020-04-14 15:15                   ` [RFC PATCH v2 0/2] configuration-based hook management Phillip Wood
2020-04-14 19:24                     ` Emily Shaffer
2020-04-14 20:27                       ` Jeff King
2020-04-15 10:01                         ` Phillip Wood
2020-04-14 20:03                     ` Josh Steadmon
2020-04-15 10:08                       ` Phillip Wood
2020-04-14 20:32                     ` Jeff King
2020-04-15 10:01                       ` Phillip Wood
2020-04-15 14:51                         ` Junio C Hamano
2020-04-15 20:30                           ` Emily Shaffer
2020-04-15 22:19                             ` Junio C Hamano
2020-04-15  3:45                 ` [TOPIC 2/17] Hooks in the future Jonathan Nieder
2020-04-15 20:59                   ` Emily Shaffer
2020-04-20 23:53                     ` [PATCH] doc: propose hooks managed by the config Emily Shaffer
2020-04-21  0:22                       ` Emily Shaffer
2020-04-21  1:20                         ` Junio C Hamano
2020-04-24 23:14                           ` Emily Shaffer
2020-04-25 20:57                       ` brian m. carlson
2020-05-06 21:33                         ` Emily Shaffer
2020-05-06 23:13                           ` brian m. carlson
2020-05-19 20:10                           ` Emily Shaffer
2020-04-15 22:42                   ` [TOPIC 2/17] Hooks in the future Jeff King
2020-04-15 22:48                     ` Emily Shaffer
2020-04-15 22:57                       ` Jeff King
2020-03-12  3:57 ` [TOPIC 3/17] Obliterate James Ramsay
2020-03-12 18:06   ` Konstantin Ryabitsev
2020-03-15 22:19   ` Damien Robert
2020-03-16 12:55     ` Konstantin Tokarev
2020-03-26 22:27       ` Damien Robert
2020-03-16 16:32     ` Elijah Newren
2020-03-26 22:30       ` Damien Robert
2020-03-16 18:32     ` Phillip Susi
2020-03-26 22:37       ` Damien Robert
2020-03-16 20:01     ` Philip Oakley
2020-05-16  2:21       ` nbelakovski
2020-03-12  3:58 ` [TOPIC 4/17] Sparse checkout James Ramsay
2020-03-12  4:00 ` James Ramsay [this message]
2020-03-17  7:38   ` Allowing only blob filtering was: [TOPIC 5/17] Partial Clone Christian Couder
2020-03-17 20:39     ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 1/2] list_objects_filter_options: introduce 'list_object_filter_config_name' Taylor Blau
2020-03-17 20:53         ` Eric Sunshine
2020-03-18 10:03           ` Jeff King
2020-03-18 19:40             ` Junio C Hamano
2020-03-18 22:38             ` Eric Sunshine
2020-03-19 17:15               ` Jeff King
2020-03-18 21:05           ` Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 2/2] upload-pack.c: allow banning certain object filter(s) Taylor Blau
2020-03-17 21:11         ` Eric Sunshine
2020-03-18 21:18           ` Taylor Blau
2020-03-18 11:18         ` Philip Oakley
2020-03-18 21:20           ` Taylor Blau
2020-03-18 10:18       ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Jeff King
2020-03-18 18:26         ` Re*: " Junio C Hamano
2020-03-19 17:03           ` Jeff King
2020-03-18 21:28         ` Taylor Blau
2020-03-18 22:41           ` Junio C Hamano
2020-03-19 17:10             ` Jeff King
2020-03-19 17:09           ` Jeff King
2020-04-17  9:41         ` Christian Couder
2020-04-17 17:40           ` Taylor Blau
2020-04-17 18:06             ` Jeff King
2020-04-21 12:34               ` Christian Couder
2020-04-22 20:41                 ` Taylor Blau
2020-04-22 20:42               ` Taylor Blau
2020-04-21 12:17             ` Christian Couder
2020-03-12  4:01 ` [TOPIC 6/17] GC strategies James Ramsay
2020-03-12  4:02 ` [TOPIC 7/17] Background operations/maintenance James Ramsay
2020-03-12  4:03 ` [TOPIC 8/17] Push performance James Ramsay
2020-03-12  4:04 ` [TOPIC 9/17] Obsolescence markers and evolve James Ramsay
2020-05-09 21:31   ` Noam Soloveichik
2020-05-15 22:26     ` Jeff King
2020-03-12  4:05 ` [TOPIC 10/17] Expel ‘git shell’? James Ramsay
2020-03-12  4:07 ` [TOPIC 11/17] GPL enforcement James Ramsay
2020-03-12  4:08 ` [TOPIC 12/17] Test harness improvements James Ramsay
2020-03-12  4:09 ` [TOPIC 13/17] Cross implementation test suite James Ramsay
2020-03-12  4:11 ` [TOPIC 14/17] Aspects of merge-ort: cool, or crimes against humanity? James Ramsay
2020-03-12  4:13 ` [TOPIC 15/17] Reachability checks James Ramsay
2020-03-12  4:14 ` [TOPIC 16/17] “I want a reviewer” James Ramsay
2020-03-12 13:31   ` Emily Shaffer
2020-03-12 17:31     ` Konstantin Ryabitsev
2020-03-12 17:42       ` Jonathan Nieder
2020-03-12 18:00         ` Konstantin Ryabitsev
2020-03-17  0:43     ` Philippe Blain
2020-03-13 21:25   ` Eric Wong
2020-03-14 17:27     ` Jeff King
2020-03-15  0:36       ` inbox indexing wishlist [was: [TOPIC 16/17] “I want a reviewer”] Eric Wong
2020-03-12  4:16 ` [TOPIC 17/17] Security James Ramsay
2020-03-12 14:38 ` Notes from Git Contributor Summit, Los Angeles (April 5, 2020) Derrick Stolee
2020-03-13 20:47 ` Jeff King
2020-03-15 18:42 ` Jakub Narebski
2020-03-16 19:31   ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2019-12-10  2:33 [PATCH 0/6] configuration-based hook management Emily Shaffer
2019-12-10  2:33 ` [PATCH 1/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2019-12-12  9:41   ` Bert Wesarg
2019-12-12 10:47   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 2/6] config: add string mapping for enum config_scope Emily Shaffer
2019-12-10 11:16   ` Philip Oakley
2019-12-10 17:21     ` Philip Oakley
2019-12-10  2:33 ` [PATCH 3/6] hook: add --list mode Emily Shaffer
2019-12-12  9:38   ` Bert Wesarg
2019-12-12 10:58   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 4/6] hook: support reordering of hook list Emily Shaffer
2019-12-11 19:21   ` Junio C Hamano
2019-12-10  2:33 ` [PATCH 5/6] hook: remove prior hook with '---' Emily Shaffer
2019-12-10  2:33 ` [PATCH 6/6] hook: teach --porcelain mode Emily Shaffer
2019-12-11 19:33   ` Junio C Hamano
2019-12-11 22:00     ` Emily Shaffer
2019-12-11 22:07       ` Junio C Hamano
2019-12-11 23:15         ` Emily Shaffer
2019-12-11 22:42 ` [PATCH 0/6] configuration-based hook management Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58425B78-7C6D-41CD-92AE-434D0A58F968@jramsay.com.au \
    --to=james@jramsay.com.au \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).