git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "James Ramsay" <james@jramsay.com.au>
To: git@vger.kernel.org
Subject: [TOPIC 14/17] Aspects of merge-ort: cool, or crimes against humanity?
Date: Thu, 12 Mar 2020 15:11:35 +1100	[thread overview]
Message-ID: <84A85206-F4A7-4F36-A302-C3986D6AFF91@jramsay.com.au> (raw)
In-Reply-To: <AC2EB721-2979-43FD-922D-C5076A57F24B@jramsay.com.au>

1. Elijah: ORT stands for Ostensibly Recursive’s Twin. As a merge 
strategy, just like you can call ‘git merge -s recursive’ you can 
call ‘git merge -s ort’.  Git’s option parsing doesn’t require 
the space after the ‘-s’.

2. Major question is about performance & possible layering violations. 
Merge recursive calls unpacks trees to walks trees, then needs to get 
all file and directory names and so walks the trees on the right again, 
and then the trees on the left again. Then diff needs to walk sets of 
trees twice, and then insert_stage_data() does a bunch more narrow tree 
walks for rename detection. Lots of tree walking. Replaced that with two 
tree walks.

3. Using traverse_trees() instead of unpack_trees(), and avoid the index 
entirely (not even touching or creating cache_entry’s), and building 
up information as I need. I’m not calling diffcore_std(), but instead 
directly calling diffcore_rename(). Is this horrifying? Or is it 
justified by the performance gains?

4. Peff: both, some of it sounds like an improvement, but maybe there 
were hidden benefits previously.

5. Elijah: I write to a tree before I do anything.

6. Peff: I like that. Seems like a clean up to me. We have written 
libgit2-like code for merging server-side

7. Elijah: I’ve been adding tests for the past few years, more to add, 
feel good about it.

8. Jonathan N: If you are using a lower-layer thing, I would not say 
you’re not doing anything you shouldn’t. But if you docs say you 
should not to use diffcore_rename(), you can update the docs to say that 
it’s fine to use it.

9. Elijah: three places directly write tree objects. All have different 
data structures they are writing from. Should I pull them out? But then 
my data structure was also different, so I’d have a fourth.

10. Peff: not worried because trees are simple. Worried about policy 
logic. Can’t write a tree entry with a double slash. Want this to be 
enforced everywhere, but no idea how hard that would be to implement. 
Not about lines of code, but consistency of policy. Fearful that only 
one place does it.

11. Elijah: I know merge-ort checks this, but it’s not nearby, so it 
could change.

12. Peff: as bad as it is to round trip through the index, it may bypass 
quality checks, which you will need to manually implement.

13. Elijah: usability side, with the tree I’ve created, I could have 
.git/AUTOMERGE. I have an old tree, a new tree, and a checkout can get 
me there. Fixed a whole bunch of bugs for sparsity and submodules.

14. Elijah: If we use this to on-the-fly remerge as part of git-log in 
order to compare the merge commit to what the automatic merging would 
have done, where/how should we write objects as we go?

15. Jonathan N: can end up with proliferation of packs, would be nice to 
have similar to fast import and have in memory store. Dream not to have 
loose files written ever.

16. Peff: I like your dream. But fast import packs are bad. We assume 
that packs are good, and thus need to use GC aggressively. This 
increases pollution of that problem. I know about objects, but not 
written to disc, risk that you can write objects that are broken, but 
git doesn’t know because git thinks it has the object but it’s only 
in memory. Log is conceptually a read operation, but this would create 
the need for writes.

17. Elijah: you could write into a temporary directory. Worried about 
`gc --auto` in the middle of my operation. If I write to a temp pack I 
could potentially avoid it.

18. Elijah: large files. Rename detection might not work efficiently OR 
correctly for sufficiently large files (binary or not). Limited bucket 
size means that completely different files treated as renames when both 
are over 8MB. Should big files just not be compared?

19. Peff: maybe we should fix the hash…

20. Elijah: present situation is broken, maybe we can cheat in the short 
term, and avoid fixing?

21. Peff: seems more correct for now, but we’d need to document

22. Elijah: checkout --overwrite-ignore flag. Should merge have the same 
flag.

23. Jonathan N: gitignore original use case was build outputs which can 
be regenerate. But then some people want to ignore `.hg` which is much 
more precious.

24. Peff: we can plumb it through later to other commands

25. Brian: CI doesn’t really care. Moving between branches it would 
complain. For checkout and merge it makes sense to support just 
destroying.

  parent reply	other threads:[~2020-03-12  4:11 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-12  3:55 Notes from Git Contributor Summit, Los Angeles (April 5, 2020) James Ramsay
2020-03-12  3:56 ` [TOPIC 1/17] Reftable James Ramsay
2020-03-12  3:56 ` [TOPIC 2/17] Hooks in the future James Ramsay
2020-03-12 14:16   ` Emily Shaffer
2020-03-13 17:56     ` Junio C Hamano
2020-04-07 23:01       ` Emily Shaffer
2020-04-07 23:51         ` Emily Shaffer
2020-04-08  0:40           ` Junio C Hamano
2020-04-08  1:09             ` Emily Shaffer
2020-04-10 21:31           ` Jeff King
2020-04-13 19:15             ` Emily Shaffer
2020-04-13 21:52               ` Jeff King
2020-04-14  0:54                 ` [RFC PATCH v2 0/2] configuration-based hook management (was: [TOPIC 2/17] Hooks in the future) Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 1/2] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 2/2] hook: add --list mode Emily Shaffer
2020-04-14 15:15                   ` [RFC PATCH v2 0/2] configuration-based hook management Phillip Wood
2020-04-14 19:24                     ` Emily Shaffer
2020-04-14 20:27                       ` Jeff King
2020-04-15 10:01                         ` Phillip Wood
2020-04-14 20:03                     ` Josh Steadmon
2020-04-15 10:08                       ` Phillip Wood
2020-04-14 20:32                     ` Jeff King
2020-04-15 10:01                       ` Phillip Wood
2020-04-15 14:51                         ` Junio C Hamano
2020-04-15 20:30                           ` Emily Shaffer
2020-04-15 22:19                             ` Junio C Hamano
2020-04-15  3:45                 ` [TOPIC 2/17] Hooks in the future Jonathan Nieder
2020-04-15 20:59                   ` Emily Shaffer
2020-04-20 23:53                     ` [PATCH] doc: propose hooks managed by the config Emily Shaffer
2020-04-21  0:22                       ` Emily Shaffer
2020-04-21  1:20                         ` Junio C Hamano
2020-04-24 23:14                           ` Emily Shaffer
2020-04-25 20:57                       ` brian m. carlson
2020-05-06 21:33                         ` Emily Shaffer
2020-05-06 23:13                           ` brian m. carlson
2020-05-19 20:10                           ` Emily Shaffer
2020-04-15 22:42                   ` [TOPIC 2/17] Hooks in the future Jeff King
2020-04-15 22:48                     ` Emily Shaffer
2020-04-15 22:57                       ` Jeff King
2020-03-12  3:57 ` [TOPIC 3/17] Obliterate James Ramsay
2020-03-12 18:06   ` Konstantin Ryabitsev
2020-03-15 22:19   ` Damien Robert
2020-03-16 12:55     ` Konstantin Tokarev
2020-03-26 22:27       ` Damien Robert
2020-03-16 16:32     ` Elijah Newren
2020-03-26 22:30       ` Damien Robert
2020-03-16 18:32     ` Phillip Susi
2020-03-26 22:37       ` Damien Robert
2020-03-16 20:01     ` Philip Oakley
2020-05-16  2:21       ` nbelakovski
2020-03-12  3:58 ` [TOPIC 4/17] Sparse checkout James Ramsay
2020-03-12  4:00 ` [TOPIC 5/17] Partial Clone James Ramsay
2020-03-17  7:38   ` Allowing only blob filtering was: " Christian Couder
2020-03-17 20:39     ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 1/2] list_objects_filter_options: introduce 'list_object_filter_config_name' Taylor Blau
2020-03-17 20:53         ` Eric Sunshine
2020-03-18 10:03           ` Jeff King
2020-03-18 19:40             ` Junio C Hamano
2020-03-18 22:38             ` Eric Sunshine
2020-03-19 17:15               ` Jeff King
2020-03-18 21:05           ` Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 2/2] upload-pack.c: allow banning certain object filter(s) Taylor Blau
2020-03-17 21:11         ` Eric Sunshine
2020-03-18 21:18           ` Taylor Blau
2020-03-18 11:18         ` Philip Oakley
2020-03-18 21:20           ` Taylor Blau
2020-03-18 10:18       ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Jeff King
2020-03-18 18:26         ` Re*: " Junio C Hamano
2020-03-19 17:03           ` Jeff King
2020-03-18 21:28         ` Taylor Blau
2020-03-18 22:41           ` Junio C Hamano
2020-03-19 17:10             ` Jeff King
2020-03-19 17:09           ` Jeff King
2020-04-17  9:41         ` Christian Couder
2020-04-17 17:40           ` Taylor Blau
2020-04-17 18:06             ` Jeff King
2020-04-21 12:34               ` Christian Couder
2020-04-22 20:41                 ` Taylor Blau
2020-04-22 20:42               ` Taylor Blau
2020-04-21 12:17             ` Christian Couder
2020-03-12  4:01 ` [TOPIC 6/17] GC strategies James Ramsay
2020-03-12  4:02 ` [TOPIC 7/17] Background operations/maintenance James Ramsay
2020-03-12  4:03 ` [TOPIC 8/17] Push performance James Ramsay
2020-03-12  4:04 ` [TOPIC 9/17] Obsolescence markers and evolve James Ramsay
2020-05-09 21:31   ` Noam Soloveichik
2020-05-15 22:26     ` Jeff King
2020-03-12  4:05 ` [TOPIC 10/17] Expel ‘git shell’? James Ramsay
2020-03-12  4:07 ` [TOPIC 11/17] GPL enforcement James Ramsay
2020-03-12  4:08 ` [TOPIC 12/17] Test harness improvements James Ramsay
2020-03-12  4:09 ` [TOPIC 13/17] Cross implementation test suite James Ramsay
2020-03-12  4:11 ` James Ramsay [this message]
2020-03-12  4:13 ` [TOPIC 15/17] Reachability checks James Ramsay
2020-03-12  4:14 ` [TOPIC 16/17] “I want a reviewer” James Ramsay
2020-03-12 13:31   ` Emily Shaffer
2020-03-12 17:31     ` Konstantin Ryabitsev
2020-03-12 17:42       ` Jonathan Nieder
2020-03-12 18:00         ` Konstantin Ryabitsev
2020-03-17  0:43     ` Philippe Blain
2020-03-13 21:25   ` Eric Wong
2020-03-14 17:27     ` Jeff King
2020-03-15  0:36       ` inbox indexing wishlist [was: [TOPIC 16/17] “I want a reviewer”] Eric Wong
2020-03-12  4:16 ` [TOPIC 17/17] Security James Ramsay
2020-03-12 14:38 ` Notes from Git Contributor Summit, Los Angeles (April 5, 2020) Derrick Stolee
2020-03-13 20:47 ` Jeff King
2020-03-15 18:42 ` Jakub Narebski
2020-03-16 19:31   ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2019-12-10  2:33 [PATCH 0/6] configuration-based hook management Emily Shaffer
2019-12-10  2:33 ` [PATCH 1/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2019-12-12  9:41   ` Bert Wesarg
2019-12-12 10:47   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 2/6] config: add string mapping for enum config_scope Emily Shaffer
2019-12-10 11:16   ` Philip Oakley
2019-12-10 17:21     ` Philip Oakley
2019-12-10  2:33 ` [PATCH 3/6] hook: add --list mode Emily Shaffer
2019-12-12  9:38   ` Bert Wesarg
2019-12-12 10:58   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 4/6] hook: support reordering of hook list Emily Shaffer
2019-12-11 19:21   ` Junio C Hamano
2019-12-10  2:33 ` [PATCH 5/6] hook: remove prior hook with '---' Emily Shaffer
2019-12-10  2:33 ` [PATCH 6/6] hook: teach --porcelain mode Emily Shaffer
2019-12-11 19:33   ` Junio C Hamano
2019-12-11 22:00     ` Emily Shaffer
2019-12-11 22:07       ` Junio C Hamano
2019-12-11 23:15         ` Emily Shaffer
2019-12-11 22:42 ` [PATCH 0/6] configuration-based hook management Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84A85206-F4A7-4F36-A302-C3986D6AFF91@jramsay.com.au \
    --to=james@jramsay.com.au \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).