Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
@ 2021-10-21 11:55 Johannes Schindelin
  2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
                   ` (11 more replies)
  0 siblings, 12 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:55 UTC (permalink / raw)
  To: git

Team,

we held our second all-virtual Summit over the past two days. It was the
traditional unconference style meeting, with topics being proposed and
voted on right before the introduction round. It was really good to see
the human faces behind those email addresses.

32 contributors participated, and we spanned the timezones from PST to
IST. To make that possible, the event took place on two days, from
1500-1900 UTC, which meant that the attendees from the US West coast had
to get up really early, while it was past midnight in India at the end.

I would like to thank all participants for accommodating the time, and in
particular for creating such a friendly, collaborative atmosphere.

A particular shout-out to Jonathan Nieder, Emily Shaffer and Derrick
Stolee for taking notes. I am going to send out these notes in per-topic
subthreads, replying to this mail.

Day 1 topics:

* Crazy (and not so crazy) ideas
* SHA-256 Updates
* Server-side merge/rebase: needs and wants?
* Submodules and how to make them worth using
* Sparse checkout behavior and plans

Day 2 topics:

* The state of getting a reftable backend working in git.git
* Documentation (translations, FAQ updates, new user-focused, general
  improvements, etc.)
* Let's have public Git chalk talks
* Increasing diversity & inclusion (transition to `main`, etc)
* Improving Git UX
* Improving reviewer quality of life (patchwork, subsystem lists?, etc)

A few topics were left for a later date (maybe as public Git chalk talks):

* Making Git memory-leak free (already landed patches)
* Scaling Git
* Scaling ref advertisements
* Config-based hooks (and getting there via migration ot hook.[ch] lib &
  "git hook run")
* Make git [clone|fetch] support pre-seeding via downloaded *.bundle files

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Crazy (and not so crazy) ideas
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
@ 2021-10-21 11:55 ` Johannes Schindelin
  2021-10-21 12:30   ` Son Luong Ngoc
  2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
  2021-10-21 11:55 ` [Summit topic] SHA-256 Updates Johannes Schindelin
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:55 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 7494 bytes --]

This session was led by Elijah Newren. Supporting cast: Johannes "Dscho"
Schindelin, Jonathan Tan, Jonathan "jrnieder" Nieder, brian m. carlson,
Jeff "Peff" King, Ævar Arnfjörð Bjarmason, Emily Shaffer, CB Bailey,
Taylor Blau, and Philip Oakley.

Notes:

* sent my idea for rebase merges on-list

* Test suite is slow. Shell scripts and process forking.

   * What if we had a special shell that interpreted the commands in a
     single process?

   * Even Git commands like rev-parse and hash-object, as long as that’s
     not the command you’re trying to test

   * Dscho wants to slip in a C-based solution

   * Jonathan tan commented: going back to your custom shell for tests
     idea, one thing we could do is have a custom command that generates
     the repo commits that we want (and that saves process spawns and
     might make the tests simpler too)

      * We could replace several “setup repo” steps with “git fast-import”
        instead.

   * Dscho measured: 0.5 sec - 30 sec in setup steps. Can use fast-import,
     or can make a new format that helps us set up the test scenario

   * Elijah: test-lib-functions helpers could be built ins

* Biggest idea: there are a lot of people who version control things via
  tarballs or .zip files per version. This prevents history from
  compressing well. Some people check in those compressed files into Git
  for purposes of history.

   * In particular, .jar files or npm packages. Initial testing showed
     that you can expand .jar files in a way that creates source-like
     files.

   * Jonathan Nieder points out that “pristine-tar” exists to do similar
     ideas: https://joeyh.name/code/pristine-tar/

   * Others use “git archive” for this purpose to mixed success.

   * jars and npm packages compress better if you store them in expanded
     form instead of compressed form

   * So many tools are used to using the end-archive, so while it’s
     tempting to have the build system be responsible for this, being able
     to “git add” the archive and have the right thing happen behind the
     scenes would be nice for ease of use

   * Goal here isn’t bit-for-bit reproducibility, just semantic
     reproducibility

   * What about other file formats that use zips, such as LibreOffice?

   * Git Merge 2018: Designers Git-It; A unified design system workflow
     did something similar, except made the tool understand the “exploded”
     file view.

   * Jonathan Tan mentions that smudge/clean filters can help, except this
     is about tree<->blob instead of blob<->blob

   * brian m. carlson mentions “git archive” output isn’t stable across
     Git versions. Should we have a canonical tar format that provides
     reproducibility?

   * Peff: tree<->blob filters can get confusing in the
     tree<->index<->worktree mapping. Possible, but requires careful
     thought about the details about when each spot

   * Old suggestion of a “blob-tree” type that allows storing a single
     index entry that corresponds to multiple trees and blobs in the
     background, possibly.

   * One long-term dream (inspired by Avery Pennarun’s “bup” tool) is to
     store large binary files in a tree-structured way that can store
     common regions as deltas, improve random access, parallelized
     hashing. Involves a consistent way to split the file into stable
     pieces, like --rsyncable uses (based on a rolling hash being zero).

   * Peff: you can do that at the object model layer or at the storage
     layer. The latter is less invasive.

   * jrnieder: The benefits of blobtree are greater at the object model
     layer --- e.g. not having to transmit chunks over the wire that you
     already have. I think the main obstacle has been that the benefits
     haven’t been enough to be worth the complexity. If that changes, we
     can imagine bundling it with some other object format changes, e.g.
     putting blob sizes in tree objects, and rolling it out as a new
     object-format.

   * Ævar: can we do this in a simpler manner, without deep technical
     changes? (Context: was thinking about this in the context of some
     $id$ questions.) Clean/smudge filters have some significant UX
     drawbacks. Has experience helping users trying to commit .jar files.
     Some simple advice saying “maybe you don’t want to commit this file
     type, here are some ways to expand it to a committable format…” based
     on patterns such as .gitignore or .gitattributes. We don’t have ways
     to indicate “this repo uses Git LFS, but you don’t have the plugin.”

* Emily: If I could rewrite the commit object format, I would change some
  things

   * Allow multiple authors

   * Add a layer of indirection to author name

      * brian has thought about this too: replace name with email address
        + some ssh key or something and use something mailmap-like to map
        it. Could be a backward-compatible approach

   * CB has been thinking about these problems in the background. Could
     randomly generate an identifier when you commit your first patch, an
     @example.com address to avoid conflicting with any real address.
     Mailmap can be a blob maintained by the project

      * In the process can get first-class multiple authors

      * If I have this id representing this particular pair of authors,
        can update what the id points to

      * Cool stuff but gets complicated

   * Just getting mailmap applied to trailers in “git log” would be huge

      * CB: main reason I don’t put myself in mailmap is that it’s not
        worth bothering without that feature

      * Ævar: “git log --author” would want the mapping, too. (and ‘git
        shortlog --group’) Do we do this only at the presentation layer or
        if we do it at a lower layer do we get such things for free?

      * If anyone’s interested, I might know where the dragons are hiding,
        happy to give advice

      * Peff: “git shortlog” already knows how to parse it out so this
        seems very possible

      * Taylor:
        https://lore.kernel.org/git/YW8A5FznqLYs7MqH@coredump.intra.peff.net/T/

   * Generation number was discussed ~2011(?)

   * Ævar: does this really need a format change? Two “author” fields
     would break things, but could have “author” and “x-author” header

   * General principle when changing formats: teasing apart where it’s
     possible to achieve what you want backward compatibility

* Philip Oakley would like a commit id referring to an unborn branch as a
  proper id

   * brian: empty tree works for what you’re talking about when you want a
     diff

   * Philip: motivating example was “first parent is going nowhere, but
     you have a second parent”

   * jrnieder: I see, you want the --first-parent history of your
     published branch to match the reflog. As a workaround, you’re able to
     use an empty initial commit and use --no-ff merges whenever you pull
     things in, but you’re referring to wishing you didn’t have to make
     that empty initial commit

   * Ævar: reminds me of the discussion in
     https://www.fossil-scm.org/home/doc/trunk/www/fossil-v-git.wiki of
     commit/branch relationships

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] SHA-256 Updates
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
  2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
@ 2021-10-21 11:55 ` Johannes Schindelin
  2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:55 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4971 bytes --]

This session was led by brian m. carlson. Supporting cast: Jonathan
"jrnieder" Nieder, Derrick Stolee, Johannes "Dscho" Schindelin, Toon
Claes, and Ævar Arnfjörð Bjarmason.

Notes:

 1.  Summarizing where we are with what merged:

     1. We have full SHA256 support \o/

     2. Some minor glitches, updated the docs to reflect that

     3. It works. 2.30 is a good state.

     4. None of the major forges support it yet but that will come

 2.  Interop between SHA256 repositories and SHA1 repositories

     1. Take each object we receive over the wire or create locally and give it
        a sha1 value as well

     2. We have a giant loose object index that maps sha256-id and sha1-id
        values. Hashmap

     3. Will be changed to some tree to allow prefix mapping

     4. index-pack has to take two passes over the pack, because you can’t map
        a commit before you’ve mapped the tree it points to (or more generally
        can’t map an object before the objects it references)

     5. Fortunately blobs don’t point to any other objects so this is
        relatively quick

 3.  Submodules are tricky

     1. They come from a different repository so we don’t have anything to map
        to

     2. What I’m currently doing is requiring that the submodule be present
        locally and storing the mapping separately in the superproject

     3. The mapping isn’t sent over the wire. That could create some complexity
        around malicious histories

 4.  For the same reason we don’t have partial clone working

     1. Might require an on-disk format bump

     2. jrnieder: taking a step back, the hash verifies the full history via
        the Merkle tree property.

     3. However, with partial clone we already relax this: it is no longer
        verifiable, locally.

     4. Therefore, we place a lot of trust on the server.

     5. The server could tell us more information about the edge commits, e.g.
        SHA-1<->SHA-256 mapping

     6. Stolee: if I am sha256 client, that’s what I want, you kind of decide
        up front what you want

     7. jrnieder: at $DAYJOB common partial clone scenario is triangular
        workflow

     8. Stolee: how likely are the multiple hosts not homogenous (all SHA-1, or
        all SHA-256)?

     9. brian: Valuable to be able to work in SHA256 and refer in input+output
        to SHA-1. If someone refers to a SHA-1, you still want to be able to
        see what they’re referring to, to interact with other people, even
        though SHA-1 is insecure

 5.  Multi-pack index: doesn’t work, but won’t be hard to fix

 6.  We write signatures for both objects. When you “git commit --gpg-sign”, it
     can sign in both formats

     1. Verifies in current format

 7.  Timeframe for hosting providers moving to SHA256

     1. Dscho: should we have a multi provider meeting and coordinate that?
        Could be everyone waiting for others

     2. brian: cgit supports SHA256 already, allows self-hosting

     3. jrnieder: with interop, individuals can use SHA256 against servers that
        only support SHA-1. Then that creates pressure for the servers to
        support SHA256 for performance reasons

     4. brian: interop doesn’t exist yet. If GitHub decides I work on that for
        the next two months, I think I could do it. But requires the code
        getting written.

     5. Toon: we at gitlab have sha256 on our radar, but with a very low prio
        https://gitlab.com/groups/gitlab-org/-/epics/794

 8.  jrnieder: Signing: very old Git versions won’t know to invalidate them
     when I commit --amend. How old is “very old”?

     1. brian: somewhere between 2.20 and 2.28. In 2.20 started treating
        everything with “gpgsig” at start as a potential signature.

     2. There were a couple of bugs I fixed in 2.30, working on signature
        interoperability. Tested with sha256.

     3. Updated the transition plan: in tags, the trailing signature is always
        the current signature, other ones go in the header.

 9.  Updating other hosting provider glue to support sha256

     1. jrnieder: e.g. GitHub API, UIs, …. Is it hard, similar to the Git part,
        or a little easier?

     2. brian: hardest part is libgit2. Lots of hardcoded oids in its testsuite

     3. Libraries tend to be the hardest piece --- e.g. Gerrit will need JGit
        updates

     4. Dscho: gitk also has some references to hardcoded 40-length

     5. Ævar: some patches on the mailing list for gitk and git-gui to adapt
        them, from Carlos

     6. brian: hopefully the ecosystem learns from this experience and doesn’t
        just hardcode 64 here :)

 10. Interop code only supports 2 algorithms. Hopefully finish this transition
     before we need the next one :)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Server-side merge/rebase: needs and wants?
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
  2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
  2021-10-21 11:55 ` [Summit topic] SHA-256 Updates Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-22  3:06   ` Bagas Sanjaya
  2021-11-08 18:21   ` Taylor Blau
  2021-10-21 11:56 ` [Summit topic] Submodules and how to make them worth using Johannes Schindelin
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5823 bytes --]

This session was led by Elijah Newren. Supporting cast: Christian Couder,
Jonathan "jrnieder" Nieder, brian m. carlson, Toon Claes, Orgad Shaneh,
Johannes "Dscho" Schindelin, Derrick Stolee, Philip Oakley, Jeff "Peff"
King, CB Bailey, Ævar Arnfjörð Bjarmason, and Phillip Wood.

Notes:

 1.  https://github.com/git/git/pull/1114

 2.  Not about exposing merge over Git protocol, but about providing plumbing
     for a server to use to run a merge

 3.  Not only merge/rebase, but also cherry-pick, revert

 4.  merge-ORT makes things a bit better, as it doesn’t have some of the
     problems of the recursive algorithm.

 5.  The challenge is not necessarily the technical challenges, but the UX for
     server tools that live “above” the git executable.

     1. What kind of output is needed? Machine-readable error messages?

     2. What Git objects must be created: a tree? A commit?

     3. How to handle, report, and store conflicts? Index is not typically
        available on the server.

 6.  Use case?

     1.  Currently servers use libgit2 to do these algorithms instead of git
         itself.

     2.  What would it take for us to move the servers off of libgit2 and onto
         Git?

     3.  This would help with a lot of compatibility issues (sha-256, new data
         formats)

     4.  Server cares about the exit code to record the success of the
         operation, including some details around which conflicts happened.

     5.  MUST NOT WRITE A REF in-process (because of replication), so must be
         at a deep plumbing level.

     6.  How to restart the merge once a user has submitted conflict
         resolutions?

     7.  Christian: GitLab also uses libgit2, would like to use C Git. Want to
         not need a worktree for scratch space.

     8.  jrnieder: Write tree with conflict markers and report the conflict
         (just a boolean), at least as an optional mode (JGit does this and
         Gerrit relies on it)

         1. Including when merging binary files, rename conflicts, etc where
            there’s no place to put the conflict markers

     9.  brian: fail-fast mode. Present that a conflict happens very quickly,
         allow conflict marker computation to be done later, upon user request
         or as a background job.

     10. Toon: GitLab would love to use merge ORT, and collaborate on it

     11. Orgad: In case you do have conflicts, does a mergetool-style frontend
         want the three competing versions?

 7.  Dscho: there’s a little-used “git merge-tree” plumbing command

     1. jrnieder: it’s a low-level doesn’t-resolve-conflicts thing, but nothing
        forces us to keep it that way. Intriguing idea

 8.  Difference between rebase and cherry-pick not all that big, apart from
     looking at HEAD (which does not make sense on the server-side)

 9.  --onto already strains the concept of the rebase, should maybe not be
     implicit.

 10. Stolee: Think about future extensibility: e.g., servers might want to
     support --autosquash

 11. It would be nice to rebase multiple, interconnected branches at the same
     time. But how to specify that?

 12. Dscho: I have this problem quite often with my many stacked patch series

 13. I use --recreate-merges (uses “label” command), create refs along the way

 14. Philip: I also rebase with merges and then run a script after the fact to
     update refs

 15. Peff: I do something lower-tech. When I have branches depending on each
     other, I set the upstream config. By doing rebases in the right order, the
     right thing happens.

 16. CB: This feature sounds really exciting, often develops parallel,
     semi-independent changes that only come together in an octopus merge at
     the end

 17. Jonathan: Newcomers sometimes put commits that don’t belong together on
     the same branch; I wish there were a smooth way for them to just “drag
     over” a commit, which we don’t currently have because it involves multiple
     branches. Cheering you on.

 18. cherry-pick in the middle of an interrupted rebase

 19. If we unify them, then this gets messy

 20. Dscho: I’m a strong proponent of being able to cherry-pick while you’re
     rebasing. But I’m also missing the ability to do an interactive rebase in
     the middle of an interactive rebase. I implemented a nested interactive
     rebase in the tooling for Git for Windows, which works by prepending the
     current interactive rebase’s todo

 21. Peff: That works in that context, but is not fully generic (no way to
     --abort / --quit). Would want a stack of operations. I have a command
     called “git continue” that continues whatever operation is in progress.

 22. Once we have every high-level operation pushing / popping like this, that
     kind of thing becomes possible.

 23. Toon: I have that too, also “git abort”

 24. CB: “git abort ” is slightly terrifying, we started with git shell and now
     we have git forth :)

 25. Dscho: could standardize on the git-rebase-todo script and add support for
     other operations, tricky bit would be how to implemented nested commands
     in an abortable fashion

 26. Ævar: would be nice if these are pushable/sharable

 27. Is rebase the right top-level command?

 28. Phillip Wood: for refactoring history, would like a different abstraction
     from rebase

 29. I have a script that does that which works well

 30. jrnieder: https://github.com/arxanas/git-branchless has some non rebase
     based history manipulation helpers as well, can be useful for inspiration

 31. Elijah: I’m thinking of a “git replay” command

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Submodules and how to make them worth using
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (2 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-21 11:56 ` [Summit topic] Sparse checkout behavior and plans Johannes Schindelin
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4921 bytes --]

This session was led by Emily Shaffer. Supporting cast: brian m. carlson,
Orgad Shaneh, Jonathan "jrnieder" Nieder, Jeff Hostetler, and Philip
Oakley.

Notes:

 1. https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/

    1. Internally at Google, a lot of use of “repo”

    2. Isn’t great, but not much alternative available

    3. Submodules are also not great, let’s make them better

    4. Some prior work: --recurse-submodules options

       1. I can run “git branch” with and without --recurse-submodules

    5. Being in recurse mode gives us a chance to be opinionated

    6. Don’t want to have a million options and create a lot of complexity

    7. Branches

       1. Superproject thinks “main” points to one set of states in submodules

       2. Submodules have “main” pointing elsewhere

       3. Which is right? The superproject is right, “git status” can show the
          difference

    8. Not trying to eliminate all complexity. There is some inherent
       complexity in stitching repositories together. But I want to make it
       predictable

    9. For specifics, see the RFC linked to above

 2. brian: Interested in current status, what’s been implemented

    1. Emily: workflow git clone / git branch / git commit / git push, all
       using submodule.recurse, worked well

    2. Intern Mahi Kolla sent a patch to recurse by default once you’ve done a
       --recurse-submodules clone

    3. Ran demo for an internal team, feedback was positive

    4. Used a hacky remote helper to map “git push” to “git push origin
       HEAD:refs/for/main”, we have plans for not needing that :)

    5. Partial clone with submodules is close to done, is another important
       part of this

    6. Glen and Josh have done some work on branching + setting tracking info.
       That’s key for making recursive push work in an intuitive way, because
       the branch you want to push to in each submodule is not always the same

    7. I also pushed a series storing a path in each submodule’s git directory
       to its superproject’s git directory. Use that as another phase in config
       parsing, inherited-from-superproject config. That combines well with
       config-based hooks (thanks Ævar for the help with that)

    8. Next steps are around fast-forward merges and rebases

    9. Specifics are in the doc linked to

 3. Interaction with Gerrit

    1. Orgad: when you push to a submodule and superproject, at merge time the
       submodule commit changes, what do you do in the superproject to handle
       this?

    2. jrnieder: This comes up in any review flow, not just Gerrit --- ideally
       you’d want to review the superproject and submodule changes together as
       one unit. There’s some work happening in Gerrit on “multi-change
       review”.

    3. What works today: Gerrit’s submodule subscription feature has the
       ability to update a superproject. If you have a set of submodule changes
       and a superproject change that are submitted together, then at submit
       time Gerrit will rewrite the superproject change to reflect what
       happened in the submodules.

    4. In the Android workflow the superproject only contains pointers to
       submodules so we don’t push changes for review to the superproject at
       all. So we handle this with submodule subscription.

    5. Emily: analogy to auto-generated merge commits

 4. Jeff Hostetler: back in 2014 Microsoft considered submodules, hit a can of
    worms

    1. Coordinating changes between submodule and superproject, this requires
       server-side locks to prevent edge cases

    2. Was hard enough that we abandoned it

    3. jrnieder: we’re viewing submodules as not a replacement for the
       monorepo, but as a separate thing for when components have an
       independent existence. Microsoft made the right choice by not using
       submodules artificially in the creation of the Windows monorepo.

 5. Jeff: do you want to support sub-sub-sub-submodules?

    1. Emily: we ruled that out.

    2. jrnieder: nested submodules already work well in Git, we’re not breaking
       that

       1. Philip Oakley: good; if that changes, please make docs + config clear
          about it

    3. As a matter of project hygiene, we encourage people to put their
       submodules in the top-level directly. That way, you know what code
       you’re pulling in.

    4. That said, there are unusual use cases e.g. around a build that pulls
       together multiple versions of the full Android codebase. So we actually
       do take advantage of nested submodules for those niche cases

 6. Please read the design doc, and expect lotsa patches over the next 3-6
    months

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Sparse checkout behavior and plans
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (3 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] Submodules and how to make them worth using Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-21 11:56 ` [Summit topic] The state of getting a reftable backend working in git.git Johannes Schindelin
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 13013 bytes --]

This session was led by Derrick Stolee. Supporting cast: Jonathan
"jrnieder" Nieder, Elijah Newren, Jeff Hostetler, Jeff "Peff" King,
Johannes "Dscho" Schindelin, Ævar Arnfjörð Bjarmason, Emily Shaffer,
Victoria Dye, brian m. carlson, and CB Bailey.

Notes:

 1.  Cone mode has stabilized

 2.  jrnieder: would sparse index without cone mode support be welcome?

     1. Stolee: you’re welcome to try ;-)

     2. Elijah: main theme: performance. Cone mode allows reasonable
        performance due to fewer rules to check

     3. Stolee: directory-level lookups mean lookups can have sublinear cost,
        since you can skip sparse rules (no need to check them in order to
        figure out whether or not a file is excluded or not)

 3.  Elijah: interested in “sparse clones”, i.e. clones that download
     everything related to a specified cone

     1.  Would be nice not having to download extra objects when already having
         specified a cone of interest

     2.  Jeff: the original partial clone had code to restrict to a cone

     3.  Peff: we still have the code, but turned it off, you can have bitmaps
         with that (too heavy on the server)

     4.  Stolee: also, how can the cone be updated if things change? Never
         solved that problem

     5.  Stolee: but the extra blob downloads turned out not to be too big of a
         problem

     6.  Stolee: got a feature request to restrict git log to the current cone,
         git grep already does that (thanks Matheus)

     7.  Elijah: “git grep” without revision arguments is restricted to
         worktree, so it respects the sparse checkout. When you pass a
         revision, though, it searches the whole tree

     8.  Many commands want to examine the whole tree, makes sense to figure
         out the UX (configuration, etc) of them together

     9.  Peff: Is diff code on someone’s radar?

     10. Stolee: I’d view that as part of the same story as “git log”, “git log
         -p”.

     11. Sparse index means we can avoid faulting in trees outside of HEAD, so
         it helps unlock this

 4.  Sparse index: Victoria and Lessley are taking lead on the number of
     commands supporting sparse index

     1. update-index, diff, blame, clean, stash, sparse-checkout itself so far
        supported only in the Microsoft fork of Git

     2. Enabled by default internally so helps us gather data

     3. Elijah: awesome that you’re working on this, sorry I haven’t been as
        responsive as I’d like on reviews

     4. I’m interested in “clean” in particular --- isn’t that about untracked
        files?

     5. Stolee: It uses the index to find what is tracked, want to avoid
        expanding the in-memory index. If there are files outside the sparse
        checkout area then it does expand.

 5.  jrnieder: question about failure modes

     1. When I convert a command, I make sure my code path doesn’t assume the
        cache array contains all entries. Then I turn off
        command_requires_full_index. What happens if I missed a spot?

     2. Stolee: I put ensure_full_index() in front of everything that assumes a
        full index, but if there’s a loop that we missed, there’s no extra
        protection.

     3. Example: cache-tree was calling itself, invalidating points,
        segfaulted.

     4. More worrying failure mode would be if commands proceed with bad data.
        Segfaulting is the good case!

     5. jrnieder is not too worried since we’re pretty far along and soon
        enough we’ll have converted all commands and these questions would be
        moot

        1. Stolee: goal isn’t to get 100% coverage, so point of questions being
           moot isn’t coming soon

        2. jrnieder: Thanks! Okay, I’ll take a look.

     6. http://sweng.the-davies.net/Home/rustys-api-design-manifesto

     7. Stolee is less worried because we have sufficient ensure_full_index
        calls.

 6.  One optimization we’re considering: not expanding the full index when
     anything outside the cone is needed (we’d like to maybe expand just the
     part that needs expanding)

     1. Elijah: we would still keep cone mode, but it’s a bit weird because the
        cone mode does not match what we have in the index

     2. Stolee: we might actually not need this

 7.  Stolee: in the process of this work, found D/F conflict issue, made a test
     illustrating it

 8.  Elijah: atomicitiy

     1.  checkout is a non-atomic operation. ^C makes a mess

     2.  “git sparse-checkout disable” is non-atomic. Takes a while, people ^C,
         and the very last step is updating the sparsity files. Leaves the
         worktree with a bunch of files they don’t need but commands ignore
         them

     3.  We run into problems because then they can check out a different
         branch, do a bunch of other work, then update the sparse-checkout and
         it will see these precious files it doesn’t want to overwrite

     4.  Should “git status” show them?

     5.  Dscho: We could set a flag on disk when you’re about to disable, then
         if we were interrupted print an error message to get the user to sort
         things out

     6.  Peff: I was going to suggest something similar. FS doesn’t make
         transactions easy, but we can at least do a rollback (signal handler),
         not foolproof, but it works pretty well and covers your ^C case.

     7.  Stolee: coming in 2.34: sparse-checkout reapply will delete ignored
         (and tracked?) files. Helps with these leftover files.

     8.  Elijah: no current way to get out of that state, thank you for making
         sparse-checkout reapply do that

     9.  Stolee: noticed during experimental release to people from Office.
         Everything was slow because they had run build and left behind ignored
         files

     10. jrnieder: Piggy-backing on Dscho’s comment, there’s a database
         analogy: record intent (in the database case, that’s a transaction
         journal) before the non-atomic steps the act on that intent. Suggests
         maybe we should be updating the sparsity pattern before the checkout
         step

 9.  That’s it, that’s the status update what’s currently on the list.

 10. We have more plans, though.

 11. Idea: use git.git itself

     1. Tried it, but had to have 97% files to still be workable

     2. Could change the Makefile to accept that, say, po/ is missing

     3. Ævar: creates a lot of complexity for the build

     4. jrnieder: as VCS provider, what is our recommendation to build authors?
        Do we want them querying sparse checkout, do we want builds that Just
        Work in cone mode, do we want to treat sparse checkout as a thing that
        builds don’t need to support?

     5. Stolee: want build system to be able to tell Git about what needs to be
        checked out. “In-tree sparse checkout” (see below)

 12. Emily: we’re interested in sparse-checkout affecting the set of active
     submodules, just mentioning this as a heads-up

 13. [PATCH 00/10] [RFC] In-tree sparse-checkout definitions - Derrick Stolee
     via GitGitGadget
     (https://lore.kernel.org/git/pull.627.git.1588857462.gitgitgadget@gmail.com/)

 14. Victoria: today when you switch gears and work on something else you have
     to update the sparse checkout pattern

 15. Proposal here is to have in-tree sparse checkout definitions, e.g. a
     .gitdependencies file that lists, for the directories you’re working with,
     what other subdirectories they depend on

 16. That way, you get exactly the folders you need

 17. Stolee: office has their own tool “scoper” that figures out dependencies
     and runs “git sparse-checkout set” for the user. Is confusing when you
     rebase and need to remember to run it

 18. Currently lives in a hook, custom and built for one engineering system,
     want to generalize and make a standard feature

 19. Victoria: being built in to Git would make sense because it’s general
     enough to work in most monorepo environments.

 20. Involves two pieces: having git understand the dependencies and assemble
     your sparse checkout cone using them, and having the build system maintain
     and use sparse checkout correctly.

 21. Some build setups tolerate missing directories reasonably well. If we make
     .gitdependencies more of a first-class concept then we could go further
     and make build systems handle missing directories as something that would
     be expected

 22. C# .proj files link to dependencies on other .proj files with relative
     path. But in a solution file collecting all .proj files, it lists all of
     them and you need to have them all present. If a subdirectory isn’t
     present, proposal is to build what is there instead of everything.

 23. Tried another prototype on how to do this in Bazel. It has a rigorous
     definition of inputs and outputs, and based on that you could translate to
     a .gitdependencies file or sparse-checkout pattern.

 24. Microsoft’s buildxl has similar properties

 25. Victoria asks: how general is the above?

 26. brian: Many monorepos has multiple microservices. A cone can represent
     what a particular service needs to run.

 27. If you’re building one coherent product like Windows, you’re going to need
     some prebuilt artifacts that you pull down.

 28. jrnieder: Large monorepos often have strong remote build. Not everything
     you depend on is things that you need to have in source form locally

 29. CB: My team at Bloomberg has a teamwide “monorepo” (not Bloomberg-wide).
     We’re cmake based. Sparse checkout would be interesting for us. We’re
     experimenting with what’s called workspace builds: you have a thing you
     can build (a subdirectory), that you pull into the toplevel CMakeLists.txt
     as a single thing.

 30. With cmake you can declare a dependency with target_link_libraries. A
     dependency name can either be a cmake defined target in the codebase
     you’re building it, or it can be a pre-built library pulled in another
     way, e.g. importing via a pkg-config file.

 31. At build time if I decide I want to change that library, I’ll expand my
     sparse-checkout region, and rerun cmake to have it understand the newly
     available source.

 32. Optionality: I don’t have to have that source checked out, but when it’s
     present I want to use it.

 33. Victoria: sounds like in-tree sparse checkout is more of an intermediate
     step. Sometimes you want the source, sometimes you want to pull in an
     external artifact.

 34. Elijah: we have a monorepo, about the size of the Linux kernel. Multiple
     separate services, interconnected pieces. Using sparse-checkout required
     some code changes, refactoring that wasn’t just around the build system.
     We created a tool before the sparse-checkout command existed, using older
     mechanisms, and then switched to sparse-checkout when it came out. We
     track our dependencies ourselves --- you need this set of modules (3 or 4)
     or the modules relevant to a particular team, and it then computes the
     relevant directories to get. We had to make some changes to adopt cone
     mode but I like it and the changes it led to. Then you run the build
     system --- you have files that declare the dependencies, are they newer
     than .git/info/sparse-checkout? If not then recompute them again.

 35. Potentially would want to rerun the dependency generation after you run a
     rebase as well…

 36. If we track it in-tree, there are some interesting cases we’ll run into
     (merge conflicts on this generated file).

 37. Also, tracking dependencies in two places can result in difficulty, skew.
     Maybe can generate one from the other.

 38. Our sparse checkout tends to be build oriented “what do I need for this
     build”. But testing inverts the dependency graph, want to see what tests
     depend on this code. We encourage them to test in the cloud but not
     everyone does that, leads fewer people to use sparse checkout.

 39. There’s some remote build, mixing-and-matching pieces built remotely and
     locally.

 40. Part of working in a monorepo is you need strong tool hygiene enforcement.
     Without that, you get a ball of mud of dependencies. Adopting sparse
     checkout drove modularity.

 41. Ævar: I’d be interested in a summary

 42. Git’s lack of support for sparse checkout was unusual, so I think this
     topic is well explored by previous version control systems

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (4 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] Sparse checkout behavior and plans Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-25 19:00   ` Han-Wen Nienhuys
  2021-10-21 11:56 ` [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.) Johannes Schindelin
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5942 bytes --]

This session was led by Ævar Arnfjörð Bjarmason (on behalf of Han-Wen
Nienhuys, the driving force behind the reftable patches, who did not
attend the Summit). Supporting cast: Jonathan "jrnieder" Nieder, Johannes
"Dscho" Schindelin, Philip Oakley, Jeff "Peff" King, and Junio Hamano.

Notes:

 1.  Ævar: helping Han-Wen with reviewing

 2.  Was split into multiple patch series

 3.  Han-Wen implemented the reftable library, has been kicking on the mailing
     list for over a year

 4.  Before reftable, need to merge some preliminaries

 5.  Odd cases:

     1. slightly different semantics of reflog

     2. Probably more things that haven’t cropped up yet

     3. Some tests are still broken, question is still: are the tests wrong, or
        the code

 6.  Plan is to get reftable library and underlying fixes in place, and then do
     the process of actual reftable-as-ref-backend afterward

     1. Jrnieder: sounds like you’re alluding to a mailing list thread. Do you
        have a link?

     2. Ævar: there are multiple, as the initial patch series has been split
        multiple times

     3. “Reftable plan”:
        https://lore.kernel.org/git/87h7jqz7k5.fsf@evledraar.gmail.com/

     4. Also alluded to in Han-Wen’s later rerolls

 7.  What is reftable?

     1. It’s a custom way to store refs

     2. Instead of writing individual files per ref, it’s a single file (or
        multiple files when updating the refs)

 8.  Dscho: three issues that were outstanding when I reviewed it

     1. That said, I clashed with Han-Wen

     2. 1. licensing / contribution model

        2. Ævar: through the Software Freedom Conservancy we have good access
           to legal advice. I got advice there about how to document this well,
           will be getting us into an end state that will hopefully satisfy
           everybody

        3. To be clear, we already have some code in-tree that is under
           different licenses. xdiff is LGPL code used by libgit2, there’s
           contrib/ + compat/ code under various licenses

        4. For legal purposes want to make sure this is clear and unambiguous
           to everyone

        5. jrnieder: about contribution model, there is on-list discussion
           about this, taking patches in the normal way to this directory in
           git@vger.kernel.org is where I thought that ended up

        6. Ævar: yes, git.git as source-of-truth. Not like gitk where there’s a
           separate upstream repo

     3. 2. coding style consistency, + not using git core data structures
           enough

        3. Ævar: still substantially true. Integrating into git.git means any
           stylistic or structural changes to fit well into git are fair game.
           Carlo has been helping with that

     4. 3. I forgot the third :)

 9.  Philip Oakley: debugging when things go wrong

     1.  When reftable arrives, will people be unable to look behind the scenes
         at what’s going on when issues happen?

     2.  Especially for people who don’t understand refs as well

     3.  Jonathan: format =
         www.kernel.org/pub/software/scm/git/docs/technical/reftable.html
         [http://www.kernel.org/pub/software/scm/git/docs/technical/reftable.html]

     4.  Ævar: That’s a fair summary. It’s as though we didn’t have packfiles
         and only had loose files and then switched to using packfiles. Can’t
         just “cat” any more. Switching to a binary format

     5.  That said, you get advantages out of that. Situations where people end
         up needing to examine the low-level details are

     6.  Not a fully fair comparison, but we have this problem already with
         packed-refs, having to look in two places

     7.  Philip: An inspection tool to export as a directory tree might be
         handy, as an inspection tool

     8.  Peff: We have pretty good inspection tools that look at the whole ref
         database

     9.  Reftable has a set of files that go together. May want debugging tool
         to dump the content of a binary reftable file. But we can
         incrementally add those

     10. As we discover bugs, I expect to have to build tooling

     11. Dscho: We also have a .git/index file and don’t have tooling to
         interact with it other than the standard Git tools

     12. Ævar: To be clear, once these patches land it would still be optional,
         would not be the default ref backend

     13. Even if it’s 100% bug free, we still have concern for users in the
         wild that make it not so easy to just flip the switch

     14. Not going to be the default backend any time soon

     15. jrnieder: makes sense to wait for a while to make it the default, even
         once it is robust, since we have to pay attention to what Git versions
         + implementations are out there in the wild

     16. Ævar: When you run “git init”, it currently still creates a branches/
         directory. Dscho tried to get rid of it before

     17. jrnieder: I think that previous attempt was getting rid of read
         capability, too

     18. Dscho: don’t remember the details, has been a couple years

     19. Junio: I do not think it is a bad idea to drop branches from template.

 10. jrnieder: Question about how to handle this kind of large contribution

 11. At some point does it make sense to take it, mark as experimental, and
     improve in place?

 12. Hoping the previous discussion will help me think about that

 13. Ævar: I agree about importing the bulk of the code as-is and iterating
     from there

 14. At that point it’s still not accessible to users but we get portability,
     testing, etc

 15. Dscho: Agreed, that makes sense to me

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.)
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (5 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] The state of getting a reftable backend working in git.git Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-22 14:20   ` Jean-Noël Avila
  2021-10-21 11:56 ` [Summit topic] Increasing diversity & inclusion (transition to `main`, etc) Johannes Schindelin
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5968 bytes --]

This session was led by brian m. carlson. Supporting cast: Jeff "Peff"
King, Ævar Arnfjörð Bjarmason, Taylor Blau, Philip Oakley, Emily Shaffer,
CB Bailey, and Jonathan "jrnieder" Nieder.

Notes:

 1. Background: answering on StackOverflow, other avenues for user questions,
    even users from very large companies

 2. How can we improve documentation?

 3. Maybe even think about translating docs such as FAQs

 4. Peff: there’s an effort to translate manpages

    1. brian: Saw an announcement, haven’t seen what came of it

    2. Peff: Some translated pages are live on git-scm.com (a github repo with
       translations)

    3. Ævar: It uses a third-party tool (po4a) that uses gettext by making each
       paragraph a translated string. So it’s the same workflow as translating
       code changes

    4. Taylor: https://github.com/jnavila/git-manpages-l10n

 5. Philip Oakley: I see manpages used as reference material instead of
    educational documents

    1.  Audience often already knows what they’re looking up

    2.  That approach makes it harder to bring people in. Examples are of the
        difficult things instead of how to get started, workable examples that
        can be copy/pasted straight into the shell and tell you how things go

    3.  Emily: We have the two-part Git tutorial (“git help tutorial”) which is
        part of manpages, but I think it’s pretty dated. It starts with how to
        convert your zipfile-based software distribution to Git which is not
        where most people start these days

    4.  Philip: user manual also is not accessible as part of manual

    5.  CB: I wonder if this is even where people look. A lot of new users will
        hit Google and find git-scm.com/book which historically has been a very
        good introduction

    6.  Slightly misleading calling it Pro Git because it has good
        introductions

    7.  Philip: maybe the Git project wants to state: we don’t make great
        documentation, look elsewhere

    8.  jrnieder: thank you for the perspective. It’s not quite the intent,
        though, we might just not do a good enough job. For example, when
        examples are too complex, that’s worth improving

    9.  Used to have active contributors who maintained documentation better
        (e.g., Jon Loeliger)

    10. A part of the problem is the format. Pro Git can include diagrams, the
        Git user manual can’t (or at least doesn’t)

    11. brian: likes Pro Git, but maybe not the best for new folks (it assumes
        some familiarity with source code management)

    12. In stackoverflow you can see how people answer questions, how much less
        existing background they assume

    13. Ævar: One issue with the Pro Git book is that it is not under a free
        software license (though it is free of charge). That means it can’t be
        included in free software distributions.

    14. I want to close the gap between output we emit and providing backlinks
        to relevant documentation. E.g. sometimes when we emit advice output,
        we say what config variable is involved and sometimes we don’t

    15. Having documentation distributed with Git is also helpful for having
        something that’s up to date and matches the code people are using

    16. Philip Oakley: Google Season of Docs is a place we can help

    17. brian: Mining stackoverflow has been very helpful for FAQs, helps avoid
        having to give the same answer again and again

    18. Goal is to have a good FAQ in git/git, to be able to link to from
        StackOverflow

    19. Perl approach of including references in error messages is very useful
        for people being able to solve their own problems

    20. Ævar: “git help git” landing page is not so helpful. I’d prefer
        something like the perl manpage that gives an overview and table of
        contents and nothing else, instead of incorporating reference
        documentation about common options

    21. brian: I’d like to see both in the toplevel manpage. “How to invoke
        git” is something people expect to see when they run “man git”

    22. Ævar: agreed about synopsis, as long as it focuses on the commonly used
        options

    23. Peff: every time I want to look up perl commandline options, I run “man
        perl”, get annoyed, and then run “man perlrun”. I think “man git” does
        the things you’re describing but organized poorly. Even “git help”
        output does a better job of organizing. I also wouldn’t be sad to see
        the options section coming after.

    24. CB: dashed commands should not be listed

 6. Emily: Side topic: the state of git help on stackoverflow is abysmal

    1. Doesn’t have much Git project presence, devrel teams focus on
       company-specific things instead of Git basics.

    2. A lot of answers are just wrong

    3. Someone spending some 20% time on that could improve things a lot and in
       the process would see where people are struggling, which can help us
       make Git more intuitive and make better intuitive tutorial documentation

    4. Ævar: Having a commonly cited FAQ used in stackoverflow can be great

 7. Philip: there are commands that are (at least almost) undocumented, e.g.
    git rerere

    1. brian: Have seen occasions where people struggled with commands like
       this

    2. Ævar: have seen undocumented patches

    3. Tried to improve documentation e.g. git fetch --prune, sometimes
       phrasing is too concise to be helpful

    4. Emily: getting confused e.g. when notes are transported via operations
       such as rebase

    5. Philip: may go in hand with the lack of good examples

 8. Lots of good ideas for contributions!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Increasing diversity & inclusion (transition to `main`, etc)
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (6 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.) Johannes Schindelin
@ 2021-10-21 11:56 ` Johannes Schindelin
  2021-10-21 12:55   ` Son Luong Ngoc
  2021-10-21 11:57 ` [Summit topic] Improving Git UX Johannes Schindelin
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 13011 bytes --]

This session was led by Johannes "Dscho" Schindelin. Supporting cast:
brian "Bmc" carlson, Jeff "Peff" King, Taylor Blau, CB Bailey, Ævar
Arnfjörð "Avarab" Bjarmason, Jonathan "Jrnieder" Nieder, Derrick Stolee,
Lessley Dennington, Glen Choo, Philip Oakley, Victoria Dye, and Jonathan
"Jonathantanmy" Tan.

Notes:

 1.  Dscho: Background. If you look into archives, I myself have made a change
     in communication style. Used to do a lot of judgmental code reviews, but
     noticed that interactions which are respectful and collaborative can be
     much more productive and enjoyable. I learned this from management help at
     $dayjob; it helps a lot to strengthen the team & product. Two people with
     different points of view working together make a better result than one
     person

 2.  Lately noticing more places Git can improve, for example default branch
     name ‘master’, which has huge impact on a lot of population, esp. USA.
     ‘main’ is a very good alternative adopted by most hosters (GitLab, GitHub,
     BitBucket, etc)

 3.  There is still work to do in documentation, other places. Some of these
     changes can be wide-reaching so dscho trying to contribute them in a less
     painful way; soon hopefully we can switch the default without this option
     workaround. A lot of work, but the impact is worth it

 4.  Still, a tiny first step - let’s go further, both with Git the tool and
     Git the community/list. Easy to forget there is someone on the other end
     of the email address who could feel discouraged

 5.  Git the community is still overwhelmingly male, white - what can we do?

 6.  Bmc: the note about conversational tone is important. I’ve heard from
     others, especially women that a confrontational review style is a turnoff;
     that’s true for me too. No longer interested in a 20-email debate about
     something. Let’s look for a more collaborative approach on list; when I
     point this out I don’t see a bunch of help backing me up, usually.

 7.  Peff: I like that bmc points out when folks aren’t behaving well; I often
     consciously don’t jump in to support you, because a pile-on is a bad
     experience for someone who’s usually ok, and if it’s someone who’s truly
     awful, I don’t want to feed the troll - they’re prone to escalate instead.
     Understandable that you could wonder on your end how your pushback is
     being received - we should be better about showing support.

 8.  Bmc: the goal is to make the list more positive; if I should do something
     differently please tell me

 9.  Taylor: I’ve had conversations off-list with people who are speaking up
     on-list. Stolee: +1, I reply (not reply-all) to say “yes thank you”

 10. Dscho: Firstly, bmc, you’re a great example and I appreciate that you push
     back when you do. When I see just bmc’s reply and nobody else, it seems
     the offender is feeling validated and continues behaving that way (because
     nobody agreed with bmc publicly).

 11. Dscho: maybe we could be quicker to ask for a video chat when someone is
     being harsh on list. Often when I point it out the quick response is “well
     I thought it was fine” and escalates.

 12. CB: I would be intimidated being invited into a private video call with
     someone on the mailing list when I’m a brand new contributor. I have
     experience helping moderate the #include<c++> discord which tries harder
     to moderate aggressively and make inclusive environment. Moderation takes
     a lot of load, but it might be a better alternative than instant dogpiling
     from all the senior Gitters. Maybe a happy medium to coordinate off-list
     first before everybody jumps on someone communicating rudely.

 13. Avarab: there are some instructions in the CoC about
     enforcement/escalation. We tend to take these reports off-list, so
     something does happen but it’s not so visible. Usually these resolve
     happily, but the list is left hanging, so it’s easy to get the impression
     that nothing happened. At the same time, there are negatives to making the
     whole thing public.

 14. Philip: often common words mean different things between nationalities,
     this can get in the way

 15. Dscho: Thank you for taking on the task to be on the escalation list, it’s
     hard. It can be really soul-draining, e.g. the GfW “how can we make Git
     more inclusion” issue. I had to stay away from the GfW fork entirely
     because I was so tired from trolling on that issue. From my point of view,
     the CoC is there to protect folks with less standing/representation. I
     thought for sure the CoC would never be invoked by a white male, but we
     saw it happen

 16. Dscho: for this session, hoping to find strategies to turn the tone around
     on list and avoid issues from the beginning, throughout the
     project/community. Recently read Nonviolent Communication, has tips to
     turn around the conversation even if it started poorly

 17. Jrnieder: are you saying someone in a position of power should never make
     use of the CoC? Dscho: I figured it was not there for me, it was there for
     people who don’t have the privilege I do. Jrnieder: A few things - CoC
     sets expectations regardless of who interactions are directed to;
     somewhere unwelcoming to established contributors but welcoming to newbies
     is still not an appealing place for some newcomers to invest in joining
     because they know they will one day be an established contributor.
     Secondly, if I have a dispute, having a guide for turning a potentially
     problematic event into a productive event is really important. A process
     that moves in that direction of nonviolent communication. Easier said than
     done. But that’s part of making a friendly environment, and overall that
     points to not treating the CoC as “only for some people”.

 18. Bmc: the goal of the CoC is to produce and preserve the community we want
     to have. It should produce a place where everybody can participate fully;
     if we have an environment that’s unproductive or toxic we lose
     contributors. I’ve left projects over a poor contributor experience
     before, because it wasn’t worth my time to deal with the overhead.
     Conversely, having a great and safe experience is a good way to attract
     diverse contributor base.

 19. Stolee: When we moved from MS to GH, we received quick feedback that we
     weren’t communicating well - too direct and unemotional. Maybe Git
     community communicates that way, but that’s not how most people interact;
     that makes me think that our “efficient and effective” communication is
     actually too aggressive, and easily interpreted as attacks on
     contributors. Basically… let’s all lighten up? :)

 20. Taylor: Yep, my “talking to GitHubbers at GitHub” voice is different from
     my “talking to Gitters on Git list” voice. New contributors, are we on the
     right track here?

 21. Lessley: I made first contribution recently, and had been warned about the
     list, papers about the Linux kernel, and that open source contribution
     could be a little contentious. I was really nervous and put it off, but in
     practice it was fine; I broke ‘seen’ which was embarrassing but was ok in
     general. Maybe I’d have more input if I had a larger contribution. I do
     wish I had gotten more review faster.

 22. Glen: I’m also a pretty new contributor. The communication style is a lot
     more direct than what I’m used to on the outside. But that doesn’t mean
     it’s unhelpful or unconstructive… but it takes some getting used to. I had
     to put in a lot of effort to trust that folks meant well and were trying
     to help, and that’s just how we communicate here. But I could imagine it
     being really intimidating if people aren’t used to that kind of review.

 23. Taylor: To emphasize, I also remember when I started, I felt like people
     were disappointed/upset by my contributions. Took me a while to
     internalize that people were trying to help me make my contributions
     better. So we should A) be careful to remember that new contributor
     experience, and B) be careful to set an example even in reviews with
     veteran contributors.

 24. Philip: Often different nations have different writing style. “Thanks.” at
     the end of the email means “Thanks but no thanks” in British English, but
     that’s not usually how it’s meant on Git list. I also noticed it’s not
     well explained why something is a problem. Reviewer thinks everybody knows
     why they’re making some comment, but the person who proposed the patch
     really didn’t see the issue and won’t understand. We should give a little
     more background when pointing out an issue.

 25. CB: We’re not the only project that won’t land things that aren’t
     technically excellent. It’s important on my team too, or else the whole
     company falls over next week. We’ve had lots of internal events and talks
     about inclusive code reviews. The few extra sentences - “Thanks for
     submitting, it looks great, the direction is good, I’ve got comments
     because xyz” - go a really long way. We recently had a new team member’s
     “good first issue” turn into a 2 week ordeal, and taking the extra time to
     say “this is good progress, it’s shaping up well” was helpful to keep from
     discouraging our new teammate.

 26. Jrnieder: Chromium project has had a problem with this too, having very
     high standards. So first Cr contribution would often just feel like a
     hazing ritual. The focus should be on helpfulness, not “demonstrate how
     much you care about the project by putting up with us”

 27. https://chromium.googlesource.com/chromium/src/+/main/docs/cr_respect.md

 28. ^ This covers a lot of what we were talking about; a good reference for
     better/respectful code reviews. Timeframe, tone, stating
     goals/expectations, etc. Should we adopt something similar in Git?

 29. 26. Bmc: It can be frustrating to spend a lot of time on a series and then
         immediately get a lot of technical feedback, without any assurance
         that the direction is good. We should work harder to say “I’m glad to
         see this patch, looking forward to seeing it land” instead of just
         pointing out things to fix. Or “the way this patch looks vs. the last
         version is really great”

 30. Dscho: One thing my manager does well is to lead by asking, to give space
     for me to reflect on what I just said and think about more perspectives.
     This doesn’t put me on the defensive right away. I’ve made the mistake
     before by assuming any reply at all implies “I’m interested in this
     feature” - that’s not obvious, and instead my review comes out as “You’re
     doing this wrong, go away” :( and I don’t know how many people felt that
     and didn’t say anything, because they left.

 31. Victoria: Given the unique nature of mailing list reviews, even though
     there are a ton of resources on how to give respectful reviews, it’d be
     useful to do a more specific guide for Git, discussing how to structure
     review reply, how to follow up, etc.

 32. Stolee: We have discussed a “guide to reviewing” in Documentation/, along
     with SubmittingPatches and CodingGuidelines. We avoided it because it’s a
     lot of work, and I’m also worried about the review of the review doc.
     Would be a productive discussion….but a lot :)

 33. Jonathantanmy: Yep, I’m thinking of doing one like that, hopefully in a
     few weeks we can discuss it on list.

 34. Avarab: I think it’s a good thing to work on; we need to be really careful
     about what guidelines we pick and choose. Need to ensure an easy path for
     new contributors so they don’t need to read hours of documentation for a
     typo fix. Plus we need to ensure that this doc is accessible for folks who
     have different first language than English.

 35. Bmc: on git-lfs we have a contributor with very little English, so when we
     did the review I’d offer an alternative text, and we would work together.
     That process was useful to come up with readable documentation in a
     helpful way. That is, proposing a solution instead of pointing out the
     problem and saying “fix it” can help a lot in scenarios like this.

 36. Dscho: Yep, this is important and will help us be more accessible to
     contributors whose English is not super top notch Cambridge exam :)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Improving Git UX
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (7 preceding siblings ...)
  2021-10-21 11:56 ` [Summit topic] Increasing diversity & inclusion (transition to `main`, etc) Johannes Schindelin
@ 2021-10-21 11:57 ` Johannes Schindelin
  2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
  2021-10-21 11:57 ` [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc) Johannes Schindelin
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:57 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4535 bytes --]

This session was led by Johannes "Dscho" Schindelin. Supporting cast:
Jonathan "jrnieder" Nieder, brian m. carlson, Ævar Arnfjörð Bjarmason, CB
Bailey, Phillip Wood, and Emily Shaffer.

Notes:

 1.  A serious problem! ~½ of blog posts about Git start by ridiculing the
     command-line interface of Git

 2.  Is very flexible, and that flexibility is reflected in the user interface

 3.  On the other hand, it could be a lot easier to use. Example: GitHub CLI in
     Go, tries not to supplant Git but gives a really good user experience
     interacting with GitHub-specific entities like PRs and Issues. Everything
     you can do day-to-day in the web UI you can do in the command line, and
     it’s scriptable

 4.  Excellent discoverability. Never needed to check the manual. Well designed
     interface, good command line completion.

 5.  What would it take to revamp Git’s user interface?

 6.  “git restore” example

     1. Dscho doesn’t like it, feels wrong

     2. Designed by a software engineer, not a designer

     3. jrnieder: The manpage is pretty clear about “this is experimental,
        we’re willing to modify it based on feedback”. Is there information we
        can gather and work with a designer to improve it?

     4. brian: “I don’t know how to use it” is valuable feedback. Maybe the
        documentation is failing? Etc

     5. Can be a sign of excessive complexity or of it relying on too much
        previous knowledge to use

 7.  Ævar: On switch/restore in particular, there was a recent discussion.

     1. Ultimately came down to inconsistency with other commands in the same
        area

     2. I gave some suggestions

     3. Some patches started, there was some trepidation about making changes,
        though

 8.  Dscho: Is working one command at a time too incremental?

     1. Inconsistencies between different commands

     2. “git add -A” exists, “git commit -A” doesn’t

     3. Are those the most pressing problems? I don’t even know that

 9.  Want guidance from user interface experts that work on the command line

 10. gh command involved contractors that are no longer with GitHub

 11. CB: The “pip” project is doing UX research. I don’t know who commissioned
     it. Got UX experts to design study:
     https://pip.pypa.io/en/latest/ux_research_design/

 12. Phillip: The inconsistency between 'checkout -b' and 'switch -c' was
     deliberate in the hope that '-c' for 'create' would be easier for users to
     understand but ends up being confusing.

 13. jrnieder: we should not expect an “angel” to swoop in and solve all our
     problems for us. It’s more about how do we build this skill within the Git
     project (by improving our own skills or attracting new contributors)

 14. We should also consider that there are many people on the mailing list
     with plenty of backgrounds, we might just need to band together to get it
     done

 15. Emily: maybe we can get some training (maybe the SFC could fund it, or
     others in the Git ecosystem)

 16. jrnieder: training is easier to fund than a permanent engagement

 17. brian: if we had this expertise, we could probably make better decisions
     in the future

 18. Ævar: Conservancy could potentially find someone, but funding a different
     matter

 19. jrnieder: neutrality not all that important in this context, finding
     funding at Google or GitHub should be easy

 20. Ævar: often comes down to these consistencies. Getting anywhere with that
     might just be a long slog.

 21. Phillip Wood: checkout -b vs switch -c inconsistency was deliberate, in
     that “-c” for create is meant to be easier for a new user

 22. Dscho: Good first step is getting the UX design basics (training), I’ll
     look for funding

 23. Example: when I just run “git” with no other options, can that output be
     more helpful? “gh” has a nice overview when I run it.

 24. Ævar: That in particular was improved years ago

     1. Dscho: Oh! Good. Might be possible to keep improving along those lines.

     2. Dscho: Sounds like we have a good path forward. \o/

 25. hallway conversation

     1. the index as a UI concept, what if we didn’t have it?

     2. learning curve design space

     3. how much does telemetry help us?

     4. popcon-style telemetry

     5. statistical rigor in surveys

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc)
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (8 preceding siblings ...)
  2021-10-21 11:57 ` [Summit topic] Improving Git UX Johannes Schindelin
@ 2021-10-21 11:57 ` Johannes Schindelin
  2021-10-21 13:41   ` Konstantin Ryabitsev
  2021-10-22  8:02 ` Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
  2021-10-22  9:44 ` Let's have public Git chalk talks, " Johannes Schindelin
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-21 11:57 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3886 bytes --]

This session was led by Jonathan "jrnieder" Nieder. Supporting cast: brian
"bmc" carlson, Emily Shaffer, Johannes "Dscho" Schindelin, CB Bailey,
Junio Hamano, and Matthias Aßauer.

Notes:

 1. Been interested in this for a long time

 2. Dscho said he’s not able to follow everything on the mailing list

    1. if you have just one patch you send, reply-all works okay

    2. mailing list works reasonably well if you’re someone like Junio, working
       on it full time, has good mail filters, keeps up to date with everything

    3. If you’re in-between, does not work well

 3. Lessley mentioned in the Diversity & Inclusion context:

    1. you send your patch,

    2. want to have a solid review

    3. want guidance where to go from here

    4. timely feedback

    5. want to know where the patch stands

    6. What happens: guilt-based workflow, where reviewers reply much later
       after being prodded

 4. bmc: I want some way to track patches

    1. What have I reviewed before and what have I not reviewed since last
       time?

    2. Emily: most of this exists in patchwork. Our intern Raxel Gutierrez did
       work on that this summer. Alas, that doesn’t show up on
       patchwork.kernel.org because it’s using patchwork 2.x and the features
       are in 3.x

    3. https://youtu.be/24dL8yqhYNg

 5. bmc: I want some kind of bug tracking system

    1.  In git-lfs when I need a git feature, people are happy to send a patch,
        there’s no point of coordination for this. Don’t know where to send a
        patch, don’t know where to send a bug report

    2.  debbugs works okay, has a huge spam problem, but it works fine; email
        based

    3.  Emily: Every time this comes up I go oh $&!& because this is
        perennially a source of dispute. I don’t care what tracker we use, just
        want one

    4.  Dscho: everyone else is caught in the crossfire between jrnieder and
        me.

    5.  CB: Is there an option that makes you both equally miserable?

    6.  bmc: Could we get kernel.org to host something?

    7.  jrnieder: there’s a bugzilla instance at bugzilla.kernel.org, which
        might satisfy CB’s criterion

    8.  bmc: I want to have whatever we use send out to the list. That would
        avoid conversations going on without people in the mailing list centric
        workflow being aware of it. If we are all using a GitHub/GitLab based
        workflow then that’s not required

    9.  Emily: +1, great point

    10. jrnieder: Sounds like we have some common ground so seems worth
        starting a mailing list thread

    11. Junio: As long as I’m not the person operating the bug tracker, I’m
        happy :)

    12. Dscho: Is it important to you that it sends things to the mailing list?

    13. Junio: Not really. The extra tracking conversations are not as
        important to me. I think it’s a feature that if someone requests a
        feature and nothing happens for a while that it no longer produces
        overhead for people is a useful feature. That kind of old filtering
        feature is sometimes valuable.

    14. jrnieder: in a bug tracker, triage + common sense of priorities is very
        useful. Experiences in JGit bugzilla vs the Debian bugtracker (the
        latter is better curated)

    15. brian: I’m happy to volunteer to do some triage on the bugtracker. If
        other people will help out and contribute, happy to do that

    16. I’m also happy to work with kernel.org admins to get this set up for us
        if that’s what we want

    17. people would expected to be kind+helpful in interactions there, can’t
        expect it to devolve into a cesspit

    18. Matthias: I’m happy to help with triage too

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Crazy (and not so crazy) ideas
  2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
@ 2021-10-21 12:30   ` Son Luong Ngoc
  2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
  1 sibling, 0 replies; 58+ messages in thread
From: Son Luong Ngoc @ 2021-10-21 12:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Hi,

On Thu, Oct 21, 2021 at 1:56 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> This session was led by Elijah Newren. Supporting cast: Johannes "Dscho"
> Schindelin, Jonathan Tan, Jonathan "jrnieder" Nieder, brian m. carlson,
> Jeff "Peff" King, Ævar Arnfjörð Bjarmason, Emily Shaffer, CB Bailey,
> Taylor Blau, and Philip Oakley.
>
> Notes:
>

...

>
> * Biggest idea: there are a lot of people who version control things via
>   tarballs or .zip files per version. This prevents history from
>   compressing well. Some people check in those compressed files into Git
>   for purposes of history.
>

...

>
>    * Old suggestion of a “blob-tree” type that allows storing a single
>      index entry that corresponds to multiple trees and blobs in the
>      background, possibly.
>
>    * One long-term dream (inspired by Avery Pennarun’s “bup” tool) is to
>      store large binary files in a tree-structured way that can store
>      common regions as deltas, improve random access, parallelized
>      hashing. Involves a consistent way to split the file into stable
>      pieces, like --rsyncable uses (based on a rolling hash being zero).
>
>    * Peff: you can do that at the object model layer or at the storage
>      layer. The latter is less invasive.
>
>    * jrnieder: The benefits of blobtree are greater at the object model
>      layer --- e.g. not having to transmit chunks over the wire that you
>      already have. I think the main obstacle has been that the benefits
>      haven’t been enough to be worth the complexity. If that changes, we
>      can imagine bundling it with some other object format changes, e.g.
>      putting blob sizes in tree objects, and rolling it out as a new
>      object-format.
>

I think this was implemented as 'Blob Ref' in Yandex's vcs named Arc.
I was suggesting this to Gitlab folks earlier (1) as a possible solution to
large file storage.

Very glad to hear that it was brought up during the summit.

Cheers,
Son Luong.

(1): https://gitlab.com/gitlab-org/git/-/issues/93

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Increasing diversity & inclusion (transition to `main`, etc)
  2021-10-21 11:56 ` [Summit topic] Increasing diversity & inclusion (transition to `main`, etc) Johannes Schindelin
@ 2021-10-21 12:55   ` Son Luong Ngoc
  2021-10-22 10:02     ` vale check, was " Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Son Luong Ngoc @ 2021-10-21 12:55 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Hi,

On Thu, Oct 21, 2021 at 1:57 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> This session was led by Johannes "Dscho" Schindelin. Supporting cast:
> brian "Bmc" carlson, Jeff "Peff" King, Taylor Blau, CB Bailey, Ævar
> Arnfjörð "Avarab" Bjarmason, Jonathan "Jrnieder" Nieder, Derrick Stolee,
> Lessley Dennington, Glen Choo, Philip Oakley, Victoria Dye, and Jonathan
> "Jonathantanmy" Tan.
>
> Notes:
>

...

>
>  5.  Git the community is still overwhelmingly male, white - what can we do?
>

...

>
>  19. Stolee: When we moved from MS to GH, we received quick feedback that we
>      weren’t communicating well - too direct and unemotional. Maybe Git
>      community communicates that way, but that’s not how most people interact;
>      that makes me think that our “efficient and effective” communication is
>      actually too aggressive, and easily interpreted as attacks on
>      contributors. Basically… let’s all lighten up? :)
>
>  20. Taylor: Yep, my “talking to GitHubbers at GitHub” voice is different from
>      my “talking to Gitters on Git list” voice. New contributors, are we on the
>      right track here?
>

...

>
>  34. Avarab: I think it’s a good thing to work on; we need to be really careful
>      about what guidelines we pick and choose. Need to ensure an easy path for
>      new contributors so they don’t need to read hours of documentation for a
>      typo fix. Plus we need to ensure that this doc is accessible for folks who
>      have different first language than English.
>
>  35. Bmc: on git-lfs we have a contributor with very little English, so when we
>      did the review I’d offer an alternative text, and we would work together.
>      That process was useful to come up with readable documentation in a
>      helpful way. That is, proposing a solution instead of pointing out the
>      problem and saying “fix it” can help a lot in scenarios like this.
>
>  36. Dscho: Yep, this is important and will help us be more accessible to
>      contributors whose English is not super top notch Cambridge exam :)

Yes, thanks for mentioning the non-English speaking community.

I have been an avid reader of the Git Mailing List for the past years and can't
help but notice contributions from folks working in Alibaba(China) have been
taking a lot more iterations to get to final reviews than usual contributions.

I would recommend, on top of having a guideline document, to have a
Valve check (1) setup as a commit-msg hook and run it as part of
GitGitGadget CI to help folks shorten the feedback loops in some basic cases.

Cheers,
Son Luong.

(1): https://docs.errata.ai/vale/styles

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc)
  2021-10-21 11:57 ` [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc) Johannes Schindelin
@ 2021-10-21 13:41   ` Konstantin Ryabitsev
  2021-10-22 22:06     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 58+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-21 13:41 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Thu, Oct 21, 2021 at 01:57:11PM +0200, Johannes Schindelin wrote:
>  2. Dscho said he’s not able to follow everything on the mailing list
> 
>     1. if you have just one patch you send, reply-all works okay
> 
>     2. mailing list works reasonably well if you’re someone like Junio, working
>        on it full time, has good mail filters, keeps up to date with everything
> 
>     3. If you’re in-between, does not work well

This is a problem that's not actually unique to mailing lists. If you have any
project that is popular enough, at some point it reaches critical mass where
developer/user feedback becomes too much for anyone to keep up. Github
projects aren't immune to this either, but they do have a benefit of providing
an easy interface for someone to apply categorization to issues/discussions.

One of the efforts currently under way at public-inbox is the "lei" tool that
should allow similar workflows for mailing-list based interactions. At some
point we will be able to provide both topical and search-based subscriptions
to subsets of the mailing list traffic that you're interested in. Search-based
subscriptions will allow you to monitor the list for discussions relevant to
your interest (e.g. patches touching functions/files/keywords that you are
working on). Topical subscriptions are a bit more complicated and would
require someone to actively categorize mailing list discussions by keywords
(e.g. bugs, suggestions, security), which would allow others to monitor just
those aspects of mailing list discussion. The latter requires someone's active
involvement and dedication from the project side, not unlike for categorizing
issues reported at github or any other issue tracker.

If you're curious, you can see my presentation to Linux Plumbers last month,
which is here:
youtube: https://www.youtube.com/watch?v=mF10hgVIx9o&t=1490s
slides: https://linuxplumbersconf.org/event/11/contributions/983/attachments/759/1421/Doing%20more%20with%20lore%20and%20b4.pdf

>  4. bmc: I want some way to track patches
> 
>     1. What have I reviewed before and what have I not reviewed since last
>        time?
> 
>     2. Emily: most of this exists in patchwork. Our intern Raxel Gutierrez did
>        work on that this summer. Alas, that doesn’t show up on
>        patchwork.kernel.org because it’s using patchwork 2.x and the features
>        are in 3.x

3.x is a bit new still, but chances are we'll be running it in a couple of
months. Unfortunately, our previous experiences with major patchwork upgrades
have been a bit thorny, so I'm trying to approach this carefully in order not
to impact other projects relying on it. (Not a dig at patchwork folks, just an
observation.)

>     7.  jrnieder: there’s a bugzilla instance at bugzilla.kernel.org, which
>         might satisfy CB’s criterion
> 
>     8.  bmc: I want to have whatever we use send out to the list. That would
>         avoid conversations going on without people in the mailing list centric
>         workflow being aware of it. If we are all using a GitHub/GitLab based
>         workflow then that’s not required

Bugzilla's mail integration is fairly good and list-friendly. We have several
projects that largely interact with their bugzilla via mailing lists
(two-way). Note, that someone still has to do things like closing and
recategorizing bugs through the website.

Note, that the initial bug report must come in through the bugzilla web
interface. There's a way to create bugs via incoming mail, but it works very
poorly.

>     13. Junio: Not really. The extra tracking conversations are not as
>         important to me. I think it’s a feature that if someone requests a
>         feature and nothing happens for a while that it no longer produces
>         overhead for people is a useful feature. That kind of old filtering
>         feature is sometimes valuable.

I find that if there's no mailing list integration, then bugzilla generally
rots after the initial person getting the bug reports moves on. Then bugs
reported via bugzilla just sit there without anyone paying attention. At least
when bug reports get sent to the list, the ensuing discussions get reflected
in both the list archives and in bugzilla.

>     16. I’m also happy to work with kernel.org admins to get this set up for us
>         if that’s what we want

Consider this part done. :)

-K

^ permalink raw reply	[flat|nested] 58+ messages in thread

* changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-21 11:57 ` [Summit topic] Improving Git UX Johannes Schindelin
@ 2021-10-21 16:45   ` Ævar Arnfjörð Bjarmason
  2021-10-21 23:03     ` changing the experimental 'git switch' Junio C Hamano
                       ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-21 16:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Josh Steadmon

On Thu, Oct 21 2021, Johannes Schindelin wrote:

>  7.  Ævar: On switch/restore in particular, there was a recent discussion.
>
>      1. Ultimately came down to inconsistency with other commands in the same
>         area
>
>      2. I gave some suggestions

Those suggestions are at:
https://lore.kernel.org/git/877dkdwgfe.fsf@evledraar.gmail.com/ copying
the most relevant part from that:

In summary, I think it should be changed to act like this:

    |---------------------------+------------------------+---------------------------|
    | What                      | Now                    | New                       |
    |---------------------------+------------------------+---------------------------|
    | Switch                    | git switch existing    | git switch existing       |
    | Error                     | git switch nonexisting | <no change (errors)>      |
    | Switch with --merge       | git switch -m branch   | git switch --merge branch |
    | Create                    | git switch -c new      | git switch -n new         |
    | Create from existing      | N/A                    | git switch -c new [<old>] |
    | Move & switch to existing | N/A                    | git switch -m new [<old>] |
    |---------------------------+------------------------+---------------------------|

>      3. Some patches started, there was some trepidation about making changes,
>         though

I was thinking of this patch, i.e. it implements the "-n" option for
"git
switch": https://lore.kernel.org/git/20210709174310.94209-1-felipe.contreras@gmail.com/

We could then add the same to "git branch", i.e. "git branch foo" could
also be invoked as "git branch -n foo".

We'd then need to have a hard change in the semantics of the
(experimental) "git switch" commant to make "-c" mean "copy" (like in
"git branch").

We'd then reach an end-state where these two commands would behave in
the same way for these common options, with the difference being that
"branch".

Whatever anyone thinks of my specific suggestions there I think that in
general we should be trying to aim more towards that in git's UI, even
to the point of slowly phasing in deprecations for non-experimental
commands. E.g. the "-n" option to "git fetch" comes to mind, which isn't
a synonym for "--dry-run", as in most other places.

I realize that doing that is hard, e.g. Josh Steadmon has a patch
on-list now to add a configurable "inherit" mode to "git branch[1].

I noted in a similar vein as the table above that it would leave us with
another inconsistency between "branch" and "checkout"/"switch" in [2].

Does that mean we shouldn't take that patch and others like it until
such UX inconsistencies are addressed?

I really don't know, but I do think that the most viable path to a
better UX for git is to consider its UX more holistically.

To the extent that our UX is a mess I think it's mainly because we've
ended up with an accumulation of behavior that made sense in isolation
at the time, but which when combined presents bad or inconsistent UX to
the user.

1. https://lore.kernel.org/git/9628d145881cb875f8e284967e10f587b9f686f9.1631126999.git.steadmon@google.com/
2. https://lore.kernel.org/git/87a6j6tbsv.fsf@gmgdl.gmail.com/

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
@ 2021-10-21 23:03     ` Junio C Hamano
  2021-10-22  3:33     ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Bagas Sanjaya
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2021-10-21 23:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, git, Josh Steadmon

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>     | Switch with --merge       | git switch -m branch   | git switch --merge branch |
>     | Create                    | git switch -c new      | git switch -n new         |
>     | Create from existing      | N/A                    | git switch -c new [<old>] |

I agree that adding a way to "clone", which is missing from
"checkout/switch/branch", is a good idea.  I do not necessarily
think it is a good idea to say "-n" is "new" or "-c" is not "create"
but is "clone/copy".  As you said yourself in a later paragraph,
"-n" sometimes is "--dry-run", and as we can see here "-c" in the
context of a command that can create and clone (with two verbs
behaving differently) is ambiguous.

Starting with a spelled out --copy vs --create without muddying the
water with -n may be a sensible way forward.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
@ 2021-10-22  3:06   ` Bagas Sanjaya
  2021-10-22 10:01     ` Johannes Schindelin
  2021-11-08 18:21   ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: Bagas Sanjaya @ 2021-10-22  3:06 UTC (permalink / raw)
  To: Johannes Schindelin, git

On 21/10/21 18.56, Johannes Schindelin wrote:
>   5.  The challenge is not necessarily the technical challenges, but the UX for
>       server tools that live “above” the git executable.
> 
>       1. What kind of output is needed? Machine-readable error messages?
> 
>       2. What Git objects must be created: a tree? A commit?
> 
>       3. How to handle, report, and store conflicts? Index is not typically
>          available on the server.

1) I prefer human-readable (i.e. l10n-able) output, because the output 
messages for server-side merge/rebase are user-facing.

2) Same as when doing merge/rebase on local machine (merge commit if 
non-ff).

3) I think because on the server-side we have bare repo (instead of 
normal repo), we need to create temporary index just for merge/rebase. 
For conflicts, the users need to resolve them locally, then notify the 
server that they have been resolved, and continue merging process.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
  2021-10-21 23:03     ` changing the experimental 'git switch' Junio C Hamano
@ 2021-10-22  3:33     ` Bagas Sanjaya
  2021-10-22 14:04     ` martin
  2021-10-25 16:44     ` Sergey Organov
  3 siblings, 0 replies; 58+ messages in thread
From: Bagas Sanjaya @ 2021-10-22  3:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Johannes Schindelin
  Cc: git, Josh Steadmon

On 21/10/21 23.45, Ævar Arnfjörð Bjarmason wrote:
> In summary, I think it should be changed to act like this:
>      
>      |---------------------------+------------------------+---------------------------|
>      | What                      | Now                    | New                       |
>      |---------------------------+------------------------+---------------------------|
>      | Switch                    | git switch existing    | git switch existing       |
>      | Error                     | git switch nonexisting | <no change (errors)>      |
>      | Switch with --merge       | git switch -m branch   | git switch --merge branch |
>      | Create                    | git switch -c new      | git switch -n new         |
>      | Create from existing      | N/A                    | git switch -c new [<old>] |
>      | Move & switch to existing | N/A                    | git switch -m new [<old>] |
>      |---------------------------+------------------------+---------------------------|
> 

For switch with --merge case, it seems like adding long-option variant 
of -m (--merge), right?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (9 preceding siblings ...)
  2021-10-21 11:57 ` [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc) Johannes Schindelin
@ 2021-10-22  8:02 ` Johannes Schindelin
  2021-10-22  8:22   ` Johannes Schindelin
  2021-10-22  9:44 ` Let's have public Git chalk talks, " Johannes Schindelin
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22  8:02 UTC (permalink / raw)
  To: git

Hi,

On Thu, 21 Oct 2021, Johannes Schindelin wrote:

> Team,
>
> we held our second all-virtual Summit over the past two days. It was the
> traditional unconference style meeting, with topics being proposed and
> voted on right before the introduction round. It was really good to see
> the human faces behind those email addresses.
>
> 32 contributors participated, and we spanned the timezones from PST to
> IST. To make that possible, the event took place on two days, from
> 1500-1900 UTC, which meant that the attendees from the US West coast had
> to get up really early, while it was past midnight in India at the end.
>
> I would like to thank all participants for accommodating the time, and in
> particular for creating such a friendly, collaborative atmosphere.
>
> A particular shout-out to Jonathan Nieder, Emily Shaffer and Derrick
> Stolee for taking notes. I am going to send out these notes in per-topic
> subthreads, replying to this mail.
>
> Day 1 topics:
>
> * Crazy (and not so crazy) ideas
> * SHA-256 Updates
> * Server-side merge/rebase: needs and wants?
> * Submodules and how to make them worth using
> * Sparse checkout behavior and plans
>
> Day 2 topics:
>
> * The state of getting a reftable backend working in git.git
> * Documentation (translations, FAQ updates, new user-focused, general
>   improvements, etc.)
> * Let's have public Git chalk talks

You might wonder why I did not send out the notes for this talk.

But that is not true! I sent it 6 times already, in various variations,
and it never came through (but I did get two nastygrams telling me that my
message was rejected because it apparently triggered a filter).

I shall keep trying, but my hopes are pretty low by now.

Ciao,
Johannes

> * Increasing diversity & inclusion (transition to `main`, etc)
> * Improving Git UX
> * Improving reviewer quality of life (patchwork, subsystem lists?, etc)
>
> A few topics were left for a later date (maybe as public Git chalk talks):
>
> * Making Git memory-leak free (already landed patches)
> * Scaling Git
> * Scaling ref advertisements
> * Config-based hooks (and getting there via migration ot hook.[ch] lib &
>   "git hook run")
> * Make git [clone|fetch] support pre-seeding via downloaded *.bundle files
>
> Ciao,
> Johannes
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-22  8:02 ` Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
@ 2021-10-22  8:22   ` Johannes Schindelin
  2021-10-22  8:30     ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22  8:22 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3919 bytes --]

Team,

I tried to reply with the full notes, which failed. So I'll try again,
this time in chunks.

On Fri, 22 Oct 2021, Johannes Schindelin wrote:

> Hi,
>
> On Thu, 21 Oct 2021, Johannes Schindelin wrote:
>
> > Team,
> >
> > we held our second all-virtual Summit over the past two days. It was the
> > traditional unconference style meeting, with topics being proposed and
> > voted on right before the introduction round. It was really good to see
> > the human faces behind those email addresses.
> >
> > 32 contributors participated, and we spanned the timezones from PST to
> > IST. To make that possible, the event took place on two days, from
> > 1500-1900 UTC, which meant that the attendees from the US West coast had
> > to get up really early, while it was past midnight in India at the end.
> >
> > I would like to thank all participants for accommodating the time, and in
> > particular for creating such a friendly, collaborative atmosphere.
> >
> > A particular shout-out to Jonathan Nieder, Emily Shaffer and Derrick
> > Stolee for taking notes. I am going to send out these notes in per-topic
> > subthreads, replying to this mail.
> >
> > Day 1 topics:
> >
> > * Crazy (and not so crazy) ideas
> > * SHA-256 Updates
> > * Server-side merge/rebase: needs and wants?
> > * Submodules and how to make them worth using
> > * Sparse checkout behavior and plans
> >
> > Day 2 topics:
> >
> > * The state of getting a reftable backend working in git.git
> > * Documentation (translations, FAQ updates, new user-focused, general
> >   improvements, etc.)
> > * Let's have public Git chalk talks
>
> You might wonder why I did not send out the notes for this talk.
>
> But that is not true! I sent it 6 times already, in various variations,
> and it never came through (but I did get two nastygrams telling me that my
> message was rejected because it apparently triggered a filter).

This session was led by Emily Shaffer. Supporting cast: Ævar Arnfjörð
Bjarmason, brian m. carlson, CB Bailey, and Junio Hamano.

Notes:

 1.  What’s a public chalk talk?

     1.  At Google, once a week, the team meets up with no particular topic in
         mind, or a couple topics, very informal

     2.  One person’s turn each week to give an informal talk with a white
         board (not using chalk)

     3.  Topic should be technical and of interest to the presenter

     4.  For example: how does protocol v2 work

     5.  Collaborative, interactive user session

     6.  Helps by learning about things

     7.  Helps by honing skills like presentation skills

     8.  A lot of (good) humility involved. For example, colleagues who have
         been familiar with the project for a long time admitting they don’t
         know, or have been wrong about things. Makes others feel more
         comfortable with their perceived lack of knowledge

     9.  Could be good for everybody on the Git mailing list, might foster less
         combative communication on the list

     10. Might be a way to attract new people by presenting “old timers” as
         humble

 2.  Does that appeal to anybody else?

to be continued...

>
> I shall keep trying, but my hopes are pretty low by now.
>
> Ciao,
> Johannes
>
> > * Increasing diversity & inclusion (transition to `main`, etc)
> > * Improving Git UX
> > * Improving reviewer quality of life (patchwork, subsystem lists?, etc)
> >
> > A few topics were left for a later date (maybe as public Git chalk talks):
> >
> > * Making Git memory-leak free (already landed patches)
> > * Scaling Git
> > * Scaling ref advertisements
> > * Config-based hooks (and getting there via migration ot hook.[ch] lib &
> >   "git hook run")
> > * Make git [clone|fetch] support pre-seeding via downloaded *.bundle files
> >
> > Ciao,
> > Johannes
> >
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-22  8:22   ` Johannes Schindelin
@ 2021-10-22  8:30     ` Johannes Schindelin
  2021-10-22  9:07       ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22  8:30 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5204 bytes --]

Continuing... (2nd try, with redactions)

On Fri, 22 Oct 2021, Johannes Schindelin wrote:

> I tried to reply with the full notes, which failed. So I'll try again,
> this time in chunks.
>
> On Fri, 22 Oct 2021, Johannes Schindelin wrote:
>
> > On Thu, 21 Oct 2021, Johannes Schindelin wrote:
> >
> > > Team,
> > >
> > > we held our second all-virtual Summit over the past two days. It was the
> > > traditional unconference style meeting, with topics being proposed and
> > > voted on right before the introduction round. It was really good to see
> > > the human faces behind those email addresses.
> > >
> > > 32 contributors participated, and we spanned the timezones from PST to
> > > IST. To make that possible, the event took place on two days, from
> > > 1500-1900 UTC, which meant that the attendees from the US West coast had
> > > to get up really early, while it was past midnight in India at the end.
> > >
> > > I would like to thank all participants for accommodating the time, and in
> > > particular for creating such a friendly, collaborative atmosphere.
> > >
> > > A particular shout-out to Jonathan Nieder, Emily Shaffer and Derrick
> > > Stolee for taking notes. I am going to send out these notes in per-topic
> > > subthreads, replying to this mail.
> > >
> > > Day 1 topics:
> > >
> > > * Crazy (and not so crazy) ideas
> > > * SHA-256 Updates
> > > * Server-side merge/rebase: needs and wants?
> > > * Submodules and how to make them worth using
> > > * Sparse checkout behavior and plans
> > >
> > > Day 2 topics:
> > >
> > > * The state of getting a reftable backend working in git.git
> > > * Documentation (translations, FAQ updates, new user-focused, general
> > >   improvements, etc.)
> > > * Let's have public Git chalk talks
> >
> > You might wonder why I did not send out the notes for this talk.
> >
> > But that is not true! I sent it 6 times already, in various variations,
> > and it never came through (but I did get two nastygrams telling me that my
> > message was rejected because it apparently triggered a filter).
>
> This session was led by Emily Shaffer. Supporting cast: Ævar Arnfjörð
> Bjarmason, brian m. carlson, CB Bailey, and Junio Hamano.
>
> Notes:
>
>  1.  What’s a public chalk talk?
>
>      1.  At Google, once a week, the team meets up with no particular topic in
>          mind, or a couple topics, very informal
>
>      2.  One person’s turn each week to give an informal talk with a white
>          board (not using chalk)
>
>      3.  Topic should be technical and of interest to the presenter
>
>      4.  For example: how does protocol v2 work
>
>      5.  Collaborative, interactive user session
>
>      6.  Helps by learning about things
>
>      7.  Helps by honing skills like presentation skills
>
>      8.  A lot of (good) humility involved. For example, colleagues who have
>          been familiar with the project for a long time admitting they don’t
>          know, or have been wrong about things. Makes others feel more
>          comfortable with their perceived lack of knowledge
>
>      9.  Could be good for everybody on the Git mailing list, might foster less
>          combative communication on the list
>
>      10. Might be a way to attract new people by presenting “old timers” as
>          humble
>
>  2.  Does that appeal to anybody else?

[redacting a word I suspect to have triggered vger's filter: it is a word
starting with "T" and continuing with "witch". Whenever you read "[itch]",
that's what I substitued for the culprit]

 3.  Ævar: I think it would be great, has been a long time we’ve seen each
     other, and already feels different

 4.  One thing to keep in mind: it’s hard to program on a white board :-)

 5.  Emily: some challenges:

     1. How often?

     2. What time?

     3. Probably move things around (because we’re global)

     4. Tech to use? Jitsi? [itch]? ([itch] seems to be particularly popular to
        teach programming)

     5. Figure out what topics to present

 6.  Ævar: does not matter what tech to use

 7.  Emily: some difference may make it matter: on [itch], you can record, and
     they host recordings

 8.  One thing to worry about recording: people might be reticent to make
     public mistakes

 9.  It’s possible to do a [itch] stream, and not record it

to be continued...
>
> >
> > I shall keep trying, but my hopes are pretty low by now.
> >
> > Ciao,
> > Johannes
> >
> > > * Increasing diversity & inclusion (transition to `main`, etc)
> > > * Improving Git UX
> > > * Improving reviewer quality of life (patchwork, subsystem lists?, etc)
> > >
> > > A few topics were left for a later date (maybe as public Git chalk talks):
> > >
> > > * Making Git memory-leak free (already landed patches)
> > > * Scaling Git
> > > * Scaling ref advertisements
> > > * Config-based hooks (and getting there via migration ot hook.[ch] lib &
> > >   "git hook run")
> > > * Make git [clone|fetch] support pre-seeding via downloaded *.bundle files
> > >
> > > Ciao,
> > > Johannes
> > >
> >

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-22  8:30     ` Johannes Schindelin
@ 2021-10-22  9:07       ` Johannes Schindelin
  0 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22  9:07 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4247 bytes --]

Team,

after 10 failed attempts to send more notes, this might start to get a bit
annoying on my side.

On Fri, 22 Oct 2021, Johannes Schindelin wrote:

> On Fri, 22 Oct 2021, Johannes Schindelin wrote:
>
> > On Fri, 22 Oct 2021, Johannes Schindelin wrote:
> >
> > > On Thu, 21 Oct 2021, Johannes Schindelin wrote:
> > >
> > > > * Let's have public Git chalk talks
> > >
> > > You might wonder why I did not send out the notes for this talk.
> > >
> > > But that is not true! I sent it 6 times already, in various variations,
> > > and it never came through (but I did get two nastygrams telling me that my
> > > message was rejected because it apparently triggered a filter).
> >
> > This session was led by Emily Shaffer. Supporting cast: Ævar Arnfjörð
> > Bjarmason, brian m. carlson, CB Bailey, and Junio Hamano.
> >
> > Notes:
> >
> >  1.  What’s a public chalk talk?
> >
> >      1.  At Google, once a week, the team meets up with no particular topic in
> >          mind, or a couple topics, very informal
> >
> >      2.  One person’s turn each week to give an informal talk with a white
> >          board (not using chalk)
> >
> >      3.  Topic should be technical and of interest to the presenter
> >
> >      4.  For example: how does protocol v2 work
> >
> >      5.  Collaborative, interactive user session
> >
> >      6.  Helps by learning about things
> >
> >      7.  Helps by honing skills like presentation skills
> >
> >      8.  A lot of (good) humility involved. For example, colleagues who have
> >          been familiar with the project for a long time admitting they don’t
> >          know, or have been wrong about things. Makes others feel more
> >          comfortable with their perceived lack of knowledge
> >
> >      9.  Could be good for everybody on the Git mailing list, might foster less
> >          combative communication on the list
> >
> >      10. Might be a way to attract new people by presenting “old timers” as
> >          humble
> >
> >  2.  Does that appeal to anybody else?
>
> [redacting a word I suspect to have triggered vger's filter: it is a word
> starting with "T" and continuing with "witch". Whenever you read "[itch]",
> that's what I substitued for the culprit]
>
>  3.  Ævar: I think it would be great, has been a long time we’ve seen each
>      other, and already feels different
>
>  4.  One thing to keep in mind: it’s hard to program on a white board :-)
>
>  5.  Emily: some challenges:
>
>      1. How often?
>
>      2. What time?
>
>      3. Probably move things around (because we’re global)
>
>      4. Tech to use? Jitsi? [itch]? ([itch] seems to be particularly popular to
>         teach programming)
>
>      5. Figure out what topics to present
>
>  6.  Ævar: does not matter what tech to use
>
>  7.  Emily: some difference may make it matter: on [itch], you can record, and
>      they host recordings
>
>  8.  One thing to worry about recording: people might be reticent to make
>      public mistakes
>
>  9.  It’s possible to do a [itch] stream, and not record it

The brian m. carlson offered the idea to be considerate of reservations by
participants, but also accommodate Git contributors who would have loved
to see the presentation but were unable to attend due to timezones, time
conflicts, etc: offer it for viewing only for a short while.

to be continued

> > >
> > > I shall keep trying, but my hopes are pretty low by now.
> > >
> > > Ciao,
> > > Johannes
> > >
> > > > * Increasing diversity & inclusion (transition to `main`, etc)
> > > > * Improving Git UX
> > > > * Improving reviewer quality of life (patchwork, subsystem lists?, etc)
> > > >
> > > > A few topics were left for a later date (maybe as public Git chalk talks):
> > > >
> > > > * Making Git memory-leak free (already landed patches)
> > > > * Scaling Git
> > > > * Scaling ref advertisements
> > > > * Config-based hooks (and getting there via migration ot hook.[ch] lib &
> > > >   "git hook run")
> > > > * Make git [clone|fetch] support pre-seeding via downloaded *.bundle files
> > > >
> > > > Ciao,
> > > > Johannes
> > > >
> > >
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Let's have public Git chalk talks, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
                   ` (10 preceding siblings ...)
  2021-10-22  8:02 ` Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
@ 2021-10-22  9:44 ` Johannes Schindelin
  2021-10-25 12:58   ` Ævar Arnfjörð Bjarmason
  11 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22  9:44 UTC (permalink / raw)
  To: git

Team,

On Thu, 21 Oct 2021, Johannes Schindelin wrote:

> * Let's have public Git chalk talks

Okay, I give up on the mailing list. I tried some 20 times to send the
notes out in one form or another, and it simply is not working, and the
time I spent trying was definitely lost time.

So here is a link:
https://gist.github.com/dscho/003a0e112058e5794b5e08e84d34092d

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-10-22  3:06   ` Bagas Sanjaya
@ 2021-10-22 10:01     ` Johannes Schindelin
  2021-10-23 20:52       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22 10:01 UTC (permalink / raw)
  To: Bagas Sanjaya; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2141 bytes --]

Hi Bagas,

On Fri, 22 Oct 2021, Bagas Sanjaya wrote:

> On 21/10/21 18.56, Johannes Schindelin wrote:
> >   5.  The challenge is not necessarily the technical challenges, but the UX
> >   for
> >       server tools that live “above” the git executable.
> >
> >       1. What kind of output is needed? Machine-readable error messages?
> >
> >       2. What Git objects must be created: a tree? A commit?
> >
> >       3. How to handle, report, and store conflicts? Index is not typically
> >          available on the server.
>
> 1) I prefer human-readable (i.e. l10n-able) output, because the output
> messages for server-side merge/rebase are user-facing.

For server-side usage, a human-readable output _by Git_ would not make
sense. It would be the responsibility of the server-side caller (which is
usually a web application) to present the result, potentially translated,
definitely prettified.

So while I agree with you that the result should be made pretty on the
server side, I disagree that this is Git's job. Instead, Git should
produce something eminently machine-parseable in this context.

> 2) Same as when doing merge/rebase on local machine (merge commit if non-ff).

Local usage is _totally_ different.

> 3) I think because on the server-side we have bare repo (instead of normal
> repo), we need to create temporary index just for merge/rebase.

Merge ORT does not need a temporary index. That's the reason it is so much
faster than the regular merge-recursive.

> For conflicts, the users need to resolve them locally, then notify the
> server that they have been resolved, and continue merging process.

It is already possible e.g. on GitHub to resolve merge conflicts in the
web UI. That is very convenient, and I think we all agreed at the Summit
that this is a scenario Git should support as well as it can. We did not
come to any concrete conclusion how that should look like (read: what
output format Git could use to support server-side consumption better),
though, and I think it basically comes down to experimenting with a couple
approaches.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* vale check, was Re: [Summit topic] Increasing diversity & inclusion (transition to `main`, etc)
  2021-10-21 12:55   ` Son Luong Ngoc
@ 2021-10-22 10:02     ` Johannes Schindelin
  2021-10-22 10:03       ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22 10:02 UTC (permalink / raw)
  To: Son Luong Ngoc; +Cc: git

Hi Bagas,

On Thu, 21 Oct 2021, Son Luong Ngoc wrote:

> I would recommend, on top of having a guideline document, to have a
> Valve check (1) setup as a commit-msg hook and run it as part of
> GitGitGadget CI to help folks shorten the feedback loops in some basic
> cases.
>
> [...]
>
> (1): https://docs.errata.ai/vale/styles

How about setting this up, then opening a PR?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: vale check, was Re: [Summit topic] Increasing diversity & inclusion (transition to `main`, etc)
  2021-10-22 10:02     ` vale check, was " Johannes Schindelin
@ 2021-10-22 10:03       ` Johannes Schindelin
  0 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-10-22 10:03 UTC (permalink / raw)
  To: Son Luong Ngoc; +Cc: git

Hi Son Luong,

On Fri, 22 Oct 2021, Johannes Schindelin wrote:

> Hi Bagas,

Sorry about that, I meant to address _you_, not Bagas.

Ciao,
Dscho

>
> On Thu, 21 Oct 2021, Son Luong Ngoc wrote:
>
> > I would recommend, on top of having a guideline document, to have a
> > Valve check (1) setup as a commit-msg hook and run it as part of
> > GitGitGadget CI to help folks shorten the feedback loops in some basic
> > cases.
> >
> > [...]
> >
> > (1): https://docs.errata.ai/vale/styles
>
> How about setting this up, then opening a PR?
>
> Ciao,
> Dscho
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
  2021-10-21 23:03     ` changing the experimental 'git switch' Junio C Hamano
  2021-10-22  3:33     ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Bagas Sanjaya
@ 2021-10-22 14:04     ` martin
  2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
  2021-10-24  6:54       ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Martin
  2021-10-25 16:44     ` Sergey Organov
  3 siblings, 2 replies; 58+ messages in thread
From: martin @ 2021-10-22 14:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Johannes Schindelin
  Cc: git, Josh Steadmon

On 21/10/2021 18:45, Ævar Arnfjörð Bjarmason wrote:
> E.g. the "-n" option to "git fetch" comes to mind, which isn't
> a synonym for "--dry-run", as in most other places.
>

-n
is only used very few times for dry run. I found
git add
git rm
git mv

But
cherry-pick => no commit
pull => no stat
rebase => no stat
merge => no stat
fetch => no tags
clone => no checkout

In any case, "-n" has always a "no" meaning (even dry run, mean "no 
changes to be recorded").

So IMHO -n is a really bad idea for "new"

About "-b" branch:
That does give no indication something is created. I find it highly 
confusing for checkout already,
because the word "branch" could also mean "check out to existing branch" 
rather than doing a detached checkout.
However, others may be perfectly fine with -b only referring to branches 
that will be created.

-c of course is also used for config in clone.... :)

If 2 letters could be used, then -c could be given twice for "create copy"
-c  => create
-c -c  => create copy
-cc  => create copy

----------
Also, will move/copy for switch actually be the same as for "git branch"?

I haven't used them, but from the docs, I take it that a 
[new/replacement] branch will be created, and this branches tip points 
to the same commit as the origin branch.

But in "git switch" a new commit for the top is given. So that differs.
Maybe someone can educate me ?
- For move, where is the diff between
   git switch --move existing_branch  commit
   git switch --force-create existing_branch  commit
Afaik only that the reflog will be copied/kept?

For copy what does it mean at all?
   git switch --copy existing_branch  commit
Does not make any sense at all.
Because "copy" means that "existing_branch" is to be kept. So copy needs 
a name for the new branch.
I see 2 possible copies
   git switch --copy existing_branch  new_branch commit
   git switch --copy existing_branch  target_branch
For the latter, it switches to the existing "target_branch", but 
replaces its reflog.

Unless there is more, than the copying of the reflog, wouldn't it be 
better to add an option "--copy-reflog"
Then you could do
git switch --copy-reflog=branch   target_branch  # replace reflog of 
existing target branch
git switch --copy-reflog=branch  -c new_branch  target_branch  # 
new_branch will get the reflog / this is "copy"
git switch --copy-reflog=branch  -C new_branch  target_branch  # 
new_branch will get the reflog
git switch --copy-reflog  -C existing_branch  target_branch  # 
existing_branch will keep the reflog. / this is "move"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.)
  2021-10-21 11:56 ` [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.) Johannes Schindelin
@ 2021-10-22 14:20   ` Jean-Noël Avila
  2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 58+ messages in thread
From: Jean-Noël Avila @ 2021-10-22 14:20 UTC (permalink / raw)
  To: git

I'm sorry that my presence at this meeting could have helped a bit for
some subtopics.

Le 21/10/2021 à 13:56, Johannes Schindelin a écrit :
> This session was led by brian m. carlson. Supporting cast: Jeff "Peff"
> King, Ævar Arnfjörð Bjarmason, Taylor Blau, Philip Oakley, Emily Shaffer,
> CB Bailey, and Jonathan "jrnieder" Nieder.
>
> Notes:
>
>  1. Background: answering on StackOverflow, other avenues for user questions,
>     even users from very large companies
>
>  2. How can we improve documentation?
>
>  3. Maybe even think about translating docs such as FAQs
>
>  4. Peff: there’s an effort to translate manpages
>
>     1. brian: Saw an announcement, haven’t seen what came of it

The effort is still ongoing. Unfortunately, there aren't much outputs
from it, only the inclusion on git-scm.com.

A proposition was sent for Debian packages.

I'm open for any help in packaging what's already available for whatever
useful.

For some statistics

* there are 23 po files, "pt_BR" fully translated, "fr" half translated,
"de" one third; most other languages have not really started (the
portion already translated was made automatically for unmodified strings).

* not all pages are included for translation; most porcelain pages
available on git-scm.com are included, but for instance, not the config
parts or the guides. That's already 10,687 source segments and 206,700
source words, which is a volume similar to "Crime and Punishment" by
Dostoyevsky. And it really looks like an punishment for most apprentice
translators willing to start.

In order to lower the barrier to translators, the project is relying on
weblate: https://hosted.weblate.org/projects/git-manpages/translations/
while still retaining a "Developer's Certificate of Origin".

>
>     2. Peff: Some translated pages are live on git-scm.com (a github repo with
>        translations)

For instance, git init manpages is already available in 8 languages.

>
>     3. Ævar: It uses a third-party tool (po4a) that uses gettext by making each
>        paragraph a translated string. So it’s the same workflow as translating
>        code changes

Asciidoc support is "co-developed" in po4a in parallel with the
translation: I fix bugs when they are found in the po files.

>     4. Taylor: https://github.com/jnavila/git-manpages-l10n

If it looks too personal, it can be moved into the git organization.

>
>  5. Philip Oakley: I see manpages used as reference material instead of
>     educational documents
>
>
>     12. In stackoverflow you can see how people answer questions, how much less
>         existing background they assume

Version control is usually already in the culture of most users
(writers, engineers in other fields have come to use them some 10 years
ago). What their questions usually boil down to is: how can I use and
customize git features for my field of expertise. When software editors
include git support in their applications, it is usually with severed
functions and users quickly have to get back to plain git when they want
a little more.

General rules can help start up with a new customization, but at some
point, the customization is specific to the tool. A library of
application oriented customizations, help files and FAQs may be of
interest. Some customizations already exist, sometimes with errors
(meaning the maintainer of the customization has not fully understood
how git works) but they are scattered.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-22 14:04     ` martin
@ 2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
  2021-10-22 15:30         ` martin
  2021-10-22 21:54         ` Sergey Organov
  2021-10-24  6:54       ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Martin
  1 sibling, 2 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-22 14:24 UTC (permalink / raw)
  To: martin; +Cc: Johannes Schindelin, git, Josh Steadmon


On Fri, Oct 22 2021, martin wrote:

> On 21/10/2021 18:45, Ævar Arnfjörð Bjarmason wrote:
>> E.g. the "-n" option to "git fetch" comes to mind, which isn't
>> a synonym for "--dry-run", as in most other places.
>>
>
> -n
> is only used very few times for dry run. I found
> git add
> git rm
> git mv
>
> But
> cherry-pick => no commit
> pull => no stat
> rebase => no stat
> merge => no stat
> fetch => no tags
> clone => no checkout
>
> In any case, "-n" has always a "no" meaning (even dry run, mean "no
> changes to be recorded").
>
> So IMHO -n is a really bad idea for "new"

Good point. I think I've changed my mind on that, but can't think of a
good short flag for such a thing.

FWIW one reason this would be needed is that "switch" intentionally did
not take "git switch unknown-name" to create "unknown-name", but maybe
we could relax that if we just e.g. printed out a notice saying a new
branch is created (which we probably do already...).

I.e. then the worst that'll happen is that the user has to "git switch
-" and "git branch -d -", except I think the latter doesn't work, so
"git branch -d <that-name>".

> About "-b" branch:
> That does give no indication something is created. I find it highly
> confusing for checkout already,
> because the word "branch" could also mean "check out to existing
> branch" rather than doing a detached checkout.
> However, others may be perfectly fine with -b only referring to
> branches that will be created.
>
> -c of course is also used for config in clone.... :)
>
> If 2 letters could be used, then -c could be given twice for "create copy"
> -c  => create
> -c -c  => create copy
> -cc  => create copy

Hrm, that's interesting. But probably better to have a long-option. Some
short options (notable -v for --verbose) often work like that, but I
wonder if people wouldn't just be confused by it.

Maybe not.

> ----------
> Also, will move/copy for switch actually be the same as for "git branch"?
>
> I haven't used them, but from the docs, I take it that a
> [new/replacement] branch will be created, and this branches tip points 
> to the same commit as the origin branch.

Both of them can take an optional "copy/create from". So I this is the
same for both already, aside from one not supporting "copy".

> But in "git switch" a new commit for the top is given. So that differs.
> Maybe someone can educate me ?
> - For move, where is the diff between
>   git switch --move existing_branch  commit
>   git switch --force-create existing_branch  commit
> Afaik only that the reflog will be copied/kept?
>
> For copy what does it mean at all?
>   git switch --copy existing_branch  commit
> Does not make any sense at all.
> Because "copy" means that "existing_branch" is to be kept. So copy
> needs a name for the new branch.
> I see 2 possible copies
>   git switch --copy existing_branch  new_branch commit
>   git switch --copy existing_branch  target_branch
> For the latter, it switches to the existing "target_branch", but
> replaces its reflog.

Maybe I'm being dense, but I'm not really seeing how a:

    git switch [some create option] <new> <old>

Would have caveats that we don't have already with:

    git branch [some create option] [<old>] <new>

Aside from the confusing switch-around of the arguments (which is
another UX wart...).

> Unless there is more, than the copying of the reflog, wouldn't it be
> better to add an option "--copy-reflog"
> Then you could do
> git switch --copy-reflog=branch   target_branch  # replace reflog of
> existing target branch
> git switch --copy-reflog=branch  -c new_branch  target_branch  #
> new_branch will get the reflog / this is "copy"
> git switch --copy-reflog=branch  -C new_branch  target_branch  #
> new_branch will get the reflog
> git switch --copy-reflog  -C existing_branch  target_branch  #
> existing_branch will keep the reflog. / this is "move"

Yes, I think "should it copy the reflog" is a thing that's arguably
either a missing feature or a bug in the "git branch" copy mode,
depending on your POV.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.)
  2021-10-22 14:20   ` Jean-Noël Avila
@ 2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
  2021-10-27  7:02       ` Jean-Noël Avila
  2021-10-27  8:50       ` Jeff King
  0 siblings, 2 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-22 14:31 UTC (permalink / raw)
  To: Jean-Noël Avila; +Cc: git


On Fri, Oct 22 2021, Jean-Noël Avila wrote:

> I'm sorry that my presence at this meeting could have helped a bit for
> some subtopics.
>
> Le 21/10/2021 à 13:56, Johannes Schindelin a écrit :
>> This session was led by brian m. carlson. Supporting cast: Jeff "Peff"
>> King, Ævar Arnfjörð Bjarmason, Taylor Blau, Philip Oakley, Emily Shaffer,
>> CB Bailey, and Jonathan "jrnieder" Nieder.
>>
>> Notes:
>>
>>  1. Background: answering on StackOverflow, other avenues for user questions,
>>     even users from very large companies
>>
>>  2. How can we improve documentation?
>>
>>  3. Maybe even think about translating docs such as FAQs
>>
>>  4. Peff: there’s an effort to translate manpages
>>
>>     1. brian: Saw an announcement, haven’t seen what came of it
>
> The effort is still ongoing. Unfortunately, there aren't much outputs
> from it, only the inclusion on git-scm.com.
>
> A proposition was sent for Debian packages.
>
> I'm open for any help in packaging what's already available for whatever
> useful.
>
>
> For some statistics
>
> * there are 23 po files, "pt_BR" fully translated, "fr" half translated,
> "de" one third; most other languages have not really started (the
> portion already translated was made automatically for unmodified strings).
>
> * not all pages are included for translation; most porcelain pages
> available on git-scm.com are included, but for instance, not the config
> parts or the guides. That's already 10,687 source segments and 206,700
> source words, which is a volume similar to "Crime and Punishment" by
> Dostoyevsky. And it really looks like an punishment for most apprentice
> translators willing to start.
>
> In order to lower the barrier to translators, the project is relying on
> weblate: https://hosted.weblate.org/projects/git-manpages/translations/
> while still retaining a "Developer's Certificate of Origin".
>
>
>>
>>     2. Peff: Some translated pages are live on git-scm.com (a github repo with
>>        translations)
>
> For instance, git init manpages is already available in 8 languages.
>
>
>>
>>     3. Ævar: It uses a third-party tool (po4a) that uses gettext by making each
>>        paragraph a translated string. So it’s the same workflow as translating
>>        code changes
>
> Asciidoc support is "co-developed" in po4a in parallel with the
> translation: I fix bugs when they are found in the po files.
>
>>     4. Taylor: https://github.com/jnavila/git-manpages-l10n
>
>
> If it looks too personal, it can be moved into the git organization.
>
>
>>
>>  5. Philip Oakley: I see manpages used as reference material instead of
>>     educational documents
>>
>>
>>     12. In stackoverflow you can see how people answer questions, how much less
>>         existing background they assume
>
> Version control is usually already in the culture of most users
> (writers, engineers in other fields have come to use them some 10 years
> ago). What their questions usually boil down to is: how can I use and
> customize git features for my field of expertise. When software editors
> include git support in their applications, it is usually with severed
> functions and users quickly have to get back to plain git when they want
> a little more.
>
> General rules can help start up with a new customization, but at some
> point, the customization is specific to the tool. A library of
> application oriented customizations, help files and FAQs may be of
> interest. Some customizations already exist, sometimes with errors
> (meaning the maintainer of the customization has not fully understood
> how git works) but they are scattered.

I'd very much support this living in-tree just as the po/* directory
already does. I.e. periodically pulled down.

There are many OS's that have something like "apt install
manpages-<lang>", so if we had these available they could be much more
useful to users.

E.g. I see I can "apt install manpages-pt", but if you're a Portuguese
speaker you probably won't chase down some third-party addition of
Portuguese manpages, and even if they're in Debian other package
maintainers might not add them if they're not in the "main" package etc.

What's standing in the way of us treating this in the same way as the
po/* directory, if anything?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
@ 2021-10-22 15:30         ` martin
  2021-10-23  8:27           ` changing the experimental 'git switch' Sergey Organov
  2021-10-22 21:54         ` Sergey Organov
  1 sibling, 1 reply; 58+ messages in thread
From: martin @ 2021-10-22 15:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, git, Josh Steadmon

On 22/10/2021 16:24, Ævar Arnfjörð Bjarmason wrote:
> FWIW one reason this would be needed is that "switch" intentionally did
> not take "git switch unknown-name" to create "unknown-name", but maybe
> we could relax that if we just e.g. printed out a notice saying a new
> branch is created (which we probably do already...).
I think the "required flag for create" is a good idea and should be 
kept. (My 2 cents)

having the "new name" identified, also helps the user to get the order 
of the arguments right.
Take
   git rebase  <upstream>   <branch>
Why is the target listed before the source? But whichever way round, it 
is hard to remember.
If rebase would only take one <upstream>, and the optional <branch> 
would be a "--from", wouldn't that be easier?

>> If 2 letters could be used, then -c could be given twice for "create copy"
>> -c  => create
>> -c -c  => create copy
>> -cc  => create copy
> Hrm, that's interesting. But probably better to have a long-option.
Well, both: Long and short. But long is --copy or --create-copy.
The issue is finding a short option. -cc imho is still short.

>> But in "git switch" a new commit for the top is given. So that differs.
>> Maybe someone can educate me ?
> Maybe I'm being dense, but I'm not really seeing how a:
>
>      git switch [some create option] <new> <old>
>
> Would have caveats that we don't have already with:
I only tried to illuminate my question with some made-up examples (the 
examples were not meant to be a solution).

What exactly does copy/move do? Am I missing any point?
- They create a new branch (when using switch this branch will also be 
reset to a given commit)
- They copy the reflog
- "move" deletes the old branch (or can be seen as rename)

Anything else?



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
  2021-10-22 15:30         ` martin
@ 2021-10-22 21:54         ` Sergey Organov
  1 sibling, 0 replies; 58+ messages in thread
From: Sergey Organov @ 2021-10-22 21:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: martin, Johannes Schindelin, git, Josh Steadmon

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Fri, Oct 22 2021, martin wrote:
>

[...]

>> If 2 letters could be used, then -c could be given twice for "create copy"
>> -c  => create
>> -c -c  => create copy
>> -cc  => create copy

Please, no!

> Hrm, that's interesting.

Yep, Git UI is too "interesting" already.

> But probably better to have a long-option.

Definitely.

> Some short options (notable -v for --verbose) often work like that,
> but I wonder if people wouldn't just be confused by it.

I would be confused. Those options that do behave like that usually
just increase (implicit) level of verbosity or debug level, so -vv is a
way to say --verbose=2, and -vvv => --verbose=3.

An option that changes its semantic depending on its sequence number is
something that I'd avoid like a plague.

Thanks,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc)
  2021-10-21 13:41   ` Konstantin Ryabitsev
@ 2021-10-22 22:06     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-22 22:06 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Johannes Schindelin, git


On Thu, Oct 21 2021, Konstantin Ryabitsev wrote:

> On Thu, Oct 21, 2021 at 01:57:11PM +0200, Johannes Schindelin wrote:
>>  2. Dscho said he’s not able to follow everything on the mailing list
>> 
>>     1. if you have just one patch you send, reply-all works okay
>> 
>>     2. mailing list works reasonably well if you’re someone like Junio, working
>>        on it full time, has good mail filters, keeps up to date with everything
>> 
>>     3. If you’re in-between, does not work well
>
> This is a problem that's not actually unique to mailing lists. If you have any
> project that is popular enough, at some point it reaches critical mass where
> developer/user feedback becomes too much for anyone to keep up. Github
> projects aren't immune to this either, but they do have a benefit of providing
> an easy interface for someone to apply categorization to issues/discussions.

I'd like to use this mail as a good jump-off point to link to my "how"
v.s. "what" E-Mail from when this was last discussed. I think I
mentioned it in passing at the recent summit:

https://lore.kernel.org/git/87fszd3xo0.fsf@evledraar.gmail.com/

Especially as...

>>     13. Junio: Not really. The extra tracking conversations are not as
>>         important to me. I think it’s a feature that if someone requests a
>>         feature and nothing happens for a while that it no longer produces
>>         overhead for people is a useful feature. That kind of old filtering
>>         feature is sometimes valuable.
>
> I find that if there's no mailing list integration, then bugzilla generally
> rots after the initial person getting the bug reports moves on. Then bugs
> reported via bugzilla just sit there without anyone paying attention. At least
> when bug reports get sent to the list, the ensuing discussions get reflected
> in both the list archives and in bugzilla.

...it makes a passive mention to this "forgetting as a feature" aspect
of not having a bug tracker.

>
>>     16. I’m also happy to work with kernel.org admins to get this set up for us
>>         if that’s what we want
>
> Consider this part done. :)

And thank you for your contribution to kernel.org infrastructure.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-22 15:30         ` martin
@ 2021-10-23  8:27           ` Sergey Organov
  0 siblings, 0 replies; 58+ messages in thread
From: Sergey Organov @ 2021-10-23  8:27 UTC (permalink / raw)
  To: martin
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin, git,
	Josh Steadmon

martin <test2@mfriebe.de> writes:

> On 22/10/2021 16:24, Ævar Arnfjörð Bjarmason wrote:

[...]

>>> If 2 letters could be used, then -c could be given twice for "create copy"
>>> -c  => create
>>> -c -c  => create copy
>>> -cc  => create copy
>> Hrm, that's interesting. But probably better to have a long-option.
> Well, both: Long and short. But long is --copy or --create-copy.
> The issue is finding a short option. -cc imho is still short.

No -cc or --cc, please! -cc is not single option, it's -c -c in a line,
and you will then have hard time to even describe -c.

--cc would be a point of confusion as well, e.g., see "git log --cc".

BTW, is it frequent enough operation to even demand something shorter
than --copy?

Thanks,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-10-22 10:01     ` Johannes Schindelin
@ 2021-10-23 20:52       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-23 20:52 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Bagas Sanjaya, git

On Fri, Oct 22 2021, Johannes Schindelin wrote:

> Hi Bagas,
>
> On Fri, 22 Oct 2021, Bagas Sanjaya wrote:
>
>> On 21/10/21 18.56, Johannes Schindelin wrote:
>> >   5.  The challenge is not necessarily the technical challenges, but the UX
>> >   for
>> >       server tools that live “above” the git executable.
>> >
>> >       1. What kind of output is needed? Machine-readable error messages?
>> >
>> >       2. What Git objects must be created: a tree? A commit?
>> >
>> >       3. How to handle, report, and store conflicts? Index is not typically
>> >          available on the server.
>>
>> 1) I prefer human-readable (i.e. l10n-able) output, because the output
>> messages for server-side merge/rebase are user-facing.
>
> For server-side usage, a human-readable output _by Git_ would not make
> sense. It would be the responsibility of the server-side caller (which is
> usually a web application) to present the result, potentially translated,
> definitely prettified.
>
> So while I agree with you that the result should be made pretty on the
> server side, I disagree that this is Git's job. Instead, Git should
> produce something eminently machine-parseable in this context.

Our server-side already produces human-readable output via die()
messages or ERR packets.

To the extent that we need human-readable output in say protocol v2 I
think it makes more sense to start supporting passing over the user's
locale to look the appropriate thing up in our *.mo files. The
alternative is maintaining an exhaustive catalog of unique error ID's or
whatever.

Most things should of course have meaningful error codes etc., I'm only
referring to the output that has some human readable "we've failed, and
here's a text explanation for why" component, see the various places in
protocol v0..2 where we emit that, e.g. telling a user what went wrong
with the "filter" arguments they provided etc.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch' (was: [Summit topic] Improving Git UX)
  2021-10-22 14:04     ` martin
  2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
@ 2021-10-24  6:54       ` Martin
  2021-10-24 20:27         ` changing the experimental 'git switch' Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Martin @ 2021-10-24  6:54 UTC (permalink / raw)
  Cc: git

On 22/10/2021 16:04, martin wrote:
> Unless there is more, than the copying of the reflog, wouldn't it be 
> better to add an option "--copy-reflog"

Ok, I found the answer (actually 2 / see other mail)
> The |-c| and |-C| options have the exact same semantics as |-m| and 
> |-M|, except instead of the branch being renamed, it will be copied to 
> a new name, along with its config and reflog.
As for the "config" being part depends on which part of the man-page one 
reads.

But, anyway on the topic of "git switch"

Copying a branch (and maybe moving too) could be seen as an extension to 
creating a new one.
After all, after the copy operation there is a newly created branch. 
Only it has some more data with it.

So one could do
git switch  --settings-from <branch-with-reflog-and-conf> --create 
<new-branch>   <commit>
git switch  -s <branch-with-reflog-and-conf>   -c <new-branch>   <commit>

"settings-from" is just an example, there may be better names for it. 
Ideally not starting with a "c".

And using a name different from "copy" may be more accurate, because 
unless it is created on the same one <commit> to which the 
<branch-with-reflog-and-conf> points, then its at best partially copied.

On top of that options could be brought in, to copy only reflog or only 
config.

Using the above with an --force-create / -C would make it a "move branch".
In this case there could be a shortcut, if <branch-with-reflog-and-conf> 
and (the old) <new-branch>  are the same.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-24  6:54       ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Martin
@ 2021-10-24 20:27         ` Junio C Hamano
  2021-10-25 12:48           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2021-10-24 20:27 UTC (permalink / raw)
  To: Martin; +Cc: git

Martin <git@mfriebe.de> writes:

> So one could do
> git switch  --settings-from <branch-with-reflog-and-conf> --create
> <new-branch>   <commit>
> git switch  -s <branch-with-reflog-and-conf>   -c <new-branch>   <commit>
>
> "settings-from" is just an example, there may be better names for
> it. Ideally not starting with a "c".
>
> And using a name different from "copy" may be more accurate, because
> unless it is created on the same one <commit> to which the 
> <branch-with-reflog-and-conf> points, then its at best partially copied.

I like the "copy the settings from this other branch when creating
this new branch" as a concept.

One thing that I find iffy is the reflog.  Even with the current
"create a new branch NEW, pointing at the same commit, tracking the
same remote-tracking branch, having the same branch description, and
pretending to have come along the same trajectory, out of this
original branch OLD", I actually find that the copyng of reflog is
utterly questionable.  Before that operation, the new branch did not
exist, hence NEW@{4.days.ago} shouldn't say the same thing as
OLD@{4.days.ago} for the branch NEW that was created like so just a
minute ago.

If you generalize the operation to allow starting the new branch at
a different commit, it becomes even more strange to copy the reflog
of the "original" branch, which is not even the original for this
new branch.

Another thing nobody seems to have brought up is the branch
description.  We copy everything under branch.OLD.* to branch.NEW.*
and end up copying it from OLD to NEW, but I think that is also a
nonsense operation.

So, it probably makes sense to be more selective that what are
sensibly copied and what are not.  Reflog most likely does not
belong to the "sensibly copyable" set.  Tracking info most likely
does.  Among various configuration in branch.OLD.*, there may be
things like description that are not sensibly copyable.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-24 20:27         ` changing the experimental 'git switch' Junio C Hamano
@ 2021-10-25 12:48           ` Ævar Arnfjörð Bjarmason
  2021-10-25 17:06             ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-25 12:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin, git


On Sun, Oct 24 2021, Junio C Hamano wrote:

> Martin <git@mfriebe.de> writes:
>
>> So one could do
>> git switch  --settings-from <branch-with-reflog-and-conf> --create
>> <new-branch>   <commit>
>> git switch  -s <branch-with-reflog-and-conf>   -c <new-branch>   <commit>
>>
>> "settings-from" is just an example, there may be better names for
>> it. Ideally not starting with a "c".
>>
>> And using a name different from "copy" may be more accurate, because
>> unless it is created on the same one <commit> to which the 
>> <branch-with-reflog-and-conf> points, then its at best partially copied.
>
> I like the "copy the settings from this other branch when creating
> this new branch" as a concept.
>
> One thing that I find iffy is the reflog.  Even with the current
> "create a new branch NEW, pointing at the same commit, tracking the
> same remote-tracking branch, having the same branch description, and
> pretending to have come along the same trajectory, out of this
> original branch OLD", I actually find that the copyng of reflog is
> utterly questionable.  Before that operation, the new branch did not
> exist, hence NEW@{4.days.ago} shouldn't say the same thing as
> OLD@{4.days.ago} for the branch NEW that was created like so just a
> minute ago.
>
> If you generalize the operation to allow starting the new branch at
> a different commit, it becomes even more strange to copy the reflog
> of the "original" branch, which is not even the original for this
> new branch.
>
> Another thing nobody seems to have brought up is the branch
> description.  We copy everything under branch.OLD.* to branch.NEW.*
> and end up copying it from OLD to NEW, but I think that is also a
> nonsense operation.
>
> So, it probably makes sense to be more selective that what are
> sensibly copied and what are not.  Reflog most likely does not
> belong to the "sensibly copyable" set.  Tracking info most likely
> does.  Among various configuration in branch.OLD.*, there may be
> things like description that are not sensibly copyable.

It is a bit weird, but the main problem is that we'll use it for UI such
as @{-1} or whatever in addition to things like "x days ago". So if you
copy a branch for some ad-hoc testing, and were just running such a
command you might expend it to work.

For a user it also maps nicely to the mental model you'd have if you
copied two directories with the "-p" option to "cp", i.e. you'll be able
to run a "find" command on that checking mtime of N days ago and the
like.

Maybe it still doesn't make sense for those cases just some thoughts on
UX edge cases.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Let's have public Git chalk talks, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20
  2021-10-22  9:44 ` Let's have public Git chalk talks, " Johannes Schindelin
@ 2021-10-25 12:58   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-25 12:58 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Fri, Oct 22 2021, Johannes Schindelin wrote:

> Team,
>
> On Thu, 21 Oct 2021, Johannes Schindelin wrote:
>
>> * Let's have public Git chalk talks
>
> Okay, I give up on the mailing list. I tried some 20 times to send the
> notes out in one form or another, and it simply is not working, and the
> time I spent trying was definitely lost time.
>
> So here is a link:
> https://gist.github.com/dscho/003a0e112058e5794b5e08e84d34092d

Trying to see if it works for me. FWIW I recieved a bounce from GMail on
an unrelated mail of mine in this thread whose error suggests that
gmx.de's MX's may be rate limiting recieved mails on your account in
some way, which may or may not have anything to do with difficulties
interacting with kernel.org infrastructure in general.

Attempt to paste
https://gist.githubusercontent.com/dscho/003a0e112058e5794b5e08e84d34092d/raw/8d825e01152d957671d5da9e7c5217cabc37afa5/gistfile1.txt
follows below:

This session was led by Emily Shaffer. Supporting cast: Ævar Arnfjörð
Bjarmason, brian m. carlson, CB Bailey, and Junio Hamano.

Notes:

 1.  What’s a public chalk talk?

     1.  At Google, once a week, the team meets up with no particular topic in
         mind, or a couple topics, very informal

     2.  One person’s turn each week to give an informal talk with a white
         board (not using chalk)

     3.  Topic should be technical and of interest to the presenter

     4.  For example: how does protocol v2 work

     5.  Collaborative, interactive user session

     6.  Helps by learning about things

     7.  Helps by honing skills like presentation skills

     8.  A lot of (good) humility involved. For example, colleagues who have
         been familiar with the project for a long time admitting they don’t
         know, or have been wrong about things. Makes others feel more
         comfortable with their perceived lack of knowledge

     9.  Could be good for everybody on the Git mailing list, might foster less
         combative communication on the list

     10. Might be a way to attract new people by presenting “old timers” as
         humble

 2.  Does that appeal to anybody else?

 3.  Ævar: I think it would be great, has been a long time we’ve seen each
     other, and already feels different

 4.  One thing to keep in mind: it’s hard to program on a white board :-)

 5.  Emily: some challenges:

     1. How often?

     2. What time?

     3. Probably move things around (because we’re global)

     4. Tech to use? Jitsi? Twitch? (Twitch seems to be particularly popular to
        teach programming)

     5. Figure out what topics to present

 6.  Ævar: does not matter what tech to use

 7.  Emily: some difference may make it matter: on Twitch, you can record, and
     they host recordings

 8.  One thing to worry about recording: people might be reticent to make
     public mistakes

 9.  It’s possible to do a Twitch stream, and not record it

 10. brian: maybe record it, but not keep the recordings forever

 11. People might be uncomfortable having their homes being recorded

 12. At GitHub, some sessions are recorded just so people from other timezones
     can watch later

 13. CB: would be a nice way to see the other contributors

 14. Really like the idea, hopefully won’t replace other things we do

 15. Emily: internally, often about patch series in progress (or not even
     started)

 16. So retaining recordings for long time makes even less sense

 17. Weekly might be too frequently, Monthly cadence sound more reasonable?

 18. Junio: not sure we want an official schedule

 19. Assumed this would be an extension of what we do on IRC

 20. Remember when Linus would drop in and talk about a specific topic in
     depth, was nice

 21. Now we have video

 22. Emily: I fear if we don’t schedule it, it’ll never happen

 23. Ævar: would like it to be organized, maybe try some schedule and then
     iterate?

 24. brian: if it is scheduled, I can put it on my calendar, otherwise might be
     hard to block the time

 25. Every two weeks would be fine, especially when alternating timezones

 26. Emily: who besides me wants to volunteer for the other timezone?

 27. Ævar: if you start a schedule, I’ll see what I can do

 28. CB: also interested

 29. brian: can do, but Toronto is probably too close to California time

 30. Junio: schedule should be put on https://tinyurl.com/gitcal

 31. Emily: how about using a Google Sheet just like for the Contributors’
     Summit?

 32. One advantage to decide the topic in advance is that people can decide
     whether to make time to attend, on the other hand people might show up
     with a polished PowerPoint, which is not the idea

 33. brian: we can try, and if it does not work, make it less formal

 34. Emily: pretty much got what I need to start this

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-10-22 14:04     ` martin
@ 2021-10-25 16:44     ` Sergey Organov
  2021-10-25 22:23       ` Ævar Arnfjörð Bjarmason
  3 siblings, 1 reply; 58+ messages in thread
From: Sergey Organov @ 2021-10-25 16:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

[...]

> I really don't know, but I do think that the most viable path to a
> better UX for git is to consider its UX more holistically.
>
> To the extent that our UX is a mess I think it's mainly because we've
> ended up with an accumulation of behavior that made sense in isolation
> at the time, but which when combined presents bad or inconsistent UX to
> the user.

Yep. Moreover, this practice of "making sense" being the primary
reasoning factor doesn't work very well even in isolation, for single
Git sub-commands. As there is no defined underlying UI model, or rules,
or even clear guidelines of how to properly design command-line options,
multiple authors, all having their own sense and having no common ground
to base their decisions on, inevitably produce some spaghetti UI.

The UI model to be defined, provided we are serious about aiming at a
good design, in fact has at least 2 aspects to address:

1. Uniform top-level syntax of all the Git commands.

2. Uniform rules to handle command-line options.

Being hard to produce simple yet flexible design by itself, the problem
is further complicated by the need to absorb as much of the existing UI
as reasonably possible.

Once a model is defined though, we should be able to at least ensure new
designs fit the model, and then, over time, gradually replace legacy UIs
that currently don't fit.

As a side-note, from this standpoint, discussing deep details of "git
switch" options, or even relevancy of introducing of "git switch" in the
first place, has still no proper ground.

Not even touching (1) for now, let me put some feelers out to see if we
can even figure how the rules or guidelines for command-line options
design may look like.

1. All options are divided into 2 classes: basic options and convenience
   options.

2. Minimalism. Every basic option should tweak exactly one aspect of
   program behavior.

3. Orthogonality. Every basic option should not "imply" any other
   option, nor change the behavior of any other option.

4. Reversibility. Every basic option should have a way to set it to any
   supported value at any moment, including setting it back to its
   default value.

5. Grouping for convenience. A convenience option (usually with a short
   syntax), should be semantically equivalent to an exact sequence of
   basic options, as if it were substituted at the place of the
   convenience option, and should not otherwise tweak program behavior.
   I.e., a convenience option should be simple textual synonym for
   particular sequence of basic options.

Please notice that in the above model basic option having a short form
is formally considered to be a short convenience option that is a
synonym for long basic option.

There are obviously some other useful guidelines that could be defined,
or some alternate approach could be chosen,but the primary point is that
if we want a consistent UI, we do need some rules, and we need
convenient implementation of the model agreed upon, and then ensure that
from all the designs that "make sense", only those that fit into
underlying model are accepted.

Thanks,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-25 12:48           ` Ævar Arnfjörð Bjarmason
@ 2021-10-25 17:06             ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2021-10-25 17:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Martin, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> So, it probably makes sense to be more selective that what are
>> sensibly copied and what are not.  Reflog most likely does not
>> belong to the "sensibly copyable" set.  Tracking info most likely
>> does.  Among various configuration in branch.OLD.*, there may be
>> things like description that are not sensibly copyable.
>
> It is a bit weird, but the main problem is that we'll use it for UI such
> as @{-1} or whatever in addition to things like "x days ago". So if you
> copy a branch for some ad-hoc testing, and were just running such a
> command you might expend it to work.

The event of new branch "creation" onward should be recorded to the
reflog of the newly created branch.  As of X days ago, the new
branch did not even exist, so that is not a good excuse to copy the
reflog.

Also @{-1} comes from the reflog of HEAD, which is different from
what we are discussing.

> For a user it also maps nicely to the mental model you'd have if you
> copied two directories with the "-p" option to "cp", i.e. you'll be able
> to run a "find" command on that checking mtime of N days ago and the
> like.
>
> Maybe it still doesn't make sense for those cases just some thoughts on
> UX edge cases.

To me, it makes no sense, with these analogies.  If I make a copy of
a file one month old with timestamp copied, I may appreciate that
the newly created copy hasn't yet been touched by looking at the old
timestamp, but that does not necessarily mean that I want to pretend
that the new file was there from that old date, or I want to pretend
that the last time the new file was edited before that was at an
even old time.

If I were renaming a branch, that is a totally different story.  In
the mental model, the "identity" of the branch did not change, only
the label that I use to refer to it (called "name") has changed.

But I do not expect copying to split and give half the identity of
the original to the new one.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-21 11:56 ` [Summit topic] The state of getting a reftable backend working in git.git Johannes Schindelin
@ 2021-10-25 19:00   ` Han-Wen Nienhuys
  2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 58+ messages in thread
From: Han-Wen Nienhuys @ 2021-10-25 19:00 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Thu, Oct 21, 2021 at 1:56 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> This session was led by Ævar Arnfjörð Bjarmason (on behalf of Han-Wen
> Nienhuys, the driving force behind the reftable patches, who did not
> attend the Summit). Supporting cast: Jonathan "jrnieder" Nieder, Johannes
> "Dscho" Schindelin, Philip Oakley, Jeff "Peff" King, and Junio Hamano.
>

Thanks Ævar for doing this. I wanted to be there, but I took a much
needed 2 week computer-less vacation .

>..
>      9.  Reftable has a set of files that go together. May want debugging tool
>          to dump the content of a binary reftable file. But we can
>          incrementally add those


The patch series includes a test-tool for dumping both individual
tables and a stack of tables. It's not super-polished, but it gets the
job done.

$ touch a ; ~/vc/git/git add a; ~/vc/git/git commit -mx
...

$  ~/vc/git/bin-wrappers/test-tool  dump-reftable -t
.git/reftable/0x000000000002-0x000000000002-327b23c6.ref
ref{refs/heads/main(2) val 1 ab21c324503544acca84eb55f5ee7dce24b23e15}
log{HEAD(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
0000000000000000000000000000000000000000 =>
ab21c324503544acca84eb55f5ee7dce24b23e15

commit (initial): x

}
log{refs/heads/main(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
0000000000000000000000000000000000000000 =>
ab21c324503544acca84eb55f5ee7dce24b23e15

commit (initial): x

}


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-25 19:00   ` Han-Wen Nienhuys
@ 2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
  2021-10-26  8:12       ` Han-Wen Nienhuys
  2021-10-26 15:51       ` Philip Oakley
  0 siblings, 2 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-25 22:09 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Johannes Schindelin, git, Philip Oakley

On Mon, Oct 25 2021, Han-Wen Nienhuys wrote:

> On Thu, Oct 21, 2021 at 1:56 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>>
>> This session was led by Ævar Arnfjörð Bjarmason (on behalf of Han-Wen
>> Nienhuys, the driving force behind the reftable patches, who did not
>> attend the Summit). Supporting cast: Jonathan "jrnieder" Nieder, Johannes
>> "Dscho" Schindelin, Philip Oakley, Jeff "Peff" King, and Junio Hamano.
>>
>
> Thanks Ævar for doing this. I wanted to be there, but I took a much
> needed 2 week computer-less vacation .

No problem, as is perhaps clear from the notes I had to hand-wave some
questions away since I didn't know about those things.

>>..
>>      9.  Reftable has a set of files that go together. May want debugging tool
>>          to dump the content of a binary reftable file. But we can
>>          incrementally add those
>
>
> The patch series includes a test-tool for dumping both individual
> tables and a stack of tables. It's not super-polished, but it gets the
> job done.
>
> $ touch a ; ~/vc/git/git add a; ~/vc/git/git commit -mx
> ...
>
> $  ~/vc/git/bin-wrappers/test-tool  dump-reftable -t
> .git/reftable/0x000000000002-0x000000000002-327b23c6.ref
> ref{refs/heads/main(2) val 1 ab21c324503544acca84eb55f5ee7dce24b23e15}
> log{HEAD(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
> 0000000000000000000000000000000000000000 =>
> ab21c324503544acca84eb55f5ee7dce24b23e15
>
> commit (initial): x
>
> }
> log{refs/heads/main(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
> 0000000000000000000000000000000000000000 =>
> ab21c324503544acca84eb55f5ee7dce24b23e15
>
> commit (initial): x
>
> }

Neat.

From memory I think the more general concern Philip Oakley was also
expressing (but maybe he'll chime in) could also be addressed by a tool
that just un-reftable-ifies a repository.

I think such a thing would be useful, and I think we don't have that
already. Isn't the files backend or reftable usage now an "init"-time
setting.

It would be useful if for no other reason than to give user who are
looking at a repository that's weird somehow the ability to quickly
migrate 100% away from reftable, to see if it has any impact on whatever
they're seeing.

I wanted to implement a "git unpack-refs" a while ago for "pack-refs",
just to simulate some performance aspects of loose-refs without writing
an ad-hoc "ref exploder" one-liner again.

A migration tool would surely be pretty much that, no? I.e. we'd just
create a .git/refs.migrate or whatever, then hold a lock on reftable,
and in-place move .git/refs{.migrate,} (along with top-level files like
HEAD et al, presumably...).

Maybe there's more complexity I'm not considering than just the *.lock
dance in .git/*, but if not such a tool could also convert freely
between the two backends, so you could try refable out in an existing
checkout.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-25 16:44     ` Sergey Organov
@ 2021-10-25 22:23       ` Ævar Arnfjörð Bjarmason
  2021-10-27 18:54         ` Sergey Organov
  0 siblings, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-25 22:23 UTC (permalink / raw)
  To: Sergey Organov; +Cc: git, Jeff King


On Mon, Oct 25 2021, Sergey Organov wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> [...]
>
>> I really don't know, but I do think that the most viable path to a
>> better UX for git is to consider its UX more holistically.
>>
>> To the extent that our UX is a mess I think it's mainly because we've
>> ended up with an accumulation of behavior that made sense in isolation
>> at the time, but which when combined presents bad or inconsistent UX to
>> the user.
>
> Yep. Moreover, this practice of "making sense" being the primary
> reasoning factor doesn't work very well even in isolation, for single
> Git sub-commands. As there is no defined underlying UI model, or rules,
> or even clear guidelines of how to properly design command-line options,
> multiple authors, all having their own sense and having no common ground
> to base their decisions on, inevitably produce some spaghetti UI.

Yes we're definitely lacking on the documentation front here at least,
but I do think we have quite a bit of consistency in the form of
parse_options() users....

> The UI model to be defined, provided we are serious about aiming at a
> good design, in fact has at least 2 aspects to address:
>
> 1. Uniform top-level syntax of all the Git commands.

have have e.g. hash-object but nothing like hash_object, there's that at
least..., but also mktag, not make-tag, so....

> 2. Uniform rules to handle command-line options.
>
> Being hard to produce simple yet flexible design by itself, the problem
> is further complicated by the need to absorb as much of the existing UI
> as reasonably possible.
>
> Once a model is defined though, we should be able to at least ensure new
> designs fit the model, and then, over time, gradually replace legacy UIs
> that currently don't fit.
>
> As a side-note, from this standpoint, discussing deep details of "git
> switch" options, or even relevancy of introducing of "git switch" in the
> first place, has still no proper ground.
>
> Not even touching (1) for now, let me put some feelers out to see if we
> can even figure how the rules or guidelines for command-line options
> design may look like.

Having hacked quite a bit on parse_options() recently, including quite a
bit of unsubmitted work I've got some opinions in this area :)

That API is as close as we get to uniform UX in this area.

> 1. All options are divided into 2 classes: basic options and convenience
>    options.

Are you thinking of things like "git config --bool" v.s. "git config
--type=bool" (let's ignore that we discourage the former for now), or
more like "common" v.s. "obscure" ?

> 2. Minimalism. Every basic option should tweak exactly one aspect of
>    program behavior.

Generally, although for things like "git log" you quickly end up with
wanting to have pseudo-mode options imply one thing or the other,
sometimes for the better, sometimes wfor worse.

> 3. Orthogonality. Every basic option should not "imply" any other
>    option, nor change the behavior of any other option.

Yeah, generally.

> 4. Reversibility. Every basic option should have a way to set it to any
>    supported value at any moment, including setting it back to its
>    default value.

Yeah, for sure, we're generally quite good at this with parse_options(),
but there's exceptions (particularly with callbacks).

> 5. Grouping for convenience. A convenience option (usually with a short
>    syntax), should be semantically equivalent to an exact sequence of
>    basic options, as if it were substituted at the place of the
>    convenience option, and should not otherwise tweak program behavior.
>    I.e., a convenience option should be simple textual synonym for
>    particular sequence of basic options.

I think some examples for the above in terms of current git commands
would be quite helpful, I'm struggling to think of examples for some of
these.

> Please notice that in the above model basic option having a short form
> is formally considered to be a short convenience option that is a
> synonym for long basic option.
>
> There are obviously some other useful guidelines that could be defined,
> or some alternate approach could be chosen,but the primary point is that
> if we want a consistent UI, we do need some rules, and we need
> convenient implementation of the model agreed upon, and then ensure that
> from all the designs that "make sense", only those that fit into
> underlying model are accepted.

There was a recent discussion about cat-file option parsing semantics at
https://lore.kernel.org/git/87tuhuikhf.fsf@evledraar.gmail.com/

I have this unsubmitted (and updated from that discussion) patch to make
"cat-file" help friendlier:
https://github.com/avar/git/commit/bd32f57cd21

I wonder what you think abut that new output v.s. the old.

More generally, I've wanted to have some mode for parse_options() for a
while now to label a given option X as only going with option. We have
OPT_CMDMODE() for things that are mutually exclusive with all other
options, but not anything like a OPT_SUBCMDMODE() or whatever (and
sometimes such a thing would go with N "top-level modes", not just one).

Right now you need to do that manually, see the usage_msg_opt[f]()
verbosity at:
https://github.com/avar/git/blob/avar/cat-file-usage-and-options-handling/builtin/cat-file.c#L679-L755

I thing like that would be really useful, and would go a long way
towards consistent UX, as you could generate the sort of "grouped help"
shown in the commit link above with it, as well as have things like:

    git some-command --top-level-option --op<TAB>

Tab-complete only those --op* options that go with that
--top-level-option.

I guess what I'm saying is that I agree with you, but just think that
incremental changes to these UX APIs is the most viable way forward.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
@ 2021-10-26  8:12       ` Han-Wen Nienhuys
  2021-10-28 14:17         ` Philip Oakley
  2021-10-26 15:51       ` Philip Oakley
  1 sibling, 1 reply; 58+ messages in thread
From: Han-Wen Nienhuys @ 2021-10-26  8:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, git, Philip Oakley

On Tue, Oct 26, 2021 at 12:16 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> From memory I think the more general concern Philip Oakley was also
> expressing (but maybe he'll chime in) could also be addressed by a tool
> that just un-reftable-ifies a repository.
>
> I think such a thing would be useful, and I think we don't have that
> already. Isn't the files backend or reftable usage now an "init"-time
> setting.
>..
> Maybe there's more complexity I'm not considering than just the *.lock
> dance in .git/*, but if not such a tool could also convert freely
> between the two backends, so you could try refable out in an existing
> checkout.

I added a convert-ref-storage command to the JGit command line client
for exactly this,

$ jgit convert-ref-storage  -h
jgit convert-ref-storage [--format VAL] [--help (-h)] [--ssh [JSCH | APACHE]]

 --format VAL          : Format to convert to (reftable or refdir) (default:
                         reftable)
 --help (-h)           : display this help text (default: true)
 --ssh [JSCH | APACHE] : Selects the built-in ssh library to use, JSch or
                         Apache MINA sshd. (default: JSCH)

See here[1] for implementation. It's not safe for concurrent use with
other git commands, but that's hardly a common use-case.

[1] https://eclipse.googlesource.com/gerrit/jgit/jgit/+/1825a2230c06e7a6cbe23c69b63c3b7ecd2ceac6/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/FileRepository.java#806


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
  2021-10-26  8:12       ` Han-Wen Nienhuys
@ 2021-10-26 15:51       ` Philip Oakley
  1 sibling, 0 replies; 58+ messages in thread
From: Philip Oakley @ 2021-10-26 15:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys
  Cc: Johannes Schindelin, git

Hi Han-Wen,

On 25/10/2021 23:09, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Oct 25 2021, Han-Wen Nienhuys wrote:
>
>> On Thu, Oct 21, 2021 at 1:56 PM Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>> This session was led by Ævar Arnfjörð Bjarmason (on behalf of Han-Wen
>>> Nienhuys, the driving force behind the reftable patches, who did not
>>> attend the Summit). Supporting cast: Jonathan "jrnieder" Nieder, Johannes
>>> "Dscho" Schindelin, Philip Oakley, Jeff "Peff" King, and Junio Hamano.
>>>
>> Thanks Ævar for doing this. I wanted to be there, but I took a much
>> needed 2 week computer-less vacation .
> No problem, as is perhaps clear from the notes I had to hand-wave some
> questions away since I didn't know about those things.
>
>>> ..
>>>      9.  Reftable has a set of files that go together. May want debugging tool
>>>          to dump the content of a binary reftable file. But we can
>>>          incrementally add those
>>
>> The patch series includes a test-tool for dumping both individual
>> tables and a stack of tables. It's not super-polished, but it gets the
>> job done.
>>
>> $ touch a ; ~/vc/git/git add a; ~/vc/git/git commit -mx
>> ...
>>
>> $  ~/vc/git/bin-wrappers/test-tool  dump-reftable -t
>> .git/reftable/0x000000000002-0x000000000002-327b23c6.ref
>> ref{refs/heads/main(2) val 1 ab21c324503544acca84eb55f5ee7dce24b23e15}
>> log{HEAD(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
>> 0000000000000000000000000000000000000000 =>
>> ab21c324503544acca84eb55f5ee7dce24b23e15
>>
>> commit (initial): x
>>
>> }
>> log{refs/heads/main(2) Han-Wen Nienhuys <hanwen@google.com> 1635188263 0200
>> 0000000000000000000000000000000000000000 =>
>> ab21c324503544acca84eb55f5ee7dce24b23e15
>>
>> commit (initial): x
>>
>> }
> Neat.
>
> From memory I think the more general concern Philip Oakley was also
> expressing (but maybe he'll chime in) could also be addressed by a tool
> that just un-reftable-ifies a repository.

I was remembering my early exploits with trying to understand Git and
all the web references tended to refer to the file system implementation
of refs, in a reverse-specification sort of way.

refs can be hard to comprehend especially when DWIMmery is also
involved, and the user hasn't yet fully understood all the git commands
that can affect and read refs

>
> I think such a thing would be useful, and I think we don't have that
> already. Isn't the files backend or reftable usage now an "init"-time
> setting.
>
> It would be useful if for no other reason than to give user who are
> looking at a repository that's weird somehow the ability to quickly
> migrate 100% away from reftable, to see if it has any impact on whatever
> they're seeing.

I remember the usefulness of the data_dumper when I was looking at the
early Git Visual Studio project generators and the like.

Having a similar dumper for the refs would be useful. I can see it being
split between a dumper for repos with just a few refs and one that can
cope with the thousands of refs scaling problem (some sort of selectivity?)
>
> I wanted to implement a "git unpack-refs" a while ago for "pack-refs",
> just to simulate some performance aspects of loose-refs without writing
> an ad-hoc "ref exploder" one-liner again.
>
> A migration tool would surely be pretty much that, no? I.e. we'd just
> create a .git/refs.migrate or whatever, then hold a lock on reftable,
> and in-place move .git/refs{.migrate,} (along with top-level files like
> HEAD et al, presumably...).

I could see an option that puts the exploded refs 'somewhere else' just
for inspection by a confused user...
>
> Maybe there's more complexity I'm not considering than just the *.lock
> dance in .git/*, but if not such a tool could also convert freely
> between the two backends, so you could try refable out in an existing
> checkout.
Philip

^ permalink raw reply	[flat|nested] 58+ messages in thread

* scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas]
  2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
  2021-10-21 12:30   ` Son Luong Ngoc
@ 2021-10-26 20:14   ` Eric Wong
  2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
  2021-11-02 13:52     ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Johannes Schindelin
  1 sibling, 2 replies; 58+ messages in thread
From: Eric Wong @ 2021-10-26 20:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> * Test suite is slow. Shell scripts and process forking.
> 
>    * What if we had a special shell that interpreted the commands in a
>      single process?
> 
>    * Even Git commands like rev-parse and hash-object, as long as that’s
>      not the command you’re trying to test

This is something I've wanted in a very long time as a scripter.
fast-import has been great over the years, as is
"cat-file --batch(-check)", but there's gaps should be filled
(preferably without fragile linkage of shared libraries into a
script process)

>    * Dscho wants to slip in a C-based solution
> 
>    * Jonathan tan commented: going back to your custom shell for tests
>      idea, one thing we could do is have a custom command that generates
>      the repo commits that we want (and that saves process spawns and
>      might make the tests simpler too)

Perhaps a not-seriously-proposed patch from 2006 could be
modernized for our now-libified internals:

https://yhbt.net/lore/git/Pine.LNX.4.64.0602232229340.3771@g5.osdl.org/

>       * We could replace several “setup repo” steps with “git fast-import”
>         instead.
> 
>    * Dscho measured: 0.5 sec - 30 sec in setup steps. Can use fast-import,
>      or can make a new format that helps us set up the test scenario

0.5s - 30s across the whole suite or individual tests?

Having a way to disable fsync globally should further improve
things, especially for people on slower storage.  libeatmydata
is available, but perhaps not widely available/known.

>    * Elijah: test-lib-functions helpers could be built ins

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.)
  2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
@ 2021-10-27  7:02       ` Jean-Noël Avila
  2021-10-27  8:50       ` Jeff King
  1 sibling, 0 replies; 58+ messages in thread
From: Jean-Noël Avila @ 2021-10-27  7:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Fri, Oct 22 2021, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Oct 22 2021, Jean-Noël Avila wrote:
> 
>> I'm sorry that my presence at this meeting could have helped a bit for
>> some subtopics.
>>
>> Le 21/10/2021 à 13:56, Johannes Schindelin a écrit :
>>> This session was led by brian m. carlson. Supporting cast: Jeff "Peff"
>>> King, Ævar Arnfjörð Bjarmason, Taylor Blau, Philip Oakley, Emily Shaffer,
>>> CB Bailey, and Jonathan "jrnieder" Nieder.
>>>
>>> Notes:
>>>
>>>  1. Background: answering on StackOverflow, other avenues for user questions,
>>>     even users from very large companies
>>>
>>>  2. How can we improve documentation?
>>>
>>>  3. Maybe even think about translating docs such as FAQs
>>>
>>>  4. Peff: there’s an effort to translate manpages
>>>
>>>     1. brian: Saw an announcement, haven’t seen what came of it
>>
>> The effort is still ongoing. Unfortunately, there aren't much outputs
>> from it, only the inclusion on git-scm.com.
>>
>> A proposition was sent for Debian packages.
>>
>> I'm open for any help in packaging what's already available for whatever
>> useful.
>>
>>
>> For some statistics
>>
>> * there are 23 po files, "pt_BR" fully translated, "fr" half translated,
>> "de" one third; most other languages have not really started (the
>> portion already translated was made automatically for unmodified strings).
>>
>> * not all pages are included for translation; most porcelain pages
>> available on git-scm.com are included, but for instance, not the config
>> parts or the guides. That's already 10,687 source segments and 206,700
>> source words, which is a volume similar to "Crime and Punishment" by
>> Dostoyevsky. And it really looks like an punishment for most apprentice
>> translators willing to start.
>>
>> In order to lower the barrier to translators, the project is relying on
>> weblate: https://hosted.weblate.org/projects/git-manpages/translations/
>> while still retaining a "Developer's Certificate of Origin".
>>
>>
>>>
>>>     2. Peff: Some translated pages are live on git-scm.com (a github repo with
>>>        translations)
>>
>> For instance, git init manpages is already available in 8 languages.
>>
>>
>>>
>>>     3. Ævar: It uses a third-party tool (po4a) that uses gettext by making each
>>>        paragraph a translated string. So it’s the same workflow as translating
>>>        code changes
>>
>> Asciidoc support is "co-developed" in po4a in parallel with the
>> translation: I fix bugs when they are found in the po files.
>>
>>>     4. Taylor: https://github.com/jnavila/git-manpages-l10n
>>
>>
>> If it looks too personal, it can be moved into the git organization.
>>
>>
>>>
>>>  5. Philip Oakley: I see manpages used as reference material instead of
>>>     educational documents
>>>
>>>
>>>     12. In stackoverflow you can see how people answer questions, how much less
>>>         existing background they assume
>>
>> Version control is usually already in the culture of most users
>> (writers, engineers in other fields have come to use them some 10 years
>> ago). What their questions usually boil down to is: how can I use and
>> customize git features for my field of expertise. When software editors
>> include git support in their applications, it is usually with severed
>> functions and users quickly have to get back to plain git when they want
>> a little more.
>>
>> General rules can help start up with a new customization, but at some
>> point, the customization is specific to the tool. A library of
>> application oriented customizations, help files and FAQs may be of
>> interest. Some customizations already exist, sometimes with errors
>> (meaning the maintainer of the customization has not fully understood
>> how git works) but they are scattered.
> 
> I'd very much support this living in-tree just as the po/* directory
> already does. I.e. periodically pulled down.
> 
> There are many OS's that have something like "apt install
> manpages-<lang>", so if we had these available they could be much more
> useful to users.
> 
> E.g. I see I can "apt install manpages-pt", but if you're a Portuguese
> speaker you probably won't chase down some third-party addition of
> Portuguese manpages, and even if they're in Debian other package
> maintainers might not add them if they're not in the "main" package etc.
> 
> What's standing in the way of us treating this in the same way as the
> po/* directory, if anything?
> 

 * I'm using Asciidoctor to process manpages, because it processes
directly the asciidoc source, whereas using the intermediate docbook
stage stops when some included files are missing (very common problem
with such long running translations and included files are not yet
translated).
 * I had understood from my initial presentation that adding this
content to "common" Git was not desirable. The question was more to make
the repo appear under the git organization on GitHub, not a full
integration.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.)
  2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
  2021-10-27  7:02       ` Jean-Noël Avila
@ 2021-10-27  8:50       ` Jeff King
  1 sibling, 0 replies; 58+ messages in thread
From: Jeff King @ 2021-10-27  8:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jean-Noël Avila, git

On Fri, Oct 22, 2021 at 04:31:46PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I'd very much support this living in-tree just as the po/* directory
> already does. I.e. periodically pulled down.

Just a bit of a tangent here, since weblate was mentioned earlier.

I'd caution a bit against pulling the history generated by weblate
directly. It's pretty sub-optimal from a Git perspective: you have a
bunch of big .po files and then a ton of little commits changing one or
a handful of lines.

So the "logical" size of the repository (the sum of the actual object
sizes) ends up growing quite a bit. Deltas can help with the on-disk
size, but:

  - lots of operations scale with the logical size. The client-side
    index-pack of a clone, for instance, but also everyday stuff like
    "git log -S".

  - empirically we don't do a great job of finding these. See below for
    some numbers.

For instance, take https://github.com/phpmyadmin/phpmyadmin, a
repository which uses weblate (I don't mean to pick on them; it's just a
repo whose weblate-related packing I've looked into before). A fresh
clone is 1.3GB. If you do an aggressive repack, you can get it down to
about 550MB. But there's still tons of logical data. Running:

  git cat-file --batch-all-objects --batch-check='%(objectsize) %(objectsize:disk)' |
  perl -alne '
    $logical += $F[0]; $disk += $F[1];
    END { print "$logical / $disk = " . $logical / $disk }
  '

shows that there's over 70GB of logical data. It gets an impressive
156:1 compression ratio (for comparison, "normal" repos like linux.git
and git.git are around 40-60x in my experience).

If you split it up by directory, like this:

  git rev-list --objects --all --no-object-names -- po |
  git cat-file --batch-check='%(objectsize)' |
  perl -lne '$total += $_; END { print $total }'

you'll see that po/ accounts for almost 60GB of that logical size.

We face some of that in our current po/, too. They're big files, and
that's the nature of the problem space. But our current ones tend to be
edited by taking a pass over the whole file, rather than the one-liners
that a web-based workflow encourages.

To be clear, I'm not arguing against weblate in general. It's cool that
it makes it easier for people to contribute to translations. But I think
it has an outsized impact on size and performance compared to the rest
of the repository. That's a big price to pay for carrying the history
in-tree.

Obviously one option there is to squash the po/ history before pulling
it in. The weblate commit messages themselves aren't that useful. I'm
not actually sure if jnavila's work so far has been using weblate. The
commits in his git-html-l10n are much coarser than what I see in
phpmyadmin, for example (so maybe he's doing similar squashing already).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: changing the experimental 'git switch'
  2021-10-25 22:23       ` Ævar Arnfjörð Bjarmason
@ 2021-10-27 18:54         ` Sergey Organov
  0 siblings, 0 replies; 58+ messages in thread
From: Sergey Organov @ 2021-10-27 18:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Oct 25 2021, Sergey Organov wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>> [...]
>>
>>> I really don't know, but I do think that the most viable path to a
>>> better UX for git is to consider its UX more holistically.
>>>
>>> To the extent that our UX is a mess I think it's mainly because we've
>>> ended up with an accumulation of behavior that made sense in isolation
>>> at the time, but which when combined presents bad or inconsistent UX to
>>> the user.
>>
>> Yep. Moreover, this practice of "making sense" being the primary
>> reasoning factor doesn't work very well even in isolation, for single
>> Git sub-commands. As there is no defined underlying UI model, or rules,
>> or even clear guidelines of how to properly design command-line options,
>> multiple authors, all having their own sense and having no common ground
>> to base their decisions on, inevitably produce some spaghetti UI.
>
> Yes we're definitely lacking on the documentation front here at least,
> but I do think we have quite a bit of consistency in the form of
> parse_options() users....

Yes, parse_potions() is definitely a good step in the right direction.
It's not the magic pill though, as even if it gives all the tools for
proper options parsing and design, it still can't prevent all the ugly
hacks applied to the resulting program state after options are already
parsed.

>
>> The UI model to be defined, provided we are serious about aiming at a
>> good design, in fact has at least 2 aspects to address:
>>
>> 1. Uniform top-level syntax of all the Git commands.
>
> have have e.g. hash-object but nothing like hash_object, there's that at
> least..., but also mktag, not make-tag, so....

Naming conventions are nice to have too, but here I was more after
something like "man git" starting with more restrictive synopsis than it
is now, say:

SYNOPSIS

  git [COMMON_OPTIONS] [OBJECT] [OBJ_OPTIONS] [COMMAND] [ARGS]

  OBJECT := { branch, tag, stash, ... }

  COMMAND := OBJECT-dependent
  ...

So that, say, removal of file, tag, branch, etc., all were similar, say:

    git file delete
    git branch delete
    git tag delete
    git submodule delete

A measures to short-cut these full forms could probably be introduced,
so that, say, "git delete" is in fact parsed as "git file delete", and
then synonyms could be introduces, so that current "git rm" is still
accepted and becomes "git file delete" after parsing.

As has been mentioned elsewhere, it'd be nice to get some expertise in
the field of syntax and semantics of textual UIs to work on a suitable
design. I'm none of an expert in the field, just someone interested in
the topic.

>
>> 2. Uniform rules to handle command-line options.
>>
>> Being hard to produce simple yet flexible design by itself, the problem
>> is further complicated by the need to absorb as much of the existing UI
>> as reasonably possible.
>>
>> Once a model is defined though, we should be able to at least ensure new
>> designs fit the model, and then, over time, gradually replace legacy UIs
>> that currently don't fit.
>>
>> As a side-note, from this standpoint, discussing deep details of "git
>> switch" options, or even relevancy of introducing of "git switch" in the
>> first place, has still no proper ground.
>>
>> Not even touching (1) for now, let me put some feelers out to see if we
>> can even figure how the rules or guidelines for command-line options
>> design may look like.
>
> Having hacked quite a bit on parse_options() recently, including quite a
> bit of unsubmitted work I've got some opinions in this area :)

Sure as hell you have! I only touched a little bit of it, and even that
raised some opinions ;-)

>
> That API is as close as we get to uniform UX in this area.

I'm all for the continuous work in this area, and thank you for taking
so much care of it.

>
>> 1. All options are divided into 2 classes: basic options and convenience
>>    options.
>
> Are you thinking of things like "git config --bool" v.s. "git config
> --type=bool" (let's ignore that we discourage the former for now), or
> more like "common" v.s. "obscure" ?

Neither. Basic options are to cover all the needed functionality, and
convenience options are to have most useful combinations of basic
options handy in a short form.

>
>> 2. Minimalism. Every basic option should tweak exactly one aspect of
>>    program behavior.
>
> Generally, although for things like "git log" you quickly end up with
> wanting to have pseudo-mode options imply one thing or the other,
> sometimes for the better, sometimes wfor worse.

In this model this is covered exactly by "convenience options". The
primary difference to the current status quo being that they don't
"imply" anything, or "defaults" anything, whatever that might mean. They
are simple textual synonyms for a set of basic options.

User will mostly use convenience options, turning to basic option(s)
only when they need to achieve something very specific and not that
usual, or for scripting, or for aliasing.

>
>> 3. Orthogonality. Every basic option should not "imply" any other
>>    option, nor change the behavior of any other option.
>
> Yeah, generally.
>
>> 4. Reversibility. Every basic option should have a way to set it to any
>>    supported value at any moment, including setting it back to its
>>    default value.
>
> Yeah, for sure, we're generally quite good at this with parse_options(),
> but there's exceptions (particularly with callbacks).

Yes, parse_options() is definitely a step in the right direction. Though
do I already repeat myself?

>
>> 5. Grouping for convenience. A convenience option (usually with a short
>>    syntax), should be semantically equivalent to an exact sequence of
>>    basic options, as if it were substituted at the place of the
>>    convenience option, and should not otherwise tweak program behavior.
>>    I.e., a convenience option should be simple textual synonym for
>>    particular sequence of basic options.
>
> I think some examples for the above in terms of current git commands
> would be quite helpful, I'm struggling to think of examples for some of
> these.

OK, let's try something from "git log".

      --first-parent
           Follow only the first parent commit upon seeing a merge commit. [...]

           This option also changes default diff format for merge commits to first-parent, see
           --diff-merges=first-parent for details.

This violates the model as it changes both how commits are selected
(changes program behavior) and changes some default governed by another
option.

To adhere to the model, it might rather have been:

   --follow-first-parent
      Follow only the first parent commit upon seeing a merge commit.

   --first-parent
      Synonym for --follow-first-parent --diff-merges=first-parent

Here, --follow-first-parent and --diff-merges=first-parent are both
basic options, while --first-parent is convenience option. Please notice
how --first-parent is described entirely in terms of other options,
without anything else, so, saying:

   git log --first-parent

is *exactly* the same as saying:

   git log --follow-first-parent --diff-merges=first-parent

only shorter. [ OPT_SYNONYM()? ]

In this model all the functionality is to be covered by orthogonal
good-behaving basic options, and then convenience options are defined to
be like that, convenience, adding zero essential functionality.

This would force an author of an option suitable for their current work
at hand to think and design suitable additional *basic* option(s) first,
and only then define new convenience option(s) in terms of basic
option(s) as needed.

>
>> Please notice that in the above model basic option having a short form
>> is formally considered to be a short convenience option that is a
>> synonym for long basic option.
>>
>> There are obviously some other useful guidelines that could be defined,
>> or some alternate approach could be chosen,but the primary point is that
>> if we want a consistent UI, we do need some rules, and we need
>> convenient implementation of the model agreed upon, and then ensure that
>> from all the designs that "make sense", only those that fit into
>> underlying model are accepted.
>
> There was a recent discussion about cat-file option parsing semantics at
> https://lore.kernel.org/git/87tuhuikhf.fsf@evledraar.gmail.com/
>
> I have this unsubmitted (and updated from that discussion) patch to make
> "cat-file" help friendlier:
> https://github.com/avar/git/commit/bd32f57cd21
>
> I wonder what you think abut that new output v.s. the old.

[Un]fortunately I'm not familiar with cat-file at all, and after reading
a few lines of the discussion you've referenced made me think I don't
want to, sorry. In general, I do think it's nice you are working hard on
the interfaces.

>
> More generally, I've wanted to have some mode for parse_options() for a
> while now to label a given option X as only going with option. We have
> OPT_CMDMODE() for things that are mutually exclusive with all other
> options, but not anything like a OPT_SUBCMDMODE() or whatever (and
> sometimes such a thing would go with N "top-level modes", not just
> one).

To me this looks like attempts to cover as much of the existing
use-cases for options in Git as possible, be them good or bad. This is
very nice to have anyway, but otherwise is orthogonal to the task of
defining simple interface model that'd be targeted at eventually
obsolete such support.

>
> Right now you need to do that manually, see the usage_msg_opt[f]()
> verbosity at:
> https://github.com/avar/git/blob/avar/cat-file-usage-and-options-handling/builtin/cat-file.c#L679-L755
>
> I thing like that would be really useful, and would go a long way
> towards consistent UX, as you could generate the sort of "grouped help"
> shown in the commit link above with it, as well as have things like:
>
>     git some-command --top-level-option --op<TAB>
>
> Tab-complete only those --op* options that go with that
> --top-level-option.

Probably this --top-level-option should better have been sub-command
rather than option in proper UI design. An option that changes semantics
so much that a bunch of options is to be replaced with something else
should probably be simply prohibited.

In general, it's pain in the ass to handle options dependencies in
universal APIs such as parse_options(), so the best way of handling this
is to (eventually) get rid of the dependencies. Yeah, as usual, that's
simpler to say than to do, I know.

>
> I guess what I'm saying is that I agree with you, but just think that
> incremental changes to these UX APIs is the most viable way forward.

No objections. My primary point in this discussion is rather that
incremental changes only work fine when there is defined target that
governs the direction of the steps being taken, and as far as I can see
we still have no such target, or maybe even no agreement on significance
of having one.

That said, the unified parse_options() being used everywhere will
definitely simplify transition to an advanced design in the future,
should such transition be ever attempted.

BTW, I didn't try hard, but talking about incremental changes, is it
possible to convert part(s) of handle_revision_opt() to the
parse_options() API? For example, diff_merges_parse_opts() is right now
mostly isolated part of handle_revision_opt(). Is it feasible to convert
it to use parse_options()?

Thanks,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] The state of getting a reftable backend working in git.git
  2021-10-26  8:12       ` Han-Wen Nienhuys
@ 2021-10-28 14:17         ` Philip Oakley
  0 siblings, 0 replies; 58+ messages in thread
From: Philip Oakley @ 2021-10-28 14:17 UTC (permalink / raw)
  To: Han-Wen Nienhuys, Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, git

On 26/10/2021 09:12, Han-Wen Nienhuys wrote:
> On Tue, Oct 26, 2021 at 12:16 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> From memory I think the more general concern Philip Oakley was also
>> expressing (but maybe he'll chime in) could also be addressed by a tool
>> that just un-reftable-ifies a repository.
>>
>> I think such a thing would be useful, and I think we don't have that
>> already. Isn't the files backend or reftable usage now an "init"-time
>> setting.
>> ..
>> Maybe there's more complexity I'm not considering than just the *.lock
>> dance in .git/*, but if not such a tool could also convert freely
>> between the two backends, so you could try refable out in an existing
>> checkout.
> I added a convert-ref-storage command to the JGit command line client
> for exactly this,
>
> $ jgit convert-ref-storage  -h
> jgit convert-ref-storage [--format VAL] [--help (-h)] [--ssh [JSCH | APACHE]]
>
>  --format VAL          : Format to convert to (reftable or refdir) (default:
>                          reftable)
>  --help (-h)           : display this help text (default: true)
>  --ssh [JSCH | APACHE] : Selects the built-in ssh library to use, JSch or
>                          Apache MINA sshd. (default: JSCH)
>
> See here[1] for implementation. It's not safe for concurrent use with
> other git commands, but that's hardly a common use-case.
>
> [1] https://eclipse.googlesource.com/gerrit/jgit/jgit/+/1825a2230c06e7a6cbe23c69b63c3b7ecd2ceac6/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/FileRepository.java#806

Useful to know that there is a method, though I was distinguishing the
inspection of the data, from the conversion between (in-use) storage types.

My request was more about having an easy access ramp for on-boarding new
users (who are likely to visualise the file system analogy better) than
for experienced Git admins and developers who may need to convert and
interact with the data.

Some aspects (in the wider domain) may be the extending of the
documentation to help users with understanding the ref manipulation and
the meanings of the values.
-- 

Philip
>


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas]
  2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
@ 2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
  2021-11-03  9:24       ` test suite speedups via some not-so-crazy ideas (was: scripting speedups[...]) Ævar Arnfjörð Bjarmason
  2021-11-02 13:52     ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Johannes Schindelin
  1 sibling, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-30 19:58 UTC (permalink / raw)
  To: Eric Wong; +Cc: Johannes Schindelin, git

On Tue, Oct 26 2021, Eric Wong wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>> * Test suite is slow. Shell scripts and process forking.
>> 
>>    * What if we had a special shell that interpreted the commands in a
>>      single process?
>> 
>>    * Even Git commands like rev-parse and hash-object, as long as that’s
>>      not the command you’re trying to test
>
> This is something I've wanted in a very long time as a scripter.
> fast-import has been great over the years, as is
> "cat-file --batch(-check)", but there's gaps should be filled
> (preferably without fragile linkage of shared libraries into a
> script process)
>
>>    * Dscho wants to slip in a C-based solution
>> 
>>    * Jonathan tan commented: going back to your custom shell for tests
>>      idea, one thing we could do is have a custom command that generates
>>      the repo commits that we want (and that saves process spawns and
>>      might make the tests simpler too)
>
> Perhaps a not-seriously-proposed patch from 2006 could be
> modernized for our now-libified internals:

I think something very short of a "C-based solution" could give us most
of the wins here. Johannes was probably thinking of the scripting being
slow on Windows aspect of it.

But the main benefit of hypothetical C-based testing is that you can
connect it to the dependency tree we have in the Makefile, and only
re-run tests for code you needed to re-compile.

So e.g. we don't need to run tests that invoke "git tag" if the
dependency graph of builtin/tag.c didn't change.

With COMPUTE_HEADER_DEPENDENCIES we've got access to that dependency
information for our C code.

With trace2 we could record an initial test run, and know which built-in
commands are executed by which tests (even down to the sub-test level).

Connecting these two means that we can find all tests that say run "git
fsck", and if builtin/fsck.c is the only thing that changed in an
interactive rebase, that's the only tests we need to run.

Of course changes to things like cache.h or t/test-lib.sh would spoil
that cache entirely, but pretty much the same is true for re-compiling
things now, so would changing say builtin/init-db.c, as almost every
test does a "git init" somewhere.

But I think that approch is viable, and should take us from a huge
hypothetical project like "rewrite all the tests in C" to something
that's a viable weekend hacking project for someone who's interested.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas]
  2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
  2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
@ 2021-11-02 13:52     ` Johannes Schindelin
  1 sibling, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2021-11-02 13:52 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 3237 bytes --]

Hi Eric,

On Tue, 26 Oct 2021, Eric Wong wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > * Test suite is slow. Shell scripts and process forking.
> >
> >    * What if we had a special shell that interpreted the commands in a
> >      single process?
> >
> >    * Even Git commands like rev-parse and hash-object, as long as that’s
> >      not the command you’re trying to test
>
> This is something I've wanted in a very long time as a scripter.
> fast-import has been great over the years, as is
> "cat-file --batch(-check)", but there's gaps should be filled
> (preferably without fragile linkage of shared libraries into a
> script process)

The conclusion reached at the Summit seemed to be that we don't want to
get into that rabbit hole. We might very well end up maintaining a
POSIX-compatible shell inside Git. Definitely out of scope.

> >    * Dscho wants to slip in a C-based solution
> >
> >    * Jonathan tan commented: going back to your custom shell for tests
> >      idea, one thing we could do is have a custom command that generates
> >      the repo commits that we want (and that saves process spawns and
> >      might make the tests simpler too)
>
> Perhaps a not-seriously-proposed patch from 2006 could be
> modernized for our now-libified internals:
>
> https://yhbt.net/lore/git/Pine.LNX.4.64.0602232229340.3771@g5.osdl.org/

Thanks for digging that out. I had looked for it multiple times over the
years, but searched using the wrong search terms.

However, as you can see, it went nowhere. Probably the (implicit)
conclusion was the same as above.

> >       * We could replace several “setup repo” steps with “git fast-import”
> >         instead.
> >
> >    * Dscho measured: 0.5 sec - 30 sec in setup steps. Can use fast-import,
> >      or can make a new format that helps us set up the test scenario
>
> 0.5s - 30s across the whole suite or individual tests?

That was just vague recollection, but it was for setup steps, i.e. the
initial test cases that do not even test Git's functionality but merely
want to set up a repository/worktree for the subsequent test cases to play
with.

> Having a way to disable fsync globally should further improve
> things, especially for people on slower storage.  libeatmydata
> is available, but perhaps not widely available/known.

What was missing from the notes was the crucial fact that I did this on
Windows, i.e. a platform that is pretty darned good at multi-tasking
(something with which Linux has historically struggled a bit), but not so
good at spawning wholesale processes.

So the problem really is that calling, say, `git commit` in a `for
$(test_seq 100)` loop is ridiculously expensive.

Even rewriting those setup test cases to something as verbose as a
`fast-import` stream accelerates them like you wouldn't believe.

I even thought I threw out the idea of implementing a test helper that
could turn the output of `git log --graph --oneline` into a branch
replicating that structure, but it might have gotten lost in the noise.

I doubt that my test suite-centered commentary is very helpful for your
use cases, though.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* test suite speedups via some not-so-crazy ideas (was: scripting speedups[...])
  2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
@ 2021-11-03  9:24       ` Ævar Arnfjörð Bjarmason
  2021-11-03 22:12         ` test suite speedups via some not-so-crazy ideas Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-03  9:24 UTC (permalink / raw)
  To: Eric Wong
  Cc: Johannes Schindelin, git, Lars Schneider, SZEDER Gábor,
	Jeff King


On Sat, Oct 30 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Oct 26 2021, Eric Wong wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>>> * Test suite is slow. Shell scripts and process forking.
>>> 
>>>    * What if we had a special shell that interpreted the commands in a
>>>      single process?
>>> 
>>>    * Even Git commands like rev-parse and hash-object, as long as that’s
>>>      not the command you’re trying to test
>>
>> This is something I've wanted in a very long time as a scripter.
>> fast-import has been great over the years, as is
>> "cat-file --batch(-check)", but there's gaps should be filled
>> (preferably without fragile linkage of shared libraries into a
>> script process)
>>
>>>    * Dscho wants to slip in a C-based solution
>>> 
>>>    * Jonathan tan commented: going back to your custom shell for tests
>>>      idea, one thing we could do is have a custom command that generates
>>>      the repo commits that we want (and that saves process spawns and
>>>      might make the tests simpler too)
>>
>> Perhaps a not-seriously-proposed patch from 2006 could be
>> modernized for our now-libified internals:
>
> I think something very short of a "C-based solution" could give us most
> of the wins here. Johannes was probably thinking of the scripting being
> slow on Windows aspect of it.
>
> But the main benefit of hypothetical C-based testing is that you can
> connect it to the dependency tree we have in the Makefile, and only
> re-run tests for code you needed to re-compile.
>
> So e.g. we don't need to run tests that invoke "git tag" if the
> dependency graph of builtin/tag.c didn't change.
>
> With COMPUTE_HEADER_DEPENDENCIES we've got access to that dependency
> information for our C code.
>
> With trace2 we could record an initial test run, and know which built-in
> commands are executed by which tests (even down to the sub-test level).
>
> Connecting these two means that we can find all tests that say run "git
> fsck", and if builtin/fsck.c is the only thing that changed in an
> interactive rebase, that's the only tests we need to run.
>
> Of course changes to things like cache.h or t/test-lib.sh would spoil
> that cache entirely, but pretty much the same is true for re-compiling
> things now, so would changing say builtin/init-db.c, as almost every
> test does a "git init" somewhere.
>
> But I think that approch is viable, and should take us from a huge
> hypothetical project like "rewrite all the tests in C" to something
> that's a viable weekend hacking project for someone who's interested.

First to outline some goals: I think saying we'd like to speed up
scripts is really getting into the weeds.

Surely we'd like to speed up test runs, and generally speaking our test
suite can be parallelized, and it mostly doesn't matter if it runs on
your computer or other people's computers, as long as it runs your
code. So:

 1. Even for contributors that have a slow system they could benefit from
    the hosted CI (on GitHub or wherever else) being faster.

 2. Our CI takes around 30-60m to finish.

 3. That CI time is almost entirely something that could be sped up by
    throwing hardware at it.

 4. We're currently using "Dv2 and DSv2-series" hosted runners
    (https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners)
    we have quite a few people on-list who work for the
    company/companies involved.

    Is it within the realm of possibility to get more CI resources
    assigned to git/git's organization network?

 5. Or, is there willingness to host/pay for hosted runners from
    someone?

    Not wearing PLC hat I'd think that we could speed that up a lot with
    some reasonable money spending, and if pushing to CI made CI run in
    3-5m instead of 60m that would be worthwhile.

 6. Related to #5: I've been able to setup hosted runner jobs, and
    self-hosted runner jobs, but is there a way to do some opportunistic
    mixture of the two? Even one where self-hosted runners could come
    and go, and if they're present contribute resources to git/git's
    network?

 7. We run the various GIT_TEST_* etc. jobs in sequence, is there a
    reason for why we're serializing things in GitHub CI that could be
    parallelized?

    The vs-build and vs-test tests run in parallel, any reason we're not
    doing that trick on the ubuntu runners other than "nobody got to
    it?". We seem to be trying hard to do the exact opposite there..

    At the extreme end we could build git ~once, and have N tests depend
    on that, where N ~= $(ls t/*.sh) x $number_of_test_modes). But
    perhaps runner starting overhead starts to be the limiting factor at
    some point.

 8. To a first approximation, does anyone really care about getting an
    exhaustive list of all failures in a run, or just that we have *a*
    failure? You can always do an exhaustive run later.

 9. On the "no" answer to #8: When I build/test my own git I first run
    those tests that I modified in the relevant branches, and if any of
    those fail I just stop.

    I generally don't need to run the entirety of the rest of the test
    suite to stop and investigate why I have a failure.

    Perhaps our CI could use a similar trick, i.e. first test the set of
    modified test files, and perhaps with some ad-hoc matching of
    filenames, so e.g. if you modify builtin/add.c we'd run t/*add*.sh
    in the first set, and all with --immediate per #8 above.

    If we pass that we'd run the full set, minus that initial set.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: test suite speedups via some not-so-crazy ideas
  2021-11-03  9:24       ` test suite speedups via some not-so-crazy ideas (was: scripting speedups[...]) Ævar Arnfjörð Bjarmason
@ 2021-11-03 22:12         ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2021-11-03 22:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Eric Wong, Johannes Schindelin, git, Lars Schneider,
	SZEDER Gábor, Jeff King

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>  8. To a first approximation, does anyone really care about getting an
>     exhaustive list of all failures in a run, or just that we have *a*
>     failure? You can always do an exhaustive run later.

I do, not necessarily because I want to catch all failures, but
mostly because I want to use the the number of failing tests as a
rough sanity check.  I expect that the number is low, but not
necessarily zero, in the normal state, but if I see many in a run,
that rings different bells.  If we stop at the first failure, it
becomes harder to do this, and having to go there and restart with
"this time run the full set" manually is not really feasible.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
  2021-10-22  3:06   ` Bagas Sanjaya
@ 2021-11-08 18:21   ` Taylor Blau
  2021-11-09  2:15     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2021-11-08 18:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Elijah Newren, git

I was discussing this with Elijah today in IRC. I thought that I sent
the following message to the list, but somehow dropped it from the CC
list, and only sent it to Elijah and Johannes.

Here it is in its entirety, this time copying the list.

n Thu, Oct 21, 2021 at 01:56:06PM +0200, Johannes Schindelin wrote:
>  5.  The challenge is not necessarily the technical challenges, but the UX for
>      server tools that live “above” the git executable.
>
>      1. What kind of output is needed? Machine-readable error messages?
>
>      2. What Git objects must be created: a tree? A commit?
>
>      3. How to handle, report, and store conflicts? Index is not typically
>         available on the server.

I looked a little bit more into what GitHub would need in order to make
the switch. For background, we currently perform merges and rebases
using libgit2 as the backend, for the obvious reason which is that in a
pre-ORT world we could not write an intermediate result without having
an index around.

(As a fun aside, we used to expand our bare copy of a repository into a
temporary working directory, perform the merge there, and then delete
the directory. We definitely don't do that anymore ;)).

It looks like our current libgit2 usage more-or-less returns an
(object_id, list<file>) tuple, where:

  - a non-NULL object_id is the result of a successful (i.e.,
    conflict-free) merge; specifically the oid of the resulting root
    tree

  - a NULL object_id and a non-empty list of files indicates that the
    merge could not be completed without manual conflict resolution, and
    the list of files indicates where the conflicts were

When we try to process a conflicted merge, we display the list of files
where conflicts were present in the web UI. We do have a UI to resolve
conflicts, but we populate the contents of that UI by telling libgit2 to
perform the same merge on *just that file*, and writing out the file
with its conflict markers as the result (and sending that result out to
a web editor).

So I think an ORT-powered server-side merge would have to be able to:

  - write out the contents of a merge (with a tree, not a commit), and
    indicate whether or not that merge was successful with an exit code

  - write out the list of files that had conflicts upon failure

Given my limited knowledge of the ORT implementation, it seems like
writing out the conflicts themselves would be pretty easy. But GitHub
probably wouldn't use it, or at least not immediately, since we rely
heavily on being able to recreate the conflicts file-by-file as they are
needed.

Anyway, I happened to be looking into all of this during the summit, but
never wrote any of it down. So I figured that this might be helpful in
case folks are interested in pursuing this further. If so, let me know
if there are any other questions about what GitHub might want on the
backend, and I'll try to answer as best I can.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-11-08 18:21   ` Taylor Blau
@ 2021-11-09  2:15     ` Ævar Arnfjörð Bjarmason
  2021-11-30 10:06       ` Christian Couder
  0 siblings, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  2:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Schindelin, Elijah Newren, git

On Mon, Nov 08 2021, Taylor Blau wrote:

> I was discussing this with Elijah today in IRC. I thought that I sent
> the following message to the list, but somehow dropped it from the CC
> list, and only sent it to Elijah and Johannes.
>
> Here it is in its entirety, this time copying the list.
>
> n Thu, Oct 21, 2021 at 01:56:06PM +0200, Johannes Schindelin wrote:
>>  5.  The challenge is not necessarily the technical challenges, but the UX for
>>      server tools that live “above” the git executable.
>>
>>      1. What kind of output is needed? Machine-readable error messages?
>>
>>      2. What Git objects must be created: a tree? A commit?
>>
>>      3. How to handle, report, and store conflicts? Index is not typically
>>         available on the server.
>
> I looked a little bit more into what GitHub would need in order to make
> the switch. For background, we currently perform merges and rebases
> using libgit2 as the backend, for the obvious reason which is that in a
> pre-ORT world we could not write an intermediate result without having
> an index around.
>
> (As a fun aside, we used to expand our bare copy of a repository into a
> temporary working directory, perform the merge there, and then delete
> the directory. We definitely don't do that anymore ;)).
>
> It looks like our current libgit2 usage more-or-less returns an
> (object_id, list<file>) tuple, where:
>
>   - a non-NULL object_id is the result of a successful (i.e.,
>     conflict-free) merge; specifically the oid of the resulting root
>     tree
>
>   - a NULL object_id and a non-empty list of files indicates that the
>     merge could not be completed without manual conflict resolution, and
>     the list of files indicates where the conflicts were
>
> When we try to process a conflicted merge, we display the list of files
> where conflicts were present in the web UI. We do have a UI to resolve
> conflicts, but we populate the contents of that UI by telling libgit2 to
> perform the same merge on *just that file*, and writing out the file
> with its conflict markers as the result (and sending that result out to
> a web editor).
>
> So I think an ORT-powered server-side merge would have to be able to:
>
>   - write out the contents of a merge (with a tree, not a commit), and
>     indicate whether or not that merge was successful with an exit code
>
>   - write out the list of files that had conflicts upon failure
>
> Given my limited knowledge of the ORT implementation, it seems like
> writing out the conflicts themselves would be pretty easy. But GitHub
> probably wouldn't use it, or at least not immediately, since we rely
> heavily on being able to recreate the conflicts file-by-file as they are
> needed.
>
> Anyway, I happened to be looking into all of this during the summit, but
> never wrote any of it down. So I figured that this might be helpful in
> case folks are interested in pursuing this further. If so, let me know
> if there are any other questions about what GitHub might want on the
> backend, and I'll try to answer as best I can.

That's very informative, thanks.

Not that "ort" won't me much better at this, but FWIW git-merge-tree
sort of gets most of the way-ish to what you're describing already in
terms of a command interface.

I.e. I'm not the first or last to have (not for anything serious)
implement a dry-run bare-repo merge with something like:

    git merge-tree origin/master git-for-windows/main origin/seen >diff
    # Better regex needed, but basically this
    grep "^\+<<<<<<< \.our$" diff && conflict=t

So with some parsing of that command output you can get a diff with one
side or the other applied.

From there it's a matter of applying the patch, and from there you'd get
blobs/trees. which is painful from just having a diff & no index, so
it's a common use-case of libgit2 for just such basic usage.

But to the extent that we were talking about plumbing interfaces
wouldn't basically a git-merge-tree on steroids (or extension thereof)
do, i.e.:

 * Ask it to merge X heads, returns whether it worked or not
 * ... and can return a diff with conflict markers like this
 * ... for just some <pathspec>
 * ... maybe with the conflict already "resolved" one way or the other?
 * ... optionally, after some markers write one/both sides, spew out the
   relevant tree/blob OIDs
 * ... which again, could be limited by the <pathspec> above.

I'm thinking of something that basically works like git for-each-ref --format="". So:

    git merge-tree --format="..." <heads> -- <pathspec>

Where that <format> can be custom \0-delimited (or whatever) sections of
payload that could have whatever combination of the above you'd need. I
think git-for-each-ref is probably the best example we've got of a
plumbing interface in this category, i.e. being able to extract
arbitrary payloads via format specifiers & "path" (well, ref)
limitation.

Elijah probably has much better ideas already, I'm just spitballing. 

But if something like that worked it would be mostly a matter of
stealing code from for-each-ref and the like, and then the <handwaiving>
mapping that to ORT callbacks somehow.

And then it could even learn a --batch mode, which with those formats
could allow calling it without paying the price for command
re-invocation, something like the update-ref/proposed cat-file interface
discussed in another thread at [1].

1. https://lore.kernel.org/git/211106.86k0hmgc8q.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Summit topic] Server-side merge/rebase: needs and wants?
  2021-11-09  2:15     ` Ævar Arnfjörð Bjarmason
@ 2021-11-30 10:06       ` Christian Couder
  0 siblings, 0 replies; 58+ messages in thread
From: Christian Couder @ 2021-11-30 10:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, Johannes Schindelin, Elijah Newren, git

On Tue, Nov 9, 2021 at 1:18 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Mon, Nov 08 2021, Taylor Blau wrote:
>
> > I was discussing this with Elijah today in IRC. I thought that I sent
> > the following message to the list, but somehow dropped it from the CC
> > list, and only sent it to Elijah and Johannes.
> >
> > Here it is in its entirety, this time copying the list.
> >
> > n Thu, Oct 21, 2021 at 01:56:06PM +0200, Johannes Schindelin wrote:
> >>  5.  The challenge is not necessarily the technical challenges, but the UX for
> >>      server tools that live “above” the git executable.
> >>
> >>      1. What kind of output is needed? Machine-readable error messages?
> >>
> >>      2. What Git objects must be created: a tree? A commit?
> >>
> >>      3. How to handle, report, and store conflicts? Index is not typically
> >>         available on the server.
> >
> > I looked a little bit more into what GitHub would need in order to make
> > the switch. For background, we currently perform merges and rebases
> > using libgit2 as the backend, for the obvious reason which is that in a
> > pre-ORT world we could not write an intermediate result without having
> > an index around.
> >
> > (As a fun aside, we used to expand our bare copy of a repository into a
> > temporary working directory, perform the merge there, and then delete
> > the directory. We definitely don't do that anymore ;)).
> >
> > It looks like our current libgit2 usage more-or-less returns an
> > (object_id, list<file>) tuple, where:
> >
> >   - a non-NULL object_id is the result of a successful (i.e.,
> >     conflict-free) merge; specifically the oid of the resulting root
> >     tree
> >
> >   - a NULL object_id and a non-empty list of files indicates that the
> >     merge could not be completed without manual conflict resolution, and
> >     the list of files indicates where the conflicts were
> >
> > When we try to process a conflicted merge, we display the list of files
> > where conflicts were present in the web UI. We do have a UI to resolve
> > conflicts, but we populate the contents of that UI by telling libgit2 to
> > perform the same merge on *just that file*, and writing out the file
> > with its conflict markers as the result (and sending that result out to
> > a web editor).
> >
> > So I think an ORT-powered server-side merge would have to be able to:
> >
> >   - write out the contents of a merge (with a tree, not a commit), and
> >     indicate whether or not that merge was successful with an exit code
> >
> >   - write out the list of files that had conflicts upon failure
> >
> > Given my limited knowledge of the ORT implementation, it seems like
> > writing out the conflicts themselves would be pretty easy. But GitHub
> > probably wouldn't use it, or at least not immediately, since we rely
> > heavily on being able to recreate the conflicts file-by-file as they are
> > needed.
> >
> > Anyway, I happened to be looking into all of this during the summit, but
> > never wrote any of it down. So I figured that this might be helpful in
> > case folks are interested in pursuing this further. If so, let me know
> > if there are any other questions about what GitHub might want on the
> > backend, and I'll try to answer as best I can.
>
> That's very informative, thanks.

Yeah, thanks!

> Not that "ort" won't me much better at this,

I think the optimizations in "ort" could still be useful. Wouldn't it
be nice if rename detection was optimized for example?

> but FWIW git-merge-tree
> sort of gets most of the way-ish to what you're describing already in
> terms of a command interface.

Yeah, but if the engine is not up to date, I am not sure it's worth it
to reuse it just for the current very limited command interface.

> I.e. I'm not the first or last to have (not for anything serious)
> implement a dry-run bare-repo merge with something like:
>
>     git merge-tree origin/master git-for-windows/main origin/seen >diff
>     # Better regex needed, but basically this
>     grep "^\+<<<<<<< \.our$" diff && conflict=t
>
> So with some parsing of that command output you can get a diff with one
> side or the other applied.

Yeah, it looks like it would be easy to add options like --ours,
--theirs, etc, to get only the part we are interested in. And we
already easily see if the merge conflicted or not from the current
output, as it seems to output:

"0 mode sha1 filename"

in case of a successful merge, and:

"1 mode sha1 filename"
"2 mode sha1 filename"
"3 mode sha1 filename"

in case of conflicts.

> From there it's a matter of applying the patch, and from there you'd get
> blobs/trees. which is painful from just having a diff & no index, so
> it's a common use-case of libgit2 for just such basic usage.
>
> But to the extent that we were talking about plumbing interfaces
> wouldn't basically a git-merge-tree on steroids (or extension thereof)
> do, i.e.:
>
>  * Ask it to merge X heads, returns whether it worked or not
>  * ... and can return a diff with conflict markers like this
>  * ... for just some <pathspec>
>  * ... maybe with the conflict already "resolved" one way or the other?
>  * ... optionally, after some markers write one/both sides, spew out the
>    relevant tree/blob OIDs
>  * ... which again, could be limited by the <pathspec> above.
>
> I'm thinking of something that basically works like git for-each-ref --format="". So:
>
>     git merge-tree --format="..." <heads> -- <pathspec>
>
> Where that <format> can be custom \0-delimited (or whatever) sections of
> payload that could have whatever combination of the above you'd need. I
> think git-for-each-ref is probably the best example we've got of a
> plumbing interface in this category, i.e. being able to extract
> arbitrary payloads via format specifiers & "path" (well, ref)
> limitation.

The current synopsis is:

git merge-tree <base-tree> <branch1> <branch2>

which is quite different from what you are proposing.

Given that it seems worth it to use a different underlying engine
(actually the "ort" one) than the current one, I think that it might
be better to start from scratch with a new command using the "ort"
engine.

> Elijah probably has much better ideas already, I'm just spitballing.

Yeah, I'd be interested in knowing Elijah's opinion on this. Although
maybe I misunderstood, but I thought that Elijah had plans to send
patches related to this to the list after v2.34.

> But if something like that worked it would be mostly a matter of
> stealing code from for-each-ref and the like, and then the <handwaiving>
> mapping that to ORT callbacks somehow.

Yeah, but what would be left from the original git merge-tree then?

Wouldn't it make more sense to start with a new command that has
roughly the same features as git merge-tree and a similar interface
(though maybe not quite the same as we could anticipate some future
extensions and maybe learn a bit from other commands), but uses "ort".
Then we could grow it as we want, without being burdened by the git
merge-tree legacy, in the same way as "ort" was developed without
being burdened by the recursive merge legacy?

> And then it could even learn a --batch mode, which with those formats
> could allow calling it without paying the price for command
> re-invocation, something like the update-ref/proposed cat-file interface
> discussed in another thread at [1].

Yeah, sure.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2021-11-30 10:07 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
2021-10-21 12:30   ` Son Luong Ngoc
2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
2021-11-03  9:24       ` test suite speedups via some not-so-crazy ideas (was: scripting speedups[...]) Ævar Arnfjörð Bjarmason
2021-11-03 22:12         ` test suite speedups via some not-so-crazy ideas Junio C Hamano
2021-11-02 13:52     ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Johannes Schindelin
2021-10-21 11:55 ` [Summit topic] SHA-256 Updates Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
2021-10-22  3:06   ` Bagas Sanjaya
2021-10-22 10:01     ` Johannes Schindelin
2021-10-23 20:52       ` Ævar Arnfjörð Bjarmason
2021-11-08 18:21   ` Taylor Blau
2021-11-09  2:15     ` Ævar Arnfjörð Bjarmason
2021-11-30 10:06       ` Christian Couder
2021-10-21 11:56 ` [Summit topic] Submodules and how to make them worth using Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] Sparse checkout behavior and plans Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] The state of getting a reftable backend working in git.git Johannes Schindelin
2021-10-25 19:00   ` Han-Wen Nienhuys
2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
2021-10-26  8:12       ` Han-Wen Nienhuys
2021-10-28 14:17         ` Philip Oakley
2021-10-26 15:51       ` Philip Oakley
2021-10-21 11:56 ` [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.) Johannes Schindelin
2021-10-22 14:20   ` Jean-Noël Avila
2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
2021-10-27  7:02       ` Jean-Noël Avila
2021-10-27  8:50       ` Jeff King
2021-10-21 11:56 ` [Summit topic] Increasing diversity & inclusion (transition to `main`, etc) Johannes Schindelin
2021-10-21 12:55   ` Son Luong Ngoc
2021-10-22 10:02     ` vale check, was " Johannes Schindelin
2021-10-22 10:03       ` Johannes Schindelin
2021-10-21 11:57 ` [Summit topic] Improving Git UX Johannes Schindelin
2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
2021-10-21 23:03     ` changing the experimental 'git switch' Junio C Hamano
2021-10-22  3:33     ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Bagas Sanjaya
2021-10-22 14:04     ` martin
2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
2021-10-22 15:30         ` martin
2021-10-23  8:27           ` changing the experimental 'git switch' Sergey Organov
2021-10-22 21:54         ` Sergey Organov
2021-10-24  6:54       ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Martin
2021-10-24 20:27         ` changing the experimental 'git switch' Junio C Hamano
2021-10-25 12:48           ` Ævar Arnfjörð Bjarmason
2021-10-25 17:06             ` Junio C Hamano
2021-10-25 16:44     ` Sergey Organov
2021-10-25 22:23       ` Ævar Arnfjörð Bjarmason
2021-10-27 18:54         ` Sergey Organov
2021-10-21 11:57 ` [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc) Johannes Schindelin
2021-10-21 13:41   ` Konstantin Ryabitsev
2021-10-22 22:06     ` Ævar Arnfjörð Bjarmason
2021-10-22  8:02 ` Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
2021-10-22  8:22   ` Johannes Schindelin
2021-10-22  8:30     ` Johannes Schindelin
2021-10-22  9:07       ` Johannes Schindelin
2021-10-22  9:44 ` Let's have public Git chalk talks, " Johannes Schindelin
2021-10-25 12:58   ` Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).