git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git for games working group
@ 2018-09-14 17:55 John Austin
  2018-09-14 19:00 ` Taylor Blau
  2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 35+ messages in thread
From: John Austin @ 2018-09-14 17:55 UTC (permalink / raw)
  To: git

Hey all,

I've been putting together a working group for game studios wanting to
use Git. There are a couple of blockers that keep most game and media
companies on Perforce or others, but most would love to use git if it
were feasible.

The biggest tasks I'd like to tackle are:
 - improvements to large file management (mostly solved by LFS, GVFS)
 - avoiding excessive binary file conflicts (this is one of the big
reasons most studio are on Perforce)

Is anyone interested in contributing/offering insights? I suspect most
folks here are git users as is, but if you know someone stuck on
Perforce, I'd love to chat with them!

Happy to field thoughts in this thread or answer other questions about
why git doesn't work for games at the moment.

Cheers,
JA


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 17:55 Git for games working group John Austin
@ 2018-09-14 19:00 ` Taylor Blau
  2018-09-14 21:09   ` John Austin
  2018-09-14 21:13   ` John Austin
  2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 35+ messages in thread
From: Taylor Blau @ 2018-09-14 19:00 UTC (permalink / raw)
  To: John Austin; +Cc: git, sandals, larsxschneider, pastelmobilesuit

Hi John,

On Fri, Sep 14, 2018 at 10:55:39AM -0700, John Austin wrote:
> Is anyone interested in contributing/offering insights? I suspect most
> folks here are git users as is, but if you know someone stuck on
> Perforce, I'd love to chat with them!

I'm thrilled that other folks are interested in this, too. I'm not a
video game developer myself, but I am the maintainer of Git LFS. If
there's a capacity in which I could be useful to this group, I'd be more
than happy to offer myself in that capacity.

I'm cc-ing in brian carlson, Lars Schneider, and Preben Ingvaldsen on
this email, too, since they all server on the core team of the project.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 19:00 ` Taylor Blau
@ 2018-09-14 21:09   ` John Austin
  2018-09-15 16:40     ` Taylor Blau
  2018-09-14 21:13   ` John Austin
  1 sibling, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-14 21:09 UTC (permalink / raw)
  To: me; +Cc: git, sandals, larsxschneider, pastelmobilesuit

Hey Taylor,

Great to have your support! I think LFS has done a great job so far
solving the large file issue. I've been working myself on strategies
for handling binary conflicts, and particularly how to do it in a
git-friendly way (ie. avoiding as much centralization as possible and
playing into the commit/branching model of git). I've got to a loose
design that I like, but it'd be good to get some feedback, as well as
hearing what other game devs would want in a binary conflict system.

- John


On Fri, Sep 14, 2018 at 12:00 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Hi John,
>
> On Fri, Sep 14, 2018 at 10:55:39AM -0700, John Austin wrote:
> > Is anyone interested in contributing/offering insights? I suspect most
> > folks here are git users as is, but if you know someone stuck on
> > Perforce, I'd love to chat with them!
>
> I'm thrilled that other folks are interested in this, too. I'm not a
> video game developer myself, but I am the maintainer of Git LFS. If
> there's a capacity in which I could be useful to this group, I'd be more
> than happy to offer myself in that capacity.
>
> I'm cc-ing in brian carlson, Lars Schneider, and Preben Ingvaldsen on
> this email, too, since they all server on the core team of the project.
>
> Thanks,
> Taylor
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 19:00 ` Taylor Blau
  2018-09-14 21:09   ` John Austin
@ 2018-09-14 21:13   ` John Austin
  2018-09-16  7:56     ` David Aguilar
  1 sibling, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-14 21:13 UTC (permalink / raw)
  To: me; +Cc: git, sandals, larsxschneider, pastelmobilesuit

Hey Taylor,

Great to have your support! I think LFS has done a great job so far
solving the large file issue. I've been working myself on strategies
for handling binary conflicts, and particularly how to do it in a
git-friendly way (ie. avoiding as much centralization as possible and
playing into the commit/branching model of git). I've got to a loose
design that I like, but it'd be good to get some feedback, as well as
hearing what other game devs would want in a binary conflict system.

- John
On Fri, Sep 14, 2018 at 12:00 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Hi John,
>
> On Fri, Sep 14, 2018 at 10:55:39AM -0700, John Austin wrote:
> > Is anyone interested in contributing/offering insights? I suspect most
> > folks here are git users as is, but if you know someone stuck on
> > Perforce, I'd love to chat with them!
>
> I'm thrilled that other folks are interested in this, too. I'm not a
> video game developer myself, but I am the maintainer of Git LFS. If
> there's a capacity in which I could be useful to this group, I'd be more
> than happy to offer myself in that capacity.
>
> I'm cc-ing in brian carlson, Lars Schneider, and Preben Ingvaldsen on
> this email, too, since they all server on the core team of the project.
>
> Thanks,
> Taylor
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 17:55 Git for games working group John Austin
  2018-09-14 19:00 ` Taylor Blau
@ 2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
  2018-09-14 23:36   ` John Austin
  1 sibling, 1 reply; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-14 21:21 UTC (permalink / raw)
  To: John Austin; +Cc: git


On Fri, Sep 14 2018, John Austin wrote:

>  - improvements to large file management (mostly solved by LFS, GVFS)

There's also the nascent "don't fetch all the blobs" work-in-progress
clone mode which might be of interest to you:
https://blog.github.com/2018-09-10-highlights-from-git-2-19/#partial-clones

>  - avoiding excessive binary file conflicts (this is one of the big
> reasons most studio are on Perforce)

Is this just a reference to the advisory locking mode perforce/cvs
etc. have or is there something else at play here?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
@ 2018-09-14 23:36   ` John Austin
  2018-09-15 16:42     ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-14 23:36 UTC (permalink / raw)
  To: avarab; +Cc: git

> There's also the nascent "don't fetch all the blobs" work-in-progress
> clone mode which might be of interest to you:
> https://blog.github.com/2018-09-10-highlights-from-git-2-19/#partial-clones

Yes! I've been pretty excited about this functionality. It drives a
lot of GVFS/VFS for Git under the hood. I think it's a great solution
to the repo-size issue.

> Is this just a reference to the advisory locking mode perforce/cvs
> etc. have or is there something else at play here?

Good catch. I actually phrased this precisely to avoid calling it
"File Locking".

An essential example would be a team of 5 audio designers working
together on the SFX for a game. If one designer wants to add a layer
of ambience to 40% of the .wav files, they have to coordinate with
everyone else on the project manually. Without coordination this
developer will clobber any changes made to these files while he worked
on them. File Locking is the way that Perforce manages this, where a
developer can exclusively block modifications on a set of files across
the entire team.

File locking is just one solution to the problem. It's also one that
doesn't play well with git's decentralized structure and branching
model. I would state the problem more generally:
Developers need some way to know, as early as possible, if modifying a
file will cause conflicts upstream.

Optionally this knowledge can block modifying the file directly (if
we're certain there's already a conflicting version of the file on a
different branch).

JA


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 21:09   ` John Austin
@ 2018-09-15 16:40     ` Taylor Blau
  2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-15 16:40 UTC (permalink / raw)
  To: John Austin; +Cc: me, git, sandals, larsxschneider, pastelmobilesuit

On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
> I've been working myself on strategies for handling binary conflicts,
> and particularly how to do it in a git-friendly way (ie. avoiding as
> much centralization as possible and playing into the commit/branching
> model of git).

Git LFS handles conflict resolution and merging over binary files with
two primary mechanisms: (1) file locking, and (2) use of a merge-tool.

  1. is the most "non-Git-friendly" solution, since it requires the use
     of a centralized Git LFS server (to be run alongside your remote
     repository) and that every clone phones home to make sure that they
     are OK to acquire a lock.

     The workflow that we expect is that users will run 'git lfs lock
     /path/to/file' any time they want to make a change to an
     unmeregeable file, and that this call first checks to make sure
     that they are the only person who would hold the lock.

     We also periodically "sync" the state of locks locally with those
     on the remote, namely during the post-merge, post-commit, and
     post-checkout hook(s).

     Users are expected to perform the 'git lfs unlock /path/to/file'
     anytime they "merge" their changes back into master, but the
     thought is that servers could be taught to automatically do this
     upon the remote detecting the merge.

  2. is a more it-friendly approach, i.e., that the 'git mergetool'
     builtin does work with files tracked under Git LFS, i.e., that both
     sides of the merge are filtered so that the mergetool can resolve
     the changes in the large files instead of the textual pointers.


> I've got to a loose design that I like, but it'd be good to get some
> feedback, as well as hearing what other game devs would want in a
> binary conflict system.

Please do share, and I would be happy to provide feedback (and make
proposals to integrate favorable parts of your ideas into Git LFS).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 23:36   ` John Austin
@ 2018-09-15 16:42     ` Taylor Blau
  2018-09-16 18:17       ` John Austin
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-15 16:42 UTC (permalink / raw)
  To: John Austin; +Cc: avarab, git

On Fri, Sep 14, 2018 at 04:36:19PM -0700, John Austin wrote:
> > There's also the nascent "don't fetch all the blobs" work-in-progress
> > clone mode which might be of interest to you:
> > https://blog.github.com/2018-09-10-highlights-from-git-2-19/#partial-clones
>
> Yes! I've been pretty excited about this functionality. It drives a
> lot of GVFS/VFS for Git under the hood. I think it's a great solution
> to the repo-size issue.

Right, though this still subjects the remote copy to all of the
difficulty of packing large objects (though Christian's work to support
other object database implementations would go a long way to help this).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-14 21:13   ` John Austin
@ 2018-09-16  7:56     ` David Aguilar
  2018-09-17 13:48       ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: David Aguilar @ 2018-09-16  7:56 UTC (permalink / raw)
  To: John Austin; +Cc: me, git, sandals, larsxschneider, pastelmobilesuit

On Fri, Sep 14, 2018 at 02:13:28PM -0700, John Austin wrote:
> Hey Taylor,
> 
> Great to have your support! I think LFS has done a great job so far
> solving the large file issue. I've been working myself on strategies
> for handling binary conflicts, and particularly how to do it in a
> git-friendly way (ie. avoiding as much centralization as possible and
> playing into the commit/branching model of git). I've got to a loose
> design that I like, but it'd be good to get some feedback, as well as
> hearing what other game devs would want in a binary conflict system.
> 
> - John

Hey John, thanks for LFS, and thanks to Taylor for bringing up this topic.

Regarding file locking, the gitolite docs are insightful:
http://gitolite.com/gitolite/locking/index.html

File locking is how P4 handles binary conflicts.  It's actually
conflict prevention -- the locks prevent users from stepping
on each other without needing to actually talk to each other.

(I've always believed that this is actually a social problem
 (not a technical one) that is best served by better communication,
 but there's no doubt that having a technical guard in place is useful
 in many scenarios.)

From the POV of using Git as a P4 replacement, the locking support in
git-lfs seems like a fine solution to prevent binary conflicts.

https://github.com/git-lfs/git-lfs/wiki/File-Locking

Are there any missing features that would help improve LFS solution?


Locking is just one aspect of binary conflicts.

In a lock-free world, another aspect is tooling around dealing
with actual conflicts.  It seems like the main challenges there are
related to introspection of changes and mechanisms for combining
changes.

Combining changes is inherently file-format specific, and I suspect
that native authoring tools are best used in those scenarios.
Maybe LFS can help deal with binary conflicts by having short and sweet
ways to grab the "base", "their" and "our" versions of the conflict
files.

Example:

	git lfs checkout --theirs --to theirs.wav conflict.wav
	git lfs checkout --ours --to ours.wav conflict.wav
	git lfs checkout --base --to base.wav conflict.wav

Then the user can use {ours,theirs,base}.wav to produce the
resolved result using their usual authoring tools.

From the plumbing perspective, we already have the tools to
do this today, but they're not really user-friendly because
they require the user to use "git cat-file --filters --path=..."
and redirect the output to get at their changes.

Not sure if git-lfs is the right place for that kind of helper
wrapper command, but it's not a bad place for it either.
That said, none of these are user-friendly for non-Gits that
might be intimidated by a command-line.

Is there anything we could add to git-cola to help?

Being able to save the different conflicted index stages to
separately named files seems like an obvious feature that
would help users when confronted with a binary conflict.

With LFS and the ongoing work related to MVFS, shallow clone,
and partial checkout, the reasons to use P4 over Git are becoming
less and less compelling.  It'd be great to polish the game asset
workflows further so that we can have a cohesive approach to
doing game asset development using Git that is easy enough for
non-technical users to use and understand.

I mention git-cola because it's a Git porcelain that already has
git-lfs support and I'm very much in favor of improving workflows
related to interacting with LFS, large files, repos, and binary content.

Are there other rough edges around (large) binary files that can be improved?

One thought that comes to mind is diffing -- I imagine that we
might want to use different diff tools depending on the file format.
Currently git-difftool uses a single tool for all files, but it seems
like being able to use different tools, based on the file type, could
be helpful.  Not sure if difftool is the right place for that, but
being able to specify different tools per-file seems be useful in
that scenario.

Another avenue that could use help is documentation about suggested
workflows.  Git's core documentation talks about various
large-file-centric features in isolation, but it'd be good to have a
single user-centric document (not unlike gitworkflows) to document best
practices for dealing with large files, repos, game assets, etc.

That alone would help dispel the myth that Git is unsuitable for
large repos, large files, and binary content.
-- 
David

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-15 16:40     ` Taylor Blau
@ 2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason
  2018-09-16 20:49         ` John Austin
  2018-09-17 13:55         ` Taylor Blau
  0 siblings, 2 replies; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-16 14:55 UTC (permalink / raw)
  To: Taylor Blau
  Cc: John Austin, git, sandals, larsxschneider, pastelmobilesuit,
	Joey Hess


On Sat, Sep 15 2018, Taylor Blau wrote:

> On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
>> I've been working myself on strategies for handling binary conflicts,
>> and particularly how to do it in a git-friendly way (ie. avoiding as
>> much centralization as possible and playing into the commit/branching
>> model of git).
>
> Git LFS handles conflict resolution and merging over binary files with
> two primary mechanisms: (1) file locking, and (2) use of a merge-tool.
>
>   1. is the most "non-Git-friendly" solution, since it requires the use
>      of a centralized Git LFS server (to be run alongside your remote
>      repository) and that every clone phones home to make sure that they
>      are OK to acquire a lock.
>
>      The workflow that we expect is that users will run 'git lfs lock
>      /path/to/file' any time they want to make a change to an
>      unmeregeable file, and that this call first checks to make sure
>      that they are the only person who would hold the lock.
>
>      We also periodically "sync" the state of locks locally with those
>      on the remote, namely during the post-merge, post-commit, and
>      post-checkout hook(s).
>
>      Users are expected to perform the 'git lfs unlock /path/to/file'
>      anytime they "merge" their changes back into master, but the
>      thought is that servers could be taught to automatically do this
>      upon the remote detecting the merge.
>
>   2. is a more it-friendly approach, i.e., that the 'git mergetool'
>      builtin does work with files tracked under Git LFS, i.e., that both
>      sides of the merge are filtered so that the mergetool can resolve
>      the changes in the large files instead of the textual pointers.
>
>
>> I've got to a loose design that I like, but it'd be good to get some
>> feedback, as well as hearing what other game devs would want in a
>> binary conflict system.
>
> Please do share, and I would be happy to provide feedback (and make
> proposals to integrate favorable parts of your ideas into Git LFS).

All of this is obviously correct as far as git-lfs goes. Just to use
this as a jump-off comment on the topic of file locking and to frame
this discussion more generally.

It's true that a tool like git-lfs "requires the use of a centralized
[...] server" for file locking, but it's not the case that a feature
like file locking requires a centralized authority.

In particular, git-lfs unlike git-annex (which preceded it) does the
opposite of (to quote John upthread) "avoid[...] as much centralization
as possible", it *is* explicitly a centralized large file solution, not
a distributed one, as opposed to git-annex.

That's not a critique of git-lfs or the centralized method, or a
recommendation for decentralization in this context, but we already have
a similar distributed solution in the form of git-annex, it's just a hop
skip and a jump away from changing "who has the file" to "who has the
lock".

So how does that work? In the centralized case like
git-lfs/cvs/p4/whatever you have some "lock/unlock" command, and it
locks a file on a central server, locking is usually a a [locked?, who]
state of "is it locked" and "who locked it?". Usually this is also
followed-up on the client-side by checking those files out without the
"w" flag.

In the hypothetical git-annex-like case (simplifying a bit for the
purposes this explanation), for every FILE in your tree you have a
corresponding FILE.lock file, but it's not a boolean, but a log of who's
asked for locks, i.e. lines of:

    <repository UUID> <ts> <state> <who (email?)> <explanation?>

E.g.:

    $ cat Makefile.lock
    my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
    my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"

This log is append-only, when clients encounter conflicts there's a
merge driver to ensure that all updates are kept.

You can then enact a policy saying you care or don't care about updates
from certain sources, or ignore locks older than so-and-so.

None of this is stuff I'd really recommend. It's just instructive to
point out that if someone wants a distributed locking solution for git,
it pretty much already exists, you can even (ab)use git-annex for it
today with a tiny hack on top.

I.e. each time you want to lock a file called Makefile just:

    echo We created a lock for this >Makefile.lock &&
    git annex add Makefile.lock &&
    git annex sync

And to release the lock:

    git annex rm Makefile.lock &&
    git annex sync

Then you and others using this just mentally pretend (or setup aliases)
that the following mapping exists:

    git annex get <file> && git annex sync ==> git lockit <file>
    git annex rm <file>  && git annex sync ==> git unlockit <file>

And that stuff like "git annex whereis" (designed to list "who has the
files") means "git annex who-has-locks".

Then you'd change the post-{checkout,merge} hooks to list the locks
"tracked annex files", chmod -w appropriately, and voila, a distributed
locking solution for git built on top of an existing tool you can
implement in a couple of hours.

Now, if I were in a game studio like this would I do any of this? Nope,
I think even if you go for locks something like the centralized git-lfs
approach is simpler and probably more appropriate (you presumably want
to be centralized anyway).

But to be honest I don't really get the need for this given something
like the use-case noted upthread:

    > John Austin <john@astrangergravity.com> wrote:
    > An essential example would be a team of 5 audio designers working
    > together on the SFX for a game. If one designer wants to add a layer
    > of ambience to 40% of the .wav files, they have to coordinate with
    > everyone else on the project manually.

If you have 5 people working on a project together, isn't it more
straightforward to post in IRC/E-Mail:

    Hey @all, don't change *.wav files for the next couple of days,
    major refactoring.

That's what we do all the time over in the non-game-non-binary-assets SW
development world, and I daresay that even if you have textual
conflicts, they're sometimes just as hard to solve.

I.e. you can have two people unaware of each other on a team starting to
in parallel refactor the same set of code in two completely different
ways, needing a lot of manual merging / throwing out of most of one
implementation. The way that's usually dealt with is something like the
above example post to a ML.

But maybe I'm just not imagining the use-cases.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-15 16:42     ` Taylor Blau
@ 2018-09-16 18:17       ` John Austin
  2018-09-16 22:05         ` Jonathan Nieder
  0 siblings, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-16 18:17 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Ævar Arnfjörð Bjarmason, git

> Right, though this still subjects the remote copy to all of the
> difficulty of packing large objects (though Christian's work to support
> other object database implementations would go a long way to help this).

Ah, interesting -- I didn't realize this step was part of the
bottleneck. I presumed git didn't do much more than perhaps gzip'ing
binary files when it packed them up. Or do you mean the growing cost
of storing the objects locally as you work? Perhaps that could be
solved by allowing the client more control (ie. delete the oldest
blobs that exist on the server).


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason
@ 2018-09-16 20:49         ` John Austin
  2018-09-17 13:55         ` Taylor Blau
  1 sibling, 0 replies; 35+ messages in thread
From: John Austin @ 2018-09-16 20:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, git, brian m. carlson, Lars Schneider,
	pastelmobilesuit, id

Thanks for all the thoughts so far -- I'm going to try to collate some
of my responses to avoid this getting too lengthy.

## Regarding Merging / Diffing
A couple of folks have suggested that we could improve merging /
diffing of binary files in general. I think this is useful, but can
only ever result in minor improvements, for the following reasons:

1. Game developers use an incredible amount of proprietary file
formats: Maya, Houdini, Photoshop, Wwise, Unreal UAssets, etc. At the
end of the day, it's fairly unlikely that we can build visual merge
tools for these asset types without an enormous amount of corporate
support.

2. Merging doesn't have a meaning for many types of files. I think git
has trained us that everything is merge-able, but that's not always
the case. If you gave an audio designer two voice-over audio files and
asked them to merge them, they'd give you a pretty strange look. You
have to re-record it from scratch. Content files can be highly
intertwined and highly subjective: as a textual metaphor, every line
of content conflicts with every other line. Even if you had a perfect
merge tool, it just doesn't make much sense to try to merge changes,
unless it's an incredibly simple change.

## Regarding File Locking:
File locking works well enough in Perforce, but there are a couple of
issues I've found using file locking in LFS or in Gitolite (hadn't
seen this before, thanks!).

1. File Locking is an 'active' system. File Locking adds extra
operations that must be taken, both before writing to a file and then
after finishing your changes. Artists either must drop down to a
terminal (unlikely), or we must integrate our file-locking system with
existing artist tools (a large amount of work). Either way it adds a
lot of extra grunt-work. Imagine having to manually mark which files
you modify rather than just using git status. One of git's biggest
benefit is removing this type of manual labor.

2. File Locking doesn't extend well across branches. Acquiring a lock
usually blocks modifications to this file across all branches. This
cuts off basic branching models and features (like having release
branches) that are large part of why git is so successful.

3. It's not entirely sound. Developer A can modify 'binary.bin', and
push the changes to master. Developer B, who is behind master by a
couple of days, can then unknowingly acquire the lock and make further
changes ignoring A's new commit. When B attempts to push, they will
get conflicts. If you look closely, this is a symptom of issue 2:
locking doesn't understand branches.

## "Implicit" Locking

Instead, I think it's better to think about how we can use the
structure of the git graph to solve the issue. Imagine the following
pre-commit hook for a developer attempting to commit 'binary.bin':

If there exists any commit binary.bin on a different branch that is
not integrated into this branch,  block the commit.

In this case, making a commit with a file blocks others from touching
it, until they pull in that commit. To make the parallel, making a
commit acquires a 'lock' on the file, but there's no release. The only
requirement is that you always modify the latest version of the file.

This has issues of its own, and it's a simplification of the system I
have in mind. It means Developer A needs to have information about the
commit graph local to Developer B's machine (but notably not the
files). However I think it is a better starting place for thinking
about these sorts of systems. The locks fall implicitly from the
commit graph structure, so it plays well with all of your normal git
commands. You can branch, cherry-pick, rebase, etc without any extra
support or aliases. I'll write up something a bit more detailed in a
bit.

- JA
On Sun, Sep 16, 2018 at 7:55 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sat, Sep 15 2018, Taylor Blau wrote:
>
> > On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
> >> I've been working myself on strategies for handling binary conflicts,
> >> and particularly how to do it in a git-friendly way (ie. avoiding as
> >> much centralization as possible and playing into the commit/branching
> >> model of git).
> >
> > Git LFS handles conflict resolution and merging over binary files with
> > two primary mechanisms: (1) file locking, and (2) use of a merge-tool.
> >
> >   1. is the most "non-Git-friendly" solution, since it requires the use
> >      of a centralized Git LFS server (to be run alongside your remote
> >      repository) and that every clone phones home to make sure that they
> >      are OK to acquire a lock.
> >
> >      The workflow that we expect is that users will run 'git lfs lock
> >      /path/to/file' any time they want to make a change to an
> >      unmeregeable file, and that this call first checks to make sure
> >      that they are the only person who would hold the lock.
> >
> >      We also periodically "sync" the state of locks locally with those
> >      on the remote, namely during the post-merge, post-commit, and
> >      post-checkout hook(s).
> >
> >      Users are expected to perform the 'git lfs unlock /path/to/file'
> >      anytime they "merge" their changes back into master, but the
> >      thought is that servers could be taught to automatically do this
> >      upon the remote detecting the merge.
> >
> >   2. is a more it-friendly approach, i.e., that the 'git mergetool'
> >      builtin does work with files tracked under Git LFS, i.e., that both
> >      sides of the merge are filtered so that the mergetool can resolve
> >      the changes in the large files instead of the textual pointers.
> >
> >
> >> I've got to a loose design that I like, but it'd be good to get some
> >> feedback, as well as hearing what other game devs would want in a
> >> binary conflict system.
> >
> > Please do share, and I would be happy to provide feedback (and make
> > proposals to integrate favorable parts of your ideas into Git LFS).
>
> All of this is obviously correct as far as git-lfs goes. Just to use
> this as a jump-off comment on the topic of file locking and to frame
> this discussion more generally.
>
> It's true that a tool like git-lfs "requires the use of a centralized
> [...] server" for file locking, but it's not the case that a feature
> like file locking requires a centralized authority.
>
> In particular, git-lfs unlike git-annex (which preceded it) does the
> opposite of (to quote John upthread) "avoid[...] as much centralization
> as possible", it *is* explicitly a centralized large file solution, not
> a distributed one, as opposed to git-annex.
>
> That's not a critique of git-lfs or the centralized method, or a
> recommendation for decentralization in this context, but we already have
> a similar distributed solution in the form of git-annex, it's just a hop
> skip and a jump away from changing "who has the file" to "who has the
> lock".
>
> So how does that work? In the centralized case like
> git-lfs/cvs/p4/whatever you have some "lock/unlock" command, and it
> locks a file on a central server, locking is usually a a [locked?, who]
> state of "is it locked" and "who locked it?". Usually this is also
> followed-up on the client-side by checking those files out without the
> "w" flag.
>
> In the hypothetical git-annex-like case (simplifying a bit for the
> purposes this explanation), for every FILE in your tree you have a
> corresponding FILE.lock file, but it's not a boolean, but a log of who's
> asked for locks, i.e. lines of:
>
>     <repository UUID> <ts> <state> <who (email?)> <explanation?>
>
> E.g.:
>
>     $ cat Makefile.lock
>     my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
>     my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"
>
> This log is append-only, when clients encounter conflicts there's a
> merge driver to ensure that all updates are kept.
>
> You can then enact a policy saying you care or don't care about updates
> from certain sources, or ignore locks older than so-and-so.
>
> None of this is stuff I'd really recommend. It's just instructive to
> point out that if someone wants a distributed locking solution for git,
> it pretty much already exists, you can even (ab)use git-annex for it
> today with a tiny hack on top.
>
> I.e. each time you want to lock a file called Makefile just:
>
>     echo We created a lock for this >Makefile.lock &&
>     git annex add Makefile.lock &&
>     git annex sync
>
> And to release the lock:
>
>     git annex rm Makefile.lock &&
>     git annex sync
>
> Then you and others using this just mentally pretend (or setup aliases)
> that the following mapping exists:
>
>     git annex get <file> && git annex sync ==> git lockit <file>
>     git annex rm <file>  && git annex sync ==> git unlockit <file>
>
> And that stuff like "git annex whereis" (designed to list "who has the
> files") means "git annex who-has-locks".
>
> Then you'd change the post-{checkout,merge} hooks to list the locks
> "tracked annex files", chmod -w appropriately, and voila, a distributed
> locking solution for git built on top of an existing tool you can
> implement in a couple of hours.
>
> Now, if I were in a game studio like this would I do any of this? Nope,
> I think even if you go for locks something like the centralized git-lfs
> approach is simpler and probably more appropriate (you presumably want
> to be centralized anyway).
>
> But to be honest I don't really get the need for this given something
> like the use-case noted upthread:
>
>     > John Austin <john@astrangergravity.com> wrote:
>     > An essential example would be a team of 5 audio designers working
>     > together on the SFX for a game. If one designer wants to add a layer
>     > of ambience to 40% of the .wav files, they have to coordinate with
>     > everyone else on the project manually.
>
> If you have 5 people working on a project together, isn't it more
> straightforward to post in IRC/E-Mail:
>
>     Hey @all, don't change *.wav files for the next couple of days,
>     major refactoring.
>
> That's what we do all the time over in the non-game-non-binary-assets SW
> development world, and I daresay that even if you have textual
> conflicts, they're sometimes just as hard to solve.
>
> I.e. you can have two people unaware of each other on a team starting to
> in parallel refactor the same set of code in two completely different
> ways, needing a lot of manual merging / throwing out of most of one
> implementation. The way that's usually dealt with is something like the
> above example post to a ML.
>
> But maybe I'm just not imagining the use-cases.
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-16 18:17       ` John Austin
@ 2018-09-16 22:05         ` Jonathan Nieder
  2018-09-17 13:58           ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: Jonathan Nieder @ 2018-09-16 22:05 UTC (permalink / raw)
  To: John Austin; +Cc: Taylor Blau, Ævar Arnfjörð Bjarmason, git

Hi,

On Sun, Sep 16, 2018 at 11:17:27AM -0700, John Austin wrote:
> Taylor Blau wrote:

>> Right, though this still subjects the remote copy to all of the
>> difficulty of packing large objects (though Christian's work to support
>> other object database implementations would go a long way to help this).
>
> Ah, interesting -- I didn't realize this step was part of the
> bottleneck. I presumed git didn't do much more than perhaps gzip'ing
> binary files when it packed them up. Or do you mean the growing cost
> of storing the objects locally as you work? Perhaps that could be
> solved by allowing the client more control (ie. delete the oldest
> blobs that exist on the server).

John, I believe you are correct.  Taylor, can you elaborate about what
packing overhead you are referring to?

One thing I would like to see in the long run to help Git cope with
very large files is adding something similar to bup's "bupsplit" to
the packfile format (or even better, to the actual object format, so
that it affects object names).  In other words, using a rolling hash
to decide where to split a blob and use a tree-like structure so that
(1) common portions between files can deduplicated and (2) portions
can be hashed in parallel.  I haven't heard of these things being the
bottleneck for anyone in practice today, though.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-16  7:56     ` David Aguilar
@ 2018-09-17 13:48       ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2018-09-17 13:48 UTC (permalink / raw)
  To: David Aguilar
  Cc: John Austin, me, git, sandals, larsxschneider, pastelmobilesuit

On Sun, Sep 16, 2018 at 12:56:04AM -0700, David Aguilar wrote:
> Combining changes is inherently file-format specific, and I suspect
> that native authoring tools are best used in those scenarios.
> Maybe LFS can help deal with binary conflicts by having short and sweet
> ways to grab the "base", "their" and "our" versions of the conflict
> files.
>
> Example:
>
> 	git lfs checkout --theirs --to theirs.wav conflict.wav
> 	git lfs checkout --ours --to ours.wav conflict.wav
> 	git lfs checkout --base --to base.wav conflict.wav
>
> Then the user can use {ours,theirs,base}.wav to produce the
> resolved result using their usual authoring tools.

That's a good idea, and I think that it's sensible that we teach Git LFS
how to do it. I've opened an issue to that effect in our tracker:

  https://github.com/git-lfs/git-lfs/issues/3258

> One thought that comes to mind is diffing -- I imagine that we
> might want to use different diff tools depending on the file format.
> Currently git-difftool uses a single tool for all files, but it seems
> like being able to use different tools, based on the file type, could
> be helpful.

We have had some internal discussion about this. I think that we had
landed on something similar to:

  1. Teach .gitattributes a new mergetool= attribute, which would
     specify a reference to a mergetool driver, and

  2. Teach .gitconfig about a way to store meregtool drivers, similar to
     how we name filters today.

Upon my re-reading of this proposal, it was suggested that we implement
this in terms of 'git lfs mergetool', but I don't see why this wouldn't
be a good fit for Git in general.


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason
  2018-09-16 20:49         ` John Austin
@ 2018-09-17 13:55         ` Taylor Blau
  2018-09-17 14:01           ` Randall S. Becker
  2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 35+ messages in thread
From: Taylor Blau @ 2018-09-17 13:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, John Austin, git, sandals, larsxschneider,
	pastelmobilesuit, Joey Hess

On Sun, Sep 16, 2018 at 04:55:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
> In the hypothetical git-annex-like case (simplifying a bit for the
> purposes this explanation), for every FILE in your tree you have a
> corresponding FILE.lock file, but it's not a boolean, but a log of who's
> asked for locks, i.e. lines of:
>
>     <repository UUID> <ts> <state> <who (email?)> <explanation?>
>
> E.g.:
>
>     $ cat Makefile.lock
>     my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
>     my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"
>
> This log is append-only, when clients encounter conflicts there's a
> merge driver to ensure that all updates are kept.

Certainly. I think that there are two things that aren't well expressed
under this mechanism:

  1. Having a log of locks held against that (a) file doesn't prevent us
     from introducing merge conflicts at the <file>.lock level, so we're
     reliant upon the caller first running 'git pull' and hoping that no
     one beats them out to locking and pushing their lock.

  2. Multi-file locks, e.g., "I need to lock file(s) X, Y, and Z
     together." This isn't possible in Git LFS today with the existing "git
     lfs lock" command (I had to check, but it takes only _one_ filename as
     its argument).

     Perhaps it would be nice to support something like this someday in
     Git LFS, but I think we would have to reimagine how this would look
     in your file.lock scheme.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-16 22:05         ` Jonathan Nieder
@ 2018-09-17 13:58           ` Taylor Blau
  2018-09-17 15:58             ` Jonathan Nieder
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-17 13:58 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: John Austin, Taylor Blau, Ævar Arnfjörð Bjarmason,
	git

On Sun, Sep 16, 2018 at 03:05:48PM -0700, Jonathan Nieder wrote:
> Hi,
>
> On Sun, Sep 16, 2018 at 11:17:27AM -0700, John Austin wrote:
> > Taylor Blau wrote:
>
> >> Right, though this still subjects the remote copy to all of the
> >> difficulty of packing large objects (though Christian's work to support
> >> other object database implementations would go a long way to help this).
> >
> > Ah, interesting -- I didn't realize this step was part of the
> > bottleneck. I presumed git didn't do much more than perhaps gzip'ing
> > binary files when it packed them up. Or do you mean the growing cost
> > of storing the objects locally as you work? Perhaps that could be
> > solved by allowing the client more control (ie. delete the oldest
> > blobs that exist on the server).
>
> John, I believe you are correct.  Taylor, can you elaborate about what
> packing overhead you are referring to?

Jonathan, you are right. I was also referring about the increased time
that Git would spend trying to find good packfile chains with larger,
non-textual objects. I haven't done any hard benchmarking work on this,
so it may be a moot point.

> In other words, using a rolling hash to decide where to split a blob
> and use a tree-like structure so that (1) common portions between
> files can deduplicated and (2) portions can be hashed in parallel.

I think that this is worth discussing further. Certainly, it would go a
good bit of the way to addressing the point that I responded to earlier
in this message.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: Git for games working group
  2018-09-17 13:55         ` Taylor Blau
@ 2018-09-17 14:01           ` Randall S. Becker
  2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 35+ messages in thread
From: Randall S. Becker @ 2018-09-17 14:01 UTC (permalink / raw)
  To: 'Taylor Blau',
	'Ævar Arnfjörð Bjarmason'
  Cc: 'John Austin', git, sandals, larsxschneider,
	pastelmobilesuit, 'Joey Hess'

On September 17, 2018 9:55 AM Taylor Blau wrote:
> On Sun, Sep 16, 2018 at 04:55:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
> > In the hypothetical git-annex-like case (simplifying a bit for the
> > purposes this explanation), for every FILE in your tree you have a
> > corresponding FILE.lock file, but it's not a boolean, but a log of
> > who's asked for locks, i.e. lines of:
> >
> >     <repository UUID> <ts> <state> <who (email?)> <explanation?>
> >
> > E.g.:
> >
> >     $ cat Makefile.lock
> >     my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring
> all Makefiles"
> >     my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"
> >
> > This log is append-only, when clients encounter conflicts there's a
> > merge driver to ensure that all updates are kept.
> 
> Certainly. I think that there are two things that aren't well expressed
under
> this mechanism:
> 
>   1. Having a log of locks held against that (a) file doesn't prevent us
>      from introducing merge conflicts at the <file>.lock level, so we're
>      reliant upon the caller first running 'git pull' and hoping that no
>      one beats them out to locking and pushing their lock.
> 
>   2. Multi-file locks, e.g., "I need to lock file(s) X, Y, and Z
>      together." This isn't possible in Git LFS today with the existing
"git
>      lfs lock" command (I had to check, but it takes only _one_ filename
as
>      its argument).
> 
>      Perhaps it would be nice to support something like this someday in
>      Git LFS, but I think we would have to reimagine how this would look
>      in your file.lock scheme.

I have an interest in this particular scheme, so am looking at porting both
golang and git-lfs over to my platform (HPE-NonStop). The multi-file lock
problem can be addressed through a variety of cooperative scheme, and if I
get the port, I'm hoping to contribute something to solve it (that's a big
IF at this point in time) - there are known mutex patterns to solve this
AFAIR. My own community has a similar requirement, so I'm investigating.

Cheers,
Randall



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 13:55         ` Taylor Blau
  2018-09-17 14:01           ` Randall S. Becker
@ 2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
  2018-09-17 15:57             ` Taylor Blau
  2018-09-17 16:47             ` Joey Hess
  1 sibling, 2 replies; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-17 15:00 UTC (permalink / raw)
  To: Taylor Blau
  Cc: John Austin, git, sandals, larsxschneider, pastelmobilesuit,
	Joey Hess


On Mon, Sep 17 2018, Taylor Blau wrote:

> On Sun, Sep 16, 2018 at 04:55:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> In the hypothetical git-annex-like case (simplifying a bit for the
>> purposes this explanation), for every FILE in your tree you have a
>> corresponding FILE.lock file, but it's not a boolean, but a log of who's
>> asked for locks, i.e. lines of:
>>
>>     <repository UUID> <ts> <state> <who (email?)> <explanation?>
>>
>> E.g.:
>>
>>     $ cat Makefile.lock
>>     my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
>>     my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"
>>
>> This log is append-only, when clients encounter conflicts there's a
>> merge driver to ensure that all updates are kept.
>
> Certainly. I think that there are two things that aren't well expressed
> under this mechanism:
>
>   1. Having a log of locks held against that (a) file doesn't prevent us
>      from introducing merge conflicts at the <file>.lock level, so we're
>      reliant upon the caller first running 'git pull' and hoping that no
>      one beats them out to locking and pushing their lock.

I was eliding a lot of details about how git-annex works under the
hood.

In reality under git-annex it's not a Makefile.lock file, but there's a
dedicated branch (called "git-annex") that stores this sort of metadata,
i.e. who has copies of the the "Makefile" file. That branch has
dedicated merge drivers for the files it manages, so you never get into
these sorts of conflicts.

But yeah, the ad-hoc example I mentioned of:

    echo We created a lock for this >Makefile.lock

*Would* conflict if two users picked a different string, so in practice
you'd need something standard there, i.e. everyone would just echo
"magic git-annex lock" to the file & track it, so even if they did that
same action in parallel it wouldn't conflict.

There's surely other aspects of that square peg of large file tracking
not fitting the round hole of file locking, the point of my write-up was
not that *that* solution is perfect, but there's prior art here that's
very easily adopted to distributed locking if someone wanted to scratch
that itch, since the notion of keeping a log of who has/hasn't gotten a
file is very similar to a log of who has/hasn't locked some file(s) in
the tree.

>   2. Multi-file locks, e.g., "I need to lock file(s) X, Y, and Z
>      together." This isn't possible in Git LFS today with the existing "git
>      lfs lock" command (I had to check, but it takes only _one_ filename as
>      its argument).
>
>      Perhaps it would be nice to support something like this someday in
>      Git LFS, but I think we would have to reimagine how this would look
>      in your file.lock scheme.

If you can do it for 1 file you can do it for N with a for-loop, no? So
is this just a genreal UI issue in git-annex where some commands don't
take lists of filenames (or git pathspecs) to operate on, or a more
general issue with locking?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
@ 2018-09-17 15:57             ` Taylor Blau
  2018-09-17 16:21               ` Randall S. Becker
  2018-09-17 16:47             ` Joey Hess
  1 sibling, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-17 15:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, John Austin, git, sandals, larsxschneider,
	pastelmobilesuit, Joey Hess

On Mon, Sep 17, 2018 at 05:00:10PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >   2. Multi-file locks, e.g., "I need to lock file(s) X, Y, and Z
> >      together." This isn't possible in Git LFS today with the existing "git
> >      lfs lock" command (I had to check, but it takes only _one_ filename as
> >      its argument).
> >
> >      Perhaps it would be nice to support something like this someday in
> >      Git LFS, but I think we would have to reimagine how this would look
> >      in your file.lock scheme.
>
> If you can do it for 1 file you can do it for N with a for-loop, no? So
> is this just a genreal UI issue in git-annex where some commands don't
> take lists of filenames (or git pathspecs) to operate on, or a more
> general issue with locking?

I think that it's more general.

I envision a scenario where between iterations of the for-loop, another
client acquires a lock later on in the list. I think that the general
problem here is that there is no transactional way to express "please
give me all N of these locks".

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 13:58           ` Taylor Blau
@ 2018-09-17 15:58             ` Jonathan Nieder
  2018-10-03 12:28               ` Thomas Braun
  0 siblings, 1 reply; 35+ messages in thread
From: Jonathan Nieder @ 2018-09-17 15:58 UTC (permalink / raw)
  To: Taylor Blau; +Cc: John Austin, Ævar Arnfjörð Bjarmason, git

Taylor Blau wrote:
> On Sun, Sep 16, 2018 at 03:05:48PM -0700, Jonathan Nieder wrote:
> > On Sun, Sep 16, 2018 at 11:17:27AM -0700, John Austin wrote:
> > > Taylor Blau wrote:

>>>> Right, though this still subjects the remote copy to all of the
>>>> difficulty of packing large objects (though Christian's work to support
>>>> other object database implementations would go a long way to help this).
>>>
>>> Ah, interesting -- I didn't realize this step was part of the
>>> bottleneck. I presumed git didn't do much more than perhaps gzip'ing
>>> binary files when it packed them up. Or do you mean the growing cost
>>> of storing the objects locally as you work? Perhaps that could be
>>> solved by allowing the client more control (ie. delete the oldest
>>> blobs that exist on the server).
>>
>> John, I believe you are correct.  Taylor, can you elaborate about what
>> packing overhead you are referring to?
>
> Jonathan, you are right. I was also referring about the increased time
> that Git would spend trying to find good packfile chains with larger,
> non-textual objects. I haven't done any hard benchmarking work on this,
> so it may be a moot point.

Ah, thanks.  See git-config(1):

	core.bigFileThreshold
		Files larger than this size are stored deflated,
		without attempting delta compression.

		Default is 512 MiB on all platforms.

If that's failing on your machine then it would be a bug, so we'd
definitely want to know.

Jonathan

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: Git for games working group
  2018-09-17 15:57             ` Taylor Blau
@ 2018-09-17 16:21               ` Randall S. Becker
  0 siblings, 0 replies; 35+ messages in thread
From: Randall S. Becker @ 2018-09-17 16:21 UTC (permalink / raw)
  To: 'Taylor Blau',
	'Ævar Arnfjörð Bjarmason'
  Cc: 'John Austin', git, sandals, larsxschneider,
	pastelmobilesuit, 'Joey Hess'

On September 17, 2018 11:58 AM, Taylor Blau wrote:
> On Mon, Sep 17, 2018 at 05:00:10PM +0200, Ævar Arnfjörð Bjarmason
> wrote:
> > >   2. Multi-file locks, e.g., "I need to lock file(s) X, Y, and Z
> > >      together." This isn't possible in Git LFS today with the existing
"git
> > >      lfs lock" command (I had to check, but it takes only _one_
filename as
> > >      its argument).
> > >
> > >      Perhaps it would be nice to support something like this someday
in
> > >      Git LFS, but I think we would have to reimagine how this would
look
> > >      in your file.lock scheme.
> >
> > If you can do it for 1 file you can do it for N with a for-loop, no?
> > So is this just a genreal UI issue in git-annex where some commands
> > don't take lists of filenames (or git pathspecs) to operate on, or a
> > more general issue with locking?
> 
> I think that it's more general.
> 
> I envision a scenario where between iterations of the for-loop, another
> client acquires a lock later on in the list. I think that the general
problem here
> is that there is no transactional way to express "please give me all N of
these
> locks".

A composite mutex is better, constructing a long name of X+Y+Z.lock and
obtaining the lock of that, then attempting all locks X.lock,Y.lock,Z.lock
and if any fail, free up what you did. Otherwise you run into a potential
mutex conflict if someone attempts the locks in a different order. Not
perfect, but it prevents two from going after the same set of resources, if
that set is common. Another pattern is to have a very temporary dir.lock
that is active while locks are being grabbed within a subtree, then released
when all locks are acquired or fail (so very short time). This second
pattern should generally work no matter what combination of locks are
required, although single threads lock acquisition - which is probably a
good thing functionally, but slower operationally.

Cheers,
Randall


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
  2018-09-17 15:57             ` Taylor Blau
@ 2018-09-17 16:47             ` Joey Hess
  2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 35+ messages in thread
From: Joey Hess @ 2018-09-17 16:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, John Austin, git, sandals, larsxschneider,
	pastelmobilesuit

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

Ævar Arnfjörð Bjarmason wrote:
> There's surely other aspects of that square peg of large file tracking
> not fitting the round hole of file locking, the point of my write-up was
> not that *that* solution is perfect, but there's prior art here that's
> very easily adopted to distributed locking if someone wanted to scratch
> that itch, since the notion of keeping a log of who has/hasn't gotten a
> file is very similar to a log of who has/hasn't locked some file(s) in
> the tree.

Actually they are fundamentally very different. git-annex's tracking of
locations of files is eventually consistent, which of course means that
at any given point in time it may be currently inconsistent. That is
fine for tracking locations of files, but not for locking.

When git-annex needs to do an operation that relies on someone else's
copy of a file actually being present, it uses real locking. That
locking is not centralized, instead it relies on the connections between
git repositories. That turns out to be sufficient for git-annex's own
locking needs, but it would not be sufficient to avoid file edit
conflict problems in eg a split brain situation.

-- 
see shy jo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 16:47             ` Joey Hess
@ 2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
  2018-09-23 17:28                 ` John Austin
  0 siblings, 1 reply; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-17 17:23 UTC (permalink / raw)
  To: Joey Hess
  Cc: Taylor Blau, John Austin, git, sandals, larsxschneider,
	pastelmobilesuit


On Mon, Sep 17 2018, Joey Hess wrote:

> Ævar Arnfjörð Bjarmason wrote:
>> There's surely other aspects of that square peg of large file tracking
>> not fitting the round hole of file locking, the point of my write-up was
>> not that *that* solution is perfect, but there's prior art here that's
>> very easily adopted to distributed locking if someone wanted to scratch
>> that itch, since the notion of keeping a log of who has/hasn't gotten a
>> file is very similar to a log of who has/hasn't locked some file(s) in
>> the tree.
>
> Actually they are fundamentally very different. git-annex's tracking of
> locations of files is eventually consistent, which of course means that
> at any given point in time it may be currently inconsistent. That is
> fine for tracking locations of files, but not for locking.
>
> When git-annex needs to do an operation that relies on someone else's
> copy of a file actually being present, it uses real locking. That
> locking is not centralized, instead it relies on the connections between
> git repositories. That turns out to be sufficient for git-annex's own
> locking needs, but it would not be sufficient to avoid file edit
> conflict problems in eg a split brain situation.

Right, all of that's true. I forgot to explicitly say what I meant by
"locking" in this context. Clearly it's not suitable for something like
actual file locking (in the sense of flock() et al), but rather just
advisory locking in the loosest sense of the word, i.e. some git-ish way
of someone writing on the office whiteboard "unless you're Bob, don't
touch main.c today Tuesday Sep 17th, he's hacking on it".

So just a way to have some eventually consistent side channel to pass
such a message through git. Something similar to what git-annex does
with its "git-annex" branch would work for that, as long as everyone who
wanted get such messages ran some equivalent of "git annex sync" in a
timely manner (or checked the office whiteboard every day...).

Such a schema is never going to be 100% reliable even in centralized
source control systems, e.g. even with cvs/perforce you might pull the
latest changes, then go on a plane and edit the locked main.c. Then the
lock has "failed" in the sense of "the message didn't get there in time,
and two people who could have just picked different areas to work on
made conflicting edits".

As noted upthread this isn't my use-case, I just wanted to point the
git-annex method of distributing metadata as a bolt-on to git as
interesting prior art. If someone wants "truly distributed, but with
file locking like cvs/perforce" something like what git-annex is doing
would probably work for them.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
@ 2018-09-23 17:28                 ` John Austin
  2018-09-23 17:56                   ` Randall S. Becker
  0 siblings, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-23 17:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: id, Taylor Blau, git, brian m. carlson, Lars Schneider,
	pastelmobilesuit

I've been putting together a prototype file-locking implementation for
a system that plays better with git. What are everyone's thoughts on
something like the following? I'm tentatively labeling this system
git-sync or sync-server. There are two pieces:

1. A centralized repository called the Global Graph that contains the
union git commit graph for local developer repos. When Developer A
makes a local commit on branch 'feature', git-sync will automatically
push that new commit up to the global server, under a name-spaced
branch: 'developera_repoabcdef/feature'. This can be done silently as
a force push, and shouldn't ever interrupt the developer's workflow.
Simple http queries can be made to the Global Graph, such as "Which
commits descend from commit abcdefgh?"

2. A client-side tool that queries the Global Graph to determine when
your current changes are in conflict with another developer. It might
ask "Are there any commits I don't have locally that modify
lockable_file.bin?". This could either be on pre-commit, or for more
security, be part of a read-only marking system ala Git LFS. There
wouldn't be any "lock" per say, rather, the client could refuse to
modify a file if it found other commits for that file in the global
graph.

The key here is the separation of concerns. The Global Graph is fairly
dimwitted -- it doesn't know anything about file locking. But it
provides a layer of information from which we can implement file
locking on the client side (or perhaps other interesting systems).

Thoughts?
On Mon, Sep 17, 2018 at 10:23 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Sep 17 2018, Joey Hess wrote:
>
> > Ævar Arnfjörð Bjarmason wrote:
> >> There's surely other aspects of that square peg of large file tracking
> >> not fitting the round hole of file locking, the point of my write-up was
> >> not that *that* solution is perfect, but there's prior art here that's
> >> very easily adopted to distributed locking if someone wanted to scratch
> >> that itch, since the notion of keeping a log of who has/hasn't gotten a
> >> file is very similar to a log of who has/hasn't locked some file(s) in
> >> the tree.
> >
> > Actually they are fundamentally very different. git-annex's tracking of
> > locations of files is eventually consistent, which of course means that
> > at any given point in time it may be currently inconsistent. That is
> > fine for tracking locations of files, but not for locking.
> >
> > When git-annex needs to do an operation that relies on someone else's
> > copy of a file actually being present, it uses real locking. That
> > locking is not centralized, instead it relies on the connections between
> > git repositories. That turns out to be sufficient for git-annex's own
> > locking needs, but it would not be sufficient to avoid file edit
> > conflict problems in eg a split brain situation.
>
> Right, all of that's true. I forgot to explicitly say what I meant by
> "locking" in this context. Clearly it's not suitable for something like
> actual file locking (in the sense of flock() et al), but rather just
> advisory locking in the loosest sense of the word, i.e. some git-ish way
> of someone writing on the office whiteboard "unless you're Bob, don't
> touch main.c today Tuesday Sep 17th, he's hacking on it".
>
> So just a way to have some eventually consistent side channel to pass
> such a message through git. Something similar to what git-annex does
> with its "git-annex" branch would work for that, as long as everyone who
> wanted get such messages ran some equivalent of "git annex sync" in a
> timely manner (or checked the office whiteboard every day...).
>
> Such a schema is never going to be 100% reliable even in centralized
> source control systems, e.g. even with cvs/perforce you might pull the
> latest changes, then go on a plane and edit the locked main.c. Then the
> lock has "failed" in the sense of "the message didn't get there in time,
> and two people who could have just picked different areas to work on
> made conflicting edits".
>
> As noted upthread this isn't my use-case, I just wanted to point the
> git-annex method of distributing metadata as a bolt-on to git as
> interesting prior art. If someone wants "truly distributed, but with
> file locking like cvs/perforce" something like what git-annex is doing
> would probably work for them.
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: Git for games working group
  2018-09-23 17:28                 ` John Austin
@ 2018-09-23 17:56                   ` Randall S. Becker
  2018-09-23 19:53                     ` John Austin
  2018-09-24 13:59                     ` Taylor Blau
  0 siblings, 2 replies; 35+ messages in thread
From: Randall S. Becker @ 2018-09-23 17:56 UTC (permalink / raw)
  To: 'John Austin',
	'Ævar Arnfjörð Bjarmason'
  Cc: id, 'Taylor Blau', git, 'brian m. carlson',
	'Lars Schneider', pastelmobilesuit

On September 23, 2018 1:29 PM, John Austin wrote:
> I've been putting together a prototype file-locking implementation for a
> system that plays better with git. What are everyone's thoughts on
> something like the following? I'm tentatively labeling this system git-sync or
> sync-server. There are two pieces:
> 
> 1. A centralized repository called the Global Graph that contains the union git
> commit graph for local developer repos. When Developer A makes a local
> commit on branch 'feature', git-sync will automatically push that new commit
> up to the global server, under a name-spaced
> branch: 'developera_repoabcdef/feature'. This can be done silently as a
> force push, and shouldn't ever interrupt the developer's workflow.
> Simple http queries can be made to the Global Graph, such as "Which
> commits descend from commit abcdefgh?"
> 
> 2. A client-side tool that queries the Global Graph to determine when your
> current changes are in conflict with another developer. It might ask "Are
> there any commits I don't have locally that modify lockable_file.bin?". This
> could either be on pre-commit, or for more security, be part of a read-only
> marking system ala Git LFS. There wouldn't be any "lock" per say, rather, the
> client could refuse to modify a file if it found other commits for that file in the
> global graph.
> 
> The key here is the separation of concerns. The Global Graph is fairly
> dimwitted -- it doesn't know anything about file locking. But it provides a
> layer of information from which we can implement file locking on the client
> side (or perhaps other interesting systems).
> 
> Thoughts?

I'm encouraged of where this is going. I might suggest "sync" is the wrong name here, with "mutex" being slightly better - I would even like to help with your effort and have non-unixy platforms I'd like to do this on.

Having this separate from git LFS is an even better idea IMO, and I would suggest implementing this using the same set of build tools that git uses so that it is broadly portable, unlike git LFS. Glad to help there too.

I would suggest that a higher-level grouping mechanism of resource groups might be helpful - as in "In need this directory" rather than "I need this file". Better still, I could see "I need all objects in this commit-ish", which would allow a revert operation to succeed or fail atomically while adhering to a lock requirement.

One bit that traditional lock-brokering systems implement involve forcing security attribute changes - so an unlocked file is stored as chmod a-w to prevent accidental modification of lockables, when changing that to chmod ?+w when a lock is acquired. It's not perfect, but does catch a lot of errors.

Cheers,
Randall



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-23 17:56                   ` Randall S. Becker
@ 2018-09-23 19:53                     ` John Austin
  2018-09-23 19:55                       ` John Austin
                                         ` (2 more replies)
  2018-09-24 13:59                     ` Taylor Blau
  1 sibling, 3 replies; 35+ messages in thread
From: John Austin @ 2018-09-23 19:53 UTC (permalink / raw)
  To: Randall Becker
  Cc: Ævar Arnfjörð Bjarmason, id, Taylor Blau, git,
	brian m. carlson, Lars Schneider, pastelmobilesuit

On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
<rsbecker@nexbridge.com> wrote:
>  I would even like to help with your effort and have non-unixy platforms I'd like to do this on.
> Having this separate from git LFS is an even better idea IMO, and I would suggest implementing this using the same set of build tools that git uses so that it is broadly portable, unlike git LFS. Glad to help there too.

Great to hear -- once the code is in a bit better shape I can open it
up on github. Cross platform is definitely one of my focuses. I'm
currently implementing in Rust because it targets the same space as C
and has great, near trivial, cross-platform support. What sorts of
platforms are you interested in? Windows is my first target because
that's where many game developers live.

> I would suggest that a higher-level grouping mechanism of resource groups might be helpful - as in "In need this directory" rather than "I need this file". Better still, I could see "I need all objects in this commit-ish", which would allow a revert operation to succeed or fail atomically while adhering to a lock requirement.
> One bit that traditional lock-brokering systems implement involve forcing security attribute changes - so an unlocked file is stored as chmod a-w to prevent accidental modification of lockables, when changing that to chmod ?+w when a lock is acquired. It's not perfect, but does catch a lot of errors.

Agreed -- I think this is all up to how the query endpoint and client
is designed. A couple of different types of clients could be
implemented, depending on the policies you want in place. One could
have strict security that stored unlocked files with a-w, as
mentioned. Another could be a weaker client, and simply warn
developers when their current branch is in conflict.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-23 19:53                     ` John Austin
@ 2018-09-23 19:55                       ` John Austin
  2018-09-23 20:43                       ` Randall S. Becker
  2018-09-24 14:01                       ` Taylor Blau
  2 siblings, 0 replies; 35+ messages in thread
From: John Austin @ 2018-09-23 19:55 UTC (permalink / raw)
  To: Randall Becker
  Cc: Ævar Arnfjörð Bjarmason, id, Taylor Blau, git,
	brian m. carlson, Lars Schneider, pastelmobilesuit

Regarding integration into LFS, I'd like to build the library in such
a way that it would easy to bundle with LFS (so they could share the
same git hooks), but also make it flexible enough to work for other
workflows.
On Sun, Sep 23, 2018 at 12:53 PM John Austin <john@astrangergravity.com> wrote:
>
> On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> <rsbecker@nexbridge.com> wrote:
> >  I would even like to help with your effort and have non-unixy platforms I'd like to do this on.
> > Having this separate from git LFS is an even better idea IMO, and I would suggest implementing this using the same set of build tools that git uses so that it is broadly portable, unlike git LFS. Glad to help there too.
>
> Great to hear -- once the code is in a bit better shape I can open it
> up on github. Cross platform is definitely one of my focuses. I'm
> currently implementing in Rust because it targets the same space as C
> and has great, near trivial, cross-platform support. What sorts of
> platforms are you interested in? Windows is my first target because
> that's where many game developers live.
>
> > I would suggest that a higher-level grouping mechanism of resource groups might be helpful - as in "In need this directory" rather than "I need this file". Better still, I could see "I need all objects in this commit-ish", which would allow a revert operation to succeed or fail atomically while adhering to a lock requirement.
> > One bit that traditional lock-brokering systems implement involve forcing security attribute changes - so an unlocked file is stored as chmod a-w to prevent accidental modification of lockables, when changing that to chmod ?+w when a lock is acquired. It's not perfect, but does catch a lot of errors.
>
> Agreed -- I think this is all up to how the query endpoint and client
> is designed. A couple of different types of clients could be
> implemented, depending on the policies you want in place. One could
> have strict security that stored unlocked files with a-w, as
> mentioned. Another could be a weaker client, and simply warn
> developers when their current branch is in conflict.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: Git for games working group
  2018-09-23 19:53                     ` John Austin
  2018-09-23 19:55                       ` John Austin
@ 2018-09-23 20:43                       ` Randall S. Becker
  2018-09-24 14:01                       ` Taylor Blau
  2 siblings, 0 replies; 35+ messages in thread
From: Randall S. Becker @ 2018-09-23 20:43 UTC (permalink / raw)
  To: 'John Austin'
  Cc: 'Ævar Arnfjörð Bjarmason', id,
	'Taylor Blau', git, 'brian m. carlson',
	'Lars Schneider', pastelmobilesuit

On September 23, 2018 3:54 PM, John Austin wrote:
> On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> <rsbecker@nexbridge.com> wrote:
> >  I would even like to help with your effort and have non-unixy platforms I'd
> like to do this on.
> > Having this separate from git LFS is an even better idea IMO, and I would
> suggest implementing this using the same set of build tools that git uses so
> that it is broadly portable, unlike git LFS. Glad to help there too.
> 
> Great to hear -- once the code is in a bit better shape I can open it up on
> github. Cross platform is definitely one of my focuses. I'm currently
> implementing in Rust because it targets the same space as C and has great,
> near trivial, cross-platform support. What sorts of platforms are you
> interested in? Windows is my first target because that's where many game
> developers live.

I have looked at porting Rust to my two mid-to-large platforms which do not have a Rust port. I would prefer keeping within what git currently requires without adding dependencies, but I'd be happy to take a Rust prototype and translate it. My need is actually not for gamers, but in similar processes that gamers use. The following dependences are not available on the two platforms I have in mind: g++ or clang; 
And cmake (despite efforts by people on the platform to do ports). This puts me in a difficult spot with Rust. I understand you might want to use Rust's implied threating, so I would be willing to do the pthread work to make it happen in C.

> > I would suggest that a higher-level grouping mechanism of resource groups
> might be helpful - as in "In need this directory" rather than "I need this file".
> Better still, I could see "I need all objects in this commit-ish", which would
> allow a revert operation to succeed or fail atomically while adhering to a lock
> requirement.
> > One bit that traditional lock-brokering systems implement involve forcing
> security attribute changes - so an unlocked file is stored as chmod a-w to
> prevent accidental modification of lockables, when changing that to chmod
> ?+w when a lock is acquired. It's not perfect, but does catch a lot of errors.
> 
> Agreed -- I think this is all up to how the query endpoint and client is
> designed. A couple of different types of clients could be implemented,
> depending on the policies you want in place. One could have strict security
> that stored unlocked files with a-w, as mentioned. Another could be a
> weaker client, and simply warn developers when their current branch is in
> conflict.

Regards,
Randall


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-23 17:56                   ` Randall S. Becker
  2018-09-23 19:53                     ` John Austin
@ 2018-09-24 13:59                     ` Taylor Blau
  1 sibling, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2018-09-24 13:59 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: 'John Austin',
	'Ævar Arnfjörð Bjarmason', id,
	'Taylor Blau', git, 'brian m. carlson',
	'Lars Schneider', pastelmobilesuit

On Sun, Sep 23, 2018 at 01:56:37PM -0400, Randall S. Becker wrote:
> On September 23, 2018 1:29 PM, John Austin wrote:
> > I've been putting together a prototype file-locking implementation for a
> > system that plays better with git. What are everyone's thoughts on
> > something like the following? I'm tentatively labeling this system git-sync or
> > sync-server. There are two pieces:
> >
> > 1. A centralized repository called the Global Graph that contains the union git
> > commit graph for local developer repos. When Developer A makes a local
> > commit on branch 'feature', git-sync will automatically push that new commit
> > up to the global server, under a name-spaced
> > branch: 'developera_repoabcdef/feature'. This can be done silently as a
> > force push, and shouldn't ever interrupt the developer's workflow.
> > Simple http queries can be made to the Global Graph, such as "Which
> > commits descend from commit abcdefgh?"
> >
> > 2. A client-side tool that queries the Global Graph to determine when your
> > current changes are in conflict with another developer. It might ask "Are
> > there any commits I don't have locally that modify lockable_file.bin?". This
> > could either be on pre-commit, or for more security, be part of a read-only
> > marking system ala Git LFS. There wouldn't be any "lock" per say, rather, the
> > client could refuse to modify a file if it found other commits for that file in the
> > global graph.
> >
> > The key here is the separation of concerns. The Global Graph is fairly
> > dimwitted -- it doesn't know anything about file locking. But it provides a
> > layer of information from which we can implement file locking on the client
> > side (or perhaps other interesting systems).
> >
> > Thoughts?
>
> I'm encouraged of where this is going. I might suggest "sync" is the
> wrong name here, with "mutex" being slightly better - I would even
> like to help with your effort and have non-unixy platforms I'd like to
> do this on.
>
> Having this separate from git LFS is an even better idea IMO, and I
> would suggest implementing this using the same set of build tools that
> git uses so that it is broadly portable, unlike git LFS. Glad to help
> there too.

I think that this is the way that we would prefer it, too. Ideally users
outside of those who have Git LFS installed or those that are regular
users of it should be able to interoperate with those using the global
graph.

We're thinking a lot about what should go into the next major version of
Git LFS, v3.0.0, and this seems a good candidate to me. We'd also want
to figure out how to transition v2.0.0-era locks into the new global
graph, but that seems a topic for a later discussion.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-23 19:53                     ` John Austin
  2018-09-23 19:55                       ` John Austin
  2018-09-23 20:43                       ` Randall S. Becker
@ 2018-09-24 14:01                       ` Taylor Blau
  2018-09-24 15:34                         ` John Austin
  2 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-24 14:01 UTC (permalink / raw)
  To: John Austin
  Cc: Randall Becker, Ævar Arnfjörð Bjarmason, id,
	Taylor Blau, git, brian m. carlson, Lars Schneider,
	pastelmobilesuit

On Sun, Sep 23, 2018 at 12:53:58PM -0700, John Austin wrote:
> On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> <rsbecker@nexbridge.com> wrote:
> >  I would even like to help with your effort and have non-unixy platforms I'd like to do this on.
> > Having this separate from git LFS is an even better idea IMO, and I would suggest implementing this using the same set of build tools that git uses so that it is broadly portable, unlike git LFS. Glad to help there too.
>
> Great to hear -- once the code is in a bit better shape I can open it
> up on github. Cross platform is definitely one of my focuses. I'm
> currently implementing in Rust because it targets the same space as C
> and has great, near trivial, cross-platform support. What sorts of
> platforms are you interested in? Windows is my first target because
> that's where many game developers live.

This would likely mean that Git LFS will have to reimplement it, since
we strictly avoid using CGo (Go's mechanism to issue function calls to
other languages).

The upshot is that it likely shouldn't be too much effort for anybody,
and the open-source community would get a Go implementation of the API,
too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-24 14:01                       ` Taylor Blau
@ 2018-09-24 15:34                         ` John Austin
  2018-09-24 19:58                           ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-24 15:34 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Randall Becker, Ævar Arnfjörð Bjarmason, id, git,
	brian m. carlson, Lars Schneider, pastelmobilesuit

Perhaps git-global-graph is a decent name. GGG? G3? :). The structure
right now in my head looks a bit like:

Global Graph:
     client - post-commit git hooks to push changes up to the GG
     git server - just the standard git server configuration
     query server - replies with information about the current state of the GG

Locks Pre-Commit:
     client - pre-commit hook that makes requests to the GG query server

For cross-platform compatibility, the Global Graph client and the
Locks/Conflicts client are the pieces that need to be use-able on all
platforms. My goal is to keep these pieces as simple as possible. I'd
like to at least start prototyping these in Rust, hopefully in a way
that can either be easily ported or easily re-implemented in C later
on, once things are feature-frozen.

For LFS, The main points of integration with I see are:
    -- bundling of packages (optionally install this package with a
normal LFS installation)
    -- `git lfs locks` integration. ie. integration with the read-only
control of LFS

If we push more of the functionality into the gg query server, the
integration with `lfs locks` could be simple enough to be a couple of
web requests. That might help avoid integration issues.

> we strictly avoid using CGo
What's the main reason for this? Build system complexity?
On Mon, Sep 24, 2018 at 7:37 AM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Sun, Sep 23, 2018 at 12:53:58PM -0700, John Austin wrote:
> > On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> > <rsbecker@nexbridge.com> wrote:
> > >  I would even like to help with your effort and have non-unixy platforms I'd like to do this on.
> > > Having this separate from git LFS is an even better idea IMO, and I would suggest implementing this using the same set of build tools that git uses so that it is broadly portable, unlike git LFS. Glad to help there too.
> >
> > Great to hear -- once the code is in a bit better shape I can open it
> > up on github. Cross platform is definitely one of my focuses. I'm
> > currently implementing in Rust because it targets the same space as C
> > and has great, near trivial, cross-platform support. What sorts of
> > platforms are you interested in? Windows is my first target because
> > that's where many game developers live.
>
> This would likely mean that Git LFS will have to reimplement it, since
> we strictly avoid using CGo (Go's mechanism to issue function calls to
> other languages).
>
> The upshot is that it likely shouldn't be too much effort for anybody,
> and the open-source community would get a Go implementation of the API,
> too.
>
> Thanks,
> Taylor
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-24 15:34                         ` John Austin
@ 2018-09-24 19:58                           ` Taylor Blau
  2018-09-25  4:05                             ` John Austin
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2018-09-24 19:58 UTC (permalink / raw)
  To: John Austin
  Cc: Taylor Blau, Randall Becker,
	Ævar Arnfjörð Bjarmason, id, git, brian m. carlson,
	Lars Schneider, pastelmobilesuit

On Mon, Sep 24, 2018 at 08:34:44AM -0700, John Austin wrote:
> Perhaps git-global-graph is a decent name. GGG? G3? :). The structure
> right now in my head looks a bit like:
>
> Global Graph:
>      client - post-commit git hooks to push changes up to the GG

I'm replying to this part of the email to note that this would cause Git
LFS to have to do some extra work, since running 'git lfs install'
already writes to .git/hooks/post-commit (ironically, to detect and
unlock locks that we should have released).

I'm not immediately sure about how we'd resolve this, though I suspect
it would look like either of:

  - Git LFS knows how to install or _append_ hooks to a given location,
    should one already exist at that path on disk, or

  - git-global-graph knows how to accommodate Git LFS, and can include a
    line that calls 'git-lfs-post-commit(1)', perhaps via:

      $ git global-graph install --git-lfs=$(which git-lfs)

    or similar.

> For LFS, The main points of integration with I see are:
>     -- bundling of packages (optionally install this package with a
> normal LFS installation)
>     -- `git lfs locks` integration. ie. integration with the read-only
> control of LFS

Sounds sane to me.

> > we strictly avoid using CGo
>
> What's the main reason for this? Build system complexity?

A couple of reasons. CGO is widely considered to be (1) slow and (2)
unsafe. For our purposes, this would almost be OK, except that it makes
it impossible for me to build cross-platform binaries without the
correct compilers installed.

Today, I build Git LFS for every pair in {Windows, Darwin, Linux,
FreeBSD} x {386, amd64} by running 'make release', and using CGO would
not allow me to do that.

Transitioning from Go to CGO during each call is notoriously expensive,
and concedes many of the benefits that leads us to choose Go in the
first place. (Although now that I write much more C than Go, I don't
think I would make the same argument today ;-).)

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-24 19:58                           ` Taylor Blau
@ 2018-09-25  4:05                             ` John Austin
  2018-09-25 20:14                               ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: John Austin @ 2018-09-25  4:05 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Randall Becker, Ævar Arnfjörð Bjarmason, id, git,
	brian m. carlson, Lars Schneider, pastelmobilesuit

On Mon, Sep 24, 2018 at 12:58 PM Taylor Blau <me@ttaylorr.com> wrote:
> I'm replying to this part of the email to note that this would cause Git
> LFS to have to do some extra work, since running 'git lfs install'
> already writes to .git/hooks/post-commit (ironically, to detect and
> unlock locks that we should have released).

Right, that should have been another bullet point. The fact that there
can only be one git hook is.. frustrating.

Perhaps, if LFS has an option to bundle global-graph, LFS could merge
the hooks when installing?

If you instead install global-graph after LFS, I think it should
probably attempt something like:
  -- first move the existing hook to a folder: post-commit.d/
  -- install the global-graph hook to post-commit.d/
  -- install a new hook at post-commit that simply calls all
executables in post-commit.d/

Not sure if this is something that's been discussed, since I know LFS
has a similar issue with existing hooks, but might be sensible.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-25  4:05                             ` John Austin
@ 2018-09-25 20:14                               ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2018-09-25 20:14 UTC (permalink / raw)
  To: John Austin
  Cc: Taylor Blau, Randall Becker,
	Ævar Arnfjörð Bjarmason, id, git, brian m. carlson,
	Lars Schneider, pastelmobilesuit

On Mon, Sep 24, 2018 at 09:05:56PM -0700, John Austin wrote:
> On Mon, Sep 24, 2018 at 12:58 PM Taylor Blau <me@ttaylorr.com> wrote:
> > I'm replying to this part of the email to note that this would cause Git
> > LFS to have to do some extra work, since running 'git lfs install'
> > already writes to .git/hooks/post-commit (ironically, to detect and
> > unlock locks that we should have released).
>
> Right, that should have been another bullet point. The fact that there
> can only be one git hook is.. frustrating.

Sure, I think one approach to dealing with this is to teach Git how to
handle multiple hooks for the same phase of hook.

I don't know how likely this is in practice to be something that would
be acceptable, since it seems to involve much more work than either of
our tools learning about the other.

> Perhaps, if LFS has an option to bundle global-graph, LFS could merge
> the hooks when installing?

Right. I think that (in an ideal world) both tools would know about the
other, that way we can not have to worry about who installs what first.

> If you instead install global-graph after LFS, I think it should
> probably attempt something like:
>   -- first move the existing hook to a folder: post-commit.d/
>   -- install the global-graph hook to post-commit.d/
>   -- install a new hook at post-commit that simply calls all
> executables in post-commit.d/
>
> Not sure if this is something that's been discussed, since I know LFS
> has a similar issue with existing hooks, but might be sensible.

Yeah, I think that that would be fine, too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Git for games working group
  2018-09-17 15:58             ` Jonathan Nieder
@ 2018-10-03 12:28               ` Thomas Braun
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Braun @ 2018-10-03 12:28 UTC (permalink / raw)
  To: Jonathan Nieder, Taylor Blau
  Cc: John Austin, Ævar Arnfjörð Bjarmason, git

Am 17.09.2018 um 17:58 schrieb Jonathan Nieder:

[...]

> Ah, thanks.  See git-config(1):
> 
> 	core.bigFileThreshold
> 		Files larger than this size are stored deflated,
> 		without attempting delta compression.
> 
> 		Default is 512 MiB on all platforms.
> 

In addition to config.bigFileThreshold you can also unset the delta
attribute for file extensions you don't want to get delta compressed.
See "git help attributes". And while you are at it, mark the files as
binary so that git diff/log don't have to guess.


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2018-10-03 12:28 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-14 17:55 Git for games working group John Austin
2018-09-14 19:00 ` Taylor Blau
2018-09-14 21:09   ` John Austin
2018-09-15 16:40     ` Taylor Blau
2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason
2018-09-16 20:49         ` John Austin
2018-09-17 13:55         ` Taylor Blau
2018-09-17 14:01           ` Randall S. Becker
2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
2018-09-17 15:57             ` Taylor Blau
2018-09-17 16:21               ` Randall S. Becker
2018-09-17 16:47             ` Joey Hess
2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
2018-09-23 17:28                 ` John Austin
2018-09-23 17:56                   ` Randall S. Becker
2018-09-23 19:53                     ` John Austin
2018-09-23 19:55                       ` John Austin
2018-09-23 20:43                       ` Randall S. Becker
2018-09-24 14:01                       ` Taylor Blau
2018-09-24 15:34                         ` John Austin
2018-09-24 19:58                           ` Taylor Blau
2018-09-25  4:05                             ` John Austin
2018-09-25 20:14                               ` Taylor Blau
2018-09-24 13:59                     ` Taylor Blau
2018-09-14 21:13   ` John Austin
2018-09-16  7:56     ` David Aguilar
2018-09-17 13:48       ` Taylor Blau
2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
2018-09-14 23:36   ` John Austin
2018-09-15 16:42     ` Taylor Blau
2018-09-16 18:17       ` John Austin
2018-09-16 22:05         ` Jonathan Nieder
2018-09-17 13:58           ` Taylor Blau
2018-09-17 15:58             ` Jonathan Nieder
2018-10-03 12:28               ` Thomas Braun

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).