* standalone library/tool to query commit-graph? @ 2019-05-22 18:49 Karl Ostmo 2019-05-22 18:59 ` Derrick Stolee 0 siblings, 1 reply; 12+ messages in thread From: Karl Ostmo @ 2019-05-22 18:49 UTC (permalink / raw) To: git After producing the file ".git/objects/info/commit-graph" with the command "git commit-graph write", is there a way to answer queries like "git merge-base --is-ancestor" without having a .git directory? E.g. is there a library that will operate on the "commit-graph" file all by itself? Thanks, Karl ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-22 18:49 standalone library/tool to query commit-graph? Karl Ostmo @ 2019-05-22 18:59 ` Derrick Stolee 2019-05-23 19:29 ` Jakub Narebski 0 siblings, 1 reply; 12+ messages in thread From: Derrick Stolee @ 2019-05-22 18:59 UTC (permalink / raw) To: Karl Ostmo, git On 5/22/2019 2:49 PM, Karl Ostmo wrote: > After producing the file ".git/objects/info/commit-graph" with the > command "git commit-graph write", is there a way to answer queries > like "git merge-base --is-ancestor" without having a .git directory? > E.g. is there a library that will operate on the "commit-graph" file > all by itself? You could certainly build such a tool, assuming your merge-base parameters are full-length commit ids. If you try to start at ref names, you'll need the .git directory. I would not expect such a tool to ever exist in the Git codebase. Instead, you would need a new project, say "graph-analyzer --graph=<path> --is-ancestor <id1> <id2>" Thanks, -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-22 18:59 ` Derrick Stolee @ 2019-05-23 19:29 ` Jakub Narebski 2019-05-23 21:54 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 12+ messages in thread From: Jakub Narebski @ 2019-05-23 19:29 UTC (permalink / raw) To: Derrick Stolee; +Cc: Karl Ostmo, git Derrick Stolee <stolee@gmail.com> writes: > On 5/22/2019 2:49 PM, Karl Ostmo wrote: >> After producing the file ".git/objects/info/commit-graph" with the >> command "git commit-graph write", is there a way to answer queries >> like "git merge-base --is-ancestor" without having a .git directory? >> E.g. is there a library that will operate on the "commit-graph" file >> all by itself? > > You could certainly build such a tool, assuming your merge-base parameters are > full-length commit ids. If you try to start at ref names, you'll need the .git > directory. > > I would not expect such a tool to ever exist in the Git codebase. Instead, you > would need a new project, say "graph-analyzer --graph=<path> --is-ancestor <id1> <id2>" It would be nice if such tool could convert commit-graph into other commonly used augmented graph storage formats, like GEXF (Graph Exchange XML Format), GraphML, GML (Graph Modelling Language), Pajek format or Graphviz .dot format. Wishfully thinking, -- Jakub Narębski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-23 19:29 ` Jakub Narebski @ 2019-05-23 21:54 ` Ævar Arnfjörð Bjarmason 2019-05-23 22:20 ` SZEDER Gábor 0 siblings, 1 reply; 12+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-05-23 21:54 UTC (permalink / raw) To: Jakub Narebski; +Cc: Derrick Stolee, Karl Ostmo, git On Thu, May 23 2019, Jakub Narebski wrote: > Derrick Stolee <stolee@gmail.com> writes: >> On 5/22/2019 2:49 PM, Karl Ostmo wrote: > >>> After producing the file ".git/objects/info/commit-graph" with the >>> command "git commit-graph write", is there a way to answer queries >>> like "git merge-base --is-ancestor" without having a .git directory? >>> E.g. is there a library that will operate on the "commit-graph" file >>> all by itself? >> >> You could certainly build such a tool, assuming your merge-base parameters are >> full-length commit ids. If you try to start at ref names, you'll need the .git >> directory. >> >> I would not expect such a tool to ever exist in the Git codebase. Instead, you >> would need a new project, say "graph-analyzer --graph=<path> --is-ancestor <id1> <id2>" > > It would be nice if such tool could convert commit-graph into other > commonly used augmented graph storage formats, like GEXF (Graph Exchange > XML Format), GraphML, GML (Graph Modelling Language), Pajek format or > Graphviz .dot format. Wouldn't that make more sense as a hypothetical output format for "log --graph" rather than something you'd want to emit from the commit-graph? Presumably you'd want to export in such a format to see the shape of the repo, and since the commit graph doesn't include any commits outside of packs you'd miss any loose commits. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-23 21:54 ` Ævar Arnfjörð Bjarmason @ 2019-05-23 22:20 ` SZEDER Gábor 2019-05-23 23:48 ` Derrick Stolee 0 siblings, 1 reply; 12+ messages in thread From: SZEDER Gábor @ 2019-05-23 22:20 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jakub Narebski, Derrick Stolee, Karl Ostmo, git On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Thu, May 23 2019, Jakub Narebski wrote: > > > Derrick Stolee <stolee@gmail.com> writes: > >> On 5/22/2019 2:49 PM, Karl Ostmo wrote: > > > >>> After producing the file ".git/objects/info/commit-graph" with the > >>> command "git commit-graph write", is there a way to answer queries > >>> like "git merge-base --is-ancestor" without having a .git directory? > >>> E.g. is there a library that will operate on the "commit-graph" file > >>> all by itself? > >> > >> You could certainly build such a tool, assuming your merge-base parameters are > >> full-length commit ids. If you try to start at ref names, you'll need the .git > >> directory. > >> > >> I would not expect such a tool to ever exist in the Git codebase. Instead, you > >> would need a new project, say "graph-analyzer --graph=<path> --is-ancestor <id1> <id2>" > > > > It would be nice if such tool could convert commit-graph into other > > commonly used augmented graph storage formats, like GEXF (Graph Exchange > > XML Format), GraphML, GML (Graph Modelling Language), Pajek format or > > Graphviz .dot format. > > Wouldn't that make more sense as a hypothetical output format for "log > --graph" rather than something you'd want to emit from the commit-graph? > Presumably you'd want to export in such a format to see the shape of the > repo, and since the commit graph doesn't include any commits outside of > packs you'd miss any loose commits. No, the commit-graph includes loose commits as well. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-23 22:20 ` SZEDER Gábor @ 2019-05-23 23:48 ` Derrick Stolee 2019-05-24 9:34 ` SZEDER Gábor 0 siblings, 1 reply; 12+ messages in thread From: Derrick Stolee @ 2019-05-23 23:48 UTC (permalink / raw) To: SZEDER Gábor, Ævar Arnfjörð Bjarmason Cc: Jakub Narebski, Karl Ostmo, git On 5/23/2019 6:20 PM, SZEDER Gábor wrote: > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> On Thu, May 23 2019, Jakub Narebski wrote: >> >>> Derrick Stolee <stolee@gmail.com> writes: >>>> On 5/22/2019 2:49 PM, Karl Ostmo wrote: >>> >>>>> After producing the file ".git/objects/info/commit-graph" with the >>>>> command "git commit-graph write", is there a way to answer queries >>>>> like "git merge-base --is-ancestor" without having a .git directory? >>>>> E.g. is there a library that will operate on the "commit-graph" file >>>>> all by itself? >>>> >>>> You could certainly build such a tool, assuming your merge-base parameters are >>>> full-length commit ids. If you try to start at ref names, you'll need the .git >>>> directory. >>>> >>>> I would not expect such a tool to ever exist in the Git codebase. Instead, you >>>> would need a new project, say "graph-analyzer --graph=<path> --is-ancestor <id1> <id2>" >>> >>> It would be nice if such tool could convert commit-graph into other >>> commonly used augmented graph storage formats, like GEXF (Graph Exchange >>> XML Format), GraphML, GML (Graph Modelling Language), Pajek format or >>> Graphviz .dot format. >> >> Wouldn't that make more sense as a hypothetical output format for "log >> --graph" rather than something you'd want to emit from the commit-graph? >> Presumably you'd want to export in such a format to see the shape of the >> repo, and since the commit graph doesn't include any commits outside of >> packs you'd miss any loose commits. > > No, the commit-graph includes loose commits as well. Depends on how you build the commit-graph. git commit-graph write git commit-graph write --stdin-packs These options build based on commits in packs (and closes under reachability). git commit-graph write --reachable git commit-graph write --stdin-commits These options build based on a set of starting commits. Either the refs (--reachable) or the input commit ids (--stdin-commits). But I do like the flexibility of `git log --graph` as you could export the graph after reparenting (with options like `--simplify-merges -- <path>`). You also would not include commits from random topic branches you have sitting around. -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-23 23:48 ` Derrick Stolee @ 2019-05-24 9:34 ` SZEDER Gábor 2019-05-24 9:49 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 12+ messages in thread From: SZEDER Gábor @ 2019-05-24 9:34 UTC (permalink / raw) To: Derrick Stolee Cc: Ævar Arnfjörð Bjarmason, Jakub Narebski, Karl Ostmo, git On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote: > On 5/23/2019 6:20 PM, SZEDER Gábor wrote: > > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> and since the commit graph doesn't include any commits outside of > >> packs you'd miss any loose commits. > > > > No, the commit-graph includes loose commits as well. > > Depends on how you build the commit-graph. Yeah; I just didn't want to go into details, hoping that this short reply will be enough to jog Ævar's memory to recall our earlier discussion about this :) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-24 9:34 ` SZEDER Gábor @ 2019-05-24 9:49 ` Ævar Arnfjörð Bjarmason 2019-05-24 10:06 ` SZEDER Gábor 2019-06-25 18:27 ` Jakub Narebski 0 siblings, 2 replies; 12+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-05-24 9:49 UTC (permalink / raw) To: SZEDER Gábor; +Cc: Derrick Stolee, Jakub Narebski, Karl Ostmo, git On Fri, May 24 2019, SZEDER Gábor wrote: > On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote: >> On 5/23/2019 6:20 PM, SZEDER Gábor wrote: >> > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> >> and since the commit graph doesn't include any commits outside of >> >> packs you'd miss any loose commits. >> > >> > No, the commit-graph includes loose commits as well. >> >> Depends on how you build the commit-graph. > > Yeah; I just didn't want to go into details, hoping that this short > reply will be enough to jog Ævar's memory to recall our earlier > discussion about this :) To clarify (and I should have said) I meant it'll include only packed commits in the mode Karl Ostmo invoked it in, as Derrick points out. But yeah, you can of course give it arbitrary starting points, but needing to deal with those sorts of caveats makes it rather useless in practice for the sort of use-case Jakub mused about, but more importantly a full XML dump of the graph isn't going to get much of a benefit from the commit graph, it helps with algorithms that want to avoid those sorts of full walks. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-24 9:49 ` Ævar Arnfjörð Bjarmason @ 2019-05-24 10:06 ` SZEDER Gábor 2019-05-24 10:49 ` Ævar Arnfjörð Bjarmason 2019-06-25 18:27 ` Jakub Narebski 1 sibling, 1 reply; 12+ messages in thread From: SZEDER Gábor @ 2019-05-24 10:06 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Derrick Stolee, Jakub Narebski, Karl Ostmo, git On Fri, May 24, 2019 at 11:49:28AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Fri, May 24 2019, SZEDER Gábor wrote: > > > On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote: > >> On 5/23/2019 6:20 PM, SZEDER Gábor wrote: > >> > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: > > > >> >> and since the commit graph doesn't include any commits outside of > >> >> packs you'd miss any loose commits. > >> > > >> > No, the commit-graph includes loose commits as well. > >> > >> Depends on how you build the commit-graph. > > > > Yeah; I just didn't want to go into details, hoping that this short > > reply will be enough to jog Ævar's memory to recall our earlier > > discussion about this :) > > To clarify (and I should have said) I meant it'll include only packed > commits in the mode Karl Ostmo invoked it in, as Derrick points out. No, even in that mode it will include loose objects as well, if it has to; that's what the "and closes under reachability" part of Derrick's reply means and that's what I showed in our earlier discussion at: https://public-inbox.org/git/20190322154943.GF22459@szeder.dev/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-24 10:06 ` SZEDER Gábor @ 2019-05-24 10:49 ` Ævar Arnfjörð Bjarmason 2019-05-24 11:37 ` SZEDER Gábor 0 siblings, 1 reply; 12+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-05-24 10:49 UTC (permalink / raw) To: SZEDER Gábor; +Cc: Derrick Stolee, Jakub Narebski, Karl Ostmo, git On Fri, May 24 2019, SZEDER Gábor wrote: > On Fri, May 24, 2019 at 11:49:28AM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> On Fri, May 24 2019, SZEDER Gábor wrote: >> >> > On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote: >> >> On 5/23/2019 6:20 PM, SZEDER Gábor wrote: >> >> > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: >> > >> >> >> and since the commit graph doesn't include any commits outside of >> >> >> packs you'd miss any loose commits. >> >> > >> >> > No, the commit-graph includes loose commits as well. >> >> >> >> Depends on how you build the commit-graph. >> > >> > Yeah; I just didn't want to go into details, hoping that this short >> > reply will be enough to jog Ævar's memory to recall our earlier >> > discussion about this :) >> >> To clarify (and I should have said) I meant it'll include only packed >> commits in the mode Karl Ostmo invoked it in, as Derrick points out. > > No, even in that mode it will include loose objects as well, if it has > to; that's what the "and closes under reachability" part of Derrick's > reply means and that's what I showed in our earlier discussion at: > > https://public-inbox.org/git/20190322154943.GF22459@szeder.dev/ I should have said "include any commits outside of packs [to seed the revision walk]". As you correctly point out there *are* caveats to that, e.g. it's possible to have packs & loose commits but you include everything because of reachability. For the purposes of the discussion Jakub started upthread the not-quite-correct-but-close-enough mental model that we generally tend to accumulate loose objects that later coalesce into packs is close enough. I.e. for that reason for most users a "git commit-graph write" won't produce a graph with all reachable commits, e.g. try cloning git.git, "git am"-ing a patch on top, and generate it again, it'll be the same (unless you picked a humongous patch). Similarly it'll be incomplete for most users that have gc.writeCommitGraph=true on since they use "gc --auto", and they're likely in an in-between state where they have a semi-stale graph. So building tools directly on top of it shouldn't be anyone's first choice, instead walk the DAG and see if that walking code can as an optimization optimistically consult the commit-graph. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-24 10:49 ` Ævar Arnfjörð Bjarmason @ 2019-05-24 11:37 ` SZEDER Gábor 0 siblings, 0 replies; 12+ messages in thread From: SZEDER Gábor @ 2019-05-24 11:37 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Derrick Stolee, Jakub Narebski, Karl Ostmo, git On Fri, May 24, 2019 at 12:49:12PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote: > >> >> On 5/23/2019 6:20 PM, SZEDER Gábor wrote: > >> >> > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > > >> >> >> and since the commit graph doesn't include any commits outside of > >> >> >> packs you'd miss any loose commits. > >> >> > > >> >> > No, the commit-graph includes loose commits as well. > >> >> > >> >> Depends on how you build the commit-graph. > >> > > >> > Yeah; I just didn't want to go into details, hoping that this short > >> > reply will be enough to jog Ævar's memory to recall our earlier > >> > discussion about this :) > >> > >> To clarify (and I should have said) I meant it'll include only packed > >> commits in the mode Karl Ostmo invoked it in, as Derrick points out. > > > > No, even in that mode it will include loose objects as well, if it has > > to; that's what the "and closes under reachability" part of Derrick's > > reply means and that's what I showed in our earlier discussion at: > > > > https://public-inbox.org/git/20190322154943.GF22459@szeder.dev/ > > I should have said "include any commits outside of packs [to seed the > revision walk]". > > As you correctly point out there *are* caveats to that, e.g. it's > possible to have packs & loose commits but you include everything > because of reachability. > > For the purposes of the discussion Jakub started upthread the > not-quite-correct-but-close-enough mental model that we generally tend > to accumulate loose objects that later coalesce into packs is close > enough. > > I.e. for that reason for most users a "git commit-graph write" won't > produce a graph with all reachable commits, e.g. try cloning git.git, > "git am"-ing a patch on top, and generate it again, it'll be the same > (unless you picked a humongous patch). Ok, with this I finally understand what you meant. And it just reinforces my long-held belief that '--reachable' should be the default for 'git commit-graph write'... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: standalone library/tool to query commit-graph? 2019-05-24 9:49 ` Ævar Arnfjörð Bjarmason 2019-05-24 10:06 ` SZEDER Gábor @ 2019-06-25 18:27 ` Jakub Narebski 1 sibling, 0 replies; 12+ messages in thread From: Jakub Narebski @ 2019-06-25 18:27 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: SZEDER Gábor, Derrick Stolee, Karl Ostmo, git Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: [...] > To clarify (and I should have said) I meant it'll include only packed > commits in the mode Karl Ostmo invoked it in, as Derrick points out. > > But yeah, you can of course give it arbitrary starting points, but > needing to deal with those sorts of caveats makes it rather useless in > practice for the sort of use-case Jakub mused about, but more > importantly a full XML dump of the graph isn't going to get much of a > benefit from the commit graph, it helps with algorithms that want to > avoid those sorts of full walks. Actually for an "XML dump" of a graph of revisions (assuming that you can give nodes and edges in arbitrary order in this graph output format) doing it using serialized commit-graph should be faster: you only need to read one file, and convert it to other format (perhaps even in a streaming manner). No need to delta-unpack, decompress and parse commit objects. Though on the other hand you are right: if "git log --graph" uses serialized commit graph, and it is used for XML / JSON dump, it should also be fast. If there is no serialized commit graph, you still can generate XML dump. Best, -- Jakub Narębski ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-06-25 18:27 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-22 18:49 standalone library/tool to query commit-graph? Karl Ostmo 2019-05-22 18:59 ` Derrick Stolee 2019-05-23 19:29 ` Jakub Narebski 2019-05-23 21:54 ` Ævar Arnfjörð Bjarmason 2019-05-23 22:20 ` SZEDER Gábor 2019-05-23 23:48 ` Derrick Stolee 2019-05-24 9:34 ` SZEDER Gábor 2019-05-24 9:49 ` Ævar Arnfjörð Bjarmason 2019-05-24 10:06 ` SZEDER Gábor 2019-05-24 10:49 ` Ævar Arnfjörð Bjarmason 2019-05-24 11:37 ` SZEDER Gábor 2019-06-25 18:27 ` Jakub Narebski
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).