git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC] speed up git submodule
@ 2013-06-17  0:05 Fredrik Gustafsson
  2013-06-17  8:29 ` Thomas Rast
  0 siblings, 1 reply; 2+ messages in thread
From: Fredrik Gustafsson @ 2013-06-17  0:05 UTC (permalink / raw)
  To: iveqy; +Cc: git

I've been playing a bit with lua. It's an embedded scripting language
with strong c integration. It's small and fast.

The interesting feature would be to run C-functions direct inside lua. I
suppose that would increase speed even more, at the same time as we have
the convinence of a interpreted language. Lua is smaller and faster
(well as always, it depends on what you're doing) than python and ruby.
Perl is a really pain for the windows folks (I've heard).

A correct implementation for lua support would be to start a
lua-interpreter from inside git.c (or somewhere) and load the lua code
for a specific command. That would make us independent of any target
installation of lua (althought the git binary would increase with the
lua library around 300 kb).

However I did a quick test using lua as a replacement for sh (without
direct calls to c-functions) and the result is impressive. (However this
is the wrong way of using lua, shell scripting is not something lua is
good at).

I did some runs on a project with 52 submodules (or 53 if you count the
ones in .gitmodules). These results are pretty typical:
iveqy@kolya:~/projects/eracle_core$ time /home/iveqy/projects/git/git-submodule.lua > /dev/null

real    0m1.665s
user    0m0.276s
sys     0m0.452s
iveqy@kolya:~/projects/eracle_core$ time git submodule > /dev/null

real    0m3.413s
user    0m0.476s
sys     0m1.224s

For me, that speedup does matter.

NOTICE!!!
This code is experimental. It does have some known bugs, it does have
some style issues. A state of the art complete implementation would
contain a few more tests/jumps and less concat (which is extremely
expensive in lua) and less git-invokation.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
 git-submodule.lua | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100755 git-submodule.lua

diff --git a/git-submodule.lua b/git-submodule.lua
new file mode 100755
index 0000000..14f71e6
--- /dev/null
+++ b/git-submodule.lua
@@ -0,0 +1,104 @@
+#!/usr/bin/lua
+
+function run_cmd(cmd)
+	local f = io.popen(cmd, 'r');
+	local out = f:read('*a');
+	f:close()
+	return out
+end
+
+function fwrite(fmt, ...)
+	return io.write(string.format(fmt, ...))
+end
+
+function read_gitmodules()
+	local inf = assert(io.open('.gitmodules', 'r'))
+	local config = inf:read("*all")
+	gitmodules = {}
+	for sm in string.gmatch(config, '%[[^]]*%][^%[]*') do
+		local thismod = {}
+		local name = string.match(sm, '%[%s-submodule%s-"(.+)"%s-%]')
+		thismod["name"] = name
+		local path = ''
+		for k, v in string.gmatch(sm, '\n%s*([^=^%s]*)%s*=%s*([^\n]*)') do
+			if k == 'path' then
+				path = v
+			else
+				thismod[k] = v
+			end
+		end
+		if path == '' then
+			fwrite("No path found for %s in .gitmodules\n", name)
+			os.exit(1)
+		end
+		gitmodules[path] = thismod
+	end
+
+	return gitmodules
+end
+
+function module_list()
+	local lsfiles = 'git ls-files --stage --error-unmatch -z || echo "#unmatched"'
+	local out = run_cmd(lsfiles)
+	local unmerged = ''
+	local subs = read_gitmodules()
+
+	for row in string.gmatch(out, '.-\0') do
+		if row == '#unmatched' then
+			os.exit(1)
+		end
+
+		local mode, sha1, stage, path = string.match(row, '(%d+)%s([0-9a-f]+)%s(.)%s(.*)\0')
+		if mode == '160000' then
+			if stage == '0' then
+				subs[path]["sha1"] = sha1
+				subs[path]["stage"] = stage
+			else
+				if unmerged ~= path then
+					local null_sha1 = '0000000000000000000000000000000000000000'
+					subs[path]["sha1"] = null_sha1
+					subs[path]["stage"] = 'U'
+				end
+				unmerged = path
+			end
+		end
+	end
+	return subs
+end
+
+function get_name_rev(path, sha1)
+	if sha1 == nil then sha1="" end
+	local cmd = "cd \"" .. path .. "\" && (git describe " .. sha1 ..
+				" 2>/dev/null || git describe --tags " .. sha1 ..
+				" 2>/dev/null || git describe --contains " .. sha1 ..
+				" 2>/dev/null || git describe --all --always " .. sha1 ..
+				" 2>/dev/null) "
+	return string.gsub(run_cmd(cmd), '\n', '')
+end
+
+function cmd_status()
+	subs = module_list()
+	for smpath in pairs(subs) do
+		if (subs[smpath].sha1) then
+			if subs[smpath].stage == 'U' then
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("U%s %s (%s)", subs[smpath].sha1, smpath, subs[smpath].revname)
+			elseif run_cmd("test -z " .. subs[smpath].url .. " || ! test -d " .. smpath .."/.git -o -f " .. smpath .. "/.git || echo '0' ") ~= '0\n' then
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("-%s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			elseif run_cmd("git diff-files --ignore-submodules=dirty --quiet -- " .. smpath .. " || echo '0'") ~= '0\n' then
+				p = run_cmd("git diff-files --ignore-submodules=dirty --quiet -- " .. smpath .. " || echo '0'")
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite(" %s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			else
+				subs[smpath].sha1 = string.gsub(run_cmd('cd ' .. smpath .. ' && git rev-parse --verify HEAD'), "\n", "")
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("+%s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			end
+		end
+	end
+end
+
+if arg[1] == nil or arg[1] == 'status' then
+	cmd_status()
+end
-- 
1.8.3.1.381.g2ab719e.dirty

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [RFC] speed up git submodule
  2013-06-17  0:05 [RFC] speed up git submodule Fredrik Gustafsson
@ 2013-06-17  8:29 ` Thomas Rast
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Rast @ 2013-06-17  8:29 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> The interesting feature would be to run C-functions direct inside lua. I
> suppose that would increase speed even more, at the same time as we have
> the convinence of a interpreted language. Lua is smaller and faster
> (well as always, it depends on what you're doing) than python and ruby.
> Perl is a really pain for the windows folks (I've heard).
>
> A correct implementation for lua support would be to start a
> lua-interpreter from inside git.c (or somewhere) and load the lua code
> for a specific command. That would make us independent of any target
> installation of lua (althought the git binary would increase with the
> lua library around 300 kb).
>
> However I did a quick test using lua as a replacement for sh (without
> direct calls to c-functions) and the result is impressive. (However this
> is the wrong way of using lua, shell scripting is not something lua is
> good at).

Ok, so as you say, to really buy us anything you'd have to interface lua
with the C code directly.  Otherwise you might as well write it in Perl
instead which is already a requirement for a lot of the "niceties".

However, instead of writing against git's C code, you could also
interface with libgit2, either from Lua or Perl...


BTW Peff once posted an interface to Lua for the --pretty formatters:

  http://thread.gmane.org/gmane.comp.version-control.git/206335

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-06-17  8:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-17  0:05 [RFC] speed up git submodule Fredrik Gustafsson
2013-06-17  8:29 ` Thomas Rast

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).