git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Fredrik Gustafsson <iveqy@iveqy.com>
To: iveqy@iveqy.com
Cc: git@vger.kernel.org
Subject: [RFC] speed up git submodule
Date: Mon, 17 Jun 2013 02:05:03 +0200	[thread overview]
Message-ID: <1371427503-9678-1-git-send-email-iveqy@iveqy.com> (raw)

I've been playing a bit with lua. It's an embedded scripting language
with strong c integration. It's small and fast.

The interesting feature would be to run C-functions direct inside lua. I
suppose that would increase speed even more, at the same time as we have
the convinence of a interpreted language. Lua is smaller and faster
(well as always, it depends on what you're doing) than python and ruby.
Perl is a really pain for the windows folks (I've heard).

A correct implementation for lua support would be to start a
lua-interpreter from inside git.c (or somewhere) and load the lua code
for a specific command. That would make us independent of any target
installation of lua (althought the git binary would increase with the
lua library around 300 kb).

However I did a quick test using lua as a replacement for sh (without
direct calls to c-functions) and the result is impressive. (However this
is the wrong way of using lua, shell scripting is not something lua is
good at).

I did some runs on a project with 52 submodules (or 53 if you count the
ones in .gitmodules). These results are pretty typical:
iveqy@kolya:~/projects/eracle_core$ time /home/iveqy/projects/git/git-submodule.lua > /dev/null

real    0m1.665s
user    0m0.276s
sys     0m0.452s
iveqy@kolya:~/projects/eracle_core$ time git submodule > /dev/null

real    0m3.413s
user    0m0.476s
sys     0m1.224s

For me, that speedup does matter.

NOTICE!!!
This code is experimental. It does have some known bugs, it does have
some style issues. A state of the art complete implementation would
contain a few more tests/jumps and less concat (which is extremely
expensive in lua) and less git-invokation.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
 git-submodule.lua | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100755 git-submodule.lua

diff --git a/git-submodule.lua b/git-submodule.lua
new file mode 100755
index 0000000..14f71e6
--- /dev/null
+++ b/git-submodule.lua
@@ -0,0 +1,104 @@
+#!/usr/bin/lua
+
+function run_cmd(cmd)
+	local f = io.popen(cmd, 'r');
+	local out = f:read('*a');
+	f:close()
+	return out
+end
+
+function fwrite(fmt, ...)
+	return io.write(string.format(fmt, ...))
+end
+
+function read_gitmodules()
+	local inf = assert(io.open('.gitmodules', 'r'))
+	local config = inf:read("*all")
+	gitmodules = {}
+	for sm in string.gmatch(config, '%[[^]]*%][^%[]*') do
+		local thismod = {}
+		local name = string.match(sm, '%[%s-submodule%s-"(.+)"%s-%]')
+		thismod["name"] = name
+		local path = ''
+		for k, v in string.gmatch(sm, '\n%s*([^=^%s]*)%s*=%s*([^\n]*)') do
+			if k == 'path' then
+				path = v
+			else
+				thismod[k] = v
+			end
+		end
+		if path == '' then
+			fwrite("No path found for %s in .gitmodules\n", name)
+			os.exit(1)
+		end
+		gitmodules[path] = thismod
+	end
+
+	return gitmodules
+end
+
+function module_list()
+	local lsfiles = 'git ls-files --stage --error-unmatch -z || echo "#unmatched"'
+	local out = run_cmd(lsfiles)
+	local unmerged = ''
+	local subs = read_gitmodules()
+
+	for row in string.gmatch(out, '.-\0') do
+		if row == '#unmatched' then
+			os.exit(1)
+		end
+
+		local mode, sha1, stage, path = string.match(row, '(%d+)%s([0-9a-f]+)%s(.)%s(.*)\0')
+		if mode == '160000' then
+			if stage == '0' then
+				subs[path]["sha1"] = sha1
+				subs[path]["stage"] = stage
+			else
+				if unmerged ~= path then
+					local null_sha1 = '0000000000000000000000000000000000000000'
+					subs[path]["sha1"] = null_sha1
+					subs[path]["stage"] = 'U'
+				end
+				unmerged = path
+			end
+		end
+	end
+	return subs
+end
+
+function get_name_rev(path, sha1)
+	if sha1 == nil then sha1="" end
+	local cmd = "cd \"" .. path .. "\" && (git describe " .. sha1 ..
+				" 2>/dev/null || git describe --tags " .. sha1 ..
+				" 2>/dev/null || git describe --contains " .. sha1 ..
+				" 2>/dev/null || git describe --all --always " .. sha1 ..
+				" 2>/dev/null) "
+	return string.gsub(run_cmd(cmd), '\n', '')
+end
+
+function cmd_status()
+	subs = module_list()
+	for smpath in pairs(subs) do
+		if (subs[smpath].sha1) then
+			if subs[smpath].stage == 'U' then
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("U%s %s (%s)", subs[smpath].sha1, smpath, subs[smpath].revname)
+			elseif run_cmd("test -z " .. subs[smpath].url .. " || ! test -d " .. smpath .."/.git -o -f " .. smpath .. "/.git || echo '0' ") ~= '0\n' then
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("-%s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			elseif run_cmd("git diff-files --ignore-submodules=dirty --quiet -- " .. smpath .. " || echo '0'") ~= '0\n' then
+				p = run_cmd("git diff-files --ignore-submodules=dirty --quiet -- " .. smpath .. " || echo '0'")
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite(" %s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			else
+				subs[smpath].sha1 = string.gsub(run_cmd('cd ' .. smpath .. ' && git rev-parse --verify HEAD'), "\n", "")
+				subs[smpath]["revname"] = get_name_rev(smpath, subs[smpath].sha1)
+				fwrite("+%s %s (%s)\n", subs[smpath].sha1, smpath, subs[smpath].revname)
+			end
+		end
+	end
+end
+
+if arg[1] == nil or arg[1] == 'status' then
+	cmd_status()
+end
-- 
1.8.3.1.381.g2ab719e.dirty

             reply	other threads:[~2013-06-17  0:02 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-17  0:05 Fredrik Gustafsson [this message]
2013-06-17  8:29 ` [RFC] speed up git submodule Thomas Rast

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1371427503-9678-1-git-send-email-iveqy@iveqy.com \
    --to=iveqy@iveqy.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).