git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Rewriting git-repack.sh in C
@ 2013-08-02 13:48 Stefan Beller
  2013-08-02 14:10 ` Duy Nguyen
  2013-08-05 10:34 ` Rewriting git-repack.sh in C Matthieu Moy
  0 siblings, 2 replies; 17+ messages in thread
From: Stefan Beller @ 2013-08-02 13:48 UTC (permalink / raw)
  To: GIT Mailing-list

[-- Attachment #1: Type: text/plain, Size: 837 bytes --]

Hello,

I'd like to rewrite the repack shell script in C.
So I tried the naive approach reading the man page and 
the script itself and write C program by matching each block/line 
of the script with a function in C

Now I stumble upon other git commands (git pack-objects).
What's the best way to approach such a plumbing command?

I don't think just calling cmd_pack_objects(argc, **argv) would 
be the right thing to do, as we're not using all the command 
line parameters, so some of the logic in cmd_pack_object could 
be skipped.
Another approach would be to use some of the functions as used 
by cmd_pack_objects, but these mostly reside in builtin/pack_objects.c
They'd need to be moved up to pack.h/pack.c.

So my question is, how you'd generally approach rewriting a 
shell script in C.

Stefan






[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rewriting git-repack.sh in C
  2013-08-02 13:48 Rewriting git-repack.sh in C Stefan Beller
@ 2013-08-02 14:10 ` Duy Nguyen
  2013-08-02 16:36   ` Duy Nguyen
  2013-08-03  6:33   ` Fredrik Gustafsson
  2013-08-05 10:34 ` Rewriting git-repack.sh in C Matthieu Moy
  1 sibling, 2 replies; 17+ messages in thread
From: Duy Nguyen @ 2013-08-02 14:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: GIT Mailing-list

On Fri, Aug 2, 2013 at 8:48 PM, Stefan Beller
<stefanbeller@googlemail.com> wrote:
> Hello,
>
> I'd like to rewrite the repack shell script in C.
> So I tried the naive approach reading the man page and
> the script itself and write C program by matching each block/line
> of the script with a function in C
>
> Now I stumble upon other git commands (git pack-objects).
> What's the best way to approach such a plumbing command?
>
> I don't think just calling cmd_pack_objects(argc, **argv) would
> be the right thing to do, as we're not using all the command
> line parameters, so some of the logic in cmd_pack_object could
> be skipped.
> Another approach would be to use some of the functions as used
> by cmd_pack_objects, but these mostly reside in builtin/pack_objects.c
> They'd need to be moved up to pack.h/pack.c.
>
> So my question is, how you'd generally approach rewriting a
> shell script in C.

Start a new process via start_command/run_command interface. It's
safer to retain the process boundary at this stage. You can try to
integrate further later.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rewriting git-repack.sh in C
  2013-08-02 14:10 ` Duy Nguyen
@ 2013-08-02 16:36   ` Duy Nguyen
  2013-08-03  6:33   ` Fredrik Gustafsson
  1 sibling, 0 replies; 17+ messages in thread
From: Duy Nguyen @ 2013-08-02 16:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: GIT Mailing-list

On Fri, Aug 02, 2013 at 09:10:59PM +0700, Duy Nguyen wrote:
> On Fri, Aug 2, 2013 at 8:48 PM, Stefan Beller
> <stefanbeller@googlemail.com> wrote:
> > Hello,
> >
> > I'd like to rewrite the repack shell script in C.
> > So I tried the naive approach reading the man page and
> > the script itself and write C program by matching each block/line
> > of the script with a function in C
> >
> > ...
> >
> > So my question is, how you'd generally approach rewriting a
> > shell script in C.
> 
> Start a new process via start_command/run_command interface. It's
> safer to retain the process boundary at this stage. You can try to
> integrate further later.

I was in the middle of something and somehow read this as "rewriting
git-rebase.sh" :-X

For git-repack, because it ends with a single pack-objects call, we
might not need a new process after all (very much like tail call
optimization). This is what I got for some time, but still not polish
it for submission (and it may be a bit broken after the last
rebase). Maybe you can work off this, or from scratch if you think
it's too messy. It basically teaches pack-objects extra features that
repack needs, then make repack a wrapper of pack-objects.

-- 8< --
commit 25569a3958c3272b3eb5fa50dea680948f7a2768 (build-in-repack)
Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Date:   Wed Nov 9 19:21:39 2011 +0700

    Build in git-repack
    
    pack-objects learns a few more options to take over what's been done
    by git-repack.sh. cmd_repack() becomes a wrapper around
    cmd_pack_objects().

diff --git a/Makefile b/Makefile
index 0f931a2..b4010a6 100644
--- a/Makefile
+++ b/Makefile
@@ -460,7 +460,6 @@ SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-pull.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
-SCRIPT_SH += git-repack.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
@@ -584,6 +583,7 @@ BUILT_INS += git-init$X
 BUILT_INS += git-merge-subtree$X
 BUILT_INS += git-peek-remote$X
 BUILT_INS += git-repo-config$X
+BUILT_INS += git-repack$X
 BUILT_INS += git-show$X
 BUILT_INS += git-stage$X
 BUILT_INS += git-status$X
diff --git a/builtin.h b/builtin.h
index 64bab6b..feb958f 100644
--- a/builtin.h
+++ b/builtin.h
@@ -117,6 +117,7 @@ extern int cmd_reflog(int argc, const char **argv, const char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_ext(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_fd(int argc, const char **argv, const char *prefix);
+extern int cmd_repack(int argc, const char **argv, const char *prefix);
 extern int cmd_repo_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..1742ea1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -18,10 +18,17 @@
 #include "refs.h"
 #include "streaming.h"
 #include "thread-utils.h"
+#include "sigchain.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
 	N_("git pack-objects [options...] base-name [< ref-list | < object-list]"),
+	N_("git pack-objects --repack [options...]"),
+	NULL
+};
+
+static char const * const repack_usage[] = {
+	N_("git repack [options]"),
 	NULL
 };
 
@@ -103,6 +110,15 @@ static struct object_entry *locate_object_entry(const unsigned char *sha1);
 static uint32_t written, written_delta;
 static uint32_t reused, reused_delta;
 
+#define REPACK_IN_PROGRESS		(1 << 0)
+#define REPACK_UPDATE_INFO		(1 << 1)
+#define REPACK_ALL_INTO_ONE		(1 << 2)
+#define REPACK_REMOVE_REDUNDANT		(1 << 3)
+
+static int repack_flags, nr_written_packs;
+static int repack_usedeltabaseoffset;
+static struct string_list written_packs;
+static struct string_list backup_files;
 
 static void *get_delta(struct object_entry *entry)
 {
@@ -792,9 +808,19 @@ static void write_pack_file(void)
 			snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
 			finish_tmp_packfile(tmpname, pack_tmp_name,
 					    written_list, nr_written,
-					    &pack_idx_opts, sha1);
+					    &pack_idx_opts, sha1,
+					    repack_flags & REPACK_IN_PROGRESS ?
+					    &backup_files : NULL);
 			free(pack_tmp_name);
-			puts(sha1_to_hex(sha1));
+			if (repack_flags & REPACK_IN_PROGRESS) {
+				int len = strlen(tmpname);
+				char *s = xmalloc(len + 2);
+				memcpy(s, tmpname, len - 4);
+				memcpy(s + len - 4, ".pack", 6);
+				string_list_append(&written_packs, s);
+				nr_written_packs++;
+			} else
+				puts(sha1_to_hex(sha1));
 		}
 
 		/* mark written objects as written to previous pack */
@@ -2359,7 +2385,8 @@ static void get_object_list(int ac, const char **av)
 	save_commit_buffer = 0;
 	setup_revisions(ac, av, &revs, NULL);
 
-	while (fgets(line, sizeof(line), stdin) != NULL) {
+	while (!(repack_flags & REPACK_IN_PROGRESS) &&
+	       fgets(line, sizeof(line), stdin) != NULL) {
 		int len = strlen(line);
 		if (len && line[len - 1] == '\n')
 			line[--len] = 0;
@@ -2387,6 +2414,30 @@ static void get_object_list(int ac, const char **av)
 		loosen_unused_packed_objects(&revs);
 }
 
+static void rollback_repack(void)
+{
+	struct strbuf dst = STRBUF_INIT;
+	int i, ret;
+	for (i = 0; i < backup_files.nr; i++) {
+		const char *src = backup_files.items[i].string;
+		strbuf_addstr(&dst, src);
+		strbuf_setlen(&dst, dst.len - 4); /* remove .old */
+		ret = rename(src, dst.buf);
+		if (ret)
+			warning("failed to restore %s: %s", src, strerror(errno));
+		strbuf_setlen(&dst, 0);
+	}
+	strbuf_release(&dst);
+	string_list_clear(&backup_files, 0);
+}
+
+static void rollback_repack_on_signal(int signo)
+{
+	rollback_repack();
+	sigchain_pop(signo);
+	raise(signo);
+}
+
 static int option_parse_index_version(const struct option *opt,
 				      const char *arg, int unset)
 {
@@ -2436,11 +2487,12 @@ static int option_parse_ulong(const struct option *opt,
 
 int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 {
+	struct strbuf repack_base_name = STRBUF_INIT;
 	int use_internal_rev_list = 0;
 	int thin = 0;
 	int all_progress_implied = 0;
 	const char *rp_av[6];
-	int rp_ac = 0;
+	int i, rp_ac = 0;
 	int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
 	struct option pack_objects_options[] = {
 		OPT_SET_INT('q', "quiet", &progress,
@@ -2505,6 +2557,16 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    N_("pack compression level")),
 		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
 			    N_("do not hide commits by grafts"), 0),
+
+		OPT_BIT(0, "repack", &repack_flags,
+			N_("repack mode"), REPACK_IN_PROGRESS),
+		OPT_BIT(0, "repack-all", &repack_flags,
+			N_("repack everything into one pack"), REPACK_ALL_INTO_ONE),
+		OPT_BIT(0, "remove-redundant", &repack_flags,
+			N_("remove redundant objects after repack"), REPACK_REMOVE_REDUNDANT),
+		OPT_BIT(0, "update-info", &repack_flags,
+			N_("run git-update-server-info after repack"), REPACK_UPDATE_INFO),
+
 		OPT_END(),
 	};
 
@@ -2556,6 +2618,36 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (delta_search_threads != 1)
 		warning("no threads support, ignoring --threads");
 #endif
+	if ((repack_flags & REPACK_IN_PROGRESS) == 0 &&
+	    (repack_flags & ~REPACK_IN_PROGRESS))
+		die("--repack must be given for any repack related options");
+	if (repack_flags & REPACK_IN_PROGRESS) {
+		if (pack_to_stdout)
+			die("--stdout cannot be used with --repack");
+		if (argc)
+			die("base name cannot be used with --repack");
+
+		rp_av[rp_ac++] = "--all";
+		rp_av[rp_ac++] = "--reflog";
+		use_internal_rev_list = 1;
+
+		grafts_replace_parents = 0; /* --keep-true-parents */
+		ignore_packed_keep = 1;	    /* --honor-pack-keep */
+		non_empty = 1;		    /* --non-empty */
+
+		if (!(repack_flags & REPACK_ALL_INTO_ONE)) {
+			incremental = 1; /* --incremental */
+			rp_av[rp_ac++] = "--unpacked";
+		}
+
+		strbuf_addf(&repack_base_name,
+			    "%s/pack/pack", get_object_directory());
+		base_name = repack_base_name.buf;
+
+		sigchain_push_common(rollback_repack_on_signal);
+		atexit(rollback_repack);
+	}
+
 	if (!pack_to_stdout && !pack_size_limit)
 		pack_size_limit = pack_size_limit_cfg;
 	if (pack_to_stdout && pack_size_limit)
@@ -2598,5 +2690,184 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, "Total %"PRIu32" (delta %"PRIu32"),"
 			" reused %"PRIu32" (delta %"PRIu32")\n",
 			written, written_delta, reused, reused_delta);
+
+	if (!(repack_flags & REPACK_IN_PROGRESS))
+		return 0;
+
+	if (!nr_written_packs) {
+		printf(_("Nothing new to pack.\n"));
+		return 0;
+	}
+
+	/* At this point all new packs should be in place. We can
+	   safely remove old ones */
+	for (i = 0; i < backup_files.nr; i++) {
+		const char *s = backup_files.items[i].string;
+		int ret = unlink(s);
+		if (ret)
+			warning("failed to remove %s: %s", s, strerror(errno));
+	}
+	string_list_clear(&backup_files, 0);
+
+	if (repack_flags & REPACK_REMOVE_REDUNDANT) {
+		struct packed_git *p;
+		struct string_list to_be_removed = STRING_LIST_INIT_DUP;
+
+		/* free_pack_by_name() may have freed a few packs in
+		   write_pack_file() */
+		reprepare_packed_git();
+		for (p = packed_git; p; p = p->next) {
+			if (!p->pack_local || p->pack_keep)
+				continue;
+
+			for (i = 0; i < written_packs.nr; i++) {
+				char *s = written_packs.items[i].string;
+				if (!strcmp(s, p->pack_name))
+					break;
+			}
+			if (i < written_packs.nr)
+				continue;
+
+			string_list_append(&to_be_removed, p->pack_name);
+		}
+		written_packs.strdup_strings = 1;
+		string_list_clear(&written_packs, 0);
+
+		for (i = 0; i < to_be_removed.nr; i++) {
+			char *path = to_be_removed.items[i].string;
+
+			/* Windows limitation on unlink().
+			   See c74faea19e39ca933492f697596310397175c329 */
+			free_pack_by_name(path);
+
+			if (unlink(path))
+				warning("failed to remove %s: %s", path, strerror(errno));
+			strcpy(path + strlen(path)-5, ".idx");
+			if (unlink(path))
+				warning("failed to remove %s: %s", path, strerror(errno));
+		}
+		string_list_clear(&to_be_removed, 0);
+
+		reprepare_packed_git();
+		prune_packed_objects(progress ? PRUNE_PACKED_VERBOSE : 0);
+	}
+
+	if (repack_flags & REPACK_UPDATE_INFO)
+		update_server_info(0);
+
 	return 0;
 }
+
+static int repack_config(const char *k, const char *v, void *cb)
+{
+	if (!strcasecmp(k, "repack.usedeltabaseoffset")) {
+		repack_usedeltabaseoffset = git_config_bool(k, v);
+		return 0;
+	}
+	return git_default_config(k, v, cb);
+}
+
+int cmd_repack(int argc, const char **argv, const char *prefix)
+{
+	int all_in_one = 0;
+	int all_in_one_and_unreachable = 0;
+	int unpack_unreachable = 0;
+	int remove_redundant = 0;
+	int no_reuse_delta = 0;
+	int no_reuse_object = 0;
+	int no_update = 0;
+	int quiet = 0;
+	int local = 0;
+
+	struct option opts[] = {
+		OPT_BOOL('a', NULL, &all_in_one,
+			 "pack everything in a single pack"),
+		OPT_BOOL('A', NULL, &all_in_one_and_unreachable,
+			 "same as -a, and turn unreachable objects loose"),
+		OPT_BOOL('d', NULL, &remove_redundant,
+			 "remove redundant packs, and run git-prune-packed"),
+		OPT_BOOL('f', NULL, &no_reuse_delta,
+			 "pass --no-reuse-delta to git-pack-objects"),
+		OPT_BOOL('F', NULL, &no_reuse_object,
+			 "pass --no-reuse-object to git-pack-objects"),
+		OPT_BOOL('n', NULL, &no_update,
+			 "do not run git-update-server-info"),
+		OPT_BOOL('q', NULL, &quiet, "be quiet"),
+		OPT_BOOL('l', NULL, &local,
+			 "pass --local to git-pack-objects"),
+		{ OPTION_ARGUMENT, 0, "window", NULL, "n",
+		 "size of the window used for delta compression", 0 },
+		{ OPTION_ARGUMENT, 0, "window-memory", NULL, "n",
+		  "same as the above, but limit memory size instead of entries count", 0 },
+		{ OPTION_ARGUMENT, 0, "depth", NULL, "n",
+		  "limits the maximum delta depth", 0 },
+		{ OPTION_ARGUMENT, 0, "max-pack-size", NULL, "n",
+		  "maximum size of each packfile", 0},
+		OPT_END(),
+	};
+
+	const char *av[] = { "pack-objects", "--repack",
+			     NULL, /* --[no-]update-info */
+			     NULL, /* --delta-base-offset */
+			     NULL, /* --repack-all */
+			     NULL, /* --remove-redundant */
+			     NULL, /* --no-reuse-delta */
+			     NULL, /* --no-reuse-object */
+			     NULL, /* --local */
+			     NULL, /* -q */
+			     NULL, /* --unpack-unreachable */
+			     NULL, /* --window */
+			     NULL, /* --window-memory */
+			     NULL, /* --depth */
+			     NULL, /* --max-pack-size */
+			     NULL
+	};
+	int ac = 2;
+
+	git_config(repack_config, NULL);
+
+	argc = parse_options(argc, argv, prefix, opts, repack_usage, 0);
+
+	if (no_update)
+		av[ac++] = "--no-update-info";
+	else
+		av[ac++] = "--update-info";
+	if (repack_usedeltabaseoffset)
+		av[ac++] = "--delta-base-offset";
+	if (all_in_one_and_unreachable) {
+		av[ac++] = "--repack-all";
+		all_in_one = 1;
+		unpack_unreachable = 1;
+	}
+	if (all_in_one)
+		av[ac++] = "--repack-all";
+	if (remove_redundant)
+		av[ac++] = "--remove-redundant";
+	if (no_reuse_delta)
+		av[ac++] = "--no-reuse-delta";
+	if (no_reuse_object)
+		av[ac++] = "--no-reuse-object";
+	if (local)
+		av[ac++] = "--local";
+	if (quiet)
+		av[ac++] = "-q";
+	if ((ac + argc) * sizeof(*av) > sizeof(av))
+		die("Too many options");
+	memcpy(av + ac, argv, argc * sizeof(*argv));
+	ac += argc;
+
+	if (all_in_one && remove_redundant) {
+		struct packed_git *p;
+
+		prepare_packed_git();
+		for (p = packed_git; p; p = p->next) {
+			if (!p->pack_keep &&
+			    unpack_unreachable && remove_redundant) {
+				av[ac++] = "--unpack-unreachable";
+				break;
+			}
+		}
+	}
+
+	return cmd_pack_objects(ac, av, prefix);
+}
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 6b0b6d4..3ca3c55 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -46,7 +46,7 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
 	sprintf(packname, "%s/pack/pack-", get_object_directory());
 	finish_tmp_packfile(packname, state->pack_tmp_name,
 			    state->written, state->nr_written,
-			    &state->pack_idx_opts, sha1);
+			    &state->pack_idx_opts, sha1, NULL);
 	for (i = 0; i < state->nr_written; i++)
 		free(state->written[i]);
 
diff --git a/git-repack.sh b/contrib/examples/git-repack.sh
similarity index 100%
rename from git-repack.sh
rename to contrib/examples/git-repack.sh
diff --git a/git.c b/git.c
index 1ada169..db4e4b3 100644
--- a/git.c
+++ b/git.c
@@ -389,6 +389,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "remote-ext", cmd_remote_ext },
 		{ "remote-fd", cmd_remote_fd },
 		{ "replace", cmd_replace, RUN_SETUP },
+		{ "repack", cmd_repack, RUN_SETUP },
 		{ "repo-config", cmd_repo_config, RUN_SETUP_GENTLY },
 		{ "rerere", cmd_rerere, RUN_SETUP },
 		{ "reset", cmd_reset, RUN_SETUP },
diff --git a/pack-write.c b/pack-write.c
index ca9e63b..e6aa7e3 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack.h"
 #include "csum-file.h"
+#include "string-list.h"
 
 void reset_pack_idx_option(struct pack_idx_option *opts)
 {
@@ -348,10 +349,12 @@ void finish_tmp_packfile(char *name_buffer,
 			 struct pack_idx_entry **written_list,
 			 uint32_t nr_written,
 			 struct pack_idx_option *pack_idx_opts,
-			 unsigned char sha1[])
+			 unsigned char sha1[],
+			 struct string_list *backup_files)
 {
 	const char *idx_tmp_name;
 	char *end_of_name_prefix = strrchr(name_buffer, 0);
+	struct stat st;
 
 	if (adjust_shared_perm(pack_tmp_name))
 		die_errno("unable to make temporary pack file readable");
@@ -368,6 +371,14 @@ void finish_tmp_packfile(char *name_buffer,
 		die_errno("unable to rename temporary pack file");
 
 	sprintf(end_of_name_prefix, "%s.idx", sha1_to_hex(sha1));
+	if (backup_files && !stat(name_buffer, &st)) {
+		struct strbuf old = STRBUF_INIT;
+		strbuf_addf(&old, "%s.old", name_buffer);
+		if (rename(name_buffer, old.buf))
+			die_errno("unable to rename pack %s", name_buffer);
+		string_list_append(backup_files, strbuf_detach(&old, NULL));
+	}
+	backup_file(name_buffer);
 	if (rename(idx_tmp_name, name_buffer))
 		die_errno("unable to rename temporary index file");
 
diff --git a/pack.h b/pack.h
index aa6ee7d..d3f92ad 100644
--- a/pack.h
+++ b/pack.h
@@ -90,7 +90,9 @@ extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned ch
 #define PH_ERROR_PROTOCOL	(-3)
 extern int read_pack_header(int fd, struct pack_header *);
 
+struct string_list;
+
 extern struct sha1file *create_tmp_packfile(char **pack_tmp_name);
-extern void finish_tmp_packfile(char *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]);
+extern void finish_tmp_packfile(char *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[], struct string_list *backup_files);
 
 #endif
-- 8< --

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Rewriting git-repack.sh in C
  2013-08-02 14:10 ` Duy Nguyen
  2013-08-02 16:36   ` Duy Nguyen
@ 2013-08-03  6:33   ` Fredrik Gustafsson
  2013-08-03 10:03     ` Duy Nguyen
  1 sibling, 1 reply; 17+ messages in thread
From: Fredrik Gustafsson @ 2013-08-03  6:33 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Stefan Beller, GIT Mailing-list

On Fri, Aug 02, 2013 at 09:10:59PM +0700, Duy Nguyen wrote:
> > So my question is, how you'd generally approach rewriting a
> > shell script in C.
> 
> Start a new process via start_command/run_command interface. It's
> safer to retain the process boundary at this stage. You can try to
> integrate further later.

Is it really the right approach to just replace sh with C? IMHO forking
and wait for the result should not be done if it can be avoided. It just
adds overhead.

Of course you can argue that just replace sh with C is a good first step
towards to actually do the command in "full C". Altough I'm afraid that
that will get such low priority that it won't be done.

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iveqy@iveqy.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rewriting git-repack.sh in C
  2013-08-03  6:33   ` Fredrik Gustafsson
@ 2013-08-03 10:03     ` Duy Nguyen
  2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
  0 siblings, 1 reply; 17+ messages in thread
From: Duy Nguyen @ 2013-08-03 10:03 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: Stefan Beller, GIT Mailing-list

On Sat, Aug 3, 2013 at 1:33 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> On Fri, Aug 02, 2013 at 09:10:59PM +0700, Duy Nguyen wrote:
>> > So my question is, how you'd generally approach rewriting a
>> > shell script in C.
>>
>> Start a new process via start_command/run_command interface. It's
>> safer to retain the process boundary at this stage. You can try to
>> integrate further later.
>
> Is it really the right approach to just replace sh with C? IMHO forking
> and wait for the result should not be done if it can be avoided. It just
> adds overhead.

Agreed. As I said in another post, I misread this as rewriting
git-rebase.sh, which is a lot more complicated.

> Of course you can argue that just replace sh with C is a good first step
> towards to actually do the command in "full C". Altough I'm afraid that
> that will get such low priority that it won't be done.

One of the reasons I started porting git-repack.sh since 2011 and
never finished it.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rewriting git-repack.sh in C
  2013-08-02 13:48 Rewriting git-repack.sh in C Stefan Beller
  2013-08-02 14:10 ` Duy Nguyen
@ 2013-08-05 10:34 ` Matthieu Moy
  1 sibling, 0 replies; 17+ messages in thread
From: Matthieu Moy @ 2013-08-05 10:34 UTC (permalink / raw)
  To: Stefan Beller; +Cc: GIT Mailing-list

Stefan Beller <stefanbeller@googlemail.com> writes:

> Hello,
>
> I'd like to rewrite the repack shell script in C.
> So I tried the naive approach reading the man page and 
> the script itself and write C program by matching each block/line 
> of the script with a function in C

You should add one step here: check that tests are good, and possibly
add some to improve coverage. It's easier to add tests when you already
have a reference implementation.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 0/4] Re: Rewriting git-repack.sh in C
  2013-08-03 10:03     ` Duy Nguyen
@ 2013-08-07 14:00       ` Stefan Beller
  2013-08-07 14:00         ` [PATCH 1/4] Build in git-repack Stefan Beller
                           ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 14:00 UTC (permalink / raw)
  To: pclouds, iveqy, git; +Cc: Stefan Beller

Hi Duy,

I applied your patch on the current master and added 3 patches, 
so git compiles and the testsuite works without additional breakages.

The functionality is obviously not yet completed as the backup_file
function is still empty. What do you intend that function will do?

Stefan

Nguyễn Thái Ngọc Duy (1):
  Build in git-repack

Stefan Beller (3):
  backup_file dummy function
  pack-objects: do not print usage when repacking
  repack: add unpack-unreachable

 Makefile                       |   2 +-
 builtin.h                      |   1 +
 builtin/pack-objects.c         | 284 ++++++++++++++++++++++++++++++++++++++++-
 bulk-checkin.c                 |   2 +-
 contrib/examples/git-repack.sh | 194 ++++++++++++++++++++++++++++
 git-repack.sh                  | 194 ----------------------------
 git.c                          |   1 +
 pack-write.c                   |  18 ++-
 pack.h                         |   4 +-
 9 files changed, 497 insertions(+), 203 deletions(-)
 create mode 100755 contrib/examples/git-repack.sh
 delete mode 100755 git-repack.sh

-- 
1.8.4.rc1.25.g2793d0a

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] Build in git-repack
  2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
@ 2013-08-07 14:00         ` Stefan Beller
  2013-08-07 14:28           ` Matthieu Moy
  2013-08-07 14:00         ` [PATCH 2/4] backup_file dummy function Stefan Beller
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 14:00 UTC (permalink / raw)
  To: pclouds, iveqy, git

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

pack-objects learns a few more options to take over what's been done
by git-repack.sh. cmd_repack() becomes a wrapper around
cmd_pack_objects().
---
 Makefile                       |   2 +-
 builtin.h                      |   1 +
 builtin/pack-objects.c         | 279 ++++++++++++++++++++++++++++++++++++++++-
 bulk-checkin.c                 |   2 +-
 contrib/examples/git-repack.sh | 194 ++++++++++++++++++++++++++++
 git-repack.sh                  | 194 ----------------------------
 git.c                          |   1 +
 pack-write.c                   |  13 +-
 pack.h                         |   4 +-
 9 files changed, 488 insertions(+), 202 deletions(-)
 create mode 100755 contrib/examples/git-repack.sh
 delete mode 100755 git-repack.sh

diff --git a/Makefile b/Makefile
index 3588ca1..8bd122b 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,6 @@ SCRIPT_SH += git-pull.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
-SCRIPT_SH += git-repack.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
@@ -596,6 +595,7 @@ BUILT_INS += git-get-tar-commit-id$X
 BUILT_INS += git-init$X
 BUILT_INS += git-merge-subtree$X
 BUILT_INS += git-peek-remote$X
+BUILT_INS += git-repack$X
 BUILT_INS += git-repo-config$X
 BUILT_INS += git-show$X
 BUILT_INS += git-stage$X
diff --git a/builtin.h b/builtin.h
index 8afa2de..b56cf07 100644
--- a/builtin.h
+++ b/builtin.h
@@ -102,6 +102,7 @@ extern int cmd_reflog(int argc, const char **argv, const char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_ext(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_fd(int argc, const char **argv, const char *prefix);
+extern int cmd_repack(int argc, const char **argv, const char *prefix);
 extern int cmd_repo_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..1742ea1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -18,10 +18,17 @@
 #include "refs.h"
 #include "streaming.h"
 #include "thread-utils.h"
+#include "sigchain.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
 	N_("git pack-objects [options...] base-name [< ref-list | < object-list]"),
+	N_("git pack-objects --repack [options...]"),
+	NULL
+};
+
+static char const * const repack_usage[] = {
+	N_("git repack [options]"),
 	NULL
 };
 
@@ -103,6 +110,15 @@ static struct object_entry *locate_object_entry(const unsigned char *sha1);
 static uint32_t written, written_delta;
 static uint32_t reused, reused_delta;
 
+#define REPACK_IN_PROGRESS		(1 << 0)
+#define REPACK_UPDATE_INFO		(1 << 1)
+#define REPACK_ALL_INTO_ONE		(1 << 2)
+#define REPACK_REMOVE_REDUNDANT		(1 << 3)
+
+static int repack_flags, nr_written_packs;
+static int repack_usedeltabaseoffset;
+static struct string_list written_packs;
+static struct string_list backup_files;
 
 static void *get_delta(struct object_entry *entry)
 {
@@ -792,9 +808,19 @@ static void write_pack_file(void)
 			snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
 			finish_tmp_packfile(tmpname, pack_tmp_name,
 					    written_list, nr_written,
-					    &pack_idx_opts, sha1);
+					    &pack_idx_opts, sha1,
+					    repack_flags & REPACK_IN_PROGRESS ?
+					    &backup_files : NULL);
 			free(pack_tmp_name);
-			puts(sha1_to_hex(sha1));
+			if (repack_flags & REPACK_IN_PROGRESS) {
+				int len = strlen(tmpname);
+				char *s = xmalloc(len + 2);
+				memcpy(s, tmpname, len - 4);
+				memcpy(s + len - 4, ".pack", 6);
+				string_list_append(&written_packs, s);
+				nr_written_packs++;
+			} else
+				puts(sha1_to_hex(sha1));
 		}
 
 		/* mark written objects as written to previous pack */
@@ -2359,7 +2385,8 @@ static void get_object_list(int ac, const char **av)
 	save_commit_buffer = 0;
 	setup_revisions(ac, av, &revs, NULL);
 
-	while (fgets(line, sizeof(line), stdin) != NULL) {
+	while (!(repack_flags & REPACK_IN_PROGRESS) &&
+	       fgets(line, sizeof(line), stdin) != NULL) {
 		int len = strlen(line);
 		if (len && line[len - 1] == '\n')
 			line[--len] = 0;
@@ -2387,6 +2414,30 @@ static void get_object_list(int ac, const char **av)
 		loosen_unused_packed_objects(&revs);
 }
 
+static void rollback_repack(void)
+{
+	struct strbuf dst = STRBUF_INIT;
+	int i, ret;
+	for (i = 0; i < backup_files.nr; i++) {
+		const char *src = backup_files.items[i].string;
+		strbuf_addstr(&dst, src);
+		strbuf_setlen(&dst, dst.len - 4); /* remove .old */
+		ret = rename(src, dst.buf);
+		if (ret)
+			warning("failed to restore %s: %s", src, strerror(errno));
+		strbuf_setlen(&dst, 0);
+	}
+	strbuf_release(&dst);
+	string_list_clear(&backup_files, 0);
+}
+
+static void rollback_repack_on_signal(int signo)
+{
+	rollback_repack();
+	sigchain_pop(signo);
+	raise(signo);
+}
+
 static int option_parse_index_version(const struct option *opt,
 				      const char *arg, int unset)
 {
@@ -2436,11 +2487,12 @@ static int option_parse_ulong(const struct option *opt,
 
 int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 {
+	struct strbuf repack_base_name = STRBUF_INIT;
 	int use_internal_rev_list = 0;
 	int thin = 0;
 	int all_progress_implied = 0;
 	const char *rp_av[6];
-	int rp_ac = 0;
+	int i, rp_ac = 0;
 	int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
 	struct option pack_objects_options[] = {
 		OPT_SET_INT('q', "quiet", &progress,
@@ -2505,6 +2557,16 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    N_("pack compression level")),
 		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
 			    N_("do not hide commits by grafts"), 0),
+
+		OPT_BIT(0, "repack", &repack_flags,
+			N_("repack mode"), REPACK_IN_PROGRESS),
+		OPT_BIT(0, "repack-all", &repack_flags,
+			N_("repack everything into one pack"), REPACK_ALL_INTO_ONE),
+		OPT_BIT(0, "remove-redundant", &repack_flags,
+			N_("remove redundant objects after repack"), REPACK_REMOVE_REDUNDANT),
+		OPT_BIT(0, "update-info", &repack_flags,
+			N_("run git-update-server-info after repack"), REPACK_UPDATE_INFO),
+
 		OPT_END(),
 	};
 
@@ -2556,6 +2618,36 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (delta_search_threads != 1)
 		warning("no threads support, ignoring --threads");
 #endif
+	if ((repack_flags & REPACK_IN_PROGRESS) == 0 &&
+	    (repack_flags & ~REPACK_IN_PROGRESS))
+		die("--repack must be given for any repack related options");
+	if (repack_flags & REPACK_IN_PROGRESS) {
+		if (pack_to_stdout)
+			die("--stdout cannot be used with --repack");
+		if (argc)
+			die("base name cannot be used with --repack");
+
+		rp_av[rp_ac++] = "--all";
+		rp_av[rp_ac++] = "--reflog";
+		use_internal_rev_list = 1;
+
+		grafts_replace_parents = 0; /* --keep-true-parents */
+		ignore_packed_keep = 1;	    /* --honor-pack-keep */
+		non_empty = 1;		    /* --non-empty */
+
+		if (!(repack_flags & REPACK_ALL_INTO_ONE)) {
+			incremental = 1; /* --incremental */
+			rp_av[rp_ac++] = "--unpacked";
+		}
+
+		strbuf_addf(&repack_base_name,
+			    "%s/pack/pack", get_object_directory());
+		base_name = repack_base_name.buf;
+
+		sigchain_push_common(rollback_repack_on_signal);
+		atexit(rollback_repack);
+	}
+
 	if (!pack_to_stdout && !pack_size_limit)
 		pack_size_limit = pack_size_limit_cfg;
 	if (pack_to_stdout && pack_size_limit)
@@ -2598,5 +2690,184 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, "Total %"PRIu32" (delta %"PRIu32"),"
 			" reused %"PRIu32" (delta %"PRIu32")\n",
 			written, written_delta, reused, reused_delta);
+
+	if (!(repack_flags & REPACK_IN_PROGRESS))
+		return 0;
+
+	if (!nr_written_packs) {
+		printf(_("Nothing new to pack.\n"));
+		return 0;
+	}
+
+	/* At this point all new packs should be in place. We can
+	   safely remove old ones */
+	for (i = 0; i < backup_files.nr; i++) {
+		const char *s = backup_files.items[i].string;
+		int ret = unlink(s);
+		if (ret)
+			warning("failed to remove %s: %s", s, strerror(errno));
+	}
+	string_list_clear(&backup_files, 0);
+
+	if (repack_flags & REPACK_REMOVE_REDUNDANT) {
+		struct packed_git *p;
+		struct string_list to_be_removed = STRING_LIST_INIT_DUP;
+
+		/* free_pack_by_name() may have freed a few packs in
+		   write_pack_file() */
+		reprepare_packed_git();
+		for (p = packed_git; p; p = p->next) {
+			if (!p->pack_local || p->pack_keep)
+				continue;
+
+			for (i = 0; i < written_packs.nr; i++) {
+				char *s = written_packs.items[i].string;
+				if (!strcmp(s, p->pack_name))
+					break;
+			}
+			if (i < written_packs.nr)
+				continue;
+
+			string_list_append(&to_be_removed, p->pack_name);
+		}
+		written_packs.strdup_strings = 1;
+		string_list_clear(&written_packs, 0);
+
+		for (i = 0; i < to_be_removed.nr; i++) {
+			char *path = to_be_removed.items[i].string;
+
+			/* Windows limitation on unlink().
+			   See c74faea19e39ca933492f697596310397175c329 */
+			free_pack_by_name(path);
+
+			if (unlink(path))
+				warning("failed to remove %s: %s", path, strerror(errno));
+			strcpy(path + strlen(path)-5, ".idx");
+			if (unlink(path))
+				warning("failed to remove %s: %s", path, strerror(errno));
+		}
+		string_list_clear(&to_be_removed, 0);
+
+		reprepare_packed_git();
+		prune_packed_objects(progress ? PRUNE_PACKED_VERBOSE : 0);
+	}
+
+	if (repack_flags & REPACK_UPDATE_INFO)
+		update_server_info(0);
+
 	return 0;
 }
+
+static int repack_config(const char *k, const char *v, void *cb)
+{
+	if (!strcasecmp(k, "repack.usedeltabaseoffset")) {
+		repack_usedeltabaseoffset = git_config_bool(k, v);
+		return 0;
+	}
+	return git_default_config(k, v, cb);
+}
+
+int cmd_repack(int argc, const char **argv, const char *prefix)
+{
+	int all_in_one = 0;
+	int all_in_one_and_unreachable = 0;
+	int unpack_unreachable = 0;
+	int remove_redundant = 0;
+	int no_reuse_delta = 0;
+	int no_reuse_object = 0;
+	int no_update = 0;
+	int quiet = 0;
+	int local = 0;
+
+	struct option opts[] = {
+		OPT_BOOL('a', NULL, &all_in_one,
+			 "pack everything in a single pack"),
+		OPT_BOOL('A', NULL, &all_in_one_and_unreachable,
+			 "same as -a, and turn unreachable objects loose"),
+		OPT_BOOL('d', NULL, &remove_redundant,
+			 "remove redundant packs, and run git-prune-packed"),
+		OPT_BOOL('f', NULL, &no_reuse_delta,
+			 "pass --no-reuse-delta to git-pack-objects"),
+		OPT_BOOL('F', NULL, &no_reuse_object,
+			 "pass --no-reuse-object to git-pack-objects"),
+		OPT_BOOL('n', NULL, &no_update,
+			 "do not run git-update-server-info"),
+		OPT_BOOL('q', NULL, &quiet, "be quiet"),
+		OPT_BOOL('l', NULL, &local,
+			 "pass --local to git-pack-objects"),
+		{ OPTION_ARGUMENT, 0, "window", NULL, "n",
+		 "size of the window used for delta compression", 0 },
+		{ OPTION_ARGUMENT, 0, "window-memory", NULL, "n",
+		  "same as the above, but limit memory size instead of entries count", 0 },
+		{ OPTION_ARGUMENT, 0, "depth", NULL, "n",
+		  "limits the maximum delta depth", 0 },
+		{ OPTION_ARGUMENT, 0, "max-pack-size", NULL, "n",
+		  "maximum size of each packfile", 0},
+		OPT_END(),
+	};
+
+	const char *av[] = { "pack-objects", "--repack",
+			     NULL, /* --[no-]update-info */
+			     NULL, /* --delta-base-offset */
+			     NULL, /* --repack-all */
+			     NULL, /* --remove-redundant */
+			     NULL, /* --no-reuse-delta */
+			     NULL, /* --no-reuse-object */
+			     NULL, /* --local */
+			     NULL, /* -q */
+			     NULL, /* --unpack-unreachable */
+			     NULL, /* --window */
+			     NULL, /* --window-memory */
+			     NULL, /* --depth */
+			     NULL, /* --max-pack-size */
+			     NULL
+	};
+	int ac = 2;
+
+	git_config(repack_config, NULL);
+
+	argc = parse_options(argc, argv, prefix, opts, repack_usage, 0);
+
+	if (no_update)
+		av[ac++] = "--no-update-info";
+	else
+		av[ac++] = "--update-info";
+	if (repack_usedeltabaseoffset)
+		av[ac++] = "--delta-base-offset";
+	if (all_in_one_and_unreachable) {
+		av[ac++] = "--repack-all";
+		all_in_one = 1;
+		unpack_unreachable = 1;
+	}
+	if (all_in_one)
+		av[ac++] = "--repack-all";
+	if (remove_redundant)
+		av[ac++] = "--remove-redundant";
+	if (no_reuse_delta)
+		av[ac++] = "--no-reuse-delta";
+	if (no_reuse_object)
+		av[ac++] = "--no-reuse-object";
+	if (local)
+		av[ac++] = "--local";
+	if (quiet)
+		av[ac++] = "-q";
+	if ((ac + argc) * sizeof(*av) > sizeof(av))
+		die("Too many options");
+	memcpy(av + ac, argv, argc * sizeof(*argv));
+	ac += argc;
+
+	if (all_in_one && remove_redundant) {
+		struct packed_git *p;
+
+		prepare_packed_git();
+		for (p = packed_git; p; p = p->next) {
+			if (!p->pack_keep &&
+			    unpack_unreachable && remove_redundant) {
+				av[ac++] = "--unpack-unreachable";
+				break;
+			}
+		}
+	}
+
+	return cmd_pack_objects(ac, av, prefix);
+}
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 6b0b6d4..3ca3c55 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -46,7 +46,7 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
 	sprintf(packname, "%s/pack/pack-", get_object_directory());
 	finish_tmp_packfile(packname, state->pack_tmp_name,
 			    state->written, state->nr_written,
-			    &state->pack_idx_opts, sha1);
+			    &state->pack_idx_opts, sha1, NULL);
 	for (i = 0; i < state->nr_written; i++)
 		free(state->written[i]);
 
diff --git a/contrib/examples/git-repack.sh b/contrib/examples/git-repack.sh
new file mode 100755
index 0000000..7579331
--- /dev/null
+++ b/contrib/examples/git-repack.sh
@@ -0,0 +1,194 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Linus Torvalds
+#
+
+OPTIONS_KEEPDASHDASH=
+OPTIONS_SPEC="\
+git repack [options]
+--
+a               pack everything in a single pack
+A               same as -a, and turn unreachable objects loose
+d               remove redundant packs, and run git-prune-packed
+f               pass --no-reuse-delta to git-pack-objects
+F               pass --no-reuse-object to git-pack-objects
+n               do not run git-update-server-info
+q,quiet         be quiet
+l               pass --local to git-pack-objects
+unpack-unreachable=  with -A, do not loosen objects older than this
+ Packing constraints
+window=         size of the window used for delta compression
+window-memory=  same as the above, but limit memory size instead of entries count
+depth=          limits the maximum delta depth
+max-pack-size=  maximum size of each packfile
+"
+SUBDIRECTORY_OK='Yes'
+. git-sh-setup
+
+no_update_info= all_into_one= remove_redundant= unpack_unreachable=
+local= no_reuse= extra=
+while test $# != 0
+do
+	case "$1" in
+	-n)	no_update_info=t ;;
+	-a)	all_into_one=t ;;
+	-A)	all_into_one=t
+		unpack_unreachable=--unpack-unreachable ;;
+	--unpack-unreachable)
+		unpack_unreachable="--unpack-unreachable=$2"; shift ;;
+	-d)	remove_redundant=t ;;
+	-q)	GIT_QUIET=t ;;
+	-f)	no_reuse=--no-reuse-delta ;;
+	-F)	no_reuse=--no-reuse-object ;;
+	-l)	local=--local ;;
+	--max-pack-size|--window|--window-memory|--depth)
+		extra="$extra $1=$2"; shift ;;
+	--) shift; break;;
+	*)	usage ;;
+	esac
+	shift
+done
+
+case "`git config --bool repack.usedeltabaseoffset || echo true`" in
+true)
+	extra="$extra --delta-base-offset" ;;
+esac
+
+PACKDIR="$GIT_OBJECT_DIRECTORY/pack"
+PACKTMP="$PACKDIR/.tmp-$$-pack"
+rm -f "$PACKTMP"-*
+trap 'rm -f "$PACKTMP"-*' 0 1 2 3 15
+
+# There will be more repacking strategies to come...
+case ",$all_into_one," in
+,,)
+	args='--unpacked --incremental'
+	;;
+,t,)
+	args= existing=
+	if [ -d "$PACKDIR" ]; then
+		for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
+			| sed -e 's/^\.\///' -e 's/\.pack$//'`
+		do
+			if [ -e "$PACKDIR/$e.keep" ]; then
+				: keep
+			else
+				existing="$existing $e"
+			fi
+		done
+		if test -n "$existing" -a -n "$unpack_unreachable" -a \
+			-n "$remove_redundant"
+		then
+			# This may have arbitrary user arguments, so we
+			# have to protect it against whitespace splitting
+			# when it gets run as "pack-objects $args" later.
+			# Fortunately, we know it's an approxidate, so we
+			# can just use dots instead.
+			args="$args $(echo "$unpack_unreachable" | tr ' ' .)"
+		fi
+	fi
+	;;
+esac
+
+mkdir -p "$PACKDIR" || exit
+
+args="$args $local ${GIT_QUIET:+-q} $no_reuse$extra"
+names=$(git pack-objects --keep-true-parents --honor-pack-keep --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
+	exit 1
+if [ -z "$names" ]; then
+	say Nothing new to pack.
+fi
+
+# Ok we have prepared all new packfiles.
+
+# First see if there are packs of the same name and if so
+# if we can move them out of the way (this can happen if we
+# repacked immediately after packing fully.
+rollback=
+failed=
+for name in $names
+do
+	for sfx in pack idx
+	do
+		file=pack-$name.$sfx
+		test -f "$PACKDIR/$file" || continue
+		rm -f "$PACKDIR/old-$file" &&
+		mv "$PACKDIR/$file" "$PACKDIR/old-$file" || {
+			failed=t
+			break
+		}
+		rollback="$rollback $file"
+	done
+	test -z "$failed" || break
+done
+
+# If renaming failed for any of them, roll the ones we have
+# already renamed back to their original names.
+if test -n "$failed"
+then
+	rollback_failure=
+	for file in $rollback
+	do
+		mv "$PACKDIR/old-$file" "$PACKDIR/$file" ||
+		rollback_failure="$rollback_failure $file"
+	done
+	if test -n "$rollback_failure"
+	then
+		echo >&2 "WARNING: Some packs in use have been renamed by"
+		echo >&2 "WARNING: prefixing old- to their name, in order to"
+		echo >&2 "WARNING: replace them with the new version of the"
+		echo >&2 "WARNING: file.  But the operation failed, and"
+		echo >&2 "WARNING: attempt to rename them back to their"
+		echo >&2 "WARNING: original names also failed."
+		echo >&2 "WARNING: Please rename them in $PACKDIR manually:"
+		for file in $rollback_failure
+		do
+			echo >&2 "WARNING:   old-$file -> $file"
+		done
+	fi
+	exit 1
+fi
+
+# Now the ones with the same name are out of the way...
+fullbases=
+for name in $names
+do
+	fullbases="$fullbases pack-$name"
+	chmod a-w "$PACKTMP-$name.pack"
+	chmod a-w "$PACKTMP-$name.idx"
+	mv -f "$PACKTMP-$name.pack" "$PACKDIR/pack-$name.pack" &&
+	mv -f "$PACKTMP-$name.idx"  "$PACKDIR/pack-$name.idx" ||
+	exit
+done
+
+# Remove the "old-" files
+for name in $names
+do
+	rm -f "$PACKDIR/old-pack-$name.idx"
+	rm -f "$PACKDIR/old-pack-$name.pack"
+done
+
+# End of pack replacement.
+
+if test "$remove_redundant" = t
+then
+	# We know $existing are all redundant.
+	if [ -n "$existing" ]
+	then
+		( cd "$PACKDIR" &&
+		  for e in $existing
+		  do
+			case " $fullbases " in
+			*" $e "*) ;;
+			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
+			esac
+		  done
+		)
+	fi
+	git prune-packed ${GIT_QUIET:+-q}
+fi
+
+case "$no_update_info" in
+t) : ;;
+*) git update-server-info ;;
+esac
diff --git a/git-repack.sh b/git-repack.sh
deleted file mode 100755
index 7579331..0000000
--- a/git-repack.sh
+++ /dev/null
@@ -1,194 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-#
-
-OPTIONS_KEEPDASHDASH=
-OPTIONS_SPEC="\
-git repack [options]
---
-a               pack everything in a single pack
-A               same as -a, and turn unreachable objects loose
-d               remove redundant packs, and run git-prune-packed
-f               pass --no-reuse-delta to git-pack-objects
-F               pass --no-reuse-object to git-pack-objects
-n               do not run git-update-server-info
-q,quiet         be quiet
-l               pass --local to git-pack-objects
-unpack-unreachable=  with -A, do not loosen objects older than this
- Packing constraints
-window=         size of the window used for delta compression
-window-memory=  same as the above, but limit memory size instead of entries count
-depth=          limits the maximum delta depth
-max-pack-size=  maximum size of each packfile
-"
-SUBDIRECTORY_OK='Yes'
-. git-sh-setup
-
-no_update_info= all_into_one= remove_redundant= unpack_unreachable=
-local= no_reuse= extra=
-while test $# != 0
-do
-	case "$1" in
-	-n)	no_update_info=t ;;
-	-a)	all_into_one=t ;;
-	-A)	all_into_one=t
-		unpack_unreachable=--unpack-unreachable ;;
-	--unpack-unreachable)
-		unpack_unreachable="--unpack-unreachable=$2"; shift ;;
-	-d)	remove_redundant=t ;;
-	-q)	GIT_QUIET=t ;;
-	-f)	no_reuse=--no-reuse-delta ;;
-	-F)	no_reuse=--no-reuse-object ;;
-	-l)	local=--local ;;
-	--max-pack-size|--window|--window-memory|--depth)
-		extra="$extra $1=$2"; shift ;;
-	--) shift; break;;
-	*)	usage ;;
-	esac
-	shift
-done
-
-case "`git config --bool repack.usedeltabaseoffset || echo true`" in
-true)
-	extra="$extra --delta-base-offset" ;;
-esac
-
-PACKDIR="$GIT_OBJECT_DIRECTORY/pack"
-PACKTMP="$PACKDIR/.tmp-$$-pack"
-rm -f "$PACKTMP"-*
-trap 'rm -f "$PACKTMP"-*' 0 1 2 3 15
-
-# There will be more repacking strategies to come...
-case ",$all_into_one," in
-,,)
-	args='--unpacked --incremental'
-	;;
-,t,)
-	args= existing=
-	if [ -d "$PACKDIR" ]; then
-		for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
-			| sed -e 's/^\.\///' -e 's/\.pack$//'`
-		do
-			if [ -e "$PACKDIR/$e.keep" ]; then
-				: keep
-			else
-				existing="$existing $e"
-			fi
-		done
-		if test -n "$existing" -a -n "$unpack_unreachable" -a \
-			-n "$remove_redundant"
-		then
-			# This may have arbitrary user arguments, so we
-			# have to protect it against whitespace splitting
-			# when it gets run as "pack-objects $args" later.
-			# Fortunately, we know it's an approxidate, so we
-			# can just use dots instead.
-			args="$args $(echo "$unpack_unreachable" | tr ' ' .)"
-		fi
-	fi
-	;;
-esac
-
-mkdir -p "$PACKDIR" || exit
-
-args="$args $local ${GIT_QUIET:+-q} $no_reuse$extra"
-names=$(git pack-objects --keep-true-parents --honor-pack-keep --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
-	exit 1
-if [ -z "$names" ]; then
-	say Nothing new to pack.
-fi
-
-# Ok we have prepared all new packfiles.
-
-# First see if there are packs of the same name and if so
-# if we can move them out of the way (this can happen if we
-# repacked immediately after packing fully.
-rollback=
-failed=
-for name in $names
-do
-	for sfx in pack idx
-	do
-		file=pack-$name.$sfx
-		test -f "$PACKDIR/$file" || continue
-		rm -f "$PACKDIR/old-$file" &&
-		mv "$PACKDIR/$file" "$PACKDIR/old-$file" || {
-			failed=t
-			break
-		}
-		rollback="$rollback $file"
-	done
-	test -z "$failed" || break
-done
-
-# If renaming failed for any of them, roll the ones we have
-# already renamed back to their original names.
-if test -n "$failed"
-then
-	rollback_failure=
-	for file in $rollback
-	do
-		mv "$PACKDIR/old-$file" "$PACKDIR/$file" ||
-		rollback_failure="$rollback_failure $file"
-	done
-	if test -n "$rollback_failure"
-	then
-		echo >&2 "WARNING: Some packs in use have been renamed by"
-		echo >&2 "WARNING: prefixing old- to their name, in order to"
-		echo >&2 "WARNING: replace them with the new version of the"
-		echo >&2 "WARNING: file.  But the operation failed, and"
-		echo >&2 "WARNING: attempt to rename them back to their"
-		echo >&2 "WARNING: original names also failed."
-		echo >&2 "WARNING: Please rename them in $PACKDIR manually:"
-		for file in $rollback_failure
-		do
-			echo >&2 "WARNING:   old-$file -> $file"
-		done
-	fi
-	exit 1
-fi
-
-# Now the ones with the same name are out of the way...
-fullbases=
-for name in $names
-do
-	fullbases="$fullbases pack-$name"
-	chmod a-w "$PACKTMP-$name.pack"
-	chmod a-w "$PACKTMP-$name.idx"
-	mv -f "$PACKTMP-$name.pack" "$PACKDIR/pack-$name.pack" &&
-	mv -f "$PACKTMP-$name.idx"  "$PACKDIR/pack-$name.idx" ||
-	exit
-done
-
-# Remove the "old-" files
-for name in $names
-do
-	rm -f "$PACKDIR/old-pack-$name.idx"
-	rm -f "$PACKDIR/old-pack-$name.pack"
-done
-
-# End of pack replacement.
-
-if test "$remove_redundant" = t
-then
-	# We know $existing are all redundant.
-	if [ -n "$existing" ]
-	then
-		( cd "$PACKDIR" &&
-		  for e in $existing
-		  do
-			case " $fullbases " in
-			*" $e "*) ;;
-			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
-			esac
-		  done
-		)
-	fi
-	git prune-packed ${GIT_QUIET:+-q}
-fi
-
-case "$no_update_info" in
-t) : ;;
-*) git update-server-info ;;
-esac
diff --git a/git.c b/git.c
index 2025f77..72d1dc9 100644
--- a/git.c
+++ b/git.c
@@ -397,6 +397,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "remote-ext", cmd_remote_ext },
 		{ "remote-fd", cmd_remote_fd },
 		{ "replace", cmd_replace, RUN_SETUP },
+		{ "repack", cmd_repack, RUN_SETUP },
 		{ "repo-config", cmd_repo_config, RUN_SETUP_GENTLY },
 		{ "rerere", cmd_rerere, RUN_SETUP },
 		{ "reset", cmd_reset, RUN_SETUP },
diff --git a/pack-write.c b/pack-write.c
index ca9e63b..e6aa7e3 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack.h"
 #include "csum-file.h"
+#include "string-list.h"
 
 void reset_pack_idx_option(struct pack_idx_option *opts)
 {
@@ -348,10 +349,12 @@ void finish_tmp_packfile(char *name_buffer,
 			 struct pack_idx_entry **written_list,
 			 uint32_t nr_written,
 			 struct pack_idx_option *pack_idx_opts,
-			 unsigned char sha1[])
+			 unsigned char sha1[],
+			 struct string_list *backup_files)
 {
 	const char *idx_tmp_name;
 	char *end_of_name_prefix = strrchr(name_buffer, 0);
+	struct stat st;
 
 	if (adjust_shared_perm(pack_tmp_name))
 		die_errno("unable to make temporary pack file readable");
@@ -368,6 +371,14 @@ void finish_tmp_packfile(char *name_buffer,
 		die_errno("unable to rename temporary pack file");
 
 	sprintf(end_of_name_prefix, "%s.idx", sha1_to_hex(sha1));
+	if (backup_files && !stat(name_buffer, &st)) {
+		struct strbuf old = STRBUF_INIT;
+		strbuf_addf(&old, "%s.old", name_buffer);
+		if (rename(name_buffer, old.buf))
+			die_errno("unable to rename pack %s", name_buffer);
+		string_list_append(backup_files, strbuf_detach(&old, NULL));
+	}
+	backup_file(name_buffer);
 	if (rename(idx_tmp_name, name_buffer))
 		die_errno("unable to rename temporary index file");
 
diff --git a/pack.h b/pack.h
index aa6ee7d..d3f92ad 100644
--- a/pack.h
+++ b/pack.h
@@ -90,7 +90,9 @@ extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned ch
 #define PH_ERROR_PROTOCOL	(-3)
 extern int read_pack_header(int fd, struct pack_header *);
 
+struct string_list;
+
 extern struct sha1file *create_tmp_packfile(char **pack_tmp_name);
-extern void finish_tmp_packfile(char *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]);
+extern void finish_tmp_packfile(char *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[], struct string_list *backup_files);
 
 #endif
-- 
1.8.4.rc1.25.g2793d0a

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] backup_file dummy function
  2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
  2013-08-07 14:00         ` [PATCH 1/4] Build in git-repack Stefan Beller
@ 2013-08-07 14:00         ` Stefan Beller
  2013-08-08  2:45           ` Duy Nguyen
  2013-08-07 14:00         ` [PATCH 3/4] pack-objects: do not print usage when repacking Stefan Beller
  2013-08-07 14:00         ` [PATCH 4/4] repack: add unpack-unreachable Stefan Beller
  3 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 14:00 UTC (permalink / raw)
  To: pclouds, iveqy, git; +Cc: Stefan Beller

---
 pack-write.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/pack-write.c b/pack-write.c
index e6aa7e3..b728ea2 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -344,6 +344,11 @@ struct sha1file *create_tmp_packfile(char **pack_tmp_name)
 	return sha1fd(fd, *pack_tmp_name);
 }
 
+void backup_file(char *filename)
+{
+
+}
+
 void finish_tmp_packfile(char *name_buffer,
 			 const char *pack_tmp_name,
 			 struct pack_idx_entry **written_list,
-- 
1.8.4.rc1.25.g2793d0a

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] pack-objects: do not print usage when repacking
  2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
  2013-08-07 14:00         ` [PATCH 1/4] Build in git-repack Stefan Beller
  2013-08-07 14:00         ` [PATCH 2/4] backup_file dummy function Stefan Beller
@ 2013-08-07 14:00         ` Stefan Beller
  2013-08-08  6:40           ` Antoine Pelisse
  2013-08-07 14:00         ` [PATCH 4/4] repack: add unpack-unreachable Stefan Beller
  3 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 14:00 UTC (permalink / raw)
  To: pclouds, iveqy, git; +Cc: Stefan Beller

---
 builtin/pack-objects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1742ea1..0bd8f3b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2585,7 +2585,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		base_name = argv[0];
 		argc--;
 	}
-	if (pack_to_stdout != !base_name || argc)
+	if (!(repack_flags & REPACK_IN_PROGRESS) && (pack_to_stdout != !base_name || argc))
 		usage_with_options(pack_usage, pack_objects_options);
 
 	rp_av[rp_ac++] = "pack-objects";
-- 
1.8.4.rc1.25.g2793d0a

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] repack: add unpack-unreachable
  2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
                           ` (2 preceding siblings ...)
  2013-08-07 14:00         ` [PATCH 3/4] pack-objects: do not print usage when repacking Stefan Beller
@ 2013-08-07 14:00         ` Stefan Beller
  3 siblings, 0 replies; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 14:00 UTC (permalink / raw)
  To: pclouds, iveqy, git; +Cc: Stefan Beller

---
 builtin/pack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0bd8f3b..0fe01fc 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2795,6 +2795,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		OPT_BOOL('q', NULL, &quiet, "be quiet"),
 		OPT_BOOL('l', NULL, &local,
 			 "pass --local to git-pack-objects"),
+		{ OPTION_CALLBACK, 0, "unpack-unreachable", NULL, N_("time"),
+		  N_("unpack unreachable objects newer than <time>"),
+		  PARSE_OPT_OPTARG, option_parse_unpack_unreachable },
 		{ OPTION_ARGUMENT, 0, "window", NULL, "n",
 		 "size of the window used for delta compression", 0 },
 		{ OPTION_ARGUMENT, 0, "window-memory", NULL, "n",
-- 
1.8.4.rc1.25.g2793d0a

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Build in git-repack
  2013-08-07 14:00         ` [PATCH 1/4] Build in git-repack Stefan Beller
@ 2013-08-07 14:28           ` Matthieu Moy
  2013-08-07 15:48             ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Matthieu Moy @ 2013-08-07 14:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, iveqy, git

[ It's cool you're working on this, I'd really like a git-repack in C.
  That would fix this
  http://thread.gmane.org/gmane.comp.version-control.git/226458 ]

Stefan Beller <stefanbeller@googlemail.com> writes:

> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> pack-objects learns a few more options to take over what's been done
> by git-repack.sh. cmd_repack() becomes a wrapper around
> cmd_pack_objects().

I think the patch would read easier if these were split into two
patches: one doing the real stuff in pack-objects, and then getting rid
of git-repack.sh to replace it with a trivial built-in.

Actually, I'm wondering why pack-objects requires so much changes.
git-repack.sh was already a relatively small wrapper around
pack-objects, and did not need the new options you add, so why are they
needed? In particular adding the new --update-info option that just does

> +	if (repack_flags & REPACK_UPDATE_INFO)
> +		update_server_info(0);

seems overkill to me: why don't you just let cmd_repack call
update_server_info(0)?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Build in git-repack
  2013-08-07 14:28           ` Matthieu Moy
@ 2013-08-07 15:48             ` Junio C Hamano
  2013-08-07 16:45               ` Stefan Beller
  2013-08-08  2:44               ` Duy Nguyen
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2013-08-07 15:48 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Stefan Beller, pclouds, iveqy, git

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> [ It's cool you're working on this, I'd really like a git-repack in C.
>   That would fix this
>   http://thread.gmane.org/gmane.comp.version-control.git/226458 ]
>
> Stefan Beller <stefanbeller@googlemail.com> writes:
>
>> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>
>> pack-objects learns a few more options to take over what's been done
>> by git-repack.sh. cmd_repack() becomes a wrapper around
>> cmd_pack_objects().
>
> I think the patch would read easier if these were split into two
> patches: one doing the real stuff in pack-objects, and then getting rid
> of git-repack.sh to replace it with a trivial built-in.
>
> Actually, I'm wondering why pack-objects requires so much changes.
> git-repack.sh was already a relatively small wrapper around
> pack-objects, and did not need the new options you add, so why are they
> needed? In particular adding the new --update-info option that just does
>
>> +	if (repack_flags & REPACK_UPDATE_INFO)
>> +		update_server_info(0);
>
> seems overkill to me: why don't you just let cmd_repack call
> update_server_info(0)?

My feeling exactly.  I would rather see a patch that does not touch
pack-objects at all, and use run_command() interface to spawn it.
Once we do have to pack, the necessary processing cycle will dwarf
the fork/exec latency anyway, no?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Build in git-repack
  2013-08-07 15:48             ` Junio C Hamano
@ 2013-08-07 16:45               ` Stefan Beller
  2013-08-08  2:44               ` Duy Nguyen
  1 sibling, 0 replies; 17+ messages in thread
From: Stefan Beller @ 2013-08-07 16:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthieu Moy, pclouds, iveqy, git

[-- Attachment #1: Type: text/plain, Size: 938 bytes --]

On 08/07/2013 05:48 PM, Junio C Hamano wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
>>
>> seems overkill to me: why don't you just let cmd_repack call
>> update_server_info(0)?
> 
> My feeling exactly.  I would rather see a patch that does not touch
> pack-objects at all, and use run_command() interface to spawn it.
> Once we do have to pack, the necessary processing cycle will dwarf
> the fork/exec latency anyway, no?
> 

Thanks for clarification. That was my initial idea as well, 
to not touch the pack-objects logic. 

However Duy send that patch (basically as is, 
I just made it apply again), 
and I guessed that I'd get to results
faster with an already existing approach.

I will reconsider and try to remove the additional logic from 
pack-objects again (so it will not get touched) and move it to the
repack itself. It is just a way to understand, 'what needs to be done'.

Stefan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Build in git-repack
  2013-08-07 15:48             ` Junio C Hamano
  2013-08-07 16:45               ` Stefan Beller
@ 2013-08-08  2:44               ` Duy Nguyen
  1 sibling, 0 replies; 17+ messages in thread
From: Duy Nguyen @ 2013-08-08  2:44 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller
  Cc: Matthieu Moy, Fredrik Gustafsson, Git Mailing List

On Wed, Aug 7, 2013 at 10:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
>
>> [ It's cool you're working on this, I'd really like a git-repack in C.
>>   That would fix this
>>   http://thread.gmane.org/gmane.comp.version-control.git/226458 ]
>>
>> Stefan Beller <stefanbeller@googlemail.com> writes:
>>
>>> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>>
>>> pack-objects learns a few more options to take over what's been done
>>> by git-repack.sh. cmd_repack() becomes a wrapper around
>>> cmd_pack_objects().
>>
>> I think the patch would read easier if these were split into two
>> patches: one doing the real stuff in pack-objects, and then getting rid
>> of git-repack.sh to replace it with a trivial built-in.
>>
>> Actually, I'm wondering why pack-objects requires so much changes.
>> git-repack.sh was already a relatively small wrapper around
>> pack-objects, and did not need the new options you add, so why are they
>> needed? In particular adding the new --update-info option that just does
>>
>>> +    if (repack_flags & REPACK_UPDATE_INFO)
>>> +            update_server_info(0);
>>
>> seems overkill to me: why don't you just let cmd_repack call
>> update_server_info(0)?
>
> My feeling exactly.  I would rather see a patch that does not touch
> pack-objects at all, and use run_command() interface to spawn it.
> Once we do have to pack, the necessary processing cycle will dwarf
> the fork/exec latency anyway, no?

I'm not opposed to run_command(). I think the reason I wanted to move
repack functionality to pack-objects is to avoid reading sha-1 from
pack-objects and reconstruct the paths again from the sha-1. But for
simplicity, perhaps we should not touch pack-objects at all. Then we
could have builtin/repack.c instead of stuffing cmd_repack in
builtin/pack-objects.c

@Stefan, if you want to push this work, feel free to take it as _your_
patch, rewrite as will. You don't need to retain my name.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] backup_file dummy function
  2013-08-07 14:00         ` [PATCH 2/4] backup_file dummy function Stefan Beller
@ 2013-08-08  2:45           ` Duy Nguyen
  0 siblings, 0 replies; 17+ messages in thread
From: Duy Nguyen @ 2013-08-08  2:45 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Fredrik Gustafsson, Git Mailing List

On Wed, Aug 7, 2013 at 9:00 PM, Stefan Beller
<stefanbeller@googlemail.com> wrote:
> ---
>  pack-write.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/pack-write.c b/pack-write.c
> index e6aa7e3..b728ea2 100644
> --- a/pack-write.c
> +++ b/pack-write.c
> @@ -344,6 +344,11 @@ struct sha1file *create_tmp_packfile(char **pack_tmp_name)
>         return sha1fd(fd, *pack_tmp_name);
>  }
>
> +void backup_file(char *filename)
> +{
> +
> +}
> +

I think the function looks like this (before it got lost after rebase)

+static void backup_file(const char *path)
+{
+       struct stat st;
+       if (repack_flags & REPACK_IN_PROGRESS &&
+           !stat(path, &st)) {
+               struct strbuf old = STRBUF_INIT;
+               strbuf_addf(&old, "%s.old", path);
+               if (rename(path, old.buf))
+                       die_errno("unable to rename pack %s", path);
+               string_list_append(&backup_files, strbuf_detach(&old, NULL));
+       }
+}
+
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] pack-objects: do not print usage when repacking
  2013-08-07 14:00         ` [PATCH 3/4] pack-objects: do not print usage when repacking Stefan Beller
@ 2013-08-08  6:40           ` Antoine Pelisse
  0 siblings, 0 replies; 17+ messages in thread
From: Antoine Pelisse @ 2013-08-08  6:40 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Nguyễn Thái Ngọc Duy, iveqy, git

On Wed, Aug 7, 2013 at 4:00 PM, Stefan Beller
<stefanbeller@googlemail.com> wrote:
> ---
>  builtin/pack-objects.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 1742ea1..0bd8f3b 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2585,7 +2585,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>                 base_name = argv[0];
>                 argc--;
>         }
> -       if (pack_to_stdout != !base_name || argc)
> +       if (!(repack_flags & REPACK_IN_PROGRESS) && (pack_to_stdout != !base_name || argc))
>                 usage_with_options(pack_usage, pack_objects_options);
>
>         rp_av[rp_ac++] = "pack-objects";

Hello Stefan,
I'm not sure I understand that and why it's needed, would you mind
explain it to me ? (and/or maybe add it to the commit message)

Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-08-08  6:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-02 13:48 Rewriting git-repack.sh in C Stefan Beller
2013-08-02 14:10 ` Duy Nguyen
2013-08-02 16:36   ` Duy Nguyen
2013-08-03  6:33   ` Fredrik Gustafsson
2013-08-03 10:03     ` Duy Nguyen
2013-08-07 14:00       ` [PATCH 0/4] " Stefan Beller
2013-08-07 14:00         ` [PATCH 1/4] Build in git-repack Stefan Beller
2013-08-07 14:28           ` Matthieu Moy
2013-08-07 15:48             ` Junio C Hamano
2013-08-07 16:45               ` Stefan Beller
2013-08-08  2:44               ` Duy Nguyen
2013-08-07 14:00         ` [PATCH 2/4] backup_file dummy function Stefan Beller
2013-08-08  2:45           ` Duy Nguyen
2013-08-07 14:00         ` [PATCH 3/4] pack-objects: do not print usage when repacking Stefan Beller
2013-08-08  6:40           ` Antoine Pelisse
2013-08-07 14:00         ` [PATCH 4/4] repack: add unpack-unreachable Stefan Beller
2013-08-05 10:34 ` Rewriting git-repack.sh in C Matthieu Moy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).