git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH v4 08/28] shallow.c: the 8 steps to select new commits for .git/shallow
Date: Thu,  5 Dec 2013 20:02:35 +0700	[thread overview]
Message-ID: <1386248575-10206-9-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <1386248575-10206-1-git-send-email-pclouds@gmail.com>

Suppose a fetch or push is requested between two shallow repositories
(with no history deepening or shortening). A pack that contains
necessary objects is transferred over together with .git/shallow of
the sender. The receiver has to determine whether it needs to update
.git/shallow if new refs needs new shallow comits.

The rule here is avoid updating .git/shallow by default. But we don't
want to waste the received pack. If the pack contains two refs, one
needs new shallow commits installed in .git/shallow and one does not,
we keep the latter and reject/warn about the former.

Even if .git/shallow update is allowed, we only add shallow commits
strictly necessary for the former ref (remember the sender can send
more shallow commits than necessary) and pay attention not to
accidentally cut the receiver history short (no history shortening is
asked for)

So the steps to figure out what ref need what new shallow commits are:

1. Split the sender shallow commit list into "ours" and "theirs" list
   by has_sha1_file. Those that exist in current repo in "ours", the
   remaining in "theirs".

2. Check the receiver .git/shallow, remove from "ours" the ones that
   also exist in .git/shallow.

3. Fetch the new pack. Either install or unpack it.

4. Do has_sha1_file on "theirs" list again. Drop the ones that fail
   has_sha1_file. Obviously the new pack does not need them.

5. If the pack is kept, remove from "ours" the ones that do not exist
   in the new pack.

6. Walk the new refs to answer the question "what shallow commits,
   both ours and theirs, are required in .git/shallow in order to add
   this ref?". Shallow commits not associated to any refs are removed
   from their respective list.

7. (*) Check reachability (from the current refs) of all remaining
   commits in "ours". Those reachable are removed. We do not want to
   cut any part of our (reachable) history. We only check up
   commits. True reachability test is done by
   check_everything_connected() at the end as usual.

8. Combine the final "ours" and "theirs" and add them all to
   .git/shallow. Install new refs. The case where some hook rejects
   some refs on a push is explained in more detail in the push
   patches.

Of these steps, #6 and #7 are expensive. Both require walking through
some commits, or in the worst case all commits. And we rather avoid
them in at least common case, where the transferred pack does not
contain any shallow commits that the sender advertises. Let's look at
each scenario:

1) the sender has longer history than the receiver

   All shallow commits from the sender will be put into "theirs" list
   at step 1 because none of them exists in current repo. In the
   common case, "theirs" becomes empty at step 4 and exit early.

2) the sender has shorter history than the receiver

   All shallow commits from the sender are likely in "ours" list at
   step 1. In the common case, if the new pack is kept, we could empty
   "ours" and exit early at step 5.

   If the pack is not kept, we hit the expensive step 6 then exit
   after "ours" is emptied. There'll be only a handful of objects to
   walk in fast-forward case. If it's forced update, we may need to
   walk to the bottom.

3) the sender has same .git/shallow as the receiver

   This is similar to case 2 except that "ours" should be emptied at
   step 2 and exit early.

A fetch after "clone --depth=X" is case 1. A fetch after "clone" (from
a shallow repo) is case 3. Luckily they're cheap for the common case.

A push from "clone --depth=X" falls into case 2, which is expensive.
Some more work may be done at the sender/client side to avoid more
work on the server side: if the transferred pack does not contain any
shallow commits, send-pack should not send any shallow commits to the
receive-pack, effectively turning it into a normal push and avoid all
steps.

This patch implements all steps except #3, already handled by
fetch-pack and receive-pack, #6 and #7, which has their own patch due
to their size.

(*) in previous versions step 7 was put before step 3. I reorder it so
    that the common case that keeps the pack does not need to walk
    commits at all. In future if we implement faster commit
    reachability check (maybe with the help of pack bitmaps or commit
    cache), step 7 could become cheap and be moved up before 6 again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h   |  2 ++
 commit.h  | 15 +++++++++++++
 shallow.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 trace.c   |  2 +-
 4 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index ce377e1..55dd4e3 100644
--- a/cache.h
+++ b/cache.h
@@ -1236,6 +1236,8 @@ __attribute__((format (printf, 2, 3)))
 extern void trace_argv_printf(const char **argv, const char *format, ...);
 extern void trace_repo_setup(const char *prefix);
 extern int trace_want(const char *key);
+__attribute__((format (printf, 2, 3)))
+extern void trace_printf_key(const char *key, const char *fmt, ...);
 extern void trace_strbuf(const char *key, const struct strbuf *buf);
 
 void packet_trace_identity(const char *prog);
diff --git a/commit.h b/commit.h
index 1faf717..9ead93b 100644
--- a/commit.h
+++ b/commit.h
@@ -193,6 +193,8 @@ extern struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 /* largest positive number a signed 32-bit integer can contain */
 #define INFINITE_DEPTH 0x7fffffff
 
+struct sha1_array;
+struct ref;
 extern int register_shallow(const unsigned char *sha1);
 extern int unregister_shallow(const unsigned char *sha1);
 extern int for_each_commit_graft(each_commit_graft_fn, void *);
@@ -209,6 +211,19 @@ extern void setup_alternate_shallow(struct lock_file *shallow_lock,
 extern char *setup_temporary_shallow(const struct sha1_array *extra);
 extern void advertise_shallow_grafts(int);
 
+struct shallow_info {
+	struct sha1_array *shallow;
+	int *ours, nr_ours;
+	int *theirs, nr_theirs;
+	struct sha1_array *ref;
+};
+
+extern void prepare_shallow_info(struct shallow_info *, struct sha1_array *);
+extern void clear_shallow_info(struct shallow_info *);
+extern void remove_nonexistent_theirs_shallow(struct shallow_info *);
+extern void remove_nonexistent_ours_in_pack(struct shallow_info *,
+					    struct packed_git *);
+
 int is_descendant_of(struct commit *, struct commit_list *);
 int in_merge_bases(struct commit *, struct commit *);
 int in_merge_bases_many(struct commit *, int, struct commit **);
diff --git a/shallow.c b/shallow.c
index f9d1633..a6547ca 100644
--- a/shallow.c
+++ b/shallow.c
@@ -2,6 +2,12 @@
 #include "commit.h"
 #include "tag.h"
 #include "pkt-line.h"
+#include "remote.h"
+#include "refs.h"
+#include "sha1-array.h"
+#include "diff.h"
+#include "revision.h"
+#include "commit-slab.h"
 
 static int is_shallow = -1;
 static struct stat shallow_stat;
@@ -245,3 +251,69 @@ void advertise_shallow_grafts(int fd)
 		return;
 	for_each_commit_graft(advertise_shallow_grafts_cb, &fd);
 }
+
+#define TRACE_KEY "GIT_TRACE_SHALLOW"
+
+/*
+ * Step 1, split sender shallow commits into "ours" and "theirs"
+ * Step 2, clean "ours" based on .git/shallow
+ */
+void prepare_shallow_info(struct shallow_info *info, struct sha1_array *sa)
+{
+	int i;
+	trace_printf_key(TRACE_KEY, "shallow: prepare_shallow_info\n");
+	memset(info, 0, sizeof(*info));
+	info->shallow = sa;
+	if (!sa)
+		return;
+	info->ours = xmalloc(sizeof(*info->ours) * sa->nr);
+	info->theirs = xmalloc(sizeof(*info->theirs) * sa->nr);
+	for (i = 0; i < sa->nr; i++) {
+		if (has_sha1_file(sa->sha1[i])) {
+			struct commit_graft *graft;
+			graft = lookup_commit_graft(sa->sha1[i]);
+			if (graft && graft->nr_parent < 0)
+				continue;
+			info->ours[info->nr_ours++] = i;
+		} else
+			info->theirs[info->nr_theirs++] = i;
+	}
+}
+
+void clear_shallow_info(struct shallow_info *info)
+{
+	free(info->ours);
+	free(info->theirs);
+}
+
+/* Step 4, remove non-existent ones in "theirs" after getting the pack */
+
+void remove_nonexistent_theirs_shallow(struct shallow_info *info)
+{
+	unsigned char (*sha1)[20] = info->shallow->sha1;
+	int i, dst;
+	trace_printf_key(TRACE_KEY, "shallow: remove_nonexistent_theirs_shallow\n");
+	for (i = dst = 0; i < info->nr_theirs; i++) {
+		if (i != dst)
+			info->theirs[dst] = info->theirs[i];
+		if (has_sha1_file(sha1[info->theirs[i]]))
+			dst++;
+	}
+	info->nr_theirs = dst;
+}
+
+/* Step 5, remove non-existent ones in "ours" in the pack */
+void remove_nonexistent_ours_in_pack(struct shallow_info *info,
+				     struct packed_git *p)
+{
+	unsigned char (*sha1)[20] = info->shallow->sha1;
+	int i, dst;
+	trace_printf_key(TRACE_KEY, "shallow: remove_nonexistent_ours_in_pack\n");
+	for (i = dst = 0; i < info->nr_ours; i++) {
+		if (i != dst)
+			info->ours[dst] = info->ours[i];
+		if (find_pack_entry_one(sha1[info->ours[i]], p))
+			dst++;
+	}
+	info->nr_ours = dst;
+}
diff --git a/trace.c b/trace.c
index 3d744d1..08180a9 100644
--- a/trace.c
+++ b/trace.c
@@ -76,7 +76,7 @@ static void trace_vprintf(const char *key, const char *fmt, va_list ap)
 }
 
 __attribute__((format (printf, 2, 3)))
-static void trace_printf_key(const char *key, const char *fmt, ...)
+void trace_printf_key(const char *key, const char *fmt, ...)
 {
 	va_list ap;
 	va_start(ap, fmt);
-- 
1.8.5.1.25.g8667982

  parent reply	other threads:[~2013-12-05 12:59 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-25  3:55 [PATCH v3 00/28] First class shallow clone Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 01/28] transport.h: remove send_pack prototype, already defined in send-pack.h Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 02/28] send-pack: forbid pushing from a shallow repository Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 03/28] clone: prevent --reference to " Nguyễn Thái Ngọc Duy
2013-11-26  5:52   ` Eric Sunshine
2013-11-25  3:55 ` [PATCH v3 04/28] update-server-info: do not publish shallow clones Nguyễn Thái Ngọc Duy
2013-11-25 20:08   ` Junio C Hamano
2013-11-26 12:41     ` Duy Nguyen
2013-11-25  3:55 ` [PATCH v3 05/28] Advertise shallow graft information on the server end Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 06/28] connect.c: teach get_remote_heads to parse "shallow" lines Nguyễn Thái Ngọc Duy
2013-11-25 21:42   ` Junio C Hamano
2013-11-25 22:42     ` Junio C Hamano
2013-11-27 13:02       ` Duy Nguyen
2013-11-25  3:55 ` [PATCH v3 07/28] shallow.c: add remove_reachable_shallow_points() Nguyễn Thái Ngọc Duy
2013-11-25 21:53   ` Junio C Hamano
2013-11-25  3:55 ` [PATCH v3 08/28] shallow.c: add mark_new_shallow_refs() Nguyễn Thái Ngọc Duy
2013-11-25 22:20   ` Junio C Hamano
2013-11-26 13:18     ` Duy Nguyen
2013-11-26 22:20       ` Junio C Hamano
2013-11-25  3:55 ` [PATCH v3 09/28] shallow.c: extend setup_*_shallow() to accept extra shallow points Nguyễn Thái Ngọc Duy
2013-11-25 22:25   ` Junio C Hamano
2013-11-25  3:55 ` [PATCH v3 10/28] fetch-pack.c: move shallow update code out of fetch_pack() Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 11/28] fetch-pack.h: one statement per bitfield declaration Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 12/28] clone: support remote shallow repository Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 13/28] fetch: support fetching from a " Nguyễn Thái Ngọc Duy
2013-11-27  9:47   ` Eric Sunshine
2013-11-25  3:55 ` [PATCH v3 14/28] upload-pack: make sure deepening preserves shallow roots Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 15/28] fetch: add --update-shallow to get refs that require updating .git/shallow Nguyễn Thái Ngọc Duy
2013-11-27  1:53   ` Eric Sunshine
2013-11-27 12:54     ` Duy Nguyen
2013-11-27 19:04       ` Junio C Hamano
2013-11-25  3:55 ` [PATCH v3 16/28] receive-pack: reorder some code in unpack() Nguyễn Thái Ngọc Duy
2013-12-02 22:25   ` Junio C Hamano
2013-11-25  3:55 ` [PATCH v3 17/28] Support pushing from a shallow clone Nguyễn Thái Ngọc Duy
2013-11-26 20:38   ` Eric Sunshine
2013-11-25  3:55 ` [PATCH v3 18/28] New var GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 19/28] connected.c: add new variant that runs with --shallow-file Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 20/28] receive-pack: allow pushing with new shallow roots Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 21/28] send-pack: support pushing to a shallow clone Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 22/28] remote-curl: pass ref SHA-1 to fetch-pack as well Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 23/28] Support fetch/clone over http Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 24/28] receive-pack: support pushing to a shallow clone via http Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 25/28] send-pack: support pushing from " Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 26/28] git-clone.txt: remove shallow clone limitations Nguyễn Thái Ngọc Duy
2013-11-25  3:55 ` [PATCH v3 27/28] clone: use git protocol for cloning shallow repo locally Nguyễn Thái Ngọc Duy
2013-11-27  1:36   ` Eric Sunshine
2013-11-25  3:55 ` [PATCH v3 28/28] prune: clean .git/shallow after pruning objects Nguyễn Thái Ngọc Duy
2013-12-05 13:02 ` [PATCH v4 00/28] First class shallow clone Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 01/28] transport.h: remove send_pack prototype, already defined in send-pack.h Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 02/28] Replace struct extra_have_objects with struct sha1_array Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 03/28] send-pack: forbid pushing from a shallow repository Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 04/28] clone: prevent --reference to " Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 05/28] Make the sender advertise shallow commits to the receiver Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 06/28] connect.c: teach get_remote_heads to parse "shallow" lines Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 07/28] shallow.c: extend setup_*_shallow() to accept extra shallow commits Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` Nguyễn Thái Ngọc Duy [this message]
2013-12-05 13:02   ` [PATCH v4 09/28] shallow.c: steps 6 and 7 to select new commits for .git/shallow Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 10/28] fetch-pack.c: move shallow update code out of fetch_pack() Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 11/28] fetch-pack.h: one statement per bitfield declaration Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 12/28] clone: support remote shallow repository Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 13/28] fetch: support fetching from a " Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 14/28] upload-pack: make sure deepening preserves shallow roots Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 15/28] fetch: add --update-shallow to accept refs that update .git/shallow Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 16/28] receive-pack: reorder some code in unpack() Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 17/28] receive/send-pack: support pushing from a shallow clone Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 18/28] New var GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 19/28] connected.c: add new variant that runs with --shallow-file Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 20/28] receive-pack: allow pushes that update .git/shallow Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 21/28] send-pack: support pushing to a shallow clone Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 22/28] remote-curl: pass ref SHA-1 to fetch-pack as well Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 23/28] Support shallow fetch/clone over smart-http Nguyễn Thái Ngọc Duy
2014-01-08 11:25     ` Jeff King
2014-01-08 12:13       ` [PATCH] t5537: fix incorrect expectation in test case 10 Nguyễn Thái Ngọc Duy
2014-01-09 21:57         ` Jeff King
2013-12-05 13:02   ` [PATCH v4 24/28] receive-pack: support pushing to a shallow clone via http Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 25/28] send-pack: support pushing from " Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 26/28] clone: use git protocol for cloning shallow repo locally Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 27/28] prune: clean .git/shallow after pruning objects Nguyễn Thái Ngọc Duy
2013-12-05 13:02   ` [PATCH v4 28/28] git-clone.txt: remove shallow clone limitations Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1386248575-10206-9-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).