git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
@ 2017-08-08 19:32 Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 01/10] pack: move pack name-related functions Jonathan Tan
                   ` (60 more replies)
  0 siblings, 61 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

While investigating annotating packfiles and loose objects to support
connectivity checks in partial clones [1], I decided to make the effort
to separate packfile-related code from sha1_file.c into its own file, to
make it easier to both code such changes and review them. Here is the
beginning of those efforts.

Is this something worth doing, and if yes, is this in the right
direction?

[1] https://public-inbox.org/git/20170804145113.5ceafafa@twelve2.svl.corp.google.com/

Jonathan Tan (10):
  pack: move pack name-related functions
  pack: move static state variables
  pack: move pack_report()
  pack: move open_pack_index(), parse_pack_index()
  pack: move release_pack_memory()
  pack: move pack-closing functions
  pack: move use_pack()
  pack: move unuse_pack()
  pack: move add_packed_git()
  pack: move install_packed_git()

 Makefile                 |   1 +
 builtin/am.c             |   1 +
 builtin/clone.c          |   1 +
 builtin/count-objects.c  |   1 +
 builtin/fetch.c          |   1 +
 builtin/merge.c          |   1 +
 builtin/pack-redundant.c |   1 +
 cache.h                  |  45 ----
 connected.c              |   1 +
 git-compat-util.h        |   2 -
 pack.c                   | 669 +++++++++++++++++++++++++++++++++++++++++++++++
 pack.h                   |  51 ++++
 sha1_file.c              | 667 ----------------------------------------------
 sha1_name.c              |   1 +
 streaming.c              |   1 +
 15 files changed, 730 insertions(+), 714 deletions(-)
 create mode 100644 pack.c

-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 20:36   ` Stefan Beller
  2017-08-08 19:32 ` [RFC PATCH 02/10] pack: move static state variables Jonathan Tan
                   ` (59 subsequent siblings)
  60 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, pack.c, to hold all packfile-related functions
currently in sha1_file.c, and designate pack.h to hold these
packfile-related functions.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                 |  1 +
 builtin/pack-redundant.c |  1 +
 cache.h                  | 23 -----------------------
 pack.c                   | 23 +++++++++++++++++++++++
 pack.h                   | 23 +++++++++++++++++++++++
 sha1_file.c              | 22 ----------------------
 6 files changed, 48 insertions(+), 45 deletions(-)
 create mode 100644 pack.c

diff --git a/Makefile b/Makefile
index 461c845d3..a7b901a18 100644
--- a/Makefile
+++ b/Makefile
@@ -816,6 +816,7 @@ LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
 LIB_OBJS += oidset.o
+LIB_OBJS += pack.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index cb1df1c76..df36d10e7 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -7,6 +7,7 @@
 */
 
 #include "builtin.h"
+#include "pack.h"
 
 #define BLKSIZE 512
 
diff --git a/cache.h b/cache.h
index 71fe09264..1f0f47819 100644
--- a/cache.h
+++ b/cache.h
@@ -902,20 +902,6 @@ extern void check_repository_format(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
-/*
- * Return the name of the (local) packfile with the specified sha1 in
- * its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_name(const unsigned char *sha1);
-
-/*
- * Return the name of the (local) pack index file with the specified
- * sha1 in its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
-
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
@@ -1648,15 +1634,6 @@ extern void pack_report(void);
  */
 extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 
-/*
- * Generate the filename to be used for a pack file with checksum "sha1" and
- * extension "ext". The result is written into the strbuf "buf", overwriting
- * any existing contents. A pointer to buf->buf is returned as a convenience.
- *
- * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
- */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
-
 /*
  * Create a pack .keep file named "name" (which should generally be the output
  * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
diff --git a/pack.c b/pack.c
new file mode 100644
index 000000000..0d191dfd6
--- /dev/null
+++ b/pack.c
@@ -0,0 +1,23 @@
+#include "cache.h"
+
+char *odb_pack_name(struct strbuf *buf,
+		    const unsigned char *sha1,
+		    const char *ext)
+{
+	strbuf_reset(buf);
+	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
+		    sha1_to_hex(sha1), ext);
+	return buf->buf;
+}
+
+char *sha1_pack_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "pack");
+}
+
+char *sha1_pack_index_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "idx");
+}
diff --git a/pack.h b/pack.h
index 8294341af..63bfde00c 100644
--- a/pack.h
+++ b/pack.h
@@ -101,4 +101,27 @@ extern int read_pack_header(int fd, struct pack_header *);
 extern struct sha1file *create_tmp_packfile(char **pack_tmp_name);
 extern void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]);
 
+/*
+ * Generate the filename to be used for a pack file with checksum "sha1" and
+ * extension "ext". The result is written into the strbuf "buf", overwriting
+ * any existing contents. A pointer to buf->buf is returned as a convenience.
+ *
+ * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
+ */
+extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
+
+/*
+ * Return the name of the (local) packfile with the specified sha1 in
+ * its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_name(const unsigned char *sha1);
+
+/*
+ * Return the name of the (local) pack index file with the specified
+ * sha1 in its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_index_name(const unsigned char *sha1);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index b60ae15f7..7e511ce9e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -278,28 +278,6 @@ static const char *alt_sha1_path(struct alternate_object_database *alt,
 	return buf->buf;
 }
 
- char *odb_pack_name(struct strbuf *buf,
-		     const unsigned char *sha1,
-		     const char *ext)
-{
-	strbuf_reset(buf);
-	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
-		    sha1_to_hex(sha1), ext);
-	return buf->buf;
-}
-
-char *sha1_pack_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "pack");
-}
-
-char *sha1_pack_index_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "idx");
-}
-
 struct alternate_object_database *alt_odb_list;
 static struct alternate_object_database **alt_odb_tail;
 
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 02/10] pack: move static state variables
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 01/10] pack: move pack name-related functions Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 03/10] pack: move pack_report() Jonathan Tan
                   ` (58 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

sha1_file.c declares some static variables that store packfile-related
state. Move them to pack.c.

They are temporarily made global, but subsequent commits will restore
their scope back to static.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 pack.c      | 14 ++++++++++++++
 pack.h      |  9 +++++++++
 sha1_file.c | 13 -------------
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/pack.c b/pack.c
index 0d191dfd6..0f46e0617 100644
--- a/pack.c
+++ b/pack.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "mru.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -21,3 +22,16 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	static struct strbuf buf = STRBUF_INIT;
 	return odb_pack_name(&buf, sha1, "idx");
 }
+
+unsigned int pack_used_ctr;
+unsigned int pack_mmap_calls;
+unsigned int peak_pack_open_windows;
+unsigned int pack_open_windows;
+unsigned int pack_open_fds;
+unsigned int pack_max_fds;
+size_t peak_pack_mapped;
+size_t pack_mapped;
+struct packed_git *packed_git;
+
+static struct mru packed_git_mru_storage;
+struct mru *packed_git_mru = &packed_git_mru_storage;
diff --git a/pack.h b/pack.h
index 63bfde00c..7fcd45f7b 100644
--- a/pack.h
+++ b/pack.h
@@ -124,4 +124,13 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
+extern unsigned int pack_used_ctr;
+extern unsigned int pack_mmap_calls;
+extern unsigned int peak_pack_open_windows;
+extern unsigned int pack_open_windows;
+extern unsigned int pack_open_fds;
+extern unsigned int pack_max_fds;
+extern size_t peak_pack_mapped;
+extern size_t pack_mapped;
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 7e511ce9e..4d95e21eb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -682,19 +682,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static unsigned int pack_used_ctr;
-static unsigned int pack_mmap_calls;
-static unsigned int peak_pack_open_windows;
-static unsigned int pack_open_windows;
-static unsigned int pack_open_fds;
-static unsigned int pack_max_fds;
-static size_t peak_pack_mapped;
-static size_t pack_mapped;
-struct packed_git *packed_git;
-
-static struct mru packed_git_mru_storage;
-struct mru *packed_git_mru = &packed_git_mru_storage;
-
 void pack_report(void)
 {
 	fprintf(stderr,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 03/10] pack: move pack_report()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 01/10] pack: move pack name-related functions Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 02/10] pack: move static state variables Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
                   ` (57 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  2 --
 pack.c      | 24 ++++++++++++++++++++++++
 pack.h      |  2 ++
 sha1_file.c | 24 ------------------------
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index 1f0f47819..c7f802e4a 100644
--- a/cache.h
+++ b/cache.h
@@ -1624,8 +1624,6 @@ unsigned long approximate_object_count(void);
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
-extern void pack_report(void);
-
 /*
  * Create a temporary file rooted in the object database directory, or
  * die on failure. The filename is taken from "pattern", which should have the
diff --git a/pack.c b/pack.c
index 0f46e0617..60d9fc3b0 100644
--- a/pack.c
+++ b/pack.c
@@ -35,3 +35,27 @@ struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
 struct mru *packed_git_mru = &packed_git_mru_storage;
+
+#define SZ_FMT PRIuMAX
+static inline uintmax_t sz_fmt(size_t s) { return s; }
+
+void pack_report(void)
+{
+	fprintf(stderr,
+		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
+		sz_fmt(getpagesize()),
+		sz_fmt(packed_git_window_size),
+		sz_fmt(packed_git_limit));
+	fprintf(stderr,
+		"pack_report: pack_used_ctr            = %10u\n"
+		"pack_report: pack_mmap_calls          = %10u\n"
+		"pack_report: pack_open_windows        = %10u / %10u\n"
+		"pack_report: pack_mapped              = "
+			"%10" SZ_FMT " / %10" SZ_FMT "\n",
+		pack_used_ctr,
+		pack_mmap_calls,
+		pack_open_windows, peak_pack_open_windows,
+		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
+}
diff --git a/pack.h b/pack.h
index 7fcd45f7b..6098bfe40 100644
--- a/pack.h
+++ b/pack.h
@@ -133,4 +133,6 @@ extern unsigned int pack_max_fds;
 extern size_t peak_pack_mapped;
 extern size_t pack_mapped;
 
+extern void pack_report(void);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 4d95e21eb..0de39f480 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -29,9 +29,6 @@
 #include "mergesort.h"
 #include "quote.h"
 
-#define SZ_FMT PRIuMAX
-static inline uintmax_t sz_fmt(size_t s) { return s; }
-
 const unsigned char null_sha1[20];
 const struct object_id null_oid;
 const struct object_id empty_tree_oid = {
@@ -682,27 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-void pack_report(void)
-{
-	fprintf(stderr,
-		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
-		sz_fmt(getpagesize()),
-		sz_fmt(packed_git_window_size),
-		sz_fmt(packed_git_limit));
-	fprintf(stderr,
-		"pack_report: pack_used_ctr            = %10u\n"
-		"pack_report: pack_mmap_calls          = %10u\n"
-		"pack_report: pack_open_windows        = %10u / %10u\n"
-		"pack_report: pack_mapped              = "
-			"%10" SZ_FMT " / %10" SZ_FMT "\n",
-		pack_used_ctr,
-		pack_mmap_calls,
-		pack_open_windows, peak_pack_open_windows,
-		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
-}
-
 /*
  * Open and mmap the index file at path, perform a couple of
  * consistency checks, then record its information to p.  Return 0 on
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (2 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 03/10] pack: move pack_report() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 20:19   ` Junio C Hamano
  2017-08-08 19:32 ` [RFC PATCH 05/10] pack: move release_pack_memory() Jonathan Tan
                   ` (56 subsequent siblings)
  60 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/count-objects.c |   1 +
 cache.h                 |   8 ---
 pack.c                  | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
 pack.h                  |   8 +++
 sha1_file.c             | 140 ---------------------------------------------
 sha1_name.c             |   1 +
 6 files changed, 159 insertions(+), 148 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 1d82e61f2..185d3190a 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -10,6 +10,7 @@
 #include "builtin.h"
 #include "parse-options.h"
 #include "quote.h"
+#include "pack.h"
 
 static unsigned long garbage;
 static off_t size_garbage;
diff --git a/cache.h b/cache.h
index c7f802e4a..5d6839525 100644
--- a/cache.h
+++ b/cache.h
@@ -1603,8 +1603,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
-
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
@@ -1639,12 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * mmap the index file for the specified packfile (if it is not
- * already mmapped).  Return 0 on success.
- */
-extern int open_pack_index(struct packed_git *);
-
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
diff --git a/pack.c b/pack.c
index 60d9fc3b0..6edc43228 100644
--- a/pack.c
+++ b/pack.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "mru.h"
+#include "pack.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -59,3 +60,151 @@ void pack_report(void)
 		pack_open_windows, peak_pack_open_windows,
 		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
 }
+
+/*
+ * Open and mmap the index file at path, perform a couple of
+ * consistency checks, then record its information to p.  Return 0 on
+ * success.
+ */
+static int check_packed_git_idx(const char *path, struct packed_git *p)
+{
+	void *idx_map;
+	struct pack_idx_header *hdr;
+	size_t idx_size;
+	uint32_t version, nr, i, *index;
+	int fd = git_open(path);
+	struct stat st;
+
+	if (fd < 0)
+		return -1;
+	if (fstat(fd, &st)) {
+		close(fd);
+		return -1;
+	}
+	idx_size = xsize_t(st.st_size);
+	if (idx_size < 4 * 256 + 20 + 20) {
+		close(fd);
+		return error("index file %s is too small", path);
+	}
+	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+
+	hdr = idx_map;
+	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
+		version = ntohl(hdr->idx_version);
+		if (version < 2 || version > 2) {
+			munmap(idx_map, idx_size);
+			return error("index file %s is version %"PRIu32
+				     " and is not supported by this binary"
+				     " (try upgrading GIT to a newer version)",
+				     path, version);
+		}
+	} else
+		version = 1;
+
+	nr = 0;
+	index = idx_map;
+	if (version > 1)
+		index += 2;  /* skip index header */
+	for (i = 0; i < 256; i++) {
+		uint32_t n = ntohl(index[i]);
+		if (n < nr) {
+			munmap(idx_map, idx_size);
+			return error("non-monotonic index %s", path);
+		}
+		nr = n;
+	}
+
+	if (version == 1) {
+		/*
+		 * Total size:
+		 *  - 256 index entries 4 bytes each
+		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 */
+		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v1 file size in %s", path);
+		}
+	} else if (version == 2) {
+		/*
+		 * Minimum size:
+		 *  - 8 bytes of header
+		 *  - 256 index entries 4 bytes each
+		 *  - 20-byte sha1 entry * nr
+		 *  - 4-byte crc entry * nr
+		 *  - 4-byte offset entry * nr
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 * And after the 4-byte offset table might be a
+		 * variable sized table containing 8-byte entries
+		 * for offsets larger than 2^31.
+		 */
+		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
+		unsigned long max_size = min_size;
+		if (nr)
+			max_size += (nr - 1)*8;
+		if (idx_size < min_size || idx_size > max_size) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v2 file size in %s", path);
+		}
+		if (idx_size != min_size &&
+		    /*
+		     * make sure we can deal with large pack offsets.
+		     * 31-bit signed offset won't be enough, neither
+		     * 32-bit unsigned one will be.
+		     */
+		    (sizeof(off_t) <= 4)) {
+			munmap(idx_map, idx_size);
+			return error("pack too large for current definition of off_t in %s", path);
+		}
+	}
+
+	p->index_version = version;
+	p->index_data = idx_map;
+	p->index_size = idx_size;
+	p->num_objects = nr;
+	return 0;
+}
+
+int open_pack_index(struct packed_git *p)
+{
+	char *idx_name;
+	size_t len;
+	int ret;
+
+	if (p->index_data)
+		return 0;
+
+	if (!strip_suffix(p->pack_name, ".pack", &len))
+		die("BUG: pack_name does not end in .pack");
+	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
+	ret = check_packed_git_idx(idx_name, p);
+	free(idx_name);
+	return ret;
+}
+
+static struct packed_git *alloc_packed_git(int extra)
+{
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
+	memset(p, 0, sizeof(*p));
+	p->pack_fd = -1;
+	return p;
+}
+
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
+{
+	const char *path = sha1_pack_name(sha1);
+	size_t alloc = st_add(strlen(path), 1);
+	struct packed_git *p = alloc_packed_git(alloc);
+
+	memcpy(p->pack_name, path, alloc); /* includes NUL */
+	hashcpy(p->sha1, sha1);
+	if (check_packed_git_idx(idx_path, p)) {
+		free(p);
+		return NULL;
+	}
+
+	return p;
+}
diff --git a/pack.h b/pack.h
index 6098bfe40..5be0ed42a 100644
--- a/pack.h
+++ b/pack.h
@@ -135,4 +135,12 @@ extern size_t pack_mapped;
 
 extern void pack_report(void);
 
+/*
+ * mmap the index file for the specified packfile (if it is not
+ * already mmapped).  Return 0 on success.
+ */
+extern int open_pack_index(struct packed_git *);
+
+extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 0de39f480..2e414f5f5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -679,130 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-/*
- * Open and mmap the index file at path, perform a couple of
- * consistency checks, then record its information to p.  Return 0 on
- * success.
- */
-static int check_packed_git_idx(const char *path, struct packed_git *p)
-{
-	void *idx_map;
-	struct pack_idx_header *hdr;
-	size_t idx_size;
-	uint32_t version, nr, i, *index;
-	int fd = git_open(path);
-	struct stat st;
-
-	if (fd < 0)
-		return -1;
-	if (fstat(fd, &st)) {
-		close(fd);
-		return -1;
-	}
-	idx_size = xsize_t(st.st_size);
-	if (idx_size < 4 * 256 + 20 + 20) {
-		close(fd);
-		return error("index file %s is too small", path);
-	}
-	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
-
-	hdr = idx_map;
-	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
-		version = ntohl(hdr->idx_version);
-		if (version < 2 || version > 2) {
-			munmap(idx_map, idx_size);
-			return error("index file %s is version %"PRIu32
-				     " and is not supported by this binary"
-				     " (try upgrading GIT to a newer version)",
-				     path, version);
-		}
-	} else
-		version = 1;
-
-	nr = 0;
-	index = idx_map;
-	if (version > 1)
-		index += 2;  /* skip index header */
-	for (i = 0; i < 256; i++) {
-		uint32_t n = ntohl(index[i]);
-		if (n < nr) {
-			munmap(idx_map, idx_size);
-			return error("non-monotonic index %s", path);
-		}
-		nr = n;
-	}
-
-	if (version == 1) {
-		/*
-		 * Total size:
-		 *  - 256 index entries 4 bytes each
-		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 */
-		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v1 file size in %s", path);
-		}
-	} else if (version == 2) {
-		/*
-		 * Minimum size:
-		 *  - 8 bytes of header
-		 *  - 256 index entries 4 bytes each
-		 *  - 20-byte sha1 entry * nr
-		 *  - 4-byte crc entry * nr
-		 *  - 4-byte offset entry * nr
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 * And after the 4-byte offset table might be a
-		 * variable sized table containing 8-byte entries
-		 * for offsets larger than 2^31.
-		 */
-		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
-		unsigned long max_size = min_size;
-		if (nr)
-			max_size += (nr - 1)*8;
-		if (idx_size < min_size || idx_size > max_size) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v2 file size in %s", path);
-		}
-		if (idx_size != min_size &&
-		    /*
-		     * make sure we can deal with large pack offsets.
-		     * 31-bit signed offset won't be enough, neither
-		     * 32-bit unsigned one will be.
-		     */
-		    (sizeof(off_t) <= 4)) {
-			munmap(idx_map, idx_size);
-			return error("pack too large for current definition of off_t in %s", path);
-		}
-	}
-
-	p->index_version = version;
-	p->index_data = idx_map;
-	p->index_size = idx_size;
-	p->num_objects = nr;
-	return 0;
-}
-
-int open_pack_index(struct packed_git *p)
-{
-	char *idx_name;
-	size_t len;
-	int ret;
-
-	if (p->index_data)
-		return 0;
-
-	if (!strip_suffix(p->pack_name, ".pack", &len))
-		die("BUG: pack_name does not end in .pack");
-	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
-	ret = check_packed_git_idx(idx_name, p);
-	free(idx_name);
-	return ret;
-}
-
 static void scan_windows(struct packed_git *p,
 	struct packed_git **lru_p,
 	struct pack_window **lru_w,
@@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	return p;
 }
 
-struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
-{
-	const char *path = sha1_pack_name(sha1);
-	size_t alloc = st_add(strlen(path), 1);
-	struct packed_git *p = alloc_packed_git(alloc);
-
-	memcpy(p->pack_name, path, alloc); /* includes NUL */
-	hashcpy(p->sha1, sha1);
-	if (check_packed_git_idx(idx_path, p)) {
-		free(p);
-		return NULL;
-	}
-
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
diff --git a/sha1_name.c b/sha1_name.c
index 74fcb6d78..28b7c9fd8 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -9,6 +9,7 @@
 #include "remote.h"
 #include "dir.h"
 #include "sha1-array.h"
+#include "pack.h"
 
 static int get_sha1_oneline(const char *, unsigned char *, struct commit_list *);
 
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 05/10] pack: move release_pack_memory()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (3 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 06/10] pack: move pack-closing functions Jonathan Tan
                   ` (55 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

The function unuse_one_window() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 git-compat-util.h |  2 --
 pack.c            | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 pack.h            |  4 ++++
 sha1_file.c       | 49 -------------------------------------------------
 4 files changed, 53 insertions(+), 51 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index db9c22de7..201056e2d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -749,8 +749,6 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 extern int git_atexit(void (*handler)(void));
 #endif
 
-extern void release_pack_memory(size_t);
-
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
diff --git a/pack.c b/pack.c
index 6edc43228..8daa74ad1 100644
--- a/pack.c
+++ b/pack.c
@@ -208,3 +208,52 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 
 	return p;
 }
+
+static void scan_windows(struct packed_git *p,
+	struct packed_git **lru_p,
+	struct pack_window **lru_w,
+	struct pack_window **lru_l)
+{
+	struct pack_window *w, *w_l;
+
+	for (w_l = NULL, w = p->windows; w; w = w->next) {
+		if (!w->inuse_cnt) {
+			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
+				*lru_p = p;
+				*lru_w = w;
+				*lru_l = w_l;
+			}
+		}
+		w_l = w;
+	}
+}
+
+int unuse_one_window(struct packed_git *current)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *lru_w = NULL, *lru_l = NULL;
+
+	if (current)
+		scan_windows(current, &lru_p, &lru_w, &lru_l);
+	for (p = packed_git; p; p = p->next)
+		scan_windows(p, &lru_p, &lru_w, &lru_l);
+	if (lru_p) {
+		munmap(lru_w->base, lru_w->len);
+		pack_mapped -= lru_w->len;
+		if (lru_l)
+			lru_l->next = lru_w->next;
+		else
+			lru_p->windows = lru_w->next;
+		free(lru_w);
+		pack_open_windows--;
+		return 1;
+	}
+	return 0;
+}
+
+void release_pack_memory(size_t need)
+{
+	size_t cur = pack_mapped;
+	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
+		; /* nothing */
+}
diff --git a/pack.h b/pack.h
index 5be0ed42a..c16220586 100644
--- a/pack.h
+++ b/pack.h
@@ -143,4 +143,8 @@ extern int open_pack_index(struct packed_git *);
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
+extern int unuse_one_window(struct packed_git *current);
+
+extern void release_pack_memory(size_t);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 2e414f5f5..644876e4e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -679,55 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static void scan_windows(struct packed_git *p,
-	struct packed_git **lru_p,
-	struct pack_window **lru_w,
-	struct pack_window **lru_l)
-{
-	struct pack_window *w, *w_l;
-
-	for (w_l = NULL, w = p->windows; w; w = w->next) {
-		if (!w->inuse_cnt) {
-			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
-				*lru_p = p;
-				*lru_w = w;
-				*lru_l = w_l;
-			}
-		}
-		w_l = w;
-	}
-}
-
-static int unuse_one_window(struct packed_git *current)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *lru_w = NULL, *lru_l = NULL;
-
-	if (current)
-		scan_windows(current, &lru_p, &lru_w, &lru_l);
-	for (p = packed_git; p; p = p->next)
-		scan_windows(p, &lru_p, &lru_w, &lru_l);
-	if (lru_p) {
-		munmap(lru_w->base, lru_w->len);
-		pack_mapped -= lru_w->len;
-		if (lru_l)
-			lru_l->next = lru_w->next;
-		else
-			lru_p->windows = lru_w->next;
-		free(lru_w);
-		pack_open_windows--;
-		return 1;
-	}
-	return 0;
-}
-
-void release_pack_memory(size_t need)
-{
-	size_t cur = pack_mapped;
-	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
-		; /* nothing */
-}
-
 static void mmap_limit_check(size_t length)
 {
 	static size_t limit = 0;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 06/10] pack: move pack-closing functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (4 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 05/10] pack: move release_pack_memory() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 07/10] pack: move use_pack() Jonathan Tan
                   ` (54 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

The function close_pack_fd() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
In doing this, I discovered that some builtins close the packs even
though they, in theory, should not know anything about how objects are
stored. Can we remove those calls? (The tests pass with those calls
removed.)
---
 builtin/am.c    |  1 +
 builtin/clone.c |  1 +
 builtin/fetch.c |  1 +
 builtin/merge.c |  1 +
 cache.h         |  8 --------
 pack.c          | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pack.h          |  9 +++++++++
 sha1_file.c     | 55 -------------------------------------------------------
 8 files changed, 67 insertions(+), 63 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index c973bd96d..c38dd10a3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -31,6 +31,7 @@
 #include "mailinfo.h"
 #include "apply.h"
 #include "string-list.h"
+#include "pack.h"
 
 /**
  * Returns 1 if the file is empty or does not exist, 0 otherwise.
diff --git a/builtin/clone.c b/builtin/clone.c
index 08b5cc433..53410a45d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -25,6 +25,7 @@
 #include "remote.h"
 #include "run-command.h"
 #include "connected.h"
+#include "pack.h"
 
 /*
  * Overall FIXMEs:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..196a3bfc4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -17,6 +17,7 @@
 #include "connected.h"
 #include "argv-array.h"
 #include "utf8.h"
+#include "pack.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
diff --git a/builtin/merge.c b/builtin/merge.c
index 900bafdb4..9cff4b276 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -32,6 +32,7 @@
 #include "gpg-interface.h"
 #include "sequencer.h"
 #include "string-list.h"
+#include "pack.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
diff --git a/cache.h b/cache.h
index 5d6839525..25a21a61f 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,15 +1637,7 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * munmap the index file for the specified packfile (if it is
- * currently mmapped).
- */
-extern void close_pack_index(struct packed_git *);
-
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/pack.c b/pack.c
index 8daa74ad1..c8e2dbdee 100644
--- a/pack.c
+++ b/pack.c
@@ -257,3 +257,57 @@ void release_pack_memory(size_t need)
 	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
 		; /* nothing */
 }
+
+void close_pack_windows(struct packed_git *p)
+{
+	while (p->windows) {
+		struct pack_window *w = p->windows;
+
+		if (w->inuse_cnt)
+			die("pack '%s' still has open windows to it",
+			    p->pack_name);
+		munmap(w->base, w->len);
+		pack_mapped -= w->len;
+		pack_open_windows--;
+		p->windows = w->next;
+		free(w);
+	}
+}
+
+int close_pack_fd(struct packed_git *p)
+{
+	if (p->pack_fd < 0)
+		return 0;
+
+	close(p->pack_fd);
+	pack_open_fds--;
+	p->pack_fd = -1;
+
+	return 1;
+}
+
+void close_pack_index(struct packed_git *p)
+{
+	if (p->index_data) {
+		munmap((void *)p->index_data, p->index_size);
+		p->index_data = NULL;
+	}
+}
+
+static void close_pack(struct packed_git *p)
+{
+	close_pack_windows(p);
+	close_pack_fd(p);
+	close_pack_index(p);
+}
+
+void close_all_packs(void)
+{
+	struct packed_git *p;
+
+	for (p = packed_git; p; p = p->next)
+		if (p->do_not_close)
+			die("BUG: want to close pack marked 'do-not-close'");
+		else
+			close_pack(p);
+}
diff --git a/pack.h b/pack.h
index c16220586..fd4668528 100644
--- a/pack.h
+++ b/pack.h
@@ -147,4 +147,13 @@ extern int unuse_one_window(struct packed_git *current);
 
 extern void release_pack_memory(size_t);
 
+extern void close_pack_windows(struct packed_git *);
+extern int close_pack_fd(struct packed_git *);
+/*
+ * munmap the index file for the specified packfile (if it is
+ * currently mmapped).
+ */
+extern void close_pack_index(struct packed_git *);
+extern void close_all_packs(void);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 644876e4e..e2927244f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,53 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void close_pack_windows(struct packed_git *p)
-{
-	while (p->windows) {
-		struct pack_window *w = p->windows;
-
-		if (w->inuse_cnt)
-			die("pack '%s' still has open windows to it",
-			    p->pack_name);
-		munmap(w->base, w->len);
-		pack_mapped -= w->len;
-		pack_open_windows--;
-		p->windows = w->next;
-		free(w);
-	}
-}
-
-static int close_pack_fd(struct packed_git *p)
-{
-	if (p->pack_fd < 0)
-		return 0;
-
-	close(p->pack_fd);
-	pack_open_fds--;
-	p->pack_fd = -1;
-
-	return 1;
-}
-
-static void close_pack(struct packed_git *p)
-{
-	close_pack_windows(p);
-	close_pack_fd(p);
-	close_pack_index(p);
-}
-
-void close_all_packs(void)
-{
-	struct packed_git *p;
-
-	for (p = packed_git; p; p = p->next)
-		if (p->do_not_close)
-			die("BUG: want to close pack marked 'do-not-close'");
-		else
-			close_pack(p);
-}
-
-
 /*
  * The LRU pack is the one with the oldest MRU window, preferring packs
  * with no used windows, or the oldest mtime if it has no windows allocated.
@@ -846,14 +799,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-void close_pack_index(struct packed_git *p)
-{
-	if (p->index_data) {
-		munmap((void *)p->index_data, p->index_size);
-		p->index_data = NULL;
-	}
-}
-
 static unsigned int get_max_fd_limit(void)
 {
 #ifdef RLIMIT_NOFILE
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 07/10] pack: move use_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (5 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 06/10] pack: move pack-closing functions Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 08/10] pack: move unuse_pack() Jonathan Tan
                   ` (53 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

open_packed_git is made global.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Unlike the other commits where variables and functions are made global
and then remade static, open_packed_git is not remade static (because a
function in sha1_file.c still uses it).
---
 cache.h     |   1 -
 pack.c      | 303 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 pack.h      |  14 +--
 sha1_file.c | 285 --------------------------------------------------------
 streaming.c |   1 +
 5 files changed, 299 insertions(+), 305 deletions(-)

diff --git a/cache.h b/cache.h
index 25a21a61f..dd9f9a9ae 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/pack.c b/pack.c
index c8e2dbdee..85cb65558 100644
--- a/pack.c
+++ b/pack.c
@@ -24,14 +24,14 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	return odb_pack_name(&buf, sha1, "idx");
 }
 
-unsigned int pack_used_ctr;
-unsigned int pack_mmap_calls;
-unsigned int peak_pack_open_windows;
-unsigned int pack_open_windows;
+static unsigned int pack_used_ctr;
+static unsigned int pack_mmap_calls;
+static unsigned int peak_pack_open_windows;
+static unsigned int pack_open_windows;
 unsigned int pack_open_fds;
-unsigned int pack_max_fds;
-size_t peak_pack_mapped;
-size_t pack_mapped;
+static unsigned int pack_max_fds;
+static size_t peak_pack_mapped;
+static size_t pack_mapped;
 struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
@@ -228,7 +228,7 @@ static void scan_windows(struct packed_git *p,
 	}
 }
 
-int unuse_one_window(struct packed_git *current)
+static int unuse_one_window(struct packed_git *current)
 {
 	struct packed_git *p, *lru_p = NULL;
 	struct pack_window *lru_w = NULL, *lru_l = NULL;
@@ -274,7 +274,7 @@ void close_pack_windows(struct packed_git *p)
 	}
 }
 
-int close_pack_fd(struct packed_git *p)
+static int close_pack_fd(struct packed_git *p)
 {
 	if (p->pack_fd < 0)
 		return 0;
@@ -311,3 +311,288 @@ void close_all_packs(void)
 		else
 			close_pack(p);
 }
+
+/*
+ * The LRU pack is the one with the oldest MRU window, preferring packs
+ * with no used windows, or the oldest mtime if it has no windows allocated.
+ */
+static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
+{
+	struct pack_window *w, *this_mru_w;
+	int has_windows_inuse = 0;
+
+	/*
+	 * Reject this pack if it has windows and the previously selected
+	 * one does not.  If this pack does not have windows, reject
+	 * it if the pack file is newer than the previously selected one.
+	 */
+	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
+		return;
+
+	for (w = this_mru_w = p->windows; w; w = w->next) {
+		/*
+		 * Reject this pack if any of its windows are in use,
+		 * but the previously selected pack did not have any
+		 * inuse windows.  Otherwise, record that this pack
+		 * has windows in use.
+		 */
+		if (w->inuse_cnt) {
+			if (*accept_windows_inuse)
+				has_windows_inuse = 1;
+			else
+				return;
+		}
+
+		if (w->last_used > this_mru_w->last_used)
+			this_mru_w = w;
+
+		/*
+		 * Reject this pack if it has windows that have been
+		 * used more recently than the previously selected pack.
+		 * If the previously selected pack had windows inuse and
+		 * we have not encountered a window in this pack that is
+		 * inuse, skip this check since we prefer a pack with no
+		 * inuse windows to one that has inuse windows.
+		 */
+		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
+		    this_mru_w->last_used > (*mru_w)->last_used)
+			return;
+	}
+
+	/*
+	 * Select this pack.
+	 */
+	*mru_w = this_mru_w;
+	*lru_p = p;
+	*accept_windows_inuse = has_windows_inuse;
+}
+
+static int close_one_pack(void)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *mru_w = NULL;
+	int accept_windows_inuse = 1;
+
+	for (p = packed_git; p; p = p->next) {
+		if (p->pack_fd == -1)
+			continue;
+		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
+	}
+
+	if (lru_p)
+		return close_pack_fd(lru_p);
+
+	return 0;
+}
+
+static unsigned int get_max_fd_limit(void)
+{
+#ifdef RLIMIT_NOFILE
+	{
+		struct rlimit lim;
+
+		if (!getrlimit(RLIMIT_NOFILE, &lim))
+			return lim.rlim_cur;
+	}
+#endif
+
+#ifdef _SC_OPEN_MAX
+	{
+		long open_max = sysconf(_SC_OPEN_MAX);
+		if (0 < open_max)
+			return open_max;
+		/*
+		 * Otherwise, we got -1 for one of the two
+		 * reasons:
+		 *
+		 * (1) sysconf() did not understand _SC_OPEN_MAX
+		 *     and signaled an error with -1; or
+		 * (2) sysconf() said there is no limit.
+		 *
+		 * We _could_ clear errno before calling sysconf() to
+		 * tell these two cases apart and return a huge number
+		 * in the latter case to let the caller cap it to a
+		 * value that is not so selfish, but letting the
+		 * fallback OPEN_MAX codepath take care of these cases
+		 * is a lot simpler.
+		 */
+	}
+#endif
+
+#ifdef OPEN_MAX
+	return OPEN_MAX;
+#else
+	return 1; /* see the caller ;-) */
+#endif
+}
+
+/*
+ * Do not call this directly as this leaks p->pack_fd on error return;
+ * call open_packed_git() instead.
+ */
+static int open_packed_git_1(struct packed_git *p)
+{
+	struct stat st;
+	struct pack_header hdr;
+	unsigned char sha1[20];
+	unsigned char *idx_sha1;
+	long fd_flag;
+
+	if (!p->index_data && open_pack_index(p))
+		return error("packfile %s index unavailable", p->pack_name);
+
+	if (!pack_max_fds) {
+		unsigned int max_fds = get_max_fd_limit();
+
+		/* Save 3 for stdin/stdout/stderr, 22 for work */
+		if (25 < max_fds)
+			pack_max_fds = max_fds - 25;
+		else
+			pack_max_fds = 1;
+	}
+
+	while (pack_max_fds <= pack_open_fds && close_one_pack())
+		; /* nothing */
+
+	p->pack_fd = git_open(p->pack_name);
+	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
+		return -1;
+	pack_open_fds++;
+
+	/* If we created the struct before we had the pack we lack size. */
+	if (!p->pack_size) {
+		if (!S_ISREG(st.st_mode))
+			return error("packfile %s not a regular file", p->pack_name);
+		p->pack_size = st.st_size;
+	} else if (p->pack_size != st.st_size)
+		return error("packfile %s size changed", p->pack_name);
+
+	/* We leave these file descriptors open with sliding mmap;
+	 * there is no point keeping them open across exec(), though.
+	 */
+	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
+	if (fd_flag < 0)
+		return error("cannot determine file descriptor flags");
+	fd_flag |= FD_CLOEXEC;
+	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
+		return error("cannot set FD_CLOEXEC");
+
+	/* Verify we recognize this pack file format. */
+	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
+		return error("file %s is far too short to be a packfile", p->pack_name);
+	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
+		return error("file %s is not a GIT packfile", p->pack_name);
+	if (!pack_version_ok(hdr.hdr_version))
+		return error("packfile %s is version %"PRIu32" and not"
+			" supported (try upgrading GIT to a newer version)",
+			p->pack_name, ntohl(hdr.hdr_version));
+
+	/* Verify the pack matches its index. */
+	if (p->num_objects != ntohl(hdr.hdr_entries))
+		return error("packfile %s claims to have %"PRIu32" objects"
+			     " while index indicates %"PRIu32" objects",
+			     p->pack_name, ntohl(hdr.hdr_entries),
+			     p->num_objects);
+	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
+		return error("end of packfile %s is unavailable", p->pack_name);
+	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
+		return error("packfile %s signature is unavailable", p->pack_name);
+	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
+	if (hashcmp(sha1, idx_sha1))
+		return error("packfile %s does not match index", p->pack_name);
+	return 0;
+}
+
+int open_packed_git(struct packed_git *p)
+{
+	if (!open_packed_git_1(p))
+		return 0;
+	close_pack_fd(p);
+	return -1;
+}
+
+static int in_window(struct pack_window *win, off_t offset)
+{
+	/* We must promise at least 20 bytes (one hash) after the
+	 * offset is available from this window, otherwise the offset
+	 * is not actually in this window and a different window (which
+	 * has that one hash excess) must be used.  This is to support
+	 * the object header and delta base parsing routines below.
+	 */
+	off_t win_off = win->offset;
+	return win_off <= offset
+		&& (offset + 20) <= (win_off + win->len);
+}
+
+unsigned char *use_pack(struct packed_git *p,
+		struct pack_window **w_cursor,
+		off_t offset,
+		unsigned long *left)
+{
+	struct pack_window *win = *w_cursor;
+
+	/* Since packfiles end in a hash of their content and it's
+	 * pointless to ask for an offset into the middle of that
+	 * hash, and the in_window function above wouldn't match
+	 * don't allow an offset too close to the end of the file.
+	 */
+	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
+		die("packfile %s cannot be accessed", p->pack_name);
+	if (offset > (p->pack_size - 20))
+		die("offset beyond end of packfile (truncated pack?)");
+	if (offset < 0)
+		die(_("offset before end of packfile (broken .idx?)"));
+
+	if (!win || !in_window(win, offset)) {
+		if (win)
+			win->inuse_cnt--;
+		for (win = p->windows; win; win = win->next) {
+			if (in_window(win, offset))
+				break;
+		}
+		if (!win) {
+			size_t window_align = packed_git_window_size / 2;
+			off_t len;
+
+			if (p->pack_fd == -1 && open_packed_git(p))
+				die("packfile %s cannot be accessed", p->pack_name);
+
+			win = xcalloc(1, sizeof(*win));
+			win->offset = (offset / window_align) * window_align;
+			len = p->pack_size - win->offset;
+			if (len > packed_git_window_size)
+				len = packed_git_window_size;
+			win->len = (size_t)len;
+			pack_mapped += win->len;
+			while (packed_git_limit < pack_mapped
+				&& unuse_one_window(p))
+				; /* nothing */
+			win->base = xmmap(NULL, win->len,
+				PROT_READ, MAP_PRIVATE,
+				p->pack_fd, win->offset);
+			if (win->base == MAP_FAILED)
+				die_errno("packfile %s cannot be mapped",
+					  p->pack_name);
+			if (!win->offset && win->len == p->pack_size
+				&& !p->do_not_close)
+				close_pack_fd(p);
+			pack_mmap_calls++;
+			pack_open_windows++;
+			if (pack_mapped > peak_pack_mapped)
+				peak_pack_mapped = pack_mapped;
+			if (pack_open_windows > peak_pack_open_windows)
+				peak_pack_open_windows = pack_open_windows;
+			win->next = p->windows;
+			p->windows = win;
+		}
+	}
+	if (win != *w_cursor) {
+		win->last_used = pack_used_ctr++;
+		win->inuse_cnt++;
+		*w_cursor = win;
+	}
+	offset -= win->offset;
+	if (left)
+		*left = win->len - xsize_t(offset);
+	return win->base + offset;
+}
diff --git a/pack.h b/pack.h
index fd4668528..bf2b99bf9 100644
--- a/pack.h
+++ b/pack.h
@@ -124,14 +124,7 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_used_ctr;
-extern unsigned int pack_mmap_calls;
-extern unsigned int peak_pack_open_windows;
-extern unsigned int pack_open_windows;
 extern unsigned int pack_open_fds;
-extern unsigned int pack_max_fds;
-extern size_t peak_pack_mapped;
-extern size_t pack_mapped;
 
 extern void pack_report(void);
 
@@ -143,12 +136,9 @@ extern int open_pack_index(struct packed_git *);
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
-extern int unuse_one_window(struct packed_git *current);
-
 extern void release_pack_memory(size_t);
 
 extern void close_pack_windows(struct packed_git *);
-extern int close_pack_fd(struct packed_git *);
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
@@ -156,4 +146,8 @@ extern int close_pack_fd(struct packed_git *);
 extern void close_pack_index(struct packed_git *);
 extern void close_all_packs(void);
 
+extern int open_packed_git(struct packed_git *p);
+
+extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index e2927244f..8f17a07e9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,79 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-/*
- * The LRU pack is the one with the oldest MRU window, preferring packs
- * with no used windows, or the oldest mtime if it has no windows allocated.
- */
-static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
-{
-	struct pack_window *w, *this_mru_w;
-	int has_windows_inuse = 0;
-
-	/*
-	 * Reject this pack if it has windows and the previously selected
-	 * one does not.  If this pack does not have windows, reject
-	 * it if the pack file is newer than the previously selected one.
-	 */
-	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
-		return;
-
-	for (w = this_mru_w = p->windows; w; w = w->next) {
-		/*
-		 * Reject this pack if any of its windows are in use,
-		 * but the previously selected pack did not have any
-		 * inuse windows.  Otherwise, record that this pack
-		 * has windows in use.
-		 */
-		if (w->inuse_cnt) {
-			if (*accept_windows_inuse)
-				has_windows_inuse = 1;
-			else
-				return;
-		}
-
-		if (w->last_used > this_mru_w->last_used)
-			this_mru_w = w;
-
-		/*
-		 * Reject this pack if it has windows that have been
-		 * used more recently than the previously selected pack.
-		 * If the previously selected pack had windows inuse and
-		 * we have not encountered a window in this pack that is
-		 * inuse, skip this check since we prefer a pack with no
-		 * inuse windows to one that has inuse windows.
-		 */
-		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
-		    this_mru_w->last_used > (*mru_w)->last_used)
-			return;
-	}
-
-	/*
-	 * Select this pack.
-	 */
-	*mru_w = this_mru_w;
-	*lru_p = p;
-	*accept_windows_inuse = has_windows_inuse;
-}
-
-static int close_one_pack(void)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *mru_w = NULL;
-	int accept_windows_inuse = 1;
-
-	for (p = packed_git; p; p = p->next) {
-		if (p->pack_fd == -1)
-			continue;
-		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
-	}
-
-	if (lru_p)
-		return close_pack_fd(lru_p);
-
-	return 0;
-}
-
 void unuse_pack(struct pack_window **w_cursor)
 {
 	struct pack_window *w = *w_cursor;
@@ -799,218 +726,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-static unsigned int get_max_fd_limit(void)
-{
-#ifdef RLIMIT_NOFILE
-	{
-		struct rlimit lim;
-
-		if (!getrlimit(RLIMIT_NOFILE, &lim))
-			return lim.rlim_cur;
-	}
-#endif
-
-#ifdef _SC_OPEN_MAX
-	{
-		long open_max = sysconf(_SC_OPEN_MAX);
-		if (0 < open_max)
-			return open_max;
-		/*
-		 * Otherwise, we got -1 for one of the two
-		 * reasons:
-		 *
-		 * (1) sysconf() did not understand _SC_OPEN_MAX
-		 *     and signaled an error with -1; or
-		 * (2) sysconf() said there is no limit.
-		 *
-		 * We _could_ clear errno before calling sysconf() to
-		 * tell these two cases apart and return a huge number
-		 * in the latter case to let the caller cap it to a
-		 * value that is not so selfish, but letting the
-		 * fallback OPEN_MAX codepath take care of these cases
-		 * is a lot simpler.
-		 */
-	}
-#endif
-
-#ifdef OPEN_MAX
-	return OPEN_MAX;
-#else
-	return 1; /* see the caller ;-) */
-#endif
-}
-
-/*
- * Do not call this directly as this leaks p->pack_fd on error return;
- * call open_packed_git() instead.
- */
-static int open_packed_git_1(struct packed_git *p)
-{
-	struct stat st;
-	struct pack_header hdr;
-	unsigned char sha1[20];
-	unsigned char *idx_sha1;
-	long fd_flag;
-
-	if (!p->index_data && open_pack_index(p))
-		return error("packfile %s index unavailable", p->pack_name);
-
-	if (!pack_max_fds) {
-		unsigned int max_fds = get_max_fd_limit();
-
-		/* Save 3 for stdin/stdout/stderr, 22 for work */
-		if (25 < max_fds)
-			pack_max_fds = max_fds - 25;
-		else
-			pack_max_fds = 1;
-	}
-
-	while (pack_max_fds <= pack_open_fds && close_one_pack())
-		; /* nothing */
-
-	p->pack_fd = git_open(p->pack_name);
-	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
-		return -1;
-	pack_open_fds++;
-
-	/* If we created the struct before we had the pack we lack size. */
-	if (!p->pack_size) {
-		if (!S_ISREG(st.st_mode))
-			return error("packfile %s not a regular file", p->pack_name);
-		p->pack_size = st.st_size;
-	} else if (p->pack_size != st.st_size)
-		return error("packfile %s size changed", p->pack_name);
-
-	/* We leave these file descriptors open with sliding mmap;
-	 * there is no point keeping them open across exec(), though.
-	 */
-	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
-	if (fd_flag < 0)
-		return error("cannot determine file descriptor flags");
-	fd_flag |= FD_CLOEXEC;
-	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
-		return error("cannot set FD_CLOEXEC");
-
-	/* Verify we recognize this pack file format. */
-	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
-		return error("file %s is far too short to be a packfile", p->pack_name);
-	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
-		return error("file %s is not a GIT packfile", p->pack_name);
-	if (!pack_version_ok(hdr.hdr_version))
-		return error("packfile %s is version %"PRIu32" and not"
-			" supported (try upgrading GIT to a newer version)",
-			p->pack_name, ntohl(hdr.hdr_version));
-
-	/* Verify the pack matches its index. */
-	if (p->num_objects != ntohl(hdr.hdr_entries))
-		return error("packfile %s claims to have %"PRIu32" objects"
-			     " while index indicates %"PRIu32" objects",
-			     p->pack_name, ntohl(hdr.hdr_entries),
-			     p->num_objects);
-	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
-		return error("end of packfile %s is unavailable", p->pack_name);
-	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
-		return error("packfile %s signature is unavailable", p->pack_name);
-	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
-	if (hashcmp(sha1, idx_sha1))
-		return error("packfile %s does not match index", p->pack_name);
-	return 0;
-}
-
-static int open_packed_git(struct packed_git *p)
-{
-	if (!open_packed_git_1(p))
-		return 0;
-	close_pack_fd(p);
-	return -1;
-}
-
-static int in_window(struct pack_window *win, off_t offset)
-{
-	/* We must promise at least 20 bytes (one hash) after the
-	 * offset is available from this window, otherwise the offset
-	 * is not actually in this window and a different window (which
-	 * has that one hash excess) must be used.  This is to support
-	 * the object header and delta base parsing routines below.
-	 */
-	off_t win_off = win->offset;
-	return win_off <= offset
-		&& (offset + 20) <= (win_off + win->len);
-}
-
-unsigned char *use_pack(struct packed_git *p,
-		struct pack_window **w_cursor,
-		off_t offset,
-		unsigned long *left)
-{
-	struct pack_window *win = *w_cursor;
-
-	/* Since packfiles end in a hash of their content and it's
-	 * pointless to ask for an offset into the middle of that
-	 * hash, and the in_window function above wouldn't match
-	 * don't allow an offset too close to the end of the file.
-	 */
-	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
-		die("packfile %s cannot be accessed", p->pack_name);
-	if (offset > (p->pack_size - 20))
-		die("offset beyond end of packfile (truncated pack?)");
-	if (offset < 0)
-		die(_("offset before end of packfile (broken .idx?)"));
-
-	if (!win || !in_window(win, offset)) {
-		if (win)
-			win->inuse_cnt--;
-		for (win = p->windows; win; win = win->next) {
-			if (in_window(win, offset))
-				break;
-		}
-		if (!win) {
-			size_t window_align = packed_git_window_size / 2;
-			off_t len;
-
-			if (p->pack_fd == -1 && open_packed_git(p))
-				die("packfile %s cannot be accessed", p->pack_name);
-
-			win = xcalloc(1, sizeof(*win));
-			win->offset = (offset / window_align) * window_align;
-			len = p->pack_size - win->offset;
-			if (len > packed_git_window_size)
-				len = packed_git_window_size;
-			win->len = (size_t)len;
-			pack_mapped += win->len;
-			while (packed_git_limit < pack_mapped
-				&& unuse_one_window(p))
-				; /* nothing */
-			win->base = xmmap(NULL, win->len,
-				PROT_READ, MAP_PRIVATE,
-				p->pack_fd, win->offset);
-			if (win->base == MAP_FAILED)
-				die_errno("packfile %s cannot be mapped",
-					  p->pack_name);
-			if (!win->offset && win->len == p->pack_size
-				&& !p->do_not_close)
-				close_pack_fd(p);
-			pack_mmap_calls++;
-			pack_open_windows++;
-			if (pack_mapped > peak_pack_mapped)
-				peak_pack_mapped = pack_mapped;
-			if (pack_open_windows > peak_pack_open_windows)
-				peak_pack_open_windows = pack_open_windows;
-			win->next = p->windows;
-			p->windows = win;
-		}
-	}
-	if (win != *w_cursor) {
-		win->last_used = pack_used_ctr++;
-		win->inuse_cnt++;
-		*w_cursor = win;
-	}
-	offset -= win->offset;
-	if (left)
-		*left = win->len - xsize_t(offset);
-	return win->base + offset;
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
diff --git a/streaming.c b/streaming.c
index 9afa66b8b..f657018cf 100644
--- a/streaming.c
+++ b/streaming.c
@@ -3,6 +3,7 @@
  */
 #include "cache.h"
 #include "streaming.h"
+#include "pack.h"
 
 enum input_source {
 	stream_error = -1,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 08/10] pack: move unuse_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (6 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 07/10] pack: move use_pack() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 09/10] pack: move add_packed_git() Jonathan Tan
                   ` (52 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 1 -
 pack.c      | 9 +++++++++
 pack.h      | 1 +
 sha1_file.c | 9 ---------
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index dd9f9a9ae..4812f3a63 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
diff --git a/pack.c b/pack.c
index 85cb65558..93526ea7b 100644
--- a/pack.c
+++ b/pack.c
@@ -596,3 +596,12 @@ unsigned char *use_pack(struct packed_git *p,
 		*left = win->len - xsize_t(offset);
 	return win->base + offset;
 }
+
+void unuse_pack(struct pack_window **w_cursor)
+{
+	struct pack_window *w = *w_cursor;
+	if (w) {
+		w->inuse_cnt--;
+		*w_cursor = NULL;
+	}
+}
diff --git a/pack.h b/pack.h
index bf2b99bf9..3876e9ae6 100644
--- a/pack.h
+++ b/pack.h
@@ -149,5 +149,6 @@ extern void close_all_packs(void);
 extern int open_packed_git(struct packed_git *p);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+extern void unuse_pack(struct pack_window **);
 
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 8f17a07e9..12501ef06 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void unuse_pack(struct pack_window **w_cursor)
-{
-	struct pack_window *w = *w_cursor;
-	if (w) {
-		w->inuse_cnt--;
-		*w_cursor = NULL;
-	}
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 09/10] pack: move add_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (7 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 08/10] pack: move unuse_pack() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 19:32 ` [RFC PATCH 10/10] pack: move install_packed_git() Jonathan Tan
                   ` (51 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 connected.c |  1 +
 pack.c      | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 pack.h      |  1 +
 sha1_file.c | 61 -------------------------------------------------------------
 5 files changed, 55 insertions(+), 62 deletions(-)

diff --git a/cache.h b/cache.h
index 4812f3a63..bf93477e8 100644
--- a/cache.h
+++ b/cache.h
@@ -1638,7 +1638,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 extern int odb_pack_keep(const char *name);
 
 extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
diff --git a/connected.c b/connected.c
index 136c2ac16..3e3f0148c 100644
--- a/connected.c
+++ b/connected.c
@@ -3,6 +3,7 @@
 #include "sigchain.h"
 #include "connected.h"
 #include "transport.h"
+#include "pack.h"
 
 /*
  * If we feed all the commits we want to verify to this command
diff --git a/pack.c b/pack.c
index 93526ea7b..efe0ed3e8 100644
--- a/pack.c
+++ b/pack.c
@@ -605,3 +605,56 @@ void unuse_pack(struct pack_window **w_cursor)
 		*w_cursor = NULL;
 	}
 }
+
+static void try_to_free_pack_memory(size_t size)
+{
+	release_pack_memory(size);
+}
+
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
+{
+	static int have_set_try_to_free_routine;
+	struct stat st;
+	size_t alloc;
+	struct packed_git *p;
+
+	if (!have_set_try_to_free_routine) {
+		have_set_try_to_free_routine = 1;
+		set_try_to_free_routine(try_to_free_pack_memory);
+	}
+
+	/*
+	 * Make sure a corresponding .pack file exists and that
+	 * the index looks sane.
+	 */
+	if (!strip_suffix_mem(path, &path_len, ".idx"))
+		return NULL;
+
+	/*
+	 * ".pack" is long enough to hold any suffix we're adding (and
+	 * the use xsnprintf double-checks that)
+	 */
+	alloc = st_add3(path_len, strlen(".pack"), 1);
+	p = alloc_packed_git(alloc);
+	memcpy(p->pack_name, path, path_len);
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->pack_keep = 1;
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
+	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
+		free(p);
+		return NULL;
+	}
+
+	/* ok, it looks sane as far as we can check without
+	 * actually mapping the pack file.
+	 */
+	p->pack_size = st.st_size;
+	p->pack_local = local;
+	p->mtime = st.st_mtime;
+	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
+		hashclr(p->sha1);
+	return p;
+}
diff --git a/pack.h b/pack.h
index 3876e9ae6..c1f3ff32d 100644
--- a/pack.h
+++ b/pack.h
@@ -150,5 +150,6 @@ extern int open_packed_git(struct packed_git *p);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 extern void unuse_pack(struct pack_window **);
+extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 12501ef06..7f12b1ee0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,67 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-static struct packed_git *alloc_packed_git(int extra)
-{
-	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-	memset(p, 0, sizeof(*p));
-	p->pack_fd = -1;
-	return p;
-}
-
-static void try_to_free_pack_memory(size_t size)
-{
-	release_pack_memory(size);
-}
-
-struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
-{
-	static int have_set_try_to_free_routine;
-	struct stat st;
-	size_t alloc;
-	struct packed_git *p;
-
-	if (!have_set_try_to_free_routine) {
-		have_set_try_to_free_routine = 1;
-		set_try_to_free_routine(try_to_free_pack_memory);
-	}
-
-	/*
-	 * Make sure a corresponding .pack file exists and that
-	 * the index looks sane.
-	 */
-	if (!strip_suffix_mem(path, &path_len, ".idx"))
-		return NULL;
-
-	/*
-	 * ".pack" is long enough to hold any suffix we're adding (and
-	 * the use xsnprintf double-checks that)
-	 */
-	alloc = st_add3(path_len, strlen(".pack"), 1);
-	p = alloc_packed_git(alloc);
-	memcpy(p->pack_name, path, path_len);
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
-	if (!access(p->pack_name, F_OK))
-		p->pack_keep = 1;
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
-	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
-		free(p);
-		return NULL;
-	}
-
-	/* ok, it looks sane as far as we can check without
-	 * actually mapping the pack file.
-	 */
-	p->pack_size = st.st_size;
-	p->pack_local = local;
-	p->mtime = st.st_mtime;
-	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
-		hashclr(p->sha1);
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [RFC PATCH 10/10] pack: move install_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (8 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 09/10] pack: move add_packed_git() Jonathan Tan
@ 2017-08-08 19:32 ` Jonathan Tan
  2017-08-08 20:05 ` [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Junio C Hamano
                   ` (50 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 19:32 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 pack.c      | 11 ++++++++++-
 pack.h      |  4 ++--
 sha1_file.c |  9 ---------
 4 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/cache.h b/cache.h
index bf93477e8..41562dc0b 100644
--- a/cache.h
+++ b/cache.h
@@ -1611,7 +1611,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
 extern void reprepare_packed_git(void);
-extern void install_packed_git(struct packed_git *pack);
 
 /*
  * Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/pack.c b/pack.c
index efe0ed3e8..4eb65e460 100644
--- a/pack.c
+++ b/pack.c
@@ -28,7 +28,7 @@ static unsigned int pack_used_ctr;
 static unsigned int pack_mmap_calls;
 static unsigned int peak_pack_open_windows;
 static unsigned int pack_open_windows;
-unsigned int pack_open_fds;
+static unsigned int pack_open_fds;
 static unsigned int pack_max_fds;
 static size_t peak_pack_mapped;
 static size_t pack_mapped;
@@ -658,3 +658,12 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 		hashclr(p->sha1);
 	return p;
 }
+
+void install_packed_git(struct packed_git *pack)
+{
+	if (pack->pack_fd != -1)
+		pack_open_fds++;
+
+	pack->next = packed_git;
+	packed_git = pack;
+}
diff --git a/pack.h b/pack.h
index c1f3ff32d..576c4fc7c 100644
--- a/pack.h
+++ b/pack.h
@@ -124,8 +124,6 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_open_fds;
-
 extern void pack_report(void);
 
 /*
@@ -152,4 +150,6 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t
 extern void unuse_pack(struct pack_window **);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+extern void install_packed_git(struct packed_git *pack);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 7f12b1ee0..b956ca0c9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void install_packed_git(struct packed_git *pack)
-{
-	if (pack->pack_fd != -1)
-		pack_open_fds++;
-
-	pack->next = packed_git;
-	packed_git = pack;
-}
-
 void (*report_garbage)(unsigned seen_bits, const char *path);
 
 static void report_helper(const struct string_list *list,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (9 preceding siblings ...)
  2017-08-08 19:32 ` [RFC PATCH 10/10] pack: move install_packed_git() Jonathan Tan
@ 2017-08-08 20:05 ` Junio C Hamano
  2017-08-08 20:43   ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
                   ` (49 subsequent siblings)
  60 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-08 20:05 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> While investigating annotating packfiles and loose objects to support
> connectivity checks in partial clones [1], I decided to make the effort
> to separate packfile-related code from sha1_file.c into its own file, to
> make it easier to both code such changes and review them. Here is the
> beginning of those efforts.
>
> Is this something worth doing, and if yes, is this in the right
> direction?

Overall I think it is a good idea to slim down sha1_file.c *if* we
can keep the exposed surface area small enough.

I wonder if the names "pack.[ch]" communicate well that these are
"object access layer that is about reading from packfiles".  The
writer side is called "pack-objects.[ch]".

This may have to make some symbols that used to be private to the
"object access" layer, which was what sha1_file.c was about, global
symbols.  After moving things around, do we end up exposing too many
implementation details to the world outside the "object access"
layer?  I'd assume they are limited to the resulting pack.h and it
would be OK as long as nobody other than sha1_file.c and pack.c
would inculde it, though.

Thanks.

>
> [1] https://public-inbox.org/git/20170804145113.5ceafafa@twelve2.svl.corp.google.com/
>
> Jonathan Tan (10):
>   pack: move pack name-related functions
>   pack: move static state variables
>   pack: move pack_report()
>   pack: move open_pack_index(), parse_pack_index()
>   pack: move release_pack_memory()
>   pack: move pack-closing functions
>   pack: move use_pack()
>   pack: move unuse_pack()
>   pack: move add_packed_git()
>   pack: move install_packed_git()
>
>  Makefile                 |   1 +
>  builtin/am.c             |   1 +
>  builtin/clone.c          |   1 +
>  builtin/count-objects.c  |   1 +
>  builtin/fetch.c          |   1 +
>  builtin/merge.c          |   1 +
>  builtin/pack-redundant.c |   1 +
>  cache.h                  |  45 ----
>  connected.c              |   1 +
>  git-compat-util.h        |   2 -
>  pack.c                   | 669 +++++++++++++++++++++++++++++++++++++++++++++++
>  pack.h                   |  51 ++++
>  sha1_file.c              | 667 ----------------------------------------------
>  sha1_name.c              |   1 +
>  streaming.c              |   1 +
>  15 files changed, 730 insertions(+), 714 deletions(-)
>  create mode 100644 pack.c

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
  2017-08-08 19:32 ` [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
@ 2017-08-08 20:19   ` Junio C Hamano
  2017-08-08 20:45     ` Jonathan Tan
  0 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-08 20:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  builtin/count-objects.c |   1 +
>  cache.h                 |   8 ---
>  pack.c                  | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
>  pack.h                  |   8 +++
>  sha1_file.c             | 140 ---------------------------------------------
>  sha1_name.c             |   1 +
>  6 files changed, 159 insertions(+), 148 deletions(-)

This patch is a bit strange...

> diff --git a/pack.c b/pack.c
> index 60d9fc3b0..6edc43228 100644
> --- a/pack.c
> +++ b/pack.c
> ...
> +static struct packed_git *alloc_packed_git(int extra)
> +{
> +	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
> +	memset(p, 0, sizeof(*p));
> +	p->pack_fd = -1;
> +	return p;
> +}
> +
> +struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> +{
> +	const char *path = sha1_pack_name(sha1);
> +	size_t alloc = st_add(strlen(path), 1);
> +	struct packed_git *p = alloc_packed_git(alloc);
> +
> +	memcpy(p->pack_name, path, alloc); /* includes NUL */
> +	hashcpy(p->sha1, sha1);
> +	if (check_packed_git_idx(idx_path, p)) {
> +		free(p);
> +		return NULL;
> +	}
> +
> +	return p;
> +}

We see these two functions appear in pack.c

> diff --git a/sha1_file.c b/sha1_file.c
> index 0de39f480..2e414f5f5 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> ...
> @@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
>  	return p;
>  }
>  
> -struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> -{
> -	const char *path = sha1_pack_name(sha1);
> -	size_t alloc = st_add(strlen(path), 1);
> -	struct packed_git *p = alloc_packed_git(alloc);
> -
> -	memcpy(p->pack_name, path, alloc); /* includes NUL */
> -	hashcpy(p->sha1, sha1);
> -	if (check_packed_git_idx(idx_path, p)) {
> -		free(p);
> -		return NULL;
> -	}
> -
> -	return p;
> -}
> -

And we see parse_pack_index() came from sha1_file.c

But where did alloc_packed_git() come from?  Was the patch split
incorrectly or something?

When I applied the whole series and did

    git blame -s -w -M -C -C master.. pack.c

expecting that pretty much everything has come from sha1_file.c but
noticed that some lines were actually blamed to a version of pack.c
and these functions were among them.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-08 19:32 ` [RFC PATCH 01/10] pack: move pack name-related functions Jonathan Tan
@ 2017-08-08 20:36   ` Stefan Beller
  2017-08-08 20:50     ` Jonathan Tan
  0 siblings, 1 reply; 88+ messages in thread
From: Stefan Beller @ 2017-08-08 20:36 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git@vger.kernel.org

On Tue, Aug 8, 2017 at 12:32 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> Currently, sha1_file.c and cache.h contain many functions, both related
> to and unrelated to packfiles. This makes both files very large and
> causes an unclear separation of concerns.
>
> Create a new file, pack.c, to hold all packfile-related functions
> currently in sha1_file.c, and designate pack.h to hold these
> packfile-related functions.

There are also packed refs, so one could (like I did) think that
pack.c is for generic packing of things, maybe packfile.c
would be more clear?

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
  2017-08-08 20:05 ` [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Junio C Hamano
@ 2017-08-08 20:43   ` Jonathan Tan
  2017-08-08 21:04     ` Junio C Hamano
  0 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 20:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 08 Aug 2017 13:05:05 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > While investigating annotating packfiles and loose objects to support
> > connectivity checks in partial clones [1], I decided to make the effort
> > to separate packfile-related code from sha1_file.c into its own file, to
> > make it easier to both code such changes and review them. Here is the
> > beginning of those efforts.
> >
> > Is this something worth doing, and if yes, is this in the right
> > direction?
> 
> Overall I think it is a good idea to slim down sha1_file.c *if* we
> can keep the exposed surface area small enough.

What do you mean by "keep the exposed surface area small enough"? If you
mean the total number of exposed functions in sha1_file and pack (once
everything is done), I think it will be almost the same as that
currently in sha1_file.

find_pack_entry() and has_packed_and_bad() (not yet in this patch set)
might need to be changed from static to global, but those are the only 2
I can think of. Anyway, I'll report the functions that need to be
changed from static to global at the end.

During this patch set, there might be some functions that need to be
temporarily made global, but those are reverted to static in the end.

> I wonder if the names "pack.[ch]" communicate well that these are
> "object access layer that is about reading from packfiles".  The
> writer side is called "pack-objects.[ch]".

This file will end up being slightly broader than reading from packfiles
- in particular, things like pack_report() (reporting some statistics
not only on the in-repo packfiles themselves) and parse_pack_index()
(which parses an idx file obtained through http) are there too. Hence
the generic name, but I agree that there might be a better name (or
better set of names).

> This may have to make some symbols that used to be private to the
> "object access" layer, which was what sha1_file.c was about, global
> symbols.  After moving things around, do we end up exposing too many
> implementation details to the world outside the "object access"
> layer?  I'd assume they are limited to the resulting pack.h and it
> would be OK as long as nobody other than sha1_file.c and pack.c
> would inculde it, though.

As stated above, I don't think so, but I'll make a list of the functions
needing to be made global.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
  2017-08-08 20:19   ` Junio C Hamano
@ 2017-08-08 20:45     ` Jonathan Tan
  0 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 20:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 08 Aug 2017 13:19:23 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> > ---
> >  builtin/count-objects.c |   1 +
> >  cache.h                 |   8 ---
> >  pack.c                  | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  pack.h                  |   8 +++
> >  sha1_file.c             | 140 ---------------------------------------------
> >  sha1_name.c             |   1 +
> >  6 files changed, 159 insertions(+), 148 deletions(-)
> 
> This patch is a bit strange...
> 
> > diff --git a/pack.c b/pack.c
> > index 60d9fc3b0..6edc43228 100644
> > --- a/pack.c
> > +++ b/pack.c
> > ...
> > +static struct packed_git *alloc_packed_git(int extra)
> > +{
> > +	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
> > +	memset(p, 0, sizeof(*p));
> > +	p->pack_fd = -1;
> > +	return p;
> > +}
> > +
> > +struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> > +{
> > +	const char *path = sha1_pack_name(sha1);
> > +	size_t alloc = st_add(strlen(path), 1);
> > +	struct packed_git *p = alloc_packed_git(alloc);
> > +
> > +	memcpy(p->pack_name, path, alloc); /* includes NUL */
> > +	hashcpy(p->sha1, sha1);
> > +	if (check_packed_git_idx(idx_path, p)) {
> > +		free(p);
> > +		return NULL;
> > +	}
> > +
> > +	return p;
> > +}
> 
> We see these two functions appear in pack.c
> 
> > diff --git a/sha1_file.c b/sha1_file.c
> > index 0de39f480..2e414f5f5 100644
> > --- a/sha1_file.c
> > +++ b/sha1_file.c
> > ...
> > @@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
> >  	return p;
> >  }
> >  
> > -struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> > -{
> > -	const char *path = sha1_pack_name(sha1);
> > -	size_t alloc = st_add(strlen(path), 1);
> > -	struct packed_git *p = alloc_packed_git(alloc);
> > -
> > -	memcpy(p->pack_name, path, alloc); /* includes NUL */
> > -	hashcpy(p->sha1, sha1);
> > -	if (check_packed_git_idx(idx_path, p)) {
> > -		free(p);
> > -		return NULL;
> > -	}
> > -
> > -	return p;
> > -}
> > -
> 
> And we see parse_pack_index() came from sha1_file.c
> 
> But where did alloc_packed_git() come from?  Was the patch split
> incorrectly or something?
> 
> When I applied the whole series and did
> 
>     git blame -s -w -M -C -C master.. pack.c
> 
> expecting that pretty much everything has come from sha1_file.c but
> noticed that some lines were actually blamed to a version of pack.c
> and these functions were among them.

alloc_packed_git() in pack.c is a duplicate of the function of the same
name in sha1_file.c in this patch, because at this patch, there are
still functions in both files using this function. A subsequent patch in
this patch set will remove it from pack.c.

I'll add a note explaining this to this patch in the next version.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-08 20:36   ` Stefan Beller
@ 2017-08-08 20:50     ` Jonathan Tan
  2017-08-09 12:00       ` Christian Couder
  0 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-08 20:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On Tue, 8 Aug 2017 13:36:24 -0700
Stefan Beller <sbeller@google.com> wrote:

> On Tue, Aug 8, 2017 at 12:32 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> > Currently, sha1_file.c and cache.h contain many functions, both related
> > to and unrelated to packfiles. This makes both files very large and
> > causes an unclear separation of concerns.
> >
> > Create a new file, pack.c, to hold all packfile-related functions
> > currently in sha1_file.c, and designate pack.h to hold these
> > packfile-related functions.
> 
> There are also packed refs, so one could (like I did) think that
> pack.c is for generic packing of things, maybe packfile.c
> would be more clear?

Good point. I'll use packfile.c and packfile.h in the next version.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
  2017-08-08 20:43   ` Jonathan Tan
@ 2017-08-08 21:04     ` Junio C Hamano
  0 siblings, 0 replies; 88+ messages in thread
From: Junio C Hamano @ 2017-08-08 21:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> What do you mean by "keep the exposed surface area small enough"? If you
> mean the total number of exposed functions in sha1_file and pack (once
> everything is done), I think it will be almost the same as that
> currently in sha1_file.
> ...
> During this patch set, there might be some functions that need to be
> temporarily made global, but those are reverted to static in the end.

That is exactly what I meant.

> As stated above, I don't think so, but I'll make a list of the functions
> needing to be made global.

Good.

Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (10 preceding siblings ...)
  2017-08-08 20:05 ` [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Junio C Hamano
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-10 17:21   ` Stefan Beller
                     ` (2 more replies)
  2017-08-09  1:22 ` [PATCH v2 01/25] pack: move pack name-related functions Jonathan Tan
                   ` (48 subsequent siblings)
  60 siblings, 3 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Here is the complete patch set. I have only moved the exported functions
that operate with packfiles and their static helpers - for example,
static functions like freshen_packed_object() that are used only by
non-pack-specific functions are not moved.

In the end, 3 functions needed to be made global. They are
find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad().

Of the 3, find_pack_entry() is probably legitimately promoted. But I
think that the latter two functions needing to be accessed from
sha1_file.c points to a design that could be improved - they are only
used when packed_object_info() detects corruption, and used for marking
as bad and printing messages to the user respectively, which
packed_object_info() should probably do itself. But I have not made this
change in this patch set.

(Other than the 3 functions above, there are some variables and
functions that are temporarily made global, but reduced back to static
when the wide scope is no longer needed.)

Jonathan Tan (25):
  pack: move pack name-related functions
  pack: move static state variables
  pack: move pack_report()
  pack: move open_pack_index(), parse_pack_index()
  pack: move release_pack_memory()
  pack: move pack-closing functions
  pack: move use_pack()
  pack: move unuse_pack()
  pack: move add_packed_git()
  pack: move install_packed_git()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move unpack_object_header()
  pack: move get_size_from_delta()
  pack: move unpack_object_header()
  sha1_file: set whence in storage-specific info fn
  sha1_file: remove read_packed_sha1()
  pack: move packed_object_info(), unpack_entry()
  pack: move nth_packed_object_{sha1,oid}
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move find_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move has_sha1_pack()
  pack: move has_pack_index()
  pack: move for_each_packed_object()

 Makefile                 |    1 +
 builtin/am.c             |    1 +
 builtin/cat-file.c       |    1 +
 builtin/clone.c          |    1 +
 builtin/count-objects.c  |    1 +
 builtin/fetch.c          |    1 +
 builtin/gc.c             |    1 +
 builtin/merge.c          |    1 +
 builtin/pack-redundant.c |    1 +
 builtin/prune-packed.c   |    1 +
 cache.h                  |  122 +--
 connected.c              |    1 +
 diff.c                   |    1 +
 git-compat-util.h        |    2 -
 http-backend.c           |    1 +
 http-push.c              |    1 +
 http-walker.c            |    1 +
 pack.h                   |  137 +++
 packfile.c               | 1905 +++++++++++++++++++++++++++++++++++
 path.c                   |    1 +
 reachable.c              |    1 +
 revision.c               |    1 +
 server-info.c            |    1 +
 sha1_file.c              | 2484 ++++++----------------------------------------
 sha1_name.c              |    1 +
 streaming.c              |    1 +
 26 files changed, 2350 insertions(+), 2321 deletions(-)
 create mode 100644 packfile.c

-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 01/25] pack: move pack name-related functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (11 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 02/25] pack: move static state variables Jonathan Tan
                   ` (47 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c, and designate pack.h to hold these
packfile-related functions.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                 |  1 +
 builtin/pack-redundant.c |  1 +
 cache.h                  | 23 -----------------------
 pack.h                   | 23 +++++++++++++++++++++++
 packfile.c               | 23 +++++++++++++++++++++++
 sha1_file.c              | 22 ----------------------
 6 files changed, 48 insertions(+), 45 deletions(-)
 create mode 100644 packfile.c

diff --git a/Makefile b/Makefile
index 461c845d3..5cdecaa17 100644
--- a/Makefile
+++ b/Makefile
@@ -816,6 +816,7 @@ LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
 LIB_OBJS += oidset.o
+LIB_OBJS += packfile.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index cb1df1c76..df36d10e7 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -7,6 +7,7 @@
 */
 
 #include "builtin.h"
+#include "pack.h"
 
 #define BLKSIZE 512
 
diff --git a/cache.h b/cache.h
index 71fe09264..1f0f47819 100644
--- a/cache.h
+++ b/cache.h
@@ -902,20 +902,6 @@ extern void check_repository_format(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
-/*
- * Return the name of the (local) packfile with the specified sha1 in
- * its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_name(const unsigned char *sha1);
-
-/*
- * Return the name of the (local) pack index file with the specified
- * sha1 in its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
-
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
@@ -1648,15 +1634,6 @@ extern void pack_report(void);
  */
 extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 
-/*
- * Generate the filename to be used for a pack file with checksum "sha1" and
- * extension "ext". The result is written into the strbuf "buf", overwriting
- * any existing contents. A pointer to buf->buf is returned as a convenience.
- *
- * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
- */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
-
 /*
  * Create a pack .keep file named "name" (which should generally be the output
  * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
diff --git a/pack.h b/pack.h
index 8294341af..63bfde00c 100644
--- a/pack.h
+++ b/pack.h
@@ -101,4 +101,27 @@ extern int read_pack_header(int fd, struct pack_header *);
 extern struct sha1file *create_tmp_packfile(char **pack_tmp_name);
 extern void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]);
 
+/*
+ * Generate the filename to be used for a pack file with checksum "sha1" and
+ * extension "ext". The result is written into the strbuf "buf", overwriting
+ * any existing contents. A pointer to buf->buf is returned as a convenience.
+ *
+ * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
+ */
+extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
+
+/*
+ * Return the name of the (local) packfile with the specified sha1 in
+ * its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_name(const unsigned char *sha1);
+
+/*
+ * Return the name of the (local) pack index file with the specified
+ * sha1 in its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_index_name(const unsigned char *sha1);
+
 #endif
diff --git a/packfile.c b/packfile.c
new file mode 100644
index 000000000..0d191dfd6
--- /dev/null
+++ b/packfile.c
@@ -0,0 +1,23 @@
+#include "cache.h"
+
+char *odb_pack_name(struct strbuf *buf,
+		    const unsigned char *sha1,
+		    const char *ext)
+{
+	strbuf_reset(buf);
+	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
+		    sha1_to_hex(sha1), ext);
+	return buf->buf;
+}
+
+char *sha1_pack_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "pack");
+}
+
+char *sha1_pack_index_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "idx");
+}
diff --git a/sha1_file.c b/sha1_file.c
index b60ae15f7..7e511ce9e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -278,28 +278,6 @@ static const char *alt_sha1_path(struct alternate_object_database *alt,
 	return buf->buf;
 }
 
- char *odb_pack_name(struct strbuf *buf,
-		     const unsigned char *sha1,
-		     const char *ext)
-{
-	strbuf_reset(buf);
-	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
-		    sha1_to_hex(sha1), ext);
-	return buf->buf;
-}
-
-char *sha1_pack_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "pack");
-}
-
-char *sha1_pack_index_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "idx");
-}
-
 struct alternate_object_database *alt_odb_list;
 static struct alternate_object_database **alt_odb_tail;
 
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 02/25] pack: move static state variables
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (12 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 01/25] pack: move pack name-related functions Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 03/25] pack: move pack_report() Jonathan Tan
                   ` (46 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

sha1_file.c declares some static variables that store packfile-related
state. Move them to packfile.c.

They are temporarily made global, but subsequent commits will restore
their scope back to static.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 pack.h      |  9 +++++++++
 packfile.c  | 14 ++++++++++++++
 sha1_file.c | 13 -------------
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/pack.h b/pack.h
index 63bfde00c..7fcd45f7b 100644
--- a/pack.h
+++ b/pack.h
@@ -124,4 +124,13 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
+extern unsigned int pack_used_ctr;
+extern unsigned int pack_mmap_calls;
+extern unsigned int peak_pack_open_windows;
+extern unsigned int pack_open_windows;
+extern unsigned int pack_open_fds;
+extern unsigned int pack_max_fds;
+extern size_t peak_pack_mapped;
+extern size_t pack_mapped;
+
 #endif
diff --git a/packfile.c b/packfile.c
index 0d191dfd6..0f46e0617 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "mru.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -21,3 +22,16 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	static struct strbuf buf = STRBUF_INIT;
 	return odb_pack_name(&buf, sha1, "idx");
 }
+
+unsigned int pack_used_ctr;
+unsigned int pack_mmap_calls;
+unsigned int peak_pack_open_windows;
+unsigned int pack_open_windows;
+unsigned int pack_open_fds;
+unsigned int pack_max_fds;
+size_t peak_pack_mapped;
+size_t pack_mapped;
+struct packed_git *packed_git;
+
+static struct mru packed_git_mru_storage;
+struct mru *packed_git_mru = &packed_git_mru_storage;
diff --git a/sha1_file.c b/sha1_file.c
index 7e511ce9e..4d95e21eb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -682,19 +682,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static unsigned int pack_used_ctr;
-static unsigned int pack_mmap_calls;
-static unsigned int peak_pack_open_windows;
-static unsigned int pack_open_windows;
-static unsigned int pack_open_fds;
-static unsigned int pack_max_fds;
-static size_t peak_pack_mapped;
-static size_t pack_mapped;
-struct packed_git *packed_git;
-
-static struct mru packed_git_mru_storage;
-struct mru *packed_git_mru = &packed_git_mru_storage;
-
 void pack_report(void)
 {
 	fprintf(stderr,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 03/25] pack: move pack_report()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (13 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 02/25] pack: move static state variables Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 04/25] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
                   ` (45 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  2 --
 pack.h      |  2 ++
 packfile.c  | 24 ++++++++++++++++++++++++
 sha1_file.c | 24 ------------------------
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index 1f0f47819..c7f802e4a 100644
--- a/cache.h
+++ b/cache.h
@@ -1624,8 +1624,6 @@ unsigned long approximate_object_count(void);
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
-extern void pack_report(void);
-
 /*
  * Create a temporary file rooted in the object database directory, or
  * die on failure. The filename is taken from "pattern", which should have the
diff --git a/pack.h b/pack.h
index 7fcd45f7b..6098bfe40 100644
--- a/pack.h
+++ b/pack.h
@@ -133,4 +133,6 @@ extern unsigned int pack_max_fds;
 extern size_t peak_pack_mapped;
 extern size_t pack_mapped;
 
+extern void pack_report(void);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 0f46e0617..60d9fc3b0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -35,3 +35,27 @@ struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
 struct mru *packed_git_mru = &packed_git_mru_storage;
+
+#define SZ_FMT PRIuMAX
+static inline uintmax_t sz_fmt(size_t s) { return s; }
+
+void pack_report(void)
+{
+	fprintf(stderr,
+		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
+		sz_fmt(getpagesize()),
+		sz_fmt(packed_git_window_size),
+		sz_fmt(packed_git_limit));
+	fprintf(stderr,
+		"pack_report: pack_used_ctr            = %10u\n"
+		"pack_report: pack_mmap_calls          = %10u\n"
+		"pack_report: pack_open_windows        = %10u / %10u\n"
+		"pack_report: pack_mapped              = "
+			"%10" SZ_FMT " / %10" SZ_FMT "\n",
+		pack_used_ctr,
+		pack_mmap_calls,
+		pack_open_windows, peak_pack_open_windows,
+		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
+}
diff --git a/sha1_file.c b/sha1_file.c
index 4d95e21eb..0de39f480 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -29,9 +29,6 @@
 #include "mergesort.h"
 #include "quote.h"
 
-#define SZ_FMT PRIuMAX
-static inline uintmax_t sz_fmt(size_t s) { return s; }
-
 const unsigned char null_sha1[20];
 const struct object_id null_oid;
 const struct object_id empty_tree_oid = {
@@ -682,27 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-void pack_report(void)
-{
-	fprintf(stderr,
-		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
-		sz_fmt(getpagesize()),
-		sz_fmt(packed_git_window_size),
-		sz_fmt(packed_git_limit));
-	fprintf(stderr,
-		"pack_report: pack_used_ctr            = %10u\n"
-		"pack_report: pack_mmap_calls          = %10u\n"
-		"pack_report: pack_open_windows        = %10u / %10u\n"
-		"pack_report: pack_mapped              = "
-			"%10" SZ_FMT " / %10" SZ_FMT "\n",
-		pack_used_ctr,
-		pack_mmap_calls,
-		pack_open_windows, peak_pack_open_windows,
-		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
-}
-
 /*
  * Open and mmap the index file at path, perform a couple of
  * consistency checks, then record its information to p.  Return 0 on
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 04/25] pack: move open_pack_index(), parse_pack_index()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (14 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 03/25] pack: move pack_report() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 05/25] pack: move release_pack_memory() Jonathan Tan
                   ` (44 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a
subsequent commit, alloc_packed_git() will be removed from sha1_file.c.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/count-objects.c |   1 +
 cache.h                 |   8 ---
 pack.h                  |   8 +++
 packfile.c              | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c             | 140 ---------------------------------------------
 sha1_name.c             |   1 +
 6 files changed, 159 insertions(+), 148 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 1d82e61f2..185d3190a 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -10,6 +10,7 @@
 #include "builtin.h"
 #include "parse-options.h"
 #include "quote.h"
+#include "pack.h"
 
 static unsigned long garbage;
 static off_t size_garbage;
diff --git a/cache.h b/cache.h
index c7f802e4a..5d6839525 100644
--- a/cache.h
+++ b/cache.h
@@ -1603,8 +1603,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
-
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
@@ -1639,12 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * mmap the index file for the specified packfile (if it is not
- * already mmapped).  Return 0 on success.
- */
-extern int open_pack_index(struct packed_git *);
-
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
diff --git a/pack.h b/pack.h
index 6098bfe40..5be0ed42a 100644
--- a/pack.h
+++ b/pack.h
@@ -135,4 +135,12 @@ extern size_t pack_mapped;
 
 extern void pack_report(void);
 
+/*
+ * mmap the index file for the specified packfile (if it is not
+ * already mmapped).  Return 0 on success.
+ */
+extern int open_pack_index(struct packed_git *);
+
+extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 60d9fc3b0..6edc43228 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "mru.h"
+#include "pack.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -59,3 +60,151 @@ void pack_report(void)
 		pack_open_windows, peak_pack_open_windows,
 		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
 }
+
+/*
+ * Open and mmap the index file at path, perform a couple of
+ * consistency checks, then record its information to p.  Return 0 on
+ * success.
+ */
+static int check_packed_git_idx(const char *path, struct packed_git *p)
+{
+	void *idx_map;
+	struct pack_idx_header *hdr;
+	size_t idx_size;
+	uint32_t version, nr, i, *index;
+	int fd = git_open(path);
+	struct stat st;
+
+	if (fd < 0)
+		return -1;
+	if (fstat(fd, &st)) {
+		close(fd);
+		return -1;
+	}
+	idx_size = xsize_t(st.st_size);
+	if (idx_size < 4 * 256 + 20 + 20) {
+		close(fd);
+		return error("index file %s is too small", path);
+	}
+	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+
+	hdr = idx_map;
+	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
+		version = ntohl(hdr->idx_version);
+		if (version < 2 || version > 2) {
+			munmap(idx_map, idx_size);
+			return error("index file %s is version %"PRIu32
+				     " and is not supported by this binary"
+				     " (try upgrading GIT to a newer version)",
+				     path, version);
+		}
+	} else
+		version = 1;
+
+	nr = 0;
+	index = idx_map;
+	if (version > 1)
+		index += 2;  /* skip index header */
+	for (i = 0; i < 256; i++) {
+		uint32_t n = ntohl(index[i]);
+		if (n < nr) {
+			munmap(idx_map, idx_size);
+			return error("non-monotonic index %s", path);
+		}
+		nr = n;
+	}
+
+	if (version == 1) {
+		/*
+		 * Total size:
+		 *  - 256 index entries 4 bytes each
+		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 */
+		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v1 file size in %s", path);
+		}
+	} else if (version == 2) {
+		/*
+		 * Minimum size:
+		 *  - 8 bytes of header
+		 *  - 256 index entries 4 bytes each
+		 *  - 20-byte sha1 entry * nr
+		 *  - 4-byte crc entry * nr
+		 *  - 4-byte offset entry * nr
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 * And after the 4-byte offset table might be a
+		 * variable sized table containing 8-byte entries
+		 * for offsets larger than 2^31.
+		 */
+		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
+		unsigned long max_size = min_size;
+		if (nr)
+			max_size += (nr - 1)*8;
+		if (idx_size < min_size || idx_size > max_size) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v2 file size in %s", path);
+		}
+		if (idx_size != min_size &&
+		    /*
+		     * make sure we can deal with large pack offsets.
+		     * 31-bit signed offset won't be enough, neither
+		     * 32-bit unsigned one will be.
+		     */
+		    (sizeof(off_t) <= 4)) {
+			munmap(idx_map, idx_size);
+			return error("pack too large for current definition of off_t in %s", path);
+		}
+	}
+
+	p->index_version = version;
+	p->index_data = idx_map;
+	p->index_size = idx_size;
+	p->num_objects = nr;
+	return 0;
+}
+
+int open_pack_index(struct packed_git *p)
+{
+	char *idx_name;
+	size_t len;
+	int ret;
+
+	if (p->index_data)
+		return 0;
+
+	if (!strip_suffix(p->pack_name, ".pack", &len))
+		die("BUG: pack_name does not end in .pack");
+	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
+	ret = check_packed_git_idx(idx_name, p);
+	free(idx_name);
+	return ret;
+}
+
+static struct packed_git *alloc_packed_git(int extra)
+{
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
+	memset(p, 0, sizeof(*p));
+	p->pack_fd = -1;
+	return p;
+}
+
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
+{
+	const char *path = sha1_pack_name(sha1);
+	size_t alloc = st_add(strlen(path), 1);
+	struct packed_git *p = alloc_packed_git(alloc);
+
+	memcpy(p->pack_name, path, alloc); /* includes NUL */
+	hashcpy(p->sha1, sha1);
+	if (check_packed_git_idx(idx_path, p)) {
+		free(p);
+		return NULL;
+	}
+
+	return p;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 0de39f480..2e414f5f5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -679,130 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-/*
- * Open and mmap the index file at path, perform a couple of
- * consistency checks, then record its information to p.  Return 0 on
- * success.
- */
-static int check_packed_git_idx(const char *path, struct packed_git *p)
-{
-	void *idx_map;
-	struct pack_idx_header *hdr;
-	size_t idx_size;
-	uint32_t version, nr, i, *index;
-	int fd = git_open(path);
-	struct stat st;
-
-	if (fd < 0)
-		return -1;
-	if (fstat(fd, &st)) {
-		close(fd);
-		return -1;
-	}
-	idx_size = xsize_t(st.st_size);
-	if (idx_size < 4 * 256 + 20 + 20) {
-		close(fd);
-		return error("index file %s is too small", path);
-	}
-	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
-
-	hdr = idx_map;
-	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
-		version = ntohl(hdr->idx_version);
-		if (version < 2 || version > 2) {
-			munmap(idx_map, idx_size);
-			return error("index file %s is version %"PRIu32
-				     " and is not supported by this binary"
-				     " (try upgrading GIT to a newer version)",
-				     path, version);
-		}
-	} else
-		version = 1;
-
-	nr = 0;
-	index = idx_map;
-	if (version > 1)
-		index += 2;  /* skip index header */
-	for (i = 0; i < 256; i++) {
-		uint32_t n = ntohl(index[i]);
-		if (n < nr) {
-			munmap(idx_map, idx_size);
-			return error("non-monotonic index %s", path);
-		}
-		nr = n;
-	}
-
-	if (version == 1) {
-		/*
-		 * Total size:
-		 *  - 256 index entries 4 bytes each
-		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 */
-		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v1 file size in %s", path);
-		}
-	} else if (version == 2) {
-		/*
-		 * Minimum size:
-		 *  - 8 bytes of header
-		 *  - 256 index entries 4 bytes each
-		 *  - 20-byte sha1 entry * nr
-		 *  - 4-byte crc entry * nr
-		 *  - 4-byte offset entry * nr
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 * And after the 4-byte offset table might be a
-		 * variable sized table containing 8-byte entries
-		 * for offsets larger than 2^31.
-		 */
-		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
-		unsigned long max_size = min_size;
-		if (nr)
-			max_size += (nr - 1)*8;
-		if (idx_size < min_size || idx_size > max_size) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v2 file size in %s", path);
-		}
-		if (idx_size != min_size &&
-		    /*
-		     * make sure we can deal with large pack offsets.
-		     * 31-bit signed offset won't be enough, neither
-		     * 32-bit unsigned one will be.
-		     */
-		    (sizeof(off_t) <= 4)) {
-			munmap(idx_map, idx_size);
-			return error("pack too large for current definition of off_t in %s", path);
-		}
-	}
-
-	p->index_version = version;
-	p->index_data = idx_map;
-	p->index_size = idx_size;
-	p->num_objects = nr;
-	return 0;
-}
-
-int open_pack_index(struct packed_git *p)
-{
-	char *idx_name;
-	size_t len;
-	int ret;
-
-	if (p->index_data)
-		return 0;
-
-	if (!strip_suffix(p->pack_name, ".pack", &len))
-		die("BUG: pack_name does not end in .pack");
-	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
-	ret = check_packed_git_idx(idx_name, p);
-	free(idx_name);
-	return ret;
-}
-
 static void scan_windows(struct packed_git *p,
 	struct packed_git **lru_p,
 	struct pack_window **lru_w,
@@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	return p;
 }
 
-struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
-{
-	const char *path = sha1_pack_name(sha1);
-	size_t alloc = st_add(strlen(path), 1);
-	struct packed_git *p = alloc_packed_git(alloc);
-
-	memcpy(p->pack_name, path, alloc); /* includes NUL */
-	hashcpy(p->sha1, sha1);
-	if (check_packed_git_idx(idx_path, p)) {
-		free(p);
-		return NULL;
-	}
-
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
diff --git a/sha1_name.c b/sha1_name.c
index 74fcb6d78..28b7c9fd8 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -9,6 +9,7 @@
 #include "remote.h"
 #include "dir.h"
 #include "sha1-array.h"
+#include "pack.h"
 
 static int get_sha1_oneline(const char *, unsigned char *, struct commit_list *);
 
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 05/25] pack: move release_pack_memory()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (15 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 04/25] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 06/25] pack: move pack-closing functions Jonathan Tan
                   ` (43 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

The function unuse_one_window() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 git-compat-util.h |  2 --
 pack.h            |  4 ++++
 packfile.c        | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c       | 49 -------------------------------------------------
 4 files changed, 53 insertions(+), 51 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index db9c22de7..201056e2d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -749,8 +749,6 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 extern int git_atexit(void (*handler)(void));
 #endif
 
-extern void release_pack_memory(size_t);
-
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
diff --git a/pack.h b/pack.h
index 5be0ed42a..c16220586 100644
--- a/pack.h
+++ b/pack.h
@@ -143,4 +143,8 @@ extern int open_pack_index(struct packed_git *);
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
+extern int unuse_one_window(struct packed_git *current);
+
+extern void release_pack_memory(size_t);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 6edc43228..8daa74ad1 100644
--- a/packfile.c
+++ b/packfile.c
@@ -208,3 +208,52 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 
 	return p;
 }
+
+static void scan_windows(struct packed_git *p,
+	struct packed_git **lru_p,
+	struct pack_window **lru_w,
+	struct pack_window **lru_l)
+{
+	struct pack_window *w, *w_l;
+
+	for (w_l = NULL, w = p->windows; w; w = w->next) {
+		if (!w->inuse_cnt) {
+			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
+				*lru_p = p;
+				*lru_w = w;
+				*lru_l = w_l;
+			}
+		}
+		w_l = w;
+	}
+}
+
+int unuse_one_window(struct packed_git *current)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *lru_w = NULL, *lru_l = NULL;
+
+	if (current)
+		scan_windows(current, &lru_p, &lru_w, &lru_l);
+	for (p = packed_git; p; p = p->next)
+		scan_windows(p, &lru_p, &lru_w, &lru_l);
+	if (lru_p) {
+		munmap(lru_w->base, lru_w->len);
+		pack_mapped -= lru_w->len;
+		if (lru_l)
+			lru_l->next = lru_w->next;
+		else
+			lru_p->windows = lru_w->next;
+		free(lru_w);
+		pack_open_windows--;
+		return 1;
+	}
+	return 0;
+}
+
+void release_pack_memory(size_t need)
+{
+	size_t cur = pack_mapped;
+	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
+		; /* nothing */
+}
diff --git a/sha1_file.c b/sha1_file.c
index 2e414f5f5..644876e4e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -679,55 +679,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static void scan_windows(struct packed_git *p,
-	struct packed_git **lru_p,
-	struct pack_window **lru_w,
-	struct pack_window **lru_l)
-{
-	struct pack_window *w, *w_l;
-
-	for (w_l = NULL, w = p->windows; w; w = w->next) {
-		if (!w->inuse_cnt) {
-			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
-				*lru_p = p;
-				*lru_w = w;
-				*lru_l = w_l;
-			}
-		}
-		w_l = w;
-	}
-}
-
-static int unuse_one_window(struct packed_git *current)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *lru_w = NULL, *lru_l = NULL;
-
-	if (current)
-		scan_windows(current, &lru_p, &lru_w, &lru_l);
-	for (p = packed_git; p; p = p->next)
-		scan_windows(p, &lru_p, &lru_w, &lru_l);
-	if (lru_p) {
-		munmap(lru_w->base, lru_w->len);
-		pack_mapped -= lru_w->len;
-		if (lru_l)
-			lru_l->next = lru_w->next;
-		else
-			lru_p->windows = lru_w->next;
-		free(lru_w);
-		pack_open_windows--;
-		return 1;
-	}
-	return 0;
-}
-
-void release_pack_memory(size_t need)
-{
-	size_t cur = pack_mapped;
-	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
-		; /* nothing */
-}
-
 static void mmap_limit_check(size_t length)
 {
 	static size_t limit = 0;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 06/25] pack: move pack-closing functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (16 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 05/25] pack: move release_pack_memory() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 07/25] pack: move use_pack() Jonathan Tan
                   ` (42 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

The function close_pack_fd() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/am.c    |  1 +
 builtin/clone.c |  1 +
 builtin/fetch.c |  1 +
 builtin/merge.c |  1 +
 cache.h         |  8 --------
 pack.h          |  9 +++++++++
 packfile.c      | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c     | 55 -------------------------------------------------------
 8 files changed, 67 insertions(+), 63 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index c973bd96d..c38dd10a3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -31,6 +31,7 @@
 #include "mailinfo.h"
 #include "apply.h"
 #include "string-list.h"
+#include "pack.h"
 
 /**
  * Returns 1 if the file is empty or does not exist, 0 otherwise.
diff --git a/builtin/clone.c b/builtin/clone.c
index 08b5cc433..53410a45d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -25,6 +25,7 @@
 #include "remote.h"
 #include "run-command.h"
 #include "connected.h"
+#include "pack.h"
 
 /*
  * Overall FIXMEs:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..196a3bfc4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -17,6 +17,7 @@
 #include "connected.h"
 #include "argv-array.h"
 #include "utf8.h"
+#include "pack.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
diff --git a/builtin/merge.c b/builtin/merge.c
index 900bafdb4..9cff4b276 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -32,6 +32,7 @@
 #include "gpg-interface.h"
 #include "sequencer.h"
 #include "string-list.h"
+#include "pack.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
diff --git a/cache.h b/cache.h
index 5d6839525..25a21a61f 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,15 +1637,7 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * munmap the index file for the specified packfile (if it is
- * currently mmapped).
- */
-extern void close_pack_index(struct packed_git *);
-
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/pack.h b/pack.h
index c16220586..fd4668528 100644
--- a/pack.h
+++ b/pack.h
@@ -147,4 +147,13 @@ extern int unuse_one_window(struct packed_git *current);
 
 extern void release_pack_memory(size_t);
 
+extern void close_pack_windows(struct packed_git *);
+extern int close_pack_fd(struct packed_git *);
+/*
+ * munmap the index file for the specified packfile (if it is
+ * currently mmapped).
+ */
+extern void close_pack_index(struct packed_git *);
+extern void close_all_packs(void);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 8daa74ad1..c8e2dbdee 100644
--- a/packfile.c
+++ b/packfile.c
@@ -257,3 +257,57 @@ void release_pack_memory(size_t need)
 	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
 		; /* nothing */
 }
+
+void close_pack_windows(struct packed_git *p)
+{
+	while (p->windows) {
+		struct pack_window *w = p->windows;
+
+		if (w->inuse_cnt)
+			die("pack '%s' still has open windows to it",
+			    p->pack_name);
+		munmap(w->base, w->len);
+		pack_mapped -= w->len;
+		pack_open_windows--;
+		p->windows = w->next;
+		free(w);
+	}
+}
+
+int close_pack_fd(struct packed_git *p)
+{
+	if (p->pack_fd < 0)
+		return 0;
+
+	close(p->pack_fd);
+	pack_open_fds--;
+	p->pack_fd = -1;
+
+	return 1;
+}
+
+void close_pack_index(struct packed_git *p)
+{
+	if (p->index_data) {
+		munmap((void *)p->index_data, p->index_size);
+		p->index_data = NULL;
+	}
+}
+
+static void close_pack(struct packed_git *p)
+{
+	close_pack_windows(p);
+	close_pack_fd(p);
+	close_pack_index(p);
+}
+
+void close_all_packs(void)
+{
+	struct packed_git *p;
+
+	for (p = packed_git; p; p = p->next)
+		if (p->do_not_close)
+			die("BUG: want to close pack marked 'do-not-close'");
+		else
+			close_pack(p);
+}
diff --git a/sha1_file.c b/sha1_file.c
index 644876e4e..e2927244f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,53 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void close_pack_windows(struct packed_git *p)
-{
-	while (p->windows) {
-		struct pack_window *w = p->windows;
-
-		if (w->inuse_cnt)
-			die("pack '%s' still has open windows to it",
-			    p->pack_name);
-		munmap(w->base, w->len);
-		pack_mapped -= w->len;
-		pack_open_windows--;
-		p->windows = w->next;
-		free(w);
-	}
-}
-
-static int close_pack_fd(struct packed_git *p)
-{
-	if (p->pack_fd < 0)
-		return 0;
-
-	close(p->pack_fd);
-	pack_open_fds--;
-	p->pack_fd = -1;
-
-	return 1;
-}
-
-static void close_pack(struct packed_git *p)
-{
-	close_pack_windows(p);
-	close_pack_fd(p);
-	close_pack_index(p);
-}
-
-void close_all_packs(void)
-{
-	struct packed_git *p;
-
-	for (p = packed_git; p; p = p->next)
-		if (p->do_not_close)
-			die("BUG: want to close pack marked 'do-not-close'");
-		else
-			close_pack(p);
-}
-
-
 /*
  * The LRU pack is the one with the oldest MRU window, preferring packs
  * with no used windows, or the oldest mtime if it has no windows allocated.
@@ -846,14 +799,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-void close_pack_index(struct packed_git *p)
-{
-	if (p->index_data) {
-		munmap((void *)p->index_data, p->index_size);
-		p->index_data = NULL;
-	}
-}
-
 static unsigned int get_max_fd_limit(void)
 {
 #ifdef RLIMIT_NOFILE
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 07/25] pack: move use_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (17 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 06/25] pack: move pack-closing functions Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 08/25] pack: move unuse_pack() Jonathan Tan
                   ` (41 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

The function open_packed_git() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |   1 -
 pack.h      |  14 +--
 packfile.c  | 303 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 sha1_file.c | 285 --------------------------------------------------------
 streaming.c |   1 +
 5 files changed, 299 insertions(+), 305 deletions(-)

diff --git a/cache.h b/cache.h
index 25a21a61f..dd9f9a9ae 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/pack.h b/pack.h
index fd4668528..bf2b99bf9 100644
--- a/pack.h
+++ b/pack.h
@@ -124,14 +124,7 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_used_ctr;
-extern unsigned int pack_mmap_calls;
-extern unsigned int peak_pack_open_windows;
-extern unsigned int pack_open_windows;
 extern unsigned int pack_open_fds;
-extern unsigned int pack_max_fds;
-extern size_t peak_pack_mapped;
-extern size_t pack_mapped;
 
 extern void pack_report(void);
 
@@ -143,12 +136,9 @@ extern int open_pack_index(struct packed_git *);
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
-extern int unuse_one_window(struct packed_git *current);
-
 extern void release_pack_memory(size_t);
 
 extern void close_pack_windows(struct packed_git *);
-extern int close_pack_fd(struct packed_git *);
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
@@ -156,4 +146,8 @@ extern int close_pack_fd(struct packed_git *);
 extern void close_pack_index(struct packed_git *);
 extern void close_all_packs(void);
 
+extern int open_packed_git(struct packed_git *p);
+
+extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+
 #endif
diff --git a/packfile.c b/packfile.c
index c8e2dbdee..85cb65558 100644
--- a/packfile.c
+++ b/packfile.c
@@ -24,14 +24,14 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	return odb_pack_name(&buf, sha1, "idx");
 }
 
-unsigned int pack_used_ctr;
-unsigned int pack_mmap_calls;
-unsigned int peak_pack_open_windows;
-unsigned int pack_open_windows;
+static unsigned int pack_used_ctr;
+static unsigned int pack_mmap_calls;
+static unsigned int peak_pack_open_windows;
+static unsigned int pack_open_windows;
 unsigned int pack_open_fds;
-unsigned int pack_max_fds;
-size_t peak_pack_mapped;
-size_t pack_mapped;
+static unsigned int pack_max_fds;
+static size_t peak_pack_mapped;
+static size_t pack_mapped;
 struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
@@ -228,7 +228,7 @@ static void scan_windows(struct packed_git *p,
 	}
 }
 
-int unuse_one_window(struct packed_git *current)
+static int unuse_one_window(struct packed_git *current)
 {
 	struct packed_git *p, *lru_p = NULL;
 	struct pack_window *lru_w = NULL, *lru_l = NULL;
@@ -274,7 +274,7 @@ void close_pack_windows(struct packed_git *p)
 	}
 }
 
-int close_pack_fd(struct packed_git *p)
+static int close_pack_fd(struct packed_git *p)
 {
 	if (p->pack_fd < 0)
 		return 0;
@@ -311,3 +311,288 @@ void close_all_packs(void)
 		else
 			close_pack(p);
 }
+
+/*
+ * The LRU pack is the one with the oldest MRU window, preferring packs
+ * with no used windows, or the oldest mtime if it has no windows allocated.
+ */
+static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
+{
+	struct pack_window *w, *this_mru_w;
+	int has_windows_inuse = 0;
+
+	/*
+	 * Reject this pack if it has windows and the previously selected
+	 * one does not.  If this pack does not have windows, reject
+	 * it if the pack file is newer than the previously selected one.
+	 */
+	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
+		return;
+
+	for (w = this_mru_w = p->windows; w; w = w->next) {
+		/*
+		 * Reject this pack if any of its windows are in use,
+		 * but the previously selected pack did not have any
+		 * inuse windows.  Otherwise, record that this pack
+		 * has windows in use.
+		 */
+		if (w->inuse_cnt) {
+			if (*accept_windows_inuse)
+				has_windows_inuse = 1;
+			else
+				return;
+		}
+
+		if (w->last_used > this_mru_w->last_used)
+			this_mru_w = w;
+
+		/*
+		 * Reject this pack if it has windows that have been
+		 * used more recently than the previously selected pack.
+		 * If the previously selected pack had windows inuse and
+		 * we have not encountered a window in this pack that is
+		 * inuse, skip this check since we prefer a pack with no
+		 * inuse windows to one that has inuse windows.
+		 */
+		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
+		    this_mru_w->last_used > (*mru_w)->last_used)
+			return;
+	}
+
+	/*
+	 * Select this pack.
+	 */
+	*mru_w = this_mru_w;
+	*lru_p = p;
+	*accept_windows_inuse = has_windows_inuse;
+}
+
+static int close_one_pack(void)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *mru_w = NULL;
+	int accept_windows_inuse = 1;
+
+	for (p = packed_git; p; p = p->next) {
+		if (p->pack_fd == -1)
+			continue;
+		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
+	}
+
+	if (lru_p)
+		return close_pack_fd(lru_p);
+
+	return 0;
+}
+
+static unsigned int get_max_fd_limit(void)
+{
+#ifdef RLIMIT_NOFILE
+	{
+		struct rlimit lim;
+
+		if (!getrlimit(RLIMIT_NOFILE, &lim))
+			return lim.rlim_cur;
+	}
+#endif
+
+#ifdef _SC_OPEN_MAX
+	{
+		long open_max = sysconf(_SC_OPEN_MAX);
+		if (0 < open_max)
+			return open_max;
+		/*
+		 * Otherwise, we got -1 for one of the two
+		 * reasons:
+		 *
+		 * (1) sysconf() did not understand _SC_OPEN_MAX
+		 *     and signaled an error with -1; or
+		 * (2) sysconf() said there is no limit.
+		 *
+		 * We _could_ clear errno before calling sysconf() to
+		 * tell these two cases apart and return a huge number
+		 * in the latter case to let the caller cap it to a
+		 * value that is not so selfish, but letting the
+		 * fallback OPEN_MAX codepath take care of these cases
+		 * is a lot simpler.
+		 */
+	}
+#endif
+
+#ifdef OPEN_MAX
+	return OPEN_MAX;
+#else
+	return 1; /* see the caller ;-) */
+#endif
+}
+
+/*
+ * Do not call this directly as this leaks p->pack_fd on error return;
+ * call open_packed_git() instead.
+ */
+static int open_packed_git_1(struct packed_git *p)
+{
+	struct stat st;
+	struct pack_header hdr;
+	unsigned char sha1[20];
+	unsigned char *idx_sha1;
+	long fd_flag;
+
+	if (!p->index_data && open_pack_index(p))
+		return error("packfile %s index unavailable", p->pack_name);
+
+	if (!pack_max_fds) {
+		unsigned int max_fds = get_max_fd_limit();
+
+		/* Save 3 for stdin/stdout/stderr, 22 for work */
+		if (25 < max_fds)
+			pack_max_fds = max_fds - 25;
+		else
+			pack_max_fds = 1;
+	}
+
+	while (pack_max_fds <= pack_open_fds && close_one_pack())
+		; /* nothing */
+
+	p->pack_fd = git_open(p->pack_name);
+	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
+		return -1;
+	pack_open_fds++;
+
+	/* If we created the struct before we had the pack we lack size. */
+	if (!p->pack_size) {
+		if (!S_ISREG(st.st_mode))
+			return error("packfile %s not a regular file", p->pack_name);
+		p->pack_size = st.st_size;
+	} else if (p->pack_size != st.st_size)
+		return error("packfile %s size changed", p->pack_name);
+
+	/* We leave these file descriptors open with sliding mmap;
+	 * there is no point keeping them open across exec(), though.
+	 */
+	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
+	if (fd_flag < 0)
+		return error("cannot determine file descriptor flags");
+	fd_flag |= FD_CLOEXEC;
+	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
+		return error("cannot set FD_CLOEXEC");
+
+	/* Verify we recognize this pack file format. */
+	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
+		return error("file %s is far too short to be a packfile", p->pack_name);
+	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
+		return error("file %s is not a GIT packfile", p->pack_name);
+	if (!pack_version_ok(hdr.hdr_version))
+		return error("packfile %s is version %"PRIu32" and not"
+			" supported (try upgrading GIT to a newer version)",
+			p->pack_name, ntohl(hdr.hdr_version));
+
+	/* Verify the pack matches its index. */
+	if (p->num_objects != ntohl(hdr.hdr_entries))
+		return error("packfile %s claims to have %"PRIu32" objects"
+			     " while index indicates %"PRIu32" objects",
+			     p->pack_name, ntohl(hdr.hdr_entries),
+			     p->num_objects);
+	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
+		return error("end of packfile %s is unavailable", p->pack_name);
+	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
+		return error("packfile %s signature is unavailable", p->pack_name);
+	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
+	if (hashcmp(sha1, idx_sha1))
+		return error("packfile %s does not match index", p->pack_name);
+	return 0;
+}
+
+int open_packed_git(struct packed_git *p)
+{
+	if (!open_packed_git_1(p))
+		return 0;
+	close_pack_fd(p);
+	return -1;
+}
+
+static int in_window(struct pack_window *win, off_t offset)
+{
+	/* We must promise at least 20 bytes (one hash) after the
+	 * offset is available from this window, otherwise the offset
+	 * is not actually in this window and a different window (which
+	 * has that one hash excess) must be used.  This is to support
+	 * the object header and delta base parsing routines below.
+	 */
+	off_t win_off = win->offset;
+	return win_off <= offset
+		&& (offset + 20) <= (win_off + win->len);
+}
+
+unsigned char *use_pack(struct packed_git *p,
+		struct pack_window **w_cursor,
+		off_t offset,
+		unsigned long *left)
+{
+	struct pack_window *win = *w_cursor;
+
+	/* Since packfiles end in a hash of their content and it's
+	 * pointless to ask for an offset into the middle of that
+	 * hash, and the in_window function above wouldn't match
+	 * don't allow an offset too close to the end of the file.
+	 */
+	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
+		die("packfile %s cannot be accessed", p->pack_name);
+	if (offset > (p->pack_size - 20))
+		die("offset beyond end of packfile (truncated pack?)");
+	if (offset < 0)
+		die(_("offset before end of packfile (broken .idx?)"));
+
+	if (!win || !in_window(win, offset)) {
+		if (win)
+			win->inuse_cnt--;
+		for (win = p->windows; win; win = win->next) {
+			if (in_window(win, offset))
+				break;
+		}
+		if (!win) {
+			size_t window_align = packed_git_window_size / 2;
+			off_t len;
+
+			if (p->pack_fd == -1 && open_packed_git(p))
+				die("packfile %s cannot be accessed", p->pack_name);
+
+			win = xcalloc(1, sizeof(*win));
+			win->offset = (offset / window_align) * window_align;
+			len = p->pack_size - win->offset;
+			if (len > packed_git_window_size)
+				len = packed_git_window_size;
+			win->len = (size_t)len;
+			pack_mapped += win->len;
+			while (packed_git_limit < pack_mapped
+				&& unuse_one_window(p))
+				; /* nothing */
+			win->base = xmmap(NULL, win->len,
+				PROT_READ, MAP_PRIVATE,
+				p->pack_fd, win->offset);
+			if (win->base == MAP_FAILED)
+				die_errno("packfile %s cannot be mapped",
+					  p->pack_name);
+			if (!win->offset && win->len == p->pack_size
+				&& !p->do_not_close)
+				close_pack_fd(p);
+			pack_mmap_calls++;
+			pack_open_windows++;
+			if (pack_mapped > peak_pack_mapped)
+				peak_pack_mapped = pack_mapped;
+			if (pack_open_windows > peak_pack_open_windows)
+				peak_pack_open_windows = pack_open_windows;
+			win->next = p->windows;
+			p->windows = win;
+		}
+	}
+	if (win != *w_cursor) {
+		win->last_used = pack_used_ctr++;
+		win->inuse_cnt++;
+		*w_cursor = win;
+	}
+	offset -= win->offset;
+	if (left)
+		*left = win->len - xsize_t(offset);
+	return win->base + offset;
+}
diff --git a/sha1_file.c b/sha1_file.c
index e2927244f..8f17a07e9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,79 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-/*
- * The LRU pack is the one with the oldest MRU window, preferring packs
- * with no used windows, or the oldest mtime if it has no windows allocated.
- */
-static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
-{
-	struct pack_window *w, *this_mru_w;
-	int has_windows_inuse = 0;
-
-	/*
-	 * Reject this pack if it has windows and the previously selected
-	 * one does not.  If this pack does not have windows, reject
-	 * it if the pack file is newer than the previously selected one.
-	 */
-	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
-		return;
-
-	for (w = this_mru_w = p->windows; w; w = w->next) {
-		/*
-		 * Reject this pack if any of its windows are in use,
-		 * but the previously selected pack did not have any
-		 * inuse windows.  Otherwise, record that this pack
-		 * has windows in use.
-		 */
-		if (w->inuse_cnt) {
-			if (*accept_windows_inuse)
-				has_windows_inuse = 1;
-			else
-				return;
-		}
-
-		if (w->last_used > this_mru_w->last_used)
-			this_mru_w = w;
-
-		/*
-		 * Reject this pack if it has windows that have been
-		 * used more recently than the previously selected pack.
-		 * If the previously selected pack had windows inuse and
-		 * we have not encountered a window in this pack that is
-		 * inuse, skip this check since we prefer a pack with no
-		 * inuse windows to one that has inuse windows.
-		 */
-		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
-		    this_mru_w->last_used > (*mru_w)->last_used)
-			return;
-	}
-
-	/*
-	 * Select this pack.
-	 */
-	*mru_w = this_mru_w;
-	*lru_p = p;
-	*accept_windows_inuse = has_windows_inuse;
-}
-
-static int close_one_pack(void)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *mru_w = NULL;
-	int accept_windows_inuse = 1;
-
-	for (p = packed_git; p; p = p->next) {
-		if (p->pack_fd == -1)
-			continue;
-		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
-	}
-
-	if (lru_p)
-		return close_pack_fd(lru_p);
-
-	return 0;
-}
-
 void unuse_pack(struct pack_window **w_cursor)
 {
 	struct pack_window *w = *w_cursor;
@@ -799,218 +726,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-static unsigned int get_max_fd_limit(void)
-{
-#ifdef RLIMIT_NOFILE
-	{
-		struct rlimit lim;
-
-		if (!getrlimit(RLIMIT_NOFILE, &lim))
-			return lim.rlim_cur;
-	}
-#endif
-
-#ifdef _SC_OPEN_MAX
-	{
-		long open_max = sysconf(_SC_OPEN_MAX);
-		if (0 < open_max)
-			return open_max;
-		/*
-		 * Otherwise, we got -1 for one of the two
-		 * reasons:
-		 *
-		 * (1) sysconf() did not understand _SC_OPEN_MAX
-		 *     and signaled an error with -1; or
-		 * (2) sysconf() said there is no limit.
-		 *
-		 * We _could_ clear errno before calling sysconf() to
-		 * tell these two cases apart and return a huge number
-		 * in the latter case to let the caller cap it to a
-		 * value that is not so selfish, but letting the
-		 * fallback OPEN_MAX codepath take care of these cases
-		 * is a lot simpler.
-		 */
-	}
-#endif
-
-#ifdef OPEN_MAX
-	return OPEN_MAX;
-#else
-	return 1; /* see the caller ;-) */
-#endif
-}
-
-/*
- * Do not call this directly as this leaks p->pack_fd on error return;
- * call open_packed_git() instead.
- */
-static int open_packed_git_1(struct packed_git *p)
-{
-	struct stat st;
-	struct pack_header hdr;
-	unsigned char sha1[20];
-	unsigned char *idx_sha1;
-	long fd_flag;
-
-	if (!p->index_data && open_pack_index(p))
-		return error("packfile %s index unavailable", p->pack_name);
-
-	if (!pack_max_fds) {
-		unsigned int max_fds = get_max_fd_limit();
-
-		/* Save 3 for stdin/stdout/stderr, 22 for work */
-		if (25 < max_fds)
-			pack_max_fds = max_fds - 25;
-		else
-			pack_max_fds = 1;
-	}
-
-	while (pack_max_fds <= pack_open_fds && close_one_pack())
-		; /* nothing */
-
-	p->pack_fd = git_open(p->pack_name);
-	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
-		return -1;
-	pack_open_fds++;
-
-	/* If we created the struct before we had the pack we lack size. */
-	if (!p->pack_size) {
-		if (!S_ISREG(st.st_mode))
-			return error("packfile %s not a regular file", p->pack_name);
-		p->pack_size = st.st_size;
-	} else if (p->pack_size != st.st_size)
-		return error("packfile %s size changed", p->pack_name);
-
-	/* We leave these file descriptors open with sliding mmap;
-	 * there is no point keeping them open across exec(), though.
-	 */
-	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
-	if (fd_flag < 0)
-		return error("cannot determine file descriptor flags");
-	fd_flag |= FD_CLOEXEC;
-	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
-		return error("cannot set FD_CLOEXEC");
-
-	/* Verify we recognize this pack file format. */
-	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
-		return error("file %s is far too short to be a packfile", p->pack_name);
-	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
-		return error("file %s is not a GIT packfile", p->pack_name);
-	if (!pack_version_ok(hdr.hdr_version))
-		return error("packfile %s is version %"PRIu32" and not"
-			" supported (try upgrading GIT to a newer version)",
-			p->pack_name, ntohl(hdr.hdr_version));
-
-	/* Verify the pack matches its index. */
-	if (p->num_objects != ntohl(hdr.hdr_entries))
-		return error("packfile %s claims to have %"PRIu32" objects"
-			     " while index indicates %"PRIu32" objects",
-			     p->pack_name, ntohl(hdr.hdr_entries),
-			     p->num_objects);
-	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
-		return error("end of packfile %s is unavailable", p->pack_name);
-	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
-		return error("packfile %s signature is unavailable", p->pack_name);
-	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
-	if (hashcmp(sha1, idx_sha1))
-		return error("packfile %s does not match index", p->pack_name);
-	return 0;
-}
-
-static int open_packed_git(struct packed_git *p)
-{
-	if (!open_packed_git_1(p))
-		return 0;
-	close_pack_fd(p);
-	return -1;
-}
-
-static int in_window(struct pack_window *win, off_t offset)
-{
-	/* We must promise at least 20 bytes (one hash) after the
-	 * offset is available from this window, otherwise the offset
-	 * is not actually in this window and a different window (which
-	 * has that one hash excess) must be used.  This is to support
-	 * the object header and delta base parsing routines below.
-	 */
-	off_t win_off = win->offset;
-	return win_off <= offset
-		&& (offset + 20) <= (win_off + win->len);
-}
-
-unsigned char *use_pack(struct packed_git *p,
-		struct pack_window **w_cursor,
-		off_t offset,
-		unsigned long *left)
-{
-	struct pack_window *win = *w_cursor;
-
-	/* Since packfiles end in a hash of their content and it's
-	 * pointless to ask for an offset into the middle of that
-	 * hash, and the in_window function above wouldn't match
-	 * don't allow an offset too close to the end of the file.
-	 */
-	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
-		die("packfile %s cannot be accessed", p->pack_name);
-	if (offset > (p->pack_size - 20))
-		die("offset beyond end of packfile (truncated pack?)");
-	if (offset < 0)
-		die(_("offset before end of packfile (broken .idx?)"));
-
-	if (!win || !in_window(win, offset)) {
-		if (win)
-			win->inuse_cnt--;
-		for (win = p->windows; win; win = win->next) {
-			if (in_window(win, offset))
-				break;
-		}
-		if (!win) {
-			size_t window_align = packed_git_window_size / 2;
-			off_t len;
-
-			if (p->pack_fd == -1 && open_packed_git(p))
-				die("packfile %s cannot be accessed", p->pack_name);
-
-			win = xcalloc(1, sizeof(*win));
-			win->offset = (offset / window_align) * window_align;
-			len = p->pack_size - win->offset;
-			if (len > packed_git_window_size)
-				len = packed_git_window_size;
-			win->len = (size_t)len;
-			pack_mapped += win->len;
-			while (packed_git_limit < pack_mapped
-				&& unuse_one_window(p))
-				; /* nothing */
-			win->base = xmmap(NULL, win->len,
-				PROT_READ, MAP_PRIVATE,
-				p->pack_fd, win->offset);
-			if (win->base == MAP_FAILED)
-				die_errno("packfile %s cannot be mapped",
-					  p->pack_name);
-			if (!win->offset && win->len == p->pack_size
-				&& !p->do_not_close)
-				close_pack_fd(p);
-			pack_mmap_calls++;
-			pack_open_windows++;
-			if (pack_mapped > peak_pack_mapped)
-				peak_pack_mapped = pack_mapped;
-			if (pack_open_windows > peak_pack_open_windows)
-				peak_pack_open_windows = pack_open_windows;
-			win->next = p->windows;
-			p->windows = win;
-		}
-	}
-	if (win != *w_cursor) {
-		win->last_used = pack_used_ctr++;
-		win->inuse_cnt++;
-		*w_cursor = win;
-	}
-	offset -= win->offset;
-	if (left)
-		*left = win->len - xsize_t(offset);
-	return win->base + offset;
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
diff --git a/streaming.c b/streaming.c
index 9afa66b8b..f657018cf 100644
--- a/streaming.c
+++ b/streaming.c
@@ -3,6 +3,7 @@
  */
 #include "cache.h"
 #include "streaming.h"
+#include "pack.h"
 
 enum input_source {
 	stream_error = -1,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 08/25] pack: move unuse_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (18 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 07/25] pack: move use_pack() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 09/25] pack: move add_packed_git() Jonathan Tan
                   ` (40 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 1 -
 pack.h      | 1 +
 packfile.c  | 9 +++++++++
 sha1_file.c | 9 ---------
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index dd9f9a9ae..4812f3a63 100644
--- a/cache.h
+++ b/cache.h
@@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
diff --git a/pack.h b/pack.h
index bf2b99bf9..3876e9ae6 100644
--- a/pack.h
+++ b/pack.h
@@ -149,5 +149,6 @@ extern void close_all_packs(void);
 extern int open_packed_git(struct packed_git *p);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+extern void unuse_pack(struct pack_window **);
 
 #endif
diff --git a/packfile.c b/packfile.c
index 85cb65558..93526ea7b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -596,3 +596,12 @@ unsigned char *use_pack(struct packed_git *p,
 		*left = win->len - xsize_t(offset);
 	return win->base + offset;
 }
+
+void unuse_pack(struct pack_window **w_cursor)
+{
+	struct pack_window *w = *w_cursor;
+	if (w) {
+		w->inuse_cnt--;
+		*w_cursor = NULL;
+	}
+}
diff --git a/sha1_file.c b/sha1_file.c
index 8f17a07e9..12501ef06 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void unuse_pack(struct pack_window **w_cursor)
-{
-	struct pack_window *w = *w_cursor;
-	if (w) {
-		w->inuse_cnt--;
-		*w_cursor = NULL;
-	}
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 09/25] pack: move add_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (19 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 08/25] pack: move unuse_pack() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 10/25] pack: move install_packed_git() Jonathan Tan
                   ` (39 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 connected.c |  1 +
 pack.h      |  1 +
 packfile.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 61 -------------------------------------------------------------
 5 files changed, 55 insertions(+), 62 deletions(-)

diff --git a/cache.h b/cache.h
index 4812f3a63..bf93477e8 100644
--- a/cache.h
+++ b/cache.h
@@ -1638,7 +1638,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 extern int odb_pack_keep(const char *name);
 
 extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
diff --git a/connected.c b/connected.c
index 136c2ac16..3e3f0148c 100644
--- a/connected.c
+++ b/connected.c
@@ -3,6 +3,7 @@
 #include "sigchain.h"
 #include "connected.h"
 #include "transport.h"
+#include "pack.h"
 
 /*
  * If we feed all the commits we want to verify to this command
diff --git a/pack.h b/pack.h
index 3876e9ae6..c1f3ff32d 100644
--- a/pack.h
+++ b/pack.h
@@ -150,5 +150,6 @@ extern int open_packed_git(struct packed_git *p);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 extern void unuse_pack(struct pack_window **);
+extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 #endif
diff --git a/packfile.c b/packfile.c
index 93526ea7b..efe0ed3e8 100644
--- a/packfile.c
+++ b/packfile.c
@@ -605,3 +605,56 @@ void unuse_pack(struct pack_window **w_cursor)
 		*w_cursor = NULL;
 	}
 }
+
+static void try_to_free_pack_memory(size_t size)
+{
+	release_pack_memory(size);
+}
+
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
+{
+	static int have_set_try_to_free_routine;
+	struct stat st;
+	size_t alloc;
+	struct packed_git *p;
+
+	if (!have_set_try_to_free_routine) {
+		have_set_try_to_free_routine = 1;
+		set_try_to_free_routine(try_to_free_pack_memory);
+	}
+
+	/*
+	 * Make sure a corresponding .pack file exists and that
+	 * the index looks sane.
+	 */
+	if (!strip_suffix_mem(path, &path_len, ".idx"))
+		return NULL;
+
+	/*
+	 * ".pack" is long enough to hold any suffix we're adding (and
+	 * the use xsnprintf double-checks that)
+	 */
+	alloc = st_add3(path_len, strlen(".pack"), 1);
+	p = alloc_packed_git(alloc);
+	memcpy(p->pack_name, path, path_len);
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->pack_keep = 1;
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
+	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
+		free(p);
+		return NULL;
+	}
+
+	/* ok, it looks sane as far as we can check without
+	 * actually mapping the pack file.
+	 */
+	p->pack_size = st.st_size;
+	p->pack_local = local;
+	p->mtime = st.st_mtime;
+	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
+		hashclr(p->sha1);
+	return p;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 12501ef06..7f12b1ee0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,67 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-static struct packed_git *alloc_packed_git(int extra)
-{
-	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-	memset(p, 0, sizeof(*p));
-	p->pack_fd = -1;
-	return p;
-}
-
-static void try_to_free_pack_memory(size_t size)
-{
-	release_pack_memory(size);
-}
-
-struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
-{
-	static int have_set_try_to_free_routine;
-	struct stat st;
-	size_t alloc;
-	struct packed_git *p;
-
-	if (!have_set_try_to_free_routine) {
-		have_set_try_to_free_routine = 1;
-		set_try_to_free_routine(try_to_free_pack_memory);
-	}
-
-	/*
-	 * Make sure a corresponding .pack file exists and that
-	 * the index looks sane.
-	 */
-	if (!strip_suffix_mem(path, &path_len, ".idx"))
-		return NULL;
-
-	/*
-	 * ".pack" is long enough to hold any suffix we're adding (and
-	 * the use xsnprintf double-checks that)
-	 */
-	alloc = st_add3(path_len, strlen(".pack"), 1);
-	p = alloc_packed_git(alloc);
-	memcpy(p->pack_name, path, path_len);
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
-	if (!access(p->pack_name, F_OK))
-		p->pack_keep = 1;
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
-	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
-		free(p);
-		return NULL;
-	}
-
-	/* ok, it looks sane as far as we can check without
-	 * actually mapping the pack file.
-	 */
-	p->pack_size = st.st_size;
-	p->pack_local = local;
-	p->mtime = st.st_mtime;
-	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
-		hashclr(p->sha1);
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 10/25] pack: move install_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (20 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 09/25] pack: move add_packed_git() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 11/25] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
                   ` (38 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 pack.h      |  4 ++--
 packfile.c  | 11 ++++++++++-
 sha1_file.c |  9 ---------
 4 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/cache.h b/cache.h
index bf93477e8..41562dc0b 100644
--- a/cache.h
+++ b/cache.h
@@ -1611,7 +1611,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
 extern void reprepare_packed_git(void);
-extern void install_packed_git(struct packed_git *pack);
 
 /*
  * Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/pack.h b/pack.h
index c1f3ff32d..576c4fc7c 100644
--- a/pack.h
+++ b/pack.h
@@ -124,8 +124,6 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_open_fds;
-
 extern void pack_report(void);
 
 /*
@@ -152,4 +150,6 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t
 extern void unuse_pack(struct pack_window **);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+extern void install_packed_git(struct packed_git *pack);
+
 #endif
diff --git a/packfile.c b/packfile.c
index efe0ed3e8..4eb65e460 100644
--- a/packfile.c
+++ b/packfile.c
@@ -28,7 +28,7 @@ static unsigned int pack_used_ctr;
 static unsigned int pack_mmap_calls;
 static unsigned int peak_pack_open_windows;
 static unsigned int pack_open_windows;
-unsigned int pack_open_fds;
+static unsigned int pack_open_fds;
 static unsigned int pack_max_fds;
 static size_t peak_pack_mapped;
 static size_t pack_mapped;
@@ -658,3 +658,12 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 		hashclr(p->sha1);
 	return p;
 }
+
+void install_packed_git(struct packed_git *pack)
+{
+	if (pack->pack_fd != -1)
+		pack_open_fds++;
+
+	pack->next = packed_git;
+	packed_git = pack;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 7f12b1ee0..b956ca0c9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void install_packed_git(struct packed_git *pack)
-{
-	if (pack->pack_fd != -1)
-		pack_open_fds++;
-
-	pack->next = packed_git;
-	packed_git = pack;
-}
-
 void (*report_garbage)(unsigned seen_bits, const char *path);
 
 static void report_helper(const struct string_list *list,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 11/25] pack: move {,re}prepare_packed_git and approximate_object_count
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (21 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 10/25] pack: move install_packed_git() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 12/25] pack: move unpack_object_header() Jonathan Tan
                   ` (37 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/gc.c   |   1 +
 cache.h        |  15 ----
 http-backend.c |   1 +
 pack.h         |  15 ++++
 packfile.c     | 216 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 path.c         |   1 +
 server-info.c  |   1 +
 sha1_file.c    | 214 --------------------------------------------------------
 8 files changed, 235 insertions(+), 229 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e6b84475a..f4fe023d3 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -19,6 +19,7 @@
 #include "sigchain.h"
 #include "argv-array.h"
 #include "commit.h"
+#include "pack.h"
 
 #define FAILED_RUN "failed to run %s"
 
diff --git a/cache.h b/cache.h
index 41562dc0b..f020dfade 100644
--- a/cache.h
+++ b/cache.h
@@ -1603,21 +1603,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-/* A hook to report invalid files in pack directory */
-#define PACKDIR_FILE_PACK 1
-#define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
-extern void (*report_garbage)(unsigned seen_bits, const char *path);
-
-extern void prepare_packed_git(void);
-extern void reprepare_packed_git(void);
-
-/*
- * Give a rough count of objects in the repository. This sacrifices accuracy
- * for speed.
- */
-unsigned long approximate_object_count(void);
-
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
diff --git a/http-backend.c b/http-backend.c
index 519025d2c..12f7d421f 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -9,6 +9,7 @@
 #include "string-list.h"
 #include "url.h"
 #include "argv-array.h"
+#include "pack.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
diff --git a/pack.h b/pack.h
index 576c4fc7c..cad5ed488 100644
--- a/pack.h
+++ b/pack.h
@@ -152,4 +152,19 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int
 
 extern void install_packed_git(struct packed_git *pack);
 
+/* A hook to report invalid files in pack directory */
+#define PACKDIR_FILE_PACK 1
+#define PACKDIR_FILE_IDX 2
+#define PACKDIR_FILE_GARBAGE 4
+extern void (*report_garbage)(unsigned seen_bits, const char *path);
+
+extern void prepare_packed_git(void);
+extern void reprepare_packed_git(void);
+
+/*
+ * Give a rough count of objects in the repository. This sacrifices accuracy
+ * for speed.
+ */
+unsigned long approximate_object_count(void);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 4eb65e460..a517172f7 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,6 +1,8 @@
 #include "cache.h"
 #include "mru.h"
 #include "pack.h"
+#include "dir.h"
+#include "mergesort.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -667,3 +669,217 @@ void install_packed_git(struct packed_git *pack)
 	pack->next = packed_git;
 	packed_git = pack;
 }
+
+void (*report_garbage)(unsigned seen_bits, const char *path);
+
+static void report_helper(const struct string_list *list,
+			  int seen_bits, int first, int last)
+{
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
+		return;
+
+	for (; first < last; first++)
+		report_garbage(seen_bits, list->items[first].string);
+}
+
+static void report_pack_garbage(struct string_list *list)
+{
+	int i, baselen = -1, first = 0, seen_bits = 0;
+
+	if (!report_garbage)
+		return;
+
+	string_list_sort(list);
+
+	for (i = 0; i < list->nr; i++) {
+		const char *path = list->items[i].string;
+		if (baselen != -1 &&
+		    strncmp(path, list->items[first].string, baselen)) {
+			report_helper(list, seen_bits, first, i);
+			baselen = -1;
+			seen_bits = 0;
+		}
+		if (baselen == -1) {
+			const char *dot = strrchr(path, '.');
+			if (!dot) {
+				report_garbage(PACKDIR_FILE_GARBAGE, path);
+				continue;
+			}
+			baselen = dot - path + 1;
+			first = i;
+		}
+		if (!strcmp(path + baselen, "pack"))
+			seen_bits |= 1;
+		else if (!strcmp(path + baselen, "idx"))
+			seen_bits |= 2;
+	}
+	report_helper(list, seen_bits, first, list->nr);
+}
+
+static void prepare_packed_git_one(char *objdir, int local)
+{
+	struct strbuf path = STRBUF_INIT;
+	size_t dirnamelen;
+	DIR *dir;
+	struct dirent *de;
+	struct string_list garbage = STRING_LIST_INIT_DUP;
+
+	strbuf_addstr(&path, objdir);
+	strbuf_addstr(&path, "/pack");
+	dir = opendir(path.buf);
+	if (!dir) {
+		if (errno != ENOENT)
+			error_errno("unable to open object pack directory: %s",
+				    path.buf);
+		strbuf_release(&path);
+		return;
+	}
+	strbuf_addch(&path, '/');
+	dirnamelen = path.len;
+	while ((de = readdir(dir)) != NULL) {
+		struct packed_git *p;
+		size_t base_len;
+
+		if (is_dot_or_dotdot(de->d_name))
+			continue;
+
+		strbuf_setlen(&path, dirnamelen);
+		strbuf_addstr(&path, de->d_name);
+
+		base_len = path.len;
+		if (strip_suffix_mem(path.buf, &base_len, ".idx")) {
+			/* Don't reopen a pack we already have. */
+			for (p = packed_git; p; p = p->next) {
+				size_t len;
+				if (strip_suffix(p->pack_name, ".pack", &len) &&
+				    len == base_len &&
+				    !memcmp(p->pack_name, path.buf, len))
+					break;
+			}
+			if (p == NULL &&
+			    /*
+			     * See if it really is a valid .idx file with
+			     * corresponding .pack file that we can map.
+			     */
+			    (p = add_packed_git(path.buf, path.len, local)) != NULL)
+				install_packed_git(p);
+		}
+
+		if (!report_garbage)
+			continue;
+
+		if (ends_with(de->d_name, ".idx") ||
+		    ends_with(de->d_name, ".pack") ||
+		    ends_with(de->d_name, ".bitmap") ||
+		    ends_with(de->d_name, ".keep"))
+			string_list_append(&garbage, path.buf);
+		else
+			report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
+	}
+	closedir(dir);
+	report_pack_garbage(&garbage);
+	string_list_clear(&garbage, 0);
+	strbuf_release(&path);
+}
+
+static int approximate_object_count_valid;
+
+/*
+ * Give a fast, rough count of the number of objects in the repository. This
+ * ignores loose objects completely. If you have a lot of them, then either
+ * you should repack because your performance will be awful, or they are
+ * all unreachable objects about to be pruned, in which case they're not really
+ * interesting as a measure of repo size in the first place.
+ */
+unsigned long approximate_object_count(void)
+{
+	static unsigned long count;
+	if (!approximate_object_count_valid) {
+		struct packed_git *p;
+
+		prepare_packed_git();
+		count = 0;
+		for (p = packed_git; p; p = p->next) {
+			if (open_pack_index(p))
+				continue;
+			count += p->num_objects;
+		}
+	}
+	return count;
+}
+
+static void *get_next_packed_git(const void *p)
+{
+	return ((const struct packed_git *)p)->next;
+}
+
+static void set_next_packed_git(void *p, void *next)
+{
+	((struct packed_git *)p)->next = next;
+}
+
+static int sort_pack(const void *a_, const void *b_)
+{
+	const struct packed_git *a = a_;
+	const struct packed_git *b = b_;
+	int st;
+
+	/*
+	 * Local packs tend to contain objects specific to our
+	 * variant of the project than remote ones.  In addition,
+	 * remote ones could be on a network mounted filesystem.
+	 * Favor local ones for these reasons.
+	 */
+	st = a->pack_local - b->pack_local;
+	if (st)
+		return -st;
+
+	/*
+	 * Younger packs tend to contain more recent objects,
+	 * and more recent objects tend to get accessed more
+	 * often.
+	 */
+	if (a->mtime < b->mtime)
+		return 1;
+	else if (a->mtime == b->mtime)
+		return 0;
+	return -1;
+}
+
+static void rearrange_packed_git(void)
+{
+	packed_git = llist_mergesort(packed_git, get_next_packed_git,
+				     set_next_packed_git, sort_pack);
+}
+
+static void prepare_packed_git_mru(void)
+{
+	struct packed_git *p;
+
+	mru_clear(packed_git_mru);
+	for (p = packed_git; p; p = p->next)
+		mru_append(packed_git_mru, p);
+}
+
+static int prepare_packed_git_run_once = 0;
+void prepare_packed_git(void)
+{
+	struct alternate_object_database *alt;
+
+	if (prepare_packed_git_run_once)
+		return;
+	prepare_packed_git_one(get_object_directory(), 1);
+	prepare_alt_odb();
+	for (alt = alt_odb_list; alt; alt = alt->next)
+		prepare_packed_git_one(alt->path, 0);
+	rearrange_packed_git();
+	prepare_packed_git_mru();
+	prepare_packed_git_run_once = 1;
+}
+
+void reprepare_packed_git(void)
+{
+	approximate_object_count_valid = 0;
+	prepare_packed_git_run_once = 0;
+	prepare_packed_git();
+}
diff --git a/path.c b/path.c
index e485f9f93..ae3f2f65f 100644
--- a/path.c
+++ b/path.c
@@ -9,6 +9,7 @@
 #include "worktree.h"
 #include "submodule-config.h"
 #include "path.h"
+#include "pack.h"
 
 static int get_st_mode_bits(const char *path, int *mode)
 {
diff --git a/server-info.c b/server-info.c
index 5ec5b1d82..e9bc18a8c 100644
--- a/server-info.c
+++ b/server-info.c
@@ -3,6 +3,7 @@
 #include "object.h"
 #include "commit.h"
 #include "tag.h"
+#include "pack.h"
 
 /*
  * Create the file "path" by writing to a temporary file and renaming
diff --git a/sha1_file.c b/sha1_file.c
index b956ca0c9..bbce60f1c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,220 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void (*report_garbage)(unsigned seen_bits, const char *path);
-
-static void report_helper(const struct string_list *list,
-			  int seen_bits, int first, int last)
-{
-	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
-		return;
-
-	for (; first < last; first++)
-		report_garbage(seen_bits, list->items[first].string);
-}
-
-static void report_pack_garbage(struct string_list *list)
-{
-	int i, baselen = -1, first = 0, seen_bits = 0;
-
-	if (!report_garbage)
-		return;
-
-	string_list_sort(list);
-
-	for (i = 0; i < list->nr; i++) {
-		const char *path = list->items[i].string;
-		if (baselen != -1 &&
-		    strncmp(path, list->items[first].string, baselen)) {
-			report_helper(list, seen_bits, first, i);
-			baselen = -1;
-			seen_bits = 0;
-		}
-		if (baselen == -1) {
-			const char *dot = strrchr(path, '.');
-			if (!dot) {
-				report_garbage(PACKDIR_FILE_GARBAGE, path);
-				continue;
-			}
-			baselen = dot - path + 1;
-			first = i;
-		}
-		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
-		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
-	}
-	report_helper(list, seen_bits, first, list->nr);
-}
-
-static void prepare_packed_git_one(char *objdir, int local)
-{
-	struct strbuf path = STRBUF_INIT;
-	size_t dirnamelen;
-	DIR *dir;
-	struct dirent *de;
-	struct string_list garbage = STRING_LIST_INIT_DUP;
-
-	strbuf_addstr(&path, objdir);
-	strbuf_addstr(&path, "/pack");
-	dir = opendir(path.buf);
-	if (!dir) {
-		if (errno != ENOENT)
-			error_errno("unable to open object pack directory: %s",
-				    path.buf);
-		strbuf_release(&path);
-		return;
-	}
-	strbuf_addch(&path, '/');
-	dirnamelen = path.len;
-	while ((de = readdir(dir)) != NULL) {
-		struct packed_git *p;
-		size_t base_len;
-
-		if (is_dot_or_dotdot(de->d_name))
-			continue;
-
-		strbuf_setlen(&path, dirnamelen);
-		strbuf_addstr(&path, de->d_name);
-
-		base_len = path.len;
-		if (strip_suffix_mem(path.buf, &base_len, ".idx")) {
-			/* Don't reopen a pack we already have. */
-			for (p = packed_git; p; p = p->next) {
-				size_t len;
-				if (strip_suffix(p->pack_name, ".pack", &len) &&
-				    len == base_len &&
-				    !memcmp(p->pack_name, path.buf, len))
-					break;
-			}
-			if (p == NULL &&
-			    /*
-			     * See if it really is a valid .idx file with
-			     * corresponding .pack file that we can map.
-			     */
-			    (p = add_packed_git(path.buf, path.len, local)) != NULL)
-				install_packed_git(p);
-		}
-
-		if (!report_garbage)
-			continue;
-
-		if (ends_with(de->d_name, ".idx") ||
-		    ends_with(de->d_name, ".pack") ||
-		    ends_with(de->d_name, ".bitmap") ||
-		    ends_with(de->d_name, ".keep"))
-			string_list_append(&garbage, path.buf);
-		else
-			report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
-	}
-	closedir(dir);
-	report_pack_garbage(&garbage);
-	string_list_clear(&garbage, 0);
-	strbuf_release(&path);
-}
-
-static int approximate_object_count_valid;
-
-/*
- * Give a fast, rough count of the number of objects in the repository. This
- * ignores loose objects completely. If you have a lot of them, then either
- * you should repack because your performance will be awful, or they are
- * all unreachable objects about to be pruned, in which case they're not really
- * interesting as a measure of repo size in the first place.
- */
-unsigned long approximate_object_count(void)
-{
-	static unsigned long count;
-	if (!approximate_object_count_valid) {
-		struct packed_git *p;
-
-		prepare_packed_git();
-		count = 0;
-		for (p = packed_git; p; p = p->next) {
-			if (open_pack_index(p))
-				continue;
-			count += p->num_objects;
-		}
-	}
-	return count;
-}
-
-static void *get_next_packed_git(const void *p)
-{
-	return ((const struct packed_git *)p)->next;
-}
-
-static void set_next_packed_git(void *p, void *next)
-{
-	((struct packed_git *)p)->next = next;
-}
-
-static int sort_pack(const void *a_, const void *b_)
-{
-	const struct packed_git *a = a_;
-	const struct packed_git *b = b_;
-	int st;
-
-	/*
-	 * Local packs tend to contain objects specific to our
-	 * variant of the project than remote ones.  In addition,
-	 * remote ones could be on a network mounted filesystem.
-	 * Favor local ones for these reasons.
-	 */
-	st = a->pack_local - b->pack_local;
-	if (st)
-		return -st;
-
-	/*
-	 * Younger packs tend to contain more recent objects,
-	 * and more recent objects tend to get accessed more
-	 * often.
-	 */
-	if (a->mtime < b->mtime)
-		return 1;
-	else if (a->mtime == b->mtime)
-		return 0;
-	return -1;
-}
-
-static void rearrange_packed_git(void)
-{
-	packed_git = llist_mergesort(packed_git, get_next_packed_git,
-				     set_next_packed_git, sort_pack);
-}
-
-static void prepare_packed_git_mru(void)
-{
-	struct packed_git *p;
-
-	mru_clear(packed_git_mru);
-	for (p = packed_git; p; p = p->next)
-		mru_append(packed_git_mru, p);
-}
-
-static int prepare_packed_git_run_once = 0;
-void prepare_packed_git(void)
-{
-	struct alternate_object_database *alt;
-
-	if (prepare_packed_git_run_once)
-		return;
-	prepare_packed_git_one(get_object_directory(), 1);
-	prepare_alt_odb();
-	for (alt = alt_odb_list; alt; alt = alt->next)
-		prepare_packed_git_one(alt->path, 0);
-	rearrange_packed_git();
-	prepare_packed_git_mru();
-	prepare_packed_git_run_once = 1;
-}
-
-void reprepare_packed_git(void)
-{
-	approximate_object_count_valid = 0;
-	prepare_packed_git_run_once = 0;
-	prepare_packed_git();
-}
-
 static void mark_bad_packed_object(struct packed_git *p,
 				   const unsigned char *sha1)
 {
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 12/25] pack: move unpack_object_header()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (22 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 11/25] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 13/25] pack: move get_size_from_delta() Jonathan Tan
                   ` (36 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 pack.h      |  2 ++
 packfile.c  | 25 +++++++++++++++++++++++++
 sha1_file.c | 25 -------------------------
 4 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index f020dfade..9c70759a6 100644
--- a/cache.h
+++ b/cache.h
@@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
diff --git a/pack.h b/pack.h
index cad5ed488..4a7f88a38 100644
--- a/pack.h
+++ b/pack.h
@@ -167,4 +167,6 @@ extern void reprepare_packed_git(void);
  */
 unsigned long approximate_object_count(void);
 
+extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+
 #endif
diff --git a/packfile.c b/packfile.c
index a517172f7..6e4f1c6e3 100644
--- a/packfile.c
+++ b/packfile.c
@@ -883,3 +883,28 @@ void reprepare_packed_git(void)
 	prepare_packed_git_run_once = 0;
 	prepare_packed_git();
 }
+
+unsigned long unpack_object_header_buffer(const unsigned char *buf,
+		unsigned long len, enum object_type *type, unsigned long *sizep)
+{
+	unsigned shift;
+	unsigned long size, c;
+	unsigned long used = 0;
+
+	c = buf[used++];
+	*type = (c >> 4) & 7;
+	size = c & 15;
+	shift = 4;
+	while (c & 0x80) {
+		if (len <= used || bitsizeof(long) <= shift) {
+			error("bad object header");
+			size = used = 0;
+			break;
+		}
+		c = buf[used++];
+		size += (c & 0x7f) << shift;
+		shift += 7;
+	}
+	*sizep = size;
+	return used;
+}
diff --git a/sha1_file.c b/sha1_file.c
index bbce60f1c..1f4b4ba2c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -913,31 +913,6 @@ void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
 	return map_sha1_file_1(NULL, sha1, size);
 }
 
-unsigned long unpack_object_header_buffer(const unsigned char *buf,
-		unsigned long len, enum object_type *type, unsigned long *sizep)
-{
-	unsigned shift;
-	unsigned long size, c;
-	unsigned long used = 0;
-
-	c = buf[used++];
-	*type = (c >> 4) & 7;
-	size = c & 15;
-	shift = 4;
-	while (c & 0x80) {
-		if (len <= used || bitsizeof(long) <= shift) {
-			error("bad object header");
-			size = used = 0;
-			break;
-		}
-		c = buf[used++];
-		size += (c & 0x7f) << shift;
-		shift += 7;
-	}
-	*sizep = size;
-	return used;
-}
-
 static int unpack_sha1_short_header(git_zstream *stream,
 				    unsigned char *map, unsigned long mapsize,
 				    void *buffer, unsigned long bufsiz)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 13/25] pack: move get_size_from_delta()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (23 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 12/25] pack: move unpack_object_header() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 14/25] pack: move unpack_object_header() Jonathan Tan
                   ` (35 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 pack.h      |  1 +
 packfile.c  | 40 ++++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 39 ---------------------------------------
 4 files changed, 41 insertions(+), 40 deletions(-)

diff --git a/cache.h b/cache.h
index 9c70759a6..e29918c75 100644
--- a/cache.h
+++ b/cache.h
@@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 /*
diff --git a/pack.h b/pack.h
index 4a7f88a38..69c92d8d2 100644
--- a/pack.h
+++ b/pack.h
@@ -168,5 +168,6 @@ extern void reprepare_packed_git(void);
 unsigned long approximate_object_count(void);
 
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 
 #endif
diff --git a/packfile.c b/packfile.c
index 6e4f1c6e3..511afad85 100644
--- a/packfile.c
+++ b/packfile.c
@@ -3,6 +3,7 @@
 #include "pack.h"
 #include "dir.h"
 #include "mergesort.h"
+#include "delta.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -908,3 +909,42 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
 	*sizep = size;
 	return used;
 }
+
+unsigned long get_size_from_delta(struct packed_git *p,
+				  struct pack_window **w_curs,
+			          off_t curpos)
+{
+	const unsigned char *data;
+	unsigned char delta_head[20], *in;
+	git_zstream stream;
+	int st;
+
+	memset(&stream, 0, sizeof(stream));
+	stream.next_out = delta_head;
+	stream.avail_out = sizeof(delta_head);
+
+	git_inflate_init(&stream);
+	do {
+		in = use_pack(p, w_curs, curpos, &stream.avail_in);
+		stream.next_in = in;
+		st = git_inflate(&stream, Z_FINISH);
+		curpos += stream.next_in - in;
+	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
+		 stream.total_out < sizeof(delta_head));
+	git_inflate_end(&stream);
+	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+		error("delta data unpack-initial failed");
+		return 0;
+	}
+
+	/* Examine the initial part of the delta to figure out
+	 * the result size.
+	 */
+	data = delta_head;
+
+	/* ignore base size */
+	get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
+
+	/* Read the result size */
+	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
+}
diff --git a/sha1_file.c b/sha1_file.c
index 1f4b4ba2c..7d354d9b6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1099,45 +1099,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-unsigned long get_size_from_delta(struct packed_git *p,
-				  struct pack_window **w_curs,
-			          off_t curpos)
-{
-	const unsigned char *data;
-	unsigned char delta_head[20], *in;
-	git_zstream stream;
-	int st;
-
-	memset(&stream, 0, sizeof(stream));
-	stream.next_out = delta_head;
-	stream.avail_out = sizeof(delta_head);
-
-	git_inflate_init(&stream);
-	do {
-		in = use_pack(p, w_curs, curpos, &stream.avail_in);
-		stream.next_in = in;
-		st = git_inflate(&stream, Z_FINISH);
-		curpos += stream.next_in - in;
-	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
-	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
-		error("delta data unpack-initial failed");
-		return 0;
-	}
-
-	/* Examine the initial part of the delta to figure out
-	 * the result size.
-	 */
-	data = delta_head;
-
-	/* ignore base size */
-	get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
-
-	/* Read the result size */
-	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
-}
-
 static off_t get_delta_base(struct packed_git *p,
 				    struct pack_window **w_curs,
 				    off_t *curpos,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 14/25] pack: move unpack_object_header()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (24 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 13/25] pack: move get_size_from_delta() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 15/25] sha1_file: set whence in storage-specific info fn Jonathan Tan
                   ` (34 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 pack.h      |  1 +
 packfile.c  | 26 ++++++++++++++++++++++++++
 sha1_file.c | 26 --------------------------
 4 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/cache.h b/cache.h
index e29918c75..1ae3333a0 100644
--- a/cache.h
+++ b/cache.h
@@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 /*
  * Iterate over the files in the loose-object parts of the object
diff --git a/pack.h b/pack.h
index 69c92d8d2..5e3552392 100644
--- a/pack.h
+++ b/pack.h
@@ -169,5 +169,6 @@ unsigned long approximate_object_count(void);
 
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
+extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 #endif
diff --git a/packfile.c b/packfile.c
index 511afad85..a4db78ea0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -948,3 +948,29 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	/* Read the result size */
 	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
 }
+
+int unpack_object_header(struct packed_git *p,
+			 struct pack_window **w_curs,
+			 off_t *curpos,
+			 unsigned long *sizep)
+{
+	unsigned char *base;
+	unsigned long left;
+	unsigned long used;
+	enum object_type type;
+
+	/* use_pack() assures us we have [base, base + 20) available
+	 * as a range that we can look at.  (Its actually the hash
+	 * size that is assured.)  With our object header encoding
+	 * the maximum deflated object size is 2^137, which is just
+	 * insane, so we know won't exceed what we have been given.
+	 */
+	base = use_pack(p, w_curs, *curpos, &left);
+	used = unpack_object_header_buffer(base, left, &type, sizep);
+	if (!used) {
+		type = OBJ_BAD;
+	} else
+		*curpos += used;
+
+	return type;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 7d354d9b6..f3bcdae17 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1170,32 +1170,6 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p,
 		return NULL;
 }
 
-int unpack_object_header(struct packed_git *p,
-			 struct pack_window **w_curs,
-			 off_t *curpos,
-			 unsigned long *sizep)
-{
-	unsigned char *base;
-	unsigned long left;
-	unsigned long used;
-	enum object_type type;
-
-	/* use_pack() assures us we have [base, base + 20) available
-	 * as a range that we can look at.  (Its actually the hash
-	 * size that is assured.)  With our object header encoding
-	 * the maximum deflated object size is 2^137, which is just
-	 * insane, so we know won't exceed what we have been given.
-	 */
-	base = use_pack(p, w_curs, *curpos, &left);
-	used = unpack_object_header_buffer(base, left, &type, sizep);
-	if (!used) {
-		type = OBJ_BAD;
-	} else
-		*curpos += used;
-
-	return type;
-}
-
 static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
 {
 	int type;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 15/25] sha1_file: set whence in storage-specific info fn
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (25 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 14/25] pack: move unpack_object_header() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 16/25] sha1_file: remove read_packed_sha1() Jonathan Tan
                   ` (33 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Move the setting of oi->whence to sha1_loose_object_info() and
packed_object_info(). This allows sha1_object_info_extended() to not
need to know about the delta base cache.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 sha1_file.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index f3bcdae17..9eadda388 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1473,6 +1473,9 @@ int packed_object_info(struct packed_git *p, off_t obj_offset,
 			hashclr(oi->delta_base_sha1);
 	}
 
+	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
+							  OI_PACKED;
+
 out:
 	unuse_pack(&w_curs);
 	return type;
@@ -2002,6 +2005,7 @@ static int sha1_loose_object_info(const unsigned char *sha1,
 	if (oi->sizep == &size_scratch)
 		oi->sizep = NULL;
 	strbuf_release(&hdrbuf);
+	oi->whence = OI_LOOSE;
 	return (status < 0) ? status : 0;
 }
 
@@ -2039,10 +2043,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 
 	if (!find_pack_entry(real, &e)) {
 		/* Most likely it's a loose object. */
-		if (!sha1_loose_object_info(real, oi, flags)) {
-			oi->whence = OI_LOOSE;
+		if (!sha1_loose_object_info(real, oi, flags))
 			return 0;
-		}
 
 		/* Not a loose object; someone else may have just packed it. */
 		if (flags & OBJECT_INFO_QUICK) {
@@ -2065,10 +2067,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 	if (rtype < 0) {
 		mark_bad_packed_object(e.p, real);
 		return sha1_object_info_extended(real, oi, 0);
-	} else if (in_delta_base_cache(e.p, e.offset)) {
-		oi->whence = OI_DBCACHED;
-	} else {
-		oi->whence = OI_PACKED;
+	} else if (oi->whence == OI_PACKED) {
 		oi->u.packed.offset = e.offset;
 		oi->u.packed.pack = e.p;
 		oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 16/25] sha1_file: remove read_packed_sha1()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (26 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 15/25] sha1_file: set whence in storage-specific info fn Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 17/25] pack: move packed_object_info(), unpack_entry() Jonathan Tan
                   ` (32 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Use read_object() in its place instead. This avoids duplication of code.

This makes force_object_loose() slightly slower (because of a redundant
check of loose object storage), but only in the error case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 sha1_file.c | 26 +-------------------------
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 9eadda388..9e5444334 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2091,30 +2091,6 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
 	return type;
 }
 
-static void *read_packed_sha1(const unsigned char *sha1,
-			      enum object_type *type, unsigned long *size)
-{
-	struct pack_entry e;
-	void *data;
-
-	if (!find_pack_entry(sha1, &e))
-		return NULL;
-	data = cache_or_unpack_entry(e.p, e.offset, size, type);
-	if (!data) {
-		/*
-		 * We're probably in deep shit, but let's try to fetch
-		 * the required object anyway from another pack or loose.
-		 * This should happen only in the presence of a corrupted
-		 * pack, and is better than failing outright.
-		 */
-		error("failed to read object %s at offset %"PRIuMAX" from %s",
-		      sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
-		mark_bad_packed_object(e.p, sha1);
-		data = read_object(sha1, type, size);
-	}
-	return data;
-}
-
 int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 		      unsigned char *sha1)
 {
@@ -2497,7 +2473,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
 
 	if (has_loose_object(sha1))
 		return 0;
-	buf = read_packed_sha1(sha1, &type, &len);
+	buf = read_object(sha1, &type, &len);
 	if (!buf)
 		return error("cannot read sha1_file for %s", sha1_to_hex(sha1));
 	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 17/25] pack: move packed_object_info(), unpack_entry()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (27 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 16/25] sha1_file: remove read_packed_sha1() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 18/25] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
                   ` (31 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Both sha1_file.c and packfile.c now need read_object(), so a copy of
read_object() was created in packfile.c.

This patch makes both mark_bad_packed_object() and has_packed_and_bad()
global. Unlike most of the other patches in this series, these 2
functions need to remain global.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |   7 -
 pack.h      |  11 +
 packfile.c  | 660 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 676 ++----------------------------------------------------------
 4 files changed, 685 insertions(+), 669 deletions(-)

diff --git a/cache.h b/cache.h
index 1ae3333a0..b14098bf1 100644
--- a/cache.h
+++ b/cache.h
@@ -1186,9 +1186,6 @@ extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
-/* global flag to enable extra checks when accessing packed objects */
-extern int do_check_packed_object_crc;
-
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
@@ -1621,8 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern void clear_delta_base_cache(void);
-
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
  * and can provide at least 8 bytes of data.
@@ -1660,7 +1655,6 @@ extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
 
 extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 
 /*
  * Iterate over the files in the loose-object parts of the object
@@ -1771,7 +1765,6 @@ struct object_info {
 /* Do not retry packed storage after checking packed and loose storage */
 #define OBJECT_INFO_QUICK 8
 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags);
-extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *);
 
 /* Dumb servers support */
 extern int update_server_info(int);
diff --git a/pack.h b/pack.h
index 5e3552392..2e6f357c3 100644
--- a/pack.h
+++ b/pack.h
@@ -171,4 +171,15 @@ extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsig
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
+extern void clear_delta_base_cache(void);
+
+/* global flag to enable extra checks when accessing packed objects */
+extern int do_check_packed_object_crc;
+
+extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *);
+extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
+
+extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
+extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
+
 #endif
diff --git a/packfile.c b/packfile.c
index a4db78ea0..a3745f9df 100644
--- a/packfile.c
+++ b/packfile.c
@@ -4,6 +4,8 @@
 #include "dir.h"
 #include "mergesort.h"
 #include "delta.h"
+#include "list.h"
+#include "streaming.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -974,3 +976,661 @@ int unpack_object_header(struct packed_git *p,
 
 	return type;
 }
+
+void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
+{
+	unsigned i;
+	for (i = 0; i < p->num_bad_objects; i++)
+		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
+			return;
+	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
+				      st_mult(GIT_MAX_RAWSZ,
+					      st_add(p->num_bad_objects, 1)));
+	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
+	p->num_bad_objects++;
+}
+
+const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
+{
+	struct packed_git *p;
+	unsigned i;
+
+	for (p = packed_git; p; p = p->next)
+		for (i = 0; i < p->num_bad_objects; i++)
+			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+				return p;
+	return NULL;
+}
+
+static off_t get_delta_base(struct packed_git *p,
+				    struct pack_window **w_curs,
+				    off_t *curpos,
+				    enum object_type type,
+				    off_t delta_obj_offset)
+{
+	unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
+	off_t base_offset;
+
+	/* use_pack() assured us we have [base_info, base_info + 20)
+	 * as a range that we can look at without walking off the
+	 * end of the mapped window.  Its actually the hash size
+	 * that is assured.  An OFS_DELTA longer than the hash size
+	 * is stupid, as then a REF_DELTA would be smaller to store.
+	 */
+	if (type == OBJ_OFS_DELTA) {
+		unsigned used = 0;
+		unsigned char c = base_info[used++];
+		base_offset = c & 127;
+		while (c & 128) {
+			base_offset += 1;
+			if (!base_offset || MSB(base_offset, 7))
+				return 0;  /* overflow */
+			c = base_info[used++];
+			base_offset = (base_offset << 7) + (c & 127);
+		}
+		base_offset = delta_obj_offset - base_offset;
+		if (base_offset <= 0 || base_offset >= delta_obj_offset)
+			return 0;  /* out of bound */
+		*curpos += used;
+	} else if (type == OBJ_REF_DELTA) {
+		/* The base entry _must_ be in the same pack */
+		base_offset = find_pack_entry_one(base_info, p);
+		*curpos += 20;
+	} else
+		die("I am totally screwed");
+	return base_offset;
+}
+
+/*
+ * Like get_delta_base above, but we return the sha1 instead of the pack
+ * offset. This means it is cheaper for REF deltas (we do not have to do
+ * the final object lookup), but more expensive for OFS deltas (we
+ * have to load the revidx to convert the offset back into a sha1).
+ */
+static const unsigned char *get_delta_base_sha1(struct packed_git *p,
+						struct pack_window **w_curs,
+						off_t curpos,
+						enum object_type type,
+						off_t delta_obj_offset)
+{
+	if (type == OBJ_REF_DELTA) {
+		unsigned char *base = use_pack(p, w_curs, curpos, NULL);
+		return base;
+	} else if (type == OBJ_OFS_DELTA) {
+		struct revindex_entry *revidx;
+		off_t base_offset = get_delta_base(p, w_curs, &curpos,
+						   type, delta_obj_offset);
+
+		if (!base_offset)
+			return NULL;
+
+		revidx = find_pack_revindex(p, base_offset);
+		if (!revidx)
+			return NULL;
+
+		return nth_packed_object_sha1(p, revidx->nr);
+	} else
+		return NULL;
+}
+
+static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
+{
+	int type;
+	struct revindex_entry *revidx;
+	const unsigned char *sha1;
+	revidx = find_pack_revindex(p, obj_offset);
+	if (!revidx)
+		return OBJ_BAD;
+	sha1 = nth_packed_object_sha1(p, revidx->nr);
+	mark_bad_packed_object(p, sha1);
+	type = sha1_object_info(sha1, NULL);
+	if (type <= OBJ_NONE)
+		return OBJ_BAD;
+	return type;
+}
+
+#define POI_STACK_PREALLOC 64
+
+static enum object_type packed_to_object_type(struct packed_git *p,
+					      off_t obj_offset,
+					      enum object_type type,
+					      struct pack_window **w_curs,
+					      off_t curpos)
+{
+	off_t small_poi_stack[POI_STACK_PREALLOC];
+	off_t *poi_stack = small_poi_stack;
+	int poi_stack_nr = 0, poi_stack_alloc = POI_STACK_PREALLOC;
+
+	while (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+		off_t base_offset;
+		unsigned long size;
+		/* Push the object we're going to leave behind */
+		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
+			poi_stack_alloc = alloc_nr(poi_stack_nr);
+			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
+			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
+		} else {
+			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
+		}
+		poi_stack[poi_stack_nr++] = obj_offset;
+		/* If parsing the base offset fails, just unwind */
+		base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
+		if (!base_offset)
+			goto unwind;
+		curpos = obj_offset = base_offset;
+		type = unpack_object_header(p, w_curs, &curpos, &size);
+		if (type <= OBJ_NONE) {
+			/* If getting the base itself fails, we first
+			 * retry the base, otherwise unwind */
+			type = retry_bad_packed_offset(p, base_offset);
+			if (type > OBJ_NONE)
+				goto out;
+			goto unwind;
+		}
+	}
+
+	switch (type) {
+	case OBJ_BAD:
+	case OBJ_COMMIT:
+	case OBJ_TREE:
+	case OBJ_BLOB:
+	case OBJ_TAG:
+		break;
+	default:
+		error("unknown object type %i at offset %"PRIuMAX" in %s",
+		      type, (uintmax_t)obj_offset, p->pack_name);
+		type = OBJ_BAD;
+	}
+
+out:
+	if (poi_stack != small_poi_stack)
+		free(poi_stack);
+	return type;
+
+unwind:
+	while (poi_stack_nr) {
+		obj_offset = poi_stack[--poi_stack_nr];
+		type = retry_bad_packed_offset(p, obj_offset);
+		if (type > OBJ_NONE)
+			goto out;
+	}
+	type = OBJ_BAD;
+	goto out;
+}
+
+static struct hashmap delta_base_cache;
+static size_t delta_base_cached;
+
+static LIST_HEAD(delta_base_cache_lru);
+
+struct delta_base_cache_key {
+	struct packed_git *p;
+	off_t base_offset;
+};
+
+struct delta_base_cache_entry {
+	struct hashmap hash;
+	struct delta_base_cache_key key;
+	struct list_head lru;
+	void *data;
+	unsigned long size;
+	enum object_type type;
+};
+
+static unsigned int pack_entry_hash(struct packed_git *p, off_t base_offset)
+{
+	unsigned int hash;
+
+	hash = (unsigned int)(intptr_t)p + (unsigned int)base_offset;
+	hash += (hash >> 8) + (hash >> 16);
+	return hash;
+}
+
+static struct delta_base_cache_entry *
+get_delta_base_cache_entry(struct packed_git *p, off_t base_offset)
+{
+	struct hashmap_entry entry;
+	struct delta_base_cache_key key;
+
+	if (!delta_base_cache.cmpfn)
+		return NULL;
+
+	hashmap_entry_init(&entry, pack_entry_hash(p, base_offset));
+	key.p = p;
+	key.base_offset = base_offset;
+	return hashmap_get(&delta_base_cache, &entry, &key);
+}
+
+static int delta_base_cache_key_eq(const struct delta_base_cache_key *a,
+				   const struct delta_base_cache_key *b)
+{
+	return a->p == b->p && a->base_offset == b->base_offset;
+}
+
+static int delta_base_cache_hash_cmp(const void *unused_cmp_data,
+				     const void *va, const void *vb,
+				     const void *vkey)
+{
+	const struct delta_base_cache_entry *a = va, *b = vb;
+	const struct delta_base_cache_key *key = vkey;
+	if (key)
+		return !delta_base_cache_key_eq(&a->key, key);
+	else
+		return !delta_base_cache_key_eq(&a->key, &b->key);
+}
+
+static int in_delta_base_cache(struct packed_git *p, off_t base_offset)
+{
+	return !!get_delta_base_cache_entry(p, base_offset);
+}
+
+/*
+ * Remove the entry from the cache, but do _not_ free the associated
+ * entry data. The caller takes ownership of the "data" buffer, and
+ * should copy out any fields it wants before detaching.
+ */
+static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent)
+{
+	hashmap_remove(&delta_base_cache, ent, &ent->key);
+	list_del(&ent->lru);
+	delta_base_cached -= ent->size;
+	free(ent);
+}
+
+static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
+	unsigned long *base_size, enum object_type *type)
+{
+	struct delta_base_cache_entry *ent;
+
+	ent = get_delta_base_cache_entry(p, base_offset);
+	if (!ent)
+		return unpack_entry(p, base_offset, type, base_size);
+
+	if (type)
+		*type = ent->type;
+	if (base_size)
+		*base_size = ent->size;
+	return xmemdupz(ent->data, ent->size);
+}
+
+static inline void release_delta_base_cache(struct delta_base_cache_entry *ent)
+{
+	free(ent->data);
+	detach_delta_base_cache_entry(ent);
+}
+
+void clear_delta_base_cache(void)
+{
+	struct list_head *lru, *tmp;
+	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
+		struct delta_base_cache_entry *entry =
+			list_entry(lru, struct delta_base_cache_entry, lru);
+		release_delta_base_cache(entry);
+	}
+}
+
+static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
+	void *base, unsigned long base_size, enum object_type type)
+{
+	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
+	struct list_head *lru, *tmp;
+
+	delta_base_cached += base_size;
+
+	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
+		struct delta_base_cache_entry *f =
+			list_entry(lru, struct delta_base_cache_entry, lru);
+		if (delta_base_cached <= delta_base_cache_limit)
+			break;
+		release_delta_base_cache(f);
+	}
+
+	ent->key.p = p;
+	ent->key.base_offset = base_offset;
+	ent->type = type;
+	ent->data = base;
+	ent->size = base_size;
+	list_add_tail(&ent->lru, &delta_base_cache_lru);
+
+	if (!delta_base_cache.cmpfn)
+		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
+	hashmap_entry_init(ent, pack_entry_hash(p, base_offset));
+	hashmap_add(&delta_base_cache, ent);
+}
+
+int packed_object_info(struct packed_git *p, off_t obj_offset,
+		       struct object_info *oi)
+{
+	struct pack_window *w_curs = NULL;
+	unsigned long size;
+	off_t curpos = obj_offset;
+	enum object_type type;
+
+	/*
+	 * We always get the representation type, but only convert it to
+	 * a "real" type later if the caller is interested.
+	 */
+	if (oi->contentp) {
+		*oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep,
+						      &type);
+		if (!*oi->contentp)
+			type = OBJ_BAD;
+	} else {
+		type = unpack_object_header(p, &w_curs, &curpos, &size);
+	}
+
+	if (!oi->contentp && oi->sizep) {
+		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+			off_t tmp_pos = curpos;
+			off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
+							   type, obj_offset);
+			if (!base_offset) {
+				type = OBJ_BAD;
+				goto out;
+			}
+			*oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos);
+			if (*oi->sizep == 0) {
+				type = OBJ_BAD;
+				goto out;
+			}
+		} else {
+			*oi->sizep = size;
+		}
+	}
+
+	if (oi->disk_sizep) {
+		struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
+		*oi->disk_sizep = revidx[1].offset - obj_offset;
+	}
+
+	if (oi->typep || oi->typename) {
+		enum object_type ptot;
+		ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
+					     curpos);
+		if (oi->typep)
+			*oi->typep = ptot;
+		if (oi->typename) {
+			const char *tn = typename(ptot);
+			if (tn)
+				strbuf_addstr(oi->typename, tn);
+		}
+		if (ptot < 0) {
+			type = OBJ_BAD;
+			goto out;
+		}
+	}
+
+	if (oi->delta_base_sha1) {
+		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+			const unsigned char *base;
+
+			base = get_delta_base_sha1(p, &w_curs, curpos,
+						   type, obj_offset);
+			if (!base) {
+				type = OBJ_BAD;
+				goto out;
+			}
+
+			hashcpy(oi->delta_base_sha1, base);
+		} else
+			hashclr(oi->delta_base_sha1);
+	}
+
+	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
+							  OI_PACKED;
+
+out:
+	unuse_pack(&w_curs);
+	return type;
+}
+
+static void *unpack_compressed_entry(struct packed_git *p,
+				    struct pack_window **w_curs,
+				    off_t curpos,
+				    unsigned long size)
+{
+	int st;
+	git_zstream stream;
+	unsigned char *buffer, *in;
+
+	buffer = xmallocz_gently(size);
+	if (!buffer)
+		return NULL;
+	memset(&stream, 0, sizeof(stream));
+	stream.next_out = buffer;
+	stream.avail_out = size + 1;
+
+	git_inflate_init(&stream);
+	do {
+		in = use_pack(p, w_curs, curpos, &stream.avail_in);
+		stream.next_in = in;
+		st = git_inflate(&stream, Z_FINISH);
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be */
+		curpos += stream.next_in - in;
+	} while (st == Z_OK || st == Z_BUF_ERROR);
+	git_inflate_end(&stream);
+	if ((st != Z_STREAM_END) || stream.total_out != size) {
+		free(buffer);
+		return NULL;
+	}
+
+	return buffer;
+}
+
+static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
+{
+	static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS);
+	trace_printf_key(&pack_access, "%s %"PRIuMAX"\n",
+			 p->pack_name, (uintmax_t)obj_offset);
+}
+
+int do_check_packed_object_crc;
+
+#define UNPACK_ENTRY_STACK_PREALLOC 64
+struct unpack_entry_stack_ent {
+	off_t obj_offset;
+	off_t curpos;
+	unsigned long size;
+};
+
+static void *read_object(const unsigned char *sha1, enum object_type *type,
+			 unsigned long *size)
+{
+	struct object_info oi = OBJECT_INFO_INIT;
+	void *content;
+	oi.typep = type;
+	oi.sizep = size;
+	oi.contentp = &content;
+
+	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
+		return NULL;
+	return content;
+}
+
+void *unpack_entry(struct packed_git *p, off_t obj_offset,
+		   enum object_type *final_type, unsigned long *final_size)
+{
+	struct pack_window *w_curs = NULL;
+	off_t curpos = obj_offset;
+	void *data = NULL;
+	unsigned long size;
+	enum object_type type;
+	struct unpack_entry_stack_ent small_delta_stack[UNPACK_ENTRY_STACK_PREALLOC];
+	struct unpack_entry_stack_ent *delta_stack = small_delta_stack;
+	int delta_stack_nr = 0, delta_stack_alloc = UNPACK_ENTRY_STACK_PREALLOC;
+	int base_from_cache = 0;
+
+	write_pack_access_log(p, obj_offset);
+
+	/* PHASE 1: drill down to the innermost base object */
+	for (;;) {
+		off_t base_offset;
+		int i;
+		struct delta_base_cache_entry *ent;
+
+		ent = get_delta_base_cache_entry(p, curpos);
+		if (ent) {
+			type = ent->type;
+			data = ent->data;
+			size = ent->size;
+			detach_delta_base_cache_entry(ent);
+			base_from_cache = 1;
+			break;
+		}
+
+		if (do_check_packed_object_crc && p->index_version > 1) {
+			struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
+			off_t len = revidx[1].offset - obj_offset;
+			if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) {
+				const unsigned char *sha1 =
+					nth_packed_object_sha1(p, revidx->nr);
+				error("bad packed object CRC for %s",
+				      sha1_to_hex(sha1));
+				mark_bad_packed_object(p, sha1);
+				unuse_pack(&w_curs);
+				return NULL;
+			}
+		}
+
+		type = unpack_object_header(p, &w_curs, &curpos, &size);
+		if (type != OBJ_OFS_DELTA && type != OBJ_REF_DELTA)
+			break;
+
+		base_offset = get_delta_base(p, &w_curs, &curpos, type, obj_offset);
+		if (!base_offset) {
+			error("failed to validate delta base reference "
+			      "at offset %"PRIuMAX" from %s",
+			      (uintmax_t)curpos, p->pack_name);
+			/* bail to phase 2, in hopes of recovery */
+			data = NULL;
+			break;
+		}
+
+		/* push object, proceed to base */
+		if (delta_stack_nr >= delta_stack_alloc
+		    && delta_stack == small_delta_stack) {
+			delta_stack_alloc = alloc_nr(delta_stack_nr);
+			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
+			memcpy(delta_stack, small_delta_stack,
+			       sizeof(*delta_stack)*delta_stack_nr);
+		} else {
+			ALLOC_GROW(delta_stack, delta_stack_nr+1, delta_stack_alloc);
+		}
+		i = delta_stack_nr++;
+		delta_stack[i].obj_offset = obj_offset;
+		delta_stack[i].curpos = curpos;
+		delta_stack[i].size = size;
+
+		curpos = obj_offset = base_offset;
+	}
+
+	/* PHASE 2: handle the base */
+	switch (type) {
+	case OBJ_OFS_DELTA:
+	case OBJ_REF_DELTA:
+		if (data)
+			die("BUG: unpack_entry: left loop at a valid delta");
+		break;
+	case OBJ_COMMIT:
+	case OBJ_TREE:
+	case OBJ_BLOB:
+	case OBJ_TAG:
+		if (!base_from_cache)
+			data = unpack_compressed_entry(p, &w_curs, curpos, size);
+		break;
+	default:
+		data = NULL;
+		error("unknown object type %i at offset %"PRIuMAX" in %s",
+		      type, (uintmax_t)obj_offset, p->pack_name);
+	}
+
+	/* PHASE 3: apply deltas in order */
+
+	/* invariants:
+	 *   'data' holds the base data, or NULL if there was corruption
+	 */
+	while (delta_stack_nr) {
+		void *delta_data;
+		void *base = data;
+		void *external_base = NULL;
+		unsigned long delta_size, base_size = size;
+		int i;
+
+		data = NULL;
+
+		if (base)
+			add_delta_base_cache(p, obj_offset, base, base_size, type);
+
+		if (!base) {
+			/*
+			 * We're probably in deep shit, but let's try to fetch
+			 * the required base anyway from another pack or loose.
+			 * This is costly but should happen only in the presence
+			 * of a corrupted pack, and is better than failing outright.
+			 */
+			struct revindex_entry *revidx;
+			const unsigned char *base_sha1;
+			revidx = find_pack_revindex(p, obj_offset);
+			if (revidx) {
+				base_sha1 = nth_packed_object_sha1(p, revidx->nr);
+				error("failed to read delta base object %s"
+				      " at offset %"PRIuMAX" from %s",
+				      sha1_to_hex(base_sha1), (uintmax_t)obj_offset,
+				      p->pack_name);
+				mark_bad_packed_object(p, base_sha1);
+				base = read_object(base_sha1, &type, &base_size);
+				external_base = base;
+			}
+		}
+
+		i = --delta_stack_nr;
+		obj_offset = delta_stack[i].obj_offset;
+		curpos = delta_stack[i].curpos;
+		delta_size = delta_stack[i].size;
+
+		if (!base)
+			continue;
+
+		delta_data = unpack_compressed_entry(p, &w_curs, curpos, delta_size);
+
+		if (!delta_data) {
+			error("failed to unpack compressed delta "
+			      "at offset %"PRIuMAX" from %s",
+			      (uintmax_t)curpos, p->pack_name);
+			data = NULL;
+			free(external_base);
+			continue;
+		}
+
+		data = patch_delta(base, base_size,
+				   delta_data, delta_size,
+				   &size);
+
+		/*
+		 * We could not apply the delta; warn the user, but keep going.
+		 * Our failure will be noticed either in the next iteration of
+		 * the loop, or if this is the final delta, in the caller when
+		 * we return NULL. Those code paths will take care of making
+		 * a more explicit warning and retrying with another copy of
+		 * the object.
+		 */
+		if (!data)
+			error("failed to apply delta");
+
+		free(delta_data);
+		free(external_base);
+	}
+
+	if (final_type)
+		*final_type = type;
+	if (final_size)
+		*final_size = size;
+
+	unuse_pack(&w_curs);
+
+	if (delta_stack != small_delta_stack)
+		free(delta_stack);
+
+	return data;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 9e5444334..fe7e0db76 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -717,32 +717,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-static void mark_bad_packed_object(struct packed_git *p,
-				   const unsigned char *sha1)
-{
-	unsigned i;
-	for (i = 0; i < p->num_bad_objects; i++)
-		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
-			return;
-	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
-				      st_mult(GIT_MAX_RAWSZ,
-					      st_add(p->num_bad_objects, 1)));
-	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
-	p->num_bad_objects++;
-}
-
-static const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
-{
-	struct packed_git *p;
-	unsigned i;
-
-	for (p = packed_git; p; p = p->next)
-		for (i = 0; i < p->num_bad_objects; i++)
-			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
-				return p;
-	return NULL;
-}
-
 /*
  * With an in-core object data in "map", rehash it to make sure the
  * object name actually matches "sha1" to detect object corruption.
@@ -1099,628 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-static off_t get_delta_base(struct packed_git *p,
-				    struct pack_window **w_curs,
-				    off_t *curpos,
-				    enum object_type type,
-				    off_t delta_obj_offset)
-{
-	unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
-	off_t base_offset;
-
-	/* use_pack() assured us we have [base_info, base_info + 20)
-	 * as a range that we can look at without walking off the
-	 * end of the mapped window.  Its actually the hash size
-	 * that is assured.  An OFS_DELTA longer than the hash size
-	 * is stupid, as then a REF_DELTA would be smaller to store.
-	 */
-	if (type == OBJ_OFS_DELTA) {
-		unsigned used = 0;
-		unsigned char c = base_info[used++];
-		base_offset = c & 127;
-		while (c & 128) {
-			base_offset += 1;
-			if (!base_offset || MSB(base_offset, 7))
-				return 0;  /* overflow */
-			c = base_info[used++];
-			base_offset = (base_offset << 7) + (c & 127);
-		}
-		base_offset = delta_obj_offset - base_offset;
-		if (base_offset <= 0 || base_offset >= delta_obj_offset)
-			return 0;  /* out of bound */
-		*curpos += used;
-	} else if (type == OBJ_REF_DELTA) {
-		/* The base entry _must_ be in the same pack */
-		base_offset = find_pack_entry_one(base_info, p);
-		*curpos += 20;
-	} else
-		die("I am totally screwed");
-	return base_offset;
-}
-
-/*
- * Like get_delta_base above, but we return the sha1 instead of the pack
- * offset. This means it is cheaper for REF deltas (we do not have to do
- * the final object lookup), but more expensive for OFS deltas (we
- * have to load the revidx to convert the offset back into a sha1).
- */
-static const unsigned char *get_delta_base_sha1(struct packed_git *p,
-						struct pack_window **w_curs,
-						off_t curpos,
-						enum object_type type,
-						off_t delta_obj_offset)
-{
-	if (type == OBJ_REF_DELTA) {
-		unsigned char *base = use_pack(p, w_curs, curpos, NULL);
-		return base;
-	} else if (type == OBJ_OFS_DELTA) {
-		struct revindex_entry *revidx;
-		off_t base_offset = get_delta_base(p, w_curs, &curpos,
-						   type, delta_obj_offset);
-
-		if (!base_offset)
-			return NULL;
-
-		revidx = find_pack_revindex(p, base_offset);
-		if (!revidx)
-			return NULL;
-
-		return nth_packed_object_sha1(p, revidx->nr);
-	} else
-		return NULL;
-}
-
-static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
-{
-	int type;
-	struct revindex_entry *revidx;
-	const unsigned char *sha1;
-	revidx = find_pack_revindex(p, obj_offset);
-	if (!revidx)
-		return OBJ_BAD;
-	sha1 = nth_packed_object_sha1(p, revidx->nr);
-	mark_bad_packed_object(p, sha1);
-	type = sha1_object_info(sha1, NULL);
-	if (type <= OBJ_NONE)
-		return OBJ_BAD;
-	return type;
-}
-
-#define POI_STACK_PREALLOC 64
-
-static enum object_type packed_to_object_type(struct packed_git *p,
-					      off_t obj_offset,
-					      enum object_type type,
-					      struct pack_window **w_curs,
-					      off_t curpos)
-{
-	off_t small_poi_stack[POI_STACK_PREALLOC];
-	off_t *poi_stack = small_poi_stack;
-	int poi_stack_nr = 0, poi_stack_alloc = POI_STACK_PREALLOC;
-
-	while (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-		off_t base_offset;
-		unsigned long size;
-		/* Push the object we're going to leave behind */
-		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
-			poi_stack_alloc = alloc_nr(poi_stack_nr);
-			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
-			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
-		} else {
-			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
-		}
-		poi_stack[poi_stack_nr++] = obj_offset;
-		/* If parsing the base offset fails, just unwind */
-		base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
-		if (!base_offset)
-			goto unwind;
-		curpos = obj_offset = base_offset;
-		type = unpack_object_header(p, w_curs, &curpos, &size);
-		if (type <= OBJ_NONE) {
-			/* If getting the base itself fails, we first
-			 * retry the base, otherwise unwind */
-			type = retry_bad_packed_offset(p, base_offset);
-			if (type > OBJ_NONE)
-				goto out;
-			goto unwind;
-		}
-	}
-
-	switch (type) {
-	case OBJ_BAD:
-	case OBJ_COMMIT:
-	case OBJ_TREE:
-	case OBJ_BLOB:
-	case OBJ_TAG:
-		break;
-	default:
-		error("unknown object type %i at offset %"PRIuMAX" in %s",
-		      type, (uintmax_t)obj_offset, p->pack_name);
-		type = OBJ_BAD;
-	}
-
-out:
-	if (poi_stack != small_poi_stack)
-		free(poi_stack);
-	return type;
-
-unwind:
-	while (poi_stack_nr) {
-		obj_offset = poi_stack[--poi_stack_nr];
-		type = retry_bad_packed_offset(p, obj_offset);
-		if (type > OBJ_NONE)
-			goto out;
-	}
-	type = OBJ_BAD;
-	goto out;
-}
-
-static struct hashmap delta_base_cache;
-static size_t delta_base_cached;
-
-static LIST_HEAD(delta_base_cache_lru);
-
-struct delta_base_cache_key {
-	struct packed_git *p;
-	off_t base_offset;
-};
-
-struct delta_base_cache_entry {
-	struct hashmap hash;
-	struct delta_base_cache_key key;
-	struct list_head lru;
-	void *data;
-	unsigned long size;
-	enum object_type type;
-};
-
-static unsigned int pack_entry_hash(struct packed_git *p, off_t base_offset)
-{
-	unsigned int hash;
-
-	hash = (unsigned int)(intptr_t)p + (unsigned int)base_offset;
-	hash += (hash >> 8) + (hash >> 16);
-	return hash;
-}
-
-static struct delta_base_cache_entry *
-get_delta_base_cache_entry(struct packed_git *p, off_t base_offset)
-{
-	struct hashmap_entry entry;
-	struct delta_base_cache_key key;
-
-	if (!delta_base_cache.cmpfn)
-		return NULL;
-
-	hashmap_entry_init(&entry, pack_entry_hash(p, base_offset));
-	key.p = p;
-	key.base_offset = base_offset;
-	return hashmap_get(&delta_base_cache, &entry, &key);
-}
-
-static int delta_base_cache_key_eq(const struct delta_base_cache_key *a,
-				   const struct delta_base_cache_key *b)
-{
-	return a->p == b->p && a->base_offset == b->base_offset;
-}
-
-static int delta_base_cache_hash_cmp(const void *unused_cmp_data,
-				     const void *va, const void *vb,
-				     const void *vkey)
-{
-	const struct delta_base_cache_entry *a = va, *b = vb;
-	const struct delta_base_cache_key *key = vkey;
-	if (key)
-		return !delta_base_cache_key_eq(&a->key, key);
-	else
-		return !delta_base_cache_key_eq(&a->key, &b->key);
-}
-
-static int in_delta_base_cache(struct packed_git *p, off_t base_offset)
-{
-	return !!get_delta_base_cache_entry(p, base_offset);
-}
-
-/*
- * Remove the entry from the cache, but do _not_ free the associated
- * entry data. The caller takes ownership of the "data" buffer, and
- * should copy out any fields it wants before detaching.
- */
-static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent)
-{
-	hashmap_remove(&delta_base_cache, ent, &ent->key);
-	list_del(&ent->lru);
-	delta_base_cached -= ent->size;
-	free(ent);
-}
-
-static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
-	unsigned long *base_size, enum object_type *type)
-{
-	struct delta_base_cache_entry *ent;
-
-	ent = get_delta_base_cache_entry(p, base_offset);
-	if (!ent)
-		return unpack_entry(p, base_offset, type, base_size);
-
-	if (type)
-		*type = ent->type;
-	if (base_size)
-		*base_size = ent->size;
-	return xmemdupz(ent->data, ent->size);
-}
-
-static inline void release_delta_base_cache(struct delta_base_cache_entry *ent)
-{
-	free(ent->data);
-	detach_delta_base_cache_entry(ent);
-}
-
-void clear_delta_base_cache(void)
-{
-	struct list_head *lru, *tmp;
-	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
-		struct delta_base_cache_entry *entry =
-			list_entry(lru, struct delta_base_cache_entry, lru);
-		release_delta_base_cache(entry);
-	}
-}
-
-static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
-	void *base, unsigned long base_size, enum object_type type)
-{
-	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
-	struct list_head *lru, *tmp;
-
-	delta_base_cached += base_size;
-
-	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
-		struct delta_base_cache_entry *f =
-			list_entry(lru, struct delta_base_cache_entry, lru);
-		if (delta_base_cached <= delta_base_cache_limit)
-			break;
-		release_delta_base_cache(f);
-	}
-
-	ent->key.p = p;
-	ent->key.base_offset = base_offset;
-	ent->type = type;
-	ent->data = base;
-	ent->size = base_size;
-	list_add_tail(&ent->lru, &delta_base_cache_lru);
-
-	if (!delta_base_cache.cmpfn)
-		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
-	hashmap_entry_init(ent, pack_entry_hash(p, base_offset));
-	hashmap_add(&delta_base_cache, ent);
-}
-
-int packed_object_info(struct packed_git *p, off_t obj_offset,
-		       struct object_info *oi)
-{
-	struct pack_window *w_curs = NULL;
-	unsigned long size;
-	off_t curpos = obj_offset;
-	enum object_type type;
-
-	/*
-	 * We always get the representation type, but only convert it to
-	 * a "real" type later if the caller is interested.
-	 */
-	if (oi->contentp) {
-		*oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep,
-						      &type);
-		if (!*oi->contentp)
-			type = OBJ_BAD;
-	} else {
-		type = unpack_object_header(p, &w_curs, &curpos, &size);
-	}
-
-	if (!oi->contentp && oi->sizep) {
-		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-			off_t tmp_pos = curpos;
-			off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
-							   type, obj_offset);
-			if (!base_offset) {
-				type = OBJ_BAD;
-				goto out;
-			}
-			*oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos);
-			if (*oi->sizep == 0) {
-				type = OBJ_BAD;
-				goto out;
-			}
-		} else {
-			*oi->sizep = size;
-		}
-	}
-
-	if (oi->disk_sizep) {
-		struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
-		*oi->disk_sizep = revidx[1].offset - obj_offset;
-	}
-
-	if (oi->typep || oi->typename) {
-		enum object_type ptot;
-		ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
-					     curpos);
-		if (oi->typep)
-			*oi->typep = ptot;
-		if (oi->typename) {
-			const char *tn = typename(ptot);
-			if (tn)
-				strbuf_addstr(oi->typename, tn);
-		}
-		if (ptot < 0) {
-			type = OBJ_BAD;
-			goto out;
-		}
-	}
-
-	if (oi->delta_base_sha1) {
-		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-			const unsigned char *base;
-
-			base = get_delta_base_sha1(p, &w_curs, curpos,
-						   type, obj_offset);
-			if (!base) {
-				type = OBJ_BAD;
-				goto out;
-			}
-
-			hashcpy(oi->delta_base_sha1, base);
-		} else
-			hashclr(oi->delta_base_sha1);
-	}
-
-	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
-							  OI_PACKED;
-
-out:
-	unuse_pack(&w_curs);
-	return type;
-}
-
-static void *unpack_compressed_entry(struct packed_git *p,
-				    struct pack_window **w_curs,
-				    off_t curpos,
-				    unsigned long size)
-{
-	int st;
-	git_zstream stream;
-	unsigned char *buffer, *in;
-
-	buffer = xmallocz_gently(size);
-	if (!buffer)
-		return NULL;
-	memset(&stream, 0, sizeof(stream));
-	stream.next_out = buffer;
-	stream.avail_out = size + 1;
-
-	git_inflate_init(&stream);
-	do {
-		in = use_pack(p, w_curs, curpos, &stream.avail_in);
-		stream.next_in = in;
-		st = git_inflate(&stream, Z_FINISH);
-		if (!stream.avail_out)
-			break; /* the payload is larger than it should be */
-		curpos += stream.next_in - in;
-	} while (st == Z_OK || st == Z_BUF_ERROR);
-	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) || stream.total_out != size) {
-		free(buffer);
-		return NULL;
-	}
-
-	return buffer;
-}
-
-static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size);
-
-static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
-{
-	static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS);
-	trace_printf_key(&pack_access, "%s %"PRIuMAX"\n",
-			 p->pack_name, (uintmax_t)obj_offset);
-}
-
-int do_check_packed_object_crc;
-
-#define UNPACK_ENTRY_STACK_PREALLOC 64
-struct unpack_entry_stack_ent {
-	off_t obj_offset;
-	off_t curpos;
-	unsigned long size;
-};
-
-void *unpack_entry(struct packed_git *p, off_t obj_offset,
-		   enum object_type *final_type, unsigned long *final_size)
-{
-	struct pack_window *w_curs = NULL;
-	off_t curpos = obj_offset;
-	void *data = NULL;
-	unsigned long size;
-	enum object_type type;
-	struct unpack_entry_stack_ent small_delta_stack[UNPACK_ENTRY_STACK_PREALLOC];
-	struct unpack_entry_stack_ent *delta_stack = small_delta_stack;
-	int delta_stack_nr = 0, delta_stack_alloc = UNPACK_ENTRY_STACK_PREALLOC;
-	int base_from_cache = 0;
-
-	write_pack_access_log(p, obj_offset);
-
-	/* PHASE 1: drill down to the innermost base object */
-	for (;;) {
-		off_t base_offset;
-		int i;
-		struct delta_base_cache_entry *ent;
-
-		ent = get_delta_base_cache_entry(p, curpos);
-		if (ent) {
-			type = ent->type;
-			data = ent->data;
-			size = ent->size;
-			detach_delta_base_cache_entry(ent);
-			base_from_cache = 1;
-			break;
-		}
-
-		if (do_check_packed_object_crc && p->index_version > 1) {
-			struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
-			off_t len = revidx[1].offset - obj_offset;
-			if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) {
-				const unsigned char *sha1 =
-					nth_packed_object_sha1(p, revidx->nr);
-				error("bad packed object CRC for %s",
-				      sha1_to_hex(sha1));
-				mark_bad_packed_object(p, sha1);
-				unuse_pack(&w_curs);
-				return NULL;
-			}
-		}
-
-		type = unpack_object_header(p, &w_curs, &curpos, &size);
-		if (type != OBJ_OFS_DELTA && type != OBJ_REF_DELTA)
-			break;
-
-		base_offset = get_delta_base(p, &w_curs, &curpos, type, obj_offset);
-		if (!base_offset) {
-			error("failed to validate delta base reference "
-			      "at offset %"PRIuMAX" from %s",
-			      (uintmax_t)curpos, p->pack_name);
-			/* bail to phase 2, in hopes of recovery */
-			data = NULL;
-			break;
-		}
-
-		/* push object, proceed to base */
-		if (delta_stack_nr >= delta_stack_alloc
-		    && delta_stack == small_delta_stack) {
-			delta_stack_alloc = alloc_nr(delta_stack_nr);
-			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
-			memcpy(delta_stack, small_delta_stack,
-			       sizeof(*delta_stack)*delta_stack_nr);
-		} else {
-			ALLOC_GROW(delta_stack, delta_stack_nr+1, delta_stack_alloc);
-		}
-		i = delta_stack_nr++;
-		delta_stack[i].obj_offset = obj_offset;
-		delta_stack[i].curpos = curpos;
-		delta_stack[i].size = size;
-
-		curpos = obj_offset = base_offset;
-	}
-
-	/* PHASE 2: handle the base */
-	switch (type) {
-	case OBJ_OFS_DELTA:
-	case OBJ_REF_DELTA:
-		if (data)
-			die("BUG: unpack_entry: left loop at a valid delta");
-		break;
-	case OBJ_COMMIT:
-	case OBJ_TREE:
-	case OBJ_BLOB:
-	case OBJ_TAG:
-		if (!base_from_cache)
-			data = unpack_compressed_entry(p, &w_curs, curpos, size);
-		break;
-	default:
-		data = NULL;
-		error("unknown object type %i at offset %"PRIuMAX" in %s",
-		      type, (uintmax_t)obj_offset, p->pack_name);
-	}
-
-	/* PHASE 3: apply deltas in order */
-
-	/* invariants:
-	 *   'data' holds the base data, or NULL if there was corruption
-	 */
-	while (delta_stack_nr) {
-		void *delta_data;
-		void *base = data;
-		void *external_base = NULL;
-		unsigned long delta_size, base_size = size;
-		int i;
-
-		data = NULL;
-
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
-
-		if (!base) {
-			/*
-			 * We're probably in deep shit, but let's try to fetch
-			 * the required base anyway from another pack or loose.
-			 * This is costly but should happen only in the presence
-			 * of a corrupted pack, and is better than failing outright.
-			 */
-			struct revindex_entry *revidx;
-			const unsigned char *base_sha1;
-			revidx = find_pack_revindex(p, obj_offset);
-			if (revidx) {
-				base_sha1 = nth_packed_object_sha1(p, revidx->nr);
-				error("failed to read delta base object %s"
-				      " at offset %"PRIuMAX" from %s",
-				      sha1_to_hex(base_sha1), (uintmax_t)obj_offset,
-				      p->pack_name);
-				mark_bad_packed_object(p, base_sha1);
-				base = read_object(base_sha1, &type, &base_size);
-				external_base = base;
-			}
-		}
-
-		i = --delta_stack_nr;
-		obj_offset = delta_stack[i].obj_offset;
-		curpos = delta_stack[i].curpos;
-		delta_size = delta_stack[i].size;
-
-		if (!base)
-			continue;
-
-		delta_data = unpack_compressed_entry(p, &w_curs, curpos, delta_size);
-
-		if (!delta_data) {
-			error("failed to unpack compressed delta "
-			      "at offset %"PRIuMAX" from %s",
-			      (uintmax_t)curpos, p->pack_name);
-			data = NULL;
-			free(external_base);
-			continue;
-		}
-
-		data = patch_delta(base, base_size,
-				   delta_data, delta_size,
-				   &size);
-
-		/*
-		 * We could not apply the delta; warn the user, but keep going.
-		 * Our failure will be noticed either in the next iteration of
-		 * the loop, or if this is the final delta, in the caller when
-		 * we return NULL. Those code paths will take care of making
-		 * a more explicit warning and retrying with another copy of
-		 * the object.
-		 */
-		if (!data)
-			error("failed to apply delta");
-
-		free(delta_data);
-		free(external_base);
-	}
-
-	if (final_type)
-		*final_type = type;
-	if (final_size)
-		*final_size = size;
-
-	unuse_pack(&w_curs);
-
-	if (delta_stack != small_delta_stack)
-		free(delta_stack);
-
-	return data;
-}
-
 const unsigned char *nth_packed_object_sha1(struct packed_git *p,
 					    uint32_t n)
 {
@@ -2091,6 +1443,20 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
 	return type;
 }
 
+static void *read_object(const unsigned char *sha1, enum object_type *type,
+			 unsigned long *size)
+{
+	struct object_info oi = OBJECT_INFO_INIT;
+	void *content;
+	oi.typep = type;
+	oi.sizep = size;
+	oi.contentp = &content;
+
+	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
+		return NULL;
+	return content;
+}
+
 int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 		      unsigned char *sha1)
 {
@@ -2109,20 +1475,6 @@ int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 	return 0;
 }
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size)
-{
-	struct object_info oi = OBJECT_INFO_INIT;
-	void *content;
-	oi.typep = type;
-	oi.sizep = size;
-	oi.contentp = &content;
-
-	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
-		return NULL;
-	return content;
-}
-
 /*
  * This function dies on corrupt objects; the callers who want to
  * deal with them should arrange to call read_object() and give error
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 18/25] pack: move nth_packed_object_{sha1,oid}
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (28 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 17/25] pack: move packed_object_info(), unpack_entry() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 19/25] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
                   ` (30 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 14 --------------
 pack.h      | 14 ++++++++++++++
 packfile.c  | 31 +++++++++++++++++++++++++++++++
 sha1_file.c | 31 -------------------------------
 4 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/cache.h b/cache.h
index b14098bf1..f083d532e 100644
--- a/cache.h
+++ b/cache.h
@@ -1628,20 +1628,6 @@ extern int odb_pack_keep(const char *name);
  */
 extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
-/*
- * Return the SHA-1 of the nth object within the specified packfile.
- * Open the index if it is not already open.  The return value points
- * at the SHA-1 within the mmapped index.  Return NULL if there is an
- * error.
- */
-extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
-/*
- * Like nth_packed_object_sha1, but write the data into the object specified by
- * the the first argument.  Returns the first argument on success, and NULL on
- * error.
- */
-extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
-
 /*
  * Return the offset of the nth object within the specified packfile.
  * The index must already be opened.
diff --git a/pack.h b/pack.h
index 2e6f357c3..023c97b37 100644
--- a/pack.h
+++ b/pack.h
@@ -182,4 +182,18 @@ extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsign
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
 extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
 
+/*
+ * Return the SHA-1 of the nth object within the specified packfile.
+ * Open the index if it is not already open.  The return value points
+ * at the SHA-1 within the mmapped index.  Return NULL if there is an
+ * error.
+ */
+extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
+/*
+ * Like nth_packed_object_sha1, but write the data into the object specified by
+ * the the first argument.  Returns the first argument on success, and NULL on
+ * error.
+ */
+extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
+
 #endif
diff --git a/packfile.c b/packfile.c
index a3745f9df..b16cf648a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1634,3 +1634,34 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 
 	return data;
 }
+
+const unsigned char *nth_packed_object_sha1(struct packed_git *p,
+					    uint32_t n)
+{
+	const unsigned char *index = p->index_data;
+	if (!index) {
+		if (open_pack_index(p))
+			return NULL;
+		index = p->index_data;
+	}
+	if (n >= p->num_objects)
+		return NULL;
+	index += 4 * 256;
+	if (p->index_version == 1) {
+		return index + 24 * n + 4;
+	} else {
+		index += 8;
+		return index + 20 * n;
+	}
+}
+
+const struct object_id *nth_packed_object_oid(struct object_id *oid,
+					      struct packed_git *p,
+					      uint32_t n)
+{
+	const unsigned char *hash = nth_packed_object_sha1(p, n);
+	if (!hash)
+		return NULL;
+	hashcpy(oid->hash, hash);
+	return oid;
+}
diff --git a/sha1_file.c b/sha1_file.c
index fe7e0db76..4cd2b1809 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1073,37 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-const unsigned char *nth_packed_object_sha1(struct packed_git *p,
-					    uint32_t n)
-{
-	const unsigned char *index = p->index_data;
-	if (!index) {
-		if (open_pack_index(p))
-			return NULL;
-		index = p->index_data;
-	}
-	if (n >= p->num_objects)
-		return NULL;
-	index += 4 * 256;
-	if (p->index_version == 1) {
-		return index + 24 * n + 4;
-	} else {
-		index += 8;
-		return index + 20 * n;
-	}
-}
-
-const struct object_id *nth_packed_object_oid(struct object_id *oid,
-					      struct packed_git *p,
-					      uint32_t n)
-{
-	const unsigned char *hash = nth_packed_object_sha1(p, n);
-	if (!hash)
-		return NULL;
-	hashcpy(oid->hash, hash);
-	return oid;
-}
-
 void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
 {
 	const unsigned char *ptr = vptr;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 19/25] pack: move check_pack_index_ptr(), nth_packed_object_offset()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (29 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 18/25] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 20/25] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
                   ` (29 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 16 ----------------
 pack.h      | 16 ++++++++++++++++
 packfile.c  | 33 +++++++++++++++++++++++++++++++++
 sha1_file.c | 33 ---------------------------------
 4 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/cache.h b/cache.h
index f083d532e..7686ccb30 100644
--- a/cache.h
+++ b/cache.h
@@ -1618,22 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * Make sure that a pointer access into an mmap'd index file is within bounds,
- * and can provide at least 8 bytes of data.
- *
- * Note that this is only necessary for variable-length segments of the file
- * (like the 64-bit extended offset table), as we compare the size to the
- * fixed-length parts when we open the file.
- */
-extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
-
-/*
- * Return the offset of the nth object within the specified packfile.
- * The index must already be opened.
- */
-extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
-
 /*
  * If the object named sha1 is present in the specified packfile,
  * return its offset within the packfile; otherwise, return 0.
diff --git a/pack.h b/pack.h
index 023c97b37..e0e206e3c 100644
--- a/pack.h
+++ b/pack.h
@@ -196,4 +196,20 @@ extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t
  */
 extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
 
+/*
+ * Make sure that a pointer access into an mmap'd index file is within bounds,
+ * and can provide at least 8 bytes of data.
+ *
+ * Note that this is only necessary for variable-length segments of the file
+ * (like the 64-bit extended offset table), as we compare the size to the
+ * fixed-length parts when we open the file.
+ */
+extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
+
+/*
+ * Return the offset of the nth object within the specified packfile.
+ * The index must already be opened.
+ */
+extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
+
 #endif
diff --git a/packfile.c b/packfile.c
index b16cf648a..94c8af991 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1665,3 +1665,36 @@ const struct object_id *nth_packed_object_oid(struct object_id *oid,
 	hashcpy(oid->hash, hash);
 	return oid;
 }
+
+void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
+{
+	const unsigned char *ptr = vptr;
+	const unsigned char *start = p->index_data;
+	const unsigned char *end = start + p->index_size;
+	if (ptr < start)
+		die(_("offset before start of pack index for %s (corrupt index?)"),
+		    p->pack_name);
+	/* No need to check for underflow; .idx files must be at least 8 bytes */
+	if (ptr >= end - 8)
+		die(_("offset beyond end of pack index for %s (truncated index?)"),
+		    p->pack_name);
+}
+
+off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
+{
+	const unsigned char *index = p->index_data;
+	index += 4 * 256;
+	if (p->index_version == 1) {
+		return ntohl(*((uint32_t *)(index + 24 * n)));
+	} else {
+		uint32_t off;
+		index += 8 + p->num_objects * (20 + 4);
+		off = ntohl(*((uint32_t *)(index + 4 * n)));
+		if (!(off & 0x80000000))
+			return off;
+		index += p->num_objects * 4 + (off & 0x7fffffff) * 8;
+		check_pack_index_ptr(p, index);
+		return (((uint64_t)ntohl(*((uint32_t *)(index + 0)))) << 32) |
+				   ntohl(*((uint32_t *)(index + 4)));
+	}
+}
diff --git a/sha1_file.c b/sha1_file.c
index 4cd2b1809..0f4d68c5a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1073,39 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
-{
-	const unsigned char *ptr = vptr;
-	const unsigned char *start = p->index_data;
-	const unsigned char *end = start + p->index_size;
-	if (ptr < start)
-		die(_("offset before start of pack index for %s (corrupt index?)"),
-		    p->pack_name);
-	/* No need to check for underflow; .idx files must be at least 8 bytes */
-	if (ptr >= end - 8)
-		die(_("offset beyond end of pack index for %s (truncated index?)"),
-		    p->pack_name);
-}
-
-off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
-{
-	const unsigned char *index = p->index_data;
-	index += 4 * 256;
-	if (p->index_version == 1) {
-		return ntohl(*((uint32_t *)(index + 24 * n)));
-	} else {
-		uint32_t off;
-		index += 8 + p->num_objects * (20 + 4);
-		off = ntohl(*((uint32_t *)(index + 4 * n)));
-		if (!(off & 0x80000000))
-			return off;
-		index += p->num_objects * 4 + (off & 0x7fffffff) * 8;
-		check_pack_index_ptr(p, index);
-		return (((uint64_t)ntohl(*((uint32_t *)(index + 0)))) << 32) |
-				   ntohl(*((uint32_t *)(index + 4)));
-	}
-}
-
 off_t find_pack_entry_one(const unsigned char *sha1,
 				  struct packed_git *p)
 {
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 20/25] pack: move find_pack_entry_one(), is_pack_valid()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (30 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 19/25] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 21/25] pack: move find_sha1_pack() Jonathan Tan
                   ` (28 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  8 ------
 pack.h      | 10 ++++++--
 packfile.c  | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 84 ------------------------------------------------------------
 4 files changed, 93 insertions(+), 94 deletions(-)

diff --git a/cache.h b/cache.h
index 7686ccb30..b944aca69 100644
--- a/cache.h
+++ b/cache.h
@@ -1618,14 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * If the object named sha1 is present in the specified packfile,
- * return its offset within the packfile; otherwise, return 0.
- */
-extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
-
-extern int is_pack_valid(struct packed_git *);
-
 /*
  * Iterate over the files in the loose-object parts of the object
  * directory "path", triggering the following callbacks:
diff --git a/pack.h b/pack.h
index e0e206e3c..f5bd94813 100644
--- a/pack.h
+++ b/pack.h
@@ -144,8 +144,6 @@ extern void close_pack_windows(struct packed_git *);
 extern void close_pack_index(struct packed_git *);
 extern void close_all_packs(void);
 
-extern int open_packed_git(struct packed_git *p);
-
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 extern void unuse_pack(struct pack_window **);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
@@ -212,4 +210,12 @@ extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
  */
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 
+/*
+ * If the object named sha1 is present in the specified packfile,
+ * return its offset within the packfile; otherwise, return 0.
+ */
+extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
+
+extern int is_pack_valid(struct packed_git *);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 94c8af991..71017d2ec 100644
--- a/packfile.c
+++ b/packfile.c
@@ -6,6 +6,7 @@
 #include "delta.h"
 #include "list.h"
 #include "streaming.h"
+#include "sha1-lookup.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -1698,3 +1699,87 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 				   ntohl(*((uint32_t *)(index + 4)));
 	}
 }
+
+off_t find_pack_entry_one(const unsigned char *sha1,
+				  struct packed_git *p)
+{
+	const uint32_t *level1_ofs = p->index_data;
+	const unsigned char *index = p->index_data;
+	unsigned hi, lo, stride;
+	static int use_lookup = -1;
+	static int debug_lookup = -1;
+
+	if (debug_lookup < 0)
+		debug_lookup = !!getenv("GIT_DEBUG_LOOKUP");
+
+	if (!index) {
+		if (open_pack_index(p))
+			return 0;
+		level1_ofs = p->index_data;
+		index = p->index_data;
+	}
+	if (p->index_version > 1) {
+		level1_ofs += 2;
+		index += 8;
+	}
+	index += 4 * 256;
+	hi = ntohl(level1_ofs[*sha1]);
+	lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
+	if (p->index_version > 1) {
+		stride = 20;
+	} else {
+		stride = 24;
+		index += 4;
+	}
+
+	if (debug_lookup)
+		printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n",
+		       sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
+
+	if (use_lookup < 0)
+		use_lookup = !!getenv("GIT_USE_LOOKUP");
+	if (use_lookup) {
+		int pos = sha1_entry_pos(index, stride, 0,
+					 lo, hi, p->num_objects, sha1);
+		if (pos < 0)
+			return 0;
+		return nth_packed_object_offset(p, pos);
+	}
+
+	do {
+		unsigned mi = (lo + hi) / 2;
+		int cmp = hashcmp(index + mi * stride, sha1);
+
+		if (debug_lookup)
+			printf("lo %u hi %u rg %u mi %u\n",
+			       lo, hi, hi - lo, mi);
+		if (!cmp)
+			return nth_packed_object_offset(p, mi);
+		if (cmp > 0)
+			hi = mi;
+		else
+			lo = mi+1;
+	} while (lo < hi);
+	return 0;
+}
+
+int is_pack_valid(struct packed_git *p)
+{
+	/* An already open pack is known to be valid. */
+	if (p->pack_fd != -1)
+		return 1;
+
+	/* If the pack has one window completely covering the
+	 * file size, the pack is known to be valid even if
+	 * the descriptor is not currently open.
+	 */
+	if (p->windows) {
+		struct pack_window *w = p->windows;
+
+		if (!w->offset && w->len == p->pack_size)
+			return 1;
+	}
+
+	/* Force the pack to open to prove its valid. */
+	return !open_packed_git(p);
+}
diff --git a/sha1_file.c b/sha1_file.c
index 0f4d68c5a..75b9ceb39 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1073,90 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-off_t find_pack_entry_one(const unsigned char *sha1,
-				  struct packed_git *p)
-{
-	const uint32_t *level1_ofs = p->index_data;
-	const unsigned char *index = p->index_data;
-	unsigned hi, lo, stride;
-	static int use_lookup = -1;
-	static int debug_lookup = -1;
-
-	if (debug_lookup < 0)
-		debug_lookup = !!getenv("GIT_DEBUG_LOOKUP");
-
-	if (!index) {
-		if (open_pack_index(p))
-			return 0;
-		level1_ofs = p->index_data;
-		index = p->index_data;
-	}
-	if (p->index_version > 1) {
-		level1_ofs += 2;
-		index += 8;
-	}
-	index += 4 * 256;
-	hi = ntohl(level1_ofs[*sha1]);
-	lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
-	if (p->index_version > 1) {
-		stride = 20;
-	} else {
-		stride = 24;
-		index += 4;
-	}
-
-	if (debug_lookup)
-		printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n",
-		       sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
-
-	if (use_lookup < 0)
-		use_lookup = !!getenv("GIT_USE_LOOKUP");
-	if (use_lookup) {
-		int pos = sha1_entry_pos(index, stride, 0,
-					 lo, hi, p->num_objects, sha1);
-		if (pos < 0)
-			return 0;
-		return nth_packed_object_offset(p, pos);
-	}
-
-	do {
-		unsigned mi = (lo + hi) / 2;
-		int cmp = hashcmp(index + mi * stride, sha1);
-
-		if (debug_lookup)
-			printf("lo %u hi %u rg %u mi %u\n",
-			       lo, hi, hi - lo, mi);
-		if (!cmp)
-			return nth_packed_object_offset(p, mi);
-		if (cmp > 0)
-			hi = mi;
-		else
-			lo = mi+1;
-	} while (lo < hi);
-	return 0;
-}
-
-int is_pack_valid(struct packed_git *p)
-{
-	/* An already open pack is known to be valid. */
-	if (p->pack_fd != -1)
-		return 1;
-
-	/* If the pack has one window completely covering the
-	 * file size, the pack is known to be valid even if
-	 * the descriptor is not currently open.
-	 */
-	if (p->windows) {
-		struct pack_window *w = p->windows;
-
-		if (!w->offset && w->len == p->pack_size)
-			return 1;
-	}
-
-	/* Force the pack to open to prove its valid. */
-	return !open_packed_git(p);
-}
-
 static int fill_pack_entry(const unsigned char *sha1,
 			   struct pack_entry *e,
 			   struct packed_git *p)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 21/25] pack: move find_sha1_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (31 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 20/25] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 22/25] pack: move find_pack_entry() and make it global Jonathan Tan
                   ` (27 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h       |  3 ---
 http-push.c   |  1 +
 http-walker.c |  1 +
 pack.h        |  3 +++
 packfile.c    | 13 +++++++++++++
 sha1_file.c   | 13 -------------
 6 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/cache.h b/cache.h
index b944aca69..06a8caae6 100644
--- a/cache.h
+++ b/cache.h
@@ -1600,9 +1600,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
-					 struct packed_git *packs);
-
 /*
  * Create a temporary file rooted in the object database directory, or
  * die on failure. The filename is taken from "pattern", which should have the
diff --git a/http-push.c b/http-push.c
index c91f40a61..4e8a227d1 100644
--- a/http-push.c
+++ b/http-push.c
@@ -11,6 +11,7 @@
 #include "list-objects.h"
 #include "sigchain.h"
 #include "argv-array.h"
+#include "pack.h"
 
 #ifdef EXPAT_NEEDS_XMLPARSE_H
 #include <xmlparse.h>
diff --git a/http-walker.c b/http-walker.c
index ee049cb13..d6f0af944 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "list.h"
 #include "transport.h"
+#include "pack.h"
 
 struct alt_base {
 	char *base;
diff --git a/pack.h b/pack.h
index f5bd94813..0517d6542 100644
--- a/pack.h
+++ b/pack.h
@@ -218,4 +218,7 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 
+extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
+					 struct packed_git *packs);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 71017d2ec..f16b56262 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1783,3 +1783,16 @@ int is_pack_valid(struct packed_git *p)
 	/* Force the pack to open to prove its valid. */
 	return !open_packed_git(p);
 }
+
+struct packed_git *find_sha1_pack(const unsigned char *sha1,
+				  struct packed_git *packs)
+{
+	struct packed_git *p;
+
+	for (p = packs; p; p = p->next) {
+		if (find_pack_entry_one(sha1, p))
+			return p;
+	}
+	return NULL;
+
+}
diff --git a/sha1_file.c b/sha1_file.c
index 75b9ceb39..229358663 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1126,19 +1126,6 @@ static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
 	return 0;
 }
 
-struct packed_git *find_sha1_pack(const unsigned char *sha1,
-				  struct packed_git *packs)
-{
-	struct packed_git *p;
-
-	for (p = packs; p; p = p->next) {
-		if (find_pack_entry_one(sha1, p))
-			return p;
-	}
-	return NULL;
-
-}
-
 static int sha1_loose_object_info(const unsigned char *sha1,
 				  struct object_info *oi,
 				  int flags)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 22/25] pack: move find_pack_entry() and make it global
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (32 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 21/25] pack: move find_sha1_pack() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 23/25] pack: move has_sha1_pack() Jonathan Tan
                   ` (26 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

This function needs to be global as it is used by sha1_file.c and will
be used by packfile.c.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 pack.h      |  2 ++
 packfile.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 53 -----------------------------------------------------
 3 files changed, 55 insertions(+), 53 deletions(-)

diff --git a/pack.h b/pack.h
index 0517d6542..1021a781c 100644
--- a/pack.h
+++ b/pack.h
@@ -221,4 +221,6 @@ extern int is_pack_valid(struct packed_git *);
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
+extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
+
 #endif
diff --git a/packfile.c b/packfile.c
index f16b56262..0f1e3338b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1796,3 +1796,56 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
 	return NULL;
 
 }
+
+static int fill_pack_entry(const unsigned char *sha1,
+			   struct pack_entry *e,
+			   struct packed_git *p)
+{
+	off_t offset;
+
+	if (p->num_bad_objects) {
+		unsigned i;
+		for (i = 0; i < p->num_bad_objects; i++)
+			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+				return 0;
+	}
+
+	offset = find_pack_entry_one(sha1, p);
+	if (!offset)
+		return 0;
+
+	/*
+	 * We are about to tell the caller where they can locate the
+	 * requested object.  We better make sure the packfile is
+	 * still here and can be accessed before supplying that
+	 * answer, as it may have been deleted since the index was
+	 * loaded!
+	 */
+	if (!is_pack_valid(p))
+		return 0;
+	e->offset = offset;
+	e->p = p;
+	hashcpy(e->sha1, sha1);
+	return 1;
+}
+
+/*
+ * Iff a pack file contains the object named by sha1, return true and
+ * store its location to e.
+ */
+int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
+{
+	struct mru_entry *p;
+
+	prepare_packed_git();
+	if (!packed_git)
+		return 0;
+
+	for (p = packed_git_mru->head; p; p = p->next) {
+		if (fill_pack_entry(sha1, e, p->item)) {
+			mru_mark(packed_git_mru, p);
+			return 1;
+		}
+	}
+	return 0;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 229358663..1a505eae5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1073,59 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-static int fill_pack_entry(const unsigned char *sha1,
-			   struct pack_entry *e,
-			   struct packed_git *p)
-{
-	off_t offset;
-
-	if (p->num_bad_objects) {
-		unsigned i;
-		for (i = 0; i < p->num_bad_objects; i++)
-			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
-				return 0;
-	}
-
-	offset = find_pack_entry_one(sha1, p);
-	if (!offset)
-		return 0;
-
-	/*
-	 * We are about to tell the caller where they can locate the
-	 * requested object.  We better make sure the packfile is
-	 * still here and can be accessed before supplying that
-	 * answer, as it may have been deleted since the index was
-	 * loaded!
-	 */
-	if (!is_pack_valid(p))
-		return 0;
-	e->offset = offset;
-	e->p = p;
-	hashcpy(e->sha1, sha1);
-	return 1;
-}
-
-/*
- * Iff a pack file contains the object named by sha1, return true and
- * store its location to e.
- */
-static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
-{
-	struct mru_entry *p;
-
-	prepare_packed_git();
-	if (!packed_git)
-		return 0;
-
-	for (p = packed_git_mru->head; p; p = p->next) {
-		if (fill_pack_entry(sha1, e, p->item)) {
-			mru_mark(packed_git_mru, p);
-			return 1;
-		}
-	}
-	return 0;
-}
-
 static int sha1_loose_object_info(const unsigned char *sha1,
 				  struct object_info *oi,
 				  int flags)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 23/25] pack: move has_sha1_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (33 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 22/25] pack: move find_pack_entry() and make it global Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 24/25] pack: move has_pack_index() Jonathan Tan
                   ` (25 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/prune-packed.c | 1 +
 cache.h                | 2 --
 diff.c                 | 1 +
 pack.h                 | 2 ++
 packfile.c             | 6 ++++++
 revision.c             | 1 +
 sha1_file.c            | 6 ------
 7 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index ac978ad40..79130aa2e 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -2,6 +2,7 @@
 #include "cache.h"
 #include "progress.h"
 #include "parse-options.h"
+#include "pack.h"
 
 static const char * const prune_packed_usage[] = {
 	N_("git prune-packed [-n | --dry-run] [-q | --quiet]"),
diff --git a/cache.h b/cache.h
index 06a8caae6..d96d36d50 100644
--- a/cache.h
+++ b/cache.h
@@ -1190,8 +1190,6 @@ extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned l
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
-extern int has_sha1_pack(const unsigned char *sha1);
-
 /*
  * Open the loose object at path, check its sha1, and return the contents,
  * type, and size. If the object is a blob, then "contents" may return NULL,
diff --git a/diff.c b/diff.c
index 85e714f6c..6bbc46326 100644
--- a/diff.c
+++ b/diff.c
@@ -20,6 +20,7 @@
 #include "string-list.h"
 #include "argv-array.h"
 #include "graph.h"
+#include "pack.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
diff --git a/pack.h b/pack.h
index 1021a781c..ce0e15deb 100644
--- a/pack.h
+++ b/pack.h
@@ -223,4 +223,6 @@ extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 
 extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
 
+extern int has_sha1_pack(const unsigned char *sha1);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 0f1e3338b..507f65236 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1849,3 +1849,9 @@ int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
 	}
 	return 0;
 }
+
+int has_sha1_pack(const unsigned char *sha1)
+{
+	struct pack_entry e;
+	return find_pack_entry(sha1, &e);
+}
diff --git a/revision.c b/revision.c
index 6603af944..2868c4fc8 100644
--- a/revision.c
+++ b/revision.c
@@ -19,6 +19,7 @@
 #include "dir.h"
 #include "cache-tree.h"
 #include "bisect.h"
+#include "pack.h"
 
 volatile show_early_output_fn_t show_early_output;
 
diff --git a/sha1_file.c b/sha1_file.c
index 1a505eae5..2610ea057 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1629,12 +1629,6 @@ int has_pack_index(const unsigned char *sha1)
 	return 1;
 }
 
-int has_sha1_pack(const unsigned char *sha1)
-{
-	struct pack_entry e;
-	return find_pack_entry(sha1, &e);
-}
-
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
 	if (!startup_info->have_repository)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 24/25] pack: move has_pack_index()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (34 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 23/25] pack: move has_sha1_pack() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-09  1:22 ` [PATCH v2 25/25] pack: move for_each_packed_object() Jonathan Tan
                   ` (24 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 2 --
 pack.h      | 2 ++
 packfile.c  | 8 ++++++++
 sha1_file.c | 8 --------
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index d96d36d50..656b39d51 100644
--- a/cache.h
+++ b/cache.h
@@ -1225,8 +1225,6 @@ extern int has_object_file_with_flags(const struct object_id *oid, int flags);
  */
 extern int has_loose_object_nonlocal(const unsigned char *sha1);
 
-extern int has_pack_index(const unsigned char *sha1);
-
 extern void assert_sha1_type(const unsigned char *sha1, enum object_type expect);
 
 /* Helper to check and "touch" a file */
diff --git a/pack.h b/pack.h
index ce0e15deb..2c2a347ba 100644
--- a/pack.h
+++ b/pack.h
@@ -225,4 +225,6 @@ extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
 
 extern int has_sha1_pack(const unsigned char *sha1);
 
+extern int has_pack_index(const unsigned char *sha1);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 507f65236..28a16206c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1855,3 +1855,11 @@ int has_sha1_pack(const unsigned char *sha1)
 	struct pack_entry e;
 	return find_pack_entry(sha1, &e);
 }
+
+int has_pack_index(const unsigned char *sha1)
+{
+	struct stat st;
+	if (stat(sha1_pack_index_name(sha1), &st))
+		return 0;
+	return 1;
+}
diff --git a/sha1_file.c b/sha1_file.c
index 2610ea057..8584f6cf2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1621,14 +1621,6 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
 	return ret;
 }
 
-int has_pack_index(const unsigned char *sha1)
-{
-	struct stat st;
-	if (stat(sha1_pack_index_name(sha1), &st))
-		return 0;
-	return 1;
-}
-
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
 	if (!startup_info->have_repository)
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 25/25] pack: move for_each_packed_object()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (35 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 24/25] pack: move has_pack_index() Jonathan Tan
@ 2017-08-09  1:22 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 00/23] Move exported packfile funcs to its own file Jonathan Tan
                   ` (23 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09  1:22 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, sbeller

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/cat-file.c |  1 +
 cache.h            |  7 +------
 pack.h             | 11 +++++++++++
 packfile.c         | 40 ++++++++++++++++++++++++++++++++++++++++
 reachable.c        |  1 +
 sha1_file.c        | 40 ----------------------------------------
 6 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 96b786e48..316ef5c98 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -12,6 +12,7 @@
 #include "streaming.h"
 #include "tree-walk.h"
 #include "sha1-array.h"
+#include "pack.h"
 
 struct batch_options {
 	int enabled;
diff --git a/cache.h b/cache.h
index 656b39d51..6c3822783 100644
--- a/cache.h
+++ b/cache.h
@@ -1660,17 +1660,12 @@ int for_each_loose_file_in_objdir_buf(struct strbuf *path,
 				      void *data);
 
 /*
- * Iterate over loose and packed objects in both the local
+ * Iterate over loose objects in both the local
  * repository and any alternates repositories (unless the
  * LOCAL_ONLY flag is set).
  */
 #define FOR_EACH_OBJECT_LOCAL_ONLY 0x1
-typedef int each_packed_object_fn(const struct object_id *oid,
-				  struct packed_git *pack,
-				  uint32_t pos,
-				  void *data);
 extern int for_each_loose_object(each_loose_object_fn, void *, unsigned flags);
-extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags);
 
 struct object_info {
 	/* Request */
diff --git a/pack.h b/pack.h
index 2c2a347ba..905b05be5 100644
--- a/pack.h
+++ b/pack.h
@@ -227,4 +227,15 @@ extern int has_sha1_pack(const unsigned char *sha1);
 
 extern int has_pack_index(const unsigned char *sha1);
 
+/*
+ * Iterate over packed objects in both the local
+ * repository and any alternates repositories (unless the
+ * FOR_EACH_OBJECT_LOCAL_ONLY flag, defined in cache.h, is set).
+ */
+typedef int each_packed_object_fn(const struct object_id *oid,
+				  struct packed_git *pack,
+				  uint32_t pos,
+				  void *data);
+extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags);
+
 #endif
diff --git a/packfile.c b/packfile.c
index 28a16206c..031a40828 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1863,3 +1863,43 @@ int has_pack_index(const unsigned char *sha1)
 		return 0;
 	return 1;
 }
+
+static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
+{
+	uint32_t i;
+	int r = 0;
+
+	for (i = 0; i < p->num_objects; i++) {
+		struct object_id oid;
+
+		if (!nth_packed_object_oid(&oid, p, i))
+			return error("unable to get sha1 of object %u in %s",
+				     i, p->pack_name);
+
+		r = cb(&oid, p, i, data);
+		if (r)
+			break;
+	}
+	return r;
+}
+
+int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags)
+{
+	struct packed_git *p;
+	int r = 0;
+	int pack_errors = 0;
+
+	prepare_packed_git();
+	for (p = packed_git; p; p = p->next) {
+		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
+			continue;
+		if (open_pack_index(p)) {
+			pack_errors = 1;
+			continue;
+		}
+		r = for_each_object_in_pack(p, cb, data);
+		if (r)
+			break;
+	}
+	return r ? r : pack_errors;
+}
diff --git a/reachable.c b/reachable.c
index c62efbfd4..ef606ae17 100644
--- a/reachable.c
+++ b/reachable.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "progress.h"
 #include "list-objects.h"
+#include "pack.h"
 
 struct connectivity_progress {
 	struct progress *progress;
diff --git a/sha1_file.c b/sha1_file.c
index 8584f6cf2..3f3f9174f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2013,46 +2013,6 @@ int for_each_loose_object(each_loose_object_fn cb, void *data, unsigned flags)
 	return foreach_alt_odb(loose_from_alt_odb, &alt);
 }
 
-static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
-{
-	uint32_t i;
-	int r = 0;
-
-	for (i = 0; i < p->num_objects; i++) {
-		struct object_id oid;
-
-		if (!nth_packed_object_oid(&oid, p, i))
-			return error("unable to get sha1 of object %u in %s",
-				     i, p->pack_name);
-
-		r = cb(&oid, p, i, data);
-		if (r)
-			break;
-	}
-	return r;
-}
-
-int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags)
-{
-	struct packed_git *p;
-	int r = 0;
-	int pack_errors = 0;
-
-	prepare_packed_git();
-	for (p = packed_git; p; p = p->next) {
-		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
-			continue;
-		if (open_pack_index(p)) {
-			pack_errors = 1;
-			continue;
-		}
-		r = for_each_object_in_pack(p, cb, data);
-		if (r)
-			break;
-	}
-	return r ? r : pack_errors;
-}
-
 static int check_stream_sha1(git_zstream *stream,
 			     const char *hdr,
 			     unsigned long size,
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-08 20:50     ` Jonathan Tan
@ 2017-08-09 12:00       ` Christian Couder
  2017-08-09 17:16         ` Jonathan Tan
  0 siblings, 1 reply; 88+ messages in thread
From: Christian Couder @ 2017-08-09 12:00 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Stefan Beller, git@vger.kernel.org

On Tue, Aug 8, 2017 at 10:50 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> On Tue, 8 Aug 2017 13:36:24 -0700
> Stefan Beller <sbeller@google.com> wrote:
>>
>> There are also packed refs, so one could (like I did) think that
>> pack.c is for generic packing of things, maybe packfile.c
>> would be more clear?
>
> Good point. I'll use packfile.c and packfile.h in the next version.

It looks like you used "packfile.c" and "pack.h" in v2. Is there a
reason why it's not using "packfile.h"?

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-09 12:00       ` Christian Couder
@ 2017-08-09 17:16         ` Jonathan Tan
  2017-08-11 19:38           ` Ben Peart
  0 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-09 17:16 UTC (permalink / raw)
  To: Christian Couder; +Cc: Stefan Beller, git@vger.kernel.org

On Wed, 9 Aug 2017 14:00:40 +0200
Christian Couder <christian.couder@gmail.com> wrote:

> On Tue, Aug 8, 2017 at 10:50 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> > On Tue, 8 Aug 2017 13:36:24 -0700
> > Stefan Beller <sbeller@google.com> wrote:
> >>
> >> There are also packed refs, so one could (like I did) think that
> >> pack.c is for generic packing of things, maybe packfile.c
> >> would be more clear?
> >
> > Good point. I'll use packfile.c and packfile.h in the next version.
> 
> It looks like you used "packfile.c" and "pack.h" in v2. Is there a
> reason why it's not using "packfile.h"?

Ah, I forgot to mention this in the cover letter. I thought that one
header was sufficient to cover all pack-related things, so if we wanted
to know which files used pack-related things, we would only need to
search for one string instead of two. Also, the division between
"pack.h" and the hypothetical "packfile.h" was not so clear to me.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
@ 2017-08-10 17:21   ` Stefan Beller
  2017-08-10 21:19   ` Junio C Hamano
  2017-08-11 19:41   ` [PATCH v2 00/25] Move exported packfile funcs to its own file Ben Peart
  2 siblings, 0 replies; 88+ messages in thread
From: Stefan Beller @ 2017-08-10 17:21 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git@vger.kernel.org, Junio C Hamano

On Tue, Aug 8, 2017 at 6:22 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> Here is the complete patch set. I have only moved the exported functions
> that operate with packfiles and their static helpers - for example,
> static functions like freshen_packed_object() that are used only by
> non-pack-specific functions are not moved.
>
> In the end, 3 functions needed to be made global. They are
> find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad().
>
> Of the 3, find_pack_entry() is probably legitimately promoted. But I
> think that the latter two functions needing to be accessed from
> sha1_file.c points to a design that could be improved - they are only
> used when packed_object_info() detects corruption, and used for marking
> as bad and printing messages to the user respectively, which
> packed_object_info() should probably do itself. But I have not made this
> change in this patch set.
>
> (Other than the 3 functions above, there are some variables and
> functions that are temporarily made global, but reduced back to static
> when the wide scope is no longer needed.)

I read through the patches yesterday and had no comment.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
  2017-08-10 17:21   ` Stefan Beller
@ 2017-08-10 21:19   ` Junio C Hamano
  2017-08-10 21:59     ` Jonathan Tan
  2017-08-11 19:41   ` [PATCH v2 00/25] Move exported packfile funcs to its own file Ben Peart
  2 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-10 21:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller

Jonathan Tan <jonathantanmy@google.com> writes:

> Here is the complete patch set. I have only moved the exported functions
> that operate with packfiles and their static helpers - for example,
> static functions like freshen_packed_object() that are used only by
> non-pack-specific functions are not moved.

This will interfere with smaller changes and fixes we want to have
early in the 'master' branch, so while I think it is a good idea to
do something like this in the longer term, I'd have to ask you to
either hold on or rebase this on them (you'll know what else you are
conflicting with when you try to merge this to 'pu' yourself).

Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-10 21:19   ` Junio C Hamano
@ 2017-08-10 21:59     ` Jonathan Tan
  2017-08-10 22:40       ` Junio C Hamano
  0 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-10 21:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller

On Thu, 10 Aug 2017 14:19:59 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Here is the complete patch set. I have only moved the exported functions
> > that operate with packfiles and their static helpers - for example,
> > static functions like freshen_packed_object() that are used only by
> > non-pack-specific functions are not moved.
> 
> This will interfere with smaller changes and fixes we want to have
> early in the 'master' branch, so while I think it is a good idea to
> do something like this in the longer term, I'd have to ask you to
> either hold on or rebase this on them (you'll know what else you are
> conflicting with when you try to merge this to 'pu' yourself).
> 
> Thanks.

OK, I'll wait until you have updated the master branch, then I'll try to
rebase on it.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-10 21:59     ` Jonathan Tan
@ 2017-08-10 22:40       ` Junio C Hamano
  2017-08-11 20:36         ` [PATCH 0/2] non-move patches in preparation for packfile.c Jonathan Tan
                           ` (2 more replies)
  0 siblings, 3 replies; 88+ messages in thread
From: Junio C Hamano @ 2017-08-10 22:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller

Jonathan Tan <jonathantanmy@google.com> writes:

> On Thu, 10 Aug 2017 14:19:59 -0700
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Jonathan Tan <jonathantanmy@google.com> writes:
>> 
>> > Here is the complete patch set. I have only moved the exported functions
>> > that operate with packfiles and their static helpers - for example,
>> > static functions like freshen_packed_object() that are used only by
>> > non-pack-specific functions are not moved.
>> 
>> This will interfere with smaller changes and fixes we want to have
>> early in the 'master' branch, so while I think it is a good idea to
>> do something like this in the longer term, I'd have to ask you to
>> either hold on or rebase this on them (you'll know what else you are
>> conflicting with when you try to merge this to 'pu' yourself).
>> 
>> Thanks.
>
> OK, I'll wait until you have updated the master branch, then I'll try to
> rebase on it.

That will take a few weeks, and I do not think we want you idling
during that time ;-).

You'd need to double check, but I think the topics that cause
trouble are rs/find-apck-entry-bisection and jk/drop-sha1-entry-pos;
you can start from v2.14.1 and merge these topics on top and then
build your change on top.  That would allow you to start cooking
before both of them graduate to 'master', as I expect they are both
quick-to-next material.  There might be other topics that interfere
with what you are doing, but you can easily find out what they are
if you do a trial merge to 'next' and 'pu' yourself.

Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-09 17:16         ` Jonathan Tan
@ 2017-08-11 19:38           ` Ben Peart
  2017-08-11 21:34             ` Junio C Hamano
  0 siblings, 1 reply; 88+ messages in thread
From: Ben Peart @ 2017-08-11 19:38 UTC (permalink / raw)
  To: Jonathan Tan, Christian Couder; +Cc: Stefan Beller, git@vger.kernel.org



On 8/9/2017 1:16 PM, Jonathan Tan wrote:
> On Wed, 9 Aug 2017 14:00:40 +0200
> Christian Couder <christian.couder@gmail.com> wrote:
> 
>> On Tue, Aug 8, 2017 at 10:50 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
>>> On Tue, 8 Aug 2017 13:36:24 -0700
>>> Stefan Beller <sbeller@google.com> wrote:
>>>>
>>>> There are also packed refs, so one could (like I did) think that
>>>> pack.c is for generic packing of things, maybe packfile.c
>>>> would be more clear?
>>>
>>> Good point. I'll use packfile.c and packfile.h in the next version.
>>
>> It looks like you used "packfile.c" and "pack.h" in v2. Is there a
>> reason why it's not using "packfile.h"?
> 
> Ah, I forgot to mention this in the cover letter. I thought that one
> header was sufficient to cover all pack-related things, so if we wanted
> to know which files used pack-related things, we would only need to
> search for one string instead of two. Also, the division between
> "pack.h" and the hypothetical "packfile.h" was not so clear to me.
> 

I prefer having source and the header files that export the functions 
have matching names to make it easy to find them.  I would prefer 
packfile.h vs pack.h myself.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
  2017-08-10 17:21   ` Stefan Beller
  2017-08-10 21:19   ` Junio C Hamano
@ 2017-08-11 19:41   ` Ben Peart
  2017-08-18 23:36     ` Jonathan Tan
  2 siblings, 1 reply; 88+ messages in thread
From: Ben Peart @ 2017-08-11 19:41 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: gitster, sbeller



On 8/8/2017 9:22 PM, Jonathan Tan wrote:
> Here is the complete patch set. I have only moved the exported functions
> that operate with packfiles and their static helpers - for example,
> static functions like freshen_packed_object() that are used only by
> non-pack-specific functions are not moved.
> 
> In the end, 3 functions needed to be made global. They are
> find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad().
> 
> Of the 3, find_pack_entry() is probably legitimately promoted. But I
> think that the latter two functions needing to be accessed from
> sha1_file.c points to a design that could be improved - they are only
> used when packed_object_info() detects corruption, and used for marking
> as bad and printing messages to the user respectively, which
> packed_object_info() should probably do itself. But I have not made this
> change in this patch set.
> 
> (Other than the 3 functions above, there are some variables and
> functions that are temporarily made global, but reduced back to static
> when the wide scope is no longer needed.)
> 

Nice to see the pack file functions being refactored out.  I looked at 
the end result and it looked good to me.

Do you have the energy to do a similar refactoring for the remaining 
public functions residing in sha1_file.c?  Perhaps a new sha1_file.h? It 
would be nice to get more things out of cache.h. :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH 0/2] non-move patches in preparation for packfile.c
  2017-08-10 22:40       ` Junio C Hamano
@ 2017-08-11 20:36         ` Jonathan Tan
  2017-08-11 20:36         ` [PATCH 1/2] sha1_file: set whence in storage-specific info fn Jonathan Tan
  2017-08-11 20:36         ` [PATCH 2/2] sha1_file: remove read_packed_sha1() Jonathan Tan
  2 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-11 20:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Thanks, I'll work on that.

In the meantime, could these 2 patches be merged in (after review, of
course) first? This will make the remaining set much easier to review,
as you can check most of them with the new --color-moved functionality
of diff.

Incidentally, after rebasing on "pu" and resolving the conflicts, I
noticed some "zebra stripes" (from --color-moved) where I didn't expect
time. I'll try to figure out that issue before I resend the rest of the
patches (which will be after the relevant branches are merged to
"next").

Jonathan Tan (2):
  sha1_file: set whence in storage-specific info fn
  sha1_file: remove read_packed_sha1()

 sha1_file.c | 39 +++++++--------------------------------
 1 file changed, 7 insertions(+), 32 deletions(-)

-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH 1/2] sha1_file: set whence in storage-specific info fn
  2017-08-10 22:40       ` Junio C Hamano
  2017-08-11 20:36         ` [PATCH 0/2] non-move patches in preparation for packfile.c Jonathan Tan
@ 2017-08-11 20:36         ` Jonathan Tan
  2017-08-11 21:52           ` Junio C Hamano
  2017-08-11 20:36         ` [PATCH 2/2] sha1_file: remove read_packed_sha1() Jonathan Tan
  2 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-11 20:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Move the setting of oi->whence to sha1_loose_object_info() and
packed_object_info().

This allows sha1_object_info_extended() to not need to know about the
delta base cache. This will be useful during a future refactoring in
which packfile-related functions, including the handling of the delta
base cache, will be moved to a separate file.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 sha1_file.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index b60ae15f7..910109fd9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2444,6 +2444,9 @@ int packed_object_info(struct packed_git *p, off_t obj_offset,
 			hashclr(oi->delta_base_sha1);
 	}
 
+	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
+							  OI_PACKED;
+
 out:
 	unuse_pack(&w_curs);
 	return type;
@@ -2973,6 +2976,7 @@ static int sha1_loose_object_info(const unsigned char *sha1,
 	if (oi->sizep == &size_scratch)
 		oi->sizep = NULL;
 	strbuf_release(&hdrbuf);
+	oi->whence = OI_LOOSE;
 	return (status < 0) ? status : 0;
 }
 
@@ -3010,10 +3014,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 
 	if (!find_pack_entry(real, &e)) {
 		/* Most likely it's a loose object. */
-		if (!sha1_loose_object_info(real, oi, flags)) {
-			oi->whence = OI_LOOSE;
+		if (!sha1_loose_object_info(real, oi, flags))
 			return 0;
-		}
 
 		/* Not a loose object; someone else may have just packed it. */
 		if (flags & OBJECT_INFO_QUICK) {
@@ -3036,10 +3038,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 	if (rtype < 0) {
 		mark_bad_packed_object(e.p, real);
 		return sha1_object_info_extended(real, oi, 0);
-	} else if (in_delta_base_cache(e.p, e.offset)) {
-		oi->whence = OI_DBCACHED;
-	} else {
-		oi->whence = OI_PACKED;
+	} else if (oi->whence == OI_PACKED) {
 		oi->u.packed.offset = e.offset;
 		oi->u.packed.pack = e.p;
 		oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH 2/2] sha1_file: remove read_packed_sha1()
  2017-08-10 22:40       ` Junio C Hamano
  2017-08-11 20:36         ` [PATCH 0/2] non-move patches in preparation for packfile.c Jonathan Tan
  2017-08-11 20:36         ` [PATCH 1/2] sha1_file: set whence in storage-specific info fn Jonathan Tan
@ 2017-08-11 20:36         ` Jonathan Tan
  2017-08-11 22:06           ` Junio C Hamano
  2 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-11 20:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Use read_object() in its place instead. This avoids duplication of code.

This makes force_object_loose() slightly slower (because of a redundant
check of loose object storage), but only in the error case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 sha1_file.c | 26 +-------------------------
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 910109fd9..0f758eabf 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3062,30 +3062,6 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
 	return type;
 }
 
-static void *read_packed_sha1(const unsigned char *sha1,
-			      enum object_type *type, unsigned long *size)
-{
-	struct pack_entry e;
-	void *data;
-
-	if (!find_pack_entry(sha1, &e))
-		return NULL;
-	data = cache_or_unpack_entry(e.p, e.offset, size, type);
-	if (!data) {
-		/*
-		 * We're probably in deep shit, but let's try to fetch
-		 * the required object anyway from another pack or loose.
-		 * This should happen only in the presence of a corrupted
-		 * pack, and is better than failing outright.
-		 */
-		error("failed to read object %s at offset %"PRIuMAX" from %s",
-		      sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
-		mark_bad_packed_object(e.p, sha1);
-		data = read_object(sha1, type, size);
-	}
-	return data;
-}
-
 int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 		      unsigned char *sha1)
 {
@@ -3468,7 +3444,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
 
 	if (has_loose_object(sha1))
 		return 0;
-	buf = read_packed_sha1(sha1, &type, &len);
+	buf = read_object(sha1, &type, &len);
 	if (!buf)
 		return error("cannot read sha1_file for %s", sha1_to_hex(sha1));
 	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1;
-- 
2.14.0.434.g98096fd7a8-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-11 19:38           ` Ben Peart
@ 2017-08-11 21:34             ` Junio C Hamano
  2017-08-16 22:53               ` Jonathan Tan
  0 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-11 21:34 UTC (permalink / raw)
  To: Ben Peart
  Cc: Jonathan Tan, Christian Couder, Stefan Beller,
	git@vger.kernel.org

Ben Peart <peartben@gmail.com> writes:

> On 8/9/2017 1:16 PM, Jonathan Tan wrote:
>
>> Ah, I forgot to mention this in the cover letter. I thought that one
>> header was sufficient to cover all pack-related things, so if we wanted
>> to know which files used pack-related things, we would only need to
>> search for one string instead of two. Also, the division between
>> "pack.h" and the hypothetical "packfile.h" was not so clear to me.
>
> I prefer having source and the header files that export the functions
> have matching names to make it easy to find them.  I would prefer
> packfile.h vs pack.h myself.

Meaning "If we have packfile.c, packfile.h is preferrable over pack.h"?
I tend to agree with that.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] sha1_file: set whence in storage-specific info fn
  2017-08-11 20:36         ` [PATCH 1/2] sha1_file: set whence in storage-specific info fn Jonathan Tan
@ 2017-08-11 21:52           ` Junio C Hamano
  0 siblings, 0 replies; 88+ messages in thread
From: Junio C Hamano @ 2017-08-11 21:52 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Move the setting of oi->whence to sha1_loose_object_info() and
> packed_object_info().
>
> This allows sha1_object_info_extended() to not need to know about the
> delta base cache. This will be useful during a future refactoring in
> which packfile-related functions, including the handling of the delta
> base cache, will be moved to a separate file.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---

Output from 

    git grep -E -e '(\.|->)whence'

makes me wonder if the oi->whence thing is a bit over-engineered,
though.  The only real user of this information is the streaming
code, which wants to see if it can grab an undeltified deflated data
directly out of the pack file (and if so from which packfile at what
offset), or if it can open a loose object file and slurp deflated
data out of it.

But that is totally outside the scope of this patch.  This looks
like a safe no-op conversion to me.

Thanks.

>  sha1_file.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index b60ae15f7..910109fd9 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2444,6 +2444,9 @@ int packed_object_info(struct packed_git *p, off_t obj_offset,
>  			hashclr(oi->delta_base_sha1);
>  	}
>  
> +	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
> +							  OI_PACKED;
> +
>  out:
>  	unuse_pack(&w_curs);
>  	return type;
> @@ -2973,6 +2976,7 @@ static int sha1_loose_object_info(const unsigned char *sha1,
>  	if (oi->sizep == &size_scratch)
>  		oi->sizep = NULL;
>  	strbuf_release(&hdrbuf);
> +	oi->whence = OI_LOOSE;
>  	return (status < 0) ? status : 0;
>  }
>  
> @@ -3010,10 +3014,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
>  
>  	if (!find_pack_entry(real, &e)) {
>  		/* Most likely it's a loose object. */
> -		if (!sha1_loose_object_info(real, oi, flags)) {
> -			oi->whence = OI_LOOSE;
> +		if (!sha1_loose_object_info(real, oi, flags))
>  			return 0;
> -		}
>  
>  		/* Not a loose object; someone else may have just packed it. */
>  		if (flags & OBJECT_INFO_QUICK) {
> @@ -3036,10 +3038,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
>  	if (rtype < 0) {
>  		mark_bad_packed_object(e.p, real);
>  		return sha1_object_info_extended(real, oi, 0);
> -	} else if (in_delta_base_cache(e.p, e.offset)) {
> -		oi->whence = OI_DBCACHED;
> -	} else {
> -		oi->whence = OI_PACKED;
> +	} else if (oi->whence == OI_PACKED) {
>  		oi->u.packed.offset = e.offset;
>  		oi->u.packed.pack = e.p;
>  		oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] sha1_file: remove read_packed_sha1()
  2017-08-11 20:36         ` [PATCH 2/2] sha1_file: remove read_packed_sha1() Jonathan Tan
@ 2017-08-11 22:06           ` Junio C Hamano
  0 siblings, 0 replies; 88+ messages in thread
From: Junio C Hamano @ 2017-08-11 22:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Use read_object() in its place instead. This avoids duplication of code.
>
> This makes force_object_loose() slightly slower (because of a redundant
> check of loose object storage), but only in the error case.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  sha1_file.c | 26 +-------------------------
>  1 file changed, 1 insertion(+), 25 deletions(-)

The original code insisted on reading from pack and never from a
loose object, because it knew it would return early when it found a
loose version.  Now we allow a loose one to appear in the middle of
force_object_loose() operation and happily read from it when we do
not see a pack entry for the object---presumably because we are
racing with another simultanous repack process, or something?---and
then write it out as a new (and identical) loose object, which would
not do any harm.

So this is not strictly a no-op conversion; I have a gut feeling
that it would make it more robust, not less, in the presence of
another racing repack process, but I haven't really thought through
race scenarios that may make difference in its behaviour.


> diff --git a/sha1_file.c b/sha1_file.c
> index 910109fd9..0f758eabf 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3062,30 +3062,6 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
>  	return type;
>  }
>  
> -static void *read_packed_sha1(const unsigned char *sha1,
> -			      enum object_type *type, unsigned long *size)
> -{
> -	struct pack_entry e;
> -	void *data;
> -
> -	if (!find_pack_entry(sha1, &e))
> -		return NULL;
> -	data = cache_or_unpack_entry(e.p, e.offset, size, type);
> -	if (!data) {
> -		/*
> -		 * We're probably in deep shit, but let's try to fetch
> -		 * the required object anyway from another pack or loose.
> -		 * This should happen only in the presence of a corrupted
> -		 * pack, and is better than failing outright.
> -		 */
> -		error("failed to read object %s at offset %"PRIuMAX" from %s",
> -		      sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
> -		mark_bad_packed_object(e.p, sha1);
> -		data = read_object(sha1, type, size);
> -	}
> -	return data;
> -}
> -
>  int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
>  		      unsigned char *sha1)
>  {
> @@ -3468,7 +3444,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
>  
>  	if (has_loose_object(sha1))
>  		return 0;
> -	buf = read_packed_sha1(sha1, &type, &len);
> +	buf = read_object(sha1, &type, &len);
>  	if (!buf)
>  		return error("cannot read sha1_file for %s", sha1_to_hex(sha1));
>  	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1;

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [RFC PATCH 01/10] pack: move pack name-related functions
  2017-08-11 21:34             ` Junio C Hamano
@ 2017-08-16 22:53               ` Jonathan Tan
  0 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-16 22:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, Christian Couder, Stefan Beller, git@vger.kernel.org

On Fri, 11 Aug 2017 14:34:27 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Ben Peart <peartben@gmail.com> writes:
> 
> > On 8/9/2017 1:16 PM, Jonathan Tan wrote:
> >
> >> Ah, I forgot to mention this in the cover letter. I thought that one
> >> header was sufficient to cover all pack-related things, so if we wanted
> >> to know which files used pack-related things, we would only need to
> >> search for one string instead of two. Also, the division between
> >> "pack.h" and the hypothetical "packfile.h" was not so clear to me.
> >
> > I prefer having source and the header files that export the functions
> > have matching names to make it easy to find them.  I would prefer
> > packfile.h vs pack.h myself.
> 
> Meaning "If we have packfile.c, packfile.h is preferrable over pack.h"?
> I tend to agree with that.

Fair enough - I've changed it so that the functions now go into
packfile.h. I'll send it out once I know what to base it on (at least
jt/sha1-file-cleanup, and a few more branches that also modify
sha1_file.c).

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 00/23] Move exported packfile funcs to its own file
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (36 preceding siblings ...)
  2017-08-09  1:22 ` [PATCH v2 25/25] pack: move for_each_packed_object() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-19  7:33   ` Junio C Hamano
  2017-08-18 22:20 ` [PATCH v3 01/23] pack: move pack name-related functions Jonathan Tan
                   ` (22 subsequent siblings)
  60 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

> You'd need to double check, but I think the topics that cause
> trouble are rs/find-apck-entry-bisection and jk/drop-sha1-entry-pos;
> you can start from v2.14.1 and merge these topics on top and then
> build your change on top.  That would allow you to start cooking
> before both of them graduate to 'master', as I expect they are both
> quick-to-next material.  There might be other topics that interfere
> with what you are doing, but you can easily find out what they are
> if you do a trial merge to 'next' and 'pu' yourself.

OK - in addition to the 2 you mentioned, I have found some others
(likely added after you wrote that). The complete list is:
 - rs/find-pack-entry-bisection
 - jk/drop-sha1-entry-pos
 - jt/sha1-file-cleanup (formerly part of this set)
 - mk/use-size-t-in-zlib
 - rs/unpack-entry-leakfix

I have merged all of these and rebased my patches on top.

Other changes:
 - Used packfile.h instead of pack.h (following most people's
   preference)
 - Ensured that I added functions to packfile.h retaining the order they
   were originally in, so that if you run "git diff <base> --color-moved
   --patience", there are much fewer zebra stripes

The merge base commit can be accessed online [1], if you need it.

[1] https://github.com/jonathantanmy/git/commits/packmigrate

Jonathan Tan (23):
  pack: move pack name-related functions
  pack: move static state variables
  pack: move pack_report()
  pack: move open_pack_index(), parse_pack_index()
  pack: move release_pack_memory()
  pack: move pack-closing functions
  pack: move use_pack()
  pack: move unuse_pack()
  pack: move add_packed_git()
  pack: move install_packed_git()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move unpack_object_header_buffer()
  pack: move get_size_from_delta()
  pack: move unpack_object_header()
  pack: move clear_delta_base_cache(), packed_object_info(),
    unpack_entry()
  pack: move nth_packed_object_{sha1,oid}
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move find_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move has_sha1_pack()
  pack: move has_pack_index()
  pack: move for_each_packed_object()

 Makefile                 |    1 +
 builtin/am.c             |    1 +
 builtin/cat-file.c       |    1 +
 builtin/clone.c          |    1 +
 builtin/count-objects.c  |    1 +
 builtin/fetch.c          |    1 +
 builtin/fsck.c           |    1 +
 builtin/gc.c             |    1 +
 builtin/index-pack.c     |    1 +
 builtin/merge.c          |    1 +
 builtin/pack-objects.c   |    1 +
 builtin/pack-redundant.c |    1 +
 builtin/prune-packed.c   |    1 +
 builtin/receive-pack.c   |    1 +
 bulk-checkin.c           |    1 +
 cache.h                  |  122 +--
 connected.c              |    1 +
 diff.c                   |    1 +
 fast-import.c            |    1 +
 fetch-pack.c             |    1 +
 git-compat-util.h        |    2 -
 http-backend.c           |    1 +
 http-push.c              |    1 +
 http-walker.c            |    1 +
 http.c                   |    1 +
 outgoing/packfile.h      |    0
 pack-bitmap.c            |    1 +
 pack-check.c             |    1 +
 packfile.c               | 1896 +++++++++++++++++++++++++++++++++++
 packfile.h               |  138 +++
 path.c                   |    1 +
 reachable.c              |    1 +
 revision.c               |    1 +
 server-info.c            |    1 +
 sha1_file.c              | 2452 ++++++----------------------------------------
 sha1_name.c              |    1 +
 streaming.c              |    1 +
 37 files changed, 2354 insertions(+), 2287 deletions(-)
 create mode 100644 outgoing/packfile.h
 create mode 100644 packfile.c
 create mode 100644 packfile.h

-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 01/23] pack: move pack name-related functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (37 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 00/23] Move exported packfile funcs to its own file Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 02/23] pack: move static state variables Jonathan Tan
                   ` (21 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                 |  1 +
 builtin/index-pack.c     |  1 +
 builtin/pack-redundant.c |  1 +
 cache.h                  | 23 -----------------------
 fast-import.c            |  1 +
 http.c                   |  1 +
 outgoing/packfile.h      |  0
 packfile.c               | 23 +++++++++++++++++++++++
 packfile.h               | 27 +++++++++++++++++++++++++++
 sha1_file.c              | 23 +----------------------
 10 files changed, 56 insertions(+), 45 deletions(-)
 create mode 100644 outgoing/packfile.h
 create mode 100644 packfile.c
 create mode 100644 packfile.h

diff --git a/Makefile b/Makefile
index 461c845d3..5cdecaa17 100644
--- a/Makefile
+++ b/Makefile
@@ -816,6 +816,7 @@ LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
 LIB_OBJS += oidset.o
+LIB_OBJS += packfile.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 26828c1d8..f2be145e1 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -12,6 +12,7 @@
 #include "exec_cmd.h"
 #include "streaming.h"
 #include "thread-utils.h"
+#include "packfile.h"
 
 static const char index_pack_usage[] =
 "git index-pack [-v] [-o <index-file>] [--keep | --keep=<msg>] [--verify] [--strict] (<pack-file> | --stdin [--fix-thin] [<pack-file>])";
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index cb1df1c76..aaa813632 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -7,6 +7,7 @@
 */
 
 #include "builtin.h"
+#include "packfile.h"
 
 #define BLKSIZE 512
 
diff --git a/cache.h b/cache.h
index fcba87a69..aa2b4d390 100644
--- a/cache.h
+++ b/cache.h
@@ -902,20 +902,6 @@ extern void check_repository_format(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
-/*
- * Return the name of the (local) packfile with the specified sha1 in
- * its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_name(const unsigned char *sha1);
-
-/*
- * Return the name of the (local) pack index file with the specified
- * sha1 in its name.  The return value is a pointer to memory that is
- * overwritten each time this function is called.
- */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
-
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
@@ -1656,15 +1642,6 @@ extern void pack_report(void);
  */
 extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 
-/*
- * Generate the filename to be used for a pack file with checksum "sha1" and
- * extension "ext". The result is written into the strbuf "buf", overwriting
- * any existing contents. A pointer to buf->buf is returned as a convenience.
- *
- * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
- */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
-
 /*
  * Create a pack .keep file named "name" (which should generally be the output
  * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
diff --git a/fast-import.c b/fast-import.c
index a959161b4..49516d60e 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -167,6 +167,7 @@ Format of STDIN stream:
 #include "quote.h"
 #include "dir.h"
 #include "run-command.h"
+#include "packfile.h"
 
 #define PACK_ID_BITS 16
 #define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
diff --git a/http.c b/http.c
index c6c010f88..1d1e4ab01 100644
--- a/http.c
+++ b/http.c
@@ -11,6 +11,7 @@
 #include "pkt-line.h"
 #include "gettext.h"
 #include "transport.h"
+#include "packfile.h"
 
 static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
diff --git a/outgoing/packfile.h b/outgoing/packfile.h
new file mode 100644
index 000000000..e69de29bb
diff --git a/packfile.c b/packfile.c
new file mode 100644
index 000000000..0d191dfd6
--- /dev/null
+++ b/packfile.c
@@ -0,0 +1,23 @@
+#include "cache.h"
+
+char *odb_pack_name(struct strbuf *buf,
+		    const unsigned char *sha1,
+		    const char *ext)
+{
+	strbuf_reset(buf);
+	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
+		    sha1_to_hex(sha1), ext);
+	return buf->buf;
+}
+
+char *sha1_pack_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "pack");
+}
+
+char *sha1_pack_index_name(const unsigned char *sha1)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	return odb_pack_name(&buf, sha1, "idx");
+}
diff --git a/packfile.h b/packfile.h
new file mode 100644
index 000000000..3c4a0dbd7
--- /dev/null
+++ b/packfile.h
@@ -0,0 +1,27 @@
+#ifndef PACKFILE_H
+#define PACKFILE_H
+
+/*
+ * Generate the filename to be used for a pack file with checksum "sha1" and
+ * extension "ext". The result is written into the strbuf "buf", overwriting
+ * any existing contents. A pointer to buf->buf is returned as a convenience.
+ *
+ * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
+ */
+extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
+
+/*
+ * Return the name of the (local) packfile with the specified sha1 in
+ * its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_name(const unsigned char *sha1);
+
+/*
+ * Return the name of the (local) pack index file with the specified
+ * sha1 in its name.  The return value is a pointer to memory that is
+ * overwritten each time this function is called.
+ */
+extern char *sha1_pack_index_name(const unsigned char *sha1);
+
+#endif
diff --git a/sha1_file.c b/sha1_file.c
index c888d7e5b..6e7a20b52 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -28,6 +28,7 @@
 #include "list.h"
 #include "mergesort.h"
 #include "quote.h"
+#include "packfile.h"
 
 #define SZ_FMT PRIuMAX
 static inline uintmax_t sz_fmt(size_t s) { return s; }
@@ -278,28 +279,6 @@ static const char *alt_sha1_path(struct alternate_object_database *alt,
 	return buf->buf;
 }
 
- char *odb_pack_name(struct strbuf *buf,
-		     const unsigned char *sha1,
-		     const char *ext)
-{
-	strbuf_reset(buf);
-	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
-		    sha1_to_hex(sha1), ext);
-	return buf->buf;
-}
-
-char *sha1_pack_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "pack");
-}
-
-char *sha1_pack_index_name(const unsigned char *sha1)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	return odb_pack_name(&buf, sha1, "idx");
-}
-
 struct alternate_object_database *alt_odb_list;
 static struct alternate_object_database **alt_odb_tail;
 
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 02/23] pack: move static state variables
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (38 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 01/23] pack: move pack name-related functions Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 03/23] pack: move pack_report() Jonathan Tan
                   ` (20 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

sha1_file.c declares some static variables that store packfile-related
state. Move them to packfile.c.

They are temporarily made global, but subsequent commits will restore
their scope back to static.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 packfile.c  | 14 ++++++++++++++
 packfile.h  |  9 +++++++++
 sha1_file.c | 13 -------------
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/packfile.c b/packfile.c
index 0d191dfd6..0f46e0617 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "mru.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -21,3 +22,16 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	static struct strbuf buf = STRBUF_INIT;
 	return odb_pack_name(&buf, sha1, "idx");
 }
+
+unsigned int pack_used_ctr;
+unsigned int pack_mmap_calls;
+unsigned int peak_pack_open_windows;
+unsigned int pack_open_windows;
+unsigned int pack_open_fds;
+unsigned int pack_max_fds;
+size_t peak_pack_mapped;
+size_t pack_mapped;
+struct packed_git *packed_git;
+
+static struct mru packed_git_mru_storage;
+struct mru *packed_git_mru = &packed_git_mru_storage;
diff --git a/packfile.h b/packfile.h
index 3c4a0dbd7..a76bb7cec 100644
--- a/packfile.h
+++ b/packfile.h
@@ -24,4 +24,13 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
+extern unsigned int pack_used_ctr;
+extern unsigned int pack_mmap_calls;
+extern unsigned int peak_pack_open_windows;
+extern unsigned int pack_open_windows;
+extern unsigned int pack_open_fds;
+extern unsigned int pack_max_fds;
+extern size_t peak_pack_mapped;
+extern size_t pack_mapped;
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 6e7a20b52..2b5ce9959 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -683,19 +683,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static unsigned int pack_used_ctr;
-static unsigned int pack_mmap_calls;
-static unsigned int peak_pack_open_windows;
-static unsigned int pack_open_windows;
-static unsigned int pack_open_fds;
-static unsigned int pack_max_fds;
-static size_t peak_pack_mapped;
-static size_t pack_mapped;
-struct packed_git *packed_git;
-
-static struct mru packed_git_mru_storage;
-struct mru *packed_git_mru = &packed_git_mru_storage;
-
 void pack_report(void)
 {
 	fprintf(stderr,
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 03/23] pack: move pack_report()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (39 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 02/23] pack: move static state variables Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 04/23] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
                   ` (19 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  2 --
 packfile.c  | 24 ++++++++++++++++++++++++
 packfile.h  |  2 ++
 sha1_file.c | 24 ------------------------
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index aa2b4d390..a0497d469 100644
--- a/cache.h
+++ b/cache.h
@@ -1632,8 +1632,6 @@ unsigned long approximate_object_count(void);
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
-extern void pack_report(void);
-
 /*
  * Create a temporary file rooted in the object database directory, or
  * die on failure. The filename is taken from "pattern", which should have the
diff --git a/packfile.c b/packfile.c
index 0f46e0617..60d9fc3b0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -35,3 +35,27 @@ struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
 struct mru *packed_git_mru = &packed_git_mru_storage;
+
+#define SZ_FMT PRIuMAX
+static inline uintmax_t sz_fmt(size_t s) { return s; }
+
+void pack_report(void)
+{
+	fprintf(stderr,
+		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
+		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
+		sz_fmt(getpagesize()),
+		sz_fmt(packed_git_window_size),
+		sz_fmt(packed_git_limit));
+	fprintf(stderr,
+		"pack_report: pack_used_ctr            = %10u\n"
+		"pack_report: pack_mmap_calls          = %10u\n"
+		"pack_report: pack_open_windows        = %10u / %10u\n"
+		"pack_report: pack_mapped              = "
+			"%10" SZ_FMT " / %10" SZ_FMT "\n",
+		pack_used_ctr,
+		pack_mmap_calls,
+		pack_open_windows, peak_pack_open_windows,
+		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
+}
diff --git a/packfile.h b/packfile.h
index a76bb7cec..bfa94c8fe 100644
--- a/packfile.h
+++ b/packfile.h
@@ -33,4 +33,6 @@ extern unsigned int pack_max_fds;
 extern size_t peak_pack_mapped;
 extern size_t pack_mapped;
 
+extern void pack_report(void);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 2b5ce9959..f7c8152ac 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -30,9 +30,6 @@
 #include "quote.h"
 #include "packfile.h"
 
-#define SZ_FMT PRIuMAX
-static inline uintmax_t sz_fmt(size_t s) { return s; }
-
 const unsigned char null_sha1[20];
 const struct object_id null_oid;
 const struct object_id empty_tree_oid = {
@@ -683,27 +680,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-void pack_report(void)
-{
-	fprintf(stderr,
-		"pack_report: getpagesize()            = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
-		"pack_report: core.packedGitLimit      = %10" SZ_FMT "\n",
-		sz_fmt(getpagesize()),
-		sz_fmt(packed_git_window_size),
-		sz_fmt(packed_git_limit));
-	fprintf(stderr,
-		"pack_report: pack_used_ctr            = %10u\n"
-		"pack_report: pack_mmap_calls          = %10u\n"
-		"pack_report: pack_open_windows        = %10u / %10u\n"
-		"pack_report: pack_mapped              = "
-			"%10" SZ_FMT " / %10" SZ_FMT "\n",
-		pack_used_ctr,
-		pack_mmap_calls,
-		pack_open_windows, peak_pack_open_windows,
-		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
-}
-
 /*
  * Open and mmap the index file at path, perform a couple of
  * consistency checks, then record its information to p.  Return 0 on
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 04/23] pack: move open_pack_index(), parse_pack_index()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (40 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 03/23] pack: move pack_report() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 05/23] pack: move release_pack_memory() Jonathan Tan
                   ` (18 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a
subsequent commit, alloc_packed_git() will be removed from sha1_file.c.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/count-objects.c |   1 +
 builtin/fsck.c          |   1 +
 builtin/pack-objects.c  |   1 +
 cache.h                 |   8 ---
 pack-bitmap.c           |   1 +
 pack-check.c            |   1 +
 packfile.c              | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h              |   8 +++
 sha1_file.c             | 140 ---------------------------------------------
 sha1_name.c             |   1 +
 10 files changed, 163 insertions(+), 148 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 1d82e61f2..33343818c 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -10,6 +10,7 @@
 #include "builtin.h"
 #include "parse-options.h"
 #include "quote.h"
+#include "packfile.h"
 
 static unsigned long garbage;
 static off_t size_garbage;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99dea7adf..c56207b21 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -15,6 +15,7 @@
 #include "progress.h"
 #include "streaming.h"
 #include "decorate.h"
+#include "packfile.h"
 
 #define REACHABLE 0x0001
 #define SEEN      0x0002
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c5d3d507..08f05cb84 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -25,6 +25,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "mru.h"
+#include "packfile.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [<options>...] [< <ref-list> | < <object-list>]"),
diff --git a/cache.h b/cache.h
index a0497d469..f271033db 100644
--- a/cache.h
+++ b/cache.h
@@ -1611,8 +1611,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
-
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
@@ -1647,12 +1645,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * mmap the index file for the specified packfile (if it is not
- * already mmapped).  Return 0 on success.
- */
-extern int open_pack_index(struct packed_git *);
-
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 327634cd7..cb3d14ba4 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -9,6 +9,7 @@
 #include "pack-bitmap.h"
 #include "pack-revindex.h"
 #include "pack-objects.h"
+#include "packfile.h"
 
 /*
  * An entry on the bitmap index, representing the bitmap for a given
diff --git a/pack-check.c b/pack-check.c
index 84469168a..2086f5bb7 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -2,6 +2,7 @@
 #include "pack.h"
 #include "pack-revindex.h"
 #include "progress.h"
+#include "packfile.h"
 
 struct idx_entry {
 	off_t                offset;
diff --git a/packfile.c b/packfile.c
index 60d9fc3b0..6edc43228 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "mru.h"
+#include "pack.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -59,3 +60,151 @@ void pack_report(void)
 		pack_open_windows, peak_pack_open_windows,
 		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
 }
+
+/*
+ * Open and mmap the index file at path, perform a couple of
+ * consistency checks, then record its information to p.  Return 0 on
+ * success.
+ */
+static int check_packed_git_idx(const char *path, struct packed_git *p)
+{
+	void *idx_map;
+	struct pack_idx_header *hdr;
+	size_t idx_size;
+	uint32_t version, nr, i, *index;
+	int fd = git_open(path);
+	struct stat st;
+
+	if (fd < 0)
+		return -1;
+	if (fstat(fd, &st)) {
+		close(fd);
+		return -1;
+	}
+	idx_size = xsize_t(st.st_size);
+	if (idx_size < 4 * 256 + 20 + 20) {
+		close(fd);
+		return error("index file %s is too small", path);
+	}
+	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+
+	hdr = idx_map;
+	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
+		version = ntohl(hdr->idx_version);
+		if (version < 2 || version > 2) {
+			munmap(idx_map, idx_size);
+			return error("index file %s is version %"PRIu32
+				     " and is not supported by this binary"
+				     " (try upgrading GIT to a newer version)",
+				     path, version);
+		}
+	} else
+		version = 1;
+
+	nr = 0;
+	index = idx_map;
+	if (version > 1)
+		index += 2;  /* skip index header */
+	for (i = 0; i < 256; i++) {
+		uint32_t n = ntohl(index[i]);
+		if (n < nr) {
+			munmap(idx_map, idx_size);
+			return error("non-monotonic index %s", path);
+		}
+		nr = n;
+	}
+
+	if (version == 1) {
+		/*
+		 * Total size:
+		 *  - 256 index entries 4 bytes each
+		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 */
+		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v1 file size in %s", path);
+		}
+	} else if (version == 2) {
+		/*
+		 * Minimum size:
+		 *  - 8 bytes of header
+		 *  - 256 index entries 4 bytes each
+		 *  - 20-byte sha1 entry * nr
+		 *  - 4-byte crc entry * nr
+		 *  - 4-byte offset entry * nr
+		 *  - 20-byte SHA1 of the packfile
+		 *  - 20-byte SHA1 file checksum
+		 * And after the 4-byte offset table might be a
+		 * variable sized table containing 8-byte entries
+		 * for offsets larger than 2^31.
+		 */
+		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
+		unsigned long max_size = min_size;
+		if (nr)
+			max_size += (nr - 1)*8;
+		if (idx_size < min_size || idx_size > max_size) {
+			munmap(idx_map, idx_size);
+			return error("wrong index v2 file size in %s", path);
+		}
+		if (idx_size != min_size &&
+		    /*
+		     * make sure we can deal with large pack offsets.
+		     * 31-bit signed offset won't be enough, neither
+		     * 32-bit unsigned one will be.
+		     */
+		    (sizeof(off_t) <= 4)) {
+			munmap(idx_map, idx_size);
+			return error("pack too large for current definition of off_t in %s", path);
+		}
+	}
+
+	p->index_version = version;
+	p->index_data = idx_map;
+	p->index_size = idx_size;
+	p->num_objects = nr;
+	return 0;
+}
+
+int open_pack_index(struct packed_git *p)
+{
+	char *idx_name;
+	size_t len;
+	int ret;
+
+	if (p->index_data)
+		return 0;
+
+	if (!strip_suffix(p->pack_name, ".pack", &len))
+		die("BUG: pack_name does not end in .pack");
+	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
+	ret = check_packed_git_idx(idx_name, p);
+	free(idx_name);
+	return ret;
+}
+
+static struct packed_git *alloc_packed_git(int extra)
+{
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
+	memset(p, 0, sizeof(*p));
+	p->pack_fd = -1;
+	return p;
+}
+
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
+{
+	const char *path = sha1_pack_name(sha1);
+	size_t alloc = st_add(strlen(path), 1);
+	struct packed_git *p = alloc_packed_git(alloc);
+
+	memcpy(p->pack_name, path, alloc); /* includes NUL */
+	hashcpy(p->sha1, sha1);
+	if (check_packed_git_idx(idx_path, p)) {
+		free(p);
+		return NULL;
+	}
+
+	return p;
+}
diff --git a/packfile.h b/packfile.h
index bfa94c8fe..703887d41 100644
--- a/packfile.h
+++ b/packfile.h
@@ -33,6 +33,14 @@ extern unsigned int pack_max_fds;
 extern size_t peak_pack_mapped;
 extern size_t pack_mapped;
 
+extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
+
 extern void pack_report(void);
 
+/*
+ * mmap the index file for the specified packfile (if it is not
+ * already mmapped).  Return 0 on success.
+ */
+extern int open_pack_index(struct packed_git *);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index f7c8152ac..475d2032d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -680,130 +680,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-/*
- * Open and mmap the index file at path, perform a couple of
- * consistency checks, then record its information to p.  Return 0 on
- * success.
- */
-static int check_packed_git_idx(const char *path, struct packed_git *p)
-{
-	void *idx_map;
-	struct pack_idx_header *hdr;
-	size_t idx_size;
-	uint32_t version, nr, i, *index;
-	int fd = git_open(path);
-	struct stat st;
-
-	if (fd < 0)
-		return -1;
-	if (fstat(fd, &st)) {
-		close(fd);
-		return -1;
-	}
-	idx_size = xsize_t(st.st_size);
-	if (idx_size < 4 * 256 + 20 + 20) {
-		close(fd);
-		return error("index file %s is too small", path);
-	}
-	idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
-
-	hdr = idx_map;
-	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
-		version = ntohl(hdr->idx_version);
-		if (version < 2 || version > 2) {
-			munmap(idx_map, idx_size);
-			return error("index file %s is version %"PRIu32
-				     " and is not supported by this binary"
-				     " (try upgrading GIT to a newer version)",
-				     path, version);
-		}
-	} else
-		version = 1;
-
-	nr = 0;
-	index = idx_map;
-	if (version > 1)
-		index += 2;  /* skip index header */
-	for (i = 0; i < 256; i++) {
-		uint32_t n = ntohl(index[i]);
-		if (n < nr) {
-			munmap(idx_map, idx_size);
-			return error("non-monotonic index %s", path);
-		}
-		nr = n;
-	}
-
-	if (version == 1) {
-		/*
-		 * Total size:
-		 *  - 256 index entries 4 bytes each
-		 *  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 */
-		if (idx_size != 4*256 + nr * 24 + 20 + 20) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v1 file size in %s", path);
-		}
-	} else if (version == 2) {
-		/*
-		 * Minimum size:
-		 *  - 8 bytes of header
-		 *  - 256 index entries 4 bytes each
-		 *  - 20-byte sha1 entry * nr
-		 *  - 4-byte crc entry * nr
-		 *  - 4-byte offset entry * nr
-		 *  - 20-byte SHA1 of the packfile
-		 *  - 20-byte SHA1 file checksum
-		 * And after the 4-byte offset table might be a
-		 * variable sized table containing 8-byte entries
-		 * for offsets larger than 2^31.
-		 */
-		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
-		unsigned long max_size = min_size;
-		if (nr)
-			max_size += (nr - 1)*8;
-		if (idx_size < min_size || idx_size > max_size) {
-			munmap(idx_map, idx_size);
-			return error("wrong index v2 file size in %s", path);
-		}
-		if (idx_size != min_size &&
-		    /*
-		     * make sure we can deal with large pack offsets.
-		     * 31-bit signed offset won't be enough, neither
-		     * 32-bit unsigned one will be.
-		     */
-		    (sizeof(off_t) <= 4)) {
-			munmap(idx_map, idx_size);
-			return error("pack too large for current definition of off_t in %s", path);
-		}
-	}
-
-	p->index_version = version;
-	p->index_data = idx_map;
-	p->index_size = idx_size;
-	p->num_objects = nr;
-	return 0;
-}
-
-int open_pack_index(struct packed_git *p)
-{
-	char *idx_name;
-	size_t len;
-	int ret;
-
-	if (p->index_data)
-		return 0;
-
-	if (!strip_suffix(p->pack_name, ".pack", &len))
-		die("BUG: pack_name does not end in .pack");
-	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
-	ret = check_packed_git_idx(idx_name, p);
-	free(idx_name);
-	return ret;
-}
-
 static void scan_windows(struct packed_git *p,
 	struct packed_git **lru_p,
 	struct pack_window **lru_w,
@@ -1301,22 +1177,6 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	return p;
 }
 
-struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
-{
-	const char *path = sha1_pack_name(sha1);
-	size_t alloc = st_add(strlen(path), 1);
-	struct packed_git *p = alloc_packed_git(alloc);
-
-	memcpy(p->pack_name, path, alloc); /* includes NUL */
-	hashcpy(p->sha1, sha1);
-	if (check_packed_git_idx(idx_path, p)) {
-		free(p);
-		return NULL;
-	}
-
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
diff --git a/sha1_name.c b/sha1_name.c
index 74fcb6d78..ae237809c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -9,6 +9,7 @@
 #include "remote.h"
 #include "dir.h"
 #include "sha1-array.h"
+#include "packfile.h"
 
 static int get_sha1_oneline(const char *, unsigned char *, struct commit_list *);
 
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 05/23] pack: move release_pack_memory()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (41 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 04/23] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 06/23] pack: move pack-closing functions Jonathan Tan
                   ` (17 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The function unuse_one_window() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 git-compat-util.h |  2 --
 packfile.c        | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h        |  4 ++++
 sha1_file.c       | 49 -------------------------------------------------
 4 files changed, 53 insertions(+), 51 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index db9c22de7..201056e2d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -749,8 +749,6 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 extern int git_atexit(void (*handler)(void));
 #endif
 
-extern void release_pack_memory(size_t);
-
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
diff --git a/packfile.c b/packfile.c
index 6edc43228..8daa74ad1 100644
--- a/packfile.c
+++ b/packfile.c
@@ -208,3 +208,52 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 
 	return p;
 }
+
+static void scan_windows(struct packed_git *p,
+	struct packed_git **lru_p,
+	struct pack_window **lru_w,
+	struct pack_window **lru_l)
+{
+	struct pack_window *w, *w_l;
+
+	for (w_l = NULL, w = p->windows; w; w = w->next) {
+		if (!w->inuse_cnt) {
+			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
+				*lru_p = p;
+				*lru_w = w;
+				*lru_l = w_l;
+			}
+		}
+		w_l = w;
+	}
+}
+
+int unuse_one_window(struct packed_git *current)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *lru_w = NULL, *lru_l = NULL;
+
+	if (current)
+		scan_windows(current, &lru_p, &lru_w, &lru_l);
+	for (p = packed_git; p; p = p->next)
+		scan_windows(p, &lru_p, &lru_w, &lru_l);
+	if (lru_p) {
+		munmap(lru_w->base, lru_w->len);
+		pack_mapped -= lru_w->len;
+		if (lru_l)
+			lru_l->next = lru_w->next;
+		else
+			lru_p->windows = lru_w->next;
+		free(lru_w);
+		pack_open_windows--;
+		return 1;
+	}
+	return 0;
+}
+
+void release_pack_memory(size_t need)
+{
+	size_t cur = pack_mapped;
+	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
+		; /* nothing */
+}
diff --git a/packfile.h b/packfile.h
index 703887d41..f6fe1c741 100644
--- a/packfile.h
+++ b/packfile.h
@@ -43,4 +43,8 @@ extern void pack_report(void);
  */
 extern int open_pack_index(struct packed_git *);
 
+extern int unuse_one_window(struct packed_git *current);
+
+extern void release_pack_memory(size_t);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 475d2032d..d51efd78d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -680,55 +680,6 @@ static int has_loose_object(const unsigned char *sha1)
 	return check_and_freshen(sha1, 0);
 }
 
-static void scan_windows(struct packed_git *p,
-	struct packed_git **lru_p,
-	struct pack_window **lru_w,
-	struct pack_window **lru_l)
-{
-	struct pack_window *w, *w_l;
-
-	for (w_l = NULL, w = p->windows; w; w = w->next) {
-		if (!w->inuse_cnt) {
-			if (!*lru_w || w->last_used < (*lru_w)->last_used) {
-				*lru_p = p;
-				*lru_w = w;
-				*lru_l = w_l;
-			}
-		}
-		w_l = w;
-	}
-}
-
-static int unuse_one_window(struct packed_git *current)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *lru_w = NULL, *lru_l = NULL;
-
-	if (current)
-		scan_windows(current, &lru_p, &lru_w, &lru_l);
-	for (p = packed_git; p; p = p->next)
-		scan_windows(p, &lru_p, &lru_w, &lru_l);
-	if (lru_p) {
-		munmap(lru_w->base, lru_w->len);
-		pack_mapped -= lru_w->len;
-		if (lru_l)
-			lru_l->next = lru_w->next;
-		else
-			lru_p->windows = lru_w->next;
-		free(lru_w);
-		pack_open_windows--;
-		return 1;
-	}
-	return 0;
-}
-
-void release_pack_memory(size_t need)
-{
-	size_t cur = pack_mapped;
-	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
-		; /* nothing */
-}
-
 static void mmap_limit_check(size_t length)
 {
 	static size_t limit = 0;
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 06/23] pack: move pack-closing functions
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (42 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 05/23] pack: move release_pack_memory() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 07/23] pack: move use_pack() Jonathan Tan
                   ` (16 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The function close_pack_fd() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/am.c           |  1 +
 builtin/clone.c        |  1 +
 builtin/fetch.c        |  1 +
 builtin/merge.c        |  1 +
 builtin/receive-pack.c |  1 +
 cache.h                |  8 --------
 packfile.c             | 54 +++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h             | 11 ++++++++++
 sha1_file.c            | 55 --------------------------------------------------
 9 files changed, 70 insertions(+), 63 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index c973bd96d..b9d9ff203 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -31,6 +31,7 @@
 #include "mailinfo.h"
 #include "apply.h"
 #include "string-list.h"
+#include "packfile.h"
 
 /**
  * Returns 1 if the file is empty or does not exist, 0 otherwise.
diff --git a/builtin/clone.c b/builtin/clone.c
index 08b5cc433..13abb075a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -25,6 +25,7 @@
 #include "remote.h"
 #include "run-command.h"
 #include "connected.h"
+#include "packfile.h"
 
 /*
  * Overall FIXMEs:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..c86c36f37 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -17,6 +17,7 @@
 #include "connected.h"
 #include "argv-array.h"
 #include "utf8.h"
+#include "packfile.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
diff --git a/builtin/merge.c b/builtin/merge.c
index 900bafdb4..45e673dcc 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -32,6 +32,7 @@
 #include "gpg-interface.h"
 #include "sequencer.h"
 #include "string-list.h"
+#include "packfile.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index cabdc55e0..0019a484f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -23,6 +23,7 @@
 #include "fsck.h"
 #include "tmp-objdir.h"
 #include "oidset.h"
+#include "packfile.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
diff --git a/cache.h b/cache.h
index f271033db..bbc56566e 100644
--- a/cache.h
+++ b/cache.h
@@ -1645,15 +1645,7 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * munmap the index file for the specified packfile (if it is
- * currently mmapped).
- */
-extern void close_pack_index(struct packed_git *);
-
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/packfile.c b/packfile.c
index 8daa74ad1..c8e2dbdee 100644
--- a/packfile.c
+++ b/packfile.c
@@ -257,3 +257,57 @@ void release_pack_memory(size_t need)
 	while (need >= (cur - pack_mapped) && unuse_one_window(NULL))
 		; /* nothing */
 }
+
+void close_pack_windows(struct packed_git *p)
+{
+	while (p->windows) {
+		struct pack_window *w = p->windows;
+
+		if (w->inuse_cnt)
+			die("pack '%s' still has open windows to it",
+			    p->pack_name);
+		munmap(w->base, w->len);
+		pack_mapped -= w->len;
+		pack_open_windows--;
+		p->windows = w->next;
+		free(w);
+	}
+}
+
+int close_pack_fd(struct packed_git *p)
+{
+	if (p->pack_fd < 0)
+		return 0;
+
+	close(p->pack_fd);
+	pack_open_fds--;
+	p->pack_fd = -1;
+
+	return 1;
+}
+
+void close_pack_index(struct packed_git *p)
+{
+	if (p->index_data) {
+		munmap((void *)p->index_data, p->index_size);
+		p->index_data = NULL;
+	}
+}
+
+static void close_pack(struct packed_git *p)
+{
+	close_pack_windows(p);
+	close_pack_fd(p);
+	close_pack_index(p);
+}
+
+void close_all_packs(void)
+{
+	struct packed_git *p;
+
+	for (p = packed_git; p; p = p->next)
+		if (p->do_not_close)
+			die("BUG: want to close pack marked 'do-not-close'");
+		else
+			close_pack(p);
+}
diff --git a/packfile.h b/packfile.h
index f6fe1c741..c6a07de62 100644
--- a/packfile.h
+++ b/packfile.h
@@ -43,6 +43,17 @@ extern void pack_report(void);
  */
 extern int open_pack_index(struct packed_git *);
 
+/*
+ * munmap the index file for the specified packfile (if it is
+ * currently mmapped).
+ */
+extern void close_pack_index(struct packed_git *);
+
+extern void close_pack_windows(struct packed_git *);
+extern void close_all_packs(void);
+
+extern int close_pack_fd(struct packed_git *);
+
 extern int unuse_one_window(struct packed_git *current);
 
 extern void release_pack_memory(size_t);
diff --git a/sha1_file.c b/sha1_file.c
index d51efd78d..7913a69f1 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,53 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void close_pack_windows(struct packed_git *p)
-{
-	while (p->windows) {
-		struct pack_window *w = p->windows;
-
-		if (w->inuse_cnt)
-			die("pack '%s' still has open windows to it",
-			    p->pack_name);
-		munmap(w->base, w->len);
-		pack_mapped -= w->len;
-		pack_open_windows--;
-		p->windows = w->next;
-		free(w);
-	}
-}
-
-static int close_pack_fd(struct packed_git *p)
-{
-	if (p->pack_fd < 0)
-		return 0;
-
-	close(p->pack_fd);
-	pack_open_fds--;
-	p->pack_fd = -1;
-
-	return 1;
-}
-
-static void close_pack(struct packed_git *p)
-{
-	close_pack_windows(p);
-	close_pack_fd(p);
-	close_pack_index(p);
-}
-
-void close_all_packs(void)
-{
-	struct packed_git *p;
-
-	for (p = packed_git; p; p = p->next)
-		if (p->do_not_close)
-			die("BUG: want to close pack marked 'do-not-close'");
-		else
-			close_pack(p);
-}
-
-
 /*
  * The LRU pack is the one with the oldest MRU window, preferring packs
  * with no used windows, or the oldest mtime if it has no windows allocated.
@@ -847,14 +800,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-void close_pack_index(struct packed_git *p)
-{
-	if (p->index_data) {
-		munmap((void *)p->index_data, p->index_size);
-		p->index_data = NULL;
-	}
-}
-
 static unsigned int get_max_fd_limit(void)
 {
 #ifdef RLIMIT_NOFILE
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 07/23] pack: move use_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (43 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 06/23] pack: move pack-closing functions Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 08/23] pack: move unuse_pack() Jonathan Tan
                   ` (15 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The function open_packed_git() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |   1 -
 packfile.c  | 303 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 packfile.h  |  14 +--
 sha1_file.c | 285 --------------------------------------------------------
 streaming.c |   1 +
 5 files changed, 298 insertions(+), 306 deletions(-)

diff --git a/cache.h b/cache.h
index bbc56566e..a27018210 100644
--- a/cache.h
+++ b/cache.h
@@ -1645,7 +1645,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
diff --git a/packfile.c b/packfile.c
index c8e2dbdee..ea451d27e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -24,14 +24,14 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 	return odb_pack_name(&buf, sha1, "idx");
 }
 
-unsigned int pack_used_ctr;
-unsigned int pack_mmap_calls;
-unsigned int peak_pack_open_windows;
-unsigned int pack_open_windows;
+static unsigned int pack_used_ctr;
+static unsigned int pack_mmap_calls;
+static unsigned int peak_pack_open_windows;
+static unsigned int pack_open_windows;
 unsigned int pack_open_fds;
-unsigned int pack_max_fds;
-size_t peak_pack_mapped;
-size_t pack_mapped;
+static unsigned int pack_max_fds;
+static size_t peak_pack_mapped;
+static size_t pack_mapped;
 struct packed_git *packed_git;
 
 static struct mru packed_git_mru_storage;
@@ -228,7 +228,7 @@ static void scan_windows(struct packed_git *p,
 	}
 }
 
-int unuse_one_window(struct packed_git *current)
+static int unuse_one_window(struct packed_git *current)
 {
 	struct packed_git *p, *lru_p = NULL;
 	struct pack_window *lru_w = NULL, *lru_l = NULL;
@@ -274,7 +274,7 @@ void close_pack_windows(struct packed_git *p)
 	}
 }
 
-int close_pack_fd(struct packed_git *p)
+static int close_pack_fd(struct packed_git *p)
 {
 	if (p->pack_fd < 0)
 		return 0;
@@ -311,3 +311,288 @@ void close_all_packs(void)
 		else
 			close_pack(p);
 }
+
+/*
+ * The LRU pack is the one with the oldest MRU window, preferring packs
+ * with no used windows, or the oldest mtime if it has no windows allocated.
+ */
+static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
+{
+	struct pack_window *w, *this_mru_w;
+	int has_windows_inuse = 0;
+
+	/*
+	 * Reject this pack if it has windows and the previously selected
+	 * one does not.  If this pack does not have windows, reject
+	 * it if the pack file is newer than the previously selected one.
+	 */
+	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
+		return;
+
+	for (w = this_mru_w = p->windows; w; w = w->next) {
+		/*
+		 * Reject this pack if any of its windows are in use,
+		 * but the previously selected pack did not have any
+		 * inuse windows.  Otherwise, record that this pack
+		 * has windows in use.
+		 */
+		if (w->inuse_cnt) {
+			if (*accept_windows_inuse)
+				has_windows_inuse = 1;
+			else
+				return;
+		}
+
+		if (w->last_used > this_mru_w->last_used)
+			this_mru_w = w;
+
+		/*
+		 * Reject this pack if it has windows that have been
+		 * used more recently than the previously selected pack.
+		 * If the previously selected pack had windows inuse and
+		 * we have not encountered a window in this pack that is
+		 * inuse, skip this check since we prefer a pack with no
+		 * inuse windows to one that has inuse windows.
+		 */
+		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
+		    this_mru_w->last_used > (*mru_w)->last_used)
+			return;
+	}
+
+	/*
+	 * Select this pack.
+	 */
+	*mru_w = this_mru_w;
+	*lru_p = p;
+	*accept_windows_inuse = has_windows_inuse;
+}
+
+static int close_one_pack(void)
+{
+	struct packed_git *p, *lru_p = NULL;
+	struct pack_window *mru_w = NULL;
+	int accept_windows_inuse = 1;
+
+	for (p = packed_git; p; p = p->next) {
+		if (p->pack_fd == -1)
+			continue;
+		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
+	}
+
+	if (lru_p)
+		return close_pack_fd(lru_p);
+
+	return 0;
+}
+
+static unsigned int get_max_fd_limit(void)
+{
+#ifdef RLIMIT_NOFILE
+	{
+		struct rlimit lim;
+
+		if (!getrlimit(RLIMIT_NOFILE, &lim))
+			return lim.rlim_cur;
+	}
+#endif
+
+#ifdef _SC_OPEN_MAX
+	{
+		long open_max = sysconf(_SC_OPEN_MAX);
+		if (0 < open_max)
+			return open_max;
+		/*
+		 * Otherwise, we got -1 for one of the two
+		 * reasons:
+		 *
+		 * (1) sysconf() did not understand _SC_OPEN_MAX
+		 *     and signaled an error with -1; or
+		 * (2) sysconf() said there is no limit.
+		 *
+		 * We _could_ clear errno before calling sysconf() to
+		 * tell these two cases apart and return a huge number
+		 * in the latter case to let the caller cap it to a
+		 * value that is not so selfish, but letting the
+		 * fallback OPEN_MAX codepath take care of these cases
+		 * is a lot simpler.
+		 */
+	}
+#endif
+
+#ifdef OPEN_MAX
+	return OPEN_MAX;
+#else
+	return 1; /* see the caller ;-) */
+#endif
+}
+
+/*
+ * Do not call this directly as this leaks p->pack_fd on error return;
+ * call open_packed_git() instead.
+ */
+static int open_packed_git_1(struct packed_git *p)
+{
+	struct stat st;
+	struct pack_header hdr;
+	unsigned char sha1[20];
+	unsigned char *idx_sha1;
+	long fd_flag;
+
+	if (!p->index_data && open_pack_index(p))
+		return error("packfile %s index unavailable", p->pack_name);
+
+	if (!pack_max_fds) {
+		unsigned int max_fds = get_max_fd_limit();
+
+		/* Save 3 for stdin/stdout/stderr, 22 for work */
+		if (25 < max_fds)
+			pack_max_fds = max_fds - 25;
+		else
+			pack_max_fds = 1;
+	}
+
+	while (pack_max_fds <= pack_open_fds && close_one_pack())
+		; /* nothing */
+
+	p->pack_fd = git_open(p->pack_name);
+	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
+		return -1;
+	pack_open_fds++;
+
+	/* If we created the struct before we had the pack we lack size. */
+	if (!p->pack_size) {
+		if (!S_ISREG(st.st_mode))
+			return error("packfile %s not a regular file", p->pack_name);
+		p->pack_size = st.st_size;
+	} else if (p->pack_size != st.st_size)
+		return error("packfile %s size changed", p->pack_name);
+
+	/* We leave these file descriptors open with sliding mmap;
+	 * there is no point keeping them open across exec(), though.
+	 */
+	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
+	if (fd_flag < 0)
+		return error("cannot determine file descriptor flags");
+	fd_flag |= FD_CLOEXEC;
+	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
+		return error("cannot set FD_CLOEXEC");
+
+	/* Verify we recognize this pack file format. */
+	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
+		return error("file %s is far too short to be a packfile", p->pack_name);
+	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
+		return error("file %s is not a GIT packfile", p->pack_name);
+	if (!pack_version_ok(hdr.hdr_version))
+		return error("packfile %s is version %"PRIu32" and not"
+			" supported (try upgrading GIT to a newer version)",
+			p->pack_name, ntohl(hdr.hdr_version));
+
+	/* Verify the pack matches its index. */
+	if (p->num_objects != ntohl(hdr.hdr_entries))
+		return error("packfile %s claims to have %"PRIu32" objects"
+			     " while index indicates %"PRIu32" objects",
+			     p->pack_name, ntohl(hdr.hdr_entries),
+			     p->num_objects);
+	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
+		return error("end of packfile %s is unavailable", p->pack_name);
+	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
+		return error("packfile %s signature is unavailable", p->pack_name);
+	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
+	if (hashcmp(sha1, idx_sha1))
+		return error("packfile %s does not match index", p->pack_name);
+	return 0;
+}
+
+int open_packed_git(struct packed_git *p)
+{
+	if (!open_packed_git_1(p))
+		return 0;
+	close_pack_fd(p);
+	return -1;
+}
+
+static int in_window(struct pack_window *win, off_t offset)
+{
+	/* We must promise at least 20 bytes (one hash) after the
+	 * offset is available from this window, otherwise the offset
+	 * is not actually in this window and a different window (which
+	 * has that one hash excess) must be used.  This is to support
+	 * the object header and delta base parsing routines below.
+	 */
+	off_t win_off = win->offset;
+	return win_off <= offset
+		&& (offset + 20) <= (win_off + win->len);
+}
+
+unsigned char *use_pack(struct packed_git *p,
+		struct pack_window **w_cursor,
+		off_t offset,
+		size_t *left)
+{
+	struct pack_window *win = *w_cursor;
+
+	/* Since packfiles end in a hash of their content and it's
+	 * pointless to ask for an offset into the middle of that
+	 * hash, and the in_window function above wouldn't match
+	 * don't allow an offset too close to the end of the file.
+	 */
+	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
+		die("packfile %s cannot be accessed", p->pack_name);
+	if (offset > (p->pack_size - 20))
+		die("offset beyond end of packfile (truncated pack?)");
+	if (offset < 0)
+		die(_("offset before end of packfile (broken .idx?)"));
+
+	if (!win || !in_window(win, offset)) {
+		if (win)
+			win->inuse_cnt--;
+		for (win = p->windows; win; win = win->next) {
+			if (in_window(win, offset))
+				break;
+		}
+		if (!win) {
+			size_t window_align = packed_git_window_size / 2;
+			off_t len;
+
+			if (p->pack_fd == -1 && open_packed_git(p))
+				die("packfile %s cannot be accessed", p->pack_name);
+
+			win = xcalloc(1, sizeof(*win));
+			win->offset = (offset / window_align) * window_align;
+			len = p->pack_size - win->offset;
+			if (len > packed_git_window_size)
+				len = packed_git_window_size;
+			win->len = (size_t)len;
+			pack_mapped += win->len;
+			while (packed_git_limit < pack_mapped
+				&& unuse_one_window(p))
+				; /* nothing */
+			win->base = xmmap(NULL, win->len,
+				PROT_READ, MAP_PRIVATE,
+				p->pack_fd, win->offset);
+			if (win->base == MAP_FAILED)
+				die_errno("packfile %s cannot be mapped",
+					  p->pack_name);
+			if (!win->offset && win->len == p->pack_size
+				&& !p->do_not_close)
+				close_pack_fd(p);
+			pack_mmap_calls++;
+			pack_open_windows++;
+			if (pack_mapped > peak_pack_mapped)
+				peak_pack_mapped = pack_mapped;
+			if (pack_open_windows > peak_pack_open_windows)
+				peak_pack_open_windows = pack_open_windows;
+			win->next = p->windows;
+			p->windows = win;
+		}
+	}
+	if (win != *w_cursor) {
+		win->last_used = pack_used_ctr++;
+		win->inuse_cnt++;
+		*w_cursor = win;
+	}
+	offset -= win->offset;
+	if (left)
+		*left = win->len - xsize_t(offset);
+	return win->base + offset;
+}
diff --git a/packfile.h b/packfile.h
index c6a07de62..97cfc5e70 100644
--- a/packfile.h
+++ b/packfile.h
@@ -24,14 +24,7 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_used_ctr;
-extern unsigned int pack_mmap_calls;
-extern unsigned int peak_pack_open_windows;
-extern unsigned int pack_open_windows;
 extern unsigned int pack_open_fds;
-extern unsigned int pack_max_fds;
-extern size_t peak_pack_mapped;
-extern size_t pack_mapped;
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
@@ -49,13 +42,12 @@ extern int open_pack_index(struct packed_git *);
  */
 extern void close_pack_index(struct packed_git *);
 
+extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *);
 extern void close_pack_windows(struct packed_git *);
 extern void close_all_packs(void);
 
-extern int close_pack_fd(struct packed_git *);
-
-extern int unuse_one_window(struct packed_git *current);
-
 extern void release_pack_memory(size_t);
 
+extern int open_packed_git(struct packed_git *p);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 7913a69f1..7704801d1 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,79 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-/*
- * The LRU pack is the one with the oldest MRU window, preferring packs
- * with no used windows, or the oldest mtime if it has no windows allocated.
- */
-static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse)
-{
-	struct pack_window *w, *this_mru_w;
-	int has_windows_inuse = 0;
-
-	/*
-	 * Reject this pack if it has windows and the previously selected
-	 * one does not.  If this pack does not have windows, reject
-	 * it if the pack file is newer than the previously selected one.
-	 */
-	if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime))
-		return;
-
-	for (w = this_mru_w = p->windows; w; w = w->next) {
-		/*
-		 * Reject this pack if any of its windows are in use,
-		 * but the previously selected pack did not have any
-		 * inuse windows.  Otherwise, record that this pack
-		 * has windows in use.
-		 */
-		if (w->inuse_cnt) {
-			if (*accept_windows_inuse)
-				has_windows_inuse = 1;
-			else
-				return;
-		}
-
-		if (w->last_used > this_mru_w->last_used)
-			this_mru_w = w;
-
-		/*
-		 * Reject this pack if it has windows that have been
-		 * used more recently than the previously selected pack.
-		 * If the previously selected pack had windows inuse and
-		 * we have not encountered a window in this pack that is
-		 * inuse, skip this check since we prefer a pack with no
-		 * inuse windows to one that has inuse windows.
-		 */
-		if (*mru_w && *accept_windows_inuse == has_windows_inuse &&
-		    this_mru_w->last_used > (*mru_w)->last_used)
-			return;
-	}
-
-	/*
-	 * Select this pack.
-	 */
-	*mru_w = this_mru_w;
-	*lru_p = p;
-	*accept_windows_inuse = has_windows_inuse;
-}
-
-static int close_one_pack(void)
-{
-	struct packed_git *p, *lru_p = NULL;
-	struct pack_window *mru_w = NULL;
-	int accept_windows_inuse = 1;
-
-	for (p = packed_git; p; p = p->next) {
-		if (p->pack_fd == -1)
-			continue;
-		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
-	}
-
-	if (lru_p)
-		return close_pack_fd(lru_p);
-
-	return 0;
-}
-
 void unuse_pack(struct pack_window **w_cursor)
 {
 	struct pack_window *w = *w_cursor;
@@ -800,218 +727,6 @@ void unuse_pack(struct pack_window **w_cursor)
 	}
 }
 
-static unsigned int get_max_fd_limit(void)
-{
-#ifdef RLIMIT_NOFILE
-	{
-		struct rlimit lim;
-
-		if (!getrlimit(RLIMIT_NOFILE, &lim))
-			return lim.rlim_cur;
-	}
-#endif
-
-#ifdef _SC_OPEN_MAX
-	{
-		long open_max = sysconf(_SC_OPEN_MAX);
-		if (0 < open_max)
-			return open_max;
-		/*
-		 * Otherwise, we got -1 for one of the two
-		 * reasons:
-		 *
-		 * (1) sysconf() did not understand _SC_OPEN_MAX
-		 *     and signaled an error with -1; or
-		 * (2) sysconf() said there is no limit.
-		 *
-		 * We _could_ clear errno before calling sysconf() to
-		 * tell these two cases apart and return a huge number
-		 * in the latter case to let the caller cap it to a
-		 * value that is not so selfish, but letting the
-		 * fallback OPEN_MAX codepath take care of these cases
-		 * is a lot simpler.
-		 */
-	}
-#endif
-
-#ifdef OPEN_MAX
-	return OPEN_MAX;
-#else
-	return 1; /* see the caller ;-) */
-#endif
-}
-
-/*
- * Do not call this directly as this leaks p->pack_fd on error return;
- * call open_packed_git() instead.
- */
-static int open_packed_git_1(struct packed_git *p)
-{
-	struct stat st;
-	struct pack_header hdr;
-	unsigned char sha1[20];
-	unsigned char *idx_sha1;
-	long fd_flag;
-
-	if (!p->index_data && open_pack_index(p))
-		return error("packfile %s index unavailable", p->pack_name);
-
-	if (!pack_max_fds) {
-		unsigned int max_fds = get_max_fd_limit();
-
-		/* Save 3 for stdin/stdout/stderr, 22 for work */
-		if (25 < max_fds)
-			pack_max_fds = max_fds - 25;
-		else
-			pack_max_fds = 1;
-	}
-
-	while (pack_max_fds <= pack_open_fds && close_one_pack())
-		; /* nothing */
-
-	p->pack_fd = git_open(p->pack_name);
-	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
-		return -1;
-	pack_open_fds++;
-
-	/* If we created the struct before we had the pack we lack size. */
-	if (!p->pack_size) {
-		if (!S_ISREG(st.st_mode))
-			return error("packfile %s not a regular file", p->pack_name);
-		p->pack_size = st.st_size;
-	} else if (p->pack_size != st.st_size)
-		return error("packfile %s size changed", p->pack_name);
-
-	/* We leave these file descriptors open with sliding mmap;
-	 * there is no point keeping them open across exec(), though.
-	 */
-	fd_flag = fcntl(p->pack_fd, F_GETFD, 0);
-	if (fd_flag < 0)
-		return error("cannot determine file descriptor flags");
-	fd_flag |= FD_CLOEXEC;
-	if (fcntl(p->pack_fd, F_SETFD, fd_flag) == -1)
-		return error("cannot set FD_CLOEXEC");
-
-	/* Verify we recognize this pack file format. */
-	if (read_in_full(p->pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
-		return error("file %s is far too short to be a packfile", p->pack_name);
-	if (hdr.hdr_signature != htonl(PACK_SIGNATURE))
-		return error("file %s is not a GIT packfile", p->pack_name);
-	if (!pack_version_ok(hdr.hdr_version))
-		return error("packfile %s is version %"PRIu32" and not"
-			" supported (try upgrading GIT to a newer version)",
-			p->pack_name, ntohl(hdr.hdr_version));
-
-	/* Verify the pack matches its index. */
-	if (p->num_objects != ntohl(hdr.hdr_entries))
-		return error("packfile %s claims to have %"PRIu32" objects"
-			     " while index indicates %"PRIu32" objects",
-			     p->pack_name, ntohl(hdr.hdr_entries),
-			     p->num_objects);
-	if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
-		return error("end of packfile %s is unavailable", p->pack_name);
-	if (read_in_full(p->pack_fd, sha1, sizeof(sha1)) != sizeof(sha1))
-		return error("packfile %s signature is unavailable", p->pack_name);
-	idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
-	if (hashcmp(sha1, idx_sha1))
-		return error("packfile %s does not match index", p->pack_name);
-	return 0;
-}
-
-static int open_packed_git(struct packed_git *p)
-{
-	if (!open_packed_git_1(p))
-		return 0;
-	close_pack_fd(p);
-	return -1;
-}
-
-static int in_window(struct pack_window *win, off_t offset)
-{
-	/* We must promise at least 20 bytes (one hash) after the
-	 * offset is available from this window, otherwise the offset
-	 * is not actually in this window and a different window (which
-	 * has that one hash excess) must be used.  This is to support
-	 * the object header and delta base parsing routines below.
-	 */
-	off_t win_off = win->offset;
-	return win_off <= offset
-		&& (offset + 20) <= (win_off + win->len);
-}
-
-unsigned char *use_pack(struct packed_git *p,
-		struct pack_window **w_cursor,
-		off_t offset,
-		size_t *left)
-{
-	struct pack_window *win = *w_cursor;
-
-	/* Since packfiles end in a hash of their content and it's
-	 * pointless to ask for an offset into the middle of that
-	 * hash, and the in_window function above wouldn't match
-	 * don't allow an offset too close to the end of the file.
-	 */
-	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
-		die("packfile %s cannot be accessed", p->pack_name);
-	if (offset > (p->pack_size - 20))
-		die("offset beyond end of packfile (truncated pack?)");
-	if (offset < 0)
-		die(_("offset before end of packfile (broken .idx?)"));
-
-	if (!win || !in_window(win, offset)) {
-		if (win)
-			win->inuse_cnt--;
-		for (win = p->windows; win; win = win->next) {
-			if (in_window(win, offset))
-				break;
-		}
-		if (!win) {
-			size_t window_align = packed_git_window_size / 2;
-			off_t len;
-
-			if (p->pack_fd == -1 && open_packed_git(p))
-				die("packfile %s cannot be accessed", p->pack_name);
-
-			win = xcalloc(1, sizeof(*win));
-			win->offset = (offset / window_align) * window_align;
-			len = p->pack_size - win->offset;
-			if (len > packed_git_window_size)
-				len = packed_git_window_size;
-			win->len = (size_t)len;
-			pack_mapped += win->len;
-			while (packed_git_limit < pack_mapped
-				&& unuse_one_window(p))
-				; /* nothing */
-			win->base = xmmap(NULL, win->len,
-				PROT_READ, MAP_PRIVATE,
-				p->pack_fd, win->offset);
-			if (win->base == MAP_FAILED)
-				die_errno("packfile %s cannot be mapped",
-					  p->pack_name);
-			if (!win->offset && win->len == p->pack_size
-				&& !p->do_not_close)
-				close_pack_fd(p);
-			pack_mmap_calls++;
-			pack_open_windows++;
-			if (pack_mapped > peak_pack_mapped)
-				peak_pack_mapped = pack_mapped;
-			if (pack_open_windows > peak_pack_open_windows)
-				peak_pack_open_windows = pack_open_windows;
-			win->next = p->windows;
-			p->windows = win;
-		}
-	}
-	if (win != *w_cursor) {
-		win->last_used = pack_used_ctr++;
-		win->inuse_cnt++;
-		*w_cursor = win;
-	}
-	offset -= win->offset;
-	if (left)
-		*left = win->len - xsize_t(offset);
-	return win->base + offset;
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
diff --git a/streaming.c b/streaming.c
index 9afa66b8b..6f1c60f12 100644
--- a/streaming.c
+++ b/streaming.c
@@ -3,6 +3,7 @@
  */
 #include "cache.h"
 #include "streaming.h"
+#include "packfile.h"
 
 enum input_source {
 	stream_error = -1,
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 08/23] pack: move unuse_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (44 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 07/23] pack: move use_pack() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 09/23] pack: move add_packed_git() Jonathan Tan
                   ` (14 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 1 -
 packfile.c  | 9 +++++++++
 packfile.h  | 1 +
 sha1_file.c | 9 ---------
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index a27018210..0313b0b8d 100644
--- a/cache.h
+++ b/cache.h
@@ -1645,7 +1645,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
diff --git a/packfile.c b/packfile.c
index ea451d27e..0c97c3a1a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -596,3 +596,12 @@ unsigned char *use_pack(struct packed_git *p,
 		*left = win->len - xsize_t(offset);
 	return win->base + offset;
 }
+
+void unuse_pack(struct pack_window **w_cursor)
+{
+	struct pack_window *w = *w_cursor;
+	if (w) {
+		w->inuse_cnt--;
+		*w_cursor = NULL;
+	}
+}
diff --git a/packfile.h b/packfile.h
index 97cfc5e70..b5db490ab 100644
--- a/packfile.h
+++ b/packfile.h
@@ -45,6 +45,7 @@ extern void close_pack_index(struct packed_git *);
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *);
 extern void close_pack_windows(struct packed_git *);
 extern void close_all_packs(void);
+extern void unuse_pack(struct pack_window **);
 
 extern void release_pack_memory(size_t);
 
diff --git a/sha1_file.c b/sha1_file.c
index 7704801d1..84d96d0ab 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,15 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void unuse_pack(struct pack_window **w_cursor)
-{
-	struct pack_window *w = *w_cursor;
-	if (w) {
-		w->inuse_cnt--;
-		*w_cursor = NULL;
-	}
-}
-
 static struct packed_git *alloc_packed_git(int extra)
 {
 	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 09/23] pack: move add_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (45 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 08/23] pack: move unuse_pack() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 10/23] pack: move install_packed_git() Jonathan Tan
                   ` (13 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 connected.c |  1 +
 packfile.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h  |  1 +
 sha1_file.c | 61 -------------------------------------------------------------
 5 files changed, 55 insertions(+), 62 deletions(-)

diff --git a/cache.h b/cache.h
index 0313b0b8d..3625509f9 100644
--- a/cache.h
+++ b/cache.h
@@ -1646,7 +1646,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
 extern int odb_pack_keep(const char *name);
 
 extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
diff --git a/connected.c b/connected.c
index 136c2ac16..3e3f0148c 100644
--- a/connected.c
+++ b/connected.c
@@ -3,6 +3,7 @@
 #include "sigchain.h"
 #include "connected.h"
 #include "transport.h"
+#include "pack.h"
 
 /*
  * If we feed all the commits we want to verify to this command
diff --git a/packfile.c b/packfile.c
index 0c97c3a1a..d1433d8c7 100644
--- a/packfile.c
+++ b/packfile.c
@@ -605,3 +605,56 @@ void unuse_pack(struct pack_window **w_cursor)
 		*w_cursor = NULL;
 	}
 }
+
+static void try_to_free_pack_memory(size_t size)
+{
+	release_pack_memory(size);
+}
+
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
+{
+	static int have_set_try_to_free_routine;
+	struct stat st;
+	size_t alloc;
+	struct packed_git *p;
+
+	if (!have_set_try_to_free_routine) {
+		have_set_try_to_free_routine = 1;
+		set_try_to_free_routine(try_to_free_pack_memory);
+	}
+
+	/*
+	 * Make sure a corresponding .pack file exists and that
+	 * the index looks sane.
+	 */
+	if (!strip_suffix_mem(path, &path_len, ".idx"))
+		return NULL;
+
+	/*
+	 * ".pack" is long enough to hold any suffix we're adding (and
+	 * the use xsnprintf double-checks that)
+	 */
+	alloc = st_add3(path_len, strlen(".pack"), 1);
+	p = alloc_packed_git(alloc);
+	memcpy(p->pack_name, path, path_len);
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->pack_keep = 1;
+
+	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
+	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
+		free(p);
+		return NULL;
+	}
+
+	/* ok, it looks sane as far as we can check without
+	 * actually mapping the pack file.
+	 */
+	p->pack_size = st.st_size;
+	p->pack_local = local;
+	p->mtime = st.st_mtime;
+	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
+		hashclr(p->sha1);
+	return p;
+}
diff --git a/packfile.h b/packfile.h
index b5db490ab..1e932a49e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -46,6 +46,7 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t
 extern void close_pack_windows(struct packed_git *);
 extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
+extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 extern void release_pack_memory(size_t);
 
diff --git a/sha1_file.c b/sha1_file.c
index 84d96d0ab..0929fc10e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,67 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-static struct packed_git *alloc_packed_git(int extra)
-{
-	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
-	memset(p, 0, sizeof(*p));
-	p->pack_fd = -1;
-	return p;
-}
-
-static void try_to_free_pack_memory(size_t size)
-{
-	release_pack_memory(size);
-}
-
-struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
-{
-	static int have_set_try_to_free_routine;
-	struct stat st;
-	size_t alloc;
-	struct packed_git *p;
-
-	if (!have_set_try_to_free_routine) {
-		have_set_try_to_free_routine = 1;
-		set_try_to_free_routine(try_to_free_pack_memory);
-	}
-
-	/*
-	 * Make sure a corresponding .pack file exists and that
-	 * the index looks sane.
-	 */
-	if (!strip_suffix_mem(path, &path_len, ".idx"))
-		return NULL;
-
-	/*
-	 * ".pack" is long enough to hold any suffix we're adding (and
-	 * the use xsnprintf double-checks that)
-	 */
-	alloc = st_add3(path_len, strlen(".pack"), 1);
-	p = alloc_packed_git(alloc);
-	memcpy(p->pack_name, path, path_len);
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
-	if (!access(p->pack_name, F_OK))
-		p->pack_keep = 1;
-
-	xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
-	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
-		free(p);
-		return NULL;
-	}
-
-	/* ok, it looks sane as far as we can check without
-	 * actually mapping the pack file.
-	 */
-	p->pack_size = st.st_size;
-	p->pack_local = local;
-	p->mtime = st.st_mtime;
-	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
-		hashclr(p->sha1);
-	return p;
-}
-
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 10/23] pack: move install_packed_git()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (46 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 09/23] pack: move add_packed_git() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 11/23] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
                   ` (12 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 packfile.c  | 11 ++++++++++-
 packfile.h  |  2 ++
 sha1_file.c |  9 ---------
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 3625509f9..c4d8bee52 100644
--- a/cache.h
+++ b/cache.h
@@ -1619,7 +1619,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
 extern void reprepare_packed_git(void);
-extern void install_packed_git(struct packed_git *pack);
 
 /*
  * Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/packfile.c b/packfile.c
index d1433d8c7..9a65aa4f6 100644
--- a/packfile.c
+++ b/packfile.c
@@ -28,7 +28,7 @@ static unsigned int pack_used_ctr;
 static unsigned int pack_mmap_calls;
 static unsigned int peak_pack_open_windows;
 static unsigned int pack_open_windows;
-unsigned int pack_open_fds;
+static unsigned int pack_open_fds;
 static unsigned int pack_max_fds;
 static size_t peak_pack_mapped;
 static size_t pack_mapped;
@@ -658,3 +658,12 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 		hashclr(p->sha1);
 	return p;
 }
+
+void install_packed_git(struct packed_git *pack)
+{
+	if (pack->pack_fd != -1)
+		pack_open_fds++;
+
+	pack->next = packed_git;
+	packed_git = pack;
+}
diff --git a/packfile.h b/packfile.h
index 1e932a49e..a18029184 100644
--- a/packfile.h
+++ b/packfile.h
@@ -28,6 +28,8 @@ extern unsigned int pack_open_fds;
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
+extern void install_packed_git(struct packed_git *pack);
+
 extern void pack_report(void);
 
 /*
diff --git a/sha1_file.c b/sha1_file.c
index 0929fc10e..b77e7e3c3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,15 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void install_packed_git(struct packed_git *pack)
-{
-	if (pack->pack_fd != -1)
-		pack_open_fds++;
-
-	pack->next = packed_git;
-	packed_git = pack;
-}
-
 void (*report_garbage)(unsigned seen_bits, const char *path);
 
 static void report_helper(const struct string_list *list,
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 11/23] pack: move {,re}prepare_packed_git and approximate_object_count
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (47 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 10/23] pack: move install_packed_git() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 12/23] pack: move unpack_object_header_buffer() Jonathan Tan
                   ` (11 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/gc.c   |   1 +
 bulk-checkin.c |   1 +
 cache.h        |  15 ----
 connected.c    |   2 +-
 fetch-pack.c   |   1 +
 http-backend.c |   1 +
 packfile.c     | 217 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h     |  16 ++++-
 path.c         |   1 +
 server-info.c  |   1 +
 sha1_file.c    | 214 --------------------------------------------------------
 11 files changed, 238 insertions(+), 232 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e6b84475a..3c78fcb9b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -19,6 +19,7 @@
 #include "sigchain.h"
 #include "argv-array.h"
 #include "commit.h"
+#include "packfile.h"
 
 #define FAILED_RUN "failed to run %s"
 
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 5be7ce5c7..9a1f6c49a 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -6,6 +6,7 @@
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
+#include "packfile.h"
 
 static struct bulk_checkin_state {
 	unsigned plugged:1;
diff --git a/cache.h b/cache.h
index c4d8bee52..63765d481 100644
--- a/cache.h
+++ b/cache.h
@@ -1611,21 +1611,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-/* A hook to report invalid files in pack directory */
-#define PACKDIR_FILE_PACK 1
-#define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
-extern void (*report_garbage)(unsigned seen_bits, const char *path);
-
-extern void prepare_packed_git(void);
-extern void reprepare_packed_git(void);
-
-/*
- * Give a rough count of objects in the repository. This sacrifices accuracy
- * for speed.
- */
-unsigned long approximate_object_count(void);
-
 extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 					 struct packed_git *packs);
 
diff --git a/connected.c b/connected.c
index 3e3f0148c..f416b0505 100644
--- a/connected.c
+++ b/connected.c
@@ -3,7 +3,7 @@
 #include "sigchain.h"
 #include "connected.h"
 #include "transport.h"
-#include "pack.h"
+#include "packfile.h"
 
 /*
  * If we feed all the commits we want to verify to this command
diff --git a/fetch-pack.c b/fetch-pack.c
index fbbc99c88..105506e9a 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -17,6 +17,7 @@
 #include "prio-queue.h"
 #include "sha1-array.h"
 #include "oidset.h"
+#include "packfile.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
diff --git a/http-backend.c b/http-backend.c
index 519025d2c..8076b1d5e 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -9,6 +9,7 @@
 #include "string-list.h"
 #include "url.h"
 #include "argv-array.h"
+#include "packfile.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
diff --git a/packfile.c b/packfile.c
index 9a65aa4f6..9cf462856 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,6 +1,9 @@
 #include "cache.h"
 #include "mru.h"
 #include "pack.h"
+#include "dir.h"
+#include "mergesort.h"
+#include "packfile.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -667,3 +670,217 @@ void install_packed_git(struct packed_git *pack)
 	pack->next = packed_git;
 	packed_git = pack;
 }
+
+void (*report_garbage)(unsigned seen_bits, const char *path);
+
+static void report_helper(const struct string_list *list,
+			  int seen_bits, int first, int last)
+{
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
+		return;
+
+	for (; first < last; first++)
+		report_garbage(seen_bits, list->items[first].string);
+}
+
+static void report_pack_garbage(struct string_list *list)
+{
+	int i, baselen = -1, first = 0, seen_bits = 0;
+
+	if (!report_garbage)
+		return;
+
+	string_list_sort(list);
+
+	for (i = 0; i < list->nr; i++) {
+		const char *path = list->items[i].string;
+		if (baselen != -1 &&
+		    strncmp(path, list->items[first].string, baselen)) {
+			report_helper(list, seen_bits, first, i);
+			baselen = -1;
+			seen_bits = 0;
+		}
+		if (baselen == -1) {
+			const char *dot = strrchr(path, '.');
+			if (!dot) {
+				report_garbage(PACKDIR_FILE_GARBAGE, path);
+				continue;
+			}
+			baselen = dot - path + 1;
+			first = i;
+		}
+		if (!strcmp(path + baselen, "pack"))
+			seen_bits |= 1;
+		else if (!strcmp(path + baselen, "idx"))
+			seen_bits |= 2;
+	}
+	report_helper(list, seen_bits, first, list->nr);
+}
+
+static void prepare_packed_git_one(char *objdir, int local)
+{
+	struct strbuf path = STRBUF_INIT;
+	size_t dirnamelen;
+	DIR *dir;
+	struct dirent *de;
+	struct string_list garbage = STRING_LIST_INIT_DUP;
+
+	strbuf_addstr(&path, objdir);
+	strbuf_addstr(&path, "/pack");
+	dir = opendir(path.buf);
+	if (!dir) {
+		if (errno != ENOENT)
+			error_errno("unable to open object pack directory: %s",
+				    path.buf);
+		strbuf_release(&path);
+		return;
+	}
+	strbuf_addch(&path, '/');
+	dirnamelen = path.len;
+	while ((de = readdir(dir)) != NULL) {
+		struct packed_git *p;
+		size_t base_len;
+
+		if (is_dot_or_dotdot(de->d_name))
+			continue;
+
+		strbuf_setlen(&path, dirnamelen);
+		strbuf_addstr(&path, de->d_name);
+
+		base_len = path.len;
+		if (strip_suffix_mem(path.buf, &base_len, ".idx")) {
+			/* Don't reopen a pack we already have. */
+			for (p = packed_git; p; p = p->next) {
+				size_t len;
+				if (strip_suffix(p->pack_name, ".pack", &len) &&
+				    len == base_len &&
+				    !memcmp(p->pack_name, path.buf, len))
+					break;
+			}
+			if (p == NULL &&
+			    /*
+			     * See if it really is a valid .idx file with
+			     * corresponding .pack file that we can map.
+			     */
+			    (p = add_packed_git(path.buf, path.len, local)) != NULL)
+				install_packed_git(p);
+		}
+
+		if (!report_garbage)
+			continue;
+
+		if (ends_with(de->d_name, ".idx") ||
+		    ends_with(de->d_name, ".pack") ||
+		    ends_with(de->d_name, ".bitmap") ||
+		    ends_with(de->d_name, ".keep"))
+			string_list_append(&garbage, path.buf);
+		else
+			report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
+	}
+	closedir(dir);
+	report_pack_garbage(&garbage);
+	string_list_clear(&garbage, 0);
+	strbuf_release(&path);
+}
+
+static int approximate_object_count_valid;
+
+/*
+ * Give a fast, rough count of the number of objects in the repository. This
+ * ignores loose objects completely. If you have a lot of them, then either
+ * you should repack because your performance will be awful, or they are
+ * all unreachable objects about to be pruned, in which case they're not really
+ * interesting as a measure of repo size in the first place.
+ */
+unsigned long approximate_object_count(void)
+{
+	static unsigned long count;
+	if (!approximate_object_count_valid) {
+		struct packed_git *p;
+
+		prepare_packed_git();
+		count = 0;
+		for (p = packed_git; p; p = p->next) {
+			if (open_pack_index(p))
+				continue;
+			count += p->num_objects;
+		}
+	}
+	return count;
+}
+
+static void *get_next_packed_git(const void *p)
+{
+	return ((const struct packed_git *)p)->next;
+}
+
+static void set_next_packed_git(void *p, void *next)
+{
+	((struct packed_git *)p)->next = next;
+}
+
+static int sort_pack(const void *a_, const void *b_)
+{
+	const struct packed_git *a = a_;
+	const struct packed_git *b = b_;
+	int st;
+
+	/*
+	 * Local packs tend to contain objects specific to our
+	 * variant of the project than remote ones.  In addition,
+	 * remote ones could be on a network mounted filesystem.
+	 * Favor local ones for these reasons.
+	 */
+	st = a->pack_local - b->pack_local;
+	if (st)
+		return -st;
+
+	/*
+	 * Younger packs tend to contain more recent objects,
+	 * and more recent objects tend to get accessed more
+	 * often.
+	 */
+	if (a->mtime < b->mtime)
+		return 1;
+	else if (a->mtime == b->mtime)
+		return 0;
+	return -1;
+}
+
+static void rearrange_packed_git(void)
+{
+	packed_git = llist_mergesort(packed_git, get_next_packed_git,
+				     set_next_packed_git, sort_pack);
+}
+
+static void prepare_packed_git_mru(void)
+{
+	struct packed_git *p;
+
+	mru_clear(packed_git_mru);
+	for (p = packed_git; p; p = p->next)
+		mru_append(packed_git_mru, p);
+}
+
+static int prepare_packed_git_run_once = 0;
+void prepare_packed_git(void)
+{
+	struct alternate_object_database *alt;
+
+	if (prepare_packed_git_run_once)
+		return;
+	prepare_packed_git_one(get_object_directory(), 1);
+	prepare_alt_odb();
+	for (alt = alt_odb_list; alt; alt = alt->next)
+		prepare_packed_git_one(alt->path, 0);
+	rearrange_packed_git();
+	prepare_packed_git_mru();
+	prepare_packed_git_run_once = 1;
+}
+
+void reprepare_packed_git(void)
+{
+	approximate_object_count_valid = 0;
+	prepare_packed_git_run_once = 0;
+	prepare_packed_git();
+}
diff --git a/packfile.h b/packfile.h
index a18029184..1cfda1d00 100644
--- a/packfile.h
+++ b/packfile.h
@@ -24,12 +24,24 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern unsigned int pack_open_fds;
-
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
+/* A hook to report invalid files in pack directory */
+#define PACKDIR_FILE_PACK 1
+#define PACKDIR_FILE_IDX 2
+#define PACKDIR_FILE_GARBAGE 4
+extern void (*report_garbage)(unsigned seen_bits, const char *path);
+
+extern void prepare_packed_git(void);
+extern void reprepare_packed_git(void);
 extern void install_packed_git(struct packed_git *pack);
 
+/*
+ * Give a rough count of objects in the repository. This sacrifices accuracy
+ * for speed.
+ */
+unsigned long approximate_object_count(void);
+
 extern void pack_report(void);
 
 /*
diff --git a/path.c b/path.c
index e50d2befc..b533ec938 100644
--- a/path.c
+++ b/path.c
@@ -9,6 +9,7 @@
 #include "worktree.h"
 #include "submodule-config.h"
 #include "path.h"
+#include "packfile.h"
 
 static int get_st_mode_bits(const char *path, int *mode)
 {
diff --git a/server-info.c b/server-info.c
index 5ec5b1d82..26a6c20b7 100644
--- a/server-info.c
+++ b/server-info.c
@@ -3,6 +3,7 @@
 #include "object.h"
 #include "commit.h"
 #include "tag.h"
+#include "packfile.h"
 
 /*
  * Create the file "path" by writing to a temporary file and renaming
diff --git a/sha1_file.c b/sha1_file.c
index b77e7e3c3..51bb4d1db 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,220 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-void (*report_garbage)(unsigned seen_bits, const char *path);
-
-static void report_helper(const struct string_list *list,
-			  int seen_bits, int first, int last)
-{
-	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
-		return;
-
-	for (; first < last; first++)
-		report_garbage(seen_bits, list->items[first].string);
-}
-
-static void report_pack_garbage(struct string_list *list)
-{
-	int i, baselen = -1, first = 0, seen_bits = 0;
-
-	if (!report_garbage)
-		return;
-
-	string_list_sort(list);
-
-	for (i = 0; i < list->nr; i++) {
-		const char *path = list->items[i].string;
-		if (baselen != -1 &&
-		    strncmp(path, list->items[first].string, baselen)) {
-			report_helper(list, seen_bits, first, i);
-			baselen = -1;
-			seen_bits = 0;
-		}
-		if (baselen == -1) {
-			const char *dot = strrchr(path, '.');
-			if (!dot) {
-				report_garbage(PACKDIR_FILE_GARBAGE, path);
-				continue;
-			}
-			baselen = dot - path + 1;
-			first = i;
-		}
-		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
-		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
-	}
-	report_helper(list, seen_bits, first, list->nr);
-}
-
-static void prepare_packed_git_one(char *objdir, int local)
-{
-	struct strbuf path = STRBUF_INIT;
-	size_t dirnamelen;
-	DIR *dir;
-	struct dirent *de;
-	struct string_list garbage = STRING_LIST_INIT_DUP;
-
-	strbuf_addstr(&path, objdir);
-	strbuf_addstr(&path, "/pack");
-	dir = opendir(path.buf);
-	if (!dir) {
-		if (errno != ENOENT)
-			error_errno("unable to open object pack directory: %s",
-				    path.buf);
-		strbuf_release(&path);
-		return;
-	}
-	strbuf_addch(&path, '/');
-	dirnamelen = path.len;
-	while ((de = readdir(dir)) != NULL) {
-		struct packed_git *p;
-		size_t base_len;
-
-		if (is_dot_or_dotdot(de->d_name))
-			continue;
-
-		strbuf_setlen(&path, dirnamelen);
-		strbuf_addstr(&path, de->d_name);
-
-		base_len = path.len;
-		if (strip_suffix_mem(path.buf, &base_len, ".idx")) {
-			/* Don't reopen a pack we already have. */
-			for (p = packed_git; p; p = p->next) {
-				size_t len;
-				if (strip_suffix(p->pack_name, ".pack", &len) &&
-				    len == base_len &&
-				    !memcmp(p->pack_name, path.buf, len))
-					break;
-			}
-			if (p == NULL &&
-			    /*
-			     * See if it really is a valid .idx file with
-			     * corresponding .pack file that we can map.
-			     */
-			    (p = add_packed_git(path.buf, path.len, local)) != NULL)
-				install_packed_git(p);
-		}
-
-		if (!report_garbage)
-			continue;
-
-		if (ends_with(de->d_name, ".idx") ||
-		    ends_with(de->d_name, ".pack") ||
-		    ends_with(de->d_name, ".bitmap") ||
-		    ends_with(de->d_name, ".keep"))
-			string_list_append(&garbage, path.buf);
-		else
-			report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
-	}
-	closedir(dir);
-	report_pack_garbage(&garbage);
-	string_list_clear(&garbage, 0);
-	strbuf_release(&path);
-}
-
-static int approximate_object_count_valid;
-
-/*
- * Give a fast, rough count of the number of objects in the repository. This
- * ignores loose objects completely. If you have a lot of them, then either
- * you should repack because your performance will be awful, or they are
- * all unreachable objects about to be pruned, in which case they're not really
- * interesting as a measure of repo size in the first place.
- */
-unsigned long approximate_object_count(void)
-{
-	static unsigned long count;
-	if (!approximate_object_count_valid) {
-		struct packed_git *p;
-
-		prepare_packed_git();
-		count = 0;
-		for (p = packed_git; p; p = p->next) {
-			if (open_pack_index(p))
-				continue;
-			count += p->num_objects;
-		}
-	}
-	return count;
-}
-
-static void *get_next_packed_git(const void *p)
-{
-	return ((const struct packed_git *)p)->next;
-}
-
-static void set_next_packed_git(void *p, void *next)
-{
-	((struct packed_git *)p)->next = next;
-}
-
-static int sort_pack(const void *a_, const void *b_)
-{
-	const struct packed_git *a = a_;
-	const struct packed_git *b = b_;
-	int st;
-
-	/*
-	 * Local packs tend to contain objects specific to our
-	 * variant of the project than remote ones.  In addition,
-	 * remote ones could be on a network mounted filesystem.
-	 * Favor local ones for these reasons.
-	 */
-	st = a->pack_local - b->pack_local;
-	if (st)
-		return -st;
-
-	/*
-	 * Younger packs tend to contain more recent objects,
-	 * and more recent objects tend to get accessed more
-	 * often.
-	 */
-	if (a->mtime < b->mtime)
-		return 1;
-	else if (a->mtime == b->mtime)
-		return 0;
-	return -1;
-}
-
-static void rearrange_packed_git(void)
-{
-	packed_git = llist_mergesort(packed_git, get_next_packed_git,
-				     set_next_packed_git, sort_pack);
-}
-
-static void prepare_packed_git_mru(void)
-{
-	struct packed_git *p;
-
-	mru_clear(packed_git_mru);
-	for (p = packed_git; p; p = p->next)
-		mru_append(packed_git_mru, p);
-}
-
-static int prepare_packed_git_run_once = 0;
-void prepare_packed_git(void)
-{
-	struct alternate_object_database *alt;
-
-	if (prepare_packed_git_run_once)
-		return;
-	prepare_packed_git_one(get_object_directory(), 1);
-	prepare_alt_odb();
-	for (alt = alt_odb_list; alt; alt = alt->next)
-		prepare_packed_git_one(alt->path, 0);
-	rearrange_packed_git();
-	prepare_packed_git_mru();
-	prepare_packed_git_run_once = 1;
-}
-
-void reprepare_packed_git(void)
-{
-	approximate_object_count_valid = 0;
-	prepare_packed_git_run_once = 0;
-	prepare_packed_git();
-}
-
 static void mark_bad_packed_object(struct packed_git *p,
 				   const unsigned char *sha1)
 {
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 12/23] pack: move unpack_object_header_buffer()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (48 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 11/23] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 13/23] pack: move get_size_from_delta() Jonathan Tan
                   ` (10 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 packfile.c  | 25 +++++++++++++++++++++++++
 packfile.h  |  2 ++
 sha1_file.c | 25 -------------------------
 4 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/cache.h b/cache.h
index 63765d481..75cc0c497 100644
--- a/cache.h
+++ b/cache.h
@@ -1669,7 +1669,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
diff --git a/packfile.c b/packfile.c
index 9cf462856..43b708812 100644
--- a/packfile.c
+++ b/packfile.c
@@ -884,3 +884,28 @@ void reprepare_packed_git(void)
 	prepare_packed_git_run_once = 0;
 	prepare_packed_git();
 }
+
+unsigned long unpack_object_header_buffer(const unsigned char *buf,
+		unsigned long len, enum object_type *type, unsigned long *sizep)
+{
+	unsigned shift;
+	unsigned long size, c;
+	unsigned long used = 0;
+
+	c = buf[used++];
+	*type = (c >> 4) & 7;
+	size = c & 15;
+	shift = 4;
+	while (c & 0x80) {
+		if (len <= used || bitsizeof(long) <= shift) {
+			error("bad object header");
+			size = used = 0;
+			break;
+		}
+		c = buf[used++];
+		size += (c & 0x7f) << shift;
+		shift += 7;
+	}
+	*sizep = size;
+	return used;
+}
diff --git a/packfile.h b/packfile.h
index 1cfda1d00..9f36e0112 100644
--- a/packfile.h
+++ b/packfile.h
@@ -62,6 +62,8 @@ extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+
 extern void release_pack_memory(size_t);
 
 extern int open_packed_git(struct packed_git *p);
diff --git a/sha1_file.c b/sha1_file.c
index 51bb4d1db..b999957b0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -914,31 +914,6 @@ void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
 	return map_sha1_file_1(NULL, sha1, size);
 }
 
-unsigned long unpack_object_header_buffer(const unsigned char *buf,
-		unsigned long len, enum object_type *type, unsigned long *sizep)
-{
-	unsigned shift;
-	unsigned long size, c;
-	unsigned long used = 0;
-
-	c = buf[used++];
-	*type = (c >> 4) & 7;
-	size = c & 15;
-	shift = 4;
-	while (c & 0x80) {
-		if (len <= used || bitsizeof(long) <= shift) {
-			error("bad object header");
-			size = used = 0;
-			break;
-		}
-		c = buf[used++];
-		size += (c & 0x7f) << shift;
-		shift += 7;
-	}
-	*sizep = size;
-	return used;
-}
-
 static int unpack_sha1_short_header(git_zstream *stream,
 				    unsigned char *map, unsigned long mapsize,
 				    void *buffer, unsigned long bufsiz)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 13/23] pack: move get_size_from_delta()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (49 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 12/23] pack: move unpack_object_header_buffer() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 14/23] pack: move unpack_object_header() Jonathan Tan
                   ` (9 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 packfile.c  | 40 ++++++++++++++++++++++++++++++++++++++++
 packfile.h  |  1 +
 sha1_file.c | 39 ---------------------------------------
 4 files changed, 41 insertions(+), 40 deletions(-)

diff --git a/cache.h b/cache.h
index 75cc0c497..87f65aeea 100644
--- a/cache.h
+++ b/cache.h
@@ -1669,7 +1669,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 /*
diff --git a/packfile.c b/packfile.c
index 43b708812..fa90b643e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -4,6 +4,7 @@
 #include "dir.h"
 #include "mergesort.h"
 #include "packfile.h"
+#include "delta.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -909,3 +910,42 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
 	*sizep = size;
 	return used;
 }
+
+unsigned long get_size_from_delta(struct packed_git *p,
+				  struct pack_window **w_curs,
+			          off_t curpos)
+{
+	const unsigned char *data;
+	unsigned char delta_head[20], *in;
+	git_zstream stream;
+	int st;
+
+	memset(&stream, 0, sizeof(stream));
+	stream.next_out = delta_head;
+	stream.avail_out = sizeof(delta_head);
+
+	git_inflate_init(&stream);
+	do {
+		in = use_pack(p, w_curs, curpos, &stream.avail_in);
+		stream.next_in = in;
+		st = git_inflate(&stream, Z_FINISH);
+		curpos += stream.next_in - in;
+	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
+		 stream.total_out < sizeof(delta_head));
+	git_inflate_end(&stream);
+	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+		error("delta data unpack-initial failed");
+		return 0;
+	}
+
+	/* Examine the initial part of the delta to figure out
+	 * the result size.
+	 */
+	data = delta_head;
+
+	/* ignore base size */
+	get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
+
+	/* Read the result size */
+	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
+}
diff --git a/packfile.h b/packfile.h
index 9f36e0112..9c3bce6b2 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,6 +63,7 @@ extern void unuse_pack(struct pack_window **);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 
 extern void release_pack_memory(size_t);
 
diff --git a/sha1_file.c b/sha1_file.c
index b999957b0..5d016ad6b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1100,45 +1100,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-unsigned long get_size_from_delta(struct packed_git *p,
-				  struct pack_window **w_curs,
-			          off_t curpos)
-{
-	const unsigned char *data;
-	unsigned char delta_head[20], *in;
-	git_zstream stream;
-	int st;
-
-	memset(&stream, 0, sizeof(stream));
-	stream.next_out = delta_head;
-	stream.avail_out = sizeof(delta_head);
-
-	git_inflate_init(&stream);
-	do {
-		in = use_pack(p, w_curs, curpos, &stream.avail_in);
-		stream.next_in = in;
-		st = git_inflate(&stream, Z_FINISH);
-		curpos += stream.next_in - in;
-	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
-	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
-		error("delta data unpack-initial failed");
-		return 0;
-	}
-
-	/* Examine the initial part of the delta to figure out
-	 * the result size.
-	 */
-	data = delta_head;
-
-	/* ignore base size */
-	get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
-
-	/* Read the result size */
-	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
-}
-
 static off_t get_delta_base(struct packed_git *p,
 				    struct pack_window **w_curs,
 				    off_t *curpos,
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 14/23] pack: move unpack_object_header()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (50 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 13/23] pack: move get_size_from_delta() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 15/23] pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() Jonathan Tan
                   ` (8 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  1 -
 packfile.c  | 26 ++++++++++++++++++++++++++
 packfile.h  |  1 +
 sha1_file.c | 26 --------------------------
 4 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/cache.h b/cache.h
index 87f65aeea..7adbc587d 100644
--- a/cache.h
+++ b/cache.h
@@ -1669,7 +1669,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *)
 
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 /*
  * Iterate over the files in the loose-object parts of the object
diff --git a/packfile.c b/packfile.c
index fa90b643e..3543b37b8 100644
--- a/packfile.c
+++ b/packfile.c
@@ -949,3 +949,29 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	/* Read the result size */
 	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
 }
+
+int unpack_object_header(struct packed_git *p,
+			 struct pack_window **w_curs,
+			 off_t *curpos,
+			 unsigned long *sizep)
+{
+	unsigned char *base;
+	size_t left;
+	size_t used;
+	enum object_type type;
+
+	/* use_pack() assures us we have [base, base + 20) available
+	 * as a range that we can look at.  (Its actually the hash
+	 * size that is assured.)  With our object header encoding
+	 * the maximum deflated object size is 2^137, which is just
+	 * insane, so we know won't exceed what we have been given.
+	 */
+	base = use_pack(p, w_curs, *curpos, &left);
+	used = unpack_object_header_buffer(base, left, &type, sizep);
+	if (!used) {
+		type = OBJ_BAD;
+	} else
+		*curpos += used;
+
+	return type;
+}
diff --git a/packfile.h b/packfile.h
index 9c3bce6b2..d22a528b5 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,7 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int
 
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
+extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
 extern void release_pack_memory(size_t);
 
diff --git a/sha1_file.c b/sha1_file.c
index 5d016ad6b..681dcf1c0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1171,32 +1171,6 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p,
 		return NULL;
 }
 
-int unpack_object_header(struct packed_git *p,
-			 struct pack_window **w_curs,
-			 off_t *curpos,
-			 unsigned long *sizep)
-{
-	unsigned char *base;
-	size_t left;
-	size_t used;
-	enum object_type type;
-
-	/* use_pack() assures us we have [base, base + 20) available
-	 * as a range that we can look at.  (Its actually the hash
-	 * size that is assured.)  With our object header encoding
-	 * the maximum deflated object size is 2^137, which is just
-	 * insane, so we know won't exceed what we have been given.
-	 */
-	base = use_pack(p, w_curs, *curpos, &left);
-	used = unpack_object_header_buffer(base, left, &type, sizep);
-	if (!used) {
-		type = OBJ_BAD;
-	} else
-		*curpos += used;
-
-	return type;
-}
-
 static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
 {
 	int type;
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 15/23] pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (51 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 14/23] pack: move unpack_object_header() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 16/23] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
                   ` (7 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Both sha1_file.c and packfile.c now need read_object(), so a copy of
read_object() was created in packfile.c.

This patch makes both mark_bad_packed_object() and has_packed_and_bad()
global. Unlike most of the other patches in this series, these 2
functions need to remain global.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |   7 -
 packfile.c  | 661 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h  |  10 +
 sha1_file.c | 677 ++----------------------------------------------------------
 4 files changed, 685 insertions(+), 670 deletions(-)

diff --git a/cache.h b/cache.h
index 7adbc587d..11aa18e6a 100644
--- a/cache.h
+++ b/cache.h
@@ -1194,9 +1194,6 @@ extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
-/* global flag to enable extra checks when accessing packed objects */
-extern int do_check_packed_object_crc;
-
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
@@ -1629,8 +1626,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-extern void clear_delta_base_cache(void);
-
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
  * and can provide at least 8 bytes of data.
@@ -1668,7 +1663,6 @@ extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
 
 extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 
 /*
  * Iterate over the files in the loose-object parts of the object
@@ -1779,7 +1773,6 @@ struct object_info {
 /* Do not retry packed storage after checking packed and loose storage */
 #define OBJECT_INFO_QUICK 8
 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags);
-extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *);
 
 /* Dumb servers support */
 extern int update_server_info(int);
diff --git a/packfile.c b/packfile.c
index 3543b37b8..624cc109e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -5,6 +5,8 @@
 #include "mergesort.h"
 #include "packfile.h"
 #include "delta.h"
+#include "list.h"
+#include "streaming.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -975,3 +977,662 @@ int unpack_object_header(struct packed_git *p,
 
 	return type;
 }
+
+void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
+{
+	unsigned i;
+	for (i = 0; i < p->num_bad_objects; i++)
+		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
+			return;
+	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
+				      st_mult(GIT_MAX_RAWSZ,
+					      st_add(p->num_bad_objects, 1)));
+	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
+	p->num_bad_objects++;
+}
+
+const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
+{
+	struct packed_git *p;
+	unsigned i;
+
+	for (p = packed_git; p; p = p->next)
+		for (i = 0; i < p->num_bad_objects; i++)
+			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+				return p;
+	return NULL;
+}
+
+static off_t get_delta_base(struct packed_git *p,
+				    struct pack_window **w_curs,
+				    off_t *curpos,
+				    enum object_type type,
+				    off_t delta_obj_offset)
+{
+	unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
+	off_t base_offset;
+
+	/* use_pack() assured us we have [base_info, base_info + 20)
+	 * as a range that we can look at without walking off the
+	 * end of the mapped window.  Its actually the hash size
+	 * that is assured.  An OFS_DELTA longer than the hash size
+	 * is stupid, as then a REF_DELTA would be smaller to store.
+	 */
+	if (type == OBJ_OFS_DELTA) {
+		unsigned used = 0;
+		unsigned char c = base_info[used++];
+		base_offset = c & 127;
+		while (c & 128) {
+			base_offset += 1;
+			if (!base_offset || MSB(base_offset, 7))
+				return 0;  /* overflow */
+			c = base_info[used++];
+			base_offset = (base_offset << 7) + (c & 127);
+		}
+		base_offset = delta_obj_offset - base_offset;
+		if (base_offset <= 0 || base_offset >= delta_obj_offset)
+			return 0;  /* out of bound */
+		*curpos += used;
+	} else if (type == OBJ_REF_DELTA) {
+		/* The base entry _must_ be in the same pack */
+		base_offset = find_pack_entry_one(base_info, p);
+		*curpos += 20;
+	} else
+		die("I am totally screwed");
+	return base_offset;
+}
+
+/*
+ * Like get_delta_base above, but we return the sha1 instead of the pack
+ * offset. This means it is cheaper for REF deltas (we do not have to do
+ * the final object lookup), but more expensive for OFS deltas (we
+ * have to load the revidx to convert the offset back into a sha1).
+ */
+static const unsigned char *get_delta_base_sha1(struct packed_git *p,
+						struct pack_window **w_curs,
+						off_t curpos,
+						enum object_type type,
+						off_t delta_obj_offset)
+{
+	if (type == OBJ_REF_DELTA) {
+		unsigned char *base = use_pack(p, w_curs, curpos, NULL);
+		return base;
+	} else if (type == OBJ_OFS_DELTA) {
+		struct revindex_entry *revidx;
+		off_t base_offset = get_delta_base(p, w_curs, &curpos,
+						   type, delta_obj_offset);
+
+		if (!base_offset)
+			return NULL;
+
+		revidx = find_pack_revindex(p, base_offset);
+		if (!revidx)
+			return NULL;
+
+		return nth_packed_object_sha1(p, revidx->nr);
+	} else
+		return NULL;
+}
+
+static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
+{
+	int type;
+	struct revindex_entry *revidx;
+	const unsigned char *sha1;
+	revidx = find_pack_revindex(p, obj_offset);
+	if (!revidx)
+		return OBJ_BAD;
+	sha1 = nth_packed_object_sha1(p, revidx->nr);
+	mark_bad_packed_object(p, sha1);
+	type = sha1_object_info(sha1, NULL);
+	if (type <= OBJ_NONE)
+		return OBJ_BAD;
+	return type;
+}
+
+#define POI_STACK_PREALLOC 64
+
+static enum object_type packed_to_object_type(struct packed_git *p,
+					      off_t obj_offset,
+					      enum object_type type,
+					      struct pack_window **w_curs,
+					      off_t curpos)
+{
+	off_t small_poi_stack[POI_STACK_PREALLOC];
+	off_t *poi_stack = small_poi_stack;
+	int poi_stack_nr = 0, poi_stack_alloc = POI_STACK_PREALLOC;
+
+	while (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+		off_t base_offset;
+		unsigned long size;
+		/* Push the object we're going to leave behind */
+		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
+			poi_stack_alloc = alloc_nr(poi_stack_nr);
+			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
+			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
+		} else {
+			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
+		}
+		poi_stack[poi_stack_nr++] = obj_offset;
+		/* If parsing the base offset fails, just unwind */
+		base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
+		if (!base_offset)
+			goto unwind;
+		curpos = obj_offset = base_offset;
+		type = unpack_object_header(p, w_curs, &curpos, &size);
+		if (type <= OBJ_NONE) {
+			/* If getting the base itself fails, we first
+			 * retry the base, otherwise unwind */
+			type = retry_bad_packed_offset(p, base_offset);
+			if (type > OBJ_NONE)
+				goto out;
+			goto unwind;
+		}
+	}
+
+	switch (type) {
+	case OBJ_BAD:
+	case OBJ_COMMIT:
+	case OBJ_TREE:
+	case OBJ_BLOB:
+	case OBJ_TAG:
+		break;
+	default:
+		error("unknown object type %i at offset %"PRIuMAX" in %s",
+		      type, (uintmax_t)obj_offset, p->pack_name);
+		type = OBJ_BAD;
+	}
+
+out:
+	if (poi_stack != small_poi_stack)
+		free(poi_stack);
+	return type;
+
+unwind:
+	while (poi_stack_nr) {
+		obj_offset = poi_stack[--poi_stack_nr];
+		type = retry_bad_packed_offset(p, obj_offset);
+		if (type > OBJ_NONE)
+			goto out;
+	}
+	type = OBJ_BAD;
+	goto out;
+}
+
+static struct hashmap delta_base_cache;
+static size_t delta_base_cached;
+
+static LIST_HEAD(delta_base_cache_lru);
+
+struct delta_base_cache_key {
+	struct packed_git *p;
+	off_t base_offset;
+};
+
+struct delta_base_cache_entry {
+	struct hashmap hash;
+	struct delta_base_cache_key key;
+	struct list_head lru;
+	void *data;
+	unsigned long size;
+	enum object_type type;
+};
+
+static unsigned int pack_entry_hash(struct packed_git *p, off_t base_offset)
+{
+	unsigned int hash;
+
+	hash = (unsigned int)(intptr_t)p + (unsigned int)base_offset;
+	hash += (hash >> 8) + (hash >> 16);
+	return hash;
+}
+
+static struct delta_base_cache_entry *
+get_delta_base_cache_entry(struct packed_git *p, off_t base_offset)
+{
+	struct hashmap_entry entry;
+	struct delta_base_cache_key key;
+
+	if (!delta_base_cache.cmpfn)
+		return NULL;
+
+	hashmap_entry_init(&entry, pack_entry_hash(p, base_offset));
+	key.p = p;
+	key.base_offset = base_offset;
+	return hashmap_get(&delta_base_cache, &entry, &key);
+}
+
+static int delta_base_cache_key_eq(const struct delta_base_cache_key *a,
+				   const struct delta_base_cache_key *b)
+{
+	return a->p == b->p && a->base_offset == b->base_offset;
+}
+
+static int delta_base_cache_hash_cmp(const void *unused_cmp_data,
+				     const void *va, const void *vb,
+				     const void *vkey)
+{
+	const struct delta_base_cache_entry *a = va, *b = vb;
+	const struct delta_base_cache_key *key = vkey;
+	if (key)
+		return !delta_base_cache_key_eq(&a->key, key);
+	else
+		return !delta_base_cache_key_eq(&a->key, &b->key);
+}
+
+static int in_delta_base_cache(struct packed_git *p, off_t base_offset)
+{
+	return !!get_delta_base_cache_entry(p, base_offset);
+}
+
+/*
+ * Remove the entry from the cache, but do _not_ free the associated
+ * entry data. The caller takes ownership of the "data" buffer, and
+ * should copy out any fields it wants before detaching.
+ */
+static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent)
+{
+	hashmap_remove(&delta_base_cache, ent, &ent->key);
+	list_del(&ent->lru);
+	delta_base_cached -= ent->size;
+	free(ent);
+}
+
+static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
+	unsigned long *base_size, enum object_type *type)
+{
+	struct delta_base_cache_entry *ent;
+
+	ent = get_delta_base_cache_entry(p, base_offset);
+	if (!ent)
+		return unpack_entry(p, base_offset, type, base_size);
+
+	if (type)
+		*type = ent->type;
+	if (base_size)
+		*base_size = ent->size;
+	return xmemdupz(ent->data, ent->size);
+}
+
+static inline void release_delta_base_cache(struct delta_base_cache_entry *ent)
+{
+	free(ent->data);
+	detach_delta_base_cache_entry(ent);
+}
+
+void clear_delta_base_cache(void)
+{
+	struct list_head *lru, *tmp;
+	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
+		struct delta_base_cache_entry *entry =
+			list_entry(lru, struct delta_base_cache_entry, lru);
+		release_delta_base_cache(entry);
+	}
+}
+
+static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
+	void *base, unsigned long base_size, enum object_type type)
+{
+	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
+	struct list_head *lru, *tmp;
+
+	delta_base_cached += base_size;
+
+	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
+		struct delta_base_cache_entry *f =
+			list_entry(lru, struct delta_base_cache_entry, lru);
+		if (delta_base_cached <= delta_base_cache_limit)
+			break;
+		release_delta_base_cache(f);
+	}
+
+	ent->key.p = p;
+	ent->key.base_offset = base_offset;
+	ent->type = type;
+	ent->data = base;
+	ent->size = base_size;
+	list_add_tail(&ent->lru, &delta_base_cache_lru);
+
+	if (!delta_base_cache.cmpfn)
+		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
+	hashmap_entry_init(ent, pack_entry_hash(p, base_offset));
+	hashmap_add(&delta_base_cache, ent);
+}
+
+int packed_object_info(struct packed_git *p, off_t obj_offset,
+		       struct object_info *oi)
+{
+	struct pack_window *w_curs = NULL;
+	unsigned long size;
+	off_t curpos = obj_offset;
+	enum object_type type;
+
+	/*
+	 * We always get the representation type, but only convert it to
+	 * a "real" type later if the caller is interested.
+	 */
+	if (oi->contentp) {
+		*oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep,
+						      &type);
+		if (!*oi->contentp)
+			type = OBJ_BAD;
+	} else {
+		type = unpack_object_header(p, &w_curs, &curpos, &size);
+	}
+
+	if (!oi->contentp && oi->sizep) {
+		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+			off_t tmp_pos = curpos;
+			off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
+							   type, obj_offset);
+			if (!base_offset) {
+				type = OBJ_BAD;
+				goto out;
+			}
+			*oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos);
+			if (*oi->sizep == 0) {
+				type = OBJ_BAD;
+				goto out;
+			}
+		} else {
+			*oi->sizep = size;
+		}
+	}
+
+	if (oi->disk_sizep) {
+		struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
+		*oi->disk_sizep = revidx[1].offset - obj_offset;
+	}
+
+	if (oi->typep || oi->typename) {
+		enum object_type ptot;
+		ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
+					     curpos);
+		if (oi->typep)
+			*oi->typep = ptot;
+		if (oi->typename) {
+			const char *tn = typename(ptot);
+			if (tn)
+				strbuf_addstr(oi->typename, tn);
+		}
+		if (ptot < 0) {
+			type = OBJ_BAD;
+			goto out;
+		}
+	}
+
+	if (oi->delta_base_sha1) {
+		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
+			const unsigned char *base;
+
+			base = get_delta_base_sha1(p, &w_curs, curpos,
+						   type, obj_offset);
+			if (!base) {
+				type = OBJ_BAD;
+				goto out;
+			}
+
+			hashcpy(oi->delta_base_sha1, base);
+		} else
+			hashclr(oi->delta_base_sha1);
+	}
+
+	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
+							  OI_PACKED;
+
+out:
+	unuse_pack(&w_curs);
+	return type;
+}
+
+static void *unpack_compressed_entry(struct packed_git *p,
+				    struct pack_window **w_curs,
+				    off_t curpos,
+				    unsigned long size)
+{
+	int st;
+	git_zstream stream;
+	unsigned char *buffer, *in;
+
+	buffer = xmallocz_gently(size);
+	if (!buffer)
+		return NULL;
+	memset(&stream, 0, sizeof(stream));
+	stream.next_out = buffer;
+	stream.avail_out = size + 1;
+
+	git_inflate_init(&stream);
+	do {
+		in = use_pack(p, w_curs, curpos, &stream.avail_in);
+		stream.next_in = in;
+		st = git_inflate(&stream, Z_FINISH);
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be */
+		curpos += stream.next_in - in;
+	} while (st == Z_OK || st == Z_BUF_ERROR);
+	git_inflate_end(&stream);
+	if ((st != Z_STREAM_END) || stream.total_out != size) {
+		free(buffer);
+		return NULL;
+	}
+
+	return buffer;
+}
+
+static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
+{
+	static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS);
+	trace_printf_key(&pack_access, "%s %"PRIuMAX"\n",
+			 p->pack_name, (uintmax_t)obj_offset);
+}
+
+int do_check_packed_object_crc;
+
+#define UNPACK_ENTRY_STACK_PREALLOC 64
+struct unpack_entry_stack_ent {
+	off_t obj_offset;
+	off_t curpos;
+	unsigned long size;
+};
+
+static void *read_object(const unsigned char *sha1, enum object_type *type,
+			 unsigned long *size)
+{
+	struct object_info oi = OBJECT_INFO_INIT;
+	void *content;
+	oi.typep = type;
+	oi.sizep = size;
+	oi.contentp = &content;
+
+	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
+		return NULL;
+	return content;
+}
+
+void *unpack_entry(struct packed_git *p, off_t obj_offset,
+		   enum object_type *final_type, unsigned long *final_size)
+{
+	struct pack_window *w_curs = NULL;
+	off_t curpos = obj_offset;
+	void *data = NULL;
+	unsigned long size;
+	enum object_type type;
+	struct unpack_entry_stack_ent small_delta_stack[UNPACK_ENTRY_STACK_PREALLOC];
+	struct unpack_entry_stack_ent *delta_stack = small_delta_stack;
+	int delta_stack_nr = 0, delta_stack_alloc = UNPACK_ENTRY_STACK_PREALLOC;
+	int base_from_cache = 0;
+
+	write_pack_access_log(p, obj_offset);
+
+	/* PHASE 1: drill down to the innermost base object */
+	for (;;) {
+		off_t base_offset;
+		int i;
+		struct delta_base_cache_entry *ent;
+
+		ent = get_delta_base_cache_entry(p, curpos);
+		if (ent) {
+			type = ent->type;
+			data = ent->data;
+			size = ent->size;
+			detach_delta_base_cache_entry(ent);
+			base_from_cache = 1;
+			break;
+		}
+
+		if (do_check_packed_object_crc && p->index_version > 1) {
+			struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
+			off_t len = revidx[1].offset - obj_offset;
+			if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) {
+				const unsigned char *sha1 =
+					nth_packed_object_sha1(p, revidx->nr);
+				error("bad packed object CRC for %s",
+				      sha1_to_hex(sha1));
+				mark_bad_packed_object(p, sha1);
+				data = NULL;
+				goto out;
+			}
+		}
+
+		type = unpack_object_header(p, &w_curs, &curpos, &size);
+		if (type != OBJ_OFS_DELTA && type != OBJ_REF_DELTA)
+			break;
+
+		base_offset = get_delta_base(p, &w_curs, &curpos, type, obj_offset);
+		if (!base_offset) {
+			error("failed to validate delta base reference "
+			      "at offset %"PRIuMAX" from %s",
+			      (uintmax_t)curpos, p->pack_name);
+			/* bail to phase 2, in hopes of recovery */
+			data = NULL;
+			break;
+		}
+
+		/* push object, proceed to base */
+		if (delta_stack_nr >= delta_stack_alloc
+		    && delta_stack == small_delta_stack) {
+			delta_stack_alloc = alloc_nr(delta_stack_nr);
+			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
+			memcpy(delta_stack, small_delta_stack,
+			       sizeof(*delta_stack)*delta_stack_nr);
+		} else {
+			ALLOC_GROW(delta_stack, delta_stack_nr+1, delta_stack_alloc);
+		}
+		i = delta_stack_nr++;
+		delta_stack[i].obj_offset = obj_offset;
+		delta_stack[i].curpos = curpos;
+		delta_stack[i].size = size;
+
+		curpos = obj_offset = base_offset;
+	}
+
+	/* PHASE 2: handle the base */
+	switch (type) {
+	case OBJ_OFS_DELTA:
+	case OBJ_REF_DELTA:
+		if (data)
+			die("BUG: unpack_entry: left loop at a valid delta");
+		break;
+	case OBJ_COMMIT:
+	case OBJ_TREE:
+	case OBJ_BLOB:
+	case OBJ_TAG:
+		if (!base_from_cache)
+			data = unpack_compressed_entry(p, &w_curs, curpos, size);
+		break;
+	default:
+		data = NULL;
+		error("unknown object type %i at offset %"PRIuMAX" in %s",
+		      type, (uintmax_t)obj_offset, p->pack_name);
+	}
+
+	/* PHASE 3: apply deltas in order */
+
+	/* invariants:
+	 *   'data' holds the base data, or NULL if there was corruption
+	 */
+	while (delta_stack_nr) {
+		void *delta_data;
+		void *base = data;
+		void *external_base = NULL;
+		unsigned long delta_size, base_size = size;
+		int i;
+
+		data = NULL;
+
+		if (base)
+			add_delta_base_cache(p, obj_offset, base, base_size, type);
+
+		if (!base) {
+			/*
+			 * We're probably in deep shit, but let's try to fetch
+			 * the required base anyway from another pack or loose.
+			 * This is costly but should happen only in the presence
+			 * of a corrupted pack, and is better than failing outright.
+			 */
+			struct revindex_entry *revidx;
+			const unsigned char *base_sha1;
+			revidx = find_pack_revindex(p, obj_offset);
+			if (revidx) {
+				base_sha1 = nth_packed_object_sha1(p, revidx->nr);
+				error("failed to read delta base object %s"
+				      " at offset %"PRIuMAX" from %s",
+				      sha1_to_hex(base_sha1), (uintmax_t)obj_offset,
+				      p->pack_name);
+				mark_bad_packed_object(p, base_sha1);
+				base = read_object(base_sha1, &type, &base_size);
+				external_base = base;
+			}
+		}
+
+		i = --delta_stack_nr;
+		obj_offset = delta_stack[i].obj_offset;
+		curpos = delta_stack[i].curpos;
+		delta_size = delta_stack[i].size;
+
+		if (!base)
+			continue;
+
+		delta_data = unpack_compressed_entry(p, &w_curs, curpos, delta_size);
+
+		if (!delta_data) {
+			error("failed to unpack compressed delta "
+			      "at offset %"PRIuMAX" from %s",
+			      (uintmax_t)curpos, p->pack_name);
+			data = NULL;
+			free(external_base);
+			continue;
+		}
+
+		data = patch_delta(base, base_size,
+				   delta_data, delta_size,
+				   &size);
+
+		/*
+		 * We could not apply the delta; warn the user, but keep going.
+		 * Our failure will be noticed either in the next iteration of
+		 * the loop, or if this is the final delta, in the caller when
+		 * we return NULL. Those code paths will take care of making
+		 * a more explicit warning and retrying with another copy of
+		 * the object.
+		 */
+		if (!data)
+			error("failed to apply delta");
+
+		free(delta_data);
+		free(external_base);
+	}
+
+	if (final_type)
+		*final_type = type;
+	if (final_size)
+		*final_size = size;
+
+out:
+	unuse_pack(&w_curs);
+
+	if (delta_stack != small_delta_stack)
+		free(delta_stack);
+
+	return data;
+}
diff --git a/packfile.h b/packfile.h
index d22a528b5..c28eaccc6 100644
--- a/packfile.h
+++ b/packfile.h
@@ -60,8 +60,10 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t
 extern void close_pack_windows(struct packed_git *);
 extern void close_all_packs(void);
 extern void unuse_pack(struct pack_window **);
+extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
@@ -70,4 +72,12 @@ extern void release_pack_memory(size_t);
 
 extern int open_packed_git(struct packed_git *p);
 
+/* global flag to enable extra checks when accessing packed objects */
+extern int do_check_packed_object_crc;
+
+extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *);
+
+extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
+extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 681dcf1c0..e537ba089 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -718,32 +718,6 @@ void *xmmap(void *start, size_t length,
 	return ret;
 }
 
-static void mark_bad_packed_object(struct packed_git *p,
-				   const unsigned char *sha1)
-{
-	unsigned i;
-	for (i = 0; i < p->num_bad_objects; i++)
-		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
-			return;
-	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
-				      st_mult(GIT_MAX_RAWSZ,
-					      st_add(p->num_bad_objects, 1)));
-	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
-	p->num_bad_objects++;
-}
-
-static const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
-{
-	struct packed_git *p;
-	unsigned i;
-
-	for (p = packed_git; p; p = p->next)
-		for (i = 0; i < p->num_bad_objects; i++)
-			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
-				return p;
-	return NULL;
-}
-
 /*
  * With an in-core object data in "map", rehash it to make sure the
  * object name actually matches "sha1" to detect object corruption.
@@ -1100,629 +1074,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-static off_t get_delta_base(struct packed_git *p,
-				    struct pack_window **w_curs,
-				    off_t *curpos,
-				    enum object_type type,
-				    off_t delta_obj_offset)
-{
-	unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
-	off_t base_offset;
-
-	/* use_pack() assured us we have [base_info, base_info + 20)
-	 * as a range that we can look at without walking off the
-	 * end of the mapped window.  Its actually the hash size
-	 * that is assured.  An OFS_DELTA longer than the hash size
-	 * is stupid, as then a REF_DELTA would be smaller to store.
-	 */
-	if (type == OBJ_OFS_DELTA) {
-		unsigned used = 0;
-		unsigned char c = base_info[used++];
-		base_offset = c & 127;
-		while (c & 128) {
-			base_offset += 1;
-			if (!base_offset || MSB(base_offset, 7))
-				return 0;  /* overflow */
-			c = base_info[used++];
-			base_offset = (base_offset << 7) + (c & 127);
-		}
-		base_offset = delta_obj_offset - base_offset;
-		if (base_offset <= 0 || base_offset >= delta_obj_offset)
-			return 0;  /* out of bound */
-		*curpos += used;
-	} else if (type == OBJ_REF_DELTA) {
-		/* The base entry _must_ be in the same pack */
-		base_offset = find_pack_entry_one(base_info, p);
-		*curpos += 20;
-	} else
-		die("I am totally screwed");
-	return base_offset;
-}
-
-/*
- * Like get_delta_base above, but we return the sha1 instead of the pack
- * offset. This means it is cheaper for REF deltas (we do not have to do
- * the final object lookup), but more expensive for OFS deltas (we
- * have to load the revidx to convert the offset back into a sha1).
- */
-static const unsigned char *get_delta_base_sha1(struct packed_git *p,
-						struct pack_window **w_curs,
-						off_t curpos,
-						enum object_type type,
-						off_t delta_obj_offset)
-{
-	if (type == OBJ_REF_DELTA) {
-		unsigned char *base = use_pack(p, w_curs, curpos, NULL);
-		return base;
-	} else if (type == OBJ_OFS_DELTA) {
-		struct revindex_entry *revidx;
-		off_t base_offset = get_delta_base(p, w_curs, &curpos,
-						   type, delta_obj_offset);
-
-		if (!base_offset)
-			return NULL;
-
-		revidx = find_pack_revindex(p, base_offset);
-		if (!revidx)
-			return NULL;
-
-		return nth_packed_object_sha1(p, revidx->nr);
-	} else
-		return NULL;
-}
-
-static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset)
-{
-	int type;
-	struct revindex_entry *revidx;
-	const unsigned char *sha1;
-	revidx = find_pack_revindex(p, obj_offset);
-	if (!revidx)
-		return OBJ_BAD;
-	sha1 = nth_packed_object_sha1(p, revidx->nr);
-	mark_bad_packed_object(p, sha1);
-	type = sha1_object_info(sha1, NULL);
-	if (type <= OBJ_NONE)
-		return OBJ_BAD;
-	return type;
-}
-
-#define POI_STACK_PREALLOC 64
-
-static enum object_type packed_to_object_type(struct packed_git *p,
-					      off_t obj_offset,
-					      enum object_type type,
-					      struct pack_window **w_curs,
-					      off_t curpos)
-{
-	off_t small_poi_stack[POI_STACK_PREALLOC];
-	off_t *poi_stack = small_poi_stack;
-	int poi_stack_nr = 0, poi_stack_alloc = POI_STACK_PREALLOC;
-
-	while (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-		off_t base_offset;
-		unsigned long size;
-		/* Push the object we're going to leave behind */
-		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
-			poi_stack_alloc = alloc_nr(poi_stack_nr);
-			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
-			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
-		} else {
-			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
-		}
-		poi_stack[poi_stack_nr++] = obj_offset;
-		/* If parsing the base offset fails, just unwind */
-		base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
-		if (!base_offset)
-			goto unwind;
-		curpos = obj_offset = base_offset;
-		type = unpack_object_header(p, w_curs, &curpos, &size);
-		if (type <= OBJ_NONE) {
-			/* If getting the base itself fails, we first
-			 * retry the base, otherwise unwind */
-			type = retry_bad_packed_offset(p, base_offset);
-			if (type > OBJ_NONE)
-				goto out;
-			goto unwind;
-		}
-	}
-
-	switch (type) {
-	case OBJ_BAD:
-	case OBJ_COMMIT:
-	case OBJ_TREE:
-	case OBJ_BLOB:
-	case OBJ_TAG:
-		break;
-	default:
-		error("unknown object type %i at offset %"PRIuMAX" in %s",
-		      type, (uintmax_t)obj_offset, p->pack_name);
-		type = OBJ_BAD;
-	}
-
-out:
-	if (poi_stack != small_poi_stack)
-		free(poi_stack);
-	return type;
-
-unwind:
-	while (poi_stack_nr) {
-		obj_offset = poi_stack[--poi_stack_nr];
-		type = retry_bad_packed_offset(p, obj_offset);
-		if (type > OBJ_NONE)
-			goto out;
-	}
-	type = OBJ_BAD;
-	goto out;
-}
-
-static struct hashmap delta_base_cache;
-static size_t delta_base_cached;
-
-static LIST_HEAD(delta_base_cache_lru);
-
-struct delta_base_cache_key {
-	struct packed_git *p;
-	off_t base_offset;
-};
-
-struct delta_base_cache_entry {
-	struct hashmap hash;
-	struct delta_base_cache_key key;
-	struct list_head lru;
-	void *data;
-	unsigned long size;
-	enum object_type type;
-};
-
-static unsigned int pack_entry_hash(struct packed_git *p, off_t base_offset)
-{
-	unsigned int hash;
-
-	hash = (unsigned int)(intptr_t)p + (unsigned int)base_offset;
-	hash += (hash >> 8) + (hash >> 16);
-	return hash;
-}
-
-static struct delta_base_cache_entry *
-get_delta_base_cache_entry(struct packed_git *p, off_t base_offset)
-{
-	struct hashmap_entry entry;
-	struct delta_base_cache_key key;
-
-	if (!delta_base_cache.cmpfn)
-		return NULL;
-
-	hashmap_entry_init(&entry, pack_entry_hash(p, base_offset));
-	key.p = p;
-	key.base_offset = base_offset;
-	return hashmap_get(&delta_base_cache, &entry, &key);
-}
-
-static int delta_base_cache_key_eq(const struct delta_base_cache_key *a,
-				   const struct delta_base_cache_key *b)
-{
-	return a->p == b->p && a->base_offset == b->base_offset;
-}
-
-static int delta_base_cache_hash_cmp(const void *unused_cmp_data,
-				     const void *va, const void *vb,
-				     const void *vkey)
-{
-	const struct delta_base_cache_entry *a = va, *b = vb;
-	const struct delta_base_cache_key *key = vkey;
-	if (key)
-		return !delta_base_cache_key_eq(&a->key, key);
-	else
-		return !delta_base_cache_key_eq(&a->key, &b->key);
-}
-
-static int in_delta_base_cache(struct packed_git *p, off_t base_offset)
-{
-	return !!get_delta_base_cache_entry(p, base_offset);
-}
-
-/*
- * Remove the entry from the cache, but do _not_ free the associated
- * entry data. The caller takes ownership of the "data" buffer, and
- * should copy out any fields it wants before detaching.
- */
-static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent)
-{
-	hashmap_remove(&delta_base_cache, ent, &ent->key);
-	list_del(&ent->lru);
-	delta_base_cached -= ent->size;
-	free(ent);
-}
-
-static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
-	unsigned long *base_size, enum object_type *type)
-{
-	struct delta_base_cache_entry *ent;
-
-	ent = get_delta_base_cache_entry(p, base_offset);
-	if (!ent)
-		return unpack_entry(p, base_offset, type, base_size);
-
-	if (type)
-		*type = ent->type;
-	if (base_size)
-		*base_size = ent->size;
-	return xmemdupz(ent->data, ent->size);
-}
-
-static inline void release_delta_base_cache(struct delta_base_cache_entry *ent)
-{
-	free(ent->data);
-	detach_delta_base_cache_entry(ent);
-}
-
-void clear_delta_base_cache(void)
-{
-	struct list_head *lru, *tmp;
-	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
-		struct delta_base_cache_entry *entry =
-			list_entry(lru, struct delta_base_cache_entry, lru);
-		release_delta_base_cache(entry);
-	}
-}
-
-static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
-	void *base, unsigned long base_size, enum object_type type)
-{
-	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
-	struct list_head *lru, *tmp;
-
-	delta_base_cached += base_size;
-
-	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
-		struct delta_base_cache_entry *f =
-			list_entry(lru, struct delta_base_cache_entry, lru);
-		if (delta_base_cached <= delta_base_cache_limit)
-			break;
-		release_delta_base_cache(f);
-	}
-
-	ent->key.p = p;
-	ent->key.base_offset = base_offset;
-	ent->type = type;
-	ent->data = base;
-	ent->size = base_size;
-	list_add_tail(&ent->lru, &delta_base_cache_lru);
-
-	if (!delta_base_cache.cmpfn)
-		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
-	hashmap_entry_init(ent, pack_entry_hash(p, base_offset));
-	hashmap_add(&delta_base_cache, ent);
-}
-
-int packed_object_info(struct packed_git *p, off_t obj_offset,
-		       struct object_info *oi)
-{
-	struct pack_window *w_curs = NULL;
-	unsigned long size;
-	off_t curpos = obj_offset;
-	enum object_type type;
-
-	/*
-	 * We always get the representation type, but only convert it to
-	 * a "real" type later if the caller is interested.
-	 */
-	if (oi->contentp) {
-		*oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep,
-						      &type);
-		if (!*oi->contentp)
-			type = OBJ_BAD;
-	} else {
-		type = unpack_object_header(p, &w_curs, &curpos, &size);
-	}
-
-	if (!oi->contentp && oi->sizep) {
-		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-			off_t tmp_pos = curpos;
-			off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
-							   type, obj_offset);
-			if (!base_offset) {
-				type = OBJ_BAD;
-				goto out;
-			}
-			*oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos);
-			if (*oi->sizep == 0) {
-				type = OBJ_BAD;
-				goto out;
-			}
-		} else {
-			*oi->sizep = size;
-		}
-	}
-
-	if (oi->disk_sizep) {
-		struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
-		*oi->disk_sizep = revidx[1].offset - obj_offset;
-	}
-
-	if (oi->typep || oi->typename) {
-		enum object_type ptot;
-		ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
-					     curpos);
-		if (oi->typep)
-			*oi->typep = ptot;
-		if (oi->typename) {
-			const char *tn = typename(ptot);
-			if (tn)
-				strbuf_addstr(oi->typename, tn);
-		}
-		if (ptot < 0) {
-			type = OBJ_BAD;
-			goto out;
-		}
-	}
-
-	if (oi->delta_base_sha1) {
-		if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-			const unsigned char *base;
-
-			base = get_delta_base_sha1(p, &w_curs, curpos,
-						   type, obj_offset);
-			if (!base) {
-				type = OBJ_BAD;
-				goto out;
-			}
-
-			hashcpy(oi->delta_base_sha1, base);
-		} else
-			hashclr(oi->delta_base_sha1);
-	}
-
-	oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED :
-							  OI_PACKED;
-
-out:
-	unuse_pack(&w_curs);
-	return type;
-}
-
-static void *unpack_compressed_entry(struct packed_git *p,
-				    struct pack_window **w_curs,
-				    off_t curpos,
-				    unsigned long size)
-{
-	int st;
-	git_zstream stream;
-	unsigned char *buffer, *in;
-
-	buffer = xmallocz_gently(size);
-	if (!buffer)
-		return NULL;
-	memset(&stream, 0, sizeof(stream));
-	stream.next_out = buffer;
-	stream.avail_out = size + 1;
-
-	git_inflate_init(&stream);
-	do {
-		in = use_pack(p, w_curs, curpos, &stream.avail_in);
-		stream.next_in = in;
-		st = git_inflate(&stream, Z_FINISH);
-		if (!stream.avail_out)
-			break; /* the payload is larger than it should be */
-		curpos += stream.next_in - in;
-	} while (st == Z_OK || st == Z_BUF_ERROR);
-	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) || stream.total_out != size) {
-		free(buffer);
-		return NULL;
-	}
-
-	return buffer;
-}
-
-static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size);
-
-static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
-{
-	static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS);
-	trace_printf_key(&pack_access, "%s %"PRIuMAX"\n",
-			 p->pack_name, (uintmax_t)obj_offset);
-}
-
-int do_check_packed_object_crc;
-
-#define UNPACK_ENTRY_STACK_PREALLOC 64
-struct unpack_entry_stack_ent {
-	off_t obj_offset;
-	off_t curpos;
-	unsigned long size;
-};
-
-void *unpack_entry(struct packed_git *p, off_t obj_offset,
-		   enum object_type *final_type, unsigned long *final_size)
-{
-	struct pack_window *w_curs = NULL;
-	off_t curpos = obj_offset;
-	void *data = NULL;
-	unsigned long size;
-	enum object_type type;
-	struct unpack_entry_stack_ent small_delta_stack[UNPACK_ENTRY_STACK_PREALLOC];
-	struct unpack_entry_stack_ent *delta_stack = small_delta_stack;
-	int delta_stack_nr = 0, delta_stack_alloc = UNPACK_ENTRY_STACK_PREALLOC;
-	int base_from_cache = 0;
-
-	write_pack_access_log(p, obj_offset);
-
-	/* PHASE 1: drill down to the innermost base object */
-	for (;;) {
-		off_t base_offset;
-		int i;
-		struct delta_base_cache_entry *ent;
-
-		ent = get_delta_base_cache_entry(p, curpos);
-		if (ent) {
-			type = ent->type;
-			data = ent->data;
-			size = ent->size;
-			detach_delta_base_cache_entry(ent);
-			base_from_cache = 1;
-			break;
-		}
-
-		if (do_check_packed_object_crc && p->index_version > 1) {
-			struct revindex_entry *revidx = find_pack_revindex(p, obj_offset);
-			off_t len = revidx[1].offset - obj_offset;
-			if (check_pack_crc(p, &w_curs, obj_offset, len, revidx->nr)) {
-				const unsigned char *sha1 =
-					nth_packed_object_sha1(p, revidx->nr);
-				error("bad packed object CRC for %s",
-				      sha1_to_hex(sha1));
-				mark_bad_packed_object(p, sha1);
-				data = NULL;
-				goto out;
-			}
-		}
-
-		type = unpack_object_header(p, &w_curs, &curpos, &size);
-		if (type != OBJ_OFS_DELTA && type != OBJ_REF_DELTA)
-			break;
-
-		base_offset = get_delta_base(p, &w_curs, &curpos, type, obj_offset);
-		if (!base_offset) {
-			error("failed to validate delta base reference "
-			      "at offset %"PRIuMAX" from %s",
-			      (uintmax_t)curpos, p->pack_name);
-			/* bail to phase 2, in hopes of recovery */
-			data = NULL;
-			break;
-		}
-
-		/* push object, proceed to base */
-		if (delta_stack_nr >= delta_stack_alloc
-		    && delta_stack == small_delta_stack) {
-			delta_stack_alloc = alloc_nr(delta_stack_nr);
-			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
-			memcpy(delta_stack, small_delta_stack,
-			       sizeof(*delta_stack)*delta_stack_nr);
-		} else {
-			ALLOC_GROW(delta_stack, delta_stack_nr+1, delta_stack_alloc);
-		}
-		i = delta_stack_nr++;
-		delta_stack[i].obj_offset = obj_offset;
-		delta_stack[i].curpos = curpos;
-		delta_stack[i].size = size;
-
-		curpos = obj_offset = base_offset;
-	}
-
-	/* PHASE 2: handle the base */
-	switch (type) {
-	case OBJ_OFS_DELTA:
-	case OBJ_REF_DELTA:
-		if (data)
-			die("BUG: unpack_entry: left loop at a valid delta");
-		break;
-	case OBJ_COMMIT:
-	case OBJ_TREE:
-	case OBJ_BLOB:
-	case OBJ_TAG:
-		if (!base_from_cache)
-			data = unpack_compressed_entry(p, &w_curs, curpos, size);
-		break;
-	default:
-		data = NULL;
-		error("unknown object type %i at offset %"PRIuMAX" in %s",
-		      type, (uintmax_t)obj_offset, p->pack_name);
-	}
-
-	/* PHASE 3: apply deltas in order */
-
-	/* invariants:
-	 *   'data' holds the base data, or NULL if there was corruption
-	 */
-	while (delta_stack_nr) {
-		void *delta_data;
-		void *base = data;
-		void *external_base = NULL;
-		unsigned long delta_size, base_size = size;
-		int i;
-
-		data = NULL;
-
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
-
-		if (!base) {
-			/*
-			 * We're probably in deep shit, but let's try to fetch
-			 * the required base anyway from another pack or loose.
-			 * This is costly but should happen only in the presence
-			 * of a corrupted pack, and is better than failing outright.
-			 */
-			struct revindex_entry *revidx;
-			const unsigned char *base_sha1;
-			revidx = find_pack_revindex(p, obj_offset);
-			if (revidx) {
-				base_sha1 = nth_packed_object_sha1(p, revidx->nr);
-				error("failed to read delta base object %s"
-				      " at offset %"PRIuMAX" from %s",
-				      sha1_to_hex(base_sha1), (uintmax_t)obj_offset,
-				      p->pack_name);
-				mark_bad_packed_object(p, base_sha1);
-				base = read_object(base_sha1, &type, &base_size);
-				external_base = base;
-			}
-		}
-
-		i = --delta_stack_nr;
-		obj_offset = delta_stack[i].obj_offset;
-		curpos = delta_stack[i].curpos;
-		delta_size = delta_stack[i].size;
-
-		if (!base)
-			continue;
-
-		delta_data = unpack_compressed_entry(p, &w_curs, curpos, delta_size);
-
-		if (!delta_data) {
-			error("failed to unpack compressed delta "
-			      "at offset %"PRIuMAX" from %s",
-			      (uintmax_t)curpos, p->pack_name);
-			data = NULL;
-			free(external_base);
-			continue;
-		}
-
-		data = patch_delta(base, base_size,
-				   delta_data, delta_size,
-				   &size);
-
-		/*
-		 * We could not apply the delta; warn the user, but keep going.
-		 * Our failure will be noticed either in the next iteration of
-		 * the loop, or if this is the final delta, in the caller when
-		 * we return NULL. Those code paths will take care of making
-		 * a more explicit warning and retrying with another copy of
-		 * the object.
-		 */
-		if (!data)
-			error("failed to apply delta");
-
-		free(delta_data);
-		free(external_base);
-	}
-
-	if (final_type)
-		*final_type = type;
-	if (final_size)
-		*final_size = size;
-
-out:
-	unuse_pack(&w_curs);
-
-	if (delta_stack != small_delta_stack)
-		free(delta_stack);
-
-	return data;
-}
-
 const unsigned char *nth_packed_object_sha1(struct packed_git *p,
 					    uint32_t n)
 {
@@ -2082,6 +1433,20 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
 	return type;
 }
 
+static void *read_object(const unsigned char *sha1, enum object_type *type,
+			 unsigned long *size)
+{
+	struct object_info oi = OBJECT_INFO_INIT;
+	void *content;
+	oi.typep = type;
+	oi.sizep = size;
+	oi.contentp = &content;
+
+	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
+		return NULL;
+	return content;
+}
+
 int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 		      unsigned char *sha1)
 {
@@ -2100,20 +1465,6 @@ int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
 	return 0;
 }
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size)
-{
-	struct object_info oi = OBJECT_INFO_INIT;
-	void *content;
-	oi.typep = type;
-	oi.sizep = size;
-	oi.contentp = &content;
-
-	if (sha1_object_info_extended(sha1, &oi, 0) < 0)
-		return NULL;
-	return content;
-}
-
 /*
  * This function dies on corrupt objects; the callers who want to
  * deal with them should arrange to call read_object() and give error
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 16/23] pack: move nth_packed_object_{sha1,oid}
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (52 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 15/23] pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 17/23] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
                   ` (6 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 14 --------------
 packfile.c  | 31 +++++++++++++++++++++++++++++++
 packfile.h  | 16 +++++++++++++++-
 sha1_file.c | 31 -------------------------------
 4 files changed, 46 insertions(+), 46 deletions(-)

diff --git a/cache.h b/cache.h
index 11aa18e6a..83aa3cc62 100644
--- a/cache.h
+++ b/cache.h
@@ -1636,20 +1636,6 @@ extern int odb_pack_keep(const char *name);
  */
 extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
-/*
- * Return the SHA-1 of the nth object within the specified packfile.
- * Open the index if it is not already open.  The return value points
- * at the SHA-1 within the mmapped index.  Return NULL if there is an
- * error.
- */
-extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
-/*
- * Like nth_packed_object_sha1, but write the data into the object specified by
- * the the first argument.  Returns the first argument on success, and NULL on
- * error.
- */
-extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
-
 /*
  * Return the offset of the nth object within the specified packfile.
  * The index must already be opened.
diff --git a/packfile.c b/packfile.c
index 624cc109e..e9b16da94 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1636,3 +1636,34 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 
 	return data;
 }
+
+const unsigned char *nth_packed_object_sha1(struct packed_git *p,
+					    uint32_t n)
+{
+	const unsigned char *index = p->index_data;
+	if (!index) {
+		if (open_pack_index(p))
+			return NULL;
+		index = p->index_data;
+	}
+	if (n >= p->num_objects)
+		return NULL;
+	index += 4 * 256;
+	if (p->index_version == 1) {
+		return index + 24 * n + 4;
+	} else {
+		index += 8;
+		return index + 20 * n;
+	}
+}
+
+const struct object_id *nth_packed_object_oid(struct object_id *oid,
+					      struct packed_git *p,
+					      uint32_t n)
+{
+	const unsigned char *hash = nth_packed_object_sha1(p, n);
+	if (!hash)
+		return NULL;
+	hashcpy(oid->hash, hash);
+	return oid;
+}
diff --git a/packfile.h b/packfile.h
index c28eaccc6..56d70caa0 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,6 +63,21 @@ extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+/*
+ * Return the SHA-1 of the nth object within the specified packfile.
+ * Open the index if it is not already open.  The return value points
+ * at the SHA-1 within the mmapped index.  Return NULL if there is an
+ * error.
+ */
+extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
+/*
+ * Like nth_packed_object_sha1, but write the data into the object specified by
+ * the the first argument.  Returns the first argument on success, and NULL on
+ * error.
+ */
+extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
+
+
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
@@ -79,5 +94,4 @@ extern int packed_object_info(struct packed_git *pack, off_t offset, struct obje
 
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
 extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
-
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index e537ba089..34fbe8e51 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1074,37 +1074,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-const unsigned char *nth_packed_object_sha1(struct packed_git *p,
-					    uint32_t n)
-{
-	const unsigned char *index = p->index_data;
-	if (!index) {
-		if (open_pack_index(p))
-			return NULL;
-		index = p->index_data;
-	}
-	if (n >= p->num_objects)
-		return NULL;
-	index += 4 * 256;
-	if (p->index_version == 1) {
-		return index + 24 * n + 4;
-	} else {
-		index += 8;
-		return index + 20 * n;
-	}
-}
-
-const struct object_id *nth_packed_object_oid(struct object_id *oid,
-					      struct packed_git *p,
-					      uint32_t n)
-{
-	const unsigned char *hash = nth_packed_object_sha1(p, n);
-	if (!hash)
-		return NULL;
-	hashcpy(oid->hash, hash);
-	return oid;
-}
-
 void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
 {
 	const unsigned char *ptr = vptr;
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 17/23] pack: move check_pack_index_ptr(), nth_packed_object_offset()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (53 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 16/23] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 18/23] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
                   ` (5 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 16 ----------------
 packfile.c  | 33 +++++++++++++++++++++++++++++++++
 packfile.h  | 16 ++++++++++++++++
 sha1_file.c | 33 ---------------------------------
 4 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/cache.h b/cache.h
index 83aa3cc62..ee75a4949 100644
--- a/cache.h
+++ b/cache.h
@@ -1626,22 +1626,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * Make sure that a pointer access into an mmap'd index file is within bounds,
- * and can provide at least 8 bytes of data.
- *
- * Note that this is only necessary for variable-length segments of the file
- * (like the 64-bit extended offset table), as we compare the size to the
- * fixed-length parts when we open the file.
- */
-extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
-
-/*
- * Return the offset of the nth object within the specified packfile.
- * The index must already be opened.
- */
-extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
-
 /*
  * If the object named sha1 is present in the specified packfile,
  * return its offset within the packfile; otherwise, return 0.
diff --git a/packfile.c b/packfile.c
index e9b16da94..e914422e9 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1667,3 +1667,36 @@ const struct object_id *nth_packed_object_oid(struct object_id *oid,
 	hashcpy(oid->hash, hash);
 	return oid;
 }
+
+void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
+{
+	const unsigned char *ptr = vptr;
+	const unsigned char *start = p->index_data;
+	const unsigned char *end = start + p->index_size;
+	if (ptr < start)
+		die(_("offset before start of pack index for %s (corrupt index?)"),
+		    p->pack_name);
+	/* No need to check for underflow; .idx files must be at least 8 bytes */
+	if (ptr >= end - 8)
+		die(_("offset beyond end of pack index for %s (truncated index?)"),
+		    p->pack_name);
+}
+
+off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
+{
+	const unsigned char *index = p->index_data;
+	index += 4 * 256;
+	if (p->index_version == 1) {
+		return ntohl(*((uint32_t *)(index + 24 * n)));
+	} else {
+		uint32_t off;
+		index += 8 + p->num_objects * (20 + 4);
+		off = ntohl(*((uint32_t *)(index + 4 * n)));
+		if (!(off & 0x80000000))
+			return off;
+		index += p->num_objects * 4 + (off & 0x7fffffff) * 8;
+		check_pack_index_ptr(p, index);
+		return (((uint64_t)ntohl(*((uint32_t *)(index + 0)))) << 32) |
+				   ntohl(*((uint32_t *)(index + 4)));
+	}
+}
diff --git a/packfile.h b/packfile.h
index 56d70caa0..8deb84bd1 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,6 +63,16 @@ extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
+/*
+ * Make sure that a pointer access into an mmap'd index file is within bounds,
+ * and can provide at least 8 bytes of data.
+ *
+ * Note that this is only necessary for variable-length segments of the file
+ * (like the 64-bit extended offset table), as we compare the size to the
+ * fixed-length parts when we open the file.
+ */
+extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
+
 /*
  * Return the SHA-1 of the nth object within the specified packfile.
  * Open the index if it is not already open.  The return value points
@@ -77,6 +87,11 @@ extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t
  */
 extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
 
+/*
+ * Return the offset of the nth object within the specified packfile.
+ * The index must already be opened.
+ */
+extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
@@ -94,4 +109,5 @@ extern int packed_object_info(struct packed_git *pack, off_t offset, struct obje
 
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
 extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 34fbe8e51..2d22bc228 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1074,39 +1074,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-void check_pack_index_ptr(const struct packed_git *p, const void *vptr)
-{
-	const unsigned char *ptr = vptr;
-	const unsigned char *start = p->index_data;
-	const unsigned char *end = start + p->index_size;
-	if (ptr < start)
-		die(_("offset before start of pack index for %s (corrupt index?)"),
-		    p->pack_name);
-	/* No need to check for underflow; .idx files must be at least 8 bytes */
-	if (ptr >= end - 8)
-		die(_("offset beyond end of pack index for %s (truncated index?)"),
-		    p->pack_name);
-}
-
-off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
-{
-	const unsigned char *index = p->index_data;
-	index += 4 * 256;
-	if (p->index_version == 1) {
-		return ntohl(*((uint32_t *)(index + 24 * n)));
-	} else {
-		uint32_t off;
-		index += 8 + p->num_objects * (20 + 4);
-		off = ntohl(*((uint32_t *)(index + 4 * n)));
-		if (!(off & 0x80000000))
-			return off;
-		index += p->num_objects * 4 + (off & 0x7fffffff) * 8;
-		check_pack_index_ptr(p, index);
-		return (((uint64_t)ntohl(*((uint32_t *)(index + 0)))) << 32) |
-				   ntohl(*((uint32_t *)(index + 4)));
-	}
-}
-
 off_t find_pack_entry_one(const unsigned char *sha1,
 				  struct packed_git *p)
 {
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 18/23] pack: move find_pack_entry_one(), is_pack_valid()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (54 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 17/23] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 19/23] pack: move find_sha1_pack() Jonathan Tan
                   ` (4 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     |  8 -------
 packfile.c  | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 packfile.h  |  9 ++++++--
 sha1_file.c | 73 ----------------------------------------------------------
 4 files changed, 82 insertions(+), 84 deletions(-)

diff --git a/cache.h b/cache.h
index ee75a4949..9297d078a 100644
--- a/cache.h
+++ b/cache.h
@@ -1626,14 +1626,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern);
  */
 extern int odb_pack_keep(const char *name);
 
-/*
- * If the object named sha1 is present in the specified packfile,
- * return its offset within the packfile; otherwise, return 0.
- */
-extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
-
-extern int is_pack_valid(struct packed_git *);
-
 /*
  * Iterate over the files in the loose-object parts of the object
  * directory "path", triggering the following callbacks:
diff --git a/packfile.c b/packfile.c
index e914422e9..ad7336594 100644
--- a/packfile.c
+++ b/packfile.c
@@ -7,6 +7,7 @@
 #include "delta.h"
 #include "list.h"
 #include "streaming.h"
+#include "sha1-lookup.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -509,7 +510,7 @@ static int open_packed_git_1(struct packed_git *p)
 	return 0;
 }
 
-int open_packed_git(struct packed_git *p)
+static int open_packed_git(struct packed_git *p)
 {
 	if (!open_packed_git_1(p))
 		return 0;
@@ -1700,3 +1701,76 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 				   ntohl(*((uint32_t *)(index + 4)));
 	}
 }
+
+off_t find_pack_entry_one(const unsigned char *sha1,
+				  struct packed_git *p)
+{
+	const uint32_t *level1_ofs = p->index_data;
+	const unsigned char *index = p->index_data;
+	unsigned hi, lo, stride;
+	static int debug_lookup = -1;
+
+	if (debug_lookup < 0)
+		debug_lookup = !!getenv("GIT_DEBUG_LOOKUP");
+
+	if (!index) {
+		if (open_pack_index(p))
+			return 0;
+		level1_ofs = p->index_data;
+		index = p->index_data;
+	}
+	if (p->index_version > 1) {
+		level1_ofs += 2;
+		index += 8;
+	}
+	index += 4 * 256;
+	hi = ntohl(level1_ofs[*sha1]);
+	lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
+	if (p->index_version > 1) {
+		stride = 20;
+	} else {
+		stride = 24;
+		index += 4;
+	}
+
+	if (debug_lookup)
+		printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n",
+		       sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
+
+	while (lo < hi) {
+		unsigned mi = (lo + hi) / 2;
+		int cmp = hashcmp(index + mi * stride, sha1);
+
+		if (debug_lookup)
+			printf("lo %u hi %u rg %u mi %u\n",
+			       lo, hi, hi - lo, mi);
+		if (!cmp)
+			return nth_packed_object_offset(p, mi);
+		if (cmp > 0)
+			hi = mi;
+		else
+			lo = mi+1;
+	}
+	return 0;
+}
+
+int is_pack_valid(struct packed_git *p)
+{
+	/* An already open pack is known to be valid. */
+	if (p->pack_fd != -1)
+		return 1;
+
+	/* If the pack has one window completely covering the
+	 * file size, the pack is known to be valid even if
+	 * the descriptor is not currently open.
+	 */
+	if (p->windows) {
+		struct pack_window *w = p->windows;
+
+		if (!w->offset && w->len == p->pack_size)
+			return 1;
+	}
+
+	/* Force the pack to open to prove its valid. */
+	return !open_packed_git(p);
+}
diff --git a/packfile.h b/packfile.h
index 8deb84bd1..4fca6fb28 100644
--- a/packfile.h
+++ b/packfile.h
@@ -93,6 +93,13 @@ extern const struct object_id *nth_packed_object_oid(struct object_id *, struct
  */
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 
+/*
+ * If the object named sha1 is present in the specified packfile,
+ * return its offset within the packfile; otherwise, return 0.
+ */
+extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
+
+extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
@@ -100,8 +107,6 @@ extern int unpack_object_header(struct packed_git *, struct pack_window **, off_
 
 extern void release_pack_memory(size_t);
 
-extern int open_packed_git(struct packed_git *p);
-
 /* global flag to enable extra checks when accessing packed objects */
 extern int do_check_packed_object_crc;
 
diff --git a/sha1_file.c b/sha1_file.c
index 2d22bc228..27714f5e1 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1074,79 +1074,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-off_t find_pack_entry_one(const unsigned char *sha1,
-				  struct packed_git *p)
-{
-	const uint32_t *level1_ofs = p->index_data;
-	const unsigned char *index = p->index_data;
-	unsigned hi, lo, stride;
-	static int debug_lookup = -1;
-
-	if (debug_lookup < 0)
-		debug_lookup = !!getenv("GIT_DEBUG_LOOKUP");
-
-	if (!index) {
-		if (open_pack_index(p))
-			return 0;
-		level1_ofs = p->index_data;
-		index = p->index_data;
-	}
-	if (p->index_version > 1) {
-		level1_ofs += 2;
-		index += 8;
-	}
-	index += 4 * 256;
-	hi = ntohl(level1_ofs[*sha1]);
-	lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
-	if (p->index_version > 1) {
-		stride = 20;
-	} else {
-		stride = 24;
-		index += 4;
-	}
-
-	if (debug_lookup)
-		printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n",
-		       sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
-
-	while (lo < hi) {
-		unsigned mi = (lo + hi) / 2;
-		int cmp = hashcmp(index + mi * stride, sha1);
-
-		if (debug_lookup)
-			printf("lo %u hi %u rg %u mi %u\n",
-			       lo, hi, hi - lo, mi);
-		if (!cmp)
-			return nth_packed_object_offset(p, mi);
-		if (cmp > 0)
-			hi = mi;
-		else
-			lo = mi+1;
-	}
-	return 0;
-}
-
-int is_pack_valid(struct packed_git *p)
-{
-	/* An already open pack is known to be valid. */
-	if (p->pack_fd != -1)
-		return 1;
-
-	/* If the pack has one window completely covering the
-	 * file size, the pack is known to be valid even if
-	 * the descriptor is not currently open.
-	 */
-	if (p->windows) {
-		struct pack_window *w = p->windows;
-
-		if (!w->offset && w->len == p->pack_size)
-			return 1;
-	}
-
-	/* Force the pack to open to prove its valid. */
-	return !open_packed_git(p);
-}
-
 static int fill_pack_entry(const unsigned char *sha1,
 			   struct pack_entry *e,
 			   struct packed_git *p)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 19/23] pack: move find_sha1_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (55 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 18/23] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 20/23] pack: move find_pack_entry() and make it global Jonathan Tan
                   ` (3 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h       |  3 ---
 http-push.c   |  1 +
 http-walker.c |  1 +
 packfile.c    | 13 +++++++++++++
 packfile.h    |  3 +++
 sha1_file.c   | 13 -------------
 6 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/cache.h b/cache.h
index 9297d078a..1e90bb754 100644
--- a/cache.h
+++ b/cache.h
@@ -1608,9 +1608,6 @@ struct pack_entry {
 	struct packed_git *p;
 };
 
-extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
-					 struct packed_git *packs);
-
 /*
  * Create a temporary file rooted in the object database directory, or
  * die on failure. The filename is taken from "pattern", which should have the
diff --git a/http-push.c b/http-push.c
index c91f40a61..e4c9b065c 100644
--- a/http-push.c
+++ b/http-push.c
@@ -11,6 +11,7 @@
 #include "list-objects.h"
 #include "sigchain.h"
 #include "argv-array.h"
+#include "packfile.h"
 
 #ifdef EXPAT_NEEDS_XMLPARSE_H
 #include <xmlparse.h>
diff --git a/http-walker.c b/http-walker.c
index ee049cb13..1ae8363de 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "list.h"
 #include "transport.h"
+#include "packfile.h"
 
 struct alt_base {
 	char *base;
diff --git a/packfile.c b/packfile.c
index ad7336594..ba3a5eb3a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1774,3 +1774,16 @@ int is_pack_valid(struct packed_git *p)
 	/* Force the pack to open to prove its valid. */
 	return !open_packed_git(p);
 }
+
+struct packed_git *find_sha1_pack(const unsigned char *sha1,
+				  struct packed_git *packs)
+{
+	struct packed_git *p;
+
+	for (p = packs; p; p = p->next) {
+		if (find_pack_entry_one(sha1, p))
+			return p;
+	}
+	return NULL;
+
+}
diff --git a/packfile.h b/packfile.h
index 4fca6fb28..a4ff6f6ed 100644
--- a/packfile.h
+++ b/packfile.h
@@ -42,6 +42,9 @@ extern void install_packed_git(struct packed_git *pack);
  */
 unsigned long approximate_object_count(void);
 
+extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
+					 struct packed_git *packs);
+
 extern void pack_report(void);
 
 /*
diff --git a/sha1_file.c b/sha1_file.c
index 27714f5e1..8853672d2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1127,19 +1127,6 @@ static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
 	return 0;
 }
 
-struct packed_git *find_sha1_pack(const unsigned char *sha1,
-				  struct packed_git *packs)
-{
-	struct packed_git *p;
-
-	for (p = packs; p; p = p->next) {
-		if (find_pack_entry_one(sha1, p))
-			return p;
-	}
-	return NULL;
-
-}
-
 static int sha1_loose_object_info(const unsigned char *sha1,
 				  struct object_info *oi,
 				  int flags)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 20/23] pack: move find_pack_entry() and make it global
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (56 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 19/23] pack: move find_sha1_pack() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 21/23] pack: move has_sha1_pack() Jonathan Tan
                   ` (2 subsequent siblings)
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

This function needs to be global as it is used by sha1_file.c and will
be used by packfile.c.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 packfile.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h  |  2 ++
 sha1_file.c | 53 -----------------------------------------------------
 3 files changed, 55 insertions(+), 53 deletions(-)

diff --git a/packfile.c b/packfile.c
index ba3a5eb3a..ae5395f5f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1787,3 +1787,56 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
 	return NULL;
 
 }
+
+static int fill_pack_entry(const unsigned char *sha1,
+			   struct pack_entry *e,
+			   struct packed_git *p)
+{
+	off_t offset;
+
+	if (p->num_bad_objects) {
+		unsigned i;
+		for (i = 0; i < p->num_bad_objects; i++)
+			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+				return 0;
+	}
+
+	offset = find_pack_entry_one(sha1, p);
+	if (!offset)
+		return 0;
+
+	/*
+	 * We are about to tell the caller where they can locate the
+	 * requested object.  We better make sure the packfile is
+	 * still here and can be accessed before supplying that
+	 * answer, as it may have been deleted since the index was
+	 * loaded!
+	 */
+	if (!is_pack_valid(p))
+		return 0;
+	e->offset = offset;
+	e->p = p;
+	hashcpy(e->sha1, sha1);
+	return 1;
+}
+
+/*
+ * Iff a pack file contains the object named by sha1, return true and
+ * store its location to e.
+ */
+int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
+{
+	struct mru_entry *p;
+
+	prepare_packed_git();
+	if (!packed_git)
+		return 0;
+
+	for (p = packed_git_mru->head; p; p = p->next) {
+		if (fill_pack_entry(sha1, e, p->item)) {
+			mru_mark(packed_git_mru, p);
+			return 1;
+		}
+	}
+	return 0;
+}
diff --git a/packfile.h b/packfile.h
index a4ff6f6ed..c9b4fcfaf 100644
--- a/packfile.h
+++ b/packfile.h
@@ -118,4 +118,6 @@ extern int packed_object_info(struct packed_git *pack, off_t offset, struct obje
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
 extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
 
+extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 8853672d2..76c86639c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1074,59 +1074,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
 	return parse_sha1_header_extended(hdr, &oi, 0);
 }
 
-static int fill_pack_entry(const unsigned char *sha1,
-			   struct pack_entry *e,
-			   struct packed_git *p)
-{
-	off_t offset;
-
-	if (p->num_bad_objects) {
-		unsigned i;
-		for (i = 0; i < p->num_bad_objects; i++)
-			if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
-				return 0;
-	}
-
-	offset = find_pack_entry_one(sha1, p);
-	if (!offset)
-		return 0;
-
-	/*
-	 * We are about to tell the caller where they can locate the
-	 * requested object.  We better make sure the packfile is
-	 * still here and can be accessed before supplying that
-	 * answer, as it may have been deleted since the index was
-	 * loaded!
-	 */
-	if (!is_pack_valid(p))
-		return 0;
-	e->offset = offset;
-	e->p = p;
-	hashcpy(e->sha1, sha1);
-	return 1;
-}
-
-/*
- * Iff a pack file contains the object named by sha1, return true and
- * store its location to e.
- */
-static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
-{
-	struct mru_entry *p;
-
-	prepare_packed_git();
-	if (!packed_git)
-		return 0;
-
-	for (p = packed_git_mru->head; p; p = p->next) {
-		if (fill_pack_entry(sha1, e, p->item)) {
-			mru_mark(packed_git_mru, p);
-			return 1;
-		}
-	}
-	return 0;
-}
-
 static int sha1_loose_object_info(const unsigned char *sha1,
 				  struct object_info *oi,
 				  int flags)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 21/23] pack: move has_sha1_pack()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (57 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 20/23] pack: move find_pack_entry() and make it global Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 22/23] pack: move has_pack_index() Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 23/23] pack: move for_each_packed_object() Jonathan Tan
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/prune-packed.c | 1 +
 cache.h                | 2 --
 diff.c                 | 1 +
 packfile.c             | 6 ++++++
 packfile.h             | 2 ++
 revision.c             | 1 +
 sha1_file.c            | 6 ------
 7 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index ac978ad40..97bfde24b 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -2,6 +2,7 @@
 #include "cache.h"
 #include "progress.h"
 #include "parse-options.h"
+#include "packfile.h"
 
 static const char * const prune_packed_usage[] = {
 	N_("git prune-packed [-n | --dry-run] [-q | --quiet]"),
diff --git a/cache.h b/cache.h
index 1e90bb754..286891df4 100644
--- a/cache.h
+++ b/cache.h
@@ -1198,8 +1198,6 @@ extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned l
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
-extern int has_sha1_pack(const unsigned char *sha1);
-
 /*
  * Open the loose object at path, check its sha1, and return the contents,
  * type, and size. If the object is a blob, then "contents" may return NULL,
diff --git a/diff.c b/diff.c
index 85e714f6c..e9a1d6162 100644
--- a/diff.c
+++ b/diff.c
@@ -20,6 +20,7 @@
 #include "string-list.h"
 #include "argv-array.h"
 #include "graph.h"
+#include "packfile.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
diff --git a/packfile.c b/packfile.c
index ae5395f5f..7472ab816 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1840,3 +1840,9 @@ int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
 	}
 	return 0;
 }
+
+int has_sha1_pack(const unsigned char *sha1)
+{
+	struct pack_entry e;
+	return find_pack_entry(sha1, &e);
+}
diff --git a/packfile.h b/packfile.h
index c9b4fcfaf..4945a1505 100644
--- a/packfile.h
+++ b/packfile.h
@@ -120,4 +120,6 @@ extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
 
 extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
 
+extern int has_sha1_pack(const unsigned char *sha1);
+
 #endif
diff --git a/revision.c b/revision.c
index 6603af944..db97b9221 100644
--- a/revision.c
+++ b/revision.c
@@ -19,6 +19,7 @@
 #include "dir.h"
 #include "cache-tree.h"
 #include "bisect.h"
+#include "packfile.h"
 
 volatile show_early_output_fn_t show_early_output;
 
diff --git a/sha1_file.c b/sha1_file.c
index 76c86639c..e4975e0ae 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1630,12 +1630,6 @@ int has_pack_index(const unsigned char *sha1)
 	return 1;
 }
 
-int has_sha1_pack(const unsigned char *sha1)
-{
-	struct pack_entry e;
-	return find_pack_entry(sha1, &e);
-}
-
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
 	if (!startup_info->have_repository)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 22/23] pack: move has_pack_index()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (58 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 21/23] pack: move has_sha1_pack() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  2017-08-18 22:20 ` [PATCH v3 23/23] pack: move for_each_packed_object() Jonathan Tan
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 2 --
 packfile.c  | 8 ++++++++
 packfile.h  | 2 ++
 sha1_file.c | 8 --------
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index 286891df4..dcbe37a3f 100644
--- a/cache.h
+++ b/cache.h
@@ -1233,8 +1233,6 @@ extern int has_object_file_with_flags(const struct object_id *oid, int flags);
  */
 extern int has_loose_object_nonlocal(const unsigned char *sha1);
 
-extern int has_pack_index(const unsigned char *sha1);
-
 extern void assert_sha1_type(const unsigned char *sha1, enum object_type expect);
 
 /* Helper to check and "touch" a file */
diff --git a/packfile.c b/packfile.c
index 7472ab816..7e293761b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1846,3 +1846,11 @@ int has_sha1_pack(const unsigned char *sha1)
 	struct pack_entry e;
 	return find_pack_entry(sha1, &e);
 }
+
+int has_pack_index(const unsigned char *sha1)
+{
+	struct stat st;
+	if (stat(sha1_pack_index_name(sha1), &st))
+		return 0;
+	return 1;
+}
diff --git a/packfile.h b/packfile.h
index 4945a1505..1b6ea832c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -122,4 +122,6 @@ extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e);
 
 extern int has_sha1_pack(const unsigned char *sha1);
 
+extern int has_pack_index(const unsigned char *sha1);
+
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index e4975e0ae..fa422435f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1622,14 +1622,6 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
 	return ret;
 }
 
-int has_pack_index(const unsigned char *sha1)
-{
-	struct stat st;
-	if (stat(sha1_pack_index_name(sha1), &st))
-		return 0;
-	return 1;
-}
-
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
 	if (!startup_info->have_repository)
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 23/23] pack: move for_each_packed_object()
  2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
                   ` (59 preceding siblings ...)
  2017-08-18 22:20 ` [PATCH v3 22/23] pack: move has_pack_index() Jonathan Tan
@ 2017-08-18 22:20 ` Jonathan Tan
  60 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 22:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/cat-file.c |  1 +
 cache.h            |  7 +------
 packfile.c         | 40 ++++++++++++++++++++++++++++++++++++++++
 packfile.h         | 11 +++++++++++
 reachable.c        |  1 +
 sha1_file.c        | 40 ----------------------------------------
 6 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 96b786e48..be5936017 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -12,6 +12,7 @@
 #include "streaming.h"
 #include "tree-walk.h"
 #include "sha1-array.h"
+#include "packfile.h"
 
 struct batch_options {
 	int enabled;
diff --git a/cache.h b/cache.h
index dcbe37a3f..2eeb21b02 100644
--- a/cache.h
+++ b/cache.h
@@ -1668,17 +1668,12 @@ int for_each_loose_file_in_objdir_buf(struct strbuf *path,
 				      void *data);
 
 /*
- * Iterate over loose and packed objects in both the local
+ * Iterate over loose objects in both the local
  * repository and any alternates repositories (unless the
  * LOCAL_ONLY flag is set).
  */
 #define FOR_EACH_OBJECT_LOCAL_ONLY 0x1
-typedef int each_packed_object_fn(const struct object_id *oid,
-				  struct packed_git *pack,
-				  uint32_t pos,
-				  void *data);
 extern int for_each_loose_object(each_loose_object_fn, void *, unsigned flags);
-extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags);
 
 struct object_info {
 	/* Request */
diff --git a/packfile.c b/packfile.c
index 7e293761b..1f11ef5b8 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1854,3 +1854,43 @@ int has_pack_index(const unsigned char *sha1)
 		return 0;
 	return 1;
 }
+
+static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
+{
+	uint32_t i;
+	int r = 0;
+
+	for (i = 0; i < p->num_objects; i++) {
+		struct object_id oid;
+
+		if (!nth_packed_object_oid(&oid, p, i))
+			return error("unable to get sha1 of object %u in %s",
+				     i, p->pack_name);
+
+		r = cb(&oid, p, i, data);
+		if (r)
+			break;
+	}
+	return r;
+}
+
+int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags)
+{
+	struct packed_git *p;
+	int r = 0;
+	int pack_errors = 0;
+
+	prepare_packed_git();
+	for (p = packed_git; p; p = p->next) {
+		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
+			continue;
+		if (open_pack_index(p)) {
+			pack_errors = 1;
+			continue;
+		}
+		r = for_each_object_in_pack(p, cb, data);
+		if (r)
+			break;
+	}
+	return r ? r : pack_errors;
+}
diff --git a/packfile.h b/packfile.h
index 1b6ea832c..ca4cc3b97 100644
--- a/packfile.h
+++ b/packfile.h
@@ -124,4 +124,15 @@ extern int has_sha1_pack(const unsigned char *sha1);
 
 extern int has_pack_index(const unsigned char *sha1);
 
+/*
+ * Iterate over packed objects in both the local
+ * repository and any alternates repositories (unless the
+ * FOR_EACH_OBJECT_LOCAL_ONLY flag, defined in cache.h, is set).
+ */
+typedef int each_packed_object_fn(const struct object_id *oid,
+				  struct packed_git *pack,
+				  uint32_t pos,
+				  void *data);
+extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags);
+
 #endif
diff --git a/reachable.c b/reachable.c
index c62efbfd4..d1ac5d97e 100644
--- a/reachable.c
+++ b/reachable.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "progress.h"
 #include "list-objects.h"
+#include "packfile.h"
 
 struct connectivity_progress {
 	struct progress *progress;
diff --git a/sha1_file.c b/sha1_file.c
index fa422435f..0bb2343f8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2014,46 +2014,6 @@ int for_each_loose_object(each_loose_object_fn cb, void *data, unsigned flags)
 	return foreach_alt_odb(loose_from_alt_odb, &alt);
 }
 
-static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
-{
-	uint32_t i;
-	int r = 0;
-
-	for (i = 0; i < p->num_objects; i++) {
-		struct object_id oid;
-
-		if (!nth_packed_object_oid(&oid, p, i))
-			return error("unable to get sha1 of object %u in %s",
-				     i, p->pack_name);
-
-		r = cb(&oid, p, i, data);
-		if (r)
-			break;
-	}
-	return r;
-}
-
-int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags)
-{
-	struct packed_git *p;
-	int r = 0;
-	int pack_errors = 0;
-
-	prepare_packed_git();
-	for (p = packed_git; p; p = p->next) {
-		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
-			continue;
-		if (open_pack_index(p)) {
-			pack_errors = 1;
-			continue;
-		}
-		r = for_each_object_in_pack(p, cb, data);
-		if (r)
-			break;
-	}
-	return r ? r : pack_errors;
-}
-
 static int check_stream_sha1(git_zstream *stream,
 			     const char *hdr,
 			     unsigned long size,
-- 
2.14.1.480.gb18f417b89-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
  2017-08-11 19:41   ` [PATCH v2 00/25] Move exported packfile funcs to its own file Ben Peart
@ 2017-08-18 23:36     ` Jonathan Tan
  0 siblings, 0 replies; 88+ messages in thread
From: Jonathan Tan @ 2017-08-18 23:36 UTC (permalink / raw)
  To: Ben Peart; +Cc: git, gitster, sbeller

On Fri, 11 Aug 2017 15:41:28 -0400
Ben Peart <peartben@gmail.com> wrote:

> Nice to see the pack file functions being refactored out.  I looked at 
> the end result and it looked good to me.

Thanks.

> Do you have the energy to do a similar refactoring for the remaining 
> public functions residing in sha1_file.c?  Perhaps a new sha1_file.h? It 
> would be nice to get more things out of cache.h. :)

I agree that that would be desirable, but for now I'll leave that for
someone else to do :-)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 00/23] Move exported packfile funcs to its own file
  2017-08-18 22:20 ` [PATCH v3 00/23] Move exported packfile funcs to its own file Jonathan Tan
@ 2017-08-19  7:33   ` Junio C Hamano
  2017-08-20  6:40     ` Junio C Hamano
  0 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-19  7:33 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Martin Koegler

Jonathan Tan <jonathantanmy@google.com> writes:

>> You'd need to double check, but I think the topics that cause
>> trouble are rs/find-apck-entry-bisection and jk/drop-sha1-entry-pos;
>> you can start from v2.14.1 and merge these topics on top and then
>> build your change on top.  That would allow you to start cooking
>> before both of them graduate to 'master', as I expect they are both
>> quick-to-next material.  There might be other topics that interfere
>> with what you are doing, but you can easily find out what they are
>> if you do a trial merge to 'next' and 'pu' yourself.
>
> OK - in addition to the 2 you mentioned, I have found some others
> (likely added after you wrote that). The complete list is:
>  - rs/find-pack-entry-bisection
>  - jk/drop-sha1-entry-pos
>  - jt/sha1-file-cleanup (formerly part of this set)
>  - mk/use-size-t-in-zlib
>  - rs/unpack-entry-leakfix
>
> I have merged all of these and rebased my patches on top.
>
> Other changes:
>  - Used packfile.h instead of pack.h (following most people's
>    preference)
>  - Ensured that I added functions to packfile.h retaining the order they
>    were originally in, so that if you run "git diff <base> --color-moved
>    --patience", there are much fewer zebra stripes
>
> The merge base commit can be accessed online [1], if you need it.
>
> [1] https://github.com/jonathantanmy/git/commits/packmigrate

Thanks.

I have to say that this was a painful topic to integrate.

As you may know, the mk/use-size-t-in-zlib topic is being retracted
and getting rerolled as a larger size_t series, most of which still
needs help in reviewing.

The jt/sha1-file-cleanup topic is the only one among the other four
that are still not in 'next', and I think that topic, as well as the
other three, are all good and serve as a good base to build on top.
So I first rebuilt your patches on top of these four topics.  This
took some time but it wasn't all that painful.

The result cleanly merged to 'pu', I think, but it resulted in a
rather noisy conflict when I attempted to merge it to 'next'.  I
want to see both of a merge to 'next' and a merge to 'pu' to be
reasonably clean for any topic to be viable [*1*].  Otherwise,
"initially queue in 'pu', then cook in 'next', and eventually
graduate to 'master'" workflow would not work well.

Anyway, I _think_ I finally got the conflict resolutions right for
merges of the topic to 'next', 'jch' and 'pu', so I will push the
result of merging to 'pu' out.  This unfortunately makes Martin's
ongoing size_t topic unmergeable to any of the integration branches
as-is, but let's make sure that topic is reviewed properly first (I
haven't seen people comment much on the individual patches other
than just selected few).


[Footnote]

*1* There is an intermediate point between 'master' and 'pu' called
    'jch', and I try to make sure any new topic to merge cleanly to
    that branch, too, when accepting it.  That is the branch I use
    for everyday work as an early guinea-pig.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 00/23] Move exported packfile funcs to its own file
  2017-08-19  7:33   ` Junio C Hamano
@ 2017-08-20  6:40     ` Junio C Hamano
  2017-08-21 18:40       ` Jonathan Tan
  0 siblings, 1 reply; 88+ messages in thread
From: Junio C Hamano @ 2017-08-20  6:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Martin Koegler

Junio C Hamano <gitster@pobox.com> writes:

> I have to say that this was a painful topic to integrate.
>
> As you may know, the mk/use-size-t-in-zlib topic is being retracted
> and getting rerolled as a larger size_t series, most of which still
> needs help in reviewing.
>
> The jt/sha1-file-cleanup topic is the only one among the other four
> that are still not in 'next', and I think that topic, as well as the
> other three, are all good and serve as a good base to build on top.
> So I first rebuilt your patches on top of these four topics.  This
> took some time but it wasn't all that painful.

... but it turns out that I screwed it up in at least one place,
making Linux32 build fail (Thanks Lars and folks who pushed hard to
arrange Travis to build all my pushes to 'pu').  I'm pushing out my
second attempt.  Let's see how it goes.

A change like this that only moves code around without changing
anything is painful to everybody to keep around, as nobody can
safely touch the affected code while it is in flight.  On the other
hand, as long as it is only moving code around, such a change is
reasonably safe, and it is relatively easy to ensure that there is
no change other than code movement is involved.  So let's 

 (1) make sure that the topics this depends on are sound by
     re-reading them once again, and merge them quickly down to
     'master';

 (2) merge this topic to 'next', optionally after rebasing it on
     'master', after (1) happens; and

 (3) quickly merge it to 'master', to get it over with.

In the meantime we'd need to refrain from taking code that touch
things that are moved by this series.

I plan to be offline for a week or so near the end of this month, so
I am hoping that we can do all of the above before that. That may
make us break our usual "tip of the 'master' is more stable and
robust than any released version" promise by potentially leaving it
broken for a while, but nobody can build on top of a fluid codebase
that is in the process of moving things around in a big way, so it
might not be such a bad idea to make it coincide with the period
when the tree must become quiescent due to my being offline.  We'll
see.

Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 00/23] Move exported packfile funcs to its own file
  2017-08-20  6:40     ` Junio C Hamano
@ 2017-08-21 18:40       ` Jonathan Tan
  2017-08-21 22:55         ` Junio C Hamano
  0 siblings, 1 reply; 88+ messages in thread
From: Jonathan Tan @ 2017-08-21 18:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Martin Koegler

On Sat, 19 Aug 2017 23:40:33 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I have to say that this was a painful topic to integrate.
> >
> > As you may know, the mk/use-size-t-in-zlib topic is being retracted
> > and getting rerolled as a larger size_t series, most of which still
> > needs help in reviewing.
> >
> > The jt/sha1-file-cleanup topic is the only one among the other four
> > that are still not in 'next', and I think that topic, as well as the
> > other three, are all good and serve as a good base to build on top.
> > So I first rebuilt your patches on top of these four topics.  This
> > took some time but it wasn't all that painful.
> 
> ... but it turns out that I screwed it up in at least one place,
> making Linux32 build fail (Thanks Lars and folks who pushed hard to
> arrange Travis to build all my pushes to 'pu').  I'm pushing out my
> second attempt.  Let's see how it goes.

Thanks.

> A change like this that only moves code around without changing
> anything is painful to everybody to keep around, as nobody can
> safely touch the affected code while it is in flight.  On the other
> hand, as long as it is only moving code around, such a change is
> reasonably safe, and it is relatively easy to ensure that there is
> no change other than code movement is involved.  So let's 
> 
>  (1) make sure that the topics this depends on are sound by
>      re-reading them once again, and merge them quickly down to
>      'master';

I took a look and they look sound.

 - rs/find-pack-entry-bisection resolves an issue first introduced in
   commit 1f68855 ("[PATCH] Teach read_sha1_file() and friends about
   packed git object store.", 2005-06-27), which already had that issue.
 - jk/drop-sha1-entry-pos is some code deletion.
 - rs/unpack-entry-leakfix ensures that delta_stack is freed. This
   function does not (for example) expose the destination of delta_stack
   to its caller, so it is correct that delta_stack should be freed
   unless it points to the local buffer, just like in the success case.
 - jt/sha1-file-cleanup (my patches) still looks OK to me.

In your latest "What's cooking" (Aug 2017, #04; Fri, 18), you mentioned
that the first 3 will be merged to master, and the 4th will be merged to
next.

I didn't look at mk/use-size-t-in-zlib, which (as you said) is still
under review.

>  (2) merge this topic to 'next', optionally after rebasing it on
>      'master', after (1) happens; and
> 
>  (3) quickly merge it to 'master', to get it over with.
> 
> In the meantime we'd need to refrain from taking code that touch
> things that are moved by this series.
> 
> I plan to be offline for a week or so near the end of this month, so
> I am hoping that we can do all of the above before that. That may
> make us break our usual "tip of the 'master' is more stable and
> robust than any released version" promise by potentially leaving it
> broken for a while, but nobody can build on top of a fluid codebase
> that is in the process of moving things around in a big way, so it
> might not be such a bad idea to make it coincide with the period
> when the tree must become quiescent due to my being offline.  We'll
> see.
> 
> Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 00/23] Move exported packfile funcs to its own file
  2017-08-21 18:40       ` Jonathan Tan
@ 2017-08-21 22:55         ` Junio C Hamano
  0 siblings, 0 replies; 88+ messages in thread
From: Junio C Hamano @ 2017-08-21 22:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Martin Koegler

Jonathan Tan <jonathantanmy@google.com> writes:

> On Sat, 19 Aug 2017 23:40:33 -0700
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>> 
>> > I have to say that this was a painful topic to integrate.
>> >
>> > As you may know, the mk/use-size-t-in-zlib topic is being retracted
>> > and getting rerolled as a larger size_t series, most of which still
>> > needs help in reviewing.
>> >
>> > The jt/sha1-file-cleanup topic is the only one among the other four
>> > that are still not in 'next', and I think that topic, as well as the
>> > other three, are all good and serve as a good base to build on top.
>> > So I first rebuilt your patches on top of these four topics.  This
>> > took some time but it wasn't all that painful.
>> 
>> ... but it turns out that I screwed it up in at least one place,
>> making Linux32 build fail (Thanks Lars and folks who pushed hard to
>> arrange Travis to build all my pushes to 'pu').  I'm pushing out my
>> second attempt.  Let's see how it goes.
>
> Thanks.

It seems like a later pushout late Sunday night that had the second
attempt made it pass on Linux32 ;-)  Whew.

>>  (1) make sure that the topics this depends on are sound by
>>      re-reading them once again, and merge them quickly down to
>>      'master';
>
> I took a look and they look sound.
>
>  - rs/find-pack-entry-bisection resolves an issue first introduced in
>    commit 1f68855 ("[PATCH] Teach read_sha1_file() and friends about
>    packed git object store.", 2005-06-27), which already had that issue.
>  - jk/drop-sha1-entry-pos is some code deletion.
>  - rs/unpack-entry-leakfix ensures that delta_stack is freed. This
>    function does not (for example) expose the destination of delta_stack
>    to its caller, so it is correct that delta_stack should be freed
>    unless it points to the local buffer, just like in the success case.
>  - jt/sha1-file-cleanup (my patches) still looks OK to me.
>
> In your latest "What's cooking" (Aug 2017, #04; Fri, 18), you mentioned
> that the first 3 will be merged to master, and the 4th will be merged to
> next.

Yup, thanks for double checking.  I'll be merging them down
soon-ish.

Thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2017-08-21 22:55 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-08 19:32 [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 01/10] pack: move pack name-related functions Jonathan Tan
2017-08-08 20:36   ` Stefan Beller
2017-08-08 20:50     ` Jonathan Tan
2017-08-09 12:00       ` Christian Couder
2017-08-09 17:16         ` Jonathan Tan
2017-08-11 19:38           ` Ben Peart
2017-08-11 21:34             ` Junio C Hamano
2017-08-16 22:53               ` Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 02/10] pack: move static state variables Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 03/10] pack: move pack_report() Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
2017-08-08 20:19   ` Junio C Hamano
2017-08-08 20:45     ` Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 05/10] pack: move release_pack_memory() Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 06/10] pack: move pack-closing functions Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 07/10] pack: move use_pack() Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 08/10] pack: move unuse_pack() Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 09/10] pack: move add_packed_git() Jonathan Tan
2017-08-08 19:32 ` [RFC PATCH 10/10] pack: move install_packed_git() Jonathan Tan
2017-08-08 20:05 ` [RFC PATCH 00/10] An attempt to move packfile funcs to its own file Junio C Hamano
2017-08-08 20:43   ` Jonathan Tan
2017-08-08 21:04     ` Junio C Hamano
2017-08-09  1:22 ` [PATCH v2 00/25] Move exported " Jonathan Tan
2017-08-10 17:21   ` Stefan Beller
2017-08-10 21:19   ` Junio C Hamano
2017-08-10 21:59     ` Jonathan Tan
2017-08-10 22:40       ` Junio C Hamano
2017-08-11 20:36         ` [PATCH 0/2] non-move patches in preparation for packfile.c Jonathan Tan
2017-08-11 20:36         ` [PATCH 1/2] sha1_file: set whence in storage-specific info fn Jonathan Tan
2017-08-11 21:52           ` Junio C Hamano
2017-08-11 20:36         ` [PATCH 2/2] sha1_file: remove read_packed_sha1() Jonathan Tan
2017-08-11 22:06           ` Junio C Hamano
2017-08-11 19:41   ` [PATCH v2 00/25] Move exported packfile funcs to its own file Ben Peart
2017-08-18 23:36     ` Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 01/25] pack: move pack name-related functions Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 02/25] pack: move static state variables Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 03/25] pack: move pack_report() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 04/25] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 05/25] pack: move release_pack_memory() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 06/25] pack: move pack-closing functions Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 07/25] pack: move use_pack() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 08/25] pack: move unuse_pack() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 09/25] pack: move add_packed_git() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 10/25] pack: move install_packed_git() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 11/25] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 12/25] pack: move unpack_object_header() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 13/25] pack: move get_size_from_delta() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 14/25] pack: move unpack_object_header() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 15/25] sha1_file: set whence in storage-specific info fn Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 16/25] sha1_file: remove read_packed_sha1() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 17/25] pack: move packed_object_info(), unpack_entry() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 18/25] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 19/25] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 20/25] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 21/25] pack: move find_sha1_pack() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 22/25] pack: move find_pack_entry() and make it global Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 23/25] pack: move has_sha1_pack() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 24/25] pack: move has_pack_index() Jonathan Tan
2017-08-09  1:22 ` [PATCH v2 25/25] pack: move for_each_packed_object() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 00/23] Move exported packfile funcs to its own file Jonathan Tan
2017-08-19  7:33   ` Junio C Hamano
2017-08-20  6:40     ` Junio C Hamano
2017-08-21 18:40       ` Jonathan Tan
2017-08-21 22:55         ` Junio C Hamano
2017-08-18 22:20 ` [PATCH v3 01/23] pack: move pack name-related functions Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 02/23] pack: move static state variables Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 03/23] pack: move pack_report() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 04/23] pack: move open_pack_index(), parse_pack_index() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 05/23] pack: move release_pack_memory() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 06/23] pack: move pack-closing functions Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 07/23] pack: move use_pack() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 08/23] pack: move unuse_pack() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 09/23] pack: move add_packed_git() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 10/23] pack: move install_packed_git() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 11/23] pack: move {,re}prepare_packed_git and approximate_object_count Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 12/23] pack: move unpack_object_header_buffer() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 13/23] pack: move get_size_from_delta() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 14/23] pack: move unpack_object_header() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 15/23] pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 16/23] pack: move nth_packed_object_{sha1,oid} Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 17/23] pack: move check_pack_index_ptr(), nth_packed_object_offset() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 18/23] pack: move find_pack_entry_one(), is_pack_valid() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 19/23] pack: move find_sha1_pack() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 20/23] pack: move find_pack_entry() and make it global Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 21/23] pack: move has_sha1_pack() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 22/23] pack: move has_pack_index() Jonathan Tan
2017-08-18 22:20 ` [PATCH v3 23/23] pack: move for_each_packed_object() Jonathan Tan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).