From: Jeff King <peff@peff.net>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: David Michael Barr <b@rr-dav.id.au>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC] pack-objects: compression level for non-blobs
Date: Sat, 29 Dec 2012 04:05:58 -0500 [thread overview]
Message-ID: <20121229090558.GA31291@sigill.intra.peff.net> (raw)
In-Reply-To: <20121229052747.GA14928@sigill.intra.peff.net>
On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:
> > I think I tried the partial decompression for commit header and it did
> > not help much (or I misremember it, not so sure).
>
> I'll see if I can dig up the reference, as it was something I was going
> to look at next.
I tried the simple patch below, but it actually made things slower! I'm
assuming it is because the streaming setup is not micro-optimized very
well. A custom read_sha1_until_blank_line() could probably do better.
diff --git a/commit.c b/commit.c
index e8eb0ae..efd6c06 100644
--- a/commit.c
+++ b/commit.c
@@ -8,6 +8,7 @@
#include "notes.h"
#include "gpg-interface.h"
#include "mergesort.h"
+#include "streaming.h"
static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
@@ -306,6 +307,39 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
return 0;
}
+static void *read_commit_header(const unsigned char *sha1,
+ enum object_type *type,
+ unsigned long *size)
+{
+ static const int chunk_size = 256;
+ struct strbuf buf = STRBUF_INIT;
+ struct git_istream *st;
+
+ st = open_istream(sha1, type, size, NULL);
+ if (!st)
+ return NULL;
+ while (1) {
+ size_t start = buf.len;
+ ssize_t readlen;
+
+ strbuf_grow(&buf, chunk_size);
+ readlen = read_istream(st, buf.buf + start, chunk_size);
+ buf.buf[start + readlen + 1] = '\0';
+ buf.len += readlen;
+
+ if (readlen < 0) {
+ close_istream(st);
+ strbuf_release(&buf);
+ return NULL;
+ }
+ if (!readlen || strstr(buf.buf + start, "\n\n"))
+ break;
+ }
+
+ close_istream(st);
+ return strbuf_detach(&buf, size);
+}
+
int parse_commit(struct commit *item)
{
enum object_type type;
@@ -317,7 +351,11 @@ int parse_commit(struct commit *item)
return -1;
if (item->object.parsed)
return 0;
- buffer = read_sha1_file(item->object.sha1, &type, &size);
+
+ if (!save_commit_buffer)
+ buffer = read_commit_header(item->object.sha1, &type, &size);
+ else
+ buffer = read_sha1_file(item->object.sha1, &type, &size);
if (!buffer)
return error("Could not read %s",
sha1_to_hex(item->object.sha1));
next prev parent reply other threads:[~2012-12-29 9:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-26 6:25 [RFC] pack-objects: compression level for non-blobs David Michael Barr
2012-11-26 12:35 ` David Michael Barr
2012-12-29 0:41 ` Jeff King
2012-12-29 4:34 ` Nguyen Thai Ngoc Duy
2012-12-29 5:07 ` Jeff King
2012-12-29 5:25 ` Nguyen Thai Ngoc Duy
2012-12-29 5:27 ` Jeff King
2012-12-29 9:05 ` Jeff King [this message]
2012-12-29 9:48 ` Jeff King
2012-12-30 12:05 ` Jeff King
2012-12-30 12:53 ` Nguyen Thai Ngoc Duy
2012-12-30 21:31 ` Jeff King
2012-12-31 18:06 ` Shawn Pearce
2013-01-01 4:15 ` Duy Nguyen
2013-01-01 12:10 ` Duy Nguyen
2013-01-01 17:17 ` Shawn Pearce
2013-01-01 23:47 ` Junio C Hamano
2013-01-02 2:23 ` Duy Nguyen
2013-01-01 20:02 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121229090558.GA31291@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=b@rr-dav.id.au \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).