git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: "git@vger.kernel.org" <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>, Derrick Stolee <dstolee@microsoft.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Question about get_cached_commit_buffer()
Date: Tue, 20 Feb 2018 17:12:50 -0500	[thread overview]
Message-ID: <ecbbe515-b7a8-3dc8-7d14-32412e7b12c3@gmail.com> (raw)

While working on my commit-graph patch [1] and using a local build in my 
usual workflows, I found a bug in my branch.

Essentially, when calling `git rev-list --header`, the header 
information is actually missing for many commits except the first. This 
was not caught in my testing since t6000-rev-list-misc.sh creates a test 
repo with only one commit.

The root cause is that the serialized commit graph gets its speedup by 
not loading buffers for every commit. For many use-cases (merge bases, 
--topo-order, etc.) we do not need the buffer for most of the commits. 
In the rev-list example, the buffer is loaded due to a side-effect of 
being referenced by HEAD or a branch. A similar effect happened in `git 
log`, hence the following change in my patch:

diff --git a/log-tree.c b/log-tree.c
index 580b3a9..14735d4 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -647,8 +647,7 @@ void show_log(struct rev_info *opt)
  		show_mergetag(opt, commit);
  	}
  
-	if (!get_cached_commit_buffer(commit, NULL))
-		return;
+	get_commit_buffer(commit, NULL);
  
  	if (opt->show_notes) {
  		int raw;


In rev-list, the "--header" option outputs a value and expects the 
buffer to be cached. It outputs the header info only if 
get_cached_commit_buffer() returns a non-null buffer, giving incorrect 
output. If it called get_commit_buffer() instead, it would immediately 
call get_cached_commit_buffer() and on failure actually load the buffer.

This has not been a problem before, since the buffer was always loaded 
at some point for each commit (and saved because of the 
save_commit_buffer global).

I propose to make get_cached_commit_buffer() static to commit.c and 
convert all callers to get_commit_buffer(). Is there any reason to _not_ 
do this? It seems that there is no functional or performance change.

After the serialized commit graph exists and is used, the only change is 
that we delay loading the buffer until we need to read the commit 
metadata that is not stored in the graph (message, author/committer).

Thanks,
-Stolee

[1] 
https://public-inbox.org/git/4d1ee202-7d79-d73c-6e05-d0fc85db943c@gmail.com/T/#m381bfd3f2eafbd254e35a5147cd198bc35055e92
     [Patch v4 00/14] Serialized Git Commit Graph

[2] 
https://public-inbox.org/git/20140610214039.GJ19147@sigill.intra.peff.net/
     [Patch 10/17] provide helpers to access the commit buffer

             reply	other threads:[~2018-02-20 22:12 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-20 22:12 Derrick Stolee [this message]
2018-02-20 22:57 ` Question about get_cached_commit_buffer() Jeff King
2018-02-21 14:13   ` Derrick Stolee
2018-02-21 18:48     ` Jeff King
2018-02-21 18:52       ` Jeff King
2018-02-21 19:17         ` [PATCH] commit: drop uses of get_cached_commit_buffer() Derrick Stolee
2018-02-21 19:19           ` Derrick Stolee
2018-02-21 23:34             ` Jeff King
2018-02-21 23:13           ` Jeff King
2018-02-21 23:22             ` Stefan Beller
2018-02-21 23:29               ` Jeff King
2018-02-22  1:52             ` Derrick Stolee
2018-02-21 22:22       ` Question about get_cached_commit_buffer() Junio C Hamano
2018-02-21 22:50         ` Jeff King
2018-02-21 23:03           ` Junio C Hamano
2018-02-21 23:27             ` [PATCH] revision: drop --show-all option Jeff King
2018-02-21 23:45               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecbbe515-b7a8-3dc8-7d14-32412e7b12c3@gmail.com \
    --to=stolee@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).