git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] cache-tree: populate cache-tree on successful merge
@ 2015-07-28 19:30 David Turner
  2015-07-28 19:50 ` Junio C Hamano
  2015-07-28 20:47 ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: David Turner @ 2015-07-28 19:30 UTC (permalink / raw)
  To: git; +Cc: David Turner, Brian Degenhardt

When we unpack trees into an existing index, we discard the old index
and replace it with the new, merged index.  Ensure that this index has
its cache-tree populated.  This will make subsequent git status and
commit commands faster.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
---

This patch is by my colleague, Brian Degenhardt (as part of his work
on git at Twitter).  I'm sending it with his and Twitter's approval.

---
 t/t0090-cache-tree.sh | 24 ++++++++++++++++++++++++
 unpack-trees.c        |  7 +++++++
 2 files changed, 31 insertions(+)

diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 601d02d..055cc19 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -199,6 +199,30 @@ test_expect_success 'checkout -B gives cache-tree' '
 	test_cache_tree
 '
 
+test_expect_success 'merge --ff-only maintains cache-tree' '
+	git checkout current &&
+	git checkout -b changes &&
+	test_commit llamas &&
+	test_commit pachyderm &&
+	test_cache_tree &&
+	git checkout current &&
+	test_cache_tree &&
+	git merge --ff-only changes &&
+	test_cache_tree
+'
+
+test_expect_success 'merge maintains cache-tree' '
+	git checkout current &&
+	git checkout -b changes2 &&
+	test_commit alpacas &&
+	test_cache_tree &&
+	git checkout current &&
+	test_commit struthio &&
+	test_cache_tree &&
+	git merge changes2 &&
+	test_cache_tree
+'
+
 test_expect_success 'partial commit gives cache-tree' '
 	git checkout -b partial no-children &&
 	test_commit one &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 2927660..befc247 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1156,6 +1156,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->src_index = NULL;
 	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index) {
+		if (!o->result.cache_tree)
+			o->result.cache_tree = cache_tree();
+
+		if (!cache_tree_fully_valid(o->result.cache_tree)) {
+			cache_tree_update(&o->result, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
+		}
+
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {
-- 
2.0.4.315.gad8727a-twtrsrc

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 19:30 [PATCH] cache-tree: populate cache-tree on successful merge David Turner
@ 2015-07-28 19:50 ` Junio C Hamano
  2015-07-28 19:54   ` David Turner
  2015-07-28 20:47 ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 19:50 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

David Turner <dturner@twopensource.com> writes:

> When we unpack trees into an existing index, we discard the old index
> and replace it with the new, merged index.  Ensure that this index has
> its cache-tree populated.  This will make subsequent git status and
> commit commands faster.

Wouldn't it make repeated calls to "git merge" and friends to build
a long history slower, when the user does not run "git status" in
between?  E.g. "git cherry-pick -4 $other_topic", where you would
not even have a chance to run "git status" in the middle.  What do
the pros-and-cons look like?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 19:50 ` Junio C Hamano
@ 2015-07-28 19:54   ` David Turner
  2015-07-28 19:55     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: David Turner @ 2015-07-28 19:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brian Degenhardt

On Tue, 2015-07-28 at 12:50 -0700, Junio C Hamano wrote:
> David Turner <dturner@twopensource.com> writes:
> 
> > When we unpack trees into an existing index, we discard the old index
> > and replace it with the new, merged index.  Ensure that this index has
> > its cache-tree populated.  This will make subsequent git status and
> > commit commands faster.
> 
> Wouldn't it make repeated calls to "git merge" and friends to build
> a long history slower, when the user does not run "git status" in
> between?  E.g. "git cherry-pick -4 $other_topic", where you would
> not even have a chance to run "git status" in the middle.  What do
> the pros-and-cons look like?

I have not benchmarked, but I suspect it would not make those slower.

The work done to produce the cache-tree is work that the commit would
otherwise have to do.  So we're spending extra time in one place to
eliminate that work in a different place.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 19:54   ` David Turner
@ 2015-07-28 19:55     ` Junio C Hamano
  2015-07-28 20:04       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 19:55 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

David Turner <dturner@twopensource.com> writes:

> The work done to produce the cache-tree is work that the commit would
> otherwise have to do.  So we're spending extra time in one place to
> eliminate that work in a different place.

Good point.  Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 19:55     ` Junio C Hamano
@ 2015-07-28 20:04       ` Junio C Hamano
  2015-07-28 20:28         ` David Turner
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 20:04 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

Junio C Hamano <gitster@pobox.com> writes:

> David Turner <dturner@twopensource.com> writes:
>
>> The work done to produce the cache-tree is work that the commit would
>> otherwise have to do.  So we're spending extra time in one place to
>> eliminate that work in a different place.
>
> Good point.  Thanks.

Hmm, I forgot about another codepath.  What about operations that
are purely done to pouplate the index, without necessarily creating
a tree out of the index?

The most worrisome is "git checkout $branch" (two-tree merge).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 20:04       ` Junio C Hamano
@ 2015-07-28 20:28         ` David Turner
  2015-07-28 20:58           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: David Turner @ 2015-07-28 20:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brian Degenhardt

On Tue, 2015-07-28 at 13:04 -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > David Turner <dturner@twopensource.com> writes:
> >
> >> The work done to produce the cache-tree is work that the commit would
> >> otherwise have to do.  So we're spending extra time in one place to
> >> eliminate that work in a different place.
> >
> > Good point.  Thanks.
> 
> Hmm, I forgot about another codepath.  What about operations that
> are purely done to pouplate the index, without necessarily creating
> a tree out of the index?
> 
> The most worrisome is "git checkout $branch" (two-tree merge).

Git checkout $branch already populates the cache-tree; this is due to
patches I added last year:

commit aecf567cbfb6ab46e82f7f5df36fb6a2dd5bee69
Author: David Turner <dturner@twopensource.com>
Date:   Sat Jul 5 21:06:56 2014 -0700

    cache-tree: create/update cache-tree on checkout
    
    When git checkout checks out a branch, create or update the
    cache-tree so that subsequent operations are faster.
----

Admittedly, we do not test for the case where we must do a two-way merge
during a checkout, but I just tested that case, and it appears that we
do already populate the cache-tree in that case.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 19:30 [PATCH] cache-tree: populate cache-tree on successful merge David Turner
  2015-07-28 19:50 ` Junio C Hamano
@ 2015-07-28 20:47 ` Junio C Hamano
  2015-07-28 21:18   ` David Turner
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 20:47 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

David Turner <dturner@twopensource.com> writes:

> When we unpack trees into an existing index, we discard the old index
> and replace it with the new, merged index.  Ensure that this index has
> its cache-tree populated.  This will make subsequent git status and
> commit commands faster.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
> ---
>
> This patch is by my colleague, Brian Degenhardt (as part of his work
> on git at Twitter).  I'm sending it with his and Twitter's approval.

I'd need to tweak the From:/Author: line then, and flip the order of
the sign-off, as Brian wrote and signed off then David relayed (as
attached).

> diff --git a/unpack-trees.c b/unpack-trees.c
> index 2927660..befc247 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1156,6 +1156,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  	o->src_index = NULL;
>  	ret = check_updates(o) ? (-2) : 0;
>  	if (o->dst_index) {
> +		if (!o->result.cache_tree)
> +			o->result.cache_tree = cache_tree();
> +
> +		if (!cache_tree_fully_valid(o->result.cache_tree)) {
> +			cache_tree_update(&o->result, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
> +		}

This does the cache-tree thing unconditionally, not "on successful
merge".  cache_tree_update() would refuse when it sees an unmerged
entry, but somehow the discrepancy between the title and the code
bothers me.

By the way, I wonder if we can lose/revert aecf567c (cache-tree:
create/update cache-tree on checkout, 2014-07-05), now the
underlying unpack_trees() does the necessary cache_tree_update()
when a branch is checked out.

Thanks.

-- >8 --
From: Brian Degenhardt <bmd@bmdhacks.com>
Date: Tue, 28 Jul 2015 15:30:40 -0400
Subject: [PATCH] unpack-trees: populate cache-tree on successful merge

When we unpack trees into an existing index, we discard the old
index and replace it with the new, merged index.  Ensure that this
index has its cache-tree populated.  This will make subsequent git
status and commit commands faster.

Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0090-cache-tree.sh | 24 ++++++++++++++++++++++++
 unpack-trees.c        |  8 ++++++++
 2 files changed, 32 insertions(+)

diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 601d02d..055cc19 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -199,6 +199,30 @@ test_expect_success 'checkout -B gives cache-tree' '
 	test_cache_tree
 '
 
+test_expect_success 'merge --ff-only maintains cache-tree' '
+	git checkout current &&
+	git checkout -b changes &&
+	test_commit llamas &&
+	test_commit pachyderm &&
+	test_cache_tree &&
+	git checkout current &&
+	test_cache_tree &&
+	git merge --ff-only changes &&
+	test_cache_tree
+'
+
+test_expect_success 'merge maintains cache-tree' '
+	git checkout current &&
+	git checkout -b changes2 &&
+	test_commit alpacas &&
+	test_cache_tree &&
+	git checkout current &&
+	test_commit struthio &&
+	test_cache_tree &&
+	git merge changes2 &&
+	test_cache_tree
+'
+
 test_expect_success 'partial commit gives cache-tree' '
 	git checkout -b partial no-children &&
 	test_commit one &&
diff --git a/unpack-trees.c b/unpack-trees.c
index be84ba2..d92f903 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1155,6 +1155,14 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->src_index = NULL;
 	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index) {
+		if (!ret) {
+			if (!o->result.cache_tree)
+				o->result.cache_tree = cache_tree();
+			if (!cache_tree_fully_valid(o->result.cache_tree))
+				cache_tree_update(&o->result,
+						  WRITE_TREE_SILENT |
+						  WRITE_TREE_REPAIR);
+		}
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {
-- 
2.5.0-370-gf964943

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 20:28         ` David Turner
@ 2015-07-28 20:58           ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 20:58 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

David Turner <dturner@twopensource.com> writes:

> Git checkout $branch already populates the cache-tree; this is due to
> patches I added last year:
>
> commit aecf567cbfb6ab46e82f7f5df36fb6a2dd5bee69

Heh, our mails crossed ;-)

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 20:47 ` Junio C Hamano
@ 2015-07-28 21:18   ` David Turner
  2015-07-28 21:38     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: David Turner @ 2015-07-28 21:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brian Degenhardt

On Tue, 2015-07-28 at 13:47 -0700, Junio C Hamano wrote:
> David Turner <dturner@twopensource.com> writes:
> 
> > When we unpack trees into an existing index, we discard the old index
> > and replace it with the new, merged index.  Ensure that this index has
> > its cache-tree populated.  This will make subsequent git status and
> > commit commands faster.
> >
> > Signed-off-by: David Turner <dturner@twopensource.com>
> > Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
> > ---
> >
> > This patch is by my colleague, Brian Degenhardt (as part of his work
> > on git at Twitter).  I'm sending it with his and Twitter's approval.
> 
> I'd need to tweak the From:/Author: line then, and flip the order of
> the sign-off, as Brian wrote and signed off then David relayed (as
> attached).

Where do I put an Author: line? In the commit message above the
signoffs?  As an email header?  I didn't see an option to git send-email
that would do this.  I don't want to use the From: header because I want
to be the point-of-contact for these patches.

> > diff --git a/unpack-trees.c b/unpack-trees.c
> > index 2927660..befc247 100644
> > --- a/unpack-trees.c
> > +++ b/unpack-trees.c
> > @@ -1156,6 +1156,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >  	o->src_index = NULL;
> >  	ret = check_updates(o) ? (-2) : 0;
> >  	if (o->dst_index) {
> > +		if (!o->result.cache_tree)
> > +			o->result.cache_tree = cache_tree();
> > +
> > +		if (!cache_tree_fully_valid(o->result.cache_tree)) {
> > +			cache_tree_update(&o->result, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
> > +		}
> 
> This does the cache-tree thing unconditionally, not "on successful
> merge".  cache_tree_update() would refuse when it sees an unmerged
> entry, but somehow the discrepancy between the title and the code
> bothers me.
> 
> By the way, I wonder if we can lose/revert aecf567c (cache-tree:
> create/update cache-tree on checkout, 2014-07-05), now the
> underlying unpack_trees() does the necessary cache_tree_update()
> when a branch is checked out.

Well, the tests still pass, so I guess so. That is, we still need the
WRITE_TREE_REPAIR bit, but not the update check.

Will re-roll once I hear back on the author line.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cache-tree: populate cache-tree on successful merge
  2015-07-28 21:18   ` David Turner
@ 2015-07-28 21:38     ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2015-07-28 21:38 UTC (permalink / raw)
  To: David Turner; +Cc: git, Brian Degenhardt

David Turner <dturner@twopensource.com> writes:

> On Tue, 2015-07-28 at 13:47 -0700, Junio C Hamano wrote:
>> David Turner <dturner@twopensource.com> writes:
>> 
>> > When we unpack trees into an existing index, we discard the old index
>> > and replace it with the new, merged index.  Ensure that this index has
>> > its cache-tree populated.  This will make subsequent git status and
>> > commit commands faster.
>> >
>> > Signed-off-by: David Turner <dturner@twopensource.com>
>> > Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
>> > ---
>> >
>> > This patch is by my colleague, Brian Degenhardt (as part of his work
>> > on git at Twitter).  I'm sending it with his and Twitter's approval.
>> 
>> I'd need to tweak the From:/Author: line then, and flip the order of
>> the sign-off, as Brian wrote and signed off then David relayed (as
>> attached).
>
> Where do I put an Author: line? In the commit message above the
> signoffs?  As an email header?  I didn't see an option to git send-email
> that would do this.  I don't want to use the From: header because I want
> to be the point-of-contact for these patches.

The message you are responding to would have been a good example of
forcing the author, subject and author-date to be different from the
e-mail headers.  That is, if you did "git am -s -c" on my message
you responded to, you would have seen a new commit authored by
Brian; and anybody responding to the message would have sent that
e-mail to me (and git@vger.kernel.org).

I think that is the arrangement you are looking for.

Delete everything before and including the "-- >8 --" line from my
message you responded to and then the person who applies does not
have to say "-c" but just with "git am -s" the same thing would have
happened.  E-mail coming from (and reply going to) you, but resulting
commit would be authored by Brian.

"git send-email", if you are sending somebody else's commit, should
automatically add the in-body header "From: Brian ..." as the first
line of the body, with a blank line and the body of the commit log.

>> By the way, I wonder if we can lose/revert aecf567c (cache-tree:
>> create/update cache-tree on checkout, 2014-07-05), now the
>> underlying unpack_trees() does the necessary cache_tree_update()
>> when a branch is checked out.
>
> Well, the tests still pass, so I guess so. That is, we still need the
> WRITE_TREE_REPAIR bit, but not the update check.
>
> Will re-roll once I hear back on the author line.

Let's not do the "drop cache-tree generation from checkout" in the
same patch.  It can be done as a separate patch but I do not think
it is a very high priority.

With that understanding, what I have received from you (with a minor
tweak shown in the message you are responding to) is already fine, I
think.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-07-28 21:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28 19:30 [PATCH] cache-tree: populate cache-tree on successful merge David Turner
2015-07-28 19:50 ` Junio C Hamano
2015-07-28 19:54   ` David Turner
2015-07-28 19:55     ` Junio C Hamano
2015-07-28 20:04       ` Junio C Hamano
2015-07-28 20:28         ` David Turner
2015-07-28 20:58           ` Junio C Hamano
2015-07-28 20:47 ` Junio C Hamano
2015-07-28 21:18   ` David Turner
2015-07-28 21:38     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).