git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>,
	Olaf Hering <olaf@aepfle.de>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: history damage in linux.git
Date: Thu, 21 Apr 2016 10:43:17 -0700	[thread overview]
Message-ID: <CA+55aFwOtyW7zLHdJND=FGBWKBfhQV95RPVRG5gcoRUrtGCrAQ@mail.gmail.com> (raw)
In-Reply-To: <xmqqvb3aswp0.fsf@gitster.mtv.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 2478 bytes --]

On Thu, Apr 21, 2016 at 10:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> I think avoiding side branches to describe with the weight is a
> right thing to do, i.e. if you have this history:
>
>     X---o---o---o---o---v4.6
>      \             /
>       o-----------o
>
> you do not want to explain X as "v4.6~^2~2", and instead you want it
> as "v4.6~5", even though the former is 4 hops while the latter is 5
> hops (which is longer).

Yes. I have a new suggestion: make the "merge weight" depend on how
complex the name ends up being.

And that algorithm actually gives a completely new and improved path:

   aed06b9 tags/v3.13-rc2~32^2^2~47

which is still complex, but is actually the best one I've found so far.

To compare against the previous ones I looked at:

   v4.6-rc1~9^2~792    <- current git code
   v3.13-rc2~32^2^2~47     <- new with attached patch
   v3.13-rc7~9^2~14^2~42     <- merge weight 500
   v3.13~5^2~4^2~2^2~1^2~42   <- merge weight 1

that new one is actually the second-most dense, and uses the oldest
tag. So it's a good combination of denseness, but still gets the best
actual base tag.

The logic of the patch is that there are different "complexities" in
the naming, and it's not just whether you are following a second
parent, it's also if you're doing generational hops.

So when you do a first-parent change, the name stays simple (the last
number changes), and that means that the "distance" weight is low (so
that's the normal "+1" we do for first-parent.

But if it's not a first parent, there are two different cases:

 - generation > 0: we add "~%d^%d", and we approximate that with "four
characters", and use a cost that is four orders of magnitude higher
(so 10000).

 - generation == 0: we add just "^%d", so generally just two
characters. So use a cost that is just two orders of magnitude higher
(so 100).

In other words, I'm trying to convince people that my patch not only
gives a good result, but that the "weight numbers" I use make some
kind of conceptual sense from a complexity cost angle.

With that, the patch itself is attached.

I think it's better than what "git name-rev" does now. Is it optimal?
No, I think the *optimal* would use some kind of "does one tag contain
the other" logic, and discarding all base names that are just
supersets of another base that still reaches the target.

But this patch is small and simple, and has some excuses for its
behavior. What do people think?

                 Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 1360 bytes --]

 builtin/name-rev.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 092e03c3cc9b..0354c8d222e1 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -16,9 +16,6 @@ typedef struct rev_name {
 
 static long cutoff = LONG_MAX;
 
-/* How many generations are maximally preferred over _one_ merge traversal? */
-#define MERGE_TRAVERSAL_WEIGHT 65535
-
 static void name_rev(struct commit *commit,
 		const char *tip_name, int generation, int distance,
 		int deref)
@@ -55,19 +52,26 @@ copy_data:
 			parents;
 			parents = parents->next, parent_number++) {
 		if (parent_number > 1) {
+			int weight;
 			size_t len;
 			char *new_name;
 
 			strip_suffix(tip_name, "^0", &len);
-			if (generation > 0)
+
+			// The extra merge traversal "weight" depends
+			// on how complex the resulting name is.
+			if (generation > 0) {
+				weight = 10000;
 				new_name = xstrfmt("%.*s~%d^%d", (int)len, tip_name,
 						   generation, parent_number);
-			else
+			} else {
+				weight = 100;
 				new_name = xstrfmt("%.*s^%d", (int)len, tip_name,
 						   parent_number);
+			}
 
 			name_rev(parents->item, new_name, 0,
-				distance + MERGE_TRAVERSAL_WEIGHT, 0);
+				distance + weight, 0);
 		} else {
 			name_rev(parents->item, tip_name, generation + 1,
 				distance + 1, 0);

  reply	other threads:[~2016-04-21 17:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 11:30 history damage in linux.git Olaf Hering
2016-04-21 12:10 ` Matthieu Moy
2016-04-21 12:32   ` Olaf Hering
2016-04-21 12:51     ` Matthieu Moy
2016-04-21 13:19 ` John Keeping
2016-04-21 15:54   ` Olaf Hering
2016-04-21 16:36     ` Matthieu Moy
2016-04-21 13:24 ` Andreas Schwab
2016-04-21 16:36   ` Linus Torvalds
2016-04-21 16:59     ` Junio C Hamano
2016-04-21 17:08       ` Jeff King
2016-04-21 17:23         ` Linus Torvalds
2016-04-21 17:44           ` Stefan Beller
2016-04-21 22:16             ` Junio C Hamano
2016-04-21 18:05           ` Jeff King
2016-04-21 18:18             ` Linus Torvalds
2016-04-22 13:38               ` Johannes Schindelin
2016-04-21 17:00     ` Linus Torvalds
2016-04-21 17:23       ` Junio C Hamano
2016-04-21 17:43         ` Linus Torvalds [this message]
2016-04-21 17:59           ` Linus Torvalds
2016-04-21 18:09             ` Jeff King
2016-04-21 19:27           ` Junio C Hamano
2016-04-21 19:43             ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFwOtyW7zLHdJND=FGBWKBfhQV95RPVRG5gcoRUrtGCrAQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=olaf@aepfle.de \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).