From: Jeff King <peff@peff.net>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Stefan Beller <sbeller@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Marc Strapetz <marc.strapetz@syntevo.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: topological index field for commit objects
Date: Fri, 1 Jul 2016 02:54:52 -0400 [thread overview]
Message-ID: <20160701065452.GE5358@sigill.intra.peff.net> (raw)
In-Reply-To: <5774F4C7.805@gmail.com>
On Thu, Jun 30, 2016 at 12:30:31PM +0200, Jakub Narębski wrote:
> > This is one of the open questions. My older patches turned them off when
> > replacements and grafts are in effect.
>
> Well, if you store the cache of generation numbers in the packfile, or in
> the index of the packfile, or in the bitmap file, or in separate bitmap-like
> file, generating them on repack, then of course any grafts or replacements
> invalidate them... though for low level commands (like object counting)
> replacements are transparent -- or rather they are (and can be) treated as
> any other ref for reachability analysis.
>
> Well, if there are no grafts, you could still use them for doing
> "git --no-replace-objects log ...", isn't it?
Yes, replace refs don't invalidate the concept of a cache. It just
means that you invalidate the invariants of the cache for a specific
view, so you need a cache which matches that view.
It has been several years, but I remember at one point having patches
that summarized the graft/replace state as a single hash, and only used
the cache if it matched that state. So you could actually keep a cache
for some set of replace-refs that you have, as well as a cache for the
case that you've turned them off, etc.
I don't think that level of complexity is really worth it, though.
> >>> I have patches that generate and store the numbers at pack time, similar
> >>> to the way we do the reachability bitmaps.
>
> Ah, so those cached generation numbers are generated and stored at pack
> time. Where you store them: is it a separate file? Bitmap file? Packfile?
There were a few iterations of the concept over the years, but the
pack-time one uses a separate file with the same name prefix as a pack
(similar to the way bitmaps are stored). The big advantage there is that
we can piggy-back on the pack .idx to avoid having to write each sha1
again (20 bytes per commit, whereas the actual data we're caching is
only 4 bytes).
> > At GitHub we are using them for --contains analysis, along with mass
> > ahead/behind (e.g., as in https://github.com/gitster/git/branches). My
> > plan is to send patches upstream, but they need some cleanup first.
>
> That would be nice to have, please.
>
> Er, is mass ahead/behind something that can be plugged into Git
> (e.g. for "git branch -v -v"), or is it something GitHub-specific?
We have a custom command, "git ahead-behind", where you can specify
arbitrary pairs of commits on stdin. But it's all backed by a function
which, yes, could be plugged into "branch -v -v". It caches any bitmaps
it needs, so if you are doing 100 ahead/behind comparisons against
"master", for example, it only has to find the bitmap for "master" once
(remember that we sometimes have to traverse to complete a bitmap when
a branch has been updated since the last repack).
-Peff
next prev parent reply other threads:[~2016-07-01 7:01 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-29 18:31 topological index field for commit objects Marc Strapetz
2016-06-29 18:59 ` Junio C Hamano
2016-06-29 20:20 ` Stefan Beller
2016-06-29 20:39 ` Junio C Hamano
2016-06-29 20:54 ` Stefan Beller
2016-06-29 21:37 ` Stefan Beller
2016-06-29 21:43 ` Jeff King
2016-06-29 20:56 ` Jeff King
2016-06-29 21:49 ` Jakub Narębski
2016-06-29 22:00 ` Jeff King
2016-06-29 22:11 ` Junio C Hamano
2016-06-29 22:30 ` Jeff King
2016-07-05 11:43 ` Johannes Schindelin
2016-07-05 12:59 ` Jakub Narębski
2016-06-30 10:30 ` Jakub Narębski
2016-06-30 18:12 ` Linus Torvalds
2016-06-30 23:39 ` Jakub Narębski
2016-06-30 23:59 ` Mike Hommey
2016-07-01 3:17 ` Jeff King
2016-07-01 6:45 ` Marc Strapetz
2016-07-01 9:48 ` Jakub Narębski
2016-07-01 16:08 ` Junio C Hamano
2016-07-01 6:54 ` Jeff King [this message]
2016-07-01 9:59 ` Jakub Narębski
2016-07-20 0:07 ` Jakub Narębski
2016-07-20 13:02 ` Jeff King
2017-02-04 13:43 ` Jakub Narębski
2017-02-17 9:26 ` Jeff King
2017-02-17 9:28 ` Jakub Narębski
2016-06-29 22:15 ` Marc Strapetz
2016-06-29 21:00 ` Jakub Narębski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160701065452.GE5358@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=marc.strapetz@syntevo.com \
--cc=sbeller@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).