From: Jakub Narebski <jnareb@gmail.com> To: Jeff King <peff@peff.net> Cc: Ted Ts'o <tytso@mit.edu>, Jonathan Nieder <jrnieder@gmail.com>, Ævar Arnfjörð Bjarmason <avarab@gmail.com>, Clemens Buchacher <drizzd@aon.at>, git@vger.kernel.org, Junio C Hamano <gitster@pobox.com> Subject: Re: generation numbers (was: [PATCH 0/4] Speed up git tag --contains) Date: Wed, 6 Jul 2011 20:46:42 +0200 Message-ID: <201107062046.43820.jnareb@gmail.com> (raw) In-Reply-To: <20110706181200.GD17978@sigill.intra.peff.net> On Wed, 6 Jul 2011, Jeff King wrote: > On Wed, Jul 06, 2011 at 11:01:03AM -0400, Ted Ts'o wrote: > > > Is it worth it to try to replicate this information across repositories? > > Probably not. I suggested notes-cache just because the amount of code is > very trivial. Well, generation numbers are universal and would help everybody. For new commits with 'generation' header those would be always replicated, for old commits with 'generation' notes / notes-cache the can be replicated. > One problem with notes storage is that it's not well optimized for tiny > pieces of data like this (e.g., the generation number should fit in a > 32-bit unsigned int, as its max is the size of the longest single path > in the history graph). But notes are much more general; we will actually > map each commit to a blob object containing the generation number, which > is pretty wasteful. Wasn't textconv-cache using commit-less notes? The same can be done for generation notes-cache. Though it is still wasteful... By the way, would we be using text representation (like in 'generation' commit header) or 32-bit integer binary representation in some ordering, or variable-length integer (I think git uses them somewhere)? Nb. I wonder if 32-bit unsigned int would always be enough, for example Linux kernel + history. > > Why not just simply have a cache file in the git directory which is > > managed somewhat like gitk.cache; call it generation.cache? > > Yeah, that would be fine. With a sorted list of binary sha1s and 32-bit > generation numbers, you're talking about 24 bytes per commit. Or a 6 > megabyte cache for linux-2.6. > > You'd probably want to be a little clever with updates. If I have > calculated the generation number of every commit, and then do "git > commit; git tag --contains HEAD", you probably don't want to rewrite the > entire cache. You could probably journal a fixed number of entries in an > unsorted file (or even in a parallel directory structure to loose > objects), and then periodically write out the whole sorted list when the > journal gets too big. Or choose a more clever data structure that can do > in-place updates. And that is the difference between gitk.cache (generated _once_ when starting gitk, and regenerated on request), and idea of generation.cache I think it would be simpler to use generation header + generation notes. Or start with generation notes only. -- Jakub Narebski Poland
next prev parent reply index Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-06-11 19:04 [PATCH 0/4] Speed up git tag --contains Ævar Arnfjörð Bjarmason 2011-06-11 19:04 ` [PATCH 1/4] tag: speed up --contains calculation Ævar Arnfjörð Bjarmason 2011-06-11 19:04 ` [PATCH 2/4] limit "contains" traversals based on commit timestamp Ævar Arnfjörð Bjarmason 2011-06-11 19:04 ` [PATCH 3/4] default core.clockskew variable to one day Ævar Arnfjörð Bjarmason 2011-06-11 19:04 ` [PATCH 4/4] Why is "git tag --contains" so slow? Ævar Arnfjörð Bjarmason 2011-07-06 6:40 ` [PATCH 0/4] Speed up git tag --contains Jeff King 2011-07-06 6:54 ` Jeff King 2011-07-06 19:06 ` Clemens Buchacher 2011-07-06 6:56 ` Jonathan Nieder 2011-07-06 7:03 ` Jeff King 2011-07-06 14:26 ` generation numbers (was: [PATCH 0/4] Speed up git tag --contains) Jakub Narebski 2011-07-06 15:01 ` Ted Ts'o 2011-07-06 18:12 ` Jeff King 2011-07-06 18:46 ` Jakub Narebski [this message] 2011-07-07 18:59 ` Jeff King 2011-07-07 19:34 ` generation numbers Junio C Hamano 2011-07-07 20:31 ` Jakub Narebski 2011-07-07 20:52 ` A Large Angry SCM 2011-07-08 0:29 ` Junio C Hamano 2011-07-08 22:57 ` Jeff King 2011-07-06 23:22 ` Junio C Hamano 2011-07-07 19:08 ` Jeff King 2011-07-07 20:10 ` Jakub Narebski 2018-01-12 18:56 ` [PATCH 0/4] Speed up git tag --contains csilvers 2018-03-03 5:15 ` Jeff King 2018-03-08 23:05 ` csilvers 2018-03-12 13:45 ` Derrick Stolee 2018-03-12 23:59 ` Jeff King
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=201107062046.43820.jnareb@gmail.com \ --to=jnareb@gmail.com \ --cc=avarab@gmail.com \ --cc=drizzd@aon.at \ --cc=git@vger.kernel.org \ --cc=gitster@pobox.com \ --cc=jrnieder@gmail.com \ --cc=peff@peff.net \ --cc=tytso@mit.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
git@vger.kernel.org mailing list mirror (one of many) Archives are clonable: git clone --mirror https://public-inbox.org/git git clone --mirror http://ou63pmih66umazou.onion/git git clone --mirror http://czquwvybam4bgbro.onion/git git clone --mirror http://hjrcffqmbrq6wope.onion/git Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git nntp://news.gmane.org/gmane.comp.version-control.git note: .onion URLs require Tor: https://www.torproject.org/ or Tor2web: https://www.tor2web.org/ AGPL code for this site: git clone https://public-inbox.org/ public-inbox