git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jon Smirl <jonsmirl@gmail.com>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Calculating tree nodes
Date: Wed, 5 Sep 2007 23:20:26 -0400	[thread overview]
Message-ID: <20070906032026.GO18160@spearce.org> (raw)
In-Reply-To: <7vbqcinxdb.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> > There's nothing stopping us from creating additional indexes.
> > ...
> > But we can also store the notes alongside the commits in the
> > packfile, so that if the data for the commit has been paged in
...
> 
> I would agree with your main thrust "nobody prevents you from
> building additional index", but on a tangent, I am skeptical
> about adding too much to pack v4.  Especially "clustering the
> notes" part.
...
> Now, hopefully many operations do not need notes either,
> although notes themselves can store _anything_ so each of them
> could be large and/or each commit could have large number of
> them.  I suspect clustering notes along with the commit they
> annotate would break the locality of access for common case.

I'm inclined to agree.

Its something I've thought about doing.  I haven't even prototyped
code for it.  Let alone shown numbers that say one way or the other.

One of the notes proposals was talking about having lots of different
classes of notes.  E.g. a "Signed-off-by" class and a "build and test
log results" class.

The former would generally be very small and may even want to be
shown most of the time the commit body is displayed (e.g. in gitk,
git-log).  These would be good candidates to cluster alongside the
commit.  Indeed they are clustered there today, just hung inside
of the commit object itself.  Nobody is bitching about the hit they
cause on the common case of `pack-objects`.  :)

The latter (build and test log) would generally be very large.
We would *not* want to cluster them.  But we might want to store
next to the commit a very small pointer to the note itself.  Such
as the note's SHA-1.  Or its offset within the packfile's index.
This would make locating those notes very cheap, while not having
a huge impact on the common case of commit traversal.

Likewise we might want to pack a tag's SHA-1 alongside of the commit
it points at, as parsing the commit would immediately give us all
annotated tags that refer to that commit.  Tags are (usually) few
and far between.  But tools like git-describe are commonly used and
would benefit from not needing to build the commit->tag hashtable.
OK, well, git-describe cheats and uses the struct object hashtable,
but whatever.

You get my point.  I think.  And I got yours about not making the
common case worse than it already is today.

-- 
Shawn.

  reply	other threads:[~2007-09-06  3:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-04  2:13 Calculating tree nodes Jon Smirl
2007-09-04  2:51 ` Shawn O. Pearce
2007-09-04  3:26   ` Jon Smirl
2007-09-04  3:40     ` Johannes Schindelin
2007-09-04  3:54       ` Jon Smirl
2007-09-04  4:21         ` Martin Langhoff
2007-09-04  5:37           ` Jon Smirl
2007-09-04  5:51             ` Andreas Ericsson
2007-09-04 10:33             ` Johannes Schindelin
2007-09-04 14:31               ` Jon Smirl
2007-09-04 15:05                 ` Johannes Schindelin
2007-09-04 15:14                 ` Andreas Ericsson
2007-09-04 21:02                   ` Martin Langhoff
2007-09-04  4:28         ` Junio C Hamano
2007-09-04  5:50           ` Jon Smirl
2007-09-04  4:19     ` David Tweed
2007-09-04  5:52       ` Jon Smirl
2007-09-04  5:55         ` Andreas Ericsson
2007-09-04  6:16           ` Shawn O. Pearce
2007-09-04 14:19             ` Jon Smirl
2007-09-04 14:41               ` Andreas Ericsson
2007-09-04  6:16         ` David Tweed
2007-09-04  6:26     ` Shawn O. Pearce
2007-09-04 17:39       ` Junio C Hamano
2007-09-06  3:20         ` Shawn O. Pearce [this message]
2007-09-06  5:21           ` Junio C Hamano
2007-09-04 16:20     ` Daniel Hulme

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070906032026.GO18160@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonsmirl@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).