git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: If you would write git from scratch now, what would you change?
Date: Mon, 26 Nov 2007 20:48:04 -0500	[thread overview]
Message-ID: <20071127014804.GJ14735@spearce.org> (raw)
In-Reply-To: <200711252248.27904.jnareb@gmail.com>

Jakub Narebski <jnareb@gmail.com> wrote:
> If you would write git from scratch now, from the beginning, without 
> concerns for backwards compatibility, what would you change, or what 
> would you want to have changed?

- Sort tree entries by name, *not* by name+type

  This has got to be my biggest gripe with Git.  I think Linus really
  screwed the pooch with this.  We've talked it over a few times
  on the list and he and I have just agreed to disagree on this.

  Ask any database person and they'll tell you how wrong the
  current tree ordering is.  Or they are nuts and don't get
  the concept of data integrity.

  Linus' excuse is that the current ordering makes working with
  the flat index faster as its just one index file.  That doesn't
  mean that the flat index file can't contain tree information.
  Like it does in say that new fangled cache-tree extension.  :-)

  This particular "design decision" has brought all sorts of bugs
  into the system, like the D/F merge conflict issues, and even one
  from Linus himself when he first introduced the submodule support.
  Lets not even talk about ugly that made things in jgit.


- Loose objects storage is difficult to work with

  The standard loose object format of DEFLATE("$type $size\0$data")
  makes it harder to work with as you need to inflate at least
  part of the object just to see what the hell it is or how big
  its final output buffer needs to be.

  It also makes it very hard to stream into a packfile if you have
  determined its not worth creating a delta for the object (or no
  suitable delta base is available).

  The new (now deprecated) loose object format that was based on
  the packfile header format simplified this and made it much
  easier to work with.


- No proper libgit

  Already been stated but we don't have a great library and we
  don't have a good way to build one right now either.  A lot of
  our internal code assumes die() will abort the process.  That's a
  very bad assumption to be making inside of a library.


- Binary packed-refs representation

  I probably wouldn't have done an ASCII based packed-refs file,
  or heck, even loose refs.  I probably would have just gone with
  a binary file that we wholesale rewrite every time there is any
  sort of ref update.

  We already do this with the index.  So every time we update a
  file path we are rewriting the entire index.  And we update
  file paths a heck of a lot more often than we update branch
  heads.  Or tags.

  But tools like for-each-ref get invoked heavily, and fast access
  to the ref database is important to overall performance.


- No GIT_OBJECT_DIRECTORY vs. GIT_DIR distinction

  This is causing problems with $GIT_DIR/objects/info/alternates
  and then try to repack repositories.  Not having the ref space of
  the alternates and/or borrowers considered during repacking can
  cause all sorts of fun breakage that may be hard to recover from.
  Plus it means you have to do funny "refs/forkee" hacks just to
  avoid pushing unnecessary objects over the wire when the other
  end is borrowing objects.

  I probably would have had the object directory unified with its
  ref database, so that they cannot be accessed individually.


All of the above is written with 20/20 hindsight and all that.

Looking back (and knowing myself well) I think the only item I
would have gotten right if I had written Git from scratch is the
first one above (the tree entry ordering).  I probably would have
done something equally "as bad" as what we have today for all of
the others...

-- 
Shawn.

  parent reply	other threads:[~2007-11-27  1:48 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-25 21:48 If you would write git from scratch now, what would you change? Jakub Narebski
2007-11-25 22:23 ` Pierre Habouzit
2007-11-26  1:28   ` Steven Walter
2007-11-26  6:11     ` Junio C Hamano
2007-11-26  6:36       ` Adam Roben
2007-11-26 15:32         ` Carlos Rica
2007-11-26 16:40           ` Daniel Barkalow
2007-11-26 16:46 ` Andy Parkins
2007-11-26 17:10   ` Benoit Sigoure
2007-11-26 18:56     ` Jan Hudec
2007-11-26 19:12       ` David Kastrup
2007-11-26 19:34         ` Jan Hudec
2007-11-26 19:50           ` Michael Poole
2007-11-26 20:09             ` Jan Hudec
2007-11-26 20:31               ` Michael Poole
2007-11-26 20:48                 ` Jon Smirl
2007-11-26 20:11     ` Andy Parkins
2007-11-26 19:25   ` Marco Costalba
2007-11-27  1:20     ` Shawn O. Pearce
2007-11-27  1:46       ` Jakub Narebski
2007-11-27  1:58         ` Shawn O. Pearce
2007-11-27 11:39           ` Johannes Schindelin
2007-11-27 23:59             ` [RFC] git-gui USer's Survey 2007 (was: If you would write git from scratch now, what would you change?) Jakub Narebski
2007-11-28 12:32               ` Johannes Schindelin
2007-11-28 15:48                 ` Jason Sewall
2007-11-28 23:25                 ` Jan Hudec
2007-11-28 23:48                   ` Johannes Schindelin
2007-11-29  6:57                     ` Jan Hudec
2007-11-29 12:01                       ` Johannes Schindelin
2007-11-30 17:50                         ` Jan Hudec
2007-11-30 18:25                           ` Marco Costalba
2007-12-01  2:35                           ` Shawn O. Pearce
2007-12-01  2:53                             ` Marco Costalba
2007-11-28 13:18               ` [RFC] git-gui USer's Survey 2007 Sergei Organov
2007-11-27  8:45     ` If you would write git from scratch now, what would you change? Andy Parkins
2007-11-27 13:15       ` Marco Costalba
2007-11-27 23:56         ` Jan Hudec
2007-11-27 17:48       ` Johannes Schindelin
2007-12-04 11:00         ` Andy Parkins
2007-11-27 17:33   ` Jing Xue
2007-11-26 16:48 ` Jon Smirl
2007-11-26 17:11 ` David Kastrup
2007-11-26 19:27   ` Jan Hudec
2007-11-26 20:11     ` Benoit Sigoure
2007-11-26 20:36       ` Jan Hudec
2007-11-26 19:30   ` Nicolas Pitre
2007-11-26 19:34     ` David Kastrup
2007-11-26 19:57       ` Jan Hudec
2007-11-26 20:35         ` David Kastrup
2007-11-26 21:00           ` Jan Hudec
2007-11-26 21:28           ` Nicolas Pitre
2007-11-26 20:45         ` Wincent Colaiuta
2007-11-26 21:24           ` Junio C Hamano
2007-11-26 21:35             ` Nicolas Pitre
2007-11-26 21:47               ` Junio C Hamano
2007-11-26 22:03                 ` Nicolas Pitre
2007-11-27  1:03             ` Shawn O. Pearce
2007-11-27  3:35               ` Junio C Hamano
2007-11-27  5:10                 ` Steven Grimm
2007-11-26 21:27     ` Johannes Schindelin
2007-11-26 21:39       ` Nicolas Pitre
2007-11-26 21:40         ` Johannes Schindelin
2007-11-27 14:11     ` Andreas Ericsson
2007-11-27 14:38       ` Jakub Narebski
2007-11-26 19:18 ` Dana How
2007-11-26 19:52   ` Nicolas Pitre
2007-11-26 20:17     ` Dana How
2007-11-26 20:55       ` Nicolas Pitre
2007-11-26 22:02         ` Dana How
2007-11-26 22:22           ` Nicolas Pitre
2007-11-26 20:17   ` Jakub Narebski
2007-11-26 20:36     ` Dana How
2007-11-27  1:25   ` Shawn O. Pearce
2007-11-27  5:07     ` Nicolas Pitre
2007-11-27  1:48 ` Shawn O. Pearce [this message]
2007-11-27  1:54   ` Junio C Hamano
2007-11-27  1:59     ` Shawn O. Pearce
2007-11-27  2:15       ` Jakub Narebski
2007-11-27 11:47         ` C# binding, was " Johannes Schindelin
2007-11-27  4:58   ` Nicolas Pitre
2007-11-27  5:59     ` Dana How
2007-11-27  6:12       ` Shawn O. Pearce
2007-11-27 16:33     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071127014804.GJ14735@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).