From: "Shawn O. Pearce" <spearce@spearce.org>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: If you would write git from scratch now, what would you change?
Date: Mon, 26 Nov 2007 20:48:04 -0500 [thread overview]
Message-ID: <20071127014804.GJ14735@spearce.org> (raw)
In-Reply-To: <200711252248.27904.jnareb@gmail.com>
Jakub Narebski <jnareb@gmail.com> wrote:
> If you would write git from scratch now, from the beginning, without
> concerns for backwards compatibility, what would you change, or what
> would you want to have changed?
- Sort tree entries by name, *not* by name+type
This has got to be my biggest gripe with Git. I think Linus really
screwed the pooch with this. We've talked it over a few times
on the list and he and I have just agreed to disagree on this.
Ask any database person and they'll tell you how wrong the
current tree ordering is. Or they are nuts and don't get
the concept of data integrity.
Linus' excuse is that the current ordering makes working with
the flat index faster as its just one index file. That doesn't
mean that the flat index file can't contain tree information.
Like it does in say that new fangled cache-tree extension. :-)
This particular "design decision" has brought all sorts of bugs
into the system, like the D/F merge conflict issues, and even one
from Linus himself when he first introduced the submodule support.
Lets not even talk about ugly that made things in jgit.
- Loose objects storage is difficult to work with
The standard loose object format of DEFLATE("$type $size\0$data")
makes it harder to work with as you need to inflate at least
part of the object just to see what the hell it is or how big
its final output buffer needs to be.
It also makes it very hard to stream into a packfile if you have
determined its not worth creating a delta for the object (or no
suitable delta base is available).
The new (now deprecated) loose object format that was based on
the packfile header format simplified this and made it much
easier to work with.
- No proper libgit
Already been stated but we don't have a great library and we
don't have a good way to build one right now either. A lot of
our internal code assumes die() will abort the process. That's a
very bad assumption to be making inside of a library.
- Binary packed-refs representation
I probably wouldn't have done an ASCII based packed-refs file,
or heck, even loose refs. I probably would have just gone with
a binary file that we wholesale rewrite every time there is any
sort of ref update.
We already do this with the index. So every time we update a
file path we are rewriting the entire index. And we update
file paths a heck of a lot more often than we update branch
heads. Or tags.
But tools like for-each-ref get invoked heavily, and fast access
to the ref database is important to overall performance.
- No GIT_OBJECT_DIRECTORY vs. GIT_DIR distinction
This is causing problems with $GIT_DIR/objects/info/alternates
and then try to repack repositories. Not having the ref space of
the alternates and/or borrowers considered during repacking can
cause all sorts of fun breakage that may be hard to recover from.
Plus it means you have to do funny "refs/forkee" hacks just to
avoid pushing unnecessary objects over the wire when the other
end is borrowing objects.
I probably would have had the object directory unified with its
ref database, so that they cannot be accessed individually.
All of the above is written with 20/20 hindsight and all that.
Looking back (and knowing myself well) I think the only item I
would have gotten right if I had written Git from scratch is the
first one above (the tree entry ordering). I probably would have
done something equally "as bad" as what we have today for all of
the others...
--
Shawn.
next prev parent reply other threads:[~2007-11-27 1:48 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-25 21:48 If you would write git from scratch now, what would you change? Jakub Narebski
2007-11-25 22:23 ` Pierre Habouzit
2007-11-26 1:28 ` Steven Walter
2007-11-26 6:11 ` Junio C Hamano
2007-11-26 6:36 ` Adam Roben
2007-11-26 15:32 ` Carlos Rica
2007-11-26 16:40 ` Daniel Barkalow
2007-11-26 16:46 ` Andy Parkins
2007-11-26 17:10 ` Benoit Sigoure
2007-11-26 18:56 ` Jan Hudec
2007-11-26 19:12 ` David Kastrup
2007-11-26 19:34 ` Jan Hudec
2007-11-26 19:50 ` Michael Poole
2007-11-26 20:09 ` Jan Hudec
2007-11-26 20:31 ` Michael Poole
2007-11-26 20:48 ` Jon Smirl
2007-11-26 20:11 ` Andy Parkins
2007-11-26 19:25 ` Marco Costalba
2007-11-27 1:20 ` Shawn O. Pearce
2007-11-27 1:46 ` Jakub Narebski
2007-11-27 1:58 ` Shawn O. Pearce
2007-11-27 11:39 ` Johannes Schindelin
2007-11-27 23:59 ` [RFC] git-gui USer's Survey 2007 (was: If you would write git from scratch now, what would you change?) Jakub Narebski
2007-11-28 12:32 ` Johannes Schindelin
2007-11-28 15:48 ` Jason Sewall
2007-11-28 23:25 ` Jan Hudec
2007-11-28 23:48 ` Johannes Schindelin
2007-11-29 6:57 ` Jan Hudec
2007-11-29 12:01 ` Johannes Schindelin
2007-11-30 17:50 ` Jan Hudec
2007-11-30 18:25 ` Marco Costalba
2007-12-01 2:35 ` Shawn O. Pearce
2007-12-01 2:53 ` Marco Costalba
2007-11-28 13:18 ` [RFC] git-gui USer's Survey 2007 Sergei Organov
2007-11-27 8:45 ` If you would write git from scratch now, what would you change? Andy Parkins
2007-11-27 13:15 ` Marco Costalba
2007-11-27 23:56 ` Jan Hudec
2007-11-27 17:48 ` Johannes Schindelin
2007-12-04 11:00 ` Andy Parkins
2007-11-27 17:33 ` Jing Xue
2007-11-26 16:48 ` Jon Smirl
2007-11-26 17:11 ` David Kastrup
2007-11-26 19:27 ` Jan Hudec
2007-11-26 20:11 ` Benoit Sigoure
2007-11-26 20:36 ` Jan Hudec
2007-11-26 19:30 ` Nicolas Pitre
2007-11-26 19:34 ` David Kastrup
2007-11-26 19:57 ` Jan Hudec
2007-11-26 20:35 ` David Kastrup
2007-11-26 21:00 ` Jan Hudec
2007-11-26 21:28 ` Nicolas Pitre
2007-11-26 20:45 ` Wincent Colaiuta
2007-11-26 21:24 ` Junio C Hamano
2007-11-26 21:35 ` Nicolas Pitre
2007-11-26 21:47 ` Junio C Hamano
2007-11-26 22:03 ` Nicolas Pitre
2007-11-27 1:03 ` Shawn O. Pearce
2007-11-27 3:35 ` Junio C Hamano
2007-11-27 5:10 ` Steven Grimm
2007-11-26 21:27 ` Johannes Schindelin
2007-11-26 21:39 ` Nicolas Pitre
2007-11-26 21:40 ` Johannes Schindelin
2007-11-27 14:11 ` Andreas Ericsson
2007-11-27 14:38 ` Jakub Narebski
2007-11-26 19:18 ` Dana How
2007-11-26 19:52 ` Nicolas Pitre
2007-11-26 20:17 ` Dana How
2007-11-26 20:55 ` Nicolas Pitre
2007-11-26 22:02 ` Dana How
2007-11-26 22:22 ` Nicolas Pitre
2007-11-26 20:17 ` Jakub Narebski
2007-11-26 20:36 ` Dana How
2007-11-27 1:25 ` Shawn O. Pearce
2007-11-27 5:07 ` Nicolas Pitre
2007-11-27 1:48 ` Shawn O. Pearce [this message]
2007-11-27 1:54 ` Junio C Hamano
2007-11-27 1:59 ` Shawn O. Pearce
2007-11-27 2:15 ` Jakub Narebski
2007-11-27 11:47 ` C# binding, was " Johannes Schindelin
2007-11-27 4:58 ` Nicolas Pitre
2007-11-27 5:59 ` Dana How
2007-11-27 6:12 ` Shawn O. Pearce
2007-11-27 16:33 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071127014804.GJ14735@spearce.org \
--to=spearce@spearce.org \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).