git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Dana How" <danahow@gmail.com>
To: "Jakub Narebski" <jnareb@gmail.com>
Cc: git@vger.kernel.org, "Nicolas Pitre" <nico@cam.org>, danahow@gmail.com
Subject: Re: If you would write git from scratch now, what would you change?
Date: Mon, 26 Nov 2007 12:36:27 -0800	[thread overview]
Message-ID: <56b7f5510711261236n732d56dci6335541391e1e137@mail.gmail.com> (raw)
In-Reply-To: <200711262117.56326.jnareb@gmail.com>

On Nov 26, 2007 12:17 PM, Jakub Narebski <jnareb@gmail.com> wrote:
> On Mon, 26 Nov 2007, Dana How wrote:
> > On Nov 25, 2007 1:48 PM, Jakub Narebski <jnareb@gmail.com> wrote:
> >
> > > If you would write git from scratch now, from the beginning, without
> > > concerns for backwards compatibility, what would you change, or what
> > > would you want to have changed?
> >
> > Currently data can be quickly copied from pack to pack,
> > but data cannot be quickly copied blob->pack or pack->blob
> > (there was an alternate blob format that supported this,
> >  but it was deprecated).  Using the pack format for blobs
> > would fix this.  It would also mean blobs wouldn't need to
> > be uncompressed to get the blob type or size I believe.
>
> Could you do some benchmark for repository with your large objects
> as loose objects created with and without core.legacyHeaders (created
> with git pre 1.5.3), and as single blob packs, perhaps kept, with
> _undocumented_ (except for RelNotes) gitattribute delta unset for
> those files?
First of all,  this is a very reasonable request and what I should be doing.
Unfortunately,  I only have the cycles at the moment to point out this
issue,  which appears to be a problem from my perspective.

Currently,
a user who wants to publish some (large) files does the following:
git add (calls deflate)
git commit
git push (builds a pack to stdout, calling inflate and deflate on each blob).

So if the blob and pack formats were more similar (different blob format,
big blobs are singleton packs, etc) the zlib calls in git push go away.
The deflate call could be sped up by using 1 for compression level,
but it still takes time.

Another "solution" is to make each workgroup member's .git/objects
be a symlink to a tree with a lot of sticky bits and do some scripting.
(This means "git push" doesn't push any data and only alters stuff
 in .git/refs/heads on the server.)
I'm not entirely enthusiastic about this,  and when I mentioned it a while
ago it did cause some retching...

> From Documentation/RelNotes-1.5.3:
>
>   - We used to have core.legacyheaders configuration, when
>     set to false, allowed git to write loose objects in a format
>     that mimicks the format used by objects stored in packs.  It
>     turns out that this was not so useful.  Although we will
>     continue to read objects written in that format, we do not
>     honor that configuration anymore and create loose objects in
>     the legacy/traditional format.
>
>   - "pack-objects" honors "delta" attribute set in
>     .gitattributes.  It does not attempt to deltify blobs that
>     come from paths with delta attribute set to false.
>
>   - diff-delta code that is used for packing has been improved
>     to work better on big files.
>
> The last part is thanks to your comments, complaints and efforts, Dana.
Yes,  there have been some very useful improvements recently.

However,  I didn't actually push for the first "-" you list;
I was pushing for the "mimic" option even then
but some argument was presented to me against it,
to which I had no counter-argument until I understood git better later.

Thanks,
-- 
Dana L. How  danahow@gmail.com  +1 650 804 5991 cell

  reply	other threads:[~2007-11-26 20:37 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-25 21:48 If you would write git from scratch now, what would you change? Jakub Narebski
2007-11-25 22:23 ` Pierre Habouzit
2007-11-26  1:28   ` Steven Walter
2007-11-26  6:11     ` Junio C Hamano
2007-11-26  6:36       ` Adam Roben
2007-11-26 15:32         ` Carlos Rica
2007-11-26 16:40           ` Daniel Barkalow
2007-11-26 16:46 ` Andy Parkins
2007-11-26 17:10   ` Benoit Sigoure
2007-11-26 18:56     ` Jan Hudec
2007-11-26 19:12       ` David Kastrup
2007-11-26 19:34         ` Jan Hudec
2007-11-26 19:50           ` Michael Poole
2007-11-26 20:09             ` Jan Hudec
2007-11-26 20:31               ` Michael Poole
2007-11-26 20:48                 ` Jon Smirl
2007-11-26 20:11     ` Andy Parkins
2007-11-26 19:25   ` Marco Costalba
2007-11-27  1:20     ` Shawn O. Pearce
2007-11-27  1:46       ` Jakub Narebski
2007-11-27  1:58         ` Shawn O. Pearce
2007-11-27 11:39           ` Johannes Schindelin
2007-11-27 23:59             ` [RFC] git-gui USer's Survey 2007 (was: If you would write git from scratch now, what would you change?) Jakub Narebski
2007-11-28 12:32               ` Johannes Schindelin
2007-11-28 15:48                 ` Jason Sewall
2007-11-28 23:25                 ` Jan Hudec
2007-11-28 23:48                   ` Johannes Schindelin
2007-11-29  6:57                     ` Jan Hudec
2007-11-29 12:01                       ` Johannes Schindelin
2007-11-30 17:50                         ` Jan Hudec
2007-11-30 18:25                           ` Marco Costalba
2007-12-01  2:35                           ` Shawn O. Pearce
2007-12-01  2:53                             ` Marco Costalba
2007-11-28 13:18               ` [RFC] git-gui USer's Survey 2007 Sergei Organov
2007-11-27  8:45     ` If you would write git from scratch now, what would you change? Andy Parkins
2007-11-27 13:15       ` Marco Costalba
2007-11-27 23:56         ` Jan Hudec
2007-11-27 17:48       ` Johannes Schindelin
2007-12-04 11:00         ` Andy Parkins
2007-11-27 17:33   ` Jing Xue
2007-11-26 16:48 ` Jon Smirl
2007-11-26 17:11 ` David Kastrup
2007-11-26 19:27   ` Jan Hudec
2007-11-26 20:11     ` Benoit Sigoure
2007-11-26 20:36       ` Jan Hudec
2007-11-26 19:30   ` Nicolas Pitre
2007-11-26 19:34     ` David Kastrup
2007-11-26 19:57       ` Jan Hudec
2007-11-26 20:35         ` David Kastrup
2007-11-26 21:00           ` Jan Hudec
2007-11-26 21:28           ` Nicolas Pitre
2007-11-26 20:45         ` Wincent Colaiuta
2007-11-26 21:24           ` Junio C Hamano
2007-11-26 21:35             ` Nicolas Pitre
2007-11-26 21:47               ` Junio C Hamano
2007-11-26 22:03                 ` Nicolas Pitre
2007-11-27  1:03             ` Shawn O. Pearce
2007-11-27  3:35               ` Junio C Hamano
2007-11-27  5:10                 ` Steven Grimm
2007-11-26 21:27     ` Johannes Schindelin
2007-11-26 21:39       ` Nicolas Pitre
2007-11-26 21:40         ` Johannes Schindelin
2007-11-27 14:11     ` Andreas Ericsson
2007-11-27 14:38       ` Jakub Narebski
2007-11-26 19:18 ` Dana How
2007-11-26 19:52   ` Nicolas Pitre
2007-11-26 20:17     ` Dana How
2007-11-26 20:55       ` Nicolas Pitre
2007-11-26 22:02         ` Dana How
2007-11-26 22:22           ` Nicolas Pitre
2007-11-26 20:17   ` Jakub Narebski
2007-11-26 20:36     ` Dana How [this message]
2007-11-27  1:25   ` Shawn O. Pearce
2007-11-27  5:07     ` Nicolas Pitre
2007-11-27  1:48 ` Shawn O. Pearce
2007-11-27  1:54   ` Junio C Hamano
2007-11-27  1:59     ` Shawn O. Pearce
2007-11-27  2:15       ` Jakub Narebski
2007-11-27 11:47         ` C# binding, was " Johannes Schindelin
2007-11-27  4:58   ` Nicolas Pitre
2007-11-27  5:59     ` Dana How
2007-11-27  6:12       ` Shawn O. Pearce
2007-11-27 16:33     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56b7f5510711261236n732d56dci6335541391e1e137@mail.gmail.com \
    --to=danahow@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).