git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ramkumar Ramachandra <artagnon@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: Git List <git@vger.kernel.org>,
	David Barr <david.barr@cordelta.com>,
	Sverre Rabbelier <srabbelier@gmail.com>
Subject: Re: [PATCH 4/5] fast-export: Introduce --inline-blobs
Date: Thu, 20 Jan 2011 10:20:48 +0530	[thread overview]
Message-ID: <20110120045046.GB9493@kytes> (raw)
In-Reply-To: <20110119214827.GA31733@burratino>

Hi,

Jonathan Nieder writes:
> Junio C Hamano wrote:
> > Ramkumar Ramachandra <artagnon@gmail.com> writes:
> 
> >> Introduce a new command-line option --inline-blobs that always inlines
> >> blobs instead of referring to them via marks or their original SHA-1
> >> hash.
> [...]
> > Hmm, this smells somewhat fishy.
> >
> > Wasn't G-F-I designed to be a common stream format for other SCMs to
> > generate streams, so that importers and exporters can be written once for
> > each SCM to interoperate?
> 
> Here is one way to sell it:
> 
> 	With the inline blobs feature, fast-import backends have to
> 	maintain less state.  Using it should speed up exporting.
> 
> 	This is made optional because ...
> 
> I haven't thought through whether it ought to be optional or measured
> the effect on import performance.

It simplifies other fast-import backends greatly, because persisting
blobs can be complicated and expensive. I was thinking of making
svn-fe support both inlined blobs, and blobs referenced by marks. When
it's possible to be cheap by optionally having inlined blobs, why not
optionally have them? The filter we develop later can be used for
older fast-import streams that don't have inlined blobs.

On a related note, does it make sense to version our fast-import
stream format? It's certainly going to keep evolving with time, and we
need backward compatibility.

> A separate question is what an svn fast-import backend should do with
> all those blobs that are not ready to be written to dump.  As a hack
> while prototyping, one can rely on the "current" fast-export output,
> even though that is not flexible or futureproof.  Longer term, the
> folllowing sounds very interesting

Good point. The functionality to persist blobs that are refenced by
marks probably shouldn't be in svn-fe at all.

> > Just thinking aloud, but is it possible to write a filter that converts an
> > arbitrary G-F-I stream with referenced blobs into a G-F-I stream without
> > referenced blobs by inlining all the blobs?
> 
> to avoid complexity in the svn fast-import backend itself.
> (Complicating detail: such a filter would presumably take responsibility
> for --export-marks, so it might want a way to retrieve commit marks
> from its downstream.)

This filter will need to persist every blob for the entire lifetime of
the program. We can't possibly do it in-memory, so we have to find
some way to persist them on-disk and retrieve them very
quickly. Jonathan suggested using something like ToyoCabinet earlier-
I'll start working and see what I come up with.

-- Ram

  reply	other threads:[~2011-01-20  4:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-19  5:44 [RFC PATCH v2 0/5] Towards a Git-to-SVN bridge Ramkumar Ramachandra
2011-01-19  5:44 ` [PATCH 1/5] date: Expose the time_to_tm function Ramkumar Ramachandra
2011-01-19  5:44 ` [PATCH 2/5] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra
2011-01-22  0:30   ` Junio C Hamano
2011-01-22  9:45     ` Ramkumar Ramachandra
2011-01-19  5:44 ` [PATCH 3/5] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra
2011-01-19  5:44 ` [PATCH 4/5] fast-export: Introduce --inline-blobs Ramkumar Ramachandra
2011-01-19 19:50   ` Junio C Hamano
2011-01-19 21:48     ` Jonathan Nieder
2011-01-20  4:50       ` Ramkumar Ramachandra [this message]
2011-01-20  5:48         ` Jonathan Nieder
2011-01-20  6:28           ` Ramkumar Ramachandra
2011-01-20 13:53         ` Drew Northup
2011-01-22  9:24           ` Ramkumar Ramachandra
2011-01-22 19:18             ` Jonathan Nieder
2011-01-20  5:41     ` Jonathan Nieder
2011-01-22  0:30   ` Junio C Hamano
2011-01-19  5:44 ` [PATCH 5/5] vcs-svn: Add dir_cache for svnload Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110120045046.GB9493@kytes \
    --to=artagnon@gmail.com \
    --cc=david.barr@cordelta.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).