git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Kevin Wern <kevin.m.wern@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Kevin Wern <kevin.m.wern@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 11/11] Resumable clone: implement primer logic in git-clone
Date: Wed, 28 Sep 2016 01:49:51 -0400	[thread overview]
Message-ID: <20160928054951.GF3762@kwern-HP-Pavilion-dv5-Notebook-PC> (raw)
In-Reply-To: <xmqqy42rh156.fsf@gitster.mtv.corp.google.com>

On Fri, Sep 16, 2016 at 04:32:05PM -0700, Junio C Hamano wrote:
> >  -----------
> > @@ -172,6 +173,12 @@ objects from the source repository into a pack in the cloned repository.
> >  	via ssh, this specifies a non-default path for the command
> >  	run on the other end.
> >  
> > +--prime-clone <prime-clone>::
> > +-p <prime-clone>::
> 
> Not many other options have single letter shorthand.  Is it expected
> that it is worth to let this option squat on a short-and-sweet "-p",
> perhaps because it is so frequently used?

I based my decision more on precedent than value--the "--upload-pack" option
had the shorthand "-u".

> > @@ -40,17 +42,20 @@ static const char * const builtin_clone_usage[] = {
> >  
> >  static int option_no_checkout, option_bare, option_mirror, option_single_branch = -1;
> >  static int option_local = -1, option_no_hardlinks, option_shared, option_recursive;
> > +static int option_resume;
> >  static char *option_template, *option_depth;
> > -static char *option_origin = NULL;
> > +static const char *option_origin = NULL;
> 
> Is this change related to anything you are doing here?
> 
> If you are fixing things while at it, please don't ;-) If you really
> want to, please also remove " = NULL", from this line and also from
> the next line.  Also do not add " = NULL" at the end of alt_res.
> 

Yes, it was because remote_config uses all const char* strings, and .name gets
assigned to option_origin when it pulls the previous info from config. I looked
to see if option_origin's underlying string is ever modified in a way that
makes it ineligible to be const, and it wasn't.

So, I should remove the NULLs here? And then initalize alt_res to NULL in
cmd_clone?

> >  static char *option_branch = NULL;
> >  ...
> > +static const struct alt_resource *alt_res = NULL;
> 
> > +static char *get_filename(const char *dir)
> > +{
> > +	char *dir_copy = xstrdup(dir);
> > +	strip_trailing_slashes(dir_copy);
> > +	char *filename, *final = NULL;
> > +
> > +	filename = find_last_dir_sep(dir);
> > +
> > +	if (filename && *(++filename))
> > +		final = xstrdup(filename);
> > +
> > +	free(dir_copy);
> > +	return final;
> > +}
> 
> Hmph, don't we have our own basename(3) lookalike that knows about
> dir-sep already?

Whoops, did not catch that. Thanks!

> > @@ -562,7 +614,7 @@ static void write_remote_refs(const struct ref *local_refs)
> >  		die("%s", err.buf);
> >  
> >  	for (r = local_refs; r; r = r->next) {
> > -		if (!r->peer_ref)
> > +		if (!r->peer_ref || ref_exists(r->peer_ref->name))
> >  			continue;
> >  		if (ref_transaction_create(t, r->peer_ref->name, r->old_oid.hash,
> >  					   0, NULL, &err))
> 
> What is this change about?

Because resumable clone is supposed to handle halting, I included the
possibility of halting after the final fetch happened and remote refs were
updated (basically anything after resumable download happens). This change
basically says "If the remote ref was not previously written, then write it."
Because the write is all-or-nothing, this means the logic should either write
all of the intended references, or none of them (because they were written
in the previous invocation).

> > +static const char *setup_and_index_pack(const char *filename)
> > +{
> > +	const char *primer_idx_path = NULL, *primer_bndl_path = NULL;
> > +	primer_idx_path = replace_extension(filename, ".pack", ".idx");
> > +	primer_bndl_path = replace_extension(filename, ".pack", ".bndl");
> > +
> > +	if (!(primer_idx_path && primer_bndl_path)) {
> > +		warning("invalid pack filename '%s', falling back to full "
> > +			"clone", filename);
> > +		return NULL;
> > +	}
> > +
> > +	if (!file_exists(primer_bndl_path)) {
> > +		if (do_index_pack(filename, primer_idx_path)) {
> > +			warning("could not index primer pack, falling back to "
> > +				"full clone");
> > +			return NULL;
> > +		}
> > +	}
> 
> Can it be another (undetected) failure mode that .bndl somehow
> already existed, but not .idx, leaving the resulting object store in
> an incosistent state?  Can do_index_pack() fail and leave .bndl
> behind to get you into such a state?

I don't think so, based on looking at builtin/index-pack.c. write_bundle_file()
happens at the very end of final(), which is long after the process created the
resulting .idx, or has died in the event one is not created. There also is no
automatic cleanup of .idx after that if the process dies somehow after .bndl
is written.

> > +static int write_bundle_refs(const char *bundle_filename)
> > +{
> > +	struct ref_transaction *t;
> > +	struct bundle_header history_tips;
> > +	const char *temp_ref_base = "resume";
> > +	struct strbuf err = STRBUF_INIT;
> > +	int i;
> > +
> > +	init_bundle_header(&history_tips, bundle_filename);
> > +	read_bundle_header(&history_tips);
> > +
> > +	t = ref_transaction_begin(&err);
> > +	for (i = 0; i < history_tips.references.nr; i++) {
> > +		struct strbuf ref_name = STRBUF_INIT;
> > +		strbuf_addf(&ref_name, "refs/temp/%s/%s/temp-%s",
> > +			    option_origin, temp_ref_base,
> > +			    sha1_to_hex(history_tips.references.list[i].sha1));
> 
> Can we do this without polluting refs/temp/ namespace?
> 
> I am imagining that you are first fetching the .pack file from
> sideways when primer service is available, running index-pack on it
> to produce the bundle, and the step after that is to run "git fetch"
> against the original remote to fill the gap between the bit-stale
> history you got in the bundle and the reality that has progressed
> since the primer pack was made, and you need a way to tell to the
> other end that you already have the history leading to these refs
> when you run "git fetch".  I think a bit better way to do so is to
> send these has ".have" while you run the "fetch".
> 
> Wouldn't it do if you add the "--advertise-bundle-tips=<bndl>"
> option to "git fetch", move the code to read the bundle header to
> it, and point the bundle's filename with the option when you spawn
> "git fetch"?

My implementation is definitely a minimum working model. I was hoping to
move to something cleaner.

Is this new option preferable to leveraging the "--reference" option you
mentioned in the earlier discussion? I thought that was a clean solution,
especially because it uses an existing option. Additionally, would there be
any use for this new fetch option outside of cloning? If so, I could see the
value--otherwise, knowing that we want to keep as much resume-specific
knowledge inside clone as possible, "--reference" with a .bndl seems better.

- Kevin

  reply	other threads:[~2016-09-28  5:49 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-16  0:12 [PATCH 00/11] Resumable clone Kevin Wern
2016-09-16  0:12 ` [PATCH 01/11] Resumable clone: create service git-prime-clone Kevin Wern
2016-09-16 20:53   ` Junio C Hamano
2016-09-28  4:40     ` Kevin Wern
2016-09-16  0:12 ` [PATCH 02/11] Resumable clone: add prime-clone endpoints Kevin Wern
2016-09-19 13:15   ` Duy Nguyen
2016-09-28  4:43     ` Kevin Wern
2016-09-16  0:12 ` [PATCH 03/11] pkt-line: create gentle packet_read_line functions Kevin Wern
2016-09-16 22:17   ` Junio C Hamano
2016-09-28  4:42     ` Kevin Wern
2016-09-16  0:12 ` [PATCH 04/11] Resumable clone: add prime-clone to remote-curl Kevin Wern
2016-09-19 13:52   ` Duy Nguyen
2016-09-28  6:45     ` Kevin Wern
2016-09-16  0:12 ` [PATCH 05/11] Resumable clone: add output parsing to connect.c Kevin Wern
2016-09-16  0:12 ` [PATCH 06/11] Resumable clone: implement transport_prime_clone Kevin Wern
2016-09-16  0:12 ` [PATCH 07/11] Resumable clone: add resumable download to http/curl Kevin Wern
2016-09-16 22:45   ` Junio C Hamano
2016-09-28  6:41     ` Kevin Wern
2016-09-16  0:12 ` [PATCH 08/11] Resumable clone: create transport_download_primer Kevin Wern
2016-09-16  0:12 ` [PATCH 09/11] path: add resumable marker Kevin Wern
2016-09-19 13:24   ` Duy Nguyen
2016-09-16  0:12 ` [PATCH 10/11] run command: add RUN_COMMAND_NO_STDOUT Kevin Wern
2016-09-16 23:07   ` Junio C Hamano
2016-09-18 19:22     ` Johannes Schindelin
2016-09-28  4:46     ` Kevin Wern
2016-09-28 17:54       ` Junio C Hamano
2016-09-28 18:06         ` Kevin Wern
2016-09-16  0:12 ` [PATCH 11/11] Resumable clone: implement primer logic in git-clone Kevin Wern
2016-09-16 23:32   ` Junio C Hamano
2016-09-28  5:49     ` Kevin Wern [this message]
2016-09-19 14:04   ` Duy Nguyen
2016-09-19 17:16     ` Junio C Hamano
2016-09-28  4:44     ` Kevin Wern
2016-09-16 20:47 ` [PATCH 00/11] Resumable clone Junio C Hamano
2016-09-27 21:51 ` Eric Wong
2016-09-27 22:07   ` Junio C Hamano
2016-09-28 17:32     ` Junio C Hamano
2016-09-28 18:22       ` Junio C Hamano
2016-09-28 20:46     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928054951.GF3762@kwern-HP-Pavilion-dv5-Notebook-PC \
    --to=kevin.m.wern@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).