[Patch] Prevent cloning over http from spewing

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [Patch] Prevent cloning over http from spewing
@ 2009-06-02 17:42 sparse
  2009-06-03 10:21 ` Erik Faye-Lund
  2009-06-03 10:39 ` Jakub Narebski
  0 siblings, 2 replies; 23+ messages in thread
From: sparse @ 2009-06-02 17:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 145 bytes --]

When cloning over http git spews a bunch of hashs that don't really convey much.
The attached patch disables them unless --verbose is specified.

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 1192 bytes --]

commit 67be7a94c94100d24bcccf1248a26afc1dec8d05
Author: Jeff Muizelaar <jmuizelaar@mozilla.com>
Date:   Tue Jun 2 13:32:41 2009 -0400

    Prevent cloning over http from spewing
    
    This propogates the verbose flag to the transport and makes
    the walker verbose only if the transport is verbose.

diff --git a/builtin-clone.c b/builtin-clone.c
index 5c46496..f25d60b 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -502,8 +502,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 		if (option_quiet)
 			transport->verbose = -1;
-		else if (option_verbose)
+		else if (option_verbose) {
 			transport->progress = 1;
+			transport->verbose = 1;
+		}
 
 		if (option_upload_pack)
 			transport_set_option(transport, TRANS_OPT_UPLOADPACK,
diff --git a/transport.c b/transport.c
index 17891d5..3bb87f3 100644
--- a/transport.c
+++ b/transport.c
@@ -366,7 +366,7 @@ static int fetch_objs_via_walker(struct transport *transport,
 	walker->get_all = 1;
 	walker->get_tree = 1;
 	walker->get_history = 1;
-	walker->get_verbosely = transport->verbose >= 0;
+	walker->get_verbosely = transport->verbose > 0;
 	walker->get_recover = 0;
 
 	for (i = 0; i < nr_objs; i++)

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-02 17:42 [Patch] Prevent cloning over http from spewing sparse
@ 2009-06-03 10:21 ` Erik Faye-Lund
  2009-06-03 10:39 ` Jakub Narebski
  1 sibling, 0 replies; 23+ messages in thread
From: Erik Faye-Lund @ 2009-06-03 10:21 UTC (permalink / raw)
  To: sparse; +Cc: git

On Tue, Jun 2, 2009 at 7:42 PM,  <sparse@infidigm.net> wrote:
> When cloning over http git spews a bunch of hashs that don't really convey much.
> The attached patch disables them unless --verbose is specified.
>

We prefer patches inline on this mailing-list. See
Documentation/SubmittingPatches, "(3) Sending your patches." for more
details.

-- 
Erik "kusma" Faye-Lund
kusmabite@gmail.com
(+47) 986 59 656

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-02 17:42 [Patch] Prevent cloning over http from spewing sparse
  2009-06-03 10:21 ` Erik Faye-Lund
@ 2009-06-03 10:39 ` Jakub Narebski
  2009-06-03 18:28   ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Jakub Narebski @ 2009-06-03 10:39 UTC (permalink / raw)
  To: sparse; +Cc: git

sparse@infidigm.net writes:

> When cloning over http git spews a bunch of hashs that don't really convey much.
> The attached patch disables them unless --verbose is specified.
> 

Please follow guidelines stated in Documentation/SubmittingPatches:
 * it is preferred to have patch inline, rather than as attachement
 * put comments between "---" and the diffstat

Also please do the following:
 * use git-format-patch output rather than that of git-show to
   generate patch to send.

Also "Prevent cloning over http from spewing" is not, in my opinion,
good description; "Make cloning over http non-verbose by default"
would be better.

Finally a question: if we turn off verbose output, do we have any kind
of progress info fot git-clone over http?

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 10:39 ` Jakub Narebski
@ 2009-06-03 18:28   ` Junio C Hamano
  2009-06-03 19:10     ` Jeff King
  2009-06-05  0:17     ` Jakub Narebski
  0 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2009-06-03 18:28 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: sparse, git

Jakub Narebski <jnareb@gmail.com> writes:

> Finally a question: if we turn off verbose output, do we have any kind
> of progress info fot git-clone over http?

This was discussed before.

For example, Shawn wrote in September 2007 in $gmane/59433

    ... Few people really grok the pack-objects/index-pack output, nor do they
    care.  They just want to know two things:

      - Is git working or frozen in limbo?
      - Is git frozen because my network sucks, or git sucks?
      - When will this be done?  Please $DIETY, I want to go home!
        Make the fetch finish!

    The C based git-fetch made the http protocol almost totally silent.
    (No more got/walk messages.)  But that's actually also bad as it
    now doesn't even let you know its transferring anything at all.

You _could_ do something like this patch.  Instead of showing "walk %s"
and "got %s" lines, with occasional "Getting pack %s\n which contains %s",
it shows and recycles a single line that shows the number of packs, walk
actions and got actions.

This is a toy patch; it hiccups for too long while getting each pack, and
it does not cleanly restore the display after it finishes, but I'll leave
it to interested readers as an exercise to properly do this using the
progress API.

Currently many http related patches are in flight, so it may not be the
best time to do so, though.

 http-walker.c |   11 +++--------
 walker.c      |   21 +++++++++++++++++----
 walker.h      |   10 +++++++++-
 3 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 7321ccc..3303bc1 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -256,7 +256,7 @@ static void finish_object_request(struct object_request *obj_req)
 		move_temp_to_file(obj_req->tmpfile, obj_req->filename);
 
 	if (obj_req->rename == 0)
-		walker_say(obj_req->walker, "got %s\n", sha1_to_hex(obj_req->sha1));
+		walker_progress(obj_req->walker, WALKER_GOT);
 }
 
 static void process_object_response(void *callback_data)
@@ -733,12 +733,7 @@ static int fetch_pack(struct walker *walker, struct alt_base *repo, unsigned cha
 	if (!target)
 		return -1;
 
-	if (walker->get_verbosely) {
-		fprintf(stderr, "Getting pack %s\n",
-			sha1_to_hex(target->sha1));
-		fprintf(stderr, " which contains %s\n",
-			sha1_to_hex(sha1));
-	}
+	walker_progress(walker, WALKER_PACK);
 
 	url = xmalloc(strlen(repo->base) + 65);
 	sprintf(url, "%s/objects/pack/pack-%s.pack",
@@ -907,7 +902,7 @@ struct walker *get_http_walker(const char *url, struct remote *remote)
 {
 	char *s;
 	struct walker_data *data = xmalloc(sizeof(struct walker_data));
-	struct walker *walker = xmalloc(sizeof(struct walker));
+	struct walker *walker = xcalloc(1, sizeof(struct walker));
 
 	http_init(remote);
 
diff --git a/walker.c b/walker.c
index e57630e..aaf3e2c 100644
--- a/walker.c
+++ b/walker.c
@@ -9,10 +9,23 @@
 
 static unsigned char current_commit_sha1[20];
 
-void walker_say(struct walker *walker, const char *fmt, const char *hex)
+void walker_progress(struct walker *walker, enum walker_action action)
 {
-	if (walker->get_verbosely)
-		fprintf(stderr, fmt, hex);
+	switch (action) {
+	default:
+		break;
+	case WALKER_GOT:
+		walker->got++;
+		break;
+	case WALKER_WALK:
+		walker->walked++;
+		break;
+	case WALKER_PACK:
+		walker->got_pack++;
+		break;
+	}
+	fprintf(stderr, "pack %-5u walk %-10lu got %-10lu\r",
+		walker->got_pack, walker->walked, walker->got);
 }
 
 static void report_missing(const struct object *obj)
@@ -83,7 +96,7 @@ static int process_commit(struct walker *walker, struct commit *commit)
 
 	hashcpy(current_commit_sha1, commit->object.sha1);
 
-	walker_say(walker, "walk %s\n", sha1_to_hex(commit->object.sha1));
+	walker_progress(walker, WALKER_WALK);
 
 	if (walker->get_tree) {
 		if (process(walker, &commit->tree->object))
diff --git a/walker.h b/walker.h
index 8a149e1..5e7c1f6 100644
--- a/walker.h
+++ b/walker.h
@@ -9,6 +9,9 @@ struct walker {
 	void (*prefetch)(struct walker *, unsigned char *sha1);
 	int (*fetch)(struct walker *, unsigned char *sha1);
 	void (*cleanup)(struct walker *);
+	unsigned long walked;
+	unsigned long got;
+	unsigned got_pack;
 	int get_tree;
 	int get_history;
 	int get_all;
@@ -19,7 +22,12 @@ struct walker {
 };
 
 /* Report what we got under get_verbosely */
-void walker_say(struct walker *walker, const char *, const char *);
+enum walker_action {
+	WALKER_GOT,
+	WALKER_WALK,
+	WALKER_PACK,
+};
+void walker_progress(struct walker *, enum walker_action);
 
 /* Load pull targets from stdin */
 int walker_targets_stdin(char ***target, const char ***write_ref);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 18:28   ` Junio C Hamano
@ 2009-06-03 19:10     ` Jeff King
  2009-06-03 19:15       ` Shawn O. Pearce
  2009-06-05  0:17     ` Jakub Narebski
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-03 19:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, sparse, git

On Wed, Jun 03, 2009 at 11:28:40AM -0700, Junio C Hamano wrote:

> You _could_ do something like this patch.  Instead of showing "walk %s"
> and "got %s" lines, with occasional "Getting pack %s\n which contains %s",
> it shows and recycles a single line that shows the number of packs, walk
> actions and got actions.
> 
> This is a toy patch; it hiccups for too long while getting each pack, and
> it does not cleanly restore the display after it finishes, but I'll leave
> it to interested readers as an exercise to properly do this using the
> progress API.

This is much better, and I had roughly the same thought when I saw the
original "stop http from spewing" patch. The output is confusing, but we
need _some_ progress indicator.

Unfortunately, I don't think it's possible to give a "percent complete"
because we are fetching as we walk, meaning we don't know where the end
is until we get there (but I might be wrong, as I have never looked too
closely at the http fetch code).

So I think the best we could do is probably:

  1. summarize what we have fetched (N packs, N loose objects)
  2. show what we are currently fetching (object or pack)
  3. show the number of bytes retrieved for the current item
  4. if the server provides content-length, show the percentage
     completed for this object
  5. show the current throughput

That should give the user some eye candy to show something is happening,
and for a mostly packed repo, should come close to being an overall
progress indicator (since most of the time will be spent on one big
pack, that will be most of what the user sees, and its progress
indicator will dominate the overall time).

I suspect items 3-5 could mostly re-use the existing progress code, too.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:10     ` Jeff King
@ 2009-06-03 19:15       ` Shawn O. Pearce
  2009-06-03 19:24         ` Jeff King
  2009-06-04 12:45         ` Tay Ray Chuan
  0 siblings, 2 replies; 23+ messages in thread
From: Shawn O. Pearce @ 2009-06-03 19:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jakub Narebski, sparse, git

Jeff King <peff@peff.net> wrote:
> On Wed, Jun 03, 2009 at 11:28:40AM -0700, Junio C Hamano wrote:
> > This is a toy patch; it hiccups for too long while getting each pack, and
> > it does not cleanly restore the display after it finishes, but I'll leave
> > it to interested readers as an exercise to properly do this using the
> > progress API.
> 
> This is much better, and I had roughly the same thought when I saw the
> original "stop http from spewing" patch. The output is confusing, but we
> need _some_ progress indicator.
> 
> Unfortunately, I don't think it's possible to give a "percent complete"
> because we are fetching as we walk, meaning we don't know where the end
> is until we get there (but I might be wrong, as I have never looked too
> closely at the http fetch code).

No, you are right Peff, you can't give a "percent complete" because
you don't know how much you need to fetch.

What we could do is try to organize the fetch queue by object type,
get all commits, then all trees, then blobs.  The blobs are the
bulk of the data, and by the time we hit them, we should be able
to give some estimate on progress because we have all of the ones
we need to fetch in our fetch queue.  But its only a "object count"
sort of thing, not a byte count.

When fetching a .idx or a .pack though, we should be able to show
progress... assuming the server sent us a Content-Length header.
If not, in the case of a pack, we could still show receive speed
like index-pack does.

>   1. summarize what we have fetched (N packs, N loose objects)
>   2. show what we are currently fetching (object or pack)
>   3. show the number of bytes retrieved for the current item
>   4. if the server provides content-length, show the percentage
>      completed for this object
>   5. show the current throughput

Yea.  My idea of sorting by type is also a lot more work... and
probably doesn't buy us any better output than what you just said.
:-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:15       ` Shawn O. Pearce
@ 2009-06-03 19:24         ` Jeff King
  2009-06-03 19:32           ` Shawn O. Pearce
  2009-06-04 12:45         ` Tay Ray Chuan
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-03 19:24 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Jakub Narebski, sparse, git

On Wed, Jun 03, 2009 at 12:15:55PM -0700, Shawn O. Pearce wrote:

> What we could do is try to organize the fetch queue by object type,
> get all commits, then all trees, then blobs.  The blobs are the
> bulk of the data, and by the time we hit them, we should be able
> to give some estimate on progress because we have all of the ones
> we need to fetch in our fetch queue.  But its only a "object count"
> sort of thing, not a byte count.

That's clever, and I think an "object count" would be fine (after all,
that is all that git:// fetching provides). However, I'm not sure how it
would work in practice. When we follow a walk to a commit in a pack, do
we really want to try to pull _just_ that commit?

For one thing, we would need the server to support partial fetches (and
it is my assumption that we don't bother with that at all now).  I don't
know how widespread that is these days (and of course we would still
need to fall back to fetching the full pack). But even if we _could_,
would we get killed by http protocol overhead for each object? Certainly
it would be no worse than fetching a totally unpacked repo, but I kind
of assume such a fetch would be painful.

Or given that the packs should be organized by type, are you proposing
to fetch just the "commit part" as a single entity, then "tree part",
then the "blob part"? I'm a little hesitant to rely too much on what is
basically a performance heuristic for the pack organization (and god
forbid packv4 ever gets finished ;) ).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:24         ` Jeff King
@ 2009-06-03 19:32           ` Shawn O. Pearce
  2009-06-03 19:44             ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Shawn O. Pearce @ 2009-06-03 19:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jakub Narebski, sparse, git

Jeff King <peff@peff.net> wrote:
> On Wed, Jun 03, 2009 at 12:15:55PM -0700, Shawn O. Pearce wrote:
> 
> > What we could do is try to organize the fetch queue by object type,
> > get all commits, then all trees, then blobs.  The blobs are the
> > bulk of the data, and by the time we hit them, we should be able
> > to give some estimate on progress because we have all of the ones
> > we need to fetch in our fetch queue.  But its only a "object count"
> > sort of thing, not a byte count.
> 
> That's clever, and I think an "object count" would be fine (after all,
> that is all that git:// fetching provides). However, I'm not sure how it
> would work in practice. When we follow a walk to a commit in a pack, do
> we really want to try to pull _just_ that commit?

No, we pull the whole pack.  So the progress meter would have to
switch to do a content-length thing for the pack pull, then go back
to the object queue.

If that means we just pulled *all* of the blobs we have queued up,
great, we can probably actually whack them out of the queue once
the pack is down.  Actually, that's really smart to do, because
then we don't build up a massive list of objects when cloning a
very large repository like Gentoo.

By delaying trees/blobs, I meant delaying them for loose object
fetch only, not pack based fetch.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:32           ` Shawn O. Pearce
@ 2009-06-03 19:44             ` Jeff King
  2009-06-03 19:52               ` Shawn O. Pearce
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-03 19:44 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Jakub Narebski, sparse, git

On Wed, Jun 03, 2009 at 12:32:06PM -0700, Shawn O. Pearce wrote:

> > That's clever, and I think an "object count" would be fine (after all,
> > that is all that git:// fetching provides). However, I'm not sure how it
> > would work in practice. When we follow a walk to a commit in a pack, do
> > we really want to try to pull _just_ that commit?
> 
> No, we pull the whole pack.  So the progress meter would have to
> switch to do a content-length thing for the pack pull, then go back
> to the object queue.
> [...]
> By delaying trees/blobs, I meant delaying them for loose object
> fetch only, not pack based fetch.

Ah, OK, I see. I wonder if that would make a big difference in practice.
I expect repos to be fairly packed these days because of the I/O
benefits (and even if people aren't doing it manually, we auto-gc
working repos and keep pushed packs for publishing repos). So in
practice by the time you got an accurate object count, you would be more
or less done with the fetch (because you would have been grabbing the
related blobs in packs as you grabbed commits and trees).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:44             ` Jeff King
@ 2009-06-03 19:52               ` Shawn O. Pearce
  0 siblings, 0 replies; 23+ messages in thread
From: Shawn O. Pearce @ 2009-06-03 19:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jakub Narebski, sparse, git

Jeff King <peff@peff.net> wrote:
> On Wed, Jun 03, 2009 at 12:32:06PM -0700, Shawn O. Pearce wrote:
> 
> > > That's clever, and I think an "object count" would be fine (after all,
> > > that is all that git:// fetching provides). However, I'm not sure how it
> > > would work in practice. When we follow a walk to a commit in a pack, do
> > > we really want to try to pull _just_ that commit?
> > 
> > No, we pull the whole pack.  So the progress meter would have to
> > switch to do a content-length thing for the pack pull, then go back
> > to the object queue.
> > [...]
> > By delaying trees/blobs, I meant delaying them for loose object
> > fetch only, not pack based fetch.
> 
> Ah, OK, I see. I wonder if that would make a big difference in practice.
> I expect repos to be fairly packed these days because of the I/O
> benefits (and even if people aren't doing it manually, we auto-gc
> working repos and keep pushed packs for publishing repos). So in
> practice by the time you got an accurate object count, you would be more
> or less done with the fetch (because you would have been grabbing the
> related blobs in packs as you grabbed commits and trees).

Good point, but not all repos are well packed.  Think of a repo that
gets pushed to a few times a day, always under 100 objects per push,
maybe Linus'.  Its a lot of loose stuff to fetch.

But, you probably are right that many of them are packed... and this
trick to try and get a more accurate progress display wouldn't be
worth the effort most of the time.  Likely not worth coding.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 19:15       ` Shawn O. Pearce
  2009-06-03 19:24         ` Jeff King
@ 2009-06-04 12:45         ` Tay Ray Chuan
  2009-06-04 16:01           ` Jeff King
  2009-06-07 11:25           ` Tay Ray Chuan
  1 sibling, 2 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-04 12:45 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Thu, Jun 4, 2009 at 3:15 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
> No, you are right Peff, you can't give a "percent complete" because
> you don't know how much you need to fetch.
>
> What we could do is try to organize the fetch queue by object type,
> get all commits, then all trees, then blobs.  The blobs are the
> bulk of the data, and by the time we hit them, we should be able
> to give some estimate on progress because we have all of the ones
> we need to fetch in our fetch queue.  But its only a "object count"
> sort of thing, not a byte count.
>
> When fetching a .idx or a .pack though, we should be able to show
> progress... assuming the server sent us a Content-Length header.
> If not, in the case of a pack, we could still show receive speed
> like index-pack does.

I have a branch 'http-progress-indicators' at my repo
git://github.com/rctay/git.git. It contains a patch on walker.c that
updates the object total as the fetch goes along. The progress
indicator says "Processing targets:..." for lack of a better name; I'm
all for suggestions.

The branch also patches that display progress meters for the
downloading of .idx and .pack files. I also added a progress indicator
for verifying pack files in pack-check.c, because I noticed some
significant time was spent doing that without informing the user about
what was going on, but I'm not really sure if everyone else would
accept it.

So far, I've used git built using that branch to fetch the git repo,
and I also attempted the linux 2.6 kernel repo. Counting the objects
fetched was accurate in both cases (ie. matching counts of objects
fetched and objects to-be-fetched).

I plan to clean up the patches and send it in when the http
refactoring patches are finalized.

> Jeff King <peff@peff.net> wrote:
>>   1. summarize what we have fetched (N packs, N loose objects)
>>   2. show what we are currently fetching (object or pack)
>>   3. show the number of bytes retrieved for the current item
>>   4. if the server provides content-length, show the percentage
>>      completed for this object
>>   5. show the current throughput

Points 1. and 5. can probably be combined, because showing info by
type (packs and objects) isn't very easy to do. http-walker.c first
tries to fetch the raw pack; if it can't be found at the url, or at
alternate locations, it then tries to fetching packs. In other words,
it's hard to know in advance if the object is found in unpacked or
packed form.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-04 12:45         ` Tay Ray Chuan
@ 2009-06-04 16:01           ` Jeff King
  2009-06-07 10:31             ` Tay Ray Chuan
  2009-06-07 11:25           ` Tay Ray Chuan
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-04 16:01 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

On Thu, Jun 04, 2009 at 08:45:28PM +0800, Tay Ray Chuan wrote:

> I have a branch 'http-progress-indicators' at my repo
> git://github.com/rctay/git.git. It contains a patch on walker.c that
> updates the object total as the fetch goes along. The progress
> indicator says "Processing targets:..." for lack of a better name; I'm
> all for suggestions.

Thanks, I took a look at starting on a progress meter yesterday, but I
do think it makes sense to integrate with the work you are doing.

I tried your http-progress-indicators branch. A few comments:

  1. You still end up with a lot of lines of output. Some of those are
     "Getting pack $x" which we can probably get rid of in non-verbose
     mode. But we still get a different progress indicator line for each
     fetched item, which can add up to quite a lot. I was thinking of
     something like

        Fetching %s (got %d packs, %d loose): (%d/%d)

     with the substitutions:

       %s = "pack", "index", or "loose object"
       %d packs, %d loose = a running count of how much we've gotten
       %d/%d = current and total byte counts for what we are getting now

     and then you could keep everything on a single line. I don't think
     is possible with the current progress code (it doesn't let you
     restart the counter), but it should be easy with some tweaking.

  2. The current progress code can also do throughput display, which
     would be nice (see display_throughput in progress.[ch]).

  3. Your implementation calls your get_http_file_size, which does a
     separate HEAD request to get the content-length. Instead, do
     a "curl_easy_setopt(slot->curl, CURLOPT_PROGRESSFUNCTION, ...)"
     to set up a progress callback. Curl will call it with the total
     number of bytes (from the content-length header of the actual GET
     request) and the number of bytes currently downloaded.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-03 18:28   ` Junio C Hamano
  2009-06-03 19:10     ` Jeff King
@ 2009-06-05  0:17     ` Jakub Narebski
  1 sibling, 0 replies; 23+ messages in thread
From: Jakub Narebski @ 2009-06-05  0:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: sparse, git

On Wed, 3 Jun 2009, Junio C Hamano napisał:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > Finally a question: if we turn off verbose output, do we have any kind
> > of progress info for git-clone over http?
[...]

> You _could_ do something like this patch.  Instead of showing "walk %s"
> and "got %s" lines, with occasional "Getting pack %s\n which contains %s",
> it shows and recycles a single line that shows the number of packs, walk
> actions and got actions.
> 
> This is a toy patch; it hiccups for too long while getting each pack, and
> it does not cleanly restore the display after it finishes, but I'll leave
> it to interested readers as an exercise to properly do this using the
> progress API.

I know it is toy patch only, but I wonder if we would want to output
current format output (walk/got chains) for '--verbose' option...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-04 16:01           ` Jeff King
@ 2009-06-07 10:31             ` Tay Ray Chuan
  2009-06-07 11:21               ` Tay Ray Chuan
  2009-06-08 11:54               ` Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-07 10:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Fri, Jun 5, 2009 at 12:01 AM, Jeff King<peff@peff.net> wrote:
> Thanks, I took a look at starting on a progress meter yesterday, but I
> do think it makes sense to integrate with the work you are doing.
>
> I tried your http-progress-indicators branch. A few comments:
>
>  1. You still end up with a lot of lines of output. Some of those are
>     "Getting pack $x" which we can probably get rid of in non-verbose
>     mode. But we still get a different progress indicator line for each
>     fetched item, which can add up to quite a lot. I was thinking of
>     something like
>
>        Fetching %s (got %d packs, %d loose): (%d/%d)
>
>     with the substitutions:
>
>       %s = "pack", "index", or "loose object"
>       %d packs, %d loose = a running count of how much we've gotten
>       %d/%d = current and total byte counts for what we are getting now
>
>     and then you could keep everything on a single line. I don't think
>     is possible with the current progress code (it doesn't let you
>     restart the counter), but it should be easy with some tweaking.

Hmm, just wondering, is this is the intended display for "-q" or "-v"?
Or should I do isatty(), like builtin-pack-objects.c does for the
"Writing objects" progress indicator?

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-07 10:31             ` Tay Ray Chuan
@ 2009-06-07 11:21               ` Tay Ray Chuan
  2009-06-08 12:24                 ` Jeff King
  2009-06-08 11:54               ` Jeff King
  1 sibling, 1 reply; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-07 11:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Sun, Jun 7, 2009 at 6:31 PM, Tay Ray Chuan<rctay89@gmail.com> wrote:
> On Fri, Jun 5, 2009 at 12:01 AM, Jeff King<peff@peff.net> wrote:
>>
>>        Fetching %s (got %d packs, %d loose): (%d/%d)
>>
> Hmm, just wondering, is this is the intended display for "-q" or "-v"?
> Or should I do isatty(), like builtin-pack-objects.c does for the
> "Writing objects" progress indicator?

by the way, I have updated http-progress-indicators based on your suggestions.

What I have now is:

 Fetching %d objects (got %d of %d, %d alt)[, and %s]: 100%
(32077585/32077585), 30.59 MiB, done.

where

1. %d objects = number of concurrent objects being fetched, usually
around 4-5. Since objects are fetched alongside other files like packs
and pack indices, I separated this from (4).

2. got %d of %d = a count of loose objects. I haven't done counting of
packs yet, but it shouldn't be very hard.

3. %d alt = number of alternate objects. The way I'm counting them now
is very inaccurate; I may drop this if it's too complicated to do an
accurate count. I added this because some people use forked repos, and
they may wonder why after some time, the number of objects fetched
doesn't increase. (The time was spent on waiting for the server, only
for it to return a 404).

4. [, and %s] = an "optional" field that gets displayed when packs,
pack indices, etc. (everything except objects) are being fetched. The
byte counts are for this fetch, not the object fetch(es).

How about pack file verification? Some pack files are monstrous, and
can take some time to verify. Is it desirable to fit in pack file
verification into the same "Fetching..." line? Verification is a
per-file thing, so it should deserve the same treatment that "Getting
pack ..." lines got.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-04 12:45         ` Tay Ray Chuan
  2009-06-04 16:01           ` Jeff King
@ 2009-06-07 11:25           ` Tay Ray Chuan
  1 sibling, 0 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-07 11:25 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Thu, Jun 4, 2009 at 8:45 PM, Tay Ray Chuan<rctay89@gmail.com> wrote:
> tries to fetch the raw pack; if it can't be found at the url, or at

Oops, I meant "raw object".

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-07 10:31             ` Tay Ray Chuan
  2009-06-07 11:21               ` Tay Ray Chuan
@ 2009-06-08 11:54               ` Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Jeff King @ 2009-06-08 11:54 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

On Sun, Jun 07, 2009 at 06:31:38PM +0800, Tay Ray Chuan wrote:

> >     something like
> >        Fetching %s (got %d packs, %d loose): (%d/%d)
> [...]
> Hmm, just wondering, is this is the intended display for "-q" or "-v"?
> Or should I do isatty(), like builtin-pack-objects.c does for the
> "Writing objects" progress indicator?

I was imagining:

  - without "-q", show progress if isatty(1).

  - with "-q", never show progress

  - with "-v", show the "getting pack" and "walk" output we show now;
    without "-v", don't show it. "-v" has no impact on the progress
    indicator. 

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-07 11:21               ` Tay Ray Chuan
@ 2009-06-08 12:24                 ` Jeff King
  2009-06-10 14:03                   ` Tay Ray Chuan
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-08 12:24 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

On Sun, Jun 07, 2009 at 07:21:13PM +0800, Tay Ray Chuan wrote:

> by the way, I have updated http-progress-indicators based on your suggestions.

Thanks, I just looked at it (though sadly it does not merge into what is
in 'next' right now).

> What I have now is:
> 
>  Fetching %d objects (got %d of %d, %d alt)[, and %s]: 100%
> (32077585/32077585), 30.59 MiB, done.

I tried to fetch linux-2.6, and it left me pretty confused.

My first complaint is that it is way too long. It wrapped in my
80-column terminal, causing all sorts of visual confusion.

> 1. %d objects = number of concurrent objects being fetched, usually
> around 4-5. Since objects are fetched alongside other files like packs
> and pack indices, I separated this from (4).

This did at times say '4' for me, but just as often it said '0' (even
when stuff was obviously downloading). I hadn't thought about the fact
that we have concurrent downloads. That really makes things harder.
Though it seems like we only do one pack file at a time (so maybe that
is the reason for the '0' -- we are downloading a pack).

In fact, while watching the progress go for the linux-2.6 download, it
is really hard to tell what is going on. The "% completed" number jumps
around between multiple values, even showing what appears to be
nonsense:

  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   8%
  (241096602/302989431), 229.78 MiB | 668 KiB/s
  ...
  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   4%
  (270741882/302989431), 257.93 MiB | 690 KiB/s

Those are two cut-and-pastes from the same fetch. You can see it
progressing in terms of absolute numbers, but the percentage values make
no sense. The "total downloaded" and throughput numbers look roughly
correct. I don't know if this is caused by multiple simultaenous
downloads.

> 2. got %d of %d = a count of loose objects. I haven't done counting of
> packs yet, but it shouldn't be very hard.

Fetching linux-2.6, I spent a very long time on "got 2 of 320" which
really wasn't all that helpful (because almost the whole thing is in
packs). Probably a pack count would be useful there. Though I wonder if
there is some shorter way to summarize what is going on to keep the line
smaller.

But somewhat worse is that we start at '320', spend a lot of time, and
then magically it ends up at 1182387 at the end. So it is not all that
useful as a progress counter, because we don't actually know the total.
So we can show that we are progressing, but the end keeps getting
farther away. :)

> 3. %d alt = number of alternate objects. The way I'm counting them now
> is very inaccurate; I may drop this if it's too complicated to do an
> accurate count. I added this because some people use forked repos, and
> they may wonder why after some time, the number of objects fetched
> doesn't increase. (The time was spent on waiting for the server, only
> for it to return a 404).

In the name of conserving space on the line, perhaps you should just
count this as a "fetched" object and increment the fetched count by one.
The user doesn't have to care which were alt and which were not, as long
as they see a counter progressing towards completion.

> How about pack file verification? Some pack files are monstrous, and
> can take some time to verify. Is it desirable to fit in pack file
> verification into the same "Fetching..." line? Verification is a
> per-file thing, so it should deserve the same treatment that "Getting
> pack ..." lines got.

It does take a while on big packs, so I think it makes sense to show
some eye candy. You can always use the "show only after 2 seconds"
feature of the progress meter if you are concerned about too much
information for quick verifications.

It probably makes sense to actually do a stop_progress() after the
fetching and before the verifying. Otherwise you get the cruft from the
fetch on the line, but it is not covered up by the verify. Like:

  Verifying pack file: 100% (1180771/1180771), done.   0%
  (302915802/302989431), 288.72 MiB | 752 KiB/s

I wonder if you should start a newline every time we get to a new
"phase". So you might see:

 Downloading %d loose objects: Z% (X/Y), x MiB | y KiB/s, done
 Fetching pack 1 of 2: Z% (X/Y), x MiB | y KiB/s, done
 Verifying pack 1 of 2: Z% (X/Y)
 Fetching pack 2 of 2: Z% (X/Y), x MiB | y KiB/s, done
 Verifying pack 2 of 2: Z% (X/Y)

That assumes we download packs one at a time (is that right?). It does take
a couple of lines to show what is going on, but I think most repos are
only going to have a couple of packs (though in theory, you could have
more "loose objects" lines interspersed with your packs).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-08 12:24                 ` Jeff King
@ 2009-06-10 14:03                   ` Tay Ray Chuan
  2009-06-10 14:07                     ` Tay Ray Chuan
  2009-06-11 11:11                     ` Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-10 14:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Mon, Jun 8, 2009 at 8:24 PM, Jeff King<peff@peff.net> wrote:
> Thanks, I just looked at it (though sadly it does not merge into what is
> in 'next' right now).

thanks for taking the time to do try it out (and with linux-2.6, at that).

> My first complaint is that it is way too long. It wrapped in my
> 80-column terminal, causing all sorts of visual confusion.

The byte counts can really take up alot of space. Perhaps we should
just show the size (MiB) and completed percentage, sans byte counts?

>> 1. %d objects = number of concurrent objects being fetched, usually
>> around 4-5. Since objects are fetched alongside other files like packs
>> and pack indices, I separated this from (4).
>
> This did at times say '4' for me, but just as often it said '0' (even
> when stuff was obviously downloading). I hadn't thought about the fact
> that we have concurrent downloads. That really makes things harder.
> Though it seems like we only do one pack file at a time (so maybe that
> is the reason for the '0' -- we are downloading a pack).

Fetching of objects and packs take place separately; it doesn't mean
that when '0' objects are being fetched, we're definitely fetching
something else (eg. packs). Perhaps we should "hide" the "Fetching 0
objects" part when the number of simultaneous object fetches is 0?

> In fact, while watching the progress go for the linux-2.6 download, it
> is really hard to tell what is going on. The "% completed" number jumps
> around between multiple values, even showing what appears to be
> nonsense:
>
>  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   8%
>  (241096602/302989431), 229.78 MiB | 668 KiB/s
>  ...
>  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   4%
>  (270741882/302989431), 257.93 MiB | 690 KiB/s
>
> Those are two cut-and-pastes from the same fetch. You can see it
> progressing in terms of absolute numbers, but the percentage values make
> no sense. The "total downloaded" and throughput numbers look roughly
> correct. I don't know if this is caused by multiple simultaenous
> downloads.

It's likely that your guess (about simultaneous downloads) is causing
the inconsistent % completed, but then packs aren't downloaded
simultaneously. I'll have to check this again.

> Fetching linux-2.6, I spent a very long time on "got 2 of 320" which
> really wasn't all that helpful (because almost the whole thing is in
> packs). Probably a pack count would be useful there. Though I wonder if
> there is some shorter way to summarize what is going on to keep the line
> smaller.

We are of course assuming that the user is fetching from a well-packed
repo. Again, perhaps we could cut out the "Fetching 0 objects" part.

> But somewhat worse is that we start at '320', spend a lot of time, and
> then magically it ends up at 1182387 at the end. So it is not all that
> useful as a progress counter, because we don't actually know the total.
> So we can show that we are progressing, but the end keeps getting
> farther away. :)

The total number of objects (320) increases as we "walk" the commits;
sometimes we need to fetch the "walked" objects, sometimes we don't
(eg. in packs we've fetched already). There's no way to know in
advance the total; hence, the continually updating of the total. I
don't think there's it's a problem; the idea is to let the user be
sure that git is active.

>> 3. %d alt = number of alternate objects. The way I'm counting them now
>> is very inaccurate; I may drop this if it's too complicated to do an
>> accurate count. I added this because some people use forked repos, and
>> they may wonder why after some time, the number of objects fetched
>> doesn't increase. (The time was spent on waiting for the server, only
>> for it to return a 404).
>
> In the name of conserving space on the line, perhaps you should just
> count this as a "fetched" object and increment the fetched count by one.
> The user doesn't have to care which were alt and which were not, as long
> as they see a counter progressing towards completion.

Ok. I'll just drop this then. (The way I'm doing it right now isn't
very accurate: the "alt" count increases the moment git realises the
object might be found at the alternate location, not the moment the
object at the alternate location is fetched.)

> I wonder if you should start a newline every time we get to a new
> "phase". So you might see:
>
>  Downloading %d loose objects: Z% (X/Y), x MiB | y KiB/s, done
>  Fetching pack 1 of 2: Z% (X/Y), x MiB | y KiB/s, done
>  Verifying pack 1 of 2: Z% (X/Y)
>  Fetching pack 2 of 2: Z% (X/Y), x MiB | y KiB/s, done
>  Verifying pack 2 of 2: Z% (X/Y)
>
> That assumes we download packs one at a time (is that right?). It does take
> a couple of lines to show what is going on, but I think most repos are
> only going to have a couple of packs (though in theory, you could have
> more "loose objects" lines interspersed with your packs).

Yeah, we do download packs one at a time (as I said above).

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-10 14:03                   ` Tay Ray Chuan
@ 2009-06-10 14:07                     ` Tay Ray Chuan
  2009-06-11 11:11                     ` Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-10 14:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Wed, Jun 10, 2009 at 10:03 PM, Tay Ray Chuan<rctay89@gmail.com> wrote:
> On Mon, Jun 8, 2009 at 8:24 PM, Jeff King<peff@peff.net> wrote:
>>  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   8%
>>  (241096602/302989431), 229.78 MiB | 668 KiB/s
>>  ...
>>  Fetching 0 objects (got 2 of 320, 0 alt) and pack:   4%
>>  (270741882/302989431), 257.93 MiB | 690 KiB/s
>>
>> Those are two cut-and-pastes from the same fetch. You can see it
>> progressing in terms of absolute numbers, but the percentage values make
>> no sense. The "total downloaded" and throughput numbers look roughly
>> correct. I don't know if this is caused by multiple simultaenous
>> downloads.
>
> It's likely that your guess (about simultaneous downloads) is causing
> the inconsistent % completed, but then packs aren't downloaded
> simultaneously. I'll have to check this again.

the linux-2.6 repo has only 1 pack (and no http-alternates), so this is weird.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-10 14:03                   ` Tay Ray Chuan
  2009-06-10 14:07                     ` Tay Ray Chuan
@ 2009-06-11 11:11                     ` Jeff King
  2009-06-22 12:10                       ` Tay Ray Chuan
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2009-06-11 11:11 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

On Wed, Jun 10, 2009 at 10:03:10PM +0800, Tay Ray Chuan wrote:

> > My first complaint is that it is way too long. It wrapped in my
> > 80-column terminal, causing all sorts of visual confusion.
> 
> The byte counts can really take up alot of space. Perhaps we should
> just show the size (MiB) and completed percentage, sans byte counts?

I think that makes sense. Especially because the (X/Y) in git's progress
output usually refers to the number of _objects_, and there is nothing
in the output to indicate that it is actually a byte count here.

I think it will need some tweaking of the progress code to show the
percentage but not the actual byte counts, but it should be a relatively
simple change.

> Fetching of objects and packs take place separately; it doesn't mean
> that when '0' objects are being fetched, we're definitely fetching
> something else (eg. packs). Perhaps we should "hide" the "Fetching 0
> objects" part when the number of simultaneous object fetches is 0?

Yes, though I really wonder if the "fetching" number is all that useful
even when it is not zero. The _most_ important thing is to show the user
that something is happening, and we are waiting on the network. And I
think we largely show that through the "total bytes sent" and throughput
counters.

Second to that is trying to give a sense of when the task may finish.
But as we've discussed, we don't have a sense of the total number of
objects until we actually fetch them. Showing progress within packs, and
the total number of packs is somewhat useful there (though it can be
misleading -- most of the time will probably be spent on one or two of
the packs).

> The total number of objects (320) increases as we "walk" the commits;
> sometimes we need to fetch the "walked" objects, sometimes we don't
> (eg. in packs we've fetched already). There's no way to know in
> advance the total; hence, the continually updating of the total. I
> don't think there's it's a problem; the idea is to let the user be
> sure that git is active.

Right. But I think we are better off showing simple increasing numbers
(like bytes or objects transferred) than misleading or inaccurate
guesses of totals. The latter creates more frustration, I think.

> > I wonder if you should start a newline every time we get to a new
> > "phase". So you might see:
> >
> >  Downloading %d loose objects: Z% (X/Y), x MiB | y KiB/s, done
> >  Fetching pack 1 of 2: Z% (X/Y), x MiB | y KiB/s, done
> >  Verifying pack 1 of 2: Z% (X/Y)
> >  Fetching pack 2 of 2: Z% (X/Y), x MiB | y KiB/s, done
> >  Verifying pack 2 of 2: Z% (X/Y)
> >
> > That assumes we download packs one at a time (is that right?). It does take
> > a couple of lines to show what is going on, but I think most repos are
> > only going to have a couple of packs (though in theory, you could have
> > more "loose objects" lines interspersed with your packs).
> 
> Yeah, we do download packs one at a time (as I said above).

But from what you wrote elsewhere in the message, it sounds like we may
be downloading a pack _and_ a loose object at the same time. So my
suggestion doesn't quite work in that case.

> the linux-2.6 repo has only 1 pack (and no http-alternates), so this
> is weird.

Maybe we are fetching from different places:

  $ wget 2>/dev/null -O - \
  http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/info/packs
  P pack-ab6a95cfd1919f6e820a8b2670403292838cfc17.pack
  P pack-ff5d2e76c6b3d4c0a5a11efd36af43934c3744df.pack
  P pack-588543ec42616a86eef47bb53dd04cd8a864d9b5.pack
  P pack-93b35ef7e596e6839c020e36edfaf8206b0f78c4.pack
  P pack-c89a5bf3b095d812bf1068cd1c84f8a07c3403c5.pack
  P pack-2cc5038b758e40a40a4590b37a8019d1ba5a65a9.pack
  P pack-9f36ce46120d8d9ee32e6394bb5857d7e548826b.pack
  P pack-8bebad1b754473489d516549632e904c0b3178a2.pack
  P pack-6554949d36a94e012da6ca6134ef62ce347b2efa.pack

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-11 11:11                     ` Jeff King
@ 2009-06-22 12:10                       ` Tay Ray Chuan
  2009-07-20 15:24                         ` Tay Ray Chuan
  0 siblings, 1 reply; 23+ messages in thread
From: Tay Ray Chuan @ 2009-06-22 12:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

note: this time, I haven't anything to show for in my repo; sorry.

On Thu, Jun 11, 2009 at 7:11 PM, Jeff King<peff@peff.net> wrote:
> On Wed, Jun 10, 2009 at 10:03:10PM +0800, Tay Ray Chuan wrote:
>
>> > My first complaint is that it is way too long. It wrapped in my
>> > 80-column terminal, causing all sorts of visual confusion.
>>
>> The byte counts can really take up alot of space. Perhaps we should
>> just show the size (MiB) and completed percentage, sans byte counts?
>
> I think that makes sense. Especially because the (X/Y) in git's progress
> output usually refers to the number of _objects_, and there is nothing
> in the output to indicate that it is actually a byte count here.
>
> I think it will need some tweaking of the progress code to show the
> percentage but not the actual byte counts, but it should be a relatively
> simple change.

Ok.

>> Fetching of objects and packs take place separately; it doesn't mean
>> that when '0' objects are being fetched, we're definitely fetching
>> something else (eg. packs). Perhaps we should "hide" the "Fetching 0
>> objects" part when the number of simultaneous object fetches is 0?
>
> Yes, though I really wonder if the "fetching" number is all that useful
> even when it is not zero. The _most_ important thing is to show the user
> that something is happening, and we are waiting on the network. And I
> think we largely show that through the "total bytes sent" and throughput
> counters.

Ok also.

My understanding of this point now: if we're fetching non-objects
*(like packs, pack indices), we wouldn't display "Fetching X objects"
at all, regardless of X. But if we're not fetching any non-objects and
we're waiting for loose objects fetching to finish (ie. we're fetching
1 loose object or more), then we would display "Fetching X objects".

>> The total number of objects (320) increases as we "walk" the commits;
>> sometimes we need to fetch the "walked" objects, sometimes we don't
>> (eg. in packs we've fetched already). There's no way to know in
>> advance the total; hence, the continually updating of the total. I
>> don't think there's it's a problem; the idea is to let the user be
>> sure that git is active.
>
> Right. But I think we are better off showing simple increasing numbers
> (like bytes or objects transferred) than misleading or inaccurate
> guesses of totals. The latter creates more frustration, I think.

Ok.

>> Yeah, we do download packs one at a time (as I said above).
>
> But from what you wrote elsewhere in the message, it sounds like we may
> be downloading a pack _and_ a loose object at the same time. So my
> suggestion doesn't quite work in that case.

In addition to that, it isn't possible to calculate the number of pack
indices to be fetched, because we're fetching them as we go along (see
http.c::http_get_info_packs() in 'master'), so that's a potential
problem. I wonder if you (or anyone reading this) has any suggestions
on this?

I still think we can use your idea of one-fetch-one-verify per pack,
even though objects are fetched simultaneously.

Here's what a "git clone http://repo" would look like, after
incorporating points from our discussion so far:

 Fetching info/refs: Z%, x MiB | y KiB/s, done
 Fetching objects/info/packs: Z%, x MiB | y KiB/s, done
 Fetching pack index (1 of 3): Z%, x MiB | y KiB/s, done
 Fetching pack index (2 of 3): Z%, x MiB | y KiB/s, done
 Fetching pack index (3 of 3): Z%, x MiB | y KiB/s, done
 Fetching pack (1 of 3): Z%, x MiB | y KiB/s, done
 Verifying pack (1 of 3): Z%, (X/Y), done
 Fetching pack (2 of 3): Z%, x MiB | y KiB/s, done
 Verifying pack (2 of 3): Z%, (X/Y), done
 Fetching pack (3 of 3): Z%, x MiB | y KiB/s, done
 Verifying pack (3 of 3): Z%, (X/Y), done
 Fetching 2 objects, done.
 Checking out files, done.

(I made up the last line.)

>> the linux-2.6 repo has only 1 pack (and no http-alternates), so this
>> is weird.
>
> Maybe we are fetching from different places:

Thanks. I realise the weird percentage values you saw were due to
out-of-range numeric calculations, so actually, your point didn't
count, but thanks anyway.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] Prevent cloning over http from spewing
  2009-06-22 12:10                       ` Tay Ray Chuan
@ 2009-07-20 15:24                         ` Tay Ray Chuan
  0 siblings, 0 replies; 23+ messages in thread
From: Tay Ray Chuan @ 2009-07-20 15:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Junio C Hamano, Jakub Narebski, sparse, git

Hi,

On Mon, Jun 22, 2009 at 8:10 PM, Tay Ray Chuan<rctay89@gmail.com> wrote:
> note: this time, I haven't anything to show for in my repo; sorry.

I've got some stuff, as per the last email, at my git repo (branch
'http-progress-indicators' at git://github.com/rctay/git.git).

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-07-20 15:24 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-02 17:42 [Patch] Prevent cloning over http from spewing sparse
2009-06-03 10:21 ` Erik Faye-Lund
2009-06-03 10:39 ` Jakub Narebski
2009-06-03 18:28   ` Junio C Hamano
2009-06-03 19:10     ` Jeff King
2009-06-03 19:15       ` Shawn O. Pearce
2009-06-03 19:24         ` Jeff King
2009-06-03 19:32           ` Shawn O. Pearce
2009-06-03 19:44             ` Jeff King
2009-06-03 19:52               ` Shawn O. Pearce
2009-06-04 12:45         ` Tay Ray Chuan
2009-06-04 16:01           ` Jeff King
2009-06-07 10:31             ` Tay Ray Chuan
2009-06-07 11:21               ` Tay Ray Chuan
2009-06-08 12:24                 ` Jeff King
2009-06-10 14:03                   ` Tay Ray Chuan
2009-06-10 14:07                     ` Tay Ray Chuan
2009-06-11 11:11                     ` Jeff King
2009-06-22 12:10                       ` Tay Ray Chuan
2009-07-20 15:24                         ` Tay Ray Chuan
2009-06-08 11:54               ` Jeff King
2009-06-07 11:25           ` Tay Ray Chuan
2009-06-05  0:17     ` Jakub Narebski

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).