git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Git List" <git@vger.kernel.org>,
	"Martin Ågren" <martin.agren@gmail.com>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH] revision: introduce prepare_revision_walk_extended()
Date: Tue, 19 Dec 2017 06:49:06 -0500	[thread overview]
Message-ID: <20171219114906.GB24558@sigill.intra.peff.net> (raw)
In-Reply-To: <39581cd0-0bfd-c8d1-642b-1245cf425ab4@web.de>

On Mon, Dec 18, 2017 at 08:18:19PM +0100, René Scharfe wrote:

> > The root of the matter is that the revision-walking code doesn't clean
> > up after itself. In every case, the caller is just saving these to clean
> > up commit marks, isn't it?
> 
> bundle also checks if the pending objects exists.

Thanks, I missed that one. So just adding a feature to clean up commit
marks wouldn't be sufficient to cover that case.

> > That sidesteps all of the memory ownership issues by just creating a
> > copy. That's less efficient, but I'd be surprised if it matters in
> > practice (we tend to do one or two revisions per process, there don't
> > tend to be a lot of pending tips, and we're really just talking about
> > copying some pointers here).
> [...]
> I don't know if there can be real-world use cases with millions of
> entries (when it would start to hurt).

I've seen repos which have tens of thousands of tags. Something like
"rev-list --all" would have tens of thousands of pending objects.
I think in practice it's limited to the number of objects (though in
practice more like the number of commits).

I'd note also that for most uses we don't need a full object_array. You
really just need a pointer to the "struct object" to wipe its flags.

So there we might waste 8 bytes per object in the worst case. But bear
in mind that the process is wasting a lot more than that per "struct
commit" that we're holding. And versus the existing scheme, it's only
for the moment until prepare_revision_walk() frees the old pending list.

> Why does prepare_revision_walk() clear the list of pending objects at
> all?  Assuming the list is append-only then perhaps remembering the
> last handled index would suffice.

I assume it was mostly to clean up after itself, since there's no
explicit "I'm done with the traversal" function. But as I said earlier,
I'd be surprised of a revision walk doesn't leave some allocated cruft
in rev_info these days (e.g., pathspec cruft). In practice it doesn't
matter much because we don't do arbitrary numbers of traversals in
single process.

-Peff

  reply	other threads:[~2017-12-19 11:49 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-16 12:12 [PATCH] revision: introduce prepare_revision_walk_extended() René Scharfe
2017-12-17 10:20 ` Martin Ågren
2017-12-18 15:10 ` Jeff King
2017-12-18 19:18   ` René Scharfe
2017-12-19 11:49     ` Jeff King [this message]
2017-12-19 18:33       ` Junio C Hamano
2017-12-20 13:08         ` Jeff King
2017-12-21 18:41           ` René Scharfe
2017-12-24 14:22             ` Jeff King
2017-12-25 17:36               ` René Scharfe
2017-12-25 17:41 ` [PATCH v2 0/9] revision: get rid of the flag leak_pending René Scharfe
2017-12-25 17:43   ` [PATCH v2 1/9] commit: avoid allocation in clear_commit_marks_many() René Scharfe
2018-01-10  7:54     ` Jeff King
2017-12-25 17:44   ` [PATCH v2 2/9] commit: use clear_commit_marks_many() in remove_redundant() René Scharfe
2017-12-25 17:44   ` [PATCH v2 3/9] ref-filter: use clear_commit_marks_many() in do_merge_filter() René Scharfe
2017-12-25 17:44   ` [PATCH v2 4/9] object: add clear_commit_marks_all() René Scharfe
2018-01-10  7:58     ` Jeff King
2018-01-11 18:57       ` René Scharfe
2018-01-12 15:20         ` Jeff King
2017-12-25 17:45   ` [PATCH v2 5/9] bisect: avoid using the rev_info flag leak_pending René Scharfe
2018-01-10  8:07     ` Jeff King
2018-01-11 18:57       ` René Scharfe
2018-01-12 15:23         ` Jeff King
2017-12-25 17:46   ` [PATCH v2 6/9] bundle: " René Scharfe
2017-12-28 21:13     ` Junio C Hamano
2018-01-10  8:18     ` Jeff King
2017-12-25 17:47   ` [PATCH v2 7/9] checkout: " René Scharfe
2017-12-28 21:24     ` Junio C Hamano
2017-12-25 17:47   ` [PATCH v2 8/9] revision: remove the unused " René Scharfe
2017-12-25 17:48   ` [PATCH v2 9/9] commit: remove unused function clear_commit_marks_for_object_array() René Scharfe
2017-12-28 20:32   ` [PATCH v2 0/9] revision: get rid of the flag leak_pending Junio C Hamano
2018-01-10  8:20   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171219114906.GB24558@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).