git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Stefan Beller <sbeller@google.com>, Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS
Date: Tue, 2 May 2017 10:05:00 -0400	[thread overview]
Message-ID: <b2d1d2fe-1b9b-4afa-192f-267bbb5df487@jeffhostetler.com> (raw)
In-Reply-To: <CAGZ79kZkssTEdNyzYh1YYv89szvig=rn2j3DJcHxsbzdADRw-w@mail.gmail.com>



On 5/2/2017 12:17 AM, Stefan Beller wrote:
> On Mon, May 1, 2017 at 6:36 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> This applies to origin/master.
>>>
>>> For better readability and understandability for newcomers it is a good idea
>>> to not offer 2 APIs doing the same thing with on being the #define of the other.
>>>
>>> In the long run we may want to drop the macros guarded by
>>> NO_THE_INDEX_COMPATIBILITY_MACROS. This converts a couple of them.

Thank you for bringing this up and making this proposal.
I started a similar effort internally last fall, but
stopped because of the footprint size.

>>
>> Why?  Why should we keep typing &the_index, when most of the time we
>> are given _the_ index and working on it?
>
> As someone knowledgeable with the code base you know that the cache_*
> and index_* functions only differ by an index argument. A newcomer may not
> know this, so they wonder why we have (A) so many functions [and which is the
> right function to use]; it is an issue of ease of use of the code base.
>
> Anything you do In submodule land today needs to spawn new processes in
> the submodule. This is cumbersome and not performant. So in the far future
> we may want to have an abstraction of a repo (B), i.e. all repository state in
> one struct/class. That way we can open a submodule in-process and perform
> the required actions without spawning a process.
>
> The road to (B) is a long road, but we have to art somewhere. And this seemed
> like a good place by introducing a dedicated argument for the
> repository. In a follow
> up in the future we may want to replace &the_index by "the_main_repo.its_index"
> and then could also run the commands on other (submodule) indexes. But more
> importantly, all these commands would operate on a repository object.
>
> In such a far future we would have functions like the cmd_* functions
> that would take a repository object instead of doing its setup discovery
> on their own.
>
> Another reason may be its current velocity (or absence of it) w.r.t. to these
> functions, such that fewer merge conflicts may arise.

In addition to (eventually) allowing multiple repos be open at
the same time for submodules, it would also help with various
multi-threading efforts.  For example, we have loops that do a
"for (k = 0, k < active_nr; k++) {...}"  There is no visual clue
in that code that it references "the_index" and therefore should
be subject to the same locking.  Granted, this is a trivial example,
but goes to the argument that the code has lots of subtle global
variables and macros that make it difficult to reason about the
code.

This step would help un-hide this.

In a much longer future, we could also consider building an
improved API around the in-memory index data.  For example,
currently we have a simple array of cache_entry pointers and
the entire code base uses "for" loops like the above to iterate.
If we could hide that fact, then we could consider alternative
representations for various reasons.
() bulk alloc the cache_entries from a pool, rather than individually.
() cluster cache_entries linearly by parent directory, rather
    than linearly over the whole tree.
() efficient alternative iterator methods on the index, such as
    non-recursive breadth-first

Things like this would be difficult with the current set of
globals and macros.

Thanks,
Jeff


>
> ---
> This discussion is similar to the "free memory at the end of cmd_*" discussion,
> as it aims to make code reusable, and accepting a minor drawback for it.
> Typing "the_index" re-enforces the object thinking model and may have people
> start on thinking if they would like to declare yet another global variable.
>
> Thanks,
> Stefan
>

  reply	other threads:[~2017-05-02 14:05 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-01 19:07 [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS Stefan Beller
2017-05-01 19:07 ` [PATCH 1/5] cache.h: drop read_cache() Stefan Beller
2017-05-01 19:07 ` [PATCH 2/5] cache.h: drop active_* macros Stefan Beller
2017-05-01 19:07 ` [PATCH 3/5] cache.h: drop read_cache_from Stefan Beller
2017-05-01 19:07 ` [PATCH 4/5] cache.h: drop read_cache_preload(pathspec) Stefan Beller
2017-05-01 19:07 ` [PATCH 5/5] cache.h: drop read_cache_unmerged() Stefan Beller
2017-05-02  1:36 ` [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS Junio C Hamano
2017-05-02  4:17   ` Stefan Beller
2017-05-02 14:05     ` Jeff Hostetler [this message]
2017-05-03 11:31       ` Samuel Lijin
2017-05-03 17:14         ` Stefan Beller
2017-05-03 18:22           ` Samuel Lijin
2017-05-04  3:29             ` Brandon Williams
2017-05-03 10:27   ` Duy Nguyen
2017-05-03 17:02     ` Stefan Beller
2017-05-04  2:48     ` Junio C Hamano
2017-05-04  3:24       ` Brandon Williams
2017-05-04 18:30       ` Stefan Beller
2017-05-05 14:31         ` Johannes Schindelin
2017-05-05 17:20           ` Brandon Williams
2017-05-04 19:19       ` Jonathan Nieder
2017-05-05 17:22         ` Junio C Hamano
2017-05-05 17:29           ` Brandon Williams
2017-05-02 15:35 ` Jeff Hostetler
2017-05-02 17:06   ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b2d1d2fe-1b9b-4afa-192f-267bbb5df487@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).