git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Pratyush Yadav <me@yadavpratyush.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Mark Levedahl <mlevedahl@gmail.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	git <git@vger.kernel.org>,
	Christian Couder <christian.couder@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] git-gui: Perform rescan on window focus-in
Date: Mon, 5 Aug 2019 01:47:21 +0530	[thread overview]
Message-ID: <ea3d442d-2bda-8c2a-0f75-c9b6e32c6816@yadavpratyush.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1908042058340.46@tvgsbejvaqbjf.bet>

Hi Johannes,

I sense that I'm starting to test your patience. So first off, thanks 
for the advice you've given so far. It has been really valuable. Feel 
free to let me know if you don't want me to prod you further about this, 
and I'll stop :).

Also let me know if you don't want me to cc you in other git-gui patches 
I send.


On 8/5/19 12:40 AM, Johannes Schindelin wrote:
> Hi,
> 
> On Sun, 4 Aug 2019, Pratyush Yadav wrote:
> 
>> On 8/4/19 2:04 AM, Johannes Schindelin wrote:
>>>
>>> On Sat, 3 Aug 2019, Pratyush Yadav wrote:
>>>
[...]
>>>
>>> You misunderstood what I was saying: a single setup is bad, and you
>>> can make it moderately better by testing _at least_ with a
>>> moderately-sized repository [*1*] in addition to git.git.
>>>
>>> So yes, it would still not be enough to test with, say, the git.git
>>> _and_ the Chromium repository _only_ on your setup, but if not even
>>> you can be bothered to test with more than one small repository, how
>>> can you possibly expect anybody else to add more testing?
>>
>> All right, I'll see what repos I can test.
>>
>> But my internet is pretty slow and unstable, so my clone of the
>> Chromium repo failed mid-way multiple times. I assume we need to test
>> on a large index, so is it all right if I use
>> t/perf/repos/many-files.sh to artificially generate a large repo?
> 
> Why do you ask me for permission to just try this? I feel very
> uncomfortable being put in such a position: I am not your manager or
> gate-keeper or anything.

I'm not a native speaker, so maybe I worded it incorrectly. What I meant 
by the question was not to ask permission, but to verify that this would 
be a good test for checking performance with a large repo.

So I'll rephrase the question:
Will using t/perf/repos/many-files.sh, instead of something like the 
Chromium repo, be a good enough test of performance of a rescan?

> 
>>>> [...]
>>>>> I wonder, however, whether you can think of a better method to
>>>>> figure out when to auto-refresh. Focus seems to be a low-hanging
>>>>> fruit, but as you noticed it is not very accurate. Maybe if you
>>>>> combine it with a timeout? Or maybe you can detect idle time in
>>>>> Tcl/Tk?
>>>>
>>>> Hm, I don't see a better alternative than file system watches.
>>>> Timeouts are a heuristic that can potentially be problematic.
>>>
>>> Let me stress the fact that I suggested a timeout _in addition_ to the
>>> focus event?
>>
>> Oh, my bad. I thought you suggested using timeouts exclusively.
>>
>> But I'm not sure I understand what you mean by "using timeouts in addition to
>> the focus event". My guess is that you mean we should activate a
>> refresh-on-focus-in only after git-gui has been out of focus for a certain
>> amount of time. Is my guess correct?
> 
> I am _not_ telling you what strategy you should use. You really need to
> come up with hypotheses about what tell-tales for committable outside
> changes could be easy to detect. This is your patch, and your project.

IDEs and some text editors like Atom use file system watches, and that's 
why I think they are a viable approach. But at the same time, these are 
also known to be kind of heavy-weight and slow, and I like git-gui for 
being fast and lightweight, so that might not be a good idea.

> 
> My suggestion about a time-out was to think a bit further than just mere
> Tk-provided events to detect whether the user might have changed
> anything outside of Git GUI that might make an automatic refresh
> convenient for the user.
> 
> I do _not_ want to engage in this project, it is not my pet peeve.
Understandable. We are all volunteers after all. Thanks for your help so 
far :)

> 
>>> Yes, using a timeout on its own is stupidly problematic. That's why I
>>> did not suggest that.
>>>
>>>> If you do a refresh too frequently, you hog up the user's resources
>>>> for little benefit.
>>>
>>> Indeed. You want to find a heuristic that catches most of the cases
>>> where files were changed, while at the same time not even _trying_ to
>>> refresh automatically when probably nothing changed.
>>
>> Like I said before, the best way of doing that that I can see is file system
>> watches.
> 
> That's not a heuristic.
> 
> A file system monitor is doing a lot of work in this case, for dubitable
> benefit.
> 
> Take for example git.git: Let's say that I run a specific test. It
> creates a directory under `t/`: the filesystem monitor triggers. It
> creates a repository in that directory: the filesystem monitor triggers
> _multiple times_. The test then performs all kinds of I/O, eventually
> removing the directory when all tests passed.

The monitor triggers multiple times, but we don't have to do a rescan 
every time. The callback for the trigger would just set a boolean that 
will decide whether a rescan has to be done the next time it comes back 
into focus.

> 
> Note that none of these filesystem changes correspond to anything that
> would update _anything in Git GUI during a refresh.
> 
> Of course, this is something I did not mention before because I took it
> for granted that you would always try to weigh the benefits of your
> approach to the worst possible unintended consequences.

Yes, the specific use case you mentioned would trigger an unnecessary 
refresh, but I don't think many people do these kinds of things very often.

Building the tree will generate a lot of IO, but we will only watch 
files tracked by git, so all that IO would not trigger a refresh at all.

I'm not saying file system watches are the best alternative here, I'm 
just saying it is a pretty good alternative, and the scenario you 
describe doesn't happen often enough to bother most people. One problem 
with them is, of course, added dependencies. And it's a big problem. I'd 
love to have a tcl-only alternative, but there is none at the moment.

I'll experiment with some file system watches, and see how they fare.

> 
>> But maybe we can get reasonable performance with a combination of
>> timeouts and focus events.
> 
> Please note that I would not be surprised if this heuristic _also_
> resulted in a lot of bad, unintended consequences. That's for you to
> find out.

Yes, of course. I was just thinking out loud.

> 
>>> Footnote *1*: I don't expect you to test with the largest repositories,
>>> as you are unlikely to have access to anything approaching the size of
>>> the largest Git repository on the planet:
>>> https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/
>>>
>>
>> Ah yes, I read about it a while back on Reddit. Having a huge monolithic repo
>> sounds backwards to me. Using submodules sounds like a better idea, but who am
>> I to judge. They probably have their reasons that I'm not aware of.
> 
> This statement just sounds to me as if you never have used submodules in
> any serious way. My experience is that software developers who tried to
> use submodules offer opinions that read very differently from that
> paragraph.

My last workplace had Linux, U-Boot, ARM Trusted Firmware, and a handful 
more projects under one repo with submodules. I don't remember having 
any problems, and hence why I thought it was a pretty decent idea.

> 
> Strong opinions usually do not survive contact with open-minded exposure
> to reality.

"but who am I to judge. They probably have their reasons that I'm not 
aware of."

I just put my first thoughts out about this, admitting I probably don't 
have the full picture. I see that as being open-minded.

-- 
Regards,
Pratyush Yadav

  reply	other threads:[~2019-08-04 20:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-28 15:17 [PATCH] git-gui: Perform rescan on window focus-in Pratyush Yadav
2019-07-28 21:36 ` brian m. carlson
2019-07-28 22:10   ` Pratyush Yadav
2019-07-28 22:32     ` Pratyush Yadav
2019-07-28 22:49     ` brian m. carlson
2019-07-29  2:24       ` Mark Levedahl
2019-07-29  2:26       ` Mark Levedahl
2019-07-29  2:28       ` Mark Levedahl
2019-07-29  8:15         ` Pratyush Yadav
2019-07-31 19:42           ` Johannes Schindelin
2019-08-01 21:52             ` Pratyush Yadav
2019-08-02 12:39               ` Johannes Schindelin
2019-08-02 20:00                 ` Pratyush Yadav
2019-08-03 20:34                   ` Johannes Schindelin
2019-08-04 12:53                     ` Pratyush Yadav
2019-08-04 19:10                       ` Johannes Schindelin
2019-08-04 20:17                         ` Pratyush Yadav [this message]
2019-08-02 16:47               ` Junio C Hamano
2019-08-02 20:13                 ` Pratyush Yadav
2019-08-04 18:56                   ` Johannes Schindelin
2019-07-29  5:09       ` Junio C Hamano
2019-07-29  8:44         ` Pratyush Yadav
2019-07-28 21:44 ` Pratyush Yadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea3d442d-2bda-8c2a-0f75-c9b6e32c6816@yadavpratyush.com \
    --to=me@yadavpratyush.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mlevedahl@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).