git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: Luke Lu <git@vicaya.com>,
	Christer Weinigel <christer@weinigel.se>,
	Tom Tobin <korpios@korpios.com>,
	git@vger.kernel.org
Subject: Re: On Tabs and Spaces
Date: Wed, 17 Oct 2007 20:13:23 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.0.999.0710171955580.26902@woody.linux-foundation.org> (raw)
In-Reply-To: <20071018024553.GA5186@coredump.intra.peff.net>



On Wed, 17 Oct 2007, Jeff King wrote:
> 
> You have made this claim several times, and I really don't understand
> it. If I have 8 spaces, then a diff line will have either " ", "+", or
> "-" followed by 8 spaces. If I use a hard tab, then the tab will end up
> only taking up 7 spaces because of the nature of tabs.
> 
> This might matter if I'm comparing non-diff code to diff code. But in a
> diff, _everything_ is indented by exactly one space, so it all lines up.
> Is there something I'm missing?

Yes. 

You're missing the fact that some people have problems with editors.

So they add a line, and they add *that* line with the wrong kind of 
indentation. And it shows up among the other lines like this (here the 
whole patch is indented):

	diff --git a/kernel/sched.c b/kernel/sched.c
	index 92721d1..1ecb164 100644
	--- a/kernel/sched.c
	+++ b/kernel/sched.c
	@@ -127,6 +127,7 @@ static inline u32 sg_div_cpu_power(const struct sched_group *sg, u32 load)
	 static inline void sg_inc_cpu_power(struct sched_group *sg, u32 val)
	 {
	 	sg->__cpu_power += val;
	+        wrong indentation here.
	 	sg->reciprocal_cpu_power = reciprocal_value(sg->__cpu_power);
	 }
	 #endif

and so you see the fact that somebody messed up in the patch itself.

It actually more often goes the other way: somebody may have messed up 
earlier, but did so *consistently* so it wasn't obvious when looking at 
the patch. And then somebody fixes one line, and now that one fixed line 
is indented correctly but differently.

When it gets *too* bad, we just reindent the whole file, but more 
commonly when I notice it in a diff, I just edit that particular region 
or even just the diff itself in-place.

Generally, it seldom comes to even that. Doing a

	git grep '        ' -- '*.c'

(that's now eight spaces) returns quite a lot of lines, and it's generally 
not worth worrying about (not all of them are indentation - people do use 
spaces for lining things up etc - but a lot of it really is just indents 
done against the coding style).

> I was about to tell you that you're full of it, but there really is a
> slowdown:
>  [ ... ]
> It's actually about 16%.

I didn't even time it, and I called it at 20% without even counting any 
tabs. Why? Because it's inevitable!

It so happens that "grep" has a lot of really clever heuristics, so that 
it is actually better at passing over characters that it knows cannot 
start the pattern you are searching for, so timing "grep" is actually 
quite complex in the general case. So I bet that if you had grepped for 
something that started with a space, you'd probably have found a bigger 
slowdown. 

But ignore all that complexity, and it really boils down to a really 
simple principle: bigger data sets are more expensive, and "linear 
slowdown" is actually almost the best possible case. Quite often, a bigger 
data set causes *worse* than a linear slowdown.

It's very seldom the case that you grow some problem space and performance 
stays the same.

> Gah, I can't believe I've not only been sucked into a tab vs spaces
> discussion, but now I've actually wasted time doing a performance
> comparison on it.

Well, performance analysis isn't exactly a "waste". That "git grep" was 
something we spent some time trying to go fast (for example, doing the 
whole external grep tool thing because that thing is usually optimized to 
h*ll and back - so the execve() overhead is more than worth it).

And it's a real workload. Maybe others don't use "git grep" quite as much 
as I do, but I do it *all* the time. Some other people probably use ctags 
or something, I personally prefer just a fast git grep.

But the *exact* same issues will show up for "simple" things like "git 
bisect". One of the biggest costs of git bisect is actually checking out 
the source tree. If the source tree is on the order of 20% larger, what 
does that mean? 

So it doesn't matter if you have a terabyte disk. Source code size *still* 
matters.

And 20% (or 16%) is more than a lot of other optimizations can help you 
save!

> As an aside, that commit was enough to trigger a "git-gc --auto", which
> was my first experience with it. It's actually kind of annoying
> (especially since I was about to repack -a -d).

Yeah, I don't think it's wonderful, but it might even be a good thing as a 
"hey, at least you are aware of the notion of GC now" kind of introduction 
to people (who then hopefully realize that they don't actually want 
automatic GC, but rather do it once a week or something).

		Linus

  parent reply	other threads:[~2007-10-18  3:13 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-16  6:45 On Tabs and Spaces Michael Witten
2007-10-16  7:04 ` Shawn O. Pearce
2007-10-16  7:27   ` Michael Witten
2007-10-16 17:20     ` Andreas Ericsson
2007-10-16 17:40       ` Sam Ravnborg
2007-10-16 23:09         ` Petr Baudis
2007-10-17  3:41           ` David
2007-10-17 11:32             ` Andy Parkins
2007-10-16  8:30 ` Adam Piatyszek
2007-10-16  9:04   ` Lars Hjemli
2007-10-16 10:16     ` Adam Piatyszek
2007-10-16 15:26       ` Jeffrey C. Ollie
2007-10-16 15:51         ` Michael Witten
2007-10-16 17:06           ` Jari Aalto
2007-10-16 19:20             ` Linus Torvalds
2007-10-16 19:36               ` Mike Hommey
2007-10-16 19:47                 ` Linus Torvalds
2007-10-16 19:51                   ` Linus Torvalds
2007-10-16 20:32                   ` Matthieu Moy
2007-10-16 20:18               ` Tom Tobin
2007-10-16 23:05                 ` Linus Torvalds
2007-10-16 23:51                   ` Christer Weinigel
2007-10-17  0:45                     ` Linus Torvalds
2007-10-17  3:08                       ` Michael Witten
2007-10-17  3:29                         ` Linus Torvalds
2007-10-17  7:17                       ` Luke Lu
2007-10-17  9:09                         ` Michael Witten
2007-10-17 10:03                           ` Luke Lu
2007-10-17 10:21                           ` Nikolai Weibull
2007-10-17 11:23                             ` Michael Witten
2007-10-17 22:02                           ` Jari Aalto
2007-10-17 22:36                             ` Andreas Ericsson
2007-10-17 23:38                               ` Jari Aalto
2007-10-18  4:31                                 ` Dmitry Torokhov
2007-10-18  8:19                                   ` Jari Aalto
2007-10-18 11:34                                     ` Petr Baudis
2007-10-18 11:39                                       ` Nikolai Weibull
2007-10-22  3:39                                         ` Miles Bader
2007-10-18  7:15                               ` David Kågedal
2007-10-18  5:42                             ` Mike Hommey
2007-10-18 10:36                               ` Jari Aalto
2007-10-17 15:53                         ` Linus Torvalds
2007-10-17 18:05                           ` Johannes Schindelin
2007-10-17 18:25                           ` Tom Tobin
2007-10-17 18:54                             ` Linus Torvalds
2007-10-17 19:33                               ` Tom Tobin
2007-10-17 19:44                                 ` Linus Torvalds
2007-10-17 19:48                                 ` Nicolas Pitre
2007-10-17 19:52                                 ` Josh England
2007-10-17 19:53                                 ` Linus Torvalds
2007-10-17 21:21                                   ` Christer Weinigel
2007-10-17 22:03                                     ` Linus Torvalds
2007-10-18  6:25                                       ` David Kastrup
2007-10-17 22:11                                     ` Johannes Schindelin
2007-10-17 23:17                                       ` Christer Weinigel
2007-10-17 23:44                                         ` Johannes Schindelin
2007-10-18  0:31                                           ` Christer Weinigel
2007-10-18  6:02                                             ` Andreas Ericsson
2007-10-18  7:12                                             ` David Kågedal
2007-10-17 23:53                                         ` Linus Torvalds
2007-10-17 20:31                                 ` David Kastrup
2007-10-17 21:08                                 ` Johannes Schindelin
2007-10-17 19:47                               ` Jan Wielemaker
2007-10-18  0:32                           ` Jeff King
2007-10-18  0:59                             ` Linus Torvalds
2007-10-18  2:45                               ` Jeff King
2007-10-18  3:03                                 ` david
2007-10-18  3:00                                   ` Jeff King
2007-10-18  3:32                                     ` Linus Torvalds
2007-10-18  4:17                                       ` Linus Torvalds
2007-10-18  3:13                                 ` Linus Torvalds [this message]
2007-10-18  3:23                                   ` Jeff King
2007-10-18  4:41                                     ` [PATCH] Add a message explaining that automatic GC is about to start koreth
2007-10-18  4:44                                       ` Steven Grimm
2007-10-18  5:01                                       ` Jeff King
2007-10-18  5:11                                         ` Shawn O. Pearce
2007-10-18 13:52                                       ` Brian Gernhardt
2007-10-18 14:16                                         ` Steven Grimm
2007-10-18 18:08                                           ` Jeff King
2007-10-19  0:16                                           ` Shawn O. Pearce
2007-10-19  1:12                                             ` [PATCH] git-gc: improve wording of --auto notification Jeff King
2007-10-19  1:24                                               ` Shawn O. Pearce
2007-10-19  1:26                                                 ` Jeff King
2007-10-18 14:21                                         ` [PATCH] Add a message explaining that automatic GC is about to start Nicolas Pitre
2007-10-18  4:52                                 ` On Tabs and Spaces Nicolas Pitre
2007-10-18  4:54                                   ` Jeff King
2007-10-18  4:55                                     ` Jeff King
2007-10-17 16:08                         ` David Kastrup
2007-10-17 17:44                           ` Nicolas Pitre
2007-10-17 19:52                             ` David Kastrup
2007-10-17 17:51                           ` Sean
2007-10-17  5:56                   ` David Kastrup
2007-10-16 20:56               ` Sam Ravnborg
2007-10-18  3:31               ` Paul Wankadia
2007-10-18  4:22                 ` Linus Torvalds
2007-10-18  4:36               ` Dmitry Potapov
2007-10-16 17:23         ` Andreas Ericsson
2007-10-16 18:34           ` Jan-Benedict Glaw
2007-10-16 18:51             ` Andreas Ericsson
2007-10-20 13:54 ` Robin Rosenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.999.0710171955580.26902@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=christer@weinigel.se \
    --cc=git@vger.kernel.org \
    --cc=git@vicaya.com \
    --cc=korpios@korpios.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).