git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: git@vger.kernel.org
Subject: Re: updated design for the diff-raw format.
Date: Sat, 21 May 2005 16:17:33 -0700	[thread overview]
Message-ID: <7vll68dv8y.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <7vwtpsdvgm.fsf@assigned-by-dhcp.cox.net> (Junio C. Hamano's message of "Sat, 21 May 2005 16:12:57 -0700")

(second of the replayed message, with blessing from Linus)

Date: Sat, 21 May 2005 11:24:31 -0700 (PDT)
From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH 3/3] Diff overhaul, adding the other half of copy detection.
Message-ID: <Pine.LNX.4.58.0505211107160.2206@ppc970.osdl.org>

On Sat, 21 May 2005, Junio C Hamano wrote:
> 
> Once we start to think of it this way, it becomes quite tempting
> to change the diff-raw format to actually match the above
> concept.

I agree, and I was going to suggest changing the "raw" diff output for all
the same reasons. So I think you should do it, as the old format was based
on not really knowing where this all would take us. I think your proposed
format is visually nicer, and it's obviously more flexible.

Small suggestion on termination of the thing:
 - add a "inter_name_termination" variable, which defaults to '\t' (the 
   same way "line_termination" defaults to '\n')
 - make "-z" set both "inter_name_termination" _and_ "line_termination" to 
   0.
 - make the spacing be fixed (and add a test for it, so that there is 
   never any confusion): regular spaces between the non-file-names, and 
   "inter_name_termination" before the filenames, and "line_termination" 
   after the second filename.

This has a few results:

 - the default output is perfectly readable, if long

 - "cut" (which defaults to TAB delimeter) can directly pick up the
   three fields from a line: "state", "file1" and "file2"

 - even if you use the "readable" output (as opposed to the "-z" 
   machine-readable one), spaces in filenames are unambiguous, and we only 
   screw up on TAB and NL.

   Spaces in names are normal in many things. NL/TAB really _are_ unusual, 
   and I could imagine that some porcelain could actually disallow them 
   (and if that happens, we could support that by add a flag to
   "update-cache" to refuse to touch such files, the same way we refuse 
   non-canonical filenames now).

 - the -z flag results in fairly unreadable output, but is at least 
   totally parseable with all filename characters allowed.

With that in place, the new format would be a lot _easier_ to parse than 
the old one, I think. And will be more flexible, and since it's a 
fixed-column format, it's actually pretty readable for humans too, as long 
as the terminal line is wide enough.

>     100644 100644 233a250... 66818b4... file0 file0
>     100755 100755 fc77389... 7b72d3d... file1 file1
>     ______ 100644 _______... 233a250... file2 file2
>     100755 ______ fc77389... _______... file3 file3
>     100644 100644 233a250... 233a250... file4 file4
> 
> Again, I am not even advocating this.  It is more like me
> still thinking aloud.

No, I think it's really good. The _one_ thing I'd do is to maybe put a 
special character at the beginning of the line, so that "diff-helper" has 
an ever easier time to know whether it should care or not. Something that 
normally wouldn't show up at the beginning of a line, like ':'.

(This would have the secondary advantage that yuou could run "diff-helper"  
multiple times and not care whether it was already expanded or not. Right
now that is impossible: diff-helper can't know the difference between an
already-expanded diff that has a line that begins with a '+' or '-' and a
eleted/new object line).

		Linus





  parent reply	other threads:[~2005-05-21 23:16 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
2005-05-21 23:16 ` Junio C Hamano
2005-05-21 23:17 ` Junio C Hamano [this message]
2005-05-21 23:18 ` Junio C Hamano
2005-05-21 23:19 ` Junio C Hamano
2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
2005-05-22  6:01     ` Linus Torvalds
2005-05-22  6:33       ` Junio C Hamano
2005-05-22  6:57       ` Junio C Hamano
2005-05-22  8:31         ` [PATCH] Fix tweak in similarity estimator Junio C Hamano
2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
2005-05-22 18:36       ` Niklas Hoglund
2005-05-22 19:15         ` Junio C Hamano
2005-05-22 18:42       ` Thomas Glanzmann
2005-05-22 19:05         ` Linus Torvalds
2005-05-22 19:05           ` Thomas Glanzmann
2005-05-22 19:20           ` Junio C Hamano
2005-05-22 19:35             ` Junio C Hamano
2005-05-22 20:24               ` Linus Torvalds
2005-05-22 23:01                 ` Junio C Hamano
2005-05-22 23:14                   ` Linus Torvalds
2005-05-23  0:35                     ` Junio C Hamano
2005-05-23  1:07                       ` Linus Torvalds
2005-05-23  1:33                         ` Junio C Hamano
2005-05-23  4:26               ` [PATCH] Rename/copy detection fix Junio C Hamano
2005-05-23  4:38                 ` Comments on "Rename/copy detection fix." Junio C Hamano
2005-05-22 19:13       ` [PATCH] The diff-raw format updates Junio C Hamano
2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
2005-05-22 16:40     ` Linus Torvalds
2005-05-22 16:47       ` Junio C Hamano
2005-05-22 17:04     ` Junio C Hamano
2005-05-23  4:24       ` [PATCH] Be careful with symlinks when detecting renames and copies Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vll68dv8y.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).