git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* What's cooking in git.git (Jun 2016, #09; Mon, 27)
@ 2016-06-27 23:21 Junio C Hamano
  2016-06-28  8:01 ` [PATCH v3 0/3] unified auto CRLF handling, V3 tboegi
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Junio C Hamano @ 2016-06-27 23:21 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The first batch for this cycle has been merged to 'master', the tip
of 'next' has been rewound and rebuilt, while a few topics got
temporarily ejected from 'next'.

You can find the changes described here in the integration branches
of the repositories listed at

    http://git-blame.blogspot.com/p/git-public-repositories.html

--------------------------------------------------
[Graduated to "master"]

* ap/git-svn-propset-doc (2016-06-15) 1 commit
  (merged to 'next' on 2016-06-22 at 5a34f7d)
 + git-svn: document the 'git svn propset' command

 "git svn propset" subcommand that was added in 2.3 days is
 documented now.


* cc/apply-introduce-state (2016-06-06) 50 commits
  (merged to 'next' on 2016-06-20 at 4f205b8)
 + builtin/apply: remove misleading comment on lock_file field
 + builtin/apply: move 'newfd' global into 'struct apply_state'
 + builtin/apply: add 'lock_file' pointer into 'struct apply_state'
 + builtin/apply: move applying patches into apply_all_patches()
 + builtin/apply: move 'state' check into check_apply_state()
 + builtin/apply: move 'symlink_changes' global into 'struct apply_state'
 + builtin/apply: move 'fn_table' global into 'struct apply_state'
 + builtin/apply: move 'state_linenr' global into 'struct apply_state'
 + builtin/apply: move 'max_change' and 'max_len' into 'struct apply_state'
 + builtin/apply: move 'ws_ignore_action' into 'struct apply_state'
 + builtin/apply: move 'ws_error_action' into 'struct apply_state'
 + builtin/apply: move 'applied_after_fixing_ws' into 'struct apply_state'
 + builtin/apply: move 'squelch_whitespace_errors' into 'struct apply_state'
 + builtin/apply: remove whitespace_option arg from set_default_whitespace_mode()
 + builtin/apply: move 'whitespace_option' into 'struct apply_state'
 + builtin/apply: move 'whitespace_error' global into 'struct apply_state'
 + builtin/apply: move 'root' global into 'struct apply_state'
 + builtin/apply: move 'p_value_known' global into 'struct apply_state'
 + builtin/apply: move 'p_value' global into 'struct apply_state'
 + builtin/apply: move 'has_include' global into 'struct apply_state'
 + builtin/apply: move 'limit_by_name' global into 'struct apply_state'
 + builtin/apply: move 'patch_input_file' global into 'struct apply_state'
 + builtin/apply: move 'apply' global into 'struct apply_state'
 + builtin/apply: move 'p_context' global into 'struct apply_state'
 + builtin/apply: move 'fake_ancestor' global into 'struct apply_state'
 + builtin/apply: move 'line_termination' global into 'struct apply_state'
 + builtin/apply: move 'unsafe_paths' global into 'struct apply_state'
 + builtin/apply: move 'no_add' global into 'struct apply_state'
 + builtin/apply: move 'threeway' global into 'struct apply_state'
 + builtin/apply: move 'summary' global into 'struct apply_state'
 + builtin/apply: move 'numstat' global into 'struct apply_state'
 + builtin/apply: move 'diffstat' global into 'struct apply_state'
 + builtin/apply: move 'cached' global into 'struct apply_state'
 + builtin/apply: move 'allow_overlap' global into 'struct apply_state'
 + builtin/apply: move 'update_index' global into 'struct apply_state'
 + builtin/apply: move 'apply_verbosely' global into 'struct apply_state'
 + builtin/apply: move 'apply_with_reject' global into 'struct apply_state'
 + builtin/apply: move 'apply_in_reverse' global into 'struct apply_state'
 + builtin/apply: move 'check_index' global into 'struct apply_state'
 + builtin/apply: move 'check' global into 'struct apply_state'
 + builtin/apply: move 'unidiff_zero' global into 'struct apply_state'
 + builtin/apply: move 'state' init into init_apply_state()
 + builtin/apply: introduce 'struct apply_state' to start libifying
 + builtin/apply: move 'read_stdin' global into cmd_apply()
 + builtin/apply: move 'options' variable into cmd_apply()
 + builtin/apply: extract line_by_line_fuzzy_match() from match_fragment()
 + builtin/apply: avoid local variable shadowing 'len' parameter
 + builtin/apply: avoid parameter shadowing 'linenr' global
 + builtin/apply: avoid parameter shadowing 'p_value' global
 + builtin/apply: make gitdiff_verify_name() return void

 Originally merged to 'next' on 2016-06-06

 The "git apply" standalone program is being libified; this is the
 first step to move many state variables into a structure that can
 be explicitly (re)initialized to make the machinery callable more
 than once.

 The next step that moves some remaining state variables into the
 structure and turns die()s into an error return that propagates up
 to the caller is not queued yet but in flight.  It would be good to
 review the above first and give the remainder of the series a solid
 base to build on.


* dn/gpg-doc (2016-06-16) 1 commit
  (merged to 'next' on 2016-06-22 at f467355)
 + Documentation: GPG capitalization

 The documentation tries to consistently spell "GPG"; when
 referring to the specific program name, "gpg" is used.


* em/newer-freebsd-shells-are-fine-with-returns (2016-06-17) 1 commit
  (merged to 'next' on 2016-06-22 at dbee33c)
 + rebase: update comment about FreeBSD /bin/sh

 Comments about misbehaving FreeBSD shells have been clarified with
 the version number (9.x and before are broken, newer ones are OK).


* et/add-chmod-x (2016-06-07) 1 commit
  (merged to 'next' on 2016-06-22 at 71d65a0)
 + add: add --chmod=+x / --chmod=-x options

 "git update-index --add --chmod=+x file" may be usable as an escape
 hatch, but not a friendly thing to force for people who do need to
 use it regularly.  "git add --chmod=+x file" can be used instead.


* jc/deref-tag (2016-06-14) 1 commit
  (merged to 'next' on 2016-06-22 at 1075713)
 + blame, line-log: do not loop around deref_tag()

 Code clean-up.


* jk/avoid-unbounded-alloca (2016-06-07) 1 commit
  (merged to 'next' on 2016-06-22 at 93feb23)
 + tree-diff: avoid alloca for large allocations

 A codepath that used alloca(3) to place an unbounded amount of data
 on the stack has been updated to avoid doing so.


* jk/fetch-prune-doc (2016-06-14) 1 commit
  (merged to 'next' on 2016-06-22 at 6563376)
 + fetch: document that pruning happens before fetching

 Minor doc update.


* lf/receive-pack-auto-gc-to-client (2016-06-06) 1 commit
  (merged to 'next' on 2016-06-22 at 92162f5)
 + receive-pack: send auto-gc output over sideband 2

 Messages that are generated by auto gc during "git push" on the
 receiving end are now passed back to the sending end in such a way
 that they are shown with "remote: " prefix to avoid confusing the
 users.


* lv/status-say-working-tree-not-directory (2016-06-09) 1 commit
  (merged to 'next' on 2016-06-22 at c65c7c1)
 + Use "working tree" instead of "working directory" for git status

 "git status" used to say "working directory" when it meant "working
 tree".


* mg/cherry-pick-multi-on-unborn (2016-06-06) 1 commit
  (merged to 'next' on 2016-06-22 at 183295b)
 + cherry-pick: allow to pick to unborn branches

 "git cherry-pick A" worked on an unborn branch, but "git
 cherry-pick A..B" didn't.


* nb/gnome-keyring-build (2016-06-14) 1 commit
  (merged to 'next' on 2016-06-22 at 0dfb90c)
 + gnome-keyring: Don't hard-code pkg-config executable

 Build improvements for gnome-keyring (in contrib/)


* pb/strbuf-read-file-doc (2016-06-14) 1 commit
  (merged to 'next' on 2016-06-22 at 10e4b4f)
 + strbuf: describe the return value of strbuf_read_file

 Minor doc update.


* pc/occurred (2016-06-10) 2 commits
  (merged to 'next' on 2016-06-22 at ce0b944)
 + config.c: fix misspelt "occurred" in an error message
 + refs.h: fix misspelt "occurred" in a comment

 Typofix.


* rj/compat-regex-size-max-fix (2016-06-06) 1 commit
  (merged to 'next' on 2016-06-22 at 376c5b1)
 + regex: fix a SIZE_MAX macro redefinition warning

 A compilation fix.


* sg/reflog-past-root (2016-06-06) 1 commit
  (merged to 'next' on 2016-06-22 at c5d4e29)
 + reflog: continue walking the reflog past root commits

 "git reflog" stopped upon seeing an entry that denotes a branch
 creation event (aka "unborn"), which made it appear as if the
 reflog was truncated.


* tb/complete-status (2016-06-10) 3 commits
  (merged to 'next' on 2016-06-22 at 44ae68f)
 + completion: add git status
 + completion: add __git_get_option_value helper
 + completion: factor out untracked file modes into a variable

 The completion script (in contrib/) learned to complete "git
 status" options.


* tr/doc-tt (2016-06-08) 4 commits
  (merged to 'next' on 2016-06-22 at dc6df3b)
 + doc: change configuration variables format
 + doc: more consistency in environment variables format
 + doc: change environment variables format
 + doc: clearer rule about formatting literals

 The documentation set has been updated so that literal commands,
 configuration variables and environment variables are consistently
 typeset in fixed-width font and bold in manpages.


* vs/prompt-avoid-unset-variable (2016-06-06) 1 commit
  (merged to 'next' on 2016-06-22 at 8bf21d3)
 + git-prompt.sh: Don't error on null ${ZSH,BASH}_VERSION, $short_sha

 The git-prompt scriptlet (in contrib/) was not friendly with those
 who uses "set -u", which has been fixed.

--------------------------------------------------
[New Topics]

* jk/test-match-signal (2016-06-24) 4 commits
 - t/lib-git-daemon: use test_match_signal
 - test_must_fail: use test_match_signal
 - t0005: use test_match_signal as appropriate
 - tests: factor portable signal check out of t0005

 The test framework learned a new helper test_match_signal to check
 an exit code from getting killed by an expected signal.

 Will merge to 'next'.


* ah/unpack-trees-advice-messages (2016-06-27) 1 commit
 - unpack-trees: fix English grammar in do-this-before-that messages

 Will merge to 'next'.


* ew/gc-auto-pack-limit-fix (2016-06-27) 1 commit
 - gc: fix off-by-one error with gc.autoPackLimit

 "gc.autoPackLimit" when set to 1 should not trigger a repacking
 when there is only one pack, but the code counted poorly and did
 so.

 Will merge to 'next'.


* nd/connect-ssh-command-config (2016-06-27) 1 commit
 - connect: read $GIT_SSH_COMMAND from config file

 A new configuration variable core.sshCommand to specify what value
 for GIT_SSH_COMMAND to use per repository.


* nd/doc-new-command (2016-06-27) 1 commit
 - new-command.txt: correct the command description file

 Will merge to 'next'.


* po/range-doc (2016-06-27) 3 commits
 - doc: give headings for the two and three dot notations
 - doc: show the actual left, right, and boundary marks
 - doc: use 'symmetric difference' consistently


* sb/submodule-parallel-fetch (2016-06-27) 2 commits
 - xwrite: poll on non-blocking FDs
 - xread: retry after poll on EAGAIN/EWOULDBLOCK

 Fix a recently introduced codepaths that are involved in parallel
 submodule operations, which gave up on reading too early, and
 could have wasted CPU while attempting to write under a corner case
 condition.

 Will merge to 'next'.


* mm/doc-tt (2016-06-27) 6 commits
 - doc: typeset HEAD and variants as litteral
 - CodingGuidelines: formatting HEAD in documentation
 - doc: typeset long options with argument as litteral
 - doc: typeset -- as litteral
 - doc: typeset long command-line options as literal
 - doc: typeset short command-line options as literal

 More mark-up updates to typeset strings that are expected to
 literally typed by the end user in fixed-width font.


* nd/fetch-ref-summary (2016-06-27) 5 commits
 - fetch: reduce duplicate in ref update status lines with placeholder
 - fetch: align all "remote -> local" output
 - fetch: change flag code for displaying tag update and deleted ref
 - fetch: refactor ref update status formatting code
 - git-fetch.txt: document fetch output

 Improve the look of the way "git fetch" reports what happened to
 each ref that was fetched.

 Still being discussed.

--------------------------------------------------
[Stalled]

* tb/convert-peek-in-index (2016-06-07) 3 commits
 - correct ce_compare_data() in a middle of a merge
 - read-cache: factor out get_sha1_from_index() helper
 - convert: unify the "auto" handling of CRLF
 (this branch is used by jh/clean-smudge-annex.)

 Needs review.


* sb/bisect (2016-04-15) 22 commits
 - SQUASH???
 - bisect: get back halfway shortcut
 - bisect: compute best bisection in compute_relevant_weights()
 - bisect: use a bottom-up traversal to find relevant weights
 - bisect: prepare for different algorithms based on find_all
 - bisect: rename count_distance() to compute_weight()
 - bisect: make total number of commits global
 - bisect: introduce distance_direction()
 - bisect: extract get_distance() function from code duplication
 - bisect: use commit instead of commit list as arguments when appropriate
 - bisect: replace clear_distance() by unique markers
 - bisect: use struct node_data array instead of int array
 - bisect: get rid of recursion in count_distance()
 - bisect: make algorithm behavior independent of DEBUG_BISECT
 - bisect: make bisect compile if DEBUG_BISECT is set
 - bisect: plug the biggest memory leak
 - bisect: add test for the bisect algorithm
 - t6030: generalize test to not rely on current implementation
 - t: use test_cmp_rev() where appropriate
 - t/test-lib-functions.sh: generalize test_cmp_rev
 - bisect: allow 'bisect run' if no good commit is known
 - bisect: write about `bisect next` in documentation

 The internal algorithm used in "git bisect" to find the next commit
 to check has been optimized greatly.

 Expecting a reroll.
 ($gmane/291163)


* sg/completion-updates (2016-02-28) 21 commits
 . completion: cache the path to the repository
 . completion: extract repository discovery from __gitdir()
 . completion: don't guard git executions with __gitdir()
 . completion: consolidate silencing errors from git commands
 . completion: don't use __gitdir() for git commands
 . completion: respect 'git -C <path>'
 . completion: fix completion after 'git -C <path>'
 . completion: don't offer commands when 'git --opt' needs an argument
 . rev-parse: add '--absolute-git-dir' option
 . completion: list short refs from a remote given as a URL
 . completion: don't list 'HEAD' when trying refs completion outside of a repo
 . completion: list refs from remote when remote's name matches a directory
 . completion: respect 'git --git-dir=<path>' when listing remote refs
 . completion: fix most spots not respecting 'git --git-dir=<path>'
 . completion: ensure that the repository path given on the command line exists
 . completion tests: add tests for the __git_refs() helper function
 . completion tests: check __gitdir()'s output in the error cases
 . completion tests: consolidate getting path of current working directory
 . completion tests: make the $cur variable local to the test helper functions
 . completion tests: don't add test cruft to the test repository
 . completion: improve __git_refs()'s in-code documentation

 Will be rerolled.
 ($gmane/287839)


* ec/annotate-deleted (2015-11-20) 1 commit
 - annotate: skip checking working tree if a revision is provided

 Usability fix for annotate-specific "<file> <rev>" syntax with deleted
 files.

 Waiting for review.


* dg/subtree-rebase-test (2016-01-19) 1 commit
 - contrib/subtree: Add a test for subtree rebase that loses commits

 Reviewed up to v5.
 Will be rerolled.
 ($gmane/284426)


* dk/gc-more-wo-pack (2016-01-13) 4 commits
 - gc: clean garbage .bitmap files from pack dir
 - t5304: ensure non-garbage files are not deleted
 - t5304: test .bitmap garbage files
 - prepare_packed_git(): find more garbage

 Follow-on to dk/gc-idx-wo-pack topic, to clean up stale
 .bitmap and .keep files.

 Waiting for a reroll.
 ($gmane/284368).


* jc/diff-b-m (2015-02-23) 5 commits
 . WIPWIP
 . WIP: diff-b-m
 - diffcore-rename: allow easier debugging
 - diffcore-rename.c: add locate_rename_src()
 - diffcore-break: allow debugging

 "git diff -B -M" produced incorrect patch when the postimage of a
 completely rewritten file is similar to the preimage of a removed
 file; such a resulting file must not be expressed as a rename from
 other place.

 The fix in this patch is broken, unfortunately.
 Will discard.

--------------------------------------------------
[Cooking]

* ak/t7800-wo-readlink (2016-06-21) 1 commit
 - t7800: readlink may not be available

 One among four invocations of readlink(1) in our test suite has
 been rewritten so that the test can run on systems without the
 command (others are in valgrind test framework and t9802).

 Will merge to 'next'.


* js/perf-on-apple (2016-06-21) 1 commit
 - perf: accommodate for MacOSX

 t/perf needs /usr/bin/time with GNU extension; the invocation of it
 is updated to "gtime" on Darwin.

 Will merge to 'next'.


* sb/t5614-modernize (2016-06-21) 1 commit
 - t5614: don't use subshells
 (this branch uses sb/clone-shallow-passthru.)

 Test clean-up.

 Will merge to 'next'.


* ao/p4-has-branch-prefix-fix (2016-06-22) 1 commit
 - git-p4: correct hasBranchPrefix verbose output

 A bug, which caused "git p4" while running under verbose mode to
 report paths that are omitted due to branch prefix incorrectly, has
 been fixed; the command said "Ignoring file outside of prefix" for
 paths that are _inside_.

 Will merge to 'next'.


* cb/t7810-test-label-fix (2016-06-21) 1 commit
 - t7810: fix duplicated test title

 Test clean-up.

 Will merge to 'next'.


* jc/t2300-setup (2016-06-22) 1 commit
 - t2300: "git --exec-path" is not usable in $PATH on Windows as-is

 Portability fix for Windows.

 Will merge to 'next'.


* jk/perf-any-version (2016-06-22) 2 commits
 - p4211: explicitly disable renames in no-rename test
 - t/perf: fix regression in testing older versions of git

 Allow t/perf framework to use the features from the most recent
 version of Git even when testing an older installed version.

 Will merge to 'next'.


* jn/preformatted-doc-url (2016-06-22) 1 commit
 - doc: git-htmldocs.googlecode.com is no more

 Will merge to 'next'.


* ex/deprecate-empty-pathspec-as-match-all (2016-06-22) 1 commit
 - pathspec: warn on empty strings as pathspec

 An empty string used as a pathspec element has always meant
 'everything matches', but it is too easy to write a script that
 finds a path to remove in $path and run 'git rm "$paht"', which
 ends up removing everything.  Start warning about this use of an
 empty string used for 'everything matches' and ask users to use a
 more explicit '.' for that instead.

 The hope is that existing users will not mind this change, and
 eventually the warning can be turned into a hard error, upgrading
 the deprecation into removal of this (mis)feature.

 Will wait for further comments for a bit before merging to 'next'.


* jk/ansi-color (2016-06-23) 7 commits
 - color: support strike-through attribute
 - color: support "italic" attribute
 - color: allow "no-" for negating attributes
 - color: refactor parse_attr
 - add skip_prefix_mem helper
 - doc: refactor description of color format
 - color: fix max-size comment

 The output coloring scheme learned two new attributes, italic and
 strike, in addition to existing bold, reverse, etc.

 Will merge to 'next'.


* nd/icase (2016-06-27) 13 commits
 - SQUASH???
 - grep.c: reuse "icase" variable
 - diffcore-pickaxe: support case insensitive match on non-ascii
 - diffcore-pickaxe: Add regcomp_or_die()
 - grep/pcre: support utf-8
 - gettext: add is_utf8_locale()
 - grep/pcre: prepare locale-dependent tables for icase matching
 - grep: rewrite an if/else condition to avoid duplicate expression
 - grep/icase: avoid kwsset when -F is specified
 - grep/icase: avoid kwsset on literal non-ascii strings
 - test-regex: expose full regcomp() to the command line
 - test-regex: isolate the bug test code
 - grep: break down an "if" stmt in preparation for next changes

 "git grep -i" has been taught to fold case in non-ascii locales
 correctly.

 Modulo minor possible nits, this round looked mostly sensible.


* mj/log-show-signature-conf (2016-06-24) 3 commits
 - log: add log.showSignature configuration variable
 - log: add "--no-show-signature" command line option
 - t4202: refactor test

 "git log" learns log.showSignature configuration variable, and a
 command line option "--no-show-signature" to countermand it.

 Will merge to 'next'.


* mg/signature-doc (2016-06-17) 4 commits
 - Documentation/technical: signed merge tag format
 - Documentation/technical: signed commit format
 - Documentation/technical: signed tag format
 - Documentation/technical: describe signature formats

 Formats of the various data (and how to validate them) where we use
 GPG signature have been documented.

 Will merge to 'next'.


* jk/string-list-static-init (2016-06-13) 2 commits
  (merged to 'next' on 2016-06-27 at 3d4b2fa)
 + use string_list initializer consistently
 + Merge branch 'jk/parseopt-string-list' into jk/string-list-static-init
 (this branch uses jk/parseopt-string-list.)

 Instead of taking advantage of a struct string_list that is
 allocated with all NULs happens to be STRING_LIST_INIT_NODUP kind,
 initialize them explicitly as such, to document their behaviour
 better.

 Will merge to 'master'.


* mh/ref-store (2016-06-20) 38 commits
 - refs: implement iteration over only per-worktree refs
 - refs: make lock generic
 - refs: add method to rename refs
 - refs: add methods to init refs db
 - refs: make delete_refs() virtual
 - refs: add method for initial ref transaction commit
 - refs: add methods for reflog
 - refs: add method iterator_begin
 - files_ref_iterator_begin(): take a ref_store argument
 - split_symref_update(): add a files_ref_store argument
 - lock_ref_sha1_basic(): add a files_ref_store argument
 - lock_ref_for_update(): add a files_ref_store argument
 - commit_ref_update(): add a files_ref_store argument
 - lock_raw_ref(): add a files_ref_store argument
 - repack_without_refs(): add a files_ref_store argument
 - refs: make peel_ref() virtual
 - refs: make create_symref() virtual
 - refs: make pack_refs() virtual
 - refs: make verify_refname_available() virtual
 - refs: make read_raw_ref() virtual
 - resolve_gitlink_ref(): rename path parameter to submodule
 - resolve_gitlink_ref(): avoid memory allocation in many cases
 - resolve_gitlink_ref(): implement using resolve_ref_recursively()
 - resolve_ref_recursively(): new function
 - read_raw_ref(): take a (struct ref_store *) argument
 - resolve_gitlink_packed_ref(): remove function
 - resolve_packed_ref(): rename function from resolve_missing_loose_ref()
 - refs: reorder definitions
 - refs: add a transaction_commit() method
 - {lock,commit,rollback}_packed_refs(): add files_ref_store arguments
 - resolve_missing_loose_ref(): add a files_ref_store argument
 - get_packed_ref(): add a files_ref_store argument
 - add_packed_ref(): add a files_ref_store argument
 - refs: create a base class "ref_store" for files_ref_store
 - refs: add a backend method structure
 - refs: rename struct ref_cache to files_ref_store
 - rename_ref_available(): add docstring
 - resolve_gitlink_ref(): eliminate temporary variable
 (this branch uses mh/ref-iterators and mh/split-under-lock; is tangled with mh/update-ref-errors.)

 The ref-store abstraction was introduced to the refs API so that we
 can plug in different backends to store references.

 Rebased on top of mh/split-under-lock.


* mh/update-ref-errors (2016-06-20) 6 commits
 - lock_ref_for_update(): avoid a symref resolution
 - lock_ref_for_update(): make error handling more uniform
 - t1404: add more tests of update-ref error handling
 - t1404: document function test_update_rejected
 - t1404: remove "prefix" argument to test_update_rejected
 - t1404: rename file to t1404-update-ref-errors.sh
 (this branch uses mh/split-under-lock; is tangled with mh/ref-iterators and mh/ref-store.)

 Error handling in the codepaths that updates refs has been
 improved.

 Rebased on top of mh/split-under-lock.


* jh/clean-smudge-annex (2016-06-22) 10 commits
 - SQUASH???
 - use smudgeToFile filter in recursive merge
 - use smudgeToFile filter in git am
 - better recovery from failure of smudgeToFile filter
 - warn on unusable smudgeToFile/cleanFromFile config
 - use smudgeToFile in git checkout etc
 - use cleanFromFile in git add
 - add smudgeToFile and cleanFromFile filter configs
 - clarify %f documentation
 - Merge branch 'tb/convert-peek-in-index' into jh/clean-smudge-annex
 (this branch uses tb/convert-peek-in-index.)

 The interface to "clean/smudge" filters require Git to feed the
 whole contents via pipe, which is suboptimal for some
 applications.  "cleanFromFile/smudgeToFile" commands are the moral
 equilvalents for these filters but they interact with the files on
 the filesystem directly.


* lc/shell-default-value-noexpand (2016-06-19) 1 commit
 - sh-setup: enclose setting of ${VAR=default} in double-quotes

 Fix unnecessarily waste in the idiomatic use of ': ${VAR=default}'
 to set the default value, without enclosing it in double quotes.

 Will merge to 'next'.


* sb/clone-shallow-passthru (2016-06-20) 1 commit
 - clone: do not let --depth imply --shallow-submodules
 (this branch is used by sb/t5614-modernize.)

 Fix an unintended regression in v2.9 that breaks "clone --depth"
 that recurses down to submodules by forcing the submodules to also
 be cloned shallowly, which many server instances that host upstream
 of the submodules are not prepared for.

 Will merge to 'next'.


* js/find-commit-subject-ignore-leading-blanks (2016-06-22) 2 commits
 - commit.c: make find_commit_subject() more robust
 - pretty: make the skip_blank_lines() function public

 A helper function that takes the contents of a commit object and
 finds its subject line did not ignore leading blank lines, as is
 commonly done by other codepaths.  Make it ignore leading blank
 lines to match.

 Will merge to 'next'.


* js/log-to-diffopt-file (2016-06-24) 10 commits
 - t4211: ensure that log respects --output=<file>
 - shortlog: respect the --output=<file> setting
 - format-patch: use stdout directly
 - format-patch: avoid freopen()
 - format-patch: explicitly switch off color when writing to files
 - shortlog: support outputting to streams other than stdout
 - graph: respect the diffopt.file setting
 - line-log: respect diffopt's configured output file stream
 - log-tree: respect diffopt's configured output file stream
 - log: prepare log/log-tree to reuse the diffopt.close_file attribute

 The commands in the "log/diff" family had an FILE* pointer in the
 data structure they pass around for a long time, but some codepaths
 used to always write to the standard output.  As a preparatory step
 to make "git format-patch" available to the internal callers, these
 codepaths have been updated to consistently write into that FILE*
 instead.


* js/mingw-parameter-less-c-functions (2016-06-20) 1 commit
 - mingw: let the build succeed with DEVELOPER=1

 Some platform-specific code had non-ANSI strict declarations of C
 functions that do not take any parameters, which has been
 corrected.

 Will merge to 'next'.


* bc/cocci (2016-06-20) 8 commits
 - merge-recursive: convert merge_recursive_generic to object_id
 - merge-recursive: convert leaf functions to use struct object_id
 - merge-recursive: convert struct merge_file_info to object_id
 - merge-recursive: convert struct stage_data to use object_id
 - diff: rename struct diff_filespec's sha1_valid member
 - diff: convert struct diff_filespec to struct object_id
 - coccinelle: apply object_id Coccinelle transformations
 - contrib/coccinelle: add basic Coccinelle transforms

 Conversion from unsigned char sha1[20] to struct object_id
 continues.


* jk/tzoffset-fix (2016-06-20) 3 commits
 - local_tzoffset: detect errors from tm_to_time_t
 - t0006: test various date formats
 - t0006: rename test-date's "show" to "relative"

 The internal code used to show local timezone offset is not
 prepared to handle timestamps beyond year 2100, and gave a
 bogus offset value to the caller.  Use a more benign looking
 +0000 instead and let "git log" going in such a case, instead
 of aborting.

 Will merge to 'next'.


* jk/add-i-diff-compact-heuristics (2016-06-16) 1 commit
  (merged to 'next' on 2016-06-27 at 568f892)
 + add--interactive: respect diff.compactionHeuristic

 "git add -i/-p" learned to honor diff.compactionHeuristic
 experimental knob, so that the user can work on the same hunk split
 as "git diff" output.

 Will merge to 'master'.


* jk/big-and-future-archive-tar (2016-06-21) 2 commits
 - archive-tar: write extended headers for far-future mtime
 - archive-tar: write extended headers for file sizes >= 8GB

 "git archive" learned to handle files that are larger than 8GB and
 commits far in the future than expressible by the traditional US-TAR
 format.

 Expecting a reroll.
 ($gmane/298119)


* jk/gpg-interface-cleanup (2016-06-17) 7 commits
 - gpg-interface: check gpg signature creation status
 - sign_buffer: use pipe_command
 - verify_signed_buffer: use pipe_command
 - run-command: add pipe_command helper
 - verify_signed_buffer: use tempfile object
 - verify_signed_buffer: drop pbuf variable
 - gpg-interface: use child_process.args

 A new run-command API function pipe_command() is introduced to
 sanely feed data to the standard input while capturing data from
 the standard output and the standard error of an external process,
 which is cumbersome to hand-roll correctly without deadlocking.

 The codepath to sign data in a prepared buffer with GPG has been
 updated to use this API to read from the status-fd to check for
 errors (instead of relying on GPG's exit status).

 Will merge to 'next'.


* lf/sideband-returns-void (2016-06-16) 2 commits
  (merged to 'next' on 2016-06-27 at 558c781)
 + upload-pack.c: make send_client_data() return void
 + sideband.c: make send_sideband() return void

 A small internal API cleanup.

 Will merge to 'master'.


* nd/graph-width-padded (2016-06-16) 2 commits
 - pretty.c: support <direction>|(<negative number>) forms
 - pretty: pass graph width to pretty formatting for use in '%>|(N)'

 "log --graph --format=" learned that "%>|(N)" specifies the width
 relative to the terminal's left edge, not relative to the area to
 draw text that is to the right of the ancestry-graph section.  It
 also now accepts negative N that means the column limit is relative
 to the right border.

 Will merge to 'next'.


* jk/bisect-show-tree (2016-06-16) 1 commit
  (merged to 'next' on 2016-06-27 at 6970f87e)
 + bisect: always call setup_revisions after init_revisions

 "git bisect" makes an internal call to "git diff-tree" when
 bisection finds the culprit, but this call did not initialize the
 data structure to pass to the diff-tree API correctly.

 Will merge to 'master'.


* jk/parseopt-string-list (2016-06-13) 3 commits
  (merged to 'next' on 2016-06-27 at 27462e6)
 + blame,shortlog: don't make local option variables static
 + interpret-trailers: don't duplicate option strings
 + parse_opt_string_list: stop allocating new strings
 (this branch is used by jk/string-list-static-init.)

 The command line argument parsing that uses OPT_STRING_LIST() often
 made a copy of the argv[] element, which was unnecessary.

 Will merge to 'master'.


* jk/repack-keep-unreachable (2016-06-14) 3 commits
 - repack: extend --keep-unreachable to loose objects
 - repack: add --keep-unreachable option
 - repack: document --unpack-unreachable option

 "git repack" learned the "--keep-unreachable" option, which sends
 loose unreachable objects to a pack instead of leaving them loose.
 This helps heuristics based on the number of loose objects
 (e.g. "gc --auto").

 Will merge to 'next'.


* lf/recv-sideband-cleanup (2016-06-13) 1 commit
 - sideband.c: refactor recv_sideband()

 Code simplification.  It however loses the atomicity of the output
 9ac13ec9 (atomic write for sideband remote messages, 2006-10-11)
 tried to add to an once-much-simpler codebase.

 Expecting a reroll.


* nd/test-lib-httpd-show-error-log-in-verbose (2016-06-13) 1 commit
  (merged to 'next' on 2016-06-27 at 9793d81)
 + lib-httpd.sh: print error.log on error

 Debugging aid.

 Will merge to 'master'.


* sb/submodule-clone-retry (2016-06-13) 2 commits
 - submodule update: continue when a clone fails
 - submodule--helper: initial clone learns retry logic

 "git submodule update" that drives many "git clone" could
 eventually hit flaky servers/network conditions on one of the
 submodules; the command learned to retry the attempt.

 Will merge to 'next'.


* jc/blame-reverse (2016-06-14) 2 commits
 - blame: dwim "blame --reverse OLD" as "blame --reverse OLD.."
 - blame: improve diagnosis for "--reverse NEW"

 It is a common mistake to say "git blame --reverse OLD path",
 expecting that the command line is dwimmed as if asking how lines
 in path in an old revision OLD have survived up to the current
 commit.

 Any supporters?  Otherwise will drop.


* km/fetch-do-not-free-remote-name (2016-06-14) 1 commit
  (merged to 'next' on 2016-06-27 at 4bc34c4)
 + builtin/fetch.c: don't free remote->name after fetch

 The ownership rule for the piece of memory that hold references to
 be fetched in "git fetch" was screwy, which has been cleaned up.

 Will merge to 'master'.


* nd/shallow-deepen (2016-06-13) 27 commits
 - fetch, upload-pack: --deepen=N extends shallow boundary by N commits
 - upload-pack: add get_reachable_list()
 - upload-pack: split check_unreachable() in two, prep for get_reachable_list()
 - t5500, t5539: tests for shallow depth excluding a ref
 - clone: define shallow clone boundary with --shallow-exclude
 - fetch: define shallow boundary with --shallow-exclude
 - upload-pack: support define shallow boundary by excluding revisions
 - refs: add expand_ref()
 - t5500, t5539: tests for shallow depth since a specific date
 - clone: define shallow clone boundary based on time with --shallow-since
 - fetch: define shallow boundary with --shallow-since
 - upload-pack: add deepen-since to cut shallow repos based on time
 - shallow.c: implement a generic shallow boundary finder based on rev-list
 - fetch-pack: use a separate flag for fetch in deepening mode
 - fetch-pack.c: mark strings for translating
 - fetch-pack: use a common function for verbose printing
 - fetch-pack: use skip_prefix() instead of starts_with()
 - upload-pack: move rev-list code out of check_non_tip()
 - upload-pack: make check_non_tip() clean things up on error
 - upload-pack: tighten number parsing at "deepen" lines
 - upload-pack: use skip_prefix() instead of starts_with()
 - upload-pack: move "unshallow" sending code out of deepen()
 - upload-pack: remove unused variable "backup"
 - upload-pack: move "shallow" sending code out of deepen()
 - upload-pack: move shallow deepen code out of receive_needs()
 - transport-helper.c: refactor set_helper_option()
 - remote-curl.c: convert fetch_git() to use argv_array

 The existing "git fetch --depth=<n>" option was hard to use
 correctly when making the history of an existing shallow clone
 deeper.  A new option, "--deepen=<n>", has been added to make this
 easier to use.  "git clone" also learned "--shallow-since=<date>"
 and "--shallow-exclude=<tag>" options to make it easier to specify
 "I am interested only in the recent N months worth of history" and
 "Give me only the history since that version".

 Rerolled.
 Needs review.  What this topic attempts to achieve is worthwhile, I
 would think.


* jk/send-pack-stdio (2016-06-10) 2 commits
  (merged to 'next' on 2016-06-27 at 2cfb0ff)
 + write_or_die: remove the unused write_or_whine() function
 + send-pack: use buffered I/O to talk to pack-objects

 Code clean-up.

 Will merge to 'master'.


* pb/commit-editmsg-path (2016-06-09) 1 commit
  (merged to 'next' on 2016-06-27 at 0f01ce1)
 + builtin/commit.c: memoize git-path for COMMIT_EDITMSG

 Code clean-up.

 Will merge to 'master'.


* jc/attr-more (2016-06-09) 8 commits
 - attr.c: outline the future plans by heavily commenting
 - attr.c: always pass check[] to collect_some_attrs()
 - attr.c: introduce empty_attr_check_elems()
 - attr.c: correct ugly hack for git_all_attrs()
 - attr.c: rename a local variable check
 - fixup! d5ad6c13
 - attr.c: pass struct git_attr_check down the callchain
 - attr.c: add push_stack() helper
 (this branch uses jc/attr; is tangled with sb/pathspec-label and sb/submodule-default-paths.)

 The beginning of long and tortuous journey to clean-up attribute
 subsystem implementation.

 Needs to be redone.


* mh/ref-iterators (2016-06-20) 13 commits
 - for_each_reflog(): reimplement using iterators
 - dir_iterator: new API for iterating over a directory tree
 - for_each_reflog(): don't abort for bad references
 - do_for_each_ref(): reimplement using reference iteration
 - refs: introduce an iterator interface
 - ref_resolves_to_object(): new function
 - entry_resolves_to_object(): rename function from ref_resolves_to_object()
 - get_ref_cache(): only create an instance if there is a submodule
 - remote rm: handle symbolic refs correctly
 - delete_refs(): add a flags argument
 - refs: use name "prefix" consistently
 - do_for_each_ref(): move docstring to the header file
 - refs: remove unnecessary "extern" keywords
 (this branch is used by mh/ref-store; uses mh/split-under-lock; is tangled with mh/update-ref-errors.)

 The API to iterate over all the refs (i.e. for_each_ref(), etc.)
 has been revamped.

 Rebased on top of mh/split-under-lock.


* ew/mboxrd-format-am (2016-06-06) 3 commits
 - am: support --patch-format=mboxrd
 - mailsplit: support unescaping mboxrd messages
 - pretty: support "mboxrd" output format

 Teach format-patch and mailsplit (hence "am") how a line that
 happens to begin with "From " in the e-mail message is quoted with
 ">", so that these lines can be restored to their original shape.

 Will merge to 'next'.


* mh/connect (2016-06-06) 10 commits
 - connect: [host:port] is legacy for ssh
 - connect: move ssh command line preparation to a separate function
 - connect: actively reject git:// urls with a user part
 - connect: change the --diag-url output to separate user and host
 - connect: make parse_connect_url() return the user part of the url as a separate value
 - connect: group CONNECT_DIAG_URL handling code
 - connect: make parse_connect_url() return separated host and port
 - connect: re-derive a host:port string from the separate host and port variables
 - connect: call get_host_and_port() earlier
 - connect: document why we sometimes call get_port after get_host_and_port

 Rewrite Git-URL parsing routine (hopefully) without changing any
 behaviour.


* va/i18n-even-more (2016-06-17) 38 commits
 - i18n: branch: mark comment when editing branch description for translation
 - i18n: unmark die messages for translation
 - i18n: submodule: escape shell variables inside eval_gettext
 - i18n: submodule: join strings marked for translation
 - i18n: init-db: join message pieces
 - i18n: remote: allow translations to reorder message
 - i18n: remote: mark URL fallback text for translation
 - i18n: standardise messages
 - i18n: sequencer: add period to error message
 - i18n: merge: change command option help to lowercase
 - i18n: merge: mark messages for translation
 - i18n: notes: mark options for translation
 - i18n: notes: mark strings for translation
 - i18n: transport-helper.c: change N_() call to _()
 - i18n: bisect: mark strings for translation
 - t5523: use test_i18ngrep for negation
 - t4153: fix negated test_i18ngrep call
 - t9003: become resilient to GETTEXT_POISON
 - tests: unpack-trees: update to use test_i18n* functions
 - tests: use test_i18n* functions to suppress false positives
 - i18n: setup: mark strings for translation
 - i18n: rebase-interactive: mark comments of squash for translation
 - i18n: rebase-interactive: mark here-doc strings for translation
 - i18n: rebase-interactive: mark strings for translation
 - i18n: git-sh-setup.sh: mark strings for translation
 - t6030: update to use test_i18ncmp
 - i18n: bisect: simplify error message for i18n
 - i18n: rebase: mark placeholder for translation
 - i18n: rebase: fix marked string to use eval_gettext variant
 - merge-octopus: use die shell function from git-sh-setup.sh
 - i18n: merge-octopus: mark messages for translation
 - i18n: sequencer: mark string for translation
 - i18n: sequencer: mark entire sentences for translation
 - i18n: transport: mark strings for translation
 - i18n: advice: internationalize message for conflicts
 - i18n: advice: mark string about detached head for translation
 - i18n: builtin/remote.c: fix mark for translation
 - Merge branch 'jc/t2300-setup' into HEAD

 More markings of messages for i18n, with updates to various tests
 to pass GETTEXT_POISON tests.

 One patch from the original submission dropped due to conflicts
 with jk/upload-pack-hook, which is still in flux.

 Will merge to 'next'.


* dk/blame-move-no-reason-for-1-line-context (2016-05-29) 1 commit
 - blame: require 0 context lines while finding moved lines with -M

 "git blame -M" missed a single line that was moved within the file.

 We may want to squash a test or two to this commit.  Volunteers?


* nd/worktree-lock (2016-06-13) 6 commits
 - worktree.c: find_worktree() search by path suffix
 - worktree: add "unlock" command
 - worktree: add "lock" command
 - worktree.c: add is_worktree_locked()
 - worktree.c: add is_main_worktree()
 - worktree.c: add find_worktree()
 (this branch uses nd/worktree-cleanup-post-head-protection.)

 "git worktree prune" protected worktrees that are marked as
 "locked" by creating a file in a known location.  "git worktree"
 command learned a dedicated command pair to create and remoev such
 a file, so that the users do not have to do this with editor.

 Is everybody happy with this version?
 If so, will merge to 'next'.


* jk/upload-pack-hook (2016-06-02) 7 commits
 - upload-pack: provide a hook for running pack-objects
 - t1308: do not get fooled by symbolic links to the source tree
 - config: add a notion of "scope"
 - config: return configset value for current_config_ functions
 - config: set up config_source for command-line config
 - git_config_parse_parameter: refactor cleanup code
 - git_config_with_options: drop "found" counting

 Allow a custom "git upload-pack" replacement to respond to
 "fetch/clone" request.

 Will merge to 'next'.


* sb/submodule-default-paths (2016-06-20) 5 commits
 - completion: clone can recurse into submodules
 - clone: add --init-submodule=<pathspec> switch
 - submodule update: add `--init-default-path` switch
 - Merge branch 'sb/pathspec-label' into sb/submodule-default-paths
 - Merge branch 'jc/attr' into sb/submodule-default-paths
 (this branch uses jc/attr and sb/pathspec-label; is tangled with jc/attr-more.)

 Allow specifying the set of submodules the user is interested in on
 the command line of "git clone" that clones the superproject.


* ep/http-curl-trace (2016-05-24) 2 commits
  (merged to 'next' on 2016-06-27 at c290515)
 + imap-send.c: introduce the GIT_TRACE_CURL enviroment variable
 + http.c: implement the GIT_TRACE_CURL environment variable

 HTTP transport gained an option to produce more detailed debugging
 trace.

 Will merge to 'master'.


* jc/attr (2016-05-25) 18 commits
 - attr: support quoting pathname patterns in C style
 - attr: expose validity check for attribute names
 - attr: add counted string version of git_attr()
 - attr: add counted string version of git_check_attr()
 - attr: retire git_check_attrs() API
 - attr: convert git_check_attrs() callers to use the new API
 - attr: convert git_all_attrs() to use "struct git_attr_check"
 - attr: (re)introduce git_check_attr() and struct git_attr_check
 - attr: rename function and struct related to checking attributes
 - attr.c: plug small leak in parse_attr_line()
 - attr.c: tighten constness around "git_attr" structure
 - attr.c: simplify macroexpand_one()
 - attr.c: mark where #if DEBUG ends more clearly
 - attr.c: complete a sentence in a comment
 - attr.c: explain the lack of attr-name syntax check in parse_attr()
 - attr.c: update a stale comment on "struct match_attr"
 - attr.c: use strchrnul() to scan for one line
 - commit.c: use strchrnul() to scan for one line
 (this branch is used by jc/attr-more, sb/pathspec-label and sb/submodule-default-paths.)

 The attributes API has been updated so that it can later be
 optimized using the knowledge of which attributes are queried.

 I wanted to polish this topic further to make the attribute
 subsystem thread-ready, but because other topics depend on this
 topic and they do not (yet) need it to be thread-ready, let's merge
 this early part together with the dependent topics to 'next', and
 back-burner the threading enhancement to another day.


* pb/bisect (2016-06-27) 9 commits
 - bisect--helper: `bisect_write` shell function in C
 - bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C
 - bisect--helper: `bisect_reset` shell function in C
 - wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()
 - t6030: explicitly test for bisection cleanup
 - bisect--helper: `bisect_clean_state` shell function in C
 - bisect--helper: `write_terms` shell function in C
 - bisect: rewrite `check_term_format` shell function in C
 - bisect--helper: use OPT_CMDMODE instead of OPT_BOOL

 GSoC "bisect" topic.


* sb/pathspec-label (2016-06-03) 6 commits
 - pathspec: disable preload-index when attribute pathspec magic is in use
 - pathspec: allow escaped query values
 - pathspec: allow querying for attributes
 - pathspec: move prefix check out of the inner loop
 - pathspec: move long magic parsing out of prefix_pathspec
 - Documentation: fix a typo
 (this branch is used by sb/submodule-default-paths; uses jc/attr; is tangled with jc/attr-more.)

 The pathspec mechanism learned ":(attr:X)$pattern" pathspec magic
 to limit paths that match $pattern further by attribute settings.
 The preload-index mechanism is disabled when the new pathspec magic
 is in use (at least for now), because the attribute subsystem is
 not thread-ready.


* nd/worktree-cleanup-post-head-protection (2016-05-24) 6 commits
 - worktree: simplify prefixing paths
 - worktree: avoid 0{40}, too many zeroes, hard to read
 - worktree.c: use is_dot_or_dotdot()
 - git-worktree.txt: keep subcommand listing in alphabetical order
 - worktree.c: rewrite mark_current_worktree() to avoid strbuf
 - completion: support git-worktree
 (this branch is used by nd/worktree-lock.)

 Further preparatory clean-up for "worktree" feature.

 Will merge to 'next'.


* mh/split-under-lock (2016-06-13) 33 commits
 - lock_ref_sha1_basic(): only handle REF_NODEREF mode
 - commit_ref_update(): remove the flags parameter
 - lock_ref_for_update(): don't resolve symrefs
 - lock_ref_for_update(): don't re-read non-symbolic references
 - refs: resolve symbolic refs first
 - ref_transaction_update(): check refname_is_safe() at a minimum
 - unlock_ref(): move definition higher in the file
 - lock_ref_for_update(): new function
 - add_update(): initialize the whole ref_update
 - verify_refname_available(): adjust constness in declaration
 - refs: don't dereference on rename
 - refs: allow log-only updates
 - delete_branches(): use resolve_refdup()
 - ref_transaction_commit(): correctly report close_ref() failure
 - ref_transaction_create(): disallow recursive pruning
 - refs: make error messages more consistent
 - lock_ref_sha1_basic(): remove unneeded local variable
 - read_raw_ref(): move docstring to header file
 - read_raw_ref(): improve docstring
 - read_raw_ref(): rename symref argument to referent
 - read_raw_ref(): clear *type at start of function
 - read_raw_ref(): rename flags argument to type
 - ref_transaction_commit(): remove local variables n and updates
 - rename_ref(): remove unneeded local variable
 - commit_ref_update(): write error message to *err, not stderr
 - refname_is_safe(): insist that the refname already be normalized
 - refname_is_safe(): don't allow the empty string
 - refname_is_safe(): use skip_prefix()
 - remove_dir_recursively(): add docstring
 - safe_create_leading_directories(): improve docstring
 - read_raw_ref(): don't get confused by an empty directory
 - commit_ref(): if there is an empty dir in the way, delete it
 - t1404: demonstrate a bug resolving references
 (this branch is used by mh/ref-iterators, mh/ref-store and mh/update-ref-errors.)

 Further preparatory work on the refs API before the pluggable
 backend series can land.

 Replaced with a rerolled version.


* jc/send-email-skip-backup (2016-04-12) 1 commit
 - send-email: detect and offer to skip backup files

 A careless invocation of "git send-email directory/" after editing
 0001-change.patch with an editor often ends up sending both
 0001-change.patch and its backup file, 0001-change.patch~, causing
 embarrassment and a minor confusion.  Detect such an input and
 offer to skip the backup files when sending the patches out.

 Will merge to 'next'.
 Perhaps people will enhance it more once it gains more visibility.


* kn/ref-filter-branch-list (2016-05-17) 17 commits
 - branch: implement '--format' option
 - branch: use ref-filter printing APIs
 - branch, tag: use porcelain output
 - ref-filter: allow porcelain to translate messages in the output
 - ref-filter: add `:dir` and `:base` options for ref printing atoms
 - ref-filter: make remote_ref_atom_parser() use refname_atom_parser_internal()
 - ref-filter: introduce symref_atom_parser() and refname_atom_parser()
 - ref-filter: introduce refname_atom_parser_internal()
 - ref-filter: make "%(symref)" atom work with the ':short' modifier
 - ref-filter: add support for %(upstream:track,nobracket)
 - ref-filter: make %(upstream:track) prints "[gone]" for invalid upstreams
 - ref-filter: introduce format_ref_array_item()
 - ref-filter: move get_head_description() from branch.c
 - ref-filter: modify "%(objectname:short)" to take length
 - ref-filter: implement %(if:equals=<string>) and %(if:notequals=<string>)
 - ref-filter: include reference to 'used_atom' within 'atom_value'
 - ref-filter: implement %(if), %(then), and %(else) atoms

 The code to list branches in "git branch" has been consolidated
 with the more generic ref-filter API.

 Rerolled.
 This also really needs review.


* dt/index-helper (2016-06-27) 21 commits
 - unix-socket.c: add stub implementation when unix sockets are not supported
 - index-helper: indexhelper.exitafter config
 - trace: measure where the time is spent in the index-heavy operations
 - index-helper: optionally automatically run
 - index-helper: autorun mode
 - index-helper: don't run if already running
 - index-helper: kill mode
 - watchman: add a config option to enable the extension
 - unpack-trees: preserve index extensions
 - update-index: enable/disable watchman support
 - index-helper: use watchman to avoid refreshing index with lstat()
 - watchman: support watchman to reduce index refresh cost
 - read-cache: add watchman 'WAMA' extension
 - index-helper: log warnings
 - index-helper: add --detach
 - daemonize(): set a flag before exiting the main process
 - index-helper: add --strict
 - index-helper: new daemon for caching index and related stuff
 - pkt-line: add gentle version of packet_write
 - read-cache: allow to keep mmap'd memory after reading
 - read-cache.c: fix constness of verify_hdr()

 A new "index-helper" daemon has been introduced to give newly
 spawned Git process a quicker access to the data in the index, and
 optionally interface with the watchman daemon to further reduce the
 refresh cost.

 Expecting a reroll.
 ($gmane/298331, $gmane/298319).


* jc/bundle (2016-03-03) 6 commits
 - index-pack: --clone-bundle option
 - Merge branch 'jc/index-pack' into jc/bundle
 - bundle v3: the beginning
 - bundle: keep a copy of bundle file name in the in-core bundle header
 - bundle: plug resource leak
 - bundle doc: 'verify' is not about verifying the bundle

 The beginning of "split bundle", which could be one of the
 ingredients to allow "git clone" traffic off of the core server
 network to CDN.

 While I think it would make it easier for people to experiment and
 build on if the topic is merged to 'next', I am at the same time a
 bit reluctant to merge an unproven new topic that introduces a new
 file format, which we may end up having to support til the end of
 time.  Comments?


* jc/merge-drop-old-syntax (2015-04-29) 1 commit
 - merge: drop 'git merge <message> HEAD <commit>' syntax

 Stop supporting "git merge <message> HEAD <commit>" syntax that has
 been deprecated since October 2007, and issues a deprecation
 warning message since v2.5.0.

 It has been reported that git-gui still uses the deprecated syntax,
 which needs to be fixed before this final step can proceed.
 ($gmane/282594)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 0/3] unified auto CRLF handling, V3
  2016-06-27 23:21 What's cooking in git.git (Jun 2016, #09; Mon, 27) Junio C Hamano
@ 2016-06-28  8:01 ` tboegi
  2016-06-28  8:01 ` [PATCH v3 1/3] convert: unify the "auto" handling of CRLF tboegi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: tboegi @ 2016-06-28  8:01 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen

From: Torsten Bögershausen <tboegi@web.de>

Unified auto CRLF handling, V3
I did the review, some minor changes: against v2:
 - SAFE_CRLF_RENORMALIZE is now 3, not 4 so that enum safe_crlf
   uses 0..3
 - V2 had unintentionally commented out the ident TC, fixed that

What more may be useful to bring this series forward?


Torsten Bögershausen (3):
  convert: unify the "auto" handling of CRLF
  read-cache: factor out get_sha1_from_index() helper
  correct ce_compare_data() in a middle of a merge

 Documentation/config.txt        | 12 +++----
 Documentation/gitattributes.txt | 15 +++++----
 builtin/apply.c                 |  3 +-
 builtin/blame.c                 |  2 +-
 cache.h                         |  4 +++
 combine-diff.c                  |  3 +-
 convert.c                       | 65 +++++++++++++++++++++++-------------
 convert.h                       | 18 +++++++---
 diff.c                          |  3 +-
 dir.c                           |  2 +-
 read-cache.c                    | 33 +++++++++++-------
 sha1_file.c                     | 12 +++++--
 t/t0025-crlf-auto.sh            |  4 +--
 t/t0027-auto-crlf.sh            | 29 ++++++++--------
 t/t6038-merge-text-auto.sh      | 74 ++++++++++++++++++++++++-----------------
 15 files changed, 170 insertions(+), 109 deletions(-)

-- 
2.0.0.rc1.6318.g0c2c796


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/3] convert: unify the "auto" handling of CRLF
  2016-06-27 23:21 What's cooking in git.git (Jun 2016, #09; Mon, 27) Junio C Hamano
  2016-06-28  8:01 ` [PATCH v3 0/3] unified auto CRLF handling, V3 tboegi
@ 2016-06-28  8:01 ` tboegi
  2016-06-28  8:01 ` [PATCH v3 2/3] read-cache: factor out get_sha1_from_index() helper tboegi
  2016-06-28  8:01 ` [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge tboegi
  3 siblings, 0 replies; 24+ messages in thread
From: tboegi @ 2016-06-28  8:01 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen

From: Torsten Bögershausen <tboegi@web.de>

Before this change,
$ echo "* text=auto" >.gitattributes
$ echo "* eol=crlf" >>.gitattributes

would have the same effect as
$ echo "* text" >.gitattributes
$ git config core.eol crlf

Since the 'eol' attribute had higher priority than 'text=auto', this may
corrupt binary files and is not what most users expect to happen.

Make the 'eol' attribute to obey 'text=auto' and now
$ echo "* text=auto" >.gitattributes
$ echo "* eol=crlf" >>.gitattributes
behaves the same as
$ echo "* text=auto" >.gitattributes
$ git config core.eol crlf

In other words,
$ echo "* text=auto eol=crlf" >.gitattributes
has the same effect as
$ git config core.autocrlf true

and
$ echo "* text=auto eol=lf" >.gitattributes
has the same effect as
$ git config core.autocrlf input

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 Documentation/config.txt        | 12 +++++-------
 Documentation/gitattributes.txt | 15 +++++++++------
 convert.c                       | 42 +++++++++++++++++++++--------------------
 convert.h                       |  3 ++-
 t/t0025-crlf-auto.sh            |  4 ++--
 t/t0027-auto-crlf.sh            | 29 ++++++++++++++--------------
 t/t6038-merge-text-auto.sh      | 23 ++++++++++++++--------
 7 files changed, 69 insertions(+), 59 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4a27ad4..62ade8b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -389,13 +389,11 @@ file with mixed line endings would be reported by the `core.safecrlf`
 mechanism.
 
 core.autocrlf::
-	Setting this variable to "true" is almost the same as setting
-	the `text` attribute to "auto" on all files except that text
-	files are not guaranteed to be normalized: files that contain
-	`CRLF` in the repository will not be touched.  Use this
-	setting if you want to have `CRLF` line endings in your
-	working directory even though the repository does not have
-	normalized line endings.  This variable can be set to 'input',
+	Setting this variable to "true" is the same as setting
+	the `text` attribute to "auto" on all files and core.eol to "crlf".
+	Set to true if you want to have `CRLF` line endings in your
+	working directory and the repository has LF line endings.
+	This variable can be set to 'input',
 	in which case no output conversion is performed.
 
 core.symlinks::
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e3b1de8..d7a124b 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -115,6 +115,7 @@ text file is normalized, its line endings are converted to LF in the
 repository.  To control what line ending style is used in the working
 directory, use the `eol` attribute for a single file and the
 `core.eol` configuration variable for all text files.
+Note that `core.autocrlf` overrides `core.eol`
 
 Set::
 
@@ -130,8 +131,9 @@ Unset::
 Set to string value "auto"::
 
 	When `text` is set to "auto", the path is marked for automatic
-	end-of-line normalization.  If Git decides that the content is
-	text, its line endings are normalized to LF on checkin.
+	end-of-line conversion.  If Git decides that the content is
+	text, its line endings are converted to LF on checkin.
+	When the file has been commited with CRLF, no conversion is done.
 
 Unspecified::
 
@@ -146,7 +148,7 @@ unspecified.
 ^^^^^
 
 This attribute sets a specific line-ending style to be used in the
-working directory.  It enables end-of-line normalization without any
+working directory.  It enables end-of-line conversion without any
 content checks, effectively setting the `text` attribute.
 
 Set to string value "crlf"::
@@ -186,9 +188,10 @@ the working directory, and prevent .jpg files from being normalized
 regardless of their content.
 
 ------------------------
+*               text=auto
 *.txt		text
-*.vcproj	eol=crlf
-*.sh		eol=lf
+*.vcproj	text eol=crlf
+*.sh		text eol=lf
 *.jpg		-text
 ------------------------
 
@@ -198,7 +201,7 @@ normalization in Git.
 
 If you simply want to have CRLF line endings in your working directory
 regardless of the repository you are working with, you can set the
-config variable "core.autocrlf" without changing any attributes.
+config variable "core.autocrlf" without using any attributes.
 
 ------------------------
 [core]
diff --git a/convert.c b/convert.c
index b1614bf..67d69b5 100644
--- a/convert.c
+++ b/convert.c
@@ -176,7 +176,9 @@ static enum eol output_eol(enum crlf_action crlf_action)
 		return EOL_LF;
 	case CRLF_UNDEFINED:
 	case CRLF_AUTO_CRLF:
+		return EOL_CRLF;
 	case CRLF_AUTO_INPUT:
+		return EOL_LF;
 	case CRLF_TEXT:
 	case CRLF_AUTO:
 		/* fall through */
@@ -254,17 +256,15 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 	if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
 		if (convert_is_binary(len, &stats))
 			return 0;
-
-		if (crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
-			/*
-			 * If the file in the index has any CR in it, do not convert.
-			 * This is the new safer autocrlf handling.
-			 */
-			if (has_cr_in_index(path))
-				return 0;
-		}
+		/*
+		 * If the file in the index has any CR in it, do not convert.
+		 * This is the new safer autocrlf handling.
+		 */
+		if (checksafe == SAFE_CRLF_RENORMALIZE)
+			checksafe = SAFE_CRLF_FALSE;
+		else if (has_cr_in_index(path))
+			return 0;
 	}
-
 	check_safe_crlf(path, crlf_action, &stats, checksafe);
 
 	/* Optimization: No CRLF? Nothing to convert, regardless. */
@@ -320,12 +320,10 @@ static int crlf_to_worktree(const char *path, const char *src, size_t len,
 		return 0;
 
 	if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
-		if (crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
-			/* If we have any CR or CRLF line endings, we do not touch it */
-			/* This is the new safer autocrlf-handling */
-			if (stats.lonecr || stats.crlf )
-				return 0;
-		}
+		/* If we have any CR or CRLF line endings, we do not touch it */
+		/* This is the new safer autocrlf-handling */
+		if (stats.lonecr || stats.crlf )
+			return 0;
 
 		if (convert_is_binary(len, &stats))
 			return 0;
@@ -786,7 +784,11 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->drv = git_path_check_convert(ccheck + 2);
 		if (ca->crlf_action != CRLF_BINARY) {
 			enum eol eol_attr = git_path_check_eol(ccheck + 3);
-			if (eol_attr == EOL_LF)
+			if (ca->crlf_action == CRLF_AUTO && eol_attr == EOL_LF)
+				ca->crlf_action = CRLF_AUTO_INPUT;
+			else if (ca->crlf_action == CRLF_AUTO && eol_attr == EOL_CRLF)
+				ca->crlf_action = CRLF_AUTO_CRLF;
+			else if (eol_attr == EOL_LF)
 				ca->crlf_action = CRLF_TEXT_INPUT;
 			else if (eol_attr == EOL_CRLF)
 				ca->crlf_action = CRLF_TEXT_CRLF;
@@ -845,9 +847,9 @@ const char *get_convert_attr_ascii(const char *path)
 	case CRLF_AUTO:
 		return "text=auto";
 	case CRLF_AUTO_CRLF:
-		return "text=auto eol=crlf"; /* This is not supported yet */
+		return "text=auto eol=crlf";
 	case CRLF_AUTO_INPUT:
-		return "text=auto eol=lf"; /* This is not supported yet */
+		return "text=auto eol=lf";
 	}
 	return "";
 }
@@ -949,7 +951,7 @@ int renormalize_buffer(const char *path, const char *src, size_t len, struct str
 		src = dst->buf;
 		len = dst->len;
 	}
-	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_FALSE);
+	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE);
 }
 
 /*****************************************************************
diff --git a/convert.h b/convert.h
index ccf436b..82871a1 100644
--- a/convert.h
+++ b/convert.h
@@ -7,7 +7,8 @@
 enum safe_crlf {
 	SAFE_CRLF_FALSE = 0,
 	SAFE_CRLF_FAIL = 1,
-	SAFE_CRLF_WARN = 2
+	SAFE_CRLF_WARN = 2,
+	SAFE_CRLF_RENORMALIZE = 3
 };
 
 extern enum safe_crlf safe_crlf;
diff --git a/t/t0025-crlf-auto.sh b/t/t0025-crlf-auto.sh
index c164b46..d0bee08 100755
--- a/t/t0025-crlf-auto.sh
+++ b/t/t0025-crlf-auto.sh
@@ -114,7 +114,7 @@ test_expect_success 'autocrlf=true does not normalize CRLF files' '
 	test -z "$LFonlydiff" -a -z "$CRLFonlydiff" -a -z "$LFwithNULdiff"
 '
 
-test_expect_success 'text=auto, autocrlf=true _does_ normalize CRLF files' '
+test_expect_success 'text=auto, autocrlf=true does not normalize CRLF files' '
 
 	rm -f .gitattributes tmp LFonly CRLFonly LFwithNUL &&
 	git config core.autocrlf true &&
@@ -126,7 +126,7 @@ test_expect_success 'text=auto, autocrlf=true _does_ normalize CRLF files' '
 	LFonlydiff=$(git diff LFonly) &&
 	CRLFonlydiff=$(git diff CRLFonly) &&
 	LFwithNULdiff=$(git diff LFwithNUL) &&
-	test -z "$LFonlydiff" -a -n "$CRLFonlydiff" -a -z "$LFwithNULdiff"
+	test -z "$LFonlydiff" -a -z "$CRLFonlydiff" -a -z "$LFwithNULdiff"
 '
 
 test_expect_success 'text=auto, autocrlf=true does not normalize binary files' '
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 9372589..2860d2d 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -175,8 +175,8 @@ attr_ascii () {
 	text,lf)   echo "text eol=lf" ;;
 	text,crlf) echo "text eol=crlf" ;;
 	auto,)     echo "text=auto" ;;
-	auto,lf)   echo "text eol=lf" ;;
-	auto,crlf) echo "text eol=crlf" ;;
+	auto,lf)   echo "text=auto eol=lf" ;;
+	auto,crlf) echo "text=auto eol=crlf" ;;
 	lf,)       echo "text eol=lf" ;;
 	crlf,)     echo "text eol=crlf" ;;
 	,) echo "" ;;
@@ -397,10 +397,9 @@ commit_chk_wrnNNO ""      ""      false   ""        ""        ""          ""
 commit_chk_wrnNNO ""      ""      true    LF_CRLF   ""        ""          ""          ""
 commit_chk_wrnNNO ""      ""      input   ""        ""        ""          ""          ""
 
-commit_chk_wrnNNO "auto"  ""      false   "$WILC"   "$WICL"   "$WAMIX"    ""          ""
-commit_chk_wrnNNO "auto"  ""      true    LF_CRLF   ""        LF_CRLF     ""          ""
-commit_chk_wrnNNO "auto"  ""      input   ""        CRLF_LF   CRLF_LF     ""          ""
-
+commit_chk_wrnNNO "auto"  ""      false   "$WILC"   ""        ""          ""          ""
+commit_chk_wrnNNO "auto"  ""      true    LF_CRLF   ""        ""          ""          ""
+commit_chk_wrnNNO "auto"  ""      input   ""        ""        ""          ""          ""
 for crlf in true false input
 do
 	commit_chk_wrnNNO -text ""      $crlf   ""        ""        ""          ""          ""
@@ -408,8 +407,8 @@ do
 	commit_chk_wrnNNO -text crlf    $crlf   ""        ""        ""          ""          ""
 	commit_chk_wrnNNO ""    lf      $crlf   ""       CRLF_LF    CRLF_LF      ""         CRLF_LF
 	commit_chk_wrnNNO ""    crlf    $crlf   LF_CRLF   ""        LF_CRLF     LF_CRLF     ""
-	commit_chk_wrnNNO auto  lf    	$crlf   ""       CRLF_LF    CRLF_LF     ""          CRLF_LF
-	commit_chk_wrnNNO auto  crlf  	$crlf   LF_CRLF   ""        LF_CRLF     LF_CRLF     ""
+	commit_chk_wrnNNO auto  lf    	$crlf   ""        ""        ""          ""          ""
+	commit_chk_wrnNNO auto  crlf  	$crlf   LF_CRLF   ""        ""          ""          ""
 	commit_chk_wrnNNO text  lf    	$crlf   ""       CRLF_LF    CRLF_LF     ""          CRLF_LF
 	commit_chk_wrnNNO text  crlf  	$crlf   LF_CRLF   ""        LF_CRLF     LF_CRLF     ""
 done
@@ -454,9 +453,9 @@ do
 	check_in_repo_NNO -text ""     $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
 	check_in_repo_NNO -text lf     $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
 	check_in_repo_NNO -text crlf   $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
-	check_in_repo_NNO auto  ""     $crlf   LF  LF    LF           LF_mix_CR  CRLF_nul
-	check_in_repo_NNO auto  lf     $crlf   LF  LF    LF           LF_mix_CR  LF_nul
-	check_in_repo_NNO auto  crlf   $crlf   LF  LF    LF           LF_mix_CR  LF_nul
+	check_in_repo_NNO auto  ""     $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
+	check_in_repo_NNO auto  lf     $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
+	check_in_repo_NNO auto  crlf   $crlf   LF  CRLF  CRLF_mix_LF  LF_mix_CR  CRLF_nul
 	check_in_repo_NNO text  ""     $crlf   LF  LF    LF           LF_mix_CR  LF_nul
 	check_in_repo_NNO text  lf     $crlf   LF  LF    LF           LF_mix_CR  LF_nul
 	check_in_repo_NNO text  crlf   $crlf   LF  LF    LF           LF_mix_CR  LF_nul
@@ -509,7 +508,7 @@ do
 			checkout_files text  "$id" "crlf" "$crlf" "$ceol"  CRLF  CRLF  CRLF         CRLF_mix_CR  CRLF_nul
 			# currently the same as text, eol=XXX
 			checkout_files auto  "$id" "lf"   "$crlf" "$ceol"  LF    CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
-			checkout_files auto  "$id" "crlf" "$crlf" "$ceol"  CRLF  CRLF  CRLF         CRLF_mix_CR  CRLF_nul
+			checkout_files auto  "$id" "crlf" "$crlf" "$ceol"  CRLF  CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
 		done
 
 		# core.autocrlf false, different core.eol
@@ -517,7 +516,7 @@ do
 		# core.autocrlf true
 		checkout_files   ""    "$id" ""     true    "$ceol"  CRLF  CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
 		# text: core.autocrlf = true overrides core.eol
-		checkout_files   auto  "$id" ""     true    "$ceol"  CRLF  CRLF  CRLF         LF_mix_CR    LF_nul
+		checkout_files   auto  "$id" ""     true    "$ceol"  CRLF  CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
 		checkout_files   text  "$id" ""     true    "$ceol"  CRLF  CRLF  CRLF         CRLF_mix_CR  CRLF_nul
 		# text: core.autocrlf = input overrides core.eol
 		checkout_files   text  "$id" ""     input   "$ceol"  LF    CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
@@ -531,8 +530,8 @@ do
 	checkout_files     text  "$id" ""     false   ""       $NL   CRLF  $MIX_CRLF_LF $MIX_LF_CR   $LFNUL
 	checkout_files     text  "$id" ""     false   native   $NL   CRLF  $MIX_CRLF_LF $MIX_LF_CR   $LFNUL
 	# auto: core.autocrlf=false and core.eol unset(or native) uses native eol
-	checkout_files     auto  "$id" ""     false   ""       $NL   CRLF  $MIX_CRLF_LF LF_mix_CR    LF_nul
-	checkout_files     auto  "$id" ""     false   native   $NL   CRLF  $MIX_CRLF_LF LF_mix_CR    LF_nul
+	checkout_files     auto  "$id" ""     false   ""       $NL   CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
+	checkout_files     auto  "$id" ""     false   native   $NL   CRLF  CRLF_mix_LF  LF_mix_CR    LF_nul
 done
 
 # Should be the last test case: remove some files from the worktree
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 85c10b0..33b77ee 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -16,6 +16,13 @@ test_description='CRLF merge conflict across text=auto change
 
 test_have_prereq SED_STRIPS_CR && SED_OPTIONS=-b
 
+compare_files () {
+	tr '\015\000' QN <"$1" >"$1".expect &&
+	tr '\015\000' QN <"$2" >"$2".actual &&
+	test_cmp "$1".expect "$2".actual &&
+	rm "$1".expect "$2".actual
+}
+
 test_expect_success setup '
 	git config core.autocrlf false &&
 
@@ -30,7 +37,7 @@ test_expect_success setup '
 	git branch side &&
 
 	echo "* text=auto" >.gitattributes &&
-	touch file &&
+	echo first line >file &&
 	git add .gitattributes file &&
 	test_tick &&
 	git commit -m "normalize file" &&
@@ -81,7 +88,7 @@ test_expect_success 'Merge after setting text=auto' '
 	rm -f .gitattributes &&
 	git reset --hard a &&
 	git merge b &&
-	test_cmp expected file
+	compare_files expected file
 '
 
 test_expect_success 'Merge addition of text=auto' '
@@ -99,7 +106,7 @@ test_expect_success 'Merge addition of text=auto' '
 	rm -f .gitattributes &&
 	git reset --hard b &&
 	git merge a &&
-	test_cmp expected file
+	compare_files  expected file
 '
 
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
@@ -121,7 +128,7 @@ test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
 	git reset --hard a &&
 	test_must_fail git merge b &&
 	fuzz_conflict file >file.fuzzy &&
-	test_cmp expected file.fuzzy
+	compare_files expected file.fuzzy
 '
 
 test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
@@ -143,7 +150,7 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
 	git reset --hard b &&
 	test_must_fail git merge a &&
 	fuzz_conflict file >file.fuzzy &&
-	test_cmp expected file.fuzzy
+	compare_files expected file.fuzzy
 '
 
 test_expect_failure 'checkout -m after setting text=auto' '
@@ -158,7 +165,7 @@ test_expect_failure 'checkout -m after setting text=auto' '
 	git reset --hard initial &&
 	git checkout a -- . &&
 	git checkout -m b &&
-	test_cmp expected file
+	compare_files expected file
 '
 
 test_expect_failure 'checkout -m addition of text=auto' '
@@ -173,7 +180,7 @@ test_expect_failure 'checkout -m addition of text=auto' '
 	git reset --hard initial &&
 	git checkout b -- . &&
 	git checkout -m a &&
-	test_cmp expected file
+	compare_files expected file
 '
 
 test_expect_failure 'cherry-pick patch from after text=auto was added' '
@@ -187,7 +194,7 @@ test_expect_failure 'cherry-pick patch from after text=auto was added' '
 	git reset --hard b &&
 	test_must_fail git cherry-pick a >err 2>&1 &&
 	grep "[Nn]othing added" err &&
-	test_cmp expected file
+	compare_files expected file
 '
 
 test_expect_success 'Test delete/normalize conflict' '
-- 
2.0.0.rc1.6318.g0c2c796


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 2/3] read-cache: factor out get_sha1_from_index() helper
  2016-06-27 23:21 What's cooking in git.git (Jun 2016, #09; Mon, 27) Junio C Hamano
  2016-06-28  8:01 ` [PATCH v3 0/3] unified auto CRLF handling, V3 tboegi
  2016-06-28  8:01 ` [PATCH v3 1/3] convert: unify the "auto" handling of CRLF tboegi
@ 2016-06-28  8:01 ` tboegi
  2016-06-28  8:01 ` [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge tboegi
  3 siblings, 0 replies; 24+ messages in thread
From: tboegi @ 2016-06-28  8:01 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen

From: Torsten Bögershausen <tboegi@web.de>

Factor out the retrieval of the sha1 for a given path in
read_blob_data_from_index() into the function get_sha1_from_index().

This will be used in the next commit, when convert.c can do the
analyze for "text=auto" without slurping the whole blob into memory
at once.

Add a wrapper definition get_sha1_from_cache().

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 cache.h      |  3 +++
 read-cache.c | 29 ++++++++++++++++++-----------
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index b829410..bd1210a 100644
--- a/cache.h
+++ b/cache.h
@@ -379,6 +379,7 @@ extern void free_name_hash(struct index_state *istate);
 #define unmerge_cache_entry_at(at) unmerge_index_entry_at(&the_index, at)
 #define unmerge_cache(pathspec) unmerge_index(&the_index, pathspec)
 #define read_blob_data_from_cache(path, sz) read_blob_data_from_index(&the_index, (path), (sz))
+#define get_sha1_from_cache(path)  get_sha1_from_index (&the_index, (path))
 #endif
 
 enum object_type {
@@ -1008,6 +1009,8 @@ static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *
 	return read_sha1_file_extended(sha1, type, size, LOOKUP_REPLACE_OBJECT);
 }
 
+const unsigned char *get_sha1_from_index(struct index_state *istate, const char *path);
+
 /*
  * This internal function is only declared here for the benefit of
  * lookup_replace_object().  Please do not call it directly.
diff --git a/read-cache.c b/read-cache.c
index d9fb78b..a3ef967 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2263,13 +2263,27 @@ int index_name_is_other(const struct index_state *istate, const char *name,
 
 void *read_blob_data_from_index(struct index_state *istate, const char *path, unsigned long *size)
 {
-	int pos, len;
+	const unsigned char *sha1;
 	unsigned long sz;
 	enum object_type type;
 	void *data;
 
-	len = strlen(path);
-	pos = index_name_pos(istate, path, len);
+	sha1 = get_sha1_from_index(istate, path);
+	if (!sha1)
+		return NULL;
+	data = read_sha1_file(sha1, &type, &sz);
+	if (!data || type != OBJ_BLOB) {
+		free(data);
+		return NULL;
+	}
+	if (size)
+		*size = sz;
+	return data;
+}
+
+const unsigned char *get_sha1_from_index(struct index_state *istate, const char *path)
+{
+	int pos = index_name_pos(istate, path, strlen(path));
 	if (pos < 0) {
 		/*
 		 * We might be in the middle of a merge, in which
@@ -2285,14 +2299,7 @@ void *read_blob_data_from_index(struct index_state *istate, const char *path, un
 	}
 	if (pos < 0)
 		return NULL;
-	data = read_sha1_file(istate->cache[pos]->sha1, &type, &sz);
-	if (!data || type != OBJ_BLOB) {
-		free(data);
-		return NULL;
-	}
-	if (size)
-		*size = sz;
-	return data;
+	return (istate->cache[pos]->sha1);
 }
 
 void stat_validity_clear(struct stat_validity *sv)
-- 
2.0.0.rc1.6318.g0c2c796


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-06-27 23:21 What's cooking in git.git (Jun 2016, #09; Mon, 27) Junio C Hamano
                   ` (2 preceding siblings ...)
  2016-06-28  8:01 ` [PATCH v3 2/3] read-cache: factor out get_sha1_from_index() helper tboegi
@ 2016-06-28  8:01 ` tboegi
  2016-06-29 16:14   ` Junio C Hamano
  3 siblings, 1 reply; 24+ messages in thread
From: tboegi @ 2016-06-28  8:01 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen

From: Torsten Bögershausen <tboegi@web.de>

The following didn't work as expected:

 - In a middle of a merge
 - merge.renormalize is true,
 - .gitattributes = "* text=auto"
 - core.eol = crlf

Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
with LF "first line\nsame line\n".

The expected result of the merge is "first line\nsame line\n".

The content in the working tree is "first line\r\nsame line\r\n",
and ce_compare_data() should find that the content is clean and return 0.

Deep down crlf_to_git() is invoked, to check if CRLF are converted or not.

The "new safer autocrlf handling" calls blob_has_cr().

Instead of using the sha1 of the blob, (CRLF in this example),
the function get_sha1_from_index() is invoked.
get_sha1_from_index() decides to return "ours" when in the middle of
the merge, which is LF.

As a result, the CRLF in the worktree are converted into LF before
the comparison.
The contents of LF and CRLF don't match any more.

The problem is that ce_compare_data() has ce->sha1, but the sha1 is lost
on it's way into blob_has_cr().

Forwarding ce->sha1 from ce_compare_data() into crlf_to_git() makes sure
that blob_has_cr() looks at the appropriate blob.

Add a new parameter index_blob_sha1 to convert_to_git(), and forward the
sha1 from ce_compare_data() into convert_to_git(). Other callers use NULL
for index_blob_sha1, and the sha1 is determined from path
using get_sha1_from_cache(path). This is the same handling as before.

In the same spirit, forward the sha1 into would_convert_to_git().

While at it, rename has_cr_in_index() into blob_has_cr()
and replace 0 with SAFE_CRLF_FALSE.

Add a TC in t6038 to have a test coverage under Linux.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 builtin/apply.c            |  3 ++-
 builtin/blame.c            |  2 +-
 cache.h                    |  1 +
 combine-diff.c             |  3 ++-
 convert.c                  | 43 ++++++++++++++++++++++++++------------
 convert.h                  | 15 ++++++++++----
 diff.c                     |  3 ++-
 dir.c                      |  2 +-
 read-cache.c               |  4 +++-
 sha1_file.c                | 12 ++++++++---
 t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
 11 files changed, 90 insertions(+), 49 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 42c610e..f33d9cf 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -2142,7 +2142,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	case S_IFREG:
 		if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
 			return error(_("unable to open or read %s"), path);
-		convert_to_git(path, buf->buf, buf->len, buf, 0);
+		convert_to_git(path, buf->buf, buf->len, buf,
+			       SAFE_CRLF_FALSE, NULL);
 		return 0;
 	default:
 		return -1;
diff --git a/builtin/blame.c b/builtin/blame.c
index e982fb8..6305a1e 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2376,7 +2376,7 @@ static struct commit *fake_working_tree_commit(struct diff_options *opt,
 		if (strbuf_read(&buf, 0, 0) < 0)
 			die_errno("failed to read from stdin");
 	}
-	convert_to_git(path, buf.buf, buf.len, &buf, 0);
+	convert_to_git(path, buf.buf, buf.len, &buf, SAFE_CRLF_FALSE, NULL);
 	origin->file.ptr = buf.buf;
 	origin->file.size = buf.len;
 	pretend_sha1_file(buf.buf, buf.len, OBJ_BLOB, origin->blob_sha1);
diff --git a/cache.h b/cache.h
index bd1210a..42b74b6 100644
--- a/cache.h
+++ b/cache.h
@@ -605,6 +605,7 @@ extern int ie_modified(const struct index_state *, const struct cache_entry *, s
 
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
+#define HASH_USE_SHA_NOT_PATH 4
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
 extern int index_path(unsigned char *sha1, const char *path, struct stat *st, unsigned flags);
 
diff --git a/combine-diff.c b/combine-diff.c
index 0e1d4b0..c4fa884 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1053,7 +1053,8 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			if (is_file) {
 				struct strbuf buf = STRBUF_INIT;
 
-				if (convert_to_git(elem->path, result, len, &buf, safe_crlf)) {
+				if (convert_to_git(elem->path, result, len,
+						   &buf, safe_crlf, NULL)) {
 					free(result);
 					result = strbuf_detach(&buf, &len);
 					result_size = len;
diff --git a/convert.c b/convert.c
index 67d69b5..802ee7c 100644
--- a/convert.c
+++ b/convert.c
@@ -219,23 +219,28 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
 	}
 }
 
-static int has_cr_in_index(const char *path)
+static int blob_has_cr(const unsigned char *index_blob_sha1)
 {
 	unsigned long sz;
 	void *data;
-	int has_cr;
-
-	data = read_blob_data_from_cache(path, &sz);
+	int has_cr = 0;
+	enum object_type type;
+	if (!index_blob_sha1)
+		return 0;
+	data = read_sha1_file(index_blob_sha1, &type, &sz);
 	if (!data)
 		return 0;
-	has_cr = memchr(data, '\r', sz) != NULL;
+	if (type == OBJ_BLOB)
+		has_cr = memchr(data, '\r', sz) != NULL;
+
 	free(data);
 	return has_cr;
 }
 
 static int crlf_to_git(const char *path, const char *src, size_t len,
 		       struct strbuf *buf,
-		       enum crlf_action crlf_action, enum safe_crlf checksafe)
+		       enum crlf_action crlf_action, enum safe_crlf checksafe,
+		       const unsigned char *index_blob_sha1)
 {
 	struct text_stat stats;
 	char *dst;
@@ -256,14 +261,23 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 	if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
 		if (convert_is_binary(len, &stats))
 			return 0;
+
 		/*
 		 * If the file in the index has any CR in it, do not convert.
 		 * This is the new safer autocrlf handling.
 		 */
 		if (checksafe == SAFE_CRLF_RENORMALIZE)
 			checksafe = SAFE_CRLF_FALSE;
-		else if (has_cr_in_index(path))
-			return 0;
+		else {
+			/*
+			 * If the file in the index has any CR in it, do not convert.
+			 * This is the new safer autocrlf handling.
+			 */
+			if (!index_blob_sha1)
+				index_blob_sha1 = get_sha1_from_cache(path);
+			if (blob_has_cr(index_blob_sha1))
+				return 0;
+		}
 	}
 	check_safe_crlf(path, crlf_action, &stats, checksafe);
 
@@ -855,7 +869,8 @@ const char *get_convert_attr_ascii(const char *path)
 }
 
 int convert_to_git(const char *path, const char *src, size_t len,
-                   struct strbuf *dst, enum safe_crlf checksafe)
+		   struct strbuf *dst, enum safe_crlf checksafe,
+		   const unsigned char *index_blob_sha1)
 {
 	int ret = 0;
 	const char *filter = NULL;
@@ -876,7 +891,7 @@ int convert_to_git(const char *path, const char *src, size_t len,
 		src = dst->buf;
 		len = dst->len;
 	}
-	ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe);
+	ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe, index_blob_sha1);
 	if (ret && dst) {
 		src = dst->buf;
 		len = dst->len;
@@ -885,7 +900,8 @@ int convert_to_git(const char *path, const char *src, size_t len,
 }
 
 void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
-			      enum safe_crlf checksafe)
+			      enum safe_crlf checksafe,
+			      const unsigned char *index_blob_sha1)
 {
 	struct conv_attrs ca;
 	convert_attrs(&ca, path);
@@ -896,7 +912,8 @@ void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
 	if (!apply_filter(path, NULL, 0, fd, dst, ca.drv->clean))
 		die("%s: clean filter '%s' failed", path, ca.drv->name);
 
-	crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action, checksafe);
+	crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action,
+		    checksafe, index_blob_sha1);
 	ident_to_git(path, dst->buf, dst->len, dst, ca.ident);
 }
 
@@ -951,7 +968,7 @@ int renormalize_buffer(const char *path, const char *src, size_t len, struct str
 		src = dst->buf;
 		len = dst->len;
 	}
-	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE);
+	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE, NULL);
 }
 
 /*****************************************************************
diff --git a/convert.h b/convert.h
index 82871a1..97d4272 100644
--- a/convert.h
+++ b/convert.h
@@ -39,19 +39,26 @@ extern const char *get_convert_attr_ascii(const char *path);
 
 /* returns 1 if *dst was used */
 extern int convert_to_git(const char *path, const char *src, size_t len,
-			  struct strbuf *dst, enum safe_crlf checksafe);
+			  struct strbuf *dst, enum safe_crlf checksafe,
+			  const unsigned char *index_blob_sha1);
+
 extern int convert_to_working_tree(const char *path, const char *src,
 				   size_t len, struct strbuf *dst);
 extern int renormalize_buffer(const char *path, const char *src, size_t len,
 			      struct strbuf *dst);
-static inline int would_convert_to_git(const char *path)
+static inline int would_convert_to_git(const char *path,
+				       const unsigned char *index_blob_sha1)
 {
-	return convert_to_git(path, NULL, 0, NULL, 0);
+	return convert_to_git(path, NULL, 0, NULL, SAFE_CRLF_FALSE,
+			      index_blob_sha1);
 }
+
 /* Precondition: would_convert_to_git_filter_fd(path) == true */
 extern void convert_to_git_filter_fd(const char *path, int fd,
 				     struct strbuf *dst,
-				     enum safe_crlf checksafe);
+				     enum safe_crlf checksafe,
+				     const unsigned char *index_blob_sha1);
+
 extern int would_convert_to_git_filter_fd(const char *path);
 
 /*****************************************************************
diff --git a/diff.c b/diff.c
index 059123c..b973469 100644
--- a/diff.c
+++ b/diff.c
@@ -2800,7 +2800,8 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
 		/*
 		 * Convert from working tree format to canonical git format
 		 */
-		if (convert_to_git(s->path, s->data, s->size, &buf, crlf_warn)) {
+		if (convert_to_git(s->path, s->data, s->size, &buf,
+				   crlf_warn, NULL)) {
 			size_t size = 0;
 			munmap(s->data, s->size);
 			s->should_munmap = 0;
diff --git a/dir.c b/dir.c
index a4a9d9f..8709fc0 100644
--- a/dir.c
+++ b/dir.c
@@ -720,7 +720,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 				 (pos = cache_name_pos(fname, strlen(fname))) >= 0 &&
 				 !ce_stage(active_cache[pos]) &&
 				 ce_uptodate(active_cache[pos]) &&
-				 !would_convert_to_git(fname))
+				 !would_convert_to_git(fname, NULL))
 				hashcpy(sha1_stat->sha1, active_cache[pos]->sha1);
 			else
 				hash_sha1_file(buf, size, "blob", sha1_stat->sha1);
diff --git a/read-cache.c b/read-cache.c
index a3ef967..c109b6d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -163,7 +163,9 @@ static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 
 	if (fd >= 0) {
 		unsigned char sha1[20];
-		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
+		unsigned flags = HASH_USE_SHA_NOT_PATH;
+		memcpy(sha1, ce->sha1, sizeof(sha1));
+		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
 			match = hashcmp(sha1, ce->sha1);
 		/* index_fd() closed the file descriptor already */
 	}
diff --git a/sha1_file.c b/sha1_file.c
index d0f2aa0..52e5c6f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3275,6 +3275,7 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
 {
 	int ret, re_allocated = 0;
 	int write_object = flags & HASH_WRITE_OBJECT;
+	const int valid_sha1 = flags & HASH_USE_SHA_NOT_PATH;
 
 	if (!type)
 		type = OBJ_BLOB;
@@ -3285,7 +3286,8 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
 	if ((type == OBJ_BLOB) && path) {
 		struct strbuf nbuf = STRBUF_INIT;
 		if (convert_to_git(path, buf, size, &nbuf,
-				   write_object ? safe_crlf : SAFE_CRLF_FALSE)) {
+				   write_object ? safe_crlf : SAFE_CRLF_FALSE,
+				   valid_sha1 ? sha1 : NULL)) {
 			buf = strbuf_detach(&nbuf, &size);
 			re_allocated = 1;
 		}
@@ -3313,13 +3315,15 @@ static int index_stream_convert_blob(unsigned char *sha1, int fd,
 {
 	int ret;
 	const int write_object = flags & HASH_WRITE_OBJECT;
+	const int valid_sha1 = flags & HASH_USE_SHA_NOT_PATH;
 	struct strbuf sbuf = STRBUF_INIT;
 
 	assert(path);
 	assert(would_convert_to_git_filter_fd(path));
 
 	convert_to_git_filter_fd(path, fd, &sbuf,
-				 write_object ? safe_crlf : SAFE_CRLF_FALSE);
+				 write_object ? safe_crlf : SAFE_CRLF_FALSE,
+				 valid_sha1 ? sha1 : NULL);
 
 	if (write_object)
 		ret = write_sha1_file(sbuf.buf, sbuf.len, typename(OBJ_BLOB),
@@ -3396,6 +3400,8 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
 	     enum object_type type, const char *path, unsigned flags)
 {
 	int ret;
+	const unsigned char *sha1_ce;
+	sha1_ce = flags & HASH_USE_SHA_NOT_PATH ? sha1 : NULL;
 
 	/*
 	 * Call xsize_t() only when needed to avoid potentially unnecessary
@@ -3406,7 +3412,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
 	else if (!S_ISREG(st->st_mode))
 		ret = index_pipe(sha1, fd, type, path, flags);
 	else if (st->st_size <= big_file_threshold || type != OBJ_BLOB ||
-		 (path && would_convert_to_git(path)))
+		 (path && would_convert_to_git(path,sha1_ce)))
 		ret = index_core(sha1, fd, xsize_t(st->st_size), type, path,
 				 flags);
 	else
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 33b77ee..5e8d5fa 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
 	compare_files expected file
 '
 
-test_expect_success 'Merge addition of text=auto' '
+test_expect_success 'Merge addition of text=auto eol=LF' '
+	git config core.eol lf &&
 	cat <<-\EOF >expected &&
 	first line
 	same line
 	EOF
 
-	if test_have_prereq NATIVE_CRLF; then
-		append_cr <expected >expected.temp &&
-		mv expected.temp expected
-	fi &&
 	git config merge.renormalize true &&
 	git rm -fr . &&
 	rm -f .gitattributes &&
@@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
 	compare_files  expected file
 '
 
+test_expect_success 'Merge addition of text=auto eol=CRLF' '
+	git config core.eol crlf &&
+	cat <<-\EOF >expected &&
+	first line
+	same line
+	EOF
+
+	append_cr <expected >expected.temp &&
+	mv expected.temp expected &&
+	git config merge.renormalize true &&
+	git rm -fr . &&
+	rm -f .gitattributes &&
+	git reset --hard b &&
+	echo >&2 "After git reset --hard b" &&
+	git ls-files -s --eol >&2 &&
+	git merge a &&
+	compare_files  expected file
+'
+
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
+	git config core.eol native &&
 	echo "<<<<<<<" >expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected &&
-		echo ======= | append_cr >>expected
-	else
-		echo first line >>expected &&
-		echo same line >>expected &&
-		echo ======= >>expected
-	fi &&
+	echo first line >>expected &&
+	echo same line >>expected &&
+	echo ======= >>expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
 	echo ">>>>>>>" >>expected &&
@@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
 	echo "<<<<<<<" >expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo ======= | append_cr >>expected &&
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected
-	else
-		echo ======= >>expected &&
-		echo first line >>expected &&
-		echo same line >>expected
-	fi &&
+	echo ======= >>expected &&
+	echo first line >>expected &&
+	echo same line >>expected &&
 	echo ">>>>>>>" >>expected &&
 	git config merge.renormalize false &&
 	rm -f .gitattributes &&
-- 
2.0.0.rc1.6318.g0c2c796


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-06-28  8:01 ` [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge tboegi
@ 2016-06-29 16:14   ` Junio C Hamano
  2016-06-30 16:52     ` Torsten Bögershausen
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-06-29 16:14 UTC (permalink / raw)
  To: tboegi; +Cc: git

tboegi@web.de writes:

> From: Torsten Bögershausen <tboegi@web.de>
>
> The following didn't work as expected:

Sorry for being slow (not in response but in understanding), but
let's examine the expectation first.

>  - In a middle of a merge
>  - merge.renormalize is true,

gitattributes(5) tells us that this would make a "virtual check-out
and check-in of all stages", i.e. each of the three blob objects
involved is run through convert_to_working_tree(), and the result is
run through convert_to_git().  Now, what are these two operations
told to do?

>  - .gitattributes = "* text=auto"
>  - core.eol = crlf

git-config(1) tells us that a text file will be checked out with
CRLF line endings, so convert_to_working_tree() would work that
way.  Even though core.eol entry in that manual page does not tell
us anything, presumably convert_to_git() is expected to turn it back
to LF line endings.

> Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
> with LF "first line\nsame line\n".

The former, when renormalized, should (we are discussing the
expectation, not necessarily what is done by today's code) be
processed like this:

 * Pretend as if blob "first line\r\nsame line\r\n" is in the stage
   #0 of the index at the path;
 * Pretend as if the user said "git checkout" that path;
 * Pretend as if the user then said "git add" that path.

The checkout process hopefully does not blindly turn LF into CRLF,
making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
working tree file will have "first line\r\nsame line\r\n".

Then "git add" should turn that CRLF into LF, which would give us
"first line\nsame line\n", but because the "safer autocrlf" rule
prevents it from happening.  The (real) index already has CR in it
in the blob in stage #2, so the check-in process would (this is not
describing the ideal, but what is done by today's code) disable
CRLF->LF conversion.

Is that the problem you are trying to solve?

If that is the case, I do not see how "don't use stage #2 of the
real index; use the blob being renormalized instead" would help.
The blob being renormalized may have CR in it, triggering that
"safer autocrlf" rule and cause you the same trouble, no?

To me, it appears that if you consider that the "safer autocrlf" is
a sensible thing, you _must_ expect that the renormalization would
not work at all, in the scenario you presented.  Also,

> The expected result of the merge is "first line\nsame line\n".

if you expect this, to me, it appears that you need to reject the
"safer autocrlf", at least during renormalization, as a sensible
thing.  And if you _are_ to disable the "safer autocrlf" thing, then
it does not matter what is currently in the index--the conversion
can happen solely based on the data being converted and the
configuration and attribute settings.

So I still do not see why you want to pass "no do not use stage #0
or stage #2; use this blob instead".  Shouldn't you be just passing
a bit "don't do the safer autocrlf thing in this convert_to_git()
call" from renormalization codepath without doing anything else?

> The content in the working tree is "first line\r\nsame line\r\n",
> and ce_compare_data() should find that the content is clean and return 0.
>
> Deep down crlf_to_git() is invoked, to check if CRLF are converted or not.
>
> The "new safer autocrlf handling" calls blob_has_cr().
>
> Instead of using the sha1 of the blob, (CRLF in this example),
> the function get_sha1_from_index() is invoked.
> get_sha1_from_index() decides to return "ours" when in the middle of
> the merge, which is LF.
>
> As a result, the CRLF in the worktree are converted into LF before
> the comparison.
> The contents of LF and CRLF don't match any more.
>
> The problem is that ce_compare_data() has ce->sha1, but the sha1 is lost
> on it's way into blob_has_cr().
>
> Forwarding ce->sha1 from ce_compare_data() into crlf_to_git() makes sure
> that blob_has_cr() looks at the appropriate blob.
>
> Add a new parameter index_blob_sha1 to convert_to_git(), and forward the
> sha1 from ce_compare_data() into convert_to_git(). Other callers use NULL
> for index_blob_sha1, and the sha1 is determined from path
> using get_sha1_from_cache(path). This is the same handling as before.
>
> In the same spirit, forward the sha1 into would_convert_to_git().
>
> While at it, rename has_cr_in_index() into blob_has_cr()
> and replace 0 with SAFE_CRLF_FALSE.
>
> Add a TC in t6038 to have a test coverage under Linux.
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
>  builtin/apply.c            |  3 ++-
>  builtin/blame.c            |  2 +-
>  cache.h                    |  1 +
>  combine-diff.c             |  3 ++-
>  convert.c                  | 43 ++++++++++++++++++++++++++------------
>  convert.h                  | 15 ++++++++++----
>  diff.c                     |  3 ++-
>  dir.c                      |  2 +-
>  read-cache.c               |  4 +++-
>  sha1_file.c                | 12 ++++++++---
>  t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
>  11 files changed, 90 insertions(+), 49 deletions(-)
>
> diff --git a/builtin/apply.c b/builtin/apply.c
> index 42c610e..f33d9cf 100644
> --- a/builtin/apply.c
> +++ b/builtin/apply.c
> @@ -2142,7 +2142,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
>  	case S_IFREG:
>  		if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
>  			return error(_("unable to open or read %s"), path);
> -		convert_to_git(path, buf->buf, buf->len, buf, 0);
> +		convert_to_git(path, buf->buf, buf->len, buf,
> +			       SAFE_CRLF_FALSE, NULL);
>  		return 0;
>  	default:
>  		return -1;
> diff --git a/builtin/blame.c b/builtin/blame.c
> index e982fb8..6305a1e 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -2376,7 +2376,7 @@ static struct commit *fake_working_tree_commit(struct diff_options *opt,
>  		if (strbuf_read(&buf, 0, 0) < 0)
>  			die_errno("failed to read from stdin");
>  	}
> -	convert_to_git(path, buf.buf, buf.len, &buf, 0);
> +	convert_to_git(path, buf.buf, buf.len, &buf, SAFE_CRLF_FALSE, NULL);
>  	origin->file.ptr = buf.buf;
>  	origin->file.size = buf.len;
>  	pretend_sha1_file(buf.buf, buf.len, OBJ_BLOB, origin->blob_sha1);
> diff --git a/cache.h b/cache.h
> index bd1210a..42b74b6 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -605,6 +605,7 @@ extern int ie_modified(const struct index_state *, const struct cache_entry *, s
>  
>  #define HASH_WRITE_OBJECT 1
>  #define HASH_FORMAT_CHECK 2
> +#define HASH_USE_SHA_NOT_PATH 4
>  extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
>  extern int index_path(unsigned char *sha1, const char *path, struct stat *st, unsigned flags);
>  
> diff --git a/combine-diff.c b/combine-diff.c
> index 0e1d4b0..c4fa884 100644
> --- a/combine-diff.c
> +++ b/combine-diff.c
> @@ -1053,7 +1053,8 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
>  			if (is_file) {
>  				struct strbuf buf = STRBUF_INIT;
>  
> -				if (convert_to_git(elem->path, result, len, &buf, safe_crlf)) {
> +				if (convert_to_git(elem->path, result, len,
> +						   &buf, safe_crlf, NULL)) {
>  					free(result);
>  					result = strbuf_detach(&buf, &len);
>  					result_size = len;
> diff --git a/convert.c b/convert.c
> index 67d69b5..802ee7c 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -219,23 +219,28 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
>  	}
>  }
>  
> -static int has_cr_in_index(const char *path)
> +static int blob_has_cr(const unsigned char *index_blob_sha1)
>  {
>  	unsigned long sz;
>  	void *data;
> -	int has_cr;
> -
> -	data = read_blob_data_from_cache(path, &sz);
> +	int has_cr = 0;
> +	enum object_type type;
> +	if (!index_blob_sha1)
> +		return 0;
> +	data = read_sha1_file(index_blob_sha1, &type, &sz);
>  	if (!data)
>  		return 0;
> -	has_cr = memchr(data, '\r', sz) != NULL;
> +	if (type == OBJ_BLOB)
> +		has_cr = memchr(data, '\r', sz) != NULL;
> +
>  	free(data);
>  	return has_cr;
>  }
>  
>  static int crlf_to_git(const char *path, const char *src, size_t len,
>  		       struct strbuf *buf,
> -		       enum crlf_action crlf_action, enum safe_crlf checksafe)
> +		       enum crlf_action crlf_action, enum safe_crlf checksafe,
> +		       const unsigned char *index_blob_sha1)
>  {
>  	struct text_stat stats;
>  	char *dst;
> @@ -256,14 +261,23 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
>  	if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || crlf_action == CRLF_AUTO_CRLF) {
>  		if (convert_is_binary(len, &stats))
>  			return 0;
> +
>  		/*
>  		 * If the file in the index has any CR in it, do not convert.
>  		 * This is the new safer autocrlf handling.
>  		 */
>  		if (checksafe == SAFE_CRLF_RENORMALIZE)
>  			checksafe = SAFE_CRLF_FALSE;
> -		else if (has_cr_in_index(path))
> -			return 0;
> +		else {
> +			/*
> +			 * If the file in the index has any CR in it, do not convert.
> +			 * This is the new safer autocrlf handling.
> +			 */
> +			if (!index_blob_sha1)
> +				index_blob_sha1 = get_sha1_from_cache(path);
> +			if (blob_has_cr(index_blob_sha1))
> +				return 0;
> +		}
>  	}
>  	check_safe_crlf(path, crlf_action, &stats, checksafe);
>  
> @@ -855,7 +869,8 @@ const char *get_convert_attr_ascii(const char *path)
>  }
>  
>  int convert_to_git(const char *path, const char *src, size_t len,
> -                   struct strbuf *dst, enum safe_crlf checksafe)
> +		   struct strbuf *dst, enum safe_crlf checksafe,
> +		   const unsigned char *index_blob_sha1)
>  {
>  	int ret = 0;
>  	const char *filter = NULL;
> @@ -876,7 +891,7 @@ int convert_to_git(const char *path, const char *src, size_t len,
>  		src = dst->buf;
>  		len = dst->len;
>  	}
> -	ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe);
> +	ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe, index_blob_sha1);
>  	if (ret && dst) {
>  		src = dst->buf;
>  		len = dst->len;
> @@ -885,7 +900,8 @@ int convert_to_git(const char *path, const char *src, size_t len,
>  }
>  
>  void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
> -			      enum safe_crlf checksafe)
> +			      enum safe_crlf checksafe,
> +			      const unsigned char *index_blob_sha1)
>  {
>  	struct conv_attrs ca;
>  	convert_attrs(&ca, path);
> @@ -896,7 +912,8 @@ void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
>  	if (!apply_filter(path, NULL, 0, fd, dst, ca.drv->clean))
>  		die("%s: clean filter '%s' failed", path, ca.drv->name);
>  
> -	crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action, checksafe);
> +	crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action,
> +		    checksafe, index_blob_sha1);
>  	ident_to_git(path, dst->buf, dst->len, dst, ca.ident);
>  }
>  
> @@ -951,7 +968,7 @@ int renormalize_buffer(const char *path, const char *src, size_t len, struct str
>  		src = dst->buf;
>  		len = dst->len;
>  	}
> -	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE);
> +	return ret | convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE, NULL);
>  }
>  
>  /*****************************************************************
> diff --git a/convert.h b/convert.h
> index 82871a1..97d4272 100644
> --- a/convert.h
> +++ b/convert.h
> @@ -39,19 +39,26 @@ extern const char *get_convert_attr_ascii(const char *path);
>  
>  /* returns 1 if *dst was used */
>  extern int convert_to_git(const char *path, const char *src, size_t len,
> -			  struct strbuf *dst, enum safe_crlf checksafe);
> +			  struct strbuf *dst, enum safe_crlf checksafe,
> +			  const unsigned char *index_blob_sha1);
> +
>  extern int convert_to_working_tree(const char *path, const char *src,
>  				   size_t len, struct strbuf *dst);
>  extern int renormalize_buffer(const char *path, const char *src, size_t len,
>  			      struct strbuf *dst);
> -static inline int would_convert_to_git(const char *path)
> +static inline int would_convert_to_git(const char *path,
> +				       const unsigned char *index_blob_sha1)
>  {
> -	return convert_to_git(path, NULL, 0, NULL, 0);
> +	return convert_to_git(path, NULL, 0, NULL, SAFE_CRLF_FALSE,
> +			      index_blob_sha1);
>  }
> +
>  /* Precondition: would_convert_to_git_filter_fd(path) == true */
>  extern void convert_to_git_filter_fd(const char *path, int fd,
>  				     struct strbuf *dst,
> -				     enum safe_crlf checksafe);
> +				     enum safe_crlf checksafe,
> +				     const unsigned char *index_blob_sha1);
> +
>  extern int would_convert_to_git_filter_fd(const char *path);
>  
>  /*****************************************************************
> diff --git a/diff.c b/diff.c
> index 059123c..b973469 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2800,7 +2800,8 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
>  		/*
>  		 * Convert from working tree format to canonical git format
>  		 */
> -		if (convert_to_git(s->path, s->data, s->size, &buf, crlf_warn)) {
> +		if (convert_to_git(s->path, s->data, s->size, &buf,
> +				   crlf_warn, NULL)) {
>  			size_t size = 0;
>  			munmap(s->data, s->size);
>  			s->should_munmap = 0;
> diff --git a/dir.c b/dir.c
> index a4a9d9f..8709fc0 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -720,7 +720,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
>  				 (pos = cache_name_pos(fname, strlen(fname))) >= 0 &&
>  				 !ce_stage(active_cache[pos]) &&
>  				 ce_uptodate(active_cache[pos]) &&
> -				 !would_convert_to_git(fname))
> +				 !would_convert_to_git(fname, NULL))
>  				hashcpy(sha1_stat->sha1, active_cache[pos]->sha1);
>  			else
>  				hash_sha1_file(buf, size, "blob", sha1_stat->sha1);
> diff --git a/read-cache.c b/read-cache.c
> index a3ef967..c109b6d 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -163,7 +163,9 @@ static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
>  
>  	if (fd >= 0) {
>  		unsigned char sha1[20];
> -		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
> +		unsigned flags = HASH_USE_SHA_NOT_PATH;
> +		memcpy(sha1, ce->sha1, sizeof(sha1));
> +		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
>  			match = hashcmp(sha1, ce->sha1);
>  		/* index_fd() closed the file descriptor already */
>  	}
> diff --git a/sha1_file.c b/sha1_file.c
> index d0f2aa0..52e5c6f 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3275,6 +3275,7 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
>  {
>  	int ret, re_allocated = 0;
>  	int write_object = flags & HASH_WRITE_OBJECT;
> +	const int valid_sha1 = flags & HASH_USE_SHA_NOT_PATH;
>  
>  	if (!type)
>  		type = OBJ_BLOB;
> @@ -3285,7 +3286,8 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
>  	if ((type == OBJ_BLOB) && path) {
>  		struct strbuf nbuf = STRBUF_INIT;
>  		if (convert_to_git(path, buf, size, &nbuf,
> -				   write_object ? safe_crlf : SAFE_CRLF_FALSE)) {
> +				   write_object ? safe_crlf : SAFE_CRLF_FALSE,
> +				   valid_sha1 ? sha1 : NULL)) {
>  			buf = strbuf_detach(&nbuf, &size);
>  			re_allocated = 1;
>  		}
> @@ -3313,13 +3315,15 @@ static int index_stream_convert_blob(unsigned char *sha1, int fd,
>  {
>  	int ret;
>  	const int write_object = flags & HASH_WRITE_OBJECT;
> +	const int valid_sha1 = flags & HASH_USE_SHA_NOT_PATH;
>  	struct strbuf sbuf = STRBUF_INIT;
>  
>  	assert(path);
>  	assert(would_convert_to_git_filter_fd(path));
>  
>  	convert_to_git_filter_fd(path, fd, &sbuf,
> -				 write_object ? safe_crlf : SAFE_CRLF_FALSE);
> +				 write_object ? safe_crlf : SAFE_CRLF_FALSE,
> +				 valid_sha1 ? sha1 : NULL);
>  
>  	if (write_object)
>  		ret = write_sha1_file(sbuf.buf, sbuf.len, typename(OBJ_BLOB),
> @@ -3396,6 +3400,8 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
>  	     enum object_type type, const char *path, unsigned flags)
>  {
>  	int ret;
> +	const unsigned char *sha1_ce;
> +	sha1_ce = flags & HASH_USE_SHA_NOT_PATH ? sha1 : NULL;
>  
>  	/*
>  	 * Call xsize_t() only when needed to avoid potentially unnecessary
> @@ -3406,7 +3412,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
>  	else if (!S_ISREG(st->st_mode))
>  		ret = index_pipe(sha1, fd, type, path, flags);
>  	else if (st->st_size <= big_file_threshold || type != OBJ_BLOB ||
> -		 (path && would_convert_to_git(path)))
> +		 (path && would_convert_to_git(path,sha1_ce)))
>  		ret = index_core(sha1, fd, xsize_t(st->st_size), type, path,
>  				 flags);
>  	else
> diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
> index 33b77ee..5e8d5fa 100755
> --- a/t/t6038-merge-text-auto.sh
> +++ b/t/t6038-merge-text-auto.sh
> @@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
>  	compare_files expected file
>  '
>  
> -test_expect_success 'Merge addition of text=auto' '
> +test_expect_success 'Merge addition of text=auto eol=LF' '
> +	git config core.eol lf &&
>  	cat <<-\EOF >expected &&
>  	first line
>  	same line
>  	EOF
>  
> -	if test_have_prereq NATIVE_CRLF; then
> -		append_cr <expected >expected.temp &&
> -		mv expected.temp expected
> -	fi &&
>  	git config merge.renormalize true &&
>  	git rm -fr . &&
>  	rm -f .gitattributes &&
> @@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
>  	compare_files  expected file
>  '
>  
> +test_expect_success 'Merge addition of text=auto eol=CRLF' '
> +	git config core.eol crlf &&
> +	cat <<-\EOF >expected &&
> +	first line
> +	same line
> +	EOF
> +
> +	append_cr <expected >expected.temp &&
> +	mv expected.temp expected &&
> +	git config merge.renormalize true &&
> +	git rm -fr . &&
> +	rm -f .gitattributes &&
> +	git reset --hard b &&
> +	echo >&2 "After git reset --hard b" &&
> +	git ls-files -s --eol >&2 &&
> +	git merge a &&
> +	compare_files  expected file
> +'
> +
>  test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
> +	git config core.eol native &&
>  	echo "<<<<<<<" >expected &&
> -	if test_have_prereq NATIVE_CRLF; then
> -		echo first line | append_cr >>expected &&
> -		echo same line | append_cr >>expected &&
> -		echo ======= | append_cr >>expected
> -	else
> -		echo first line >>expected &&
> -		echo same line >>expected &&
> -		echo ======= >>expected
> -	fi &&
> +	echo first line >>expected &&
> +	echo same line >>expected &&
> +	echo ======= >>expected &&
>  	echo first line | append_cr >>expected &&
>  	echo same line | append_cr >>expected &&
>  	echo ">>>>>>>" >>expected &&
> @@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
>  	echo "<<<<<<<" >expected &&
>  	echo first line | append_cr >>expected &&
>  	echo same line | append_cr >>expected &&
> -	if test_have_prereq NATIVE_CRLF; then
> -		echo ======= | append_cr >>expected &&
> -		echo first line | append_cr >>expected &&
> -		echo same line | append_cr >>expected
> -	else
> -		echo ======= >>expected &&
> -		echo first line >>expected &&
> -		echo same line >>expected
> -	fi &&
> +	echo ======= >>expected &&
> +	echo first line >>expected &&
> +	echo same line >>expected &&
>  	echo ">>>>>>>" >>expected &&
>  	git config merge.renormalize false &&
>  	rm -f .gitattributes &&

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-06-29 16:14   ` Junio C Hamano
@ 2016-06-30 16:52     ` Torsten Bögershausen
  2016-07-01 22:11       ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2016-06-30 16:52 UTC (permalink / raw)
  To: Junio C Hamano, tboegi; +Cc: git

On 29.06.16 18:14, Junio C Hamano wrote:
> tboegi@web.de writes:
> 
>> From: Torsten Bögershausen <tboegi@web.de>
>>
>> The following didn't work as expected:
> 
> Sorry for being slow (not in response but in understanding), but
> let's examine the expectation first.

Thanks for the patience.
There is one detail missing in my description:
The check if a file in the working tree is clean:

static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
{
	int match = -1;
	int fd = open(ce->name, O_RDONLY);

	if (fd >= 0) {
		unsigned char sha1[20];
		unsigned flags = HASH_USE_SHA_NOT_PATH;
------------------
Remove the "new feature":

git diff  read-cache.c
diff --git a/read-cache.c b/read-cache.c
index dd98b36..1f951ea 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -170,7 +170,7 @@ static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 
        if (fd >= 0) {
                unsigned char sha1[20];
-               unsigned flags = HASH_USE_SHA_NOT_PATH;
+               unsigned flags = 0;//HASH_USE_SHA_NOT_PATH;
                memcpy(sha1, ce->sha1, sizeof(sha1));
                if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
                        match = hashcmp(sha1, ce->sha1);

The the problem can be re-produced, and debugged with help of t6038:

not ok 5 - Merge addition of text=auto eol=CRLF


> 
>>  - In a middle of a merge
>>  - merge.renormalize is true,
> 
> gitattributes(5) tells us that this would make a "virtual check-out
> and check-in of all stages", i.e. each of the three blob objects
> involved is run through convert_to_working_tree(), and the result is
> run through convert_to_git().  Now, what are these two operations
> told to do?
> 
>>  - .gitattributes = "* text=auto"
>>  - core.eol = crlf
> 
> git-config(1) tells us that a text file will be checked out with
> CRLF line endings, so convert_to_working_tree() would work that
> way.  Even though core.eol entry in that manual page does not tell
> us anything, presumably convert_to_git() is expected to turn it back
> to LF line endings.
> 
Yes, the git-config(1) may need improvements, but that can go as a later path.

>> Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
>> with LF "first line\nsame line\n".
I may have written:
Merge a blob with CRLF "first line\r\nsame line\r\n" 
(which is checked out as
"first line\r\nsame line\r\n" 
in the working tree, and therefore clean)
and a blob with LF "first line\nsame line\n".

> 
> The former, when renormalized, should (we are discussing the
> expectation, not necessarily what is done by today's code) be
> processed like this:
> 
>  * Pretend as if blob "first line\r\nsame line\r\n" is in the stage
>    #0 of the index at the path;
>  * Pretend as if the user said "git checkout" that path;
>  * Pretend as if the user then said "git add" that path.
> 
> The checkout process hopefully does not blindly turn LF into CRLF,
> making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
> working tree file will have "first line\r\nsame line\r\n".
Yes
> 
> Then "git add" should turn that CRLF into LF, which would give us
> "first line\nsame line\n", but because the "safer autocrlf" rule
> prevents it from happening.  The (real) index already has CR in it
> in the blob in stage #2, so the check-in process would (this is not
> describing the ideal, but what is done by today's code) disable
> CRLF->LF conversion.
> 
> Is that the problem you are trying to solve?
Not sure, if that problem isn't already solved.

> 
> If that is the case, I do not see how "don't use stage #2 of the
> real index; use the blob being renormalized instead" would help.
> The blob being renormalized may have CR in it, triggering that
> "safer autocrlf" rule and cause you the same trouble, no?
> 
> To me, it appears that if you consider that the "safer autocrlf" is
> a sensible thing, you _must_ expect that the renormalization would
> not work at all, in the scenario you presented.  Also,
> 
>> The expected result of the merge is "first line\nsame line\n".
> 
> if you expect this, to me, it appears that you need to reject the
> "safer autocrlf", at least during renormalization, as a sensible
> thing.  And if you _are_ to disable the "safer autocrlf" thing, then
> it does not matter what is currently in the index--the conversion
> can happen solely based on the data being converted and the
> configuration and attribute settings.
> 
> So I still do not see why you want to pass "no do not use stage #0
> or stage #2; use this blob instead".  Shouldn't you be just passing
> a bit "don't do the safer autocrlf thing in this convert_to_git()
> call" from renormalization codepath without doing anything else?
> 

This is my understanding:
- git checkout and git add are working as expected:
  LF in blob gives CRLF in the working tree at checkout (if attr says "auto")
  CRLF in blob gives CRLF in the working tree at checkout.
  CRLF in the working tree gives CRLF in the blob at "git add", when
  the new safer CRLF handling says so.

- Merges
 First the working tree is checked, if it is clean.
 I haven't digged deep enough to follow the whole code path,
 but that is what ce_compare_data() does, and it fails in a merge.

 Next thing is that the blobs are merged, and when a blob with CRLF
 is merged with a blob with LF, there are only conflicts :-(
 (Remember that both may have CRLF in the worktree)
 And here renormalize_buffer() comes in (only when "merge.renormalize" is true):
 - convert the blob with convert_to_working_tree_internal() into the working
    tree format (but do it in memory), and back to the "blob":
 - convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE, NULL);

 Note the the "new safer autocrlf" is suppressed here, because of SAFE_CRLF_RENORMALIZE.
 The result is, that both blobs are now normalized, having LF.
 After that, they are merged.

This is my understanding: there are 2 different things involved in t6038#5:
- check for a clean worktree in ce_compare_data() # fails -> this is the problem to be solved
- When merge.renormalize is true: renormalize the buffers before the blobs are merged # -> works





^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-06-30 16:52     ` Torsten Bögershausen
@ 2016-07-01 22:11       ` Junio C Hamano
  2016-07-02 18:41         ` Torsten Bögershausen
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-01 22:11 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>> The checkout process hopefully does not blindly turn LF into CRLF,
>> making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
>> working tree file will have "first line\r\nsame line\r\n".
> Yes
>> 
>> Then "git add" should turn that CRLF into LF, which would give us
>> "first line\nsame line\n", but because the "safer autocrlf" rule
>> prevents it from happening.  The (real) index already has CR in it
>> in the blob in stage #2, so the check-in process would (this is not
>> describing the ideal, but what is done by today's code) disable
>> CRLF->LF conversion.
>> 
>> Is that the problem you are trying to solve?
> Not sure, if that problem isn't already solved.

OK, so in short, renormalizing three blobs in the conflicted index
is done correcte because you do not allow safer-crlf kicks in during
the renormalization?

So how does ce_compare_data() judge "first line\r\nsame line\r\n"
in stage #2 and working tree (both with CRLF) not to match?  The
CRLF in working tree data, when run convert_to_git(), will be kept
as CRLF because of safer-crlf rule, and would match, no?  Or do you
disable safer-crlf for that case, too?

> This is my understanding:
> - git checkout and git add are working as expected:
>   LF in blob gives CRLF in the working tree at checkout (if attr says "auto")
>   CRLF in blob gives CRLF in the working tree at checkout.

OK.
>   CRLF in the working tree gives CRLF in the blob at "git add", when
>   the new safer CRLF handling says so.

OK.

> - Merges
>  First the working tree is checked, if it is clean.
>  I haven't digged deep enough to follow the whole code path,
>  but that is what ce_compare_data() does, and it fails in a merge.

OK, I misunderstood what you are trying to fix.  I somehow thought
that you wanted to tweak the renormalization of three blobs.

>  Next thing is that the blobs are merged, and when a blob with CRLF
>  is merged with a blob with LF, there are only conflicts :-(
>  (Remember that both may have CRLF in the worktree)
>  And here renormalize_buffer() comes in (only when "merge.renormalize" is true):
>  - convert the blob with convert_to_working_tree_internal() into the working
>     tree format (but do it in memory), and back to the "blob":
>  - convert_to_git(path, src, len, dst, SAFE_CRLF_RENORMALIZE, NULL);
>
>  Note the the "new safer autocrlf" is suppressed here, because of SAFE_CRLF_RENORMALIZE.

That makes sense.

> - check for a clean worktree in ce_compare_data() # fails -> this is the problem to be solved

But it is unclear to me how that need an extra "no, do not look at
the indexed blob, but look at this blob instead"?

The relevant part is this:

diff --git a/read-cache.c b/read-cache.c
index a3ef967..c109b6d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -163,7 +163,9 @@ static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 
 	if (fd >= 0) {
 		unsigned char sha1[20];
-		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
+		unsigned flags = HASH_USE_SHA_NOT_PATH;
+		memcpy(sha1, ce->sha1, sizeof(sha1));
+		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
 			match = hashcmp(sha1, ce->sha1);
 		/* index_fd() closed the file descriptor already */
 	}

I do not quite see how "Don't use ce->sha1, but use sha1 given to
you" flag affects this call, when sha1 and ce->sha1 are the same due
to memcpy() just before the call?


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-01 22:11       ` Junio C Hamano
@ 2016-07-02 18:41         ` Torsten Bögershausen
  2016-07-06 14:57           ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-02 18:41 UTC (permalink / raw)
  To: Junio C Hamano, Torsten Bögershausen; +Cc: git

On 2016-07-02 00.11, Junio C Hamano wrote:
[snip]
> diff --git a/read-cache.c b/read-cache.c
> index a3ef967..c109b6d 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -163,7 +163,9 @@ static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
>
>  	if (fd >= 0) {
>  		unsigned char sha1[20];
> -		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
> +		unsigned flags = HASH_USE_SHA_NOT_PATH;
> +		memcpy(sha1, ce->sha1, sizeof(sha1));
> +		if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
>  			match = hashcmp(sha1, ce->sha1);
>  		/* index_fd() closed the file descriptor already */
>  	}
>
> I do not quite see how "Don't use ce->sha1, but use sha1 given to
> you" flag affects this call, when sha1 and ce->sha1 are the same due
> to memcpy() just before the call?
>

Lets step back a second. and try to explain whats going on.

Do a merge (and renormalize the files).

At some point inside the merge, Git calls read-cache.c/ce_compare_data(),
   to check if the path named "file" is clean.
   According to the tree information, the path "file" has the sha1 99b633.
   #Note:
   #sha1 99b633 is "first line\nsame line\n"

ce_compare_data() starts the work:
OK, lets run index_fd(...,ce->name,...)
# index_fd will simulate a "git add"  and return the sha1 (via the sha1 pointer)
# after the hashing.

# ce_compare_data() will later compare ce->sha1 with the result stored in
# the local sha1. That's why a sha1 is in the parameter list.
# To return the resulting hash:

ce_compare_data() calls index_fd(sha1, ..., ce->name, ...)

#Down in index_fd():

sha1_file.c/index_fd() calls index_core() (after consulting the attributes)
index_core() reads the file from the working tree into memory using
read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
to calculate the sha1.

Before the hashing is done, index_mem() runs
convert_to_git(path, buf, size, &nbuf, SAFE_CRLF_FALSE)
to convert  "blobs to git internal format".


Here, convert_to_git() consults the .gitattributes (again) to find out that
crlf_action is CRLF_AUTO in our case.
The "new safer autocrlf handling" says that if a blob as any CR, don't convert
CRLFs at "git add".

convert.c/crlf_to_git() starts to do it's job:
- look at buf (It has CRLF, conversion may be needed)
- consult blob_has_cr()
   # Note: Before this patch, has_cr_in_index(path)) was used

#Again, before this patch,
has_cr_in_index(path) calls read_blob_data_from_cache(path, &sz) to read the
blob into memory.

Now read_blob_data_from_cache() is a macro, and we end up in
read_blob_data_from_index(&the_index, (path), (sz))

read-cache.c/read_blob_data_from_index() starts its work:
	pos = index_name_pos(istate, path, len);
	if (pos < 0) {
		/*
		 * We might be in the middle of a merge, in which
		 * case we would read stage #2 (ours).
		 */

# And here, and this is important to notice, "ours" is sha1 ad55e2,
# which corresponds to "first line\r\nsame line\r\n"
# On our way in the callstack the information what to check had been lost:
# blob 99b633 and nothing else.
# read_blob_data_from_index() doesn't know the sha, but makes a guess,
# derived from path.
# The guess works OK for most callers: to take "ours" in a middle
# of a merge, but not here.

#Back to crlf_to_git():
The (wrong) blob has CRs in it, so crlf_to_git() decides not to convert CRLFs,
(and this is wrong).

# And now we go all the way back in the call chain:

The content in the worktree which is "first line\r\nsame line\r\n" is hashed:
hash_sha1_file(buf, size, typename(type), sha1);
#and the result is stored in the return pointer "sha1": ad55e2

#Back to ce_compare_data():
match = hashcmp(sha1, ce->sha1);

#And here sha1=ad55e2 != ce->sha199b633

The whole merge is aborted:

    error: addinfo_cache failed for path 'file'
     file: unmerged (cbd69ec7cd12dd0989e853923867d94c8519aa52)
     file: unmerged (ad55e240aeb42e0d9a0e18d6d8b02dd82ee3e527)
     file: unmerged (99b633103c15c20cebebf821133ab526b0ff90b2)
     fatal: git write-tree failed to write a tree
#Side note: cbd69ec is another file in the working tree.

#######################

With this patch,
ce_compare_data() feeds the sha1 99b633 into
blob_has_cr(const unsigned char *index_blob_sha1).

crlf_to_git() says:
"first line\r\nsame line\r\n" in the worktree needs to be converted
into
"first line\nsame line\n" as the "new" blob.

The new (converted) blob "first line\nsame line\n"
is hashed into 99b633, so ce_compare_data() says
fine, converting "first line\r\nsame line\r\n" will give sha1 99b633,
lets continue with the merge, and do the normalization.

# Note1:
# The renormalization is outside what this patch fixes.
# But I didn't manage to construct a test case, which triggered
# "fatal: git write-tree failed to write a tree" without using
# merge.renormalize

# Note2:
# ce_compare_data() has (if I understand things right)
# no knowledge that a merge is going on
# ce_compare_data() has now knowledge that a merge
# with a merge with merge.renormalize=true is going on either.

#Note3:
# I hope that this explanation makes it more clear,
# why ce->sha1 must be forwarded into blob_has_cr() (to check the right blob)
# and that ce->name must be forwarded as path into crlf_to_git()
# (to check the .gitattributes)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-02 18:41         ` Torsten Bögershausen
@ 2016-07-06 14:57           ` Junio C Hamano
  2016-07-07 17:16             ` Torsten Bögershausen
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-06 14:57 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

> At some point inside the merge, Git calls read-cache.c/ce_compare_data(),
>   to check if the path named "file" is clean.
>   According to the tree information, the path "file" has the sha1 99b633.
>   #Note:
>   #sha1 99b633 is "first line\nsame line\n"

I thought your scenario was that our side had CRLF both in the blob
and in the working tree.  So this is a different example?  Our side
has LF in the blob that is checked out with CRLF in the working tree,
and their side has CRLF in the blob?

> ce_compare_data() starts the work:
> OK, lets run index_fd(...,ce->name,...)
> # index_fd will simulate a "git add"  and return the sha1 (via the sha1 pointer)
> # after the hashing.
>
> # ce_compare_data() will later compare ce->sha1 with the result stored in
> # the local sha1. That's why a sha1 is in the parameter list.
> # To return the resulting hash:
>
> ce_compare_data() calls index_fd(sha1, ..., ce->name, ...)
>
> #Down in index_fd():
>
> sha1_file.c/index_fd() calls index_core() (after consulting the attributes)
> index_core() reads the file from the working tree into memory using
> read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
> to calculate the sha1.
>
> Before the hashing is done, index_mem() runs
> convert_to_git(path, buf, size, &nbuf, SAFE_CRLF_FALSE)
> to convert  "blobs to git internal format".
>
>
> Here, convert_to_git() consults the .gitattributes (again) to find out that
> crlf_action is CRLF_AUTO in our case.
> The "new safer autocrlf handling" says that if a blob as any CR, don't convert
> CRLFs at "git add".
>
> convert.c/crlf_to_git() starts to do it's job:
> - look at buf (It has CRLF, conversion may be needed)
> - consult blob_has_cr()
>   # Note: Before this patch, has_cr_in_index(path)) was used
>
> #Again, before this patch,
> has_cr_in_index(path) calls read_blob_data_from_cache(path, &sz) to read the
> blob into memory.
>
> Now read_blob_data_from_cache() is a macro, and we end up in
> read_blob_data_from_index(&the_index, (path), (sz))
>
> read-cache.c/read_blob_data_from_index() starts its work:
> 	pos = index_name_pos(istate, path, len);
> 	if (pos < 0) {
> 		/*
> 		 * We might be in the middle of a merge, in which
> 		 * case we would read stage #2 (ours).
> 		 */
>
> # And here, and this is important to notice, "ours" is sha1 ad55e2,
> # which corresponds to "first line\r\nsame line\r\n"

Where did 99b633 come from then?  There still is something missing
in this description.

Puzzled...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-06 14:57           ` Junio C Hamano
@ 2016-07-07 17:16             ` Torsten Bögershausen
  2016-07-07 18:43               ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-07 17:16 UTC (permalink / raw)
  To: Junio C Hamano, Torsten Bögershausen; +Cc: git

On 2016-07-06 16.57, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
> 
>> At some point inside the merge, Git calls read-cache.c/ce_compare_data(),
>>   to check if the path named "file" is clean.
>>   According to the tree information, the path "file" has the sha1 99b633.
>>   #Note:
>>   #sha1 99b633 is "first line\nsame line\n"
> 
> I thought your scenario was that our side had CRLF both in the blob
> and in the working tree.  So this is a different example?  Our side
> has LF in the blob that is checked out with CRLF in the working tree,
> and their side has CRLF in the blob?
> 
That was probably to confuse myself, and the rest of the world,
sorry for confusion.

This is the callstack:

merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned char *sha1,
		const char *path, int stage, int refresh, int options)
{
	struct cache_entry *ce;
	ce = make_cache_entry
	if (!ce)
		return error(_("addinfo_cache failed for path '%s'"), path);
	return add_cache_entry(ce, options);
}
#Calls
read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)


struct cache_entry *make_cache_entry(unsigned int mode,
		const unsigned char *sha1, const char *path, int stage,
		unsigned int refresh_options)
{
        [snip]
	ret = refresh_cache_entry(ce, refresh_options);
	if (ret != ce)
		free(ce);
	return ret;
}

#Calls
refresh_cache_ent(&the_index, ce, options, NULL, NULL);

#Calls
ie_modified()

#Calls
read-cache.c/ie_match_stat()

#Calls
changed |= ce_modified_check_fs(ce, st);

#Calls
ce_compare_data(path=file sha1=0x99b633)

#Calls
index_fd(..., ..., ce->name, ...)
#Note the sha is lost here, the parameter sha is the output.

Deep down, has_cr_in_index(path) will look at ad55e2 instead,
which is wrong.


>> ce_compare_data() starts the work:
>> OK, lets run index_fd(...,ce->name,...)
>> # index_fd will simulate a "git add"  and return the sha1 (via the sha1 pointer)
>> # after the hashing.
>>
>> # ce_compare_data() will later compare ce->sha1 with the result stored in
>> # the local sha1. That's why a sha1 is in the parameter list.
>> # To return the resulting hash:
>>
>> ce_compare_data() calls index_fd(sha1, ..., ce->name, ...)
>>
>> #Down in index_fd():
>>
>> sha1_file.c/index_fd() calls index_core() (after consulting the attributes)
>> index_core() reads the file from the working tree into memory using
>> read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
>> to calculate the sha1.
>>
>> Before the hashing is done, index_mem() runs
>> convert_to_git(path, buf, size, &nbuf, SAFE_CRLF_FALSE)
>> to convert  "blobs to git internal format".
>>
>>
>> Here, convert_to_git() consults the .gitattributes (again) to find out that
>> crlf_action is CRLF_AUTO in our case.
>> The "new safer autocrlf handling" says that if a blob as any CR, don't convert
>> CRLFs at "git add".
>>
>> convert.c/crlf_to_git() starts to do it's job:
>> - look at buf (It has CRLF, conversion may be needed)
>> - consult blob_has_cr()
>>   # Note: Before this patch, has_cr_in_index(path)) was used
>>
>> #Again, before this patch,
>> has_cr_in_index(path) calls read_blob_data_from_cache(path, &sz) to read the
>> blob into memory.
>>
>> Now read_blob_data_from_cache() is a macro, and we end up in
>> read_blob_data_from_index(&the_index, (path), (sz))
>>
>> read-cache.c/read_blob_data_from_index() starts its work:
>> 	pos = index_name_pos(istate, path, len);
>> 	if (pos < 0) {
>> 		/*
>> 		 * We might be in the middle of a merge, in which
>> 		 * case we would read stage #2 (ours).
>> 		 */
>>
>> # And here, and this is important to notice, "ours" is sha1 ad55e2,
>> # which corresponds to "first line\r\nsame line\r\n"
> 
> Where did 99b633 come from then?  There still is something missing
> in this description.
> 
> Puzzled...
This is an unfinished attempt for a commit message:
--------------------------------------------------
correct ce_compare_data() at the end of a merge

The following didn't work as expected:

 - At the end of a merge
 - merge.renormalize is true,
 - .gitattributes = "* text=auto"
 - core.eol = crlf

Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
with LF "first line\nsame line\n".

The expected result of the merge is "first line\nsame line\n".

The content in the working tree is "first line\r\nsame line\r\n",
and ce_compare_data() should find that the content is clean and return 0.

The following callstack does not work:
merge-recursive.c/add_cacheinfo(path=file sha1=0x99b633)
#Calls
refresh_cache_ent(&the_index, ce, options, NULL, NULL);

#Calls
ie_modified()

#Calls
read-cache.c/ie_match_stat()

#Calls
changed |= ce_modified_check_fs(ce, st);

#Calls
ce_compare_data(path=file sha1=0x99b633)

#Calls
index_fd(..., ..., ce->name, ...)
#Note the sha is lost here.

#Calls
index_core()

index_core() reads the file from the working tree into memory using
read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
to calculate the sha1.
Before the hashing is done, index_mem() runs
convert_to_git(path, buf, size, &nbuf, SAFE_CRLF_FALSE)
to convert  "blobs to git internal format".

Here, convert_to_git() consults the .gitattributes (again) to find out that
crlf_action is CRLF_AUTO in our case.
The "new safer autocrlf handling" says that if a blob as any CR, don't convert
CRLFs at "git add".

convert.c/crlf_to_git() starts to do it's job:
- look at buf (It has CRLF, conversion may be needed)
- consult blob_has_cr()
  # Note: Before this patch, has_cr_in_index(path)) was used

#Before this patch,
has_cr_in_index(path)
#Calls
read_blob_data_from_cache(path, &sz) to read the blob into memory.

Now read_blob_data_from_cache() is a macro, and we end up in
read_blob_data_from_index(&the_index, (path), (sz))

read-cache.c/read_blob_data_from_index() starts its work:
	pos = index_name_pos(istate, path, len);
	if (pos < 0) {
		/*
		 * We might be in the middle of a merge, in which
		 * case we would read stage #2 (ours).
		 */

# And here, and this is important to notice, "ours" is sha1 ad55e2,
# which corresponds to "first line\r\nsame line\r\n"

The result of the check is that the blob 99b633 doesn't seem
to match the working tree, and the whold merge is aborted:
  error: addinfo_cache failed for path 'file'

Solution:
Make sure that ce_compare_data() forwards the source sha1 into crlf_to_git().

Replace has_cr_in_index(path) with blob_has_cr(sha1), and forward
the source sha1 from ce_compare_data() into blob_has_cr(sha1).

While at it, rename has_cr_in_index() into blob_has_cr()
and replace 0 with SAFE_CRLF_FALSE.

Add a TC in t6038 to have a test coverage under Linux.










^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-07 17:16             ` Torsten Bögershausen
@ 2016-07-07 18:43               ` Junio C Hamano
  2016-07-07 22:19                 ` Junio C Hamano
                                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Junio C Hamano @ 2016-07-07 18:43 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

> This is the callstack:
>
> merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned char *sha1,
> 		const char *path, int stage, int refresh, int options)
> {
> 	struct cache_entry *ce;
> 	ce = make_cache_entry
> 	if (!ce)
> 		return error(_("addinfo_cache failed for path '%s'"), path);
> 	return add_cache_entry(ce, options);
> }
> #Calls
> read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)
>
>
> struct cache_entry *make_cache_entry(unsigned int mode,
> 		const unsigned char *sha1, const char *path, int stage,
> 		unsigned int refresh_options)
> {
>         [snip]
> 	ret = refresh_cache_entry(ce, refresh_options);
> 	if (ret != ce)
> 		free(ce);
> 	return ret;
> }
>
> #Calls
> refresh_cache_ent(&the_index, ce, options, NULL, NULL);
>
> #Calls
> ie_modified()
>
> #Calls
> read-cache.c/ie_match_stat()
>
> #Calls
> changed |= ce_modified_check_fs(ce, st);
>
> #Calls
> ce_compare_data(path=file sha1=0x99b633)
>
> #Calls
> index_fd(..., ..., ce->name, ...)
> #Note the sha is lost here, the parameter sha is the output.
>
> Deep down, has_cr_in_index(path) will look at ad55e2 instead,
> which is wrong.

The call to add_cacheinfo() is made with 99b633 to record the 3-way
merge result to the index in this callchain:

 merge_trees()
 -> git_merge_trees()
 -> process_renames() # does nothing for this case
 -> process_entry()
    -> merge_content()
       -> merge_file_special_markers()
          -> merge_file_1()
             -> merge_3way()
                -> ll_merge() # correctly handles the renormalization
             -> write_sha1_file() # this is what gives us 99b633
    -> update_file() # this is fed the 99b633 write_sha1_file() computed
       -> update_file_flags()
          -> read_sha1_file() # reads 99b633
          -> convert_to_working_tree()
          -> write_in_full() # updates the working tree
          -> add_cacheinfo() # to record 99b633 at "file" in the index

So refresh() may incorrectly find that 99b633 does not match what is
in the filesystem if it looks at _any_ entry in the index.  We know
we have written the file ourselves, so by definition it should match
;-) Even though I am not sure why that should affect the overall
correctness of the program (because we have written the correct
result to both working tree and to the index), this needs to be
fixed.

I am however not convinced passing the full SHA-1 is a good
solution.  The make_cache_entry() function may be creating a cache
entry to stuff in a non-default index (i.e. not "the_index"), but
its caller does not have a way to tell it which index the resulting
cache entry will go, and calls refresh_cache_entry(), which always
assumes that the caller is interested in "the_index", so what
add_cacheinfo() does by calling make_cache_entry() feels wrong in
the first place.  Also, the call from update_file_flags() knows that
the working tree is in sync with the resulting cache entry.

Perhaps update_file_flags() should not even call add_cacheinfo()
there in the first place?  I wonder if it should just call
add_file_to_index()---after all, even though the current code
already knows that 99b633 _is_ the result it wants in the index, it
still goes to the filesystem and rehashes.

I dunno.  I really do not like that extra sha1 argument added all
over the callchain by this patch.

Or did you mean other calls to add_cacheinfo()?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-07 18:43               ` Junio C Hamano
@ 2016-07-07 22:19                 ` Junio C Hamano
  2016-07-08  7:52                 ` Torsten Bögershausen
  2016-07-08 15:00                 ` Torsten Bögershausen
  2 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2016-07-07 22:19 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> I am however not convinced passing the full SHA-1 is a good
> solution.  The make_cache_entry() function may be creating a cache
> entry to stuff in a non-default index (i.e. not "the_index"), but
> its caller does not have a way to tell it which index the resulting
> cache entry will go, and calls refresh_cache_entry(), which always
> assumes that the caller is interested in "the_index", so what
> add_cacheinfo() does by calling make_cache_entry() feels wrong in
> the first place.  Also, the call from update_file_flags() knows that
> the working tree is in sync with the resulting cache entry.
>
> Perhaps update_file_flags() should not even call add_cacheinfo()
> there in the first place?  I wonder if it should just call
> add_file_to_index()---after all, even though the current code
> already knows that 99b633 _is_ the result it wants in the index, it
> still goes to the filesystem and rehashes.
>
> I dunno.  I really do not like that extra sha1 argument added all
> over the callchain by this patch.

While the above still stands (i.e. I really want to see us find a
solution without the extra argument all over), I do not think
add_file_to_index() would work well here, as the stage #2 of the
index is contaminated with CRLF in this case, and getting rid of it
is the whole point of renormalization, I would think.  Of course,
you will get a normalized result if you merged in the other
direction, as your stage #2 would have normalized 99b633 when you
cleanly merge the CRLF version from the other side with
renormalization, and "git add" of the cleanly merged result would
keep the normalized version.

Perhaps that is an acceptable downside.  If your side is not
normalized, and if the merge result is the same as what you already
have, perhaps you deserve to keep that unnormalized result, and
you'd be better off normalizing your tree before doing a merge, if
you do not like the fact that your indexed blobs have CRLF.

> Or did you mean other calls to add_cacheinfo()?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-07 18:43               ` Junio C Hamano
  2016-07-07 22:19                 ` Junio C Hamano
@ 2016-07-08  7:52                 ` Torsten Bögershausen
  2016-07-08 16:36                   ` Junio C Hamano
  2016-07-08 15:00                 ` Torsten Bögershausen
  2 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-08  7:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On 07/07/16 20:43, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
>> This is the callstack:
>>
>> merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned char *sha1,
>> 		const char *path, int stage, int refresh, int options)
>> {
>> 	struct cache_entry *ce;
>> 	ce = make_cache_entry
>> 	if (!ce)
>> 		return error(_("addinfo_cache failed for path '%s'"), path);
>> 	return add_cache_entry(ce, options);
>> }
>> #Calls
>> read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)
>>
>>
>> struct cache_entry *make_cache_entry(unsigned int mode,
>> 		const unsigned char *sha1, const char *path, int stage,
>> 		unsigned int refresh_options)
>> {
>>          [snip]
>> 	ret = refresh_cache_entry(ce, refresh_options);
>> 	if (ret != ce)
>> 		free(ce);
>> 	return ret;
>> }
>>
>> #Calls
>> refresh_cache_ent(&the_index, ce, options, NULL, NULL);
>>
>> #Calls
>> ie_modified()
>>
>> #Calls
>> read-cache.c/ie_match_stat()
>>
>> #Calls
>> changed |= ce_modified_check_fs(ce, st);
>>
>> #Calls
>> ce_compare_data(path=file sha1=0x99b633)
>>
>> #Calls
>> index_fd(..., ..., ce->name, ...)
>> #Note the sha is lost here, the parameter sha is the output.
>>
>> Deep down, has_cr_in_index(path) will look at ad55e2 instead,
>> which is wrong.
> The call to add_cacheinfo() is made with 99b633 to record the 3-way
> merge result to the index in this callchain:
>
>   merge_trees()
>   -> git_merge_trees()
>   -> process_renames() # does nothing for this case
>   -> process_entry()
>      -> merge_content()
>         -> merge_file_special_markers()
>            -> merge_file_1()
>               -> merge_3way()
>                  -> ll_merge() # correctly handles the renormalization
>               -> write_sha1_file() # this is what gives us 99b633
>      -> update_file() # this is fed the 99b633 write_sha1_file() computed
>         -> update_file_flags()
>            -> read_sha1_file() # reads 99b633
>            -> convert_to_working_tree()
>            -> write_in_full() # updates the working tree
>            -> add_cacheinfo() # to record 99b633 at "file" in the index
>
> So refresh() may incorrectly find that 99b633 does not match what is
> in the filesystem if it looks at _any_ entry in the index.  We know
> we have written the file ourselves, so by definition it should match
> ;-) Even though I am not sure why that should affect the overall
> correctness of the program (because we have written the correct
> result to both working tree and to the index), this needs to be
> fixed.
>
> I am however not convinced passing the full SHA-1 is a good
> solution.
It seems at least that this is the correct way to do it.
> The make_cache_entry() function may be creating a cache
> entry to stuff in a non-default index (i.e. not "the_index"), but
> its caller does not have a way to tell it which index the resulting
> cache entry will go, and calls refresh_cache_entry(), which always
> assumes that the caller is interested in "the_index", so what
> add_cacheinfo() does by calling make_cache_entry() feels wrong in
> the first place.  Also, the call from update_file_flags() knows that
> the working tree is in sync with the resulting cache entry.
>
> Perhaps update_file_flags() should not even call add_cacheinfo()
> there in the first place?  I wonder if it should just call
> add_file_to_index()---after all, even though the current code
> already knows that 99b633 _is_ the result it wants in the index, it
> still goes to the filesystem and rehashes.
>
> I dunno.  I really do not like that extra sha1 argument added all
> over the callchain by this patch.
>
> Or did you mean other calls to add_cacheinfo()?
I didn't mean too much - the whole call chain touches code where I am not able to
comment on details.
I'm happy to test other implementations, if someone suggests a path, so to say.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-07 18:43               ` Junio C Hamano
  2016-07-07 22:19                 ` Junio C Hamano
  2016-07-08  7:52                 ` Torsten Bögershausen
@ 2016-07-08 15:00                 ` Torsten Bögershausen
  2 siblings, 0 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-08 15:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On 07/07/16 20:43, Junio C Hamano wrote:
 > Torsten Bögershausen <tboegi@web.de> writes:
 >
 >> This is the call stack:
 >>
 >> merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned 
char *sha1,
 >> 		const char *path, int stage, int refresh, int options)
 >> {
 >> 	struct cache_entry *ce;
 >> 	ce = make_cache_entry
 >> 	if (!ce)
 >> 		return error(_("addinfo_cache failed for path '%s'"), path);
 >> 	return add_cache_entry(ce, options);
 >> }
 >> #Calls
 >> read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)
 >>
 >>
 >> struct cache_entry *make_cache_entry(unsigned int mode,
 >> 		const unsigned char *sha1, const char *path, int stage,
 >> 		unsigned int refresh_options)
 >> {
 >>         [snip]
 >> 	ret = refresh_cache_entry(ce, refresh_options);
 >> 	if (ret != ce)
 >> 		free(ce);
 >> 	return ret;
 >> }
 >>
 >> #Calls
 >> refresh_cache_ent(&the_index, ce, options, NULL, NULL);
 >>
 >> #Calls
 >> ie_modified()
 >>
 >> #Calls
 >> read-cache.c/ie_match_stat()
 >>
 >> #Calls
 >> changed |= ce_modified_check_fs(ce, st);
 >>
 >> #Calls
 >> ce_compare_data(path=file sha1=0x99b633)
 >>
 >> #Calls
 >> index_fd(..., ..., ce->name, ...)
 >> #Note the sha is lost here, the parameter sha is the output.
 >>
 >> Deep down, has_cr_in_index(path) will look at ad55e2 instead,
 >> which is wrong.
 > The call to add_cacheinfo() is made with 99b633 to record the 3-way
 > merge result to the index in this callchain:
 >
 >  merge_trees()
 >  -> git_merge_trees()
 >  -> process_renames() # does nothing for this case
 >  -> process_entry()
 >     -> merge_content()
 >        -> merge_file_special_markers()
 >           -> merge_file_1()
 >              -> merge_3way()
 >                 -> ll_merge() # correctly handles the renormalization
 >              -> write_sha1_file() # this is what gives us 99b633
 >     -> update_file() # this is fed the 99b633 write_sha1_file() computed
 >        -> update_file_flags()
 >           -> read_sha1_file() # reads 99b633
 >           -> convert_to_working_tree()
 >           -> write_in_full() # updates the working tree
 >           -> add_cacheinfo() # to record 99b633 at "file" in the index
 >
 > So refresh() may incorrectly find that 99b633 does not match what is
 > in the filesystem if it looks at _any_ entry in the index.  We know
 > we have written the file ourselves, so by definition it should match
 > ;-)

Does it, always ?
There was a lengthy discussion around January, if
git checkout -f
should always give a clean workspace.
(My suggestion was yes, please), but the conclusion
was to always check the other way around:
What would "git add" say ?
If the smudge/clean don't provide
a round trip, then the worktree should not be clean.

My understanding is, that after merge, the filters must be round trip
(and all other conversions), otherwise the merge willl fail.

In this case the merge fails because of a bug in Git.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08  7:52                 ` Torsten Bögershausen
@ 2016-07-08 16:36                   ` Junio C Hamano
  2016-07-08 17:13                     ` Torsten Bögershausen
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-08 16:36 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>> I dunno.  I really do not like that extra sha1 argument added all
>> over the callchain by this patch.
>>
>> Or did you mean other calls to add_cacheinfo()?
>
> I didn't mean too much - the whole call chain touches code where I
> am not able to comment on details.
> I'm happy to test other implementations, if someone suggests a
> path, so to say.

I did a bit of experiment.

When 1/3 alone is applied, and then only changes for t/t6038 from
3/3 is picked, (i.e. we do not add the extra "don't look at index,
check this contents"), your "Merge addition of text=auto eol=CRLF"
test would fail.

And then with this further on top:

diff --git a/merge-recursive.c b/merge-recursive.c
index b880ae5..628c8ed 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -202,6 +202,9 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
 		const char *path, int stage, int refresh, int options)
 {
 	struct cache_entry *ce;
+
+	if (!stage)
+		remove_file_from_cache(path);
 	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
 			      (refresh ? (CE_MATCH_REFRESH |
 					  CE_MATCH_IGNORE_MISSING) : 0 ));

I get the renomalized result registered to the index.

The root cause of the "bug", I think, is that the "safer crlf" is
fundamentally a flawed mechanism and does not mesh well with the
rest of the system.  If you start from an index without the path and
add a CRLF file in the working tree, you get LF blob (as expected),
but if you happen to already have CRLF file for the path in the
index, you get CRLF blob (as claimed to be "safer"), which
essentially means that with that mechanism in place, you do not know
blob with what object name you would get, given the same working
tree file with the same configuration setting.

In this codepath, we already have three stages in the index, and the
stage #2 happens to be a CRLF blob in your test.  make_cache_entry()
wants to refresh, which involves comparing the working tree file to
see if it hashes down to the same object name as ce->sha1, which is
the result of the "normalizing" merge that uses LF.  But if you
merge the other way, the stage #2 would have LF blob, and that would
not prevent the working tree file with CRLF written from the result
of the "normalizing" merge to be cleaned again to LF blob to
round-trip.

The "this is a minimum workaround" patch above disables the "safer
crlf" conversion by removing the stages that will get in the way.
When we are recording the cleanly resolved result of a merge, we
know what the resulting ce->sha1 should be, and we also know the
higher stages for the path will be removed, so there is no point
keeping the higher stages in the index and get them looked at by the
"safer crlf" codepath.


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 16:36                   ` Junio C Hamano
@ 2016-07-08 17:13                     ` Torsten Bögershausen
  2016-07-08 17:25                       ` Junio C Hamano
  2016-07-08 19:01                       ` Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-08 17:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, tboegi



On 07/08/2016 06:36 PM, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
>>> I dunno.  I really do not like that extra sha1 argument added all
>>> over the callchain by this patch.
>>>
>>> Or did you mean other calls to add_cacheinfo()?
>>
>> I didn't mean too much - the whole call chain touches code where I
>> am not able to comment on details.
>> I'm happy to test other implementations, if someone suggests a
>> path, so to say.
>
> I did a bit of experiment.
>
> When 1/3 alone is applied, and then only changes for t/t6038 from
> 3/3 is picked, (i.e. we do not add the extra "don't look at index,
> check this contents"), your "Merge addition of text=auto eol=CRLF"
> test would fail.
>
> And then with this further on top:
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index b880ae5..628c8ed 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -202,6 +202,9 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
>  		const char *path, int stage, int refresh, int options)
>  {
>  	struct cache_entry *ce;
> +
> +	if (!stage)
> +		remove_file_from_cache(path);
>  	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
>  			      (refresh ? (CE_MATCH_REFRESH |
>  					  CE_MATCH_IGNORE_MISSING) : 0 ));
>
Thanks :-)
Did that experiment made it to a branch somewhere ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 17:13                     ` Torsten Bögershausen
@ 2016-07-08 17:25                       ` Junio C Hamano
  2016-07-08 17:59                         ` Junio C Hamano
  2016-07-08 19:01                       ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-08 17:25 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>> And then with this further on top:
>>
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index b880ae5..628c8ed 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -202,6 +202,9 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
>>  		const char *path, int stage, int refresh, int options)
>>  {
>>  	struct cache_entry *ce;
>> +
>> +	if (!stage)
>> +		remove_file_from_cache(path);
>>  	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
>>  			      (refresh ? (CE_MATCH_REFRESH |
>>  					  CE_MATCH_IGNORE_MISSING) : 0 ));
>>
> Thanks :-)
> Did that experiment made it to a branch somewhere ?

Not yet.  As I called it "experiment", it was merely to demonstrate
that there are less intrusive ways to kill the "safer crlf" we may
want to consider first before passing an extra blob object name
around.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 17:25                       ` Junio C Hamano
@ 2016-07-08 17:59                         ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2016-07-08 17:59 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Not yet.  As I called it "experiment", it was merely to demonstrate
> that there are less intrusive ways to kill the "safer crlf" we may
> want to consider first before passing an extra blob object name
> around.

Here is another approach, that probably is less intrusive.  It
contains the test updates from your 3/3, and you can apply it
directly on top of 65237284 (convert: unify the "auto" handling of
CRLF, 2016-06-28), which is the result of applying your 1/3 on top
of v2.9.0-rc0~11^2 (convert.c: ident + core.autocrlf didn't work,
2016-04-25).

Instead of letting add_cacheinfo() to refresh individual cache
entries as they are added, we just leave them as they are, which
would bypass the make_cache_entry() -> refresh_cache_entry() ->
... -> ce_compare_data() -> "safer crlf" callchain.  After we are
done all (in builtin/merge.c that directly calls merge_recursive(),
and also merge_recursive_generic()), we let refresh_cache() to take
care of the refreshing of the stat information--by the time this
happens, the stale cache entries that had CRLF in stage #2 that were
carried over before the renormalizing merge started will all be gone
and will not interfere.


 builtin/merge.c            |  3 ++-
 merge-recursive.c          | 20 +++++++++---------
 t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
 3 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 101ffef..d5bf68d 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -678,7 +678,8 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 
 		hold_locked_index(&lock, 1);
 		clean = merge_recursive(&o, head,
-				remoteheads->item, reversed, &result);
+					remoteheads->item, reversed, &result);
+		refresh_cache(CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
 		if (active_cache_changed &&
 		    write_locked_index(&the_index, &lock, COMMIT_LOCK))
 			die (_("unable to write %s"), get_index_file());
diff --git a/merge-recursive.c b/merge-recursive.c
index b880ae5..8aaf1b5 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -199,14 +199,14 @@ static void output_commit_title(struct merge_options *o, struct commit *commit)
 }
 
 static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
-		const char *path, int stage, int refresh, int options)
+			 const char *path, int stage, int options)
 {
 	struct cache_entry *ce;
-	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
-			      (refresh ? (CE_MATCH_REFRESH |
-					  CE_MATCH_IGNORE_MISSING) : 0 ));
+
+	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage, 0);
 	if (!ce)
 		return error(_("addinfo_cache failed for path '%s'"), path);
+
 	return add_cache_entry(ce, options);
 }
 
@@ -552,13 +552,13 @@ static int update_stages(const char *path, const struct diff_filespec *o,
 		if (remove_file_from_cache(path))
 			return -1;
 	if (o)
-		if (add_cacheinfo(o->mode, o->sha1, path, 1, 0, options))
+		if (add_cacheinfo(o->mode, o->sha1, path, 1, options))
 			return -1;
 	if (a)
-		if (add_cacheinfo(a->mode, a->sha1, path, 2, 0, options))
+		if (add_cacheinfo(a->mode, a->sha1, path, 2, options))
 			return -1;
 	if (b)
-		if (add_cacheinfo(b->mode, b->sha1, path, 3, 0, options))
+		if (add_cacheinfo(b->mode, b->sha1, path, 3, options))
 			return -1;
 	return 0;
 }
@@ -804,7 +804,7 @@ static void update_file_flags(struct merge_options *o,
 	}
  update_index:
 	if (update_cache)
-		add_cacheinfo(mode, sha, path, 0, update_wd, ADD_CACHE_OK_TO_ADD);
+		add_cacheinfo(mode, sha, path, 0, ADD_CACHE_OK_TO_ADD);
 }
 
 static void update_file(struct merge_options *o,
@@ -1638,8 +1638,7 @@ static int merge_content(struct merge_options *o,
 		 */
 		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
 		if (!path_renamed_outside_HEAD) {
-			add_cacheinfo(mfi.mode, mfi.sha, path,
-				      0, (!o->call_depth), 0);
+			add_cacheinfo(mfi.mode, mfi.sha, path, 0, 0);
 			return mfi.clean;
 		}
 	} else
@@ -2019,6 +2018,7 @@ int merge_recursive_generic(struct merge_options *o,
 	hold_locked_index(lock, 1);
 	clean = merge_recursive(o, head_commit, next_commit, ca,
 			result);
+	refresh_cache(CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
 	if (active_cache_changed &&
 	    write_locked_index(&the_index, lock, COMMIT_LOCK))
 		return error(_("Unable to write index."));
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 33b77ee..5e8d5fa 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
 	compare_files expected file
 '
 
-test_expect_success 'Merge addition of text=auto' '
+test_expect_success 'Merge addition of text=auto eol=LF' '
+	git config core.eol lf &&
 	cat <<-\EOF >expected &&
 	first line
 	same line
 	EOF
 
-	if test_have_prereq NATIVE_CRLF; then
-		append_cr <expected >expected.temp &&
-		mv expected.temp expected
-	fi &&
 	git config merge.renormalize true &&
 	git rm -fr . &&
 	rm -f .gitattributes &&
@@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
 	compare_files  expected file
 '
 
+test_expect_success 'Merge addition of text=auto eol=CRLF' '
+	git config core.eol crlf &&
+	cat <<-\EOF >expected &&
+	first line
+	same line
+	EOF
+
+	append_cr <expected >expected.temp &&
+	mv expected.temp expected &&
+	git config merge.renormalize true &&
+	git rm -fr . &&
+	rm -f .gitattributes &&
+	git reset --hard b &&
+	echo >&2 "After git reset --hard b" &&
+	git ls-files -s --eol >&2 &&
+	git merge a &&
+	compare_files  expected file
+'
+
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
+	git config core.eol native &&
 	echo "<<<<<<<" >expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected &&
-		echo ======= | append_cr >>expected
-	else
-		echo first line >>expected &&
-		echo same line >>expected &&
-		echo ======= >>expected
-	fi &&
+	echo first line >>expected &&
+	echo same line >>expected &&
+	echo ======= >>expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
 	echo ">>>>>>>" >>expected &&
@@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
 	echo "<<<<<<<" >expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo ======= | append_cr >>expected &&
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected
-	else
-		echo ======= >>expected &&
-		echo first line >>expected &&
-		echo same line >>expected
-	fi &&
+	echo ======= >>expected &&
+	echo first line >>expected &&
+	echo same line >>expected &&
 	echo ">>>>>>>" >>expected &&
 	git config merge.renormalize false &&
 	rm -f .gitattributes &&




^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 17:13                     ` Torsten Bögershausen
  2016-07-08 17:25                       ` Junio C Hamano
@ 2016-07-08 19:01                       ` Junio C Hamano
  2016-07-08 20:50                         ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-08 19:01 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

> Did that experiment made it to a branch somewhere ?

I've tentatively queued the following (it's the same as "Here is
another approach." I sent earlier, but with a real log message) on
'pu'.

-- >8 --
Subject: [PATCH] merge: avoid "safer crlf" during recording of merge results

When merge_recursive() decides what the correct blob object merge
result for a path should be, it uses update_file_flags() helper
function to write it out to a working tree file and then calls
add_cacheinfo().  The add_cacheinfo() function in turn calls
make_cache_entry() to create a new cache entry to replace the
higher-stage entries for the path that represents the conflict.

The make_cache_entry() function calls refresh_cache_entry() to fill
in the cached stat information.  To mark a cache entry as
up-to-date, the data is re-read from the file in the working tree,
and goes through convert_to_git() conversion to be compared with the
blob object name the new cache entry records.

It is important to note that this happens while the higher-stage
entries, which are going to be replaced with the new entry, are
still in the index.  Unfortunately, the convert_to_git() conversion
has a misguided "safer crlf" mechanism baked in, and looks at the
existing cache entry for the path to decide how to convert the
contents in the working tree file.  If our side (i.e. stage#2)
records a text blob with CRLF in it, even when the system is
configured to record LF in blobs and convert them to CRLF upon
checkout (and back to LF upon checkin), the "safer crlf" mechanism
stops us doing so.

This especially poses a problem during a renormalizing merge, where
the merge result for the path is computed by first "normalizing" the
blobs involved in the merge by using convert_to_working_tree()
followed by convert_to_git() with "safer crlf" disabled.  The merge
result that is computed correctly and fed to add_cacheinfo() via
update_file_flags() does _not_ match what refresh_cache_entry() sees
by converting the working tree file via convert_to_git().

We can work this around by not refreshing the individual cache
entries in make_cache_entry() called by add_cacheinfo().  After we
are finished mergeing, we call refresh_cache() to take care of the
refreshing of the stat information.  By the time this refreshing
happens, the stale cache entries that had CRLF in stage #2 that were
carried over before the renormalizing merge started will all be gone
and will not interfere with the correct recording of the result.

The test update was taken from a series by Torsten Bögershausen
that attempted to fix this with a different approach (which was a
lot more intrusive).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/merge.c            |  3 ++-
 merge-recursive.c          | 20 +++++++++---------
 t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
 3 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 101ffef..d5bf68d 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -678,7 +678,8 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 
 		hold_locked_index(&lock, 1);
 		clean = merge_recursive(&o, head,
-				remoteheads->item, reversed, &result);
+					remoteheads->item, reversed, &result);
+		refresh_cache(CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
 		if (active_cache_changed &&
 		    write_locked_index(&the_index, &lock, COMMIT_LOCK))
 			die (_("unable to write %s"), get_index_file());
diff --git a/merge-recursive.c b/merge-recursive.c
index b880ae5..8aaf1b5 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -199,14 +199,14 @@ static void output_commit_title(struct merge_options *o, struct commit *commit)
 }
 
 static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
-		const char *path, int stage, int refresh, int options)
+			 const char *path, int stage, int options)
 {
 	struct cache_entry *ce;
-	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
-			      (refresh ? (CE_MATCH_REFRESH |
-					  CE_MATCH_IGNORE_MISSING) : 0 ));
+
+	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage, 0);
 	if (!ce)
 		return error(_("addinfo_cache failed for path '%s'"), path);
+
 	return add_cache_entry(ce, options);
 }
 
@@ -552,13 +552,13 @@ static int update_stages(const char *path, const struct diff_filespec *o,
 		if (remove_file_from_cache(path))
 			return -1;
 	if (o)
-		if (add_cacheinfo(o->mode, o->sha1, path, 1, 0, options))
+		if (add_cacheinfo(o->mode, o->sha1, path, 1, options))
 			return -1;
 	if (a)
-		if (add_cacheinfo(a->mode, a->sha1, path, 2, 0, options))
+		if (add_cacheinfo(a->mode, a->sha1, path, 2, options))
 			return -1;
 	if (b)
-		if (add_cacheinfo(b->mode, b->sha1, path, 3, 0, options))
+		if (add_cacheinfo(b->mode, b->sha1, path, 3, options))
 			return -1;
 	return 0;
 }
@@ -804,7 +804,7 @@ static void update_file_flags(struct merge_options *o,
 	}
  update_index:
 	if (update_cache)
-		add_cacheinfo(mode, sha, path, 0, update_wd, ADD_CACHE_OK_TO_ADD);
+		add_cacheinfo(mode, sha, path, 0, ADD_CACHE_OK_TO_ADD);
 }
 
 static void update_file(struct merge_options *o,
@@ -1638,8 +1638,7 @@ static int merge_content(struct merge_options *o,
 		 */
 		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
 		if (!path_renamed_outside_HEAD) {
-			add_cacheinfo(mfi.mode, mfi.sha, path,
-				      0, (!o->call_depth), 0);
+			add_cacheinfo(mfi.mode, mfi.sha, path, 0, 0);
 			return mfi.clean;
 		}
 	} else
@@ -2019,6 +2018,7 @@ int merge_recursive_generic(struct merge_options *o,
 	hold_locked_index(lock, 1);
 	clean = merge_recursive(o, head_commit, next_commit, ca,
 			result);
+	refresh_cache(CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
 	if (active_cache_changed &&
 	    write_locked_index(&the_index, lock, COMMIT_LOCK))
 		return error(_("Unable to write index."));
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 33b77ee..5e8d5fa 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
 	compare_files expected file
 '
 
-test_expect_success 'Merge addition of text=auto' '
+test_expect_success 'Merge addition of text=auto eol=LF' '
+	git config core.eol lf &&
 	cat <<-\EOF >expected &&
 	first line
 	same line
 	EOF
 
-	if test_have_prereq NATIVE_CRLF; then
-		append_cr <expected >expected.temp &&
-		mv expected.temp expected
-	fi &&
 	git config merge.renormalize true &&
 	git rm -fr . &&
 	rm -f .gitattributes &&
@@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
 	compare_files  expected file
 '
 
+test_expect_success 'Merge addition of text=auto eol=CRLF' '
+	git config core.eol crlf &&
+	cat <<-\EOF >expected &&
+	first line
+	same line
+	EOF
+
+	append_cr <expected >expected.temp &&
+	mv expected.temp expected &&
+	git config merge.renormalize true &&
+	git rm -fr . &&
+	rm -f .gitattributes &&
+	git reset --hard b &&
+	echo >&2 "After git reset --hard b" &&
+	git ls-files -s --eol >&2 &&
+	git merge a &&
+	compare_files  expected file
+'
+
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
+	git config core.eol native &&
 	echo "<<<<<<<" >expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected &&
-		echo ======= | append_cr >>expected
-	else
-		echo first line >>expected &&
-		echo same line >>expected &&
-		echo ======= >>expected
-	fi &&
+	echo first line >>expected &&
+	echo same line >>expected &&
+	echo ======= >>expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
 	echo ">>>>>>>" >>expected &&
@@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
 	echo "<<<<<<<" >expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo ======= | append_cr >>expected &&
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected
-	else
-		echo ======= >>expected &&
-		echo first line >>expected &&
-		echo same line >>expected
-	fi &&
+	echo ======= >>expected &&
+	echo first line >>expected &&
+	echo same line >>expected &&
 	echo ">>>>>>>" >>expected &&
 	git config merge.renormalize false &&
 	rm -f .gitattributes &&
-- 
2.9.0-512-gbea6568


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 19:01                       ` Junio C Hamano
@ 2016-07-08 20:50                         ` Junio C Hamano
  2016-07-11 20:07                           ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-08 20:50 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> I've tentatively queued the following (it's the same as "Here is
> another approach." I sent earlier, but with a real log message) on
> 'pu'.

And here is yet another approach, which probably is even less
intrusive.

Whatever alternative we would end up picking, in the longer term, I
think we should just get rid of the "safer crlf" hack from the
codebase, instead of adding the conditional disabling to work it
around.  The "safe crlf" that inspects the contents being added and
decides that it is not a good idea to turn CRLF in the contents
being added into LF was sensible.  The "safer crlf" that inspects
what happens to be at the path in the index and decides that it
should not turn CRLF in the contents being added into LF, only
because the blob in the index has CRLF in it, is simply broken.  If
a project wants to make sure blobs in some paths are recorded with
CRLF (perhaps because they are test data to compare with and the
program is supposed to always produce CRLF data), they can and
should be use the gitattributes mechanism to mark those paths as
such.

And that is why I really didn't like the approach to pass extra
sha1's around, churning codepaths that are at quite a low level,
only to feed it to the "safer crlf" mechanism.  If removing "safer
crlf" is the right thing to do in the longer term (and I think it
is), we shouldn't be making a big interface changes that we'd need
to revert when it goes away.

-- >8 --
Subject: [PATCH] merge: avoid "safer crlf" during recording of merge results

When merge_recursive() decides what the correct blob object merge
result for a path should be, it uses update_file_flags() helper
function to write it out to a working tree file and then calls
add_cacheinfo().  The add_cacheinfo() function in turn calls
make_cache_entry() to create a new cache entry to replace the
higher-stage entries for the path that represents the conflict.

The make_cache_entry() function calls refresh_cache_entry() to fill
in the cached stat information.  To mark a cache entry as
up-to-date, the data is re-read from the file in the working tree,
and goes through convert_to_git() conversion to be compared with the
blob object name the new cache entry records.

It is important to note that this happens while the higher-stage
entries, which are going to be replaced with the new entry, are
still in the index.  Unfortunately, the convert_to_git() conversion
has a misguided "safer crlf" mechanism baked in, and looks at the
existing cache entry for the path to decide how to convert the
contents in the working tree file.  If our side (i.e. stage#2)
records a text blob with CRLF in it, even when the system is
configured to record LF in blobs and convert them to CRLF upon
checkout (and back to LF upon checkin), the "safer crlf" mechanism
stops us doing so.

This especially poses a problem during a renormalizing merge, where
the merge result for the path is computed by first "normalizing" the
blobs involved in the merge by using convert_to_working_tree()
followed by convert_to_git() with "safer crlf" disabled.  The merge
result that is computed correctly and fed to add_cacheinfo() via
update_file_flags() does _not_ match what refresh_cache_entry() sees
by converting the working tree file via convert_to_git().

We can work this around by not refreshing the new cache entry in
make_cache_entry() called by add_cacheinfo().  After add_cacheinfo()
adds the new entry, we can call refresh_cache_entry() on that,
knowing that addition of this new cache entry would have removed the
stale cache entries that had CRLF in stage #2 that were carried over
before the renormalizing merge started and will not interfere with
the correct recording of the result.

The test update was taken from a series by Torsten Bögershausen
that attempted to fix this with a different approach (which was a
lot more intrusive).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache.h                    |  1 +
 merge-recursive.c          | 17 ++++++++++++----
 read-cache.c               |  5 +----
 t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
 4 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/cache.h b/cache.h
index b829410..b33cb54 100644
--- a/cache.h
+++ b/cache.h
@@ -632,6 +632,7 @@ extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
 #define REFRESH_IGNORE_SUBMODULES	0x0010	/* ignore submodules */
 #define REFRESH_IN_PORCELAIN	0x0020	/* user friendly output, not "needs update" */
 extern int refresh_index(struct index_state *, unsigned int flags, const struct pathspec *pathspec, char *seen, const char *header_msg);
+extern struct cache_entry *refresh_cache_entry(struct cache_entry *, unsigned int);
 
 extern void update_index_if_able(struct index_state *, struct lock_file *);
 
diff --git a/merge-recursive.c b/merge-recursive.c
index b880ae5..de37e51 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -202,12 +202,21 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
 		const char *path, int stage, int refresh, int options)
 {
 	struct cache_entry *ce;
-	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
-			      (refresh ? (CE_MATCH_REFRESH |
-					  CE_MATCH_IGNORE_MISSING) : 0 ));
+	int ret;
+
+	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage, 0);
 	if (!ce)
 		return error(_("addinfo_cache failed for path '%s'"), path);
-	return add_cache_entry(ce, options);
+
+	ret = add_cache_entry(ce, options);
+	if (refresh) {
+		struct cache_entry *nce;
+
+		nce = refresh_cache_entry(ce, CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
+		if (nce != ce)
+			ret = add_cache_entry(nce, options);
+	}
+	return ret;
 }
 
 static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
diff --git a/read-cache.c b/read-cache.c
index d9fb78b..6af409a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -19,9 +19,6 @@
 #include "split-index.h"
 #include "utf8.h"
 
-static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
-					       unsigned int options);
-
 /* Mask for the name length in ce_flags in the on-disk index */
 
 #define CE_NAMEMASK  (0x0fff)
@@ -1254,7 +1251,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	return has_errors;
 }
 
-static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
+struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 					       unsigned int options)
 {
 	return refresh_cache_ent(&the_index, ce, options, NULL, NULL);
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 33b77ee..5e8d5fa 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
 	compare_files expected file
 '
 
-test_expect_success 'Merge addition of text=auto' '
+test_expect_success 'Merge addition of text=auto eol=LF' '
+	git config core.eol lf &&
 	cat <<-\EOF >expected &&
 	first line
 	same line
 	EOF
 
-	if test_have_prereq NATIVE_CRLF; then
-		append_cr <expected >expected.temp &&
-		mv expected.temp expected
-	fi &&
 	git config merge.renormalize true &&
 	git rm -fr . &&
 	rm -f .gitattributes &&
@@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
 	compare_files  expected file
 '
 
+test_expect_success 'Merge addition of text=auto eol=CRLF' '
+	git config core.eol crlf &&
+	cat <<-\EOF >expected &&
+	first line
+	same line
+	EOF
+
+	append_cr <expected >expected.temp &&
+	mv expected.temp expected &&
+	git config merge.renormalize true &&
+	git rm -fr . &&
+	rm -f .gitattributes &&
+	git reset --hard b &&
+	echo >&2 "After git reset --hard b" &&
+	git ls-files -s --eol >&2 &&
+	git merge a &&
+	compare_files  expected file
+'
+
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
+	git config core.eol native &&
 	echo "<<<<<<<" >expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected &&
-		echo ======= | append_cr >>expected
-	else
-		echo first line >>expected &&
-		echo same line >>expected &&
-		echo ======= >>expected
-	fi &&
+	echo first line >>expected &&
+	echo same line >>expected &&
+	echo ======= >>expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
 	echo ">>>>>>>" >>expected &&
@@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
 	echo "<<<<<<<" >expected &&
 	echo first line | append_cr >>expected &&
 	echo same line | append_cr >>expected &&
-	if test_have_prereq NATIVE_CRLF; then
-		echo ======= | append_cr >>expected &&
-		echo first line | append_cr >>expected &&
-		echo same line | append_cr >>expected
-	else
-		echo ======= >>expected &&
-		echo first line >>expected &&
-		echo same line >>expected
-	fi &&
+	echo ======= >>expected &&
+	echo first line >>expected &&
+	echo same line >>expected &&
 	echo ">>>>>>>" >>expected &&
 	git config merge.renormalize false &&
 	rm -f .gitattributes &&
-- 
2.9.0-512-gbea6568


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-08 20:50                         ` Junio C Hamano
@ 2016-07-11 20:07                           ` Junio C Hamano
  2016-07-12  2:23                             ` Torsten Bögershausen
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2016-07-11 20:07 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Subject: [PATCH] merge: avoid "safer crlf" during recording of merge results
> ...
> We can work this around by not refreshing the new cache entry in
> make_cache_entry() called by add_cacheinfo().  After add_cacheinfo()
> adds the new entry, we can call refresh_cache_entry() on that,
> knowing that addition of this new cache entry would have removed the
> stale cache entries that had CRLF in stage #2 that were carried over
> before the renormalizing merge started and will not interfere with
> the correct recording of the result.
>
> The test update was taken from a series by Torsten Bögershausen
> that attempted to fix this with a different approach (which was a
> lot more intrusive).
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---

How do things look at this point?  This version is what I ended up
queuing in 'pu', but I took your "Thanks" in $gmane/299120 to only
mean "Thanks for feeding some ideas to help me move forward", not
necessarily "Tnanks that looks like the right approach." yet, so
right now both topics are stalled and waiting for an action from
you.

Thanks.

>  cache.h                    |  1 +
>  merge-recursive.c          | 17 ++++++++++++----
>  read-cache.c               |  5 +----
>  t/t6038-merge-text-auto.sh | 51 +++++++++++++++++++++++++---------------------
>  4 files changed, 43 insertions(+), 31 deletions(-)

[no comment below this line; the contents kept as reference]

> diff --git a/cache.h b/cache.h
> index b829410..b33cb54 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -632,6 +632,7 @@ extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
>  #define REFRESH_IGNORE_SUBMODULES	0x0010	/* ignore submodules */
>  #define REFRESH_IN_PORCELAIN	0x0020	/* user friendly output, not "needs update" */
>  extern int refresh_index(struct index_state *, unsigned int flags, const struct pathspec *pathspec, char *seen, const char *header_msg);
> +extern struct cache_entry *refresh_cache_entry(struct cache_entry *, unsigned int);
>  
>  extern void update_index_if_able(struct index_state *, struct lock_file *);
>  
> diff --git a/merge-recursive.c b/merge-recursive.c
> index b880ae5..de37e51 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -202,12 +202,21 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
>  		const char *path, int stage, int refresh, int options)
>  {
>  	struct cache_entry *ce;
> -	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
> -			      (refresh ? (CE_MATCH_REFRESH |
> -					  CE_MATCH_IGNORE_MISSING) : 0 ));
> +	int ret;
> +
> +	ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage, 0);
>  	if (!ce)
>  		return error(_("addinfo_cache failed for path '%s'"), path);
> -	return add_cache_entry(ce, options);
> +
> +	ret = add_cache_entry(ce, options);
> +	if (refresh) {
> +		struct cache_entry *nce;
> +
> +		nce = refresh_cache_entry(ce, CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
> +		if (nce != ce)
> +			ret = add_cache_entry(nce, options);
> +	}
> +	return ret;
>  }
>  
>  static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
> diff --git a/read-cache.c b/read-cache.c
> index d9fb78b..6af409a 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -19,9 +19,6 @@
>  #include "split-index.h"
>  #include "utf8.h"
>  
> -static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
> -					       unsigned int options);
> -
>  /* Mask for the name length in ce_flags in the on-disk index */
>  
>  #define CE_NAMEMASK  (0x0fff)
> @@ -1254,7 +1251,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>  	return has_errors;
>  }
>  
> -static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
> +struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
>  					       unsigned int options)
>  {
>  	return refresh_cache_ent(&the_index, ce, options, NULL, NULL);
> diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
> index 33b77ee..5e8d5fa 100755
> --- a/t/t6038-merge-text-auto.sh
> +++ b/t/t6038-merge-text-auto.sh
> @@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
>  	compare_files expected file
>  '
>  
> -test_expect_success 'Merge addition of text=auto' '
> +test_expect_success 'Merge addition of text=auto eol=LF' '
> +	git config core.eol lf &&
>  	cat <<-\EOF >expected &&
>  	first line
>  	same line
>  	EOF
>  
> -	if test_have_prereq NATIVE_CRLF; then
> -		append_cr <expected >expected.temp &&
> -		mv expected.temp expected
> -	fi &&
>  	git config merge.renormalize true &&
>  	git rm -fr . &&
>  	rm -f .gitattributes &&
> @@ -109,17 +106,31 @@ test_expect_success 'Merge addition of text=auto' '
>  	compare_files  expected file
>  '
>  
> +test_expect_success 'Merge addition of text=auto eol=CRLF' '
> +	git config core.eol crlf &&
> +	cat <<-\EOF >expected &&
> +	first line
> +	same line
> +	EOF
> +
> +	append_cr <expected >expected.temp &&
> +	mv expected.temp expected &&
> +	git config merge.renormalize true &&
> +	git rm -fr . &&
> +	rm -f .gitattributes &&
> +	git reset --hard b &&
> +	echo >&2 "After git reset --hard b" &&
> +	git ls-files -s --eol >&2 &&
> +	git merge a &&
> +	compare_files  expected file
> +'
> +
>  test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
> +	git config core.eol native &&
>  	echo "<<<<<<<" >expected &&
> -	if test_have_prereq NATIVE_CRLF; then
> -		echo first line | append_cr >>expected &&
> -		echo same line | append_cr >>expected &&
> -		echo ======= | append_cr >>expected
> -	else
> -		echo first line >>expected &&
> -		echo same line >>expected &&
> -		echo ======= >>expected
> -	fi &&
> +	echo first line >>expected &&
> +	echo same line >>expected &&
> +	echo ======= >>expected &&
>  	echo first line | append_cr >>expected &&
>  	echo same line | append_cr >>expected &&
>  	echo ">>>>>>>" >>expected &&
> @@ -135,15 +146,9 @@ test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
>  	echo "<<<<<<<" >expected &&
>  	echo first line | append_cr >>expected &&
>  	echo same line | append_cr >>expected &&
> -	if test_have_prereq NATIVE_CRLF; then
> -		echo ======= | append_cr >>expected &&
> -		echo first line | append_cr >>expected &&
> -		echo same line | append_cr >>expected
> -	else
> -		echo ======= >>expected &&
> -		echo first line >>expected &&
> -		echo same line >>expected
> -	fi &&
> +	echo ======= >>expected &&
> +	echo first line >>expected &&
> +	echo same line >>expected &&
>  	echo ">>>>>>>" >>expected &&
>  	git config merge.renormalize false &&
>  	rm -f .gitattributes &&

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-11 20:07                           ` Junio C Hamano
@ 2016-07-12  2:23                             ` Torsten Bögershausen
  2016-07-12 19:54                               ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2016-07-12  2:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

>
> How do things look at this point?  This version is what I ended up
> queuing in 'pu', but I took your "Thanks" in $gmane/299120 to only
> mean "Thanks for feeding some ideas to help me move forward", not
> necessarily "Tnanks that looks like the right approach." yet, so
> right now both topics are stalled and waiting for an action from
> you.
Yes, the code looks good to me.
And the commit message does explain what is going on.

For my taste, these 3 lines don't explain too much,may be remove them ?
 > The test update was taken from a series by Torsten Bögershausen
 > that attempted to fix this with a different approach (which was a
 > lot more intrusive).
So thanks for your efforts, ack from my side.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge
  2016-07-12  2:23                             ` Torsten Bögershausen
@ 2016-07-12 19:54                               ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2016-07-12 19:54 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>> How do things look at this point?  This version is what I ended up
>> queuing in 'pu', but I took your "Thanks" in $gmane/299120 to only
>> mean "Thanks for feeding some ideas to help me move forward", not
>> necessarily "Tnanks that looks like the right approach." yet, so
>> right now both topics are stalled and waiting for an action from
>> you.
> Yes, the code looks good to me.
> And the commit message does explain what is going on.
>
> For my taste, these 3 lines don't explain too much,may be remove them ?
>> The test update was taken from a series by Torsten Bögershausen
>> that attempted to fix this with a different approach (which was a
>> lot more intrusive).

OK. I wanted to make sure the resulting log message not only gives
you credit for the test portion of the change, but also for the fact
that you thought long and hard about the issue.  I'll tone it down
by removing "(which was...)" part.

> So thanks for your efforts, ack from my side.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-07-12 19:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-27 23:21 What's cooking in git.git (Jun 2016, #09; Mon, 27) Junio C Hamano
2016-06-28  8:01 ` [PATCH v3 0/3] unified auto CRLF handling, V3 tboegi
2016-06-28  8:01 ` [PATCH v3 1/3] convert: unify the "auto" handling of CRLF tboegi
2016-06-28  8:01 ` [PATCH v3 2/3] read-cache: factor out get_sha1_from_index() helper tboegi
2016-06-28  8:01 ` [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge tboegi
2016-06-29 16:14   ` Junio C Hamano
2016-06-30 16:52     ` Torsten Bögershausen
2016-07-01 22:11       ` Junio C Hamano
2016-07-02 18:41         ` Torsten Bögershausen
2016-07-06 14:57           ` Junio C Hamano
2016-07-07 17:16             ` Torsten Bögershausen
2016-07-07 18:43               ` Junio C Hamano
2016-07-07 22:19                 ` Junio C Hamano
2016-07-08  7:52                 ` Torsten Bögershausen
2016-07-08 16:36                   ` Junio C Hamano
2016-07-08 17:13                     ` Torsten Bögershausen
2016-07-08 17:25                       ` Junio C Hamano
2016-07-08 17:59                         ` Junio C Hamano
2016-07-08 19:01                       ` Junio C Hamano
2016-07-08 20:50                         ` Junio C Hamano
2016-07-11 20:07                           ` Junio C Hamano
2016-07-12  2:23                             ` Torsten Bögershausen
2016-07-12 19:54                               ` Junio C Hamano
2016-07-08 15:00                 ` Torsten Bögershausen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).