git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Nguyen Thai Ngoc Duy" <pclouds@gmail.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	"Jeff King" <peff@peff.net>,
	"Karsten Blees" <karsten.blees@gmail.com>,
	"Matthieu Moy" <Matthieu.Moy@grenoble-inp.fr>,
	"Christian Couder" <chriscool@tuxfamily.org>
Subject: [PATCH v3 00/49] libify apply and use lib in am, part 1
Date: Tue, 24 May 2016 10:10:37 +0200	[thread overview]
Message-ID: <20160524081126.16973-1-chriscool@tuxfamily.org> (raw)

Goal
~~~~

This is a patch series about libifying `git apply` functionality, and
using this libified functionality in `git am`, so that no 'git apply'
process is spawn anymore. This makes `git am` significantly faster, so
`git rebase`, when it uses the am backend, is also significantly
faster.

Previous discussions and patches series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This has initially been discussed in the following thread:

  http://thread.gmane.org/gmane.comp.version-control.git/287236/

Then the following patch series were sent:

RFC: http://thread.gmane.org/gmane.comp.version-control.git/288489/
v1: http://thread.gmane.org/gmane.comp.version-control.git/292324/
v2: http://thread.gmane.org/gmane.comp.version-control.git/294248/

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This new patch series is built on top of the above previous work.

It contains only patches 01/94 to 50/94 from v2, as v2 contains too
many patches and it was decided to split it.

The changes since v2 are the following:

    - Patch 48/94 (builtin/apply: rename 'prefix_' parameter to
      'prefix') was squashed into 09/94 (builtin/apply: move 'state'
      init into init_apply_state()), as suggested by Junio.

    - Fields in 'struct apply_state' have been reorganized, as
      suggested by Junio.

    - clear_apply_state() has been added to free 'struct apply_state'
      resources.

General comments
~~~~~~~~~~~~~~~~

Sorry if this patch series is still long despite being splitted.

I will send a diff between this version and the 50 first patches of v2
soon as a reply to this email.

The benefits are not just related to not creating new processes. When
`git am` launched a `git apply` process, this new process had to read
the index from disk. Then after the `git apply`process had terminated,
`git am` dropped its index and read the index from disk to get the
index that had been modified by the `git apply`process. This was
inefficient and also prevented the split-index mechanism to provide
many performance benefits.

Using this series as rebase material, Duy explains it like this:

 > Without the series, the picture is not so surprising. We run git-apply
 > 80+ times, each consists of this sequence
 >
 > read index
 > write index (cache tree updates only)
 > read index again
 > optionally initialize name hash (when new entries are added, I guess)
 > read packed-refs
 > write index
 >
 > With this series, we run a single git-apply which does
 >
 > read index (and sharedindex too if in split-index mode)
 > initialize name hash
 > write index 80+ times

(See: http://thread.gmane.org/gmane.comp.version-control.git/292324/focus=292460)

Links
~~~~~

This patch series is available here:

v3: https://github.com/chriscool/git/commits/libify-apply61
v2: https://github.com/chriscool/git/commits/libify-apply-use-in-am54
v1: https://github.com/chriscool/git/commits/libify-apply-use-in-am25 

Performance numbers
~~~~~~~~~~~~~~~~~~~

For now only tests on Linux have been performed on v1 around mid
April. It could be interesting to test on other platforms especially
Windows and perhaps OSX too.

  - Ævar did a huge many-hundred commit rebase on the kernel with
    untracked cache.

command: git rebase --onto 1993b17 52bef0c 29dde7c

Vanilla "next" without split index:                1m54.953s
Vanilla "next" with split index:                   1m22.476s
This series on top of "next" without split index:  1m12.034s
This series on top of "next" with split index:     0m15.678s

Ævar used his Debian laptop with SSD.

  - I tested rebasing 13 commits in Booking.com's monorepo on a Red
    Hat 6.5 server with split-index and GIT_TRACE_PERFORMANCE=1.

With Git v2.8.0, the rebase took 6.375888383 s, with the git am
command launched by the rebase command taking 3.705677431 s.

With this series on top of next, the rebase took 3.044529494 s, with
the git am command launched by the rebase command taking 0.583521168
s.


Christian Couder (49):
  builtin/apply: make gitdiff_verify_name() return void
  builtin/apply: avoid parameter shadowing 'p_value' global
  builtin/apply: avoid parameter shadowing 'linenr' global
  builtin/apply: avoid local variable shadowing 'len' parameter
  builtin/apply: extract line_by_line_fuzzy_match() from
    match_fragment()
  builtin/apply: move 'options' variable into cmd_apply()
  builtin/apply: move 'read_stdin' global into cmd_apply()
  builtin/apply: introduce 'struct apply_state' to start libifying
  builtin/apply: move 'state' init into init_apply_state()
  builtin/apply: move 'unidiff_zero' global into 'struct apply_state'
  builtin/apply: move 'check' global into 'struct apply_state'
  builtin/apply: move 'check_index' global into 'struct apply_state'
  builtin/apply: move 'apply_in_reverse' global into 'struct
    apply_state'
  builtin/apply: move 'apply_with_reject' global into 'struct
    apply_state'
  builtin/apply: move 'apply_verbosely' global into 'struct apply_state'
  builtin/apply: move 'update_index' global into 'struct apply_state'
  builtin/apply: move 'allow_overlap' global into 'struct apply_state'
  builtin/apply: move 'cached' global into 'struct apply_state'
  builtin/apply: move 'diffstat' global into 'struct apply_state'
  builtin/apply: move 'numstat' global into 'struct apply_state'
  builtin/apply: move 'summary' global into 'struct apply_state'
  builtin/apply: move 'threeway' global into 'struct apply_state'
  builtin/apply: move 'no_add' global into 'struct apply_state'
  builtin/apply: move 'unsafe_paths' global into 'struct apply_state'
  builtin/apply: move 'line_termination' global into 'struct
    apply_state'
  builtin/apply: move 'fake_ancestor' global into 'struct apply_state'
  builtin/apply: move 'p_context' global into 'struct apply_state'
  builtin/apply: move 'apply' global into 'struct apply_state'
  builtin/apply: move 'patch_input_file' global into 'struct
    apply_state'
  builtin/apply: move 'limit_by_name' global into 'struct apply_state'
  builtin/apply: move 'has_include' global into 'struct apply_state'
  builtin/apply: move 'p_value' global into 'struct apply_state'
  builtin/apply: move 'p_value_known' global into 'struct apply_state'
  builtin/apply: move 'root' global into 'struct apply_state'
  builtin/apply: move 'whitespace_error' global into 'struct
    apply_state'
  builtin/apply: move 'whitespace_option' into 'struct apply_state'
  builtin/apply: remove whitespace_option arg from
    set_default_whitespace_mode()
  builtin/apply: move 'squelch_whitespace_errors' into 'struct
    apply_state'
  builtin/apply: move 'applied_after_fixing_ws' into 'struct
    apply_state'
  builtin/apply: move 'ws_error_action' into 'struct apply_state'
  builtin/apply: move 'ws_ignore_action' into 'struct apply_state'
  builtin/apply: move 'max_change' and 'max_len' into 'struct
    apply_state'
  builtin/apply: move 'state_linenr' global into 'struct apply_state'
  builtin/apply: move 'fn_table' global into 'struct apply_state'
  builtin/apply: move 'symlink_changes' global into 'struct apply_state'
  builtin/apply: move 'state' check into check_apply_state()
  builtin/apply: move applying patches into apply_all_patches()
  builtin/apply: move 'lock_file' global into 'struct apply_state'
  builtin/apply: move 'newfd' global into 'struct apply_state'

 builtin/apply.c | 1432 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 821 insertions(+), 611 deletions(-)

-- 
2.8.3.443.gaeee61e

             reply	other threads:[~2016-05-24  8:12 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24  8:10 Christian Couder [this message]
2016-05-24  8:10 ` [PATCH v3 01/49] builtin/apply: make gitdiff_verify_name() return void Christian Couder
2016-05-24  8:10 ` [PATCH v3 02/49] builtin/apply: avoid parameter shadowing 'p_value' global Christian Couder
2016-05-24  8:10 ` [PATCH v3 03/49] builtin/apply: avoid parameter shadowing 'linenr' global Christian Couder
2016-05-24  8:10 ` [PATCH v3 04/49] builtin/apply: avoid local variable shadowing 'len' parameter Christian Couder
2016-05-24  8:10 ` [PATCH v3 05/49] builtin/apply: extract line_by_line_fuzzy_match() from match_fragment() Christian Couder
2016-05-24  8:10 ` [PATCH v3 06/49] builtin/apply: move 'options' variable into cmd_apply() Christian Couder
2016-05-24  8:10 ` [PATCH v3 07/49] builtin/apply: move 'read_stdin' global " Christian Couder
2016-05-24  8:10 ` [PATCH v3 08/49] builtin/apply: introduce 'struct apply_state' to start libifying Christian Couder
2016-05-24  8:10 ` [PATCH v3 09/49] builtin/apply: move 'state' init into init_apply_state() Christian Couder
2016-05-24  8:10 ` [PATCH v3 10/49] builtin/apply: move 'unidiff_zero' global into 'struct apply_state' Christian Couder
2016-05-24  8:10 ` [PATCH v3 11/49] builtin/apply: move 'check' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 12/49] builtin/apply: move 'check_index' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 13/49] builtin/apply: move 'apply_in_reverse' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 14/49] builtin/apply: move 'apply_with_reject' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 15/49] builtin/apply: move 'apply_verbosely' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 16/49] builtin/apply: move 'update_index' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 17/49] builtin/apply: move 'allow_overlap' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 18/49] builtin/apply: move 'cached' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 19/49] builtin/apply: move 'diffstat' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 20/49] builtin/apply: move 'numstat' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 21/49] builtin/apply: move 'summary' " Christian Couder
2016-05-24  8:10 ` [PATCH v3 22/49] builtin/apply: move 'threeway' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 23/49] builtin/apply: move 'no_add' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 24/49] builtin/apply: move 'unsafe_paths' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 25/49] builtin/apply: move 'line_termination' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 26/49] builtin/apply: move 'fake_ancestor' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 27/49] builtin/apply: move 'p_context' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 28/49] builtin/apply: move 'apply' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 29/49] builtin/apply: move 'patch_input_file' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 30/49] builtin/apply: move 'limit_by_name' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 31/49] builtin/apply: move 'has_include' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 32/49] builtin/apply: move 'p_value' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 33/49] builtin/apply: move 'p_value_known' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 34/49] builtin/apply: move 'root' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 35/49] builtin/apply: move 'whitespace_error' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 36/49] builtin/apply: move 'whitespace_option' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 37/49] builtin/apply: remove whitespace_option arg from set_default_whitespace_mode() Christian Couder
2016-05-24  8:11 ` [PATCH v3 38/49] builtin/apply: move 'squelch_whitespace_errors' into 'struct apply_state' Christian Couder
2016-05-24  8:11 ` [PATCH v3 39/49] builtin/apply: move 'applied_after_fixing_ws' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 40/49] builtin/apply: move 'ws_error_action' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 41/49] builtin/apply: move 'ws_ignore_action' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 42/49] builtin/apply: move 'max_change' and 'max_len' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 43/49] builtin/apply: move 'state_linenr' global " Christian Couder
2016-05-24  8:11 ` [PATCH v3 44/49] builtin/apply: move 'fn_table' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 45/49] builtin/apply: move 'symlink_changes' " Christian Couder
2016-05-24  8:11 ` [PATCH v3 46/49] builtin/apply: move 'state' check into check_apply_state() Christian Couder
2016-05-24  8:11 ` [PATCH v3 47/49] builtin/apply: move applying patches into apply_all_patches() Christian Couder
2016-05-24  8:11 ` [PATCH v3 48/49] builtin/apply: move 'lock_file' global into 'struct apply_state' Christian Couder
2016-06-01 17:23   ` Junio C Hamano
2016-06-03  9:42     ` Christian Couder
2016-05-24  8:11 ` [PATCH v3 49/49] builtin/apply: move 'newfd' " Christian Couder
2016-05-24  8:59 ` [PATCH v3 00/49] libify apply and use lib in am, part 1 Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160524081126.16973-1-chriscool@tuxfamily.org \
    --to=christian.couder@gmail.com \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=avarab@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karsten.blees@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sbeller@google.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).