git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org, Christian Couder <christian.couder@gmail.com>
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Karsten Blees" <karsten.blees@gmail.com>,
	"Nguyen Thai Ngoc Duy" <pclouds@gmail.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Matthieu Moy" <Matthieu.Moy@grenoble-inp.fr>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	"Christian Couder" <chriscool@tuxfamily.org>
Subject: [PATCH v6 00/44] libify apply and use lib in am, part 2
Date: Fri, 10 Jun 2016 22:10:34 +0200	[thread overview]
Message-ID: <20160610201118.13813-1-chriscool@tuxfamily.org> (raw)

Goal
~~~~

This is a patch series about libifying `git apply` functionality, and
using this libified functionality in `git am`, so that no 'git apply'
process is spawn anymore. This makes `git am` significantly faster, so
`git rebase`, when it uses the am backend, is also significantly
faster.

Previous discussions and patches series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This has initially been discussed in the following thread:

  http://thread.gmane.org/gmane.comp.version-control.git/287236/

Then the following patch series were sent:

RFC: http://thread.gmane.org/gmane.comp.version-control.git/288489/
v1: http://thread.gmane.org/gmane.comp.version-control.git/292324/
v2: http://thread.gmane.org/gmane.comp.version-control.git/294248/
v3: http://thread.gmane.org/gmane.comp.version-control.git/295429/
v4: http://thread.gmane.org/gmane.comp.version-control.git/296350/
v5: http://thread.gmane.org/gmane.comp.version-control.git/296490/

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This new patch series is built on top of the above previous work.

More precisely, this is "part 2" of the full patch series which is
built on top of the "part 1" of the full patch series. And as the
"part 1" is now in "next", this "part 2" is built on top of "next".

  - Patches 01/44 to 33/44 were in v1 and v2.

They finish libifying the apply functionality that was in
builtin/apply.c and move it into apply.{c,h}. And they use this
libified functionality in git am so that it doesn't launch git apply
processes any more.

Following great suggestions from Eric and Junio, there are some
changes in these patches to improve on v2:

      - Many commit messages for patches that make a function return
        -1 were simplified by removing "by using error()".

      - 'struct lockfile' instance should be managed properly, as
        rollback_lock_file() should be called in all error paths.

      - The patch that added calls to rollback_lock_file() has been
        squashed into the patch that make apply_all_patches() return
        -1 on error. The resulting patch is 13/44.

      - 'struct apply_state' is now moved to apply.h at the beginning
        of this series.

      - Some useless braces were removed and the commit message was
        fixed in patch 05/44.

      - error_errno() is now used instead of error() in patch 03/44.

  - Patches 34/44 to 43/44 were new in v2.

They implement a way to make the libified apply silent. It is a new
feature in the libified apply functionality.

This could be in a separate series, but unfortunately using the
libified apply in "git am" unmasks the fact that "git am", since it
was a shell script, has been silencing the apply functionality by
redirecting file descriptors to /dev/null and it looks like this is
not acceptable in C.

I am not yet sure that "be_silent" is a good name for the new
variable added by these patches.

Path 43/44 that adds --silent to `git apply` should probably be
discarded. I plan to do it in the next version.

I had planned to perhaps add tests for this new feature, but if 43/44
is discarded it may not be needed anymore.

  - Patch 44/44 is new.

It replaces some calls to error() with calls to error_errno().

General comments
~~~~~~~~~~~~~~~~

Sorry if this patch series is still long. I can split it into two or
more series if it is prefered.

I can also send diffs between this version and the previous one, but
for now I'd rather not send them in this email, as it would make it
very long.

The benefits are not just related to not creating new processes. When
`git am` launched a `git apply` process, this new process had to read
the index from disk. Then after the `git apply`process had terminated,
`git am` dropped its index and read the index from disk to get the
index that had been modified by the `git apply`process. This was
inefficient and also prevented the split-index mechanism to provide
many performance benefits.

Using this series as rebase material, Duy explains it like this:

 > Without the series, the picture is not so surprising. We run git-apply
 > 80+ times, each consists of this sequence
 >
 > read index
 > write index (cache tree updates only)
 > read index again
 > optionally initialize name hash (when new entries are added, I guess)
 > read packed-refs
 > write index
 >
 > With this series, we run a single git-apply which does
 >
 > read index (and sharedindex too if in split-index mode)
 > initialize name hash
 > write index 80+ times

(See: http://thread.gmane.org/gmane.comp.version-control.git/292324/focus=292460)

Links
~~~~~

This patch series is available here:

https://github.com/chriscool/git/commits/libify-apply-use-in-am65

The previous versions are available there:

v1: https://github.com/chriscool/git/commits/libify-apply-use-in-am25 
v2: https://github.com/chriscool/git/commits/libify-apply-use-in-am54

Performance numbers
~~~~~~~~~~~~~~~~~~~

Only tests on Linux have been performed using a very early version of
this series. It could be interesting to test on other platforms
especially Windows and perhaps OSX too.

  - Around mid April Ævar did a huge many-hundred commit rebase on the
    kernel with untracked cache.

command: git rebase --onto 1993b17 52bef0c 29dde7c

Vanilla "next" without split index:                1m54.953s
Vanilla "next" with split index:                   1m22.476s
This series on top of "next" without split index:  1m12.034s
This series on top of "next" with split index:     0m15.678s

Ævar used his Debian laptop with SSD.

  - Around mid April I tested rebasing 13 commits in Booking.com's
    monorepo on a Red Hat 6.5 server with split-index and
    GIT_TRACE_PERFORMANCE=1.

With Git v2.8.0, the rebase took 6.375888383 s, with the git am
command launched by the rebase command taking 3.705677431 s.

With this series on top of next, the rebase took 3.044529494 s, with
the git am command launched by the rebase command taking 0.583521168
s.

Christian Couder (44):
  apply: move 'struct apply_state' to apply.h
  builtin/apply: make apply_patch() return -1 instead of die()ing
  builtin/apply: read_patch_file() return -1 instead of die()ing
  builtin/apply: make find_header() return -1 instead of die()ing
  builtin/apply: make parse_chunk() return a negative integer on error
  builtin/apply: make parse_single_patch() return -1 on error
  builtin/apply: make parse_whitespace_option() return -1 instead of
    die()ing
  builtin/apply: make parse_ignorewhitespace_option() return -1 instead
    of die()ing
  builtin/apply: move init_apply_state() to apply.c
  apply: make init_apply_state() return -1 instead of exit()ing
  builtin/apply: make check_apply_state() return -1 instead of die()ing
  builtin/apply: move check_apply_state() to apply.c
  builtin/apply: make apply_all_patches() return -1 on error
  builtin/apply: make parse_traditional_patch() return -1 on error
  builtin/apply: make gitdiff_*() return 1 at end of header
  builtin/apply: make gitdiff_*() return -1 on error
  builtin/apply: change die_on_unsafe_path() to check_unsafe_path()
  builtin/apply: make build_fake_ancestor() return -1 on error
  builtin/apply: make remove_file() return -1 on error
  builtin/apply: make add_conflicted_stages_file() return -1 on error
  builtin/apply: make add_index_file() return -1 on error
  builtin/apply: make create_file() return -1 on error
  builtin/apply: make write_out_one_result() return -1 on error
  builtin/apply: make write_out_results() return -1 on error
  builtin/apply: make try_create_file() return -1 on error
  builtin/apply: make create_one_file() return -1 on error
  builtin/apply: rename option parsing functions
  apply: rename and move opt constants to apply.h
  Move libified code from builtin/apply.c to apply.{c,h}
  apply: make some parsing functions static again
  run-command: make dup_devnull() non static
  environment: add set_index_file()
  builtin/am: use apply api in run_apply()
  write_or_die: use warning() instead of fprintf(stderr, ...)
  apply: add 'be_silent' variable to 'struct apply_state'
  apply: make 'be_silent' incompatible with 'apply_verbosely'
  apply: don't print on stdout when be_silent is set
  usage: add set_warn_routine()
  usage: add get_error_routine() and get_warn_routine()
  apply: change error_routine when be_silent is set
  am: use be_silent in 'struct apply_state' to shut up applying patches
  run-command: make dup_devnull() static again
  builtin/apply: add a cli option for be_silent
  apply: use error_errno() where possible

 Makefile               |    1 +
 apply.c                | 4868 ++++++++++++++++++++++++++++++++++++++++++++++++
 apply.h                |  133 ++
 builtin/am.c           |   91 +-
 builtin/apply.c        | 4815 +----------------------------------------------
 cache.h                |    1 +
 environment.c          |   10 +
 git-compat-util.h      |    3 +
 run-command.c          |    2 +-
 t/t4012-diff-binary.sh |    4 +-
 t/t4254-am-corrupt.sh  |    2 +-
 usage.c                |   15 +
 write_or_die.c         |    6 +-
 13 files changed, 5132 insertions(+), 4819 deletions(-)
 create mode 100644 apply.c
 create mode 100644 apply.h

-- 
2.9.0.rc2.362.g3cd93d0

             reply	other threads:[~2016-06-10 20:11 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-10 20:10 Christian Couder [this message]
2016-06-10 20:10 ` [PATCH v6 01/44] apply: move 'struct apply_state' to apply.h Christian Couder
2016-06-10 20:10 ` [PATCH v6 02/44] builtin/apply: make apply_patch() return -1 instead of die()ing Christian Couder
2016-06-10 20:10 ` [PATCH v6 03/44] builtin/apply: read_patch_file() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 04/44] builtin/apply: make find_header() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 05/44] builtin/apply: make parse_chunk() return a negative integer on error Christian Couder
2016-06-10 20:10 ` [PATCH v6 06/44] builtin/apply: make parse_single_patch() return -1 " Christian Couder
2016-06-10 20:10 ` [PATCH v6 07/44] builtin/apply: make parse_whitespace_option() return -1 instead of die()ing Christian Couder
2016-06-10 20:10 ` [PATCH v6 08/44] builtin/apply: make parse_ignorewhitespace_option() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 09/44] builtin/apply: move init_apply_state() to apply.c Christian Couder
2016-06-10 20:10 ` [PATCH v6 10/44] apply: make init_apply_state() return -1 instead of exit()ing Christian Couder
2016-06-10 20:10 ` [PATCH v6 11/44] builtin/apply: make check_apply_state() return -1 instead of die()ing Christian Couder
2016-06-10 20:10 ` [PATCH v6 12/44] builtin/apply: move check_apply_state() to apply.c Christian Couder
2016-06-10 20:10 ` [PATCH v6 13/44] builtin/apply: make apply_all_patches() return -1 on error Christian Couder
2016-06-10 20:10 ` [PATCH v6 14/44] builtin/apply: make parse_traditional_patch() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 15/44] builtin/apply: make gitdiff_*() return 1 at end of header Christian Couder
2016-06-10 20:10 ` [PATCH v6 16/44] builtin/apply: make gitdiff_*() return -1 on error Christian Couder
2016-06-10 20:10 ` [PATCH v6 17/44] builtin/apply: change die_on_unsafe_path() to check_unsafe_path() Christian Couder
2016-06-10 20:10 ` [PATCH v6 18/44] builtin/apply: make build_fake_ancestor() return -1 on error Christian Couder
2016-06-10 20:10 ` [PATCH v6 19/44] builtin/apply: make remove_file() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 20/44] builtin/apply: make add_conflicted_stages_file() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 21/44] builtin/apply: make add_index_file() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 22/44] builtin/apply: make create_file() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 23/44] builtin/apply: make write_out_one_result() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 24/44] builtin/apply: make write_out_results() " Christian Couder
2016-06-10 20:10 ` [PATCH v6 25/44] builtin/apply: make try_create_file() " Christian Couder
2016-06-10 20:11 ` [PATCH v6 26/44] builtin/apply: make create_one_file() " Christian Couder
2016-06-10 20:11 ` [PATCH v6 27/44] builtin/apply: rename option parsing functions Christian Couder
2016-06-10 20:11 ` [PATCH v6 28/44] apply: rename and move opt constants to apply.h Christian Couder
2016-06-10 20:11 ` [PATCH v6 30/44] apply: make some parsing functions static again Christian Couder
2016-06-10 20:11 ` [PATCH v6 31/44] run-command: make dup_devnull() non static Christian Couder
2016-06-11  8:17   ` Johannes Sixt
2016-06-11 10:18     ` Christian Couder
2016-06-10 20:11 ` [PATCH v6 32/44] environment: add set_index_file() Christian Couder
2016-06-10 20:11 ` [PATCH v6 33/44] builtin/am: use apply api in run_apply() Christian Couder
2016-06-10 20:11 ` [PATCH v6 34/44] write_or_die: use warning() instead of fprintf(stderr, ...) Christian Couder
2016-06-10 20:11 ` [PATCH v6 35/44] apply: add 'be_silent' variable to 'struct apply_state' Christian Couder
2016-06-10 20:11 ` [PATCH v6 36/44] apply: make 'be_silent' incompatible with 'apply_verbosely' Christian Couder
2016-06-10 20:11 ` [PATCH v6 37/44] apply: don't print on stdout when be_silent is set Christian Couder
2016-06-10 20:11 ` [PATCH v6 38/44] usage: add set_warn_routine() Christian Couder
2016-06-10 20:11 ` [PATCH v6 39/44] usage: add get_error_routine() and get_warn_routine() Christian Couder
2016-06-10 20:11 ` [PATCH v6 40/44] apply: change error_routine when be_silent is set Christian Couder
2016-06-10 20:11 ` [PATCH v6 41/44] am: use be_silent in 'struct apply_state' to shut up applying patches Christian Couder
2016-06-10 22:07   ` Junio C Hamano
2016-06-11 10:07     ` Christian Couder
2016-06-10 20:11 ` [PATCH v6 42/44] run-command: make dup_devnull() static again Christian Couder
2016-06-10 20:11 ` [PATCH v6 43/44] builtin/apply: add a cli option for be_silent Christian Couder
2016-06-10 20:59   ` René Scharfe
2016-06-11 10:16     ` Christian Couder
2016-06-10 20:11 ` [PATCH v6 44/44] apply: use error_errno() where possible Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160610201118.13813-1-chriscool@tuxfamily.org \
    --to=christian.couder@gmail.com \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=avarab@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karsten.blees@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sbeller@google.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).