git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Paul Tan <pyokagan@gmail.com>
To: Git List <git@vger.kernel.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <johannes.schindelin@gmx.de>,
	Duy Nguyen <pclouds@gmail.com>,
	Stefan Beller <sbeller@google.com>,
	sam.halliday@gmail.com, Paul Tan <pyokagan@gmail.com>
Subject: [PATCH/RFC/GSoC 00/17] A barebones git-rebase in C
Date: Sat, 12 Mar 2016 18:46:20 +0800	[thread overview]
Message-ID: <1457779597-6918-1-git-send-email-pyokagan@gmail.com> (raw)

Hi all,

Last year I rewrote git-am from shell script to C. This succeeded in speeding
up a non-interactive git-rebase by 6-7x[1], which is really handly when rebasing
multiple topic branches.

[1] http://thread.gmane.org/gmane.comp.version-control.git/271967

However, it turns out that when working on a topic branch, I frequently use
interactive rebase instead to edit and squash commits. Unfortunately, as
git-rebase--interactive.sh is still a shell script, it is a bit slower (e.g.
taking a few seconds longer compared to non-interactive rebase when rebasing
big topic branches).

The situation is much worse on Windows, as from the invocation of git rebase -i,
it takes a few seconds before the editor even pops up, and the actual
rebase proceeds at a snails pace, taking around 3 minutes for a 50-patch
series, which is a huge deal-breaker since my workflow depends on frequent
commits and squashes.

As such, this year I would like to apply for GSoC to work on a rewrite of
git-rebase to C. It is slightly hefty, as there are three backends (am, merge
and interactive), along with the git-rebase.sh script.

To get a gauge of how much code is needed for the rewrite, I explored rewriting
the scripts into C, and then extracted some bits out and polished them a bit to
make a barebones git-rebase in C, creating this patch series:

[01/17] perf: introduce performance tests for git-rebase

A simple performance test for the three rebase backends so we can compare this
C version and the shell version below.

[02/17] sha1_name: implement get_oid() and friends
[03/17] builtin-rebase: implement skeletal builtin rebase
[04/17] builtin-rebase: parse rebase arguments into a common rebase_options struct
[05/17] rebase-options: implement rebase_options_load() and rebase_options_save()

The three rebase backends (am, merge, interactive) have vastly different
capabilities, so I did not try to shoehorn them into the same interface.
However, they do share a few common options and functionality, so I introduced
the common rebase-common.c library and rebase_options struct.

In the above patches we implement the essential arguments for a rebase: the
upstream, branch_name and --onto <newbase>.

[06/17] rebase-am: introduce am backend for builtin rebase

This patch implements a barebones rebase-am backend.

[07/17] rebase-common: implement refresh_and_write_cache()
[08/17] rebase-common: let refresh_and_write_cache() take a flags argument
[09/17] rebase-common: implement cache_has_unstaged_changes()
[10/17] rebase-common: implement cache_has_uncommitted_changes()
[11/17] rebase-merge: introduce merge backend for builtin rebase

These patches implement a barebones rebase-merge backend.

[12/17] rebase-todo: introduce rebase_todo_item
[13/17] rebase-todo: introduce rebase_todo_list
[14/17] status: use rebase_todo_list
[15/17] wrapper: implement append_file()
[16/17] editor: implement git_sequence_editor() and launch_sequence_editor()
[17/17] rebase-interactive: introduce interactive backend for builtin rebase

And these patches implement a barebones rebase-interactive backend.

With these patches the performance numbers when rebasing 50 commits on the
git.git repository are, on Linux,

Before patch series:

Test                               this tree
--------------------------------------------------
3400.2: rebase --onto master^      1.10(0.84+0.06)
3402.2: rebase -m --onto master^   2.38(1.38+0.13)
3404.2: rebase -i --onto master^   3.11(1.37+0.27)

After patch series:

Test                               this tree
--------------------------------------------------
3400.2: rebase --onto master^      0.74(0.51+0.08)
3402.2: rebase -m --onto master^   1.72(1.26+0.17)
3404.2: rebase -i --onto master^   1.74(1.20+0.18)

And on Windows,

Before patch series:

Test                               this tree
----------------------------------------------------
3400.2: rebase --onto master^      10.90(0.06+0.47)
3402.2: rebase -m --onto master^   86.87(0.04+0.47)
3404.2: rebase -i --onto master^   191.65(0.09+0.44)

After patch series:

Test                               this tree
---------------------------------------------------
3400.2: rebase --onto master^      6.45(0.13+0.40)
3402.2: rebase -m --onto master^   12.32(0.13+0.40)
3404.2: rebase -i --onto master^   14.16(0.15+0.40)

(Thanks to the git-am rewrite, non-interactive rebase on Windows is already
relatively fast ;-) )

So, we have around a 1.4x-1.8x speedup for Linux users, and a 1.7x-13x speedup
for Windows users. The annoying long delay before the interactive editor is
launched on Windows is gotten rid of, which I'm very happy about :-)

On the code side, we do get some nice things with a rewrite to C. For example,
we get the rebase-todo library for parsing and writing git-rebase-todo files,
which means that wt-status.c and rebase-interactive.c can share the same
parsing code. Although not in this patch series, rebase-interactive.c can also
now share the same author-script parsing and writing code from builtin/am.c as
well.

Regards,
Paul

Paul Tan (17):
  perf: introduce performance tests for git-rebase
  sha1_name: implement get_oid() and friends
  builtin-rebase: implement skeletal builtin rebase
  builtin-rebase: parse rebase arguments into a common rebase_options
    struct
  rebase-options: implement rebase_options_load() and
    rebase_options_save()
  rebase-am: introduce am backend for builtin rebase
  rebase-common: implement refresh_and_write_cache()
  rebase-common: let refresh_and_write_cache() take a flags argument
  rebase-common: implement cache_has_unstaged_changes()
  rebase-common: implement cache_has_uncommitted_changes()
  rebase-merge: introduce merge backend for builtin rebase
  rebase-todo: introduce rebase_todo_item
  rebase-todo: introduce rebase_todo_list
  status: use rebase_todo_list
  wrapper: implement append_file()
  editor: implement git_sequence_editor() and launch_sequence_editor()
  rebase-interactive: introduce interactive backend for builtin rebase

 Makefile                           |  10 +-
 builtin.h                          |   1 +
 builtin/am.c                       |  16 +-
 builtin/pull.c                     |  41 +---
 builtin/rebase.c                   | 264 ++++++++++++++++++++++++++
 cache.h                            |   8 +
 editor.c                           |  27 ++-
 git.c                              |   1 +
 rebase-am.c                        | 110 +++++++++++
 rebase-am.h                        |  22 +++
 rebase-common.c                    | 220 ++++++++++++++++++++++
 rebase-common.h                    |  48 +++++
 rebase-interactive.c               | 375 +++++++++++++++++++++++++++++++++++++
 rebase-interactive.h               |  33 ++++
 rebase-merge.c                     | 256 +++++++++++++++++++++++++
 rebase-merge.h                     |  28 +++
 rebase-todo.c                      | 251 +++++++++++++++++++++++++
 rebase-todo.h                      |  55 ++++++
 sha1_name.c                        |  30 +++
 strbuf.h                           |   1 +
 t/perf/p3400-rebase.sh             |  25 +++
 t/perf/p3402-rebase-merge.sh       |  25 +++
 t/perf/p3404-rebase-interactive.sh |  26 +++
 wrapper.c                          |  23 +++
 wt-status.c                        | 100 +++-------
 25 files changed, 1863 insertions(+), 133 deletions(-)
 create mode 100644 builtin/rebase.c
 create mode 100644 rebase-am.c
 create mode 100644 rebase-am.h
 create mode 100644 rebase-common.c
 create mode 100644 rebase-common.h
 create mode 100644 rebase-interactive.c
 create mode 100644 rebase-interactive.h
 create mode 100644 rebase-merge.c
 create mode 100644 rebase-merge.h
 create mode 100644 rebase-todo.c
 create mode 100644 rebase-todo.h
 create mode 100755 t/perf/p3400-rebase.sh
 create mode 100755 t/perf/p3402-rebase-merge.sh
 create mode 100755 t/perf/p3404-rebase-interactive.sh

-- 
2.7.0

             reply	other threads:[~2016-03-12 10:47 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-12 10:46 Paul Tan [this message]
2016-03-12 10:46 ` [PATCH/RFC/GSoC 01/17] perf: introduce performance tests for git-rebase Paul Tan
2016-03-16  7:58   ` Johannes Schindelin
2016-03-16 11:51     ` Paul Tan
2016-03-16 15:59       ` Johannes Schindelin
2016-03-18 11:01         ` Thomas Gummerer
2016-03-18 16:00           ` Johannes Schindelin
2016-03-20 14:00             ` Thomas Gummerer
2016-03-21  7:54               ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 02/17] sha1_name: implement get_oid() and friends Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 03/17] builtin-rebase: implement skeletal builtin rebase Paul Tan
2016-03-14 18:31   ` Stefan Beller
2016-03-15  8:01     ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 04/17] builtin-rebase: parse rebase arguments into a common rebase_options struct Paul Tan
2016-03-14 20:05   ` Stefan Beller
2016-03-15 10:54   ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 05/17] rebase-options: implement rebase_options_load() and rebase_options_save() Paul Tan
2016-03-14 20:30   ` Stefan Beller
2016-03-16  8:04     ` Johannes Schindelin
2016-03-16 12:28       ` Paul Tan
2016-03-16 17:11         ` Johannes Schindelin
2016-03-21 14:55           ` Paul Tan
2016-03-16 12:04     ` Paul Tan
2016-03-16 17:10       ` Stefan Beller
2016-03-12 10:46 ` [PATCH/RFC/GSoC 06/17] rebase-am: introduce am backend for builtin rebase Paul Tan
2016-03-16 13:21   ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 07/17] rebase-common: implement refresh_and_write_cache() Paul Tan
2016-03-14 21:10   ` Junio C Hamano
2016-03-16 12:56     ` Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 08/17] rebase-common: let refresh_and_write_cache() take a flags argument Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 09/17] rebase-common: implement cache_has_unstaged_changes() Paul Tan
2016-03-14 20:54   ` Johannes Schindelin
2016-03-14 21:52     ` Junio C Hamano
2016-03-15 11:51       ` Johannes Schindelin
2016-03-15 11:07     ` Duy Nguyen
2016-03-15 14:15       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 10/17] rebase-common: implement cache_has_uncommitted_changes() Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 11/17] rebase-merge: introduce merge backend for builtin rebase Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 12/17] rebase-todo: introduce rebase_todo_item Paul Tan
2016-03-14 13:43   ` Christian Couder
2016-03-14 20:33     ` Johannes Schindelin
2016-03-16 12:54     ` Paul Tan
2016-03-16 15:55       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 13/17] rebase-todo: introduce rebase_todo_list Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 14/17] status: use rebase_todo_list Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 15/17] wrapper: implement append_file() Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 16/17] editor: implement git_sequence_editor() and launch_sequence_editor() Paul Tan
2016-03-15  7:00   ` Johannes Schindelin
2016-03-16 13:06     ` Paul Tan
2016-03-16 18:21       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 17/17] rebase-interactive: introduce interactive backend for builtin rebase Paul Tan
2016-03-15  7:57   ` Johannes Schindelin
2016-03-15 16:48     ` Paul Tan
2016-03-15 19:45       ` Johannes Schindelin
2016-03-14 12:15 ` [PATCH/RFC/GSoC 00/17] A barebones git-rebase in C Duy Nguyen
2016-03-14 17:32   ` Stefan Beller
2016-03-14 18:43   ` Junio C Hamano
2016-03-16 12:46     ` Paul Tan
2016-03-14 20:44   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1457779597-6918-1-git-send-email-pyokagan@gmail.com \
    --to=pyokagan@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=sam.halliday@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).