git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Siddharth Asthana <siddharthasthana31@gmail.com>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, gitster@pobox.com,
	johncai86@gmail.com,
	Siddharth Asthana <siddharthasthana31@gmail.com>
Subject: [PATCH v2 0/2] Add mailmap mechanism in cat-file options
Date: Mon, 26 Sep 2022 16:23:41 +0530	[thread overview]
Message-ID: <20220926105343.233296-1-siddharthasthana31@gmail.com> (raw)
In-Reply-To: <20220916205946.178925-1-siddharthasthana31@gmail.com>

Thanks a lot Junio for the review :) I have made the suggested changes.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used bt `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v2:

- The commit messages of both the patches have been improved.
- In the second patch, we were populating the `contentp` field of the
  `object_info` structure when `--batch-check` was combined with
  `--use-mailmap`. Which made us read the contents of tree and blob
  object types as well, which affected the performance. We should only
  be reading the contents for commit or tag object types. The second
  patch has been updated to do just that.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt |  6 +++++-
 builtin/cat-file.c             | 27 +++++++++++++++++++++++++++
 t/t4203-mailmap.sh             | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

Range-diff against v1:
1:  513ad3b5f7 < -:  ---------- doc/cat-file: allow --use-mailmap for --batch options
2:  6f3dcce9e3 ! 1:  60cf7bc28c cat-file: add mailmap support to -s option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to -s option
     
    -    Using `git cat-file --use-mailmap` with `-s` option, like the following is
    -    allowed:
    +    Even though the cat-file command with `-s` option does not complain when
    +    `--use-mailmap` option is given, the latter option is ignored. Compute
    +    the size of the object after replacing the idents and report it instead.
     
    -     git cat-file --use-mailmap -s <commit/tag object sha>
    +    In order to make `-s` option honour the mailmap mechanism we have to
    +    read the contents of the commit/tag object. Make use of the call to
    +    `oid_object_info_extended()` to get the contents of the object and store
    +    in `buf`. `buf` is later freed in the function.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and `-s` option would be more useful if it shows the size of
    -    the changed object. This patch implements that.
    -
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
    +    Mentored-by: Christian Couder <christian.couder@gmail.com>
    +    Mentored-by: John Cai <johncai86@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
3:  af90241d32 ! 2:  06c74dd017 cat-file: add mailmap support to --batch-check option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to --batch-check option
     
    -    Using `git cat-file --use-mailmap` with --batch-check option, like the
    -    following is allowed:
    +    Even though the cat-file command with `--batch-check` option does not
    +    complain when `--use-mailmap` option is given, the latter option is
    +    ignored. Compute the size of the object after replacing the idents and
    +    report it instead.
     
    -     git cat-file --use-mailmap -batch-check
    +    In order to make `--batch-check` option honour the mailmap mechanism we
    +    have to read the contents of the commit/tag object.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and --batch-check option would be more useful if it shows the
    -    size of the changed object. This patch implements that.
    +    There were two ways to do it:
     
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
    +    1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
    +       option is given, the first call will get us the type of the object
    +       and second call will only be made if the object type is either a
    +       commit or tag to get the contents of the object.
    +
    +    2. Make one call to `oid_object_info_extended()` to get the type of the
    +       object. Then, if the object type is either of commit or tag, make a
    +       call to `read_object_file()` to read the contents of the object.
    +
    +    I benchmarked the following command with both the above approaches and
    +    compared against the current implementation where `--use-mailmap`
    +    option is ignored:
    +
    +    `git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
    +    --unordered`
    +
    +    The results can be summarized as follows:
    +                           Time (mean ± σ)
    +    default               827.7 ms ± 104.8 ms
    +    first approach        6.197 s ± 0.093 s
    +    second approach       1.975 s ± 0.217 s
    +
    +    Since, the second approach is faster than the first one, I implemented
    +    it in this patch.
    +
    +    Mentored-by: Christian Couder <christian.couder@gmail.com>
    +    Mentored-by: John Cai <johncai86@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
     @@ Documentation/git-cat-file.txt: OPTIONS
    - 	with `--use-mailmap`, `--textconv` or `--filters`. In the case of `--textconv` or
    - 	`--filters` the input lines also need to specify the path, separated by whitespace.
    - 	See the `BATCH OUTPUT` section below for details.
    -+	If used with `--use-mailmap` option, will show the size of updated object after
    -+	replacing idents using the mailmap mechanism.
    + 	`--textconv` or `--filters`, in which case the input lines also
    + 	need to specify the path, separated by whitespace.  See the
    + 	section `BATCH OUTPUT` below for details.
    ++	If used with `--use-mailmap` option, will show the size of
    ++	updated object after replacing idents using the mailmap mechanism.
      
      --batch-command::
      --batch-command=<format>::
     
      ## builtin/cat-file.c ##
    -@@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, struct expand_data *d
    - 
    - static void print_default_format(struct strbuf *scratch, struct expand_data *data)
    - {
    -+	if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    -+		size_t s = data->size;
    -+		*data->info.contentp = replace_idents_using_mailmap((char*)*data->info.contentp, &s);
    -+		data->size = cast_size_t_to_ulong(s);
    -+	}
    -+
    - 	strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
    - 		    type_name(data->type),
    - 		    (uintmax_t)data->size);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 			       struct packed_git *pack,
    - 			       off_t offset)
    - {
    -+	void *buf = NULL;
    -+
      	if (!data->skip_object_info) {
      		int ret;
      
     +		if (use_mailmap)
    -+			data->info.contentp = &buf;
    ++			data->info.typep = &data->type;
     +
      		if (pack)
      			ret = packed_object_info(the_repository, pack, offset,
      						 &data->info);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 		print_object_or_die(opt, data);
    - 		batch_write(opt, "\n", 1);
    - 	}
    + 			fflush(stdout);
    + 			return;
    + 		}
     +
    -+	free(buf);
    - }
    ++		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    ++			size_t s = data->size;
    ++			char *buf = NULL;
    ++
    ++			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			data->size = cast_size_t_to_ulong(s);
    ++
    ++			free(buf);
    ++		}
    + 	}
      
    - static void batch_one_object(const char *obj_name,
    + 	strbuf_reset(scratch);
     
      ## t/t4203-mailmap.sh ##
     @@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
-- 
2.38.0.rc1.8.g9592ff2ba4


  parent reply	other threads:[~2022-09-26 12:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
2022-09-16 22:02   ` Junio C Hamano
2022-09-16 20:59 ` [PATCH 2/3] cat-file: add mailmap support to -s option Siddharth Asthana
2022-09-16 22:22   ` Junio C Hamano
2022-09-16 20:59 ` [PATCH 3/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-09-16 22:35   ` Junio C Hamano
2022-09-26 10:53 ` Siddharth Asthana [this message]
2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-09-26 13:16     ` Ævar Arnfjörð Bjarmason
2022-09-26 13:25     ` Ævar Arnfjörð Bjarmason
2022-09-26 10:53   ` [PATCH v2 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-10-31 11:49     ` Christian Couder
2022-10-29 10:24   ` [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-10-31 11:43     ` Christian Couder
2022-10-29 18:00   ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Taylor Blau
2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
2022-11-13 21:28   ` [PATCH v4 1/3] cat-file: add mailmap support to -s option Siddharth Asthana
2022-11-13 21:28   ` [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-11-15 21:40     ` Taylor Blau
2022-11-13 21:28   ` [PATCH v4 3/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
2022-11-14 22:30     ` Taylor Blau
2022-11-20  7:42     ` Siddharth Asthana
2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
2022-11-20  7:48   ` [PATCH v5 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-11-21  7:27     ` Junio C Hamano
2022-11-21  9:40       ` Christian Couder
2022-11-21  9:45         ` Junio C Hamano
2022-11-21 11:27         ` Ævar Arnfjörð Bjarmason
2022-11-20  7:48   ` [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-11-21  7:38     ` Junio C Hamano
2022-11-30  9:19       ` Junio C Hamano
2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-12-01 15:55   ` [PATCH v6 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-12-14 11:27     ` Ævar Arnfjörð Bjarmason
2022-12-14 14:04     ` Christian Couder
2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-12-20  6:01   ` [PATCH v7 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-12-20  6:01   ` [PATCH v7 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-12-20 13:02   ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220926105343.233296-1-siddharthasthana31@gmail.com \
    --to=siddharthasthana31@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).