* [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization
@ 2021-08-17 7:14 ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu
Last time I submitted a very long patch series:
https://lore.kernel.org/git/pull.1016.git.1628842990.gitgitgadget@gmail.com/
My mentor Christian suggested to split the performance optimization part
out, so this patch series used to optimize performance optimization in
ref-filter.
Changes in this patch series:
1. Skip parse_object_buffer(), which can reduce object content parsing.
2. Use linked list to record parsing results, which can reduce second
format string parsing.
3. Reuse final buffer, which can reduce memcpy();
4. Add a need_get_object_info flag instead of compare two object_info,
which can reduce memcmp();
5. Use ALLOC_ARRAY() instead of CALLOC_ARRAY(), which can reduce memset();
Since the relevant part of git cat-file --batch has been deleted, the
execution time of git for-each-ref is very short, I haven’t added
performance tests yet for git for-each-ref.
ZheNing Hu (5):
[GSOC] ref-filter: skip parse_object_buffer in some cases
[GSOC] ref-filter: remove second parsing in format_ref_array_item
[GSOC] ref-filter: reuse final buffer
[GSOC] ref-filter: reduce unnecessary object_info comparisons
[GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY
builtin/branch.c | 2 +
builtin/for-each-ref.c | 3 +-
builtin/tag.c | 2 +
builtin/verify-tag.c | 2 +
ref-filter.c | 168 ++++++++++++++++++++++++++++++++---------
ref-filter.h | 17 ++++-
6 files changed, 154 insertions(+), 40 deletions(-)
base-commit: f000ecbed922c727382651490e75014f003c89ca
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1020%2Fadlternative%2Fref-filter-opt-perf-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1020/adlternative/ref-filter-opt-perf-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1020
--
gitgitgadget
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
When we are using some atoms such as %(raw), %(objectname),
%(objecttype), we don't need to parse the content of the object,
so we can skip parse_object_buffer() for performance optimization.
It is worth noting that in these cases, we still need to call
parse_object_buffer() for parsing:
1. The atom type is one of %(tag), %(type), %(object), %(tree),
%(numparent) or %(parent).
2. The type of the object is tag and the atom need to be
dereferenced e.g. %(*objecttype).
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 49 +++++++++++++++++++++++++++++++++++++------------
ref-filter.h | 5 +++--
2 files changed, 40 insertions(+), 14 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 93ce2a6ef2e..65ba00633dc 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1009,6 +1009,20 @@ static int reject_atom(enum atom_type atom_type)
return atom_type == ATOM_REST;
}
+static int need_parse_buffer(enum atom_type atom_type) {
+ switch (atom_type) {
+ case ATOM_TAG:
+ case ATOM_TYPE:
+ case ATOM_OBJECT:
+ case ATOM_TREE:
+ case ATOM_NUMPARENT:
+ case ATOM_PARENT:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
/*
* Make sure the format string is well formed, and parse out
* the used atoms.
@@ -1029,6 +1043,8 @@ int verify_ref_format(struct ref_format *format)
at = parse_ref_filter_atom(format, sp + 2, ep, &err);
if (at < 0)
die("%s", err.buf);
+ if (need_parse_buffer(used_atom[at].atom_type))
+ format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
@@ -1524,14 +1540,16 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
{
void *buf = data->content;
- switch (obj->type) {
+ switch (data->type) {
case OBJ_TAG:
- grab_tag_values(val, deref, obj);
+ if (obj)
+ grab_tag_values(val, deref, obj);
grab_sub_body_contents(val, deref, data);
grab_person("tagger", val, deref, buf);
break;
case OBJ_COMMIT:
- grab_commit_values(val, deref, obj);
+ if (obj)
+ grab_commit_values(val, deref, obj);
grab_sub_body_contents(val, deref, data);
grab_person("author", val, deref, buf);
grab_person("committer", val, deref, buf);
@@ -1757,14 +1775,21 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
BUG("Object size is less than zero.");
if (oi->info.contentp) {
- *obj = parse_object_buffer(the_repository, &oi->oid, oi->type, oi->size, oi->content, &eaten);
- if (!*obj) {
- if (!eaten)
- free(oi->content);
- return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
- oid_to_hex(&oi->oid), ref->refname);
+ if (ref->can_skip_parse_buffer &&
+ ((!deref &&
+ (!need_tagged || oi->type != OBJ_TAG)) ||
+ deref)) {
+ grab_values(ref->value, deref, NULL, oi);
+ } else {
+ *obj = parse_object_buffer(the_repository, &oi->oid, oi->type, oi->size, oi->content, &eaten);
+ if (!*obj) {
+ if (!eaten)
+ free(oi->content);
+ return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
+ oid_to_hex(&oi->oid), ref->refname);
+ }
+ grab_values(ref->value, deref, *obj, oi);
}
- grab_values(ref->value, deref, *obj, oi);
}
grab_common_values(ref->value, deref, oi);
@@ -1988,7 +2013,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
* If there is no atom that wants to know about tagged
* object, we are done.
*/
- if (!need_tagged || (obj->type != OBJ_TAG))
+ if (!need_tagged || (oi.type != OBJ_TAG))
return 0;
/*
@@ -2595,7 +2620,7 @@ int format_ref_array_item(struct ref_array_item *info,
state.quote_style = format->quote_style;
push_stack_element(&state.stack);
-
+ info->can_skip_parse_buffer = format->can_skip_parse_buffer;
for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
struct atom_value *atomv;
int pos;
diff --git a/ref-filter.h b/ref-filter.h
index c15dee8d6b9..5bceae1dac9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -40,6 +40,7 @@ struct ref_array_item {
struct object_id objectname;
const char *rest;
int flag;
+ int can_skip_parse_buffer;
unsigned int kind;
const char *symref;
struct commit *commit;
@@ -81,12 +82,12 @@ struct ref_format {
int quote_style;
int use_rest;
int use_color;
-
+ int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
};
-#define REF_FORMAT_INIT { .use_color = -1 }
+#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
/* Macros for checking --merged and --no-merged options */
#define _OPT_MERGED_NO_MERGED(option, filter, h) \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
We parsed format string in verify_ref_format() and stored the parsed
atom in used_atom array. But in format_ref_array_item() we have
another round of parsing format string. This affects performance.
Introducing the struct parsed_atom_list which can save the current
atom's start and end address in format string and its index in
used_atom. All parsed_atom_list entry are linked together in the
form of linked list, and the head node of the linked list is stored
in struct ref_format. Create clear_parsed_atom_list() which can used
to clear the nodes on the linked list.
This can bring performance improvement.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
builtin/branch.c | 2 ++
builtin/for-each-ref.c | 3 ++-
builtin/tag.c | 2 ++
builtin/verify-tag.c | 2 ++
ref-filter.c | 45 +++++++++++++++++++++++++++++++++---------
ref-filter.h | 11 +++++++++++
6 files changed, 55 insertions(+), 10 deletions(-)
diff --git a/builtin/branch.c b/builtin/branch.c
index b23b1d1752a..e361f8cc661 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -459,6 +459,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
strbuf_release(&err);
strbuf_release(&out);
ref_array_clear(&array);
+ clear_parsed_atom_list(&format->parsed_atom_head);
free(to_free);
}
@@ -678,6 +679,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
memset(&filter, 0, sizeof(filter));
filter.kind = FILTER_REFS_BRANCHES;
filter.abbrev = -1;
+ INIT_LIST_HEAD(&format.parsed_atom_head);
if (argc == 2 && !strcmp(argv[1], "-h"))
usage_with_options(builtin_branch_usage, options);
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 89cb6307d46..6e22d80d5b5 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -53,8 +53,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
memset(&array, 0, sizeof(array));
memset(&filter, 0, sizeof(filter));
-
format.format = "%(objectname) %(objecttype)\t%(refname)";
+ INIT_LIST_HEAD(&format.parsed_atom_head);
git_config(git_default_config, NULL);
@@ -96,6 +96,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
ref_array_clear(&array);
free_commit_list(filter.with_commit);
free_commit_list(filter.no_commit);
+ clear_parsed_atom_list(&format.parsed_atom_head);
UNLEAK(sorting);
return 0;
}
diff --git a/builtin/tag.c b/builtin/tag.c
index 452558ec957..549339cbbe4 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -78,6 +78,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
strbuf_release(&output);
ref_array_clear(&array);
free(to_free);
+ clear_parsed_atom_list(&format->parsed_atom_head);
return 0;
}
@@ -493,6 +494,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
memset(&filter, 0, sizeof(filter));
filter.lines = -1;
opt.sign = -1;
+ INIT_LIST_HEAD(&format.parsed_atom_head);
argc = parse_options(argc, argv, prefix, options, git_tag_usage, 0);
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index f45136a06ba..8b0a2b2587c 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -39,6 +39,7 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
OPT_END()
};
+ INIT_LIST_HEAD(&format.parsed_atom_head);
git_config(git_verify_tag_config, NULL);
argc = parse_options(argc, argv, prefix, verify_tag_options,
@@ -73,5 +74,6 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
if (format.format)
pretty_print_ref(name, &oid, &format);
}
+ clear_parsed_atom_list(&format.parsed_atom_head);
return had_error;
}
diff --git a/ref-filter.c b/ref-filter.c
index 65ba00633dc..76a31fb79b1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1035,6 +1035,7 @@ int verify_ref_format(struct ref_format *format)
for (cp = format->format; *cp && (sp = find_next(cp)); ) {
struct strbuf err = STRBUF_INIT;
const char *color, *ep = strchr(sp, ')');
+ struct parsed_atom_list *e;
int at;
if (!ep)
@@ -1043,6 +1044,12 @@ int verify_ref_format(struct ref_format *format)
at = parse_ref_filter_atom(format, sp + 2, ep, &err);
if (at < 0)
die("%s", err.buf);
+ e = xmalloc(sizeof(*e));
+ e->beg = sp + 2;
+ e->end = ep;
+ e->at = at;
+ list_add_tail(&e->list, &format->parsed_atom_head);
+
if (need_parse_buffer(used_atom[at].atom_type))
format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
@@ -2615,25 +2622,31 @@ int format_ref_array_item(struct ref_array_item *info,
struct strbuf *final_buf,
struct strbuf *error_buf)
{
- const char *cp, *sp, *ep;
+ const char *cp, *sp;
+ struct list_head *item;
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
state.quote_style = format->quote_style;
push_stack_element(&state.stack);
info->can_skip_parse_buffer = format->can_skip_parse_buffer;
- for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
+
+ cp = format->format;
+
+ list_for_each(item, &format->parsed_atom_head) {
struct atom_value *atomv;
- int pos;
+ struct parsed_atom_list *e =
+ list_entry(item, struct parsed_atom_list, list);
- ep = strchr(sp, ')');
- if (cp < sp)
- append_literal(cp, sp, &state);
- pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
- if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
+ if (cp < e->beg - 2)
+ append_literal(cp, e->beg - 2, &state);
+ if (get_ref_atom_value(info, e->at, &atomv, error_buf) ||
atomv->handler(atomv, &state, error_buf)) {
pop_stack_element(&state.stack);
return -1;
}
+ cp = e->end + 1;
+ if (!*cp)
+ break;
}
if (*cp) {
sp = cp + strlen(cp);
@@ -2681,10 +2694,12 @@ static int parse_sorting_atom(const char *atom)
* This parses an atom using a dummy ref_format, since we don't
* actually care about the formatting details.
*/
+ int res;
struct ref_format dummy = REF_FORMAT_INIT;
const char *end = atom + strlen(atom);
struct strbuf err = STRBUF_INIT;
- int res = parse_ref_filter_atom(&dummy, atom, end, &err);
+
+ res = parse_ref_filter_atom(&dummy, atom, end, &err);
if (res < 0)
die("%s", err.buf);
strbuf_release(&err);
@@ -2757,3 +2772,15 @@ int parse_opt_merge_filter(const struct option *opt, const char *arg, int unset)
return 0;
}
+
+void clear_parsed_atom_list(struct list_head *parsed_atom_head)
+{
+ struct list_head *pos, *tmp;
+ struct parsed_atom_list *item;
+
+ list_for_each_safe(pos, tmp, parsed_atom_head) {
+ item = list_entry(pos, struct parsed_atom_list, list);
+ list_del(pos);
+ free(item);
+ }
+}
diff --git a/ref-filter.h b/ref-filter.h
index 5bceae1dac9..df54836a643 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -72,6 +72,13 @@ struct ref_filter {
verbose;
};
+struct parsed_atom_list {
+ const char *beg;
+ const char *end;
+ int at;
+ struct list_head list;
+};
+
struct ref_format {
/*
* Set these to define the format; make sure you call
@@ -85,6 +92,7 @@ struct ref_format {
int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
+ struct list_head parsed_atom_head;
};
#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
@@ -112,6 +120,9 @@ struct ref_format {
int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int type);
/* Clear all memory allocated to ref_array */
void ref_array_clear(struct ref_array *array);
+/* Clear the parsed_atom_list in ref_format*/
+void clear_parsed_atom_list(struct list_head *parsed_atom_head);
+
/* Used to verify if the given format is correct and to parse out the used atoms */
int verify_ref_format(struct ref_format *format);
/* Sort the given ref_array as per the ref_sorting provided */
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/5] [GSOC] ref-filter: reuse final buffer
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
In format_ref_array_item(), we add the object data
to ref_formatting_state, and copy the data from
ref_formatting_state to final_buf at the end. There
are huge copies of data.
Because final_buf will be cleared before every time
we call format_ref_array_item(). So we actually add
content to an empty strbuf. We can add the object's
data directly to this final_buffer instead of adding
objects' data to state.stack->output first, then
copy to final_buf.
Add a can_reuse_final_buffer flag to struct ref_format
and create can_reuse_final_buffer() to check if we are
use %(align), %(end), %(if), %(then), %(else). If not,
we can reuse the buf of finnal_buf.
Reuse the buffer address of final_buf in
format_ref_array_item(), we directly add the data to the
final buffer, return the content to final_buf at the end.
This will bring performance improvements.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 39 ++++++++++++++++++++++++++++++++++-----
ref-filter.h | 3 ++-
2 files changed, 36 insertions(+), 6 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 76a31fb79b1..7106d4c1c4c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1023,6 +1023,19 @@ static int need_parse_buffer(enum atom_type atom_type) {
}
}
+static int can_reuse_final_buffer(enum atom_type atom_type) {
+ switch (atom_type) {
+ case ATOM_ALIGN:
+ case ATOM_END:
+ case ATOM_IF:
+ case ATOM_THEN:
+ case ATOM_ELSE:
+ return 0;
+ default:
+ return 1;
+ }
+}
+
/*
* Make sure the format string is well formed, and parse out
* the used atoms.
@@ -1054,6 +1067,7 @@ int verify_ref_format(struct ref_format *format)
format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
+ format->can_reuse_final_buffer = can_reuse_final_buffer(used_atom[at].atom_type);
if ((format->quote_style == QUOTE_PYTHON ||
format->quote_style == QUOTE_SHELL ||
@@ -2627,7 +2641,14 @@ int format_ref_array_item(struct ref_array_item *info,
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
state.quote_style = format->quote_style;
- push_stack_element(&state.stack);
+ if (format->can_reuse_final_buffer) {
+ struct ref_formatting_stack *s = xmalloc(sizeof(struct ref_formatting_stack));
+ s->output = *final_buf;
+ s->prev = state.stack;
+ state.stack = s;
+ } else {
+ push_stack_element(&state.stack);
+ }
info->can_skip_parse_buffer = format->can_skip_parse_buffer;
cp = format->format;
@@ -2641,7 +2662,8 @@ int format_ref_array_item(struct ref_array_item *info,
append_literal(cp, e->beg - 2, &state);
if (get_ref_atom_value(info, e->at, &atomv, error_buf) ||
atomv->handler(atomv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ if (!format->can_reuse_final_buffer)
+ pop_stack_element(&state.stack);
return -1;
}
cp = e->end + 1;
@@ -2656,16 +2678,23 @@ int format_ref_array_item(struct ref_array_item *info,
struct atom_value resetv = ATOM_VALUE_INIT;
resetv.s = GIT_COLOR_RESET;
if (append_atom(&resetv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ if (!format->can_reuse_final_buffer)
+ pop_stack_element(&state.stack);
return -1;
}
}
if (state.stack->prev) {
+ assert(!format->can_reuse_final_buffer);
pop_stack_element(&state.stack);
return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
}
- strbuf_addbuf(final_buf, &state.stack->output);
- pop_stack_element(&state.stack);
+ if(format->can_reuse_final_buffer) {
+ *final_buf = state.stack->output;
+ free(state.stack);
+ } else {
+ strbuf_addbuf(final_buf, &state.stack->output);
+ pop_stack_element(&state.stack);
+ }
return 0;
}
diff --git a/ref-filter.h b/ref-filter.h
index df54836a643..a62a14a2e43 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -92,10 +92,11 @@ struct ref_format {
int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
+ int can_reuse_final_buffer;
struct list_head parsed_atom_head;
};
-#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
+#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1, .can_reuse_final_buffer = 1 }
/* Macros for checking --merged and --no-merged options */
#define _OPT_MERGED_NO_MERGED(option, filter, h) \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
` (2 preceding siblings ...)
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
In populate_value() we use an empty struct object_info
compare with oi and oi_deref, which determine if we can
return early, this is actually very inefficient.
So add a need_get_object_info flag in global variables,
which used to indicate whether we need call get_object()
or we can return early.
So we only need to compare only one byte instead of comparing
the entire huge object_info struct.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 7106d4c1c4c..35e221e7ec8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -212,7 +212,7 @@ static struct used_atom {
char *head;
} u;
} *used_atom;
-static int used_atom_cnt, need_tagged, need_symref;
+static int used_atom_cnt, need_tagged, need_symref, need_get_object_info;
/*
* Expand string, append it to strbuf *sb, then return error code ret.
@@ -318,10 +318,13 @@ static int objecttype_atom_parser(struct ref_format *format, struct used_atom *a
{
if (arg)
return strbuf_addf_ret(err, -1, _("%%(objecttype) does not take arguments"));
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.typep = &oi_deref.type;
- else
+ need_get_object_info = 1;
+ } else {
oi.info.typep = &oi.type;
+ need_get_object_info = 1;
+ }
return 0;
}
@@ -330,16 +333,23 @@ static int objectsize_atom_parser(struct ref_format *format, struct used_atom *a
{
if (!arg) {
atom->u.objectsize.option = O_SIZE;
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.sizep = &oi_deref.size;
- else
+ need_get_object_info = 1;
+ } else {
oi.info.sizep = &oi.size;
+ need_get_object_info = 1;
+ }
} else if (!strcmp(arg, "disk")) {
atom->u.objectsize.option = O_SIZE_DISK;
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.disk_sizep = &oi_deref.disk_size;
- else
+ need_get_object_info = 1;
+ }
+ else {
oi.info.disk_sizep = &oi.disk_size;
+ need_get_object_info = 1;
+ }
} else
return strbuf_addf_ret(err, -1, _("unrecognized %%(objectsize) argument: %s"), arg);
return 0;
@@ -354,6 +364,7 @@ static int deltabase_atom_parser(struct ref_format *format, struct used_atom *at
oi_deref.info.delta_base_oid = &oi_deref.delta_base_oid;
else
oi.info.delta_base_oid = &oi.delta_base_oid;
+ need_get_object_info = 1;
return 0;
}
@@ -720,6 +731,7 @@ static int parse_ref_filter_atom(struct ref_format *format,
used_atom[at].type = valid_atom[i].cmp_type;
used_atom[at].source = valid_atom[i].source;
if (used_atom[at].source == SOURCE_OBJ) {
+ need_get_object_info = 1;
if (*atom == '*')
oi_deref.info.contentp = &oi_deref.content;
else
@@ -1871,7 +1883,6 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
{
struct object *obj;
int i;
- struct object_info empty = OBJECT_INFO_INIT;
CALLOC_ARRAY(ref->value, used_atom_cnt);
@@ -2019,10 +2030,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
oid_to_hex(&ref->objectname), ref->refname);
}
- if (need_tagged)
+ if (need_tagged) {
oi.info.contentp = &oi.content;
- if (!memcmp(&oi.info, &empty, sizeof(empty)) &&
- !memcmp(&oi_deref.info, &empty, sizeof(empty)))
+ need_get_object_info = 1;
+ }
+ if (!need_get_object_info)
return 0;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
` (3 preceding siblings ...)
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
populate_value() uses CALLOC_ARRAY() to allocate dynamic memory for
ref->value. But after that, we immediately assign values to its
members in the for loop. This shows that it is not necessary to
use CALLOC_ARRAY() for ref->value to clear the memory.
Therefore, use ALLOC_ARRAY() instead of CALLOC_ARRAY() to reduce the
overhead caused by clearing the memory. This can bring performance
optimizations.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/ref-filter.c b/ref-filter.c
index 35e221e7ec8..fe2df82067f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1884,7 +1884,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
struct object *obj;
int i;
- CALLOC_ARRAY(ref->value, used_atom_cnt);
+ ALLOC_ARRAY(ref->value, used_atom_cnt);
if (need_symref && (ref->flag & REF_ISSYMREF) && !ref->symref) {
ref->symref = resolve_refdup(ref->refname, RESOLVE_REF_READING,
@@ -1903,6 +1903,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
const char *refname;
struct branch *branch = NULL;
+ v->s = NULL;
v->s_size = ATOM_SIZE_UNSPECIFIED;
v->handler = append_atom;
v->atom = atom;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-17 7:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).