From: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Christian Couder <christian.couder@gmail.com>,
Hariom Verma <hariom18599@gmail.com>,
Karthik Nayak <karthik.188@gmail.com>,
Felipe Contreras <felipe.contreras@gmail.com>,
Bagas Sanjaya <bagasdotme@gmail.com>, Jeff King <peff@peff.net>,
Phillip Wood <phillip.wood123@gmail.com>,
ZheNing Hu <adlternative@gmail.com>
Subject: [PATCH v2 0/2] [GSOC] ref-filter: add %(raw) atom
Date: Sun, 30 May 2021 13:01:56 +0000 [thread overview]
Message-ID: <pull.963.v2.git.1622379718.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.963.git.1622126603.gitgitgadget@gmail.com>
In order to make git cat-file --batch use ref-filter logic, I add %(raw)
atom to ref-filter.
Change from last version:
1. Use more elegant memcasecmp().
2. Allow %(raw:size) used with --<lang>.
3. Remove redundant BUG() in then_atom_handler().
4. Roll back to origin function name grab_sub_body_contents().
5. Split the check of object type in grab_sub_body_contents() into the
previous patch.
ZheNing Hu (2):
[GSOC] ref-filter: add obj-type check in grab contents
[GSOC] ref-filter: add %(raw) atom
Documentation/git-for-each-ref.txt | 14 ++
ref-filter.c | 158 ++++++++++++++++++-----
t/t6300-for-each-ref.sh | 200 +++++++++++++++++++++++++++++
3 files changed, 338 insertions(+), 34 deletions(-)
base-commit: 5d5b1473453400224ebb126bf3947e0a3276bdf5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-963%2Fadlternative%2Fref-filter-raw-atom-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-963/adlternative/ref-filter-raw-atom-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/963
Range-diff vs v1:
-: ------------ > 1: e6c26d19a3f3 [GSOC] ref-filter: add obj-type check in grab contents
1: b3848f24f2d3 ! 2: e44a2ed0db59 [GSOC] ref-filter: add %(raw) atom
@@ Commit message
The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depands on the output of the atom being
- a structured string (end with '\0').
+ text (specifically, no embedded NULs in it).
E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
@@ Commit message
can record raw object size, it can help us add raw object data to
the buffer or compare two buffers which contain raw object data.
- Beyond, `--format=%(raw)` should not combine with `--python`, `--shell`,
+ Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, `--perl` because if our binary raw data is passed to a variable
in the host language, the host languages may cause escape errors.
+ Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
+ Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
+ Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
@@ Documentation/git-for-each-ref.txt: and `date` to extract the named component.
+raw:size::
+ The raw data size of the object.
+
-+Note that `--format=%(raw)` should not combine with `--python`, `--shell`, `--tcl`,
++Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because if our binary raw data is passed to a variable in the host language,
+the host languages may cause escape errors.
+
@@ ref-filter.c: static int contents_atom_parser(const struct ref_format *format, s
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+ const char *arg, struct strbuf *err)
+{
-+ if (!arg) {
++ if (!arg)
+ atom->u.raw_data.option = RAW_BARE;
-+ } else if (!strcmp(arg, "size"))
++ else if (!strcmp(arg, "size"))
+ atom->u.raw_data.option = RAW_LENGTH;
+ else
+ return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
(int)(ep-atom), atom);
-+ if (format->quote_style && starts_with(sp, "raw"))
-+ return strbuf_addf_ret(err, -1, _("--format=%.*s should not combine with"
+- /* Do we have the atom already used elsewhere? */
+- for (i = 0; i < used_atom_cnt; i++) {
+- int len = strlen(used_atom[i].name);
+- if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+- return i;
+- }
+-
+ /*
+ * If the atom name has a colon, strip it and everything after
+ * it off - it specifies the format for this entry, and
+@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
+ arg = memchr(sp, ':', ep - sp);
+ atom_len = (arg ? arg : ep) - sp;
+
++ if (format->quote_style && !strncmp(sp, "raw", 3) && !arg)
++ return strbuf_addf_ret(err, -1, _("--format=%.*s cannot be used with"
+ "--python, --shell, --tcl, --perl"), (int)(ep-atom), atom);
+
- /* Do we have the atom already used elsewhere? */
- for (i = 0; i < used_atom_cnt; i++) {
- int len = strlen(used_atom[i].name);
++ /* Do we have the atom already used elsewhere? */
++ for (i = 0; i < used_atom_cnt; i++) {
++ int len = strlen(used_atom[i].name);
++ if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
++ return i;
++ }
++
+ /* Is the atom a valid one? */
+ for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
+ int len = strlen(valid_atom[i].name);
@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
return at;
}
@@ ref-filter.c: static int then_atom_handler(struct atom_value *atomv, struct ref_
*/
if (if_then_else->cmp_status == COMPARE_EQUAL) {
- if (!strcmp(if_then_else->str, cur->output.buf))
-+ if (!if_then_else->str)
-+ BUG("when if_then_else->cmp_status == COMPARE_EQUAL,"
-+ "if_then_else->str must not be null");
+ if (str_len == cur->output.len &&
+ !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
if_then_else->condition_satisfied = 1;
} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
- if (strcmp(if_then_else->str, cur->output.buf))
-+ if (!if_then_else->str)
-+ BUG("when if_then_else->cmp_status == COMPARE_UNEQUAL,"
-+ "if_then_else->str must not be null");
+ if (str_len != cur->output.len ||
+ memcmp(if_then_else->str, cur->output.buf, cur->output.len))
if_then_else->condition_satisfied = 1;
@@ ref-filter.c: static int end_atom_handler(struct atom_value *atomv, struct ref_f
}
strbuf_release(&s);
@@ ref-filter.c: static void append_lines(struct strbuf *out, const char *buf, unsigned long size
- }
/* See grab_values */
--static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
-+static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned long buf_size, struct object *obj)
+ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
+- struct object *obj)
++ unsigned long buf_size, struct object *obj)
{
int i;
const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
-@@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
- continue;
+@@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
if (deref)
name++;
-- if (strcmp(name, "body") &&
-- !starts_with(name, "subject") &&
-- !starts_with(name, "trailers") &&
-- !starts_with(name, "contents"))
-+
+
+ if (starts_with(name, "raw")) {
+ if (atom->u.raw_data.option == RAW_BARE) {
+ v->s = xmemdupz(buf, buf_size);
+ v->s_size = buf_size;
-+ } else if (atom->u.raw_data.option == RAW_LENGTH)
++ } else if (atom->u.raw_data.option == RAW_LENGTH) {
+ v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
++ }
+ continue;
+ }
+
-+ if ((obj->type != OBJ_TAG &&
-+ obj->type != OBJ_COMMIT) ||
-+ (strcmp(name, "body") &&
-+ !starts_with(name, "subject") &&
-+ !starts_with(name, "trailers") &&
-+ !starts_with(name, "contents")))
- continue;
- if (!subpos)
- find_subpos(buf,
+ if ((obj->type != OBJ_TAG &&
+ obj->type != OBJ_COMMIT) ||
+ (strcmp(name, "body") &&
@@ ref-filter.c: static void fill_missing_values(struct atom_value *val)
* pointed at by the ref itself; otherwise it is the object the
* ref (which is a tag) refers to.
@@ ref-filter.c: static void fill_missing_values(struct atom_value *val)
switch (obj->type) {
case OBJ_TAG:
grab_tag_values(val, deref, obj);
-- grab_sub_body_contents(val, deref, buf);
-+ grab_raw_data(val, deref, buf, buf_size, obj);
+- grab_sub_body_contents(val, deref, buf, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
grab_person("tagger", val, deref, buf);
break;
case OBJ_COMMIT:
grab_commit_values(val, deref, obj);
-- grab_sub_body_contents(val, deref, buf);
-+ grab_raw_data(val, deref, buf, buf_size, obj);
+- grab_sub_body_contents(val, deref, buf, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
grab_person("author", val, deref, buf);
grab_person("committer", val, deref, buf);
break;
case OBJ_TREE:
/* grab_tree_values(val, deref, obj, buf, sz); */
-+ grab_raw_data(val, deref, buf, buf_size, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
break;
case OBJ_BLOB:
/* grab_blob_values(val, deref, obj, buf, sz); */
-+ grab_raw_data(val, deref, buf, buf_size, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
break;
default:
die("Eh? Object of type %d?", obj->type);
@@ ref-filter.c: static int compare_detached_head(struct ref_array_item *a, struct
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
-+ size_t i;
-+ const char *s1 = (const char *)vs1;
-+ const char *s2 = (const char *)vs2;
++ const char *s1 = (const void *)vs1;
++ const char *s2 = (const void *)vs2;
++ const char *end = s1 + n;
+
-+ for (i = 0; i < n; i++) {
-+ unsigned char u1 = s1[i];
-+ unsigned char u2 = s2[i];
-+ int U1 = toupper (u1);
-+ int U2 = toupper (u2);
-+ int diff = (UCHAR_MAX <= INT_MAX ? U1 - U2
-+ : U1 < U2 ? -1 : U2 < U1);
++ for (; s1 < end; s1++, s2++) {
++ int diff = tolower(*s1) - tolower(*s2);
+ if (diff)
+ return diff;
+ }
@@ ref-filter.c: static int compare_detached_head(struct ref_array_item *a, struct
static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
{
struct atom_value *va, *vb;
-@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
- int cmp_detached_head = 0;
- cmp_type cmp_type = used_atom[s->atom].type;
- struct strbuf err = STRBUF_INIT;
-+ size_t slen = 0;
-
- if (get_ref_atom_value(a, s->atom, &va, &err))
- die("%s", err.buf);
@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
} else if (s->sort_flags & REF_SORTING_VERSION) {
cmp = versioncmp(va->s, vb->s);
@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
+ int (*cmp_fn)(const void *, const void *, size_t);
+ cmp_fn = s->sort_flags & REF_SORTING_ICASE
+ ? memcasecmp : memcmp;
++ size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
++ strlen(va->s) : va->s_size;
++ size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
++ strlen(vb->s) : vb->s_size;
+
-+ if (va->s_size != ATOM_VALUE_S_SIZE_INIT &&
-+ vb->s_size != ATOM_VALUE_S_SIZE_INIT) {
-+ cmp = cmp_fn(va->s, vb->s, va->s_size > vb->s_size ?
-+ vb->s_size : va->s_size);
-+ } else if (va->s_size == ATOM_VALUE_S_SIZE_INIT) {
-+ slen = strlen(va->s);
-+ cmp = cmp_fn(va->s, vb->s, slen > vb->s_size ?
-+ vb->s_size : slen);
-+ } else {
-+ slen = strlen(vb->s);
-+ cmp = cmp_fn(va->s, vb->s, slen > va->s_size ?
-+ slen : va->s_size);
++ cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
++ a_size : b_size);
++ if (!cmp) {
++ if (a_size > b_size)
++ cmp = 1;
++ else if (a_size < b_size)
++ cmp = -1;
+ }
-+ cmp = cmp ? cmp : va->s_size - vb->s_size;
+ }
} else {
if (va->value < vb->value)
@@ t/t6300-for-each-ref.sh: test_atom refs/myblobs/first contents:body ""
+ refs/myblobs/first not empty
+ EOF
+ git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
-+ refs/myblobs/ >actual &&
++ refs/myblobs/ >actual &&
+ test_cmp expected actual
+'
+
@@ t/t6300-for-each-ref.sh: test_atom refs/myblobs/first contents:body ""
+ test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
++test_expect_success '%(raw:size) with --shell' '
++ git for-each-ref --format="%(raw:size)" | while read line
++ do
++ echo "'\''$line'\''" >>expect
++ done &&
++ git for-each-ref --format="%(raw:size)" --shell >actual &&
++ test_cmp expect actual
++'
++
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+ git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+ git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
2: aa6d73f3e526 < -: ------------ [GSOC] ref-filter: add %(header) atom
--
gitgitgadget
next prev parent reply other threads:[~2021-05-30 13:02 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 14:43 [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-27 14:43 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-05-27 16:36 ` Felipe Contreras
2021-05-28 13:02 ` ZheNing Hu
2021-05-28 16:30 ` Felipe Contreras
2021-05-30 5:37 ` ZheNing Hu
2021-05-29 13:23 ` Phillip Wood
2021-05-29 15:24 ` Felipe Contreras
2021-05-29 17:23 ` Phillip Wood
2021-05-30 6:29 ` ZheNing Hu
2021-05-30 13:05 ` Phillip Wood
2021-05-31 14:15 ` ZheNing Hu
2021-05-31 15:35 ` Felipe Contreras
2021-05-30 6:26 ` ZheNing Hu
2021-05-30 13:02 ` Phillip Wood
2021-05-28 3:03 ` Junio C Hamano
2021-05-28 15:04 ` ZheNing Hu
2021-05-28 16:38 ` Felipe Contreras
2021-05-30 8:11 ` ZheNing Hu
2021-05-27 14:43 ` [PATCH 2/2] [GSOC] ref-filter: add %(header) atom ZheNing Hu via GitGitGadget
2021-05-27 16:37 ` Felipe Contreras
2021-05-28 3:06 ` Junio C Hamano
2021-05-28 4:36 ` Junio C Hamano
2021-05-28 15:19 ` ZheNing Hu
2021-05-27 15:39 ` [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom Felipe Contreras
2021-05-30 13:01 ` ZheNing Hu via GitGitGadget [this message]
2021-05-30 13:01 ` [PATCH v2 1/2] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-05-31 5:34 ` Junio C Hamano
2021-05-30 13:01 ` [PATCH v2 2/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-31 0:44 ` Junio C Hamano
2021-05-31 14:35 ` ZheNing Hu
2021-06-01 9:54 ` Junio C Hamano
2021-06-01 11:05 ` ZheNing Hu
2021-05-31 4:04 ` Junio C Hamano
2021-05-31 14:40 ` ZheNing Hu
2021-06-01 8:54 ` Junio C Hamano
2021-06-01 11:00 ` ZheNing Hu
2021-06-01 13:48 ` Johannes Schindelin
2021-05-31 4:10 ` Junio C Hamano
2021-05-31 15:41 ` Felipe Contreras
2021-06-01 10:37 ` ZheNing Hu
-- strict thread matches above, loose matches on Subject: below --
2021-06-01 14:37 [PATCH 0/2] " ZheNing Hu via GitGitGadget
2021-06-04 12:12 ` [PATCH v2 " ZheNing Hu via GitGitGadget
2021-06-04 12:53 ` Christian Couder
2021-06-05 4:34 ` ZheNing Hu
2021-06-05 4:49 ` Christian Couder
2021-06-05 5:42 ` ZheNing Hu
2021-06-05 6:45 ` Christian Couder
2021-06-05 8:05 ` ZheNing Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.963.v2.git.1622379718.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=adlternative@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=christian.couder@gmail.com \
--cc=felipe.contreras@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=karthik.188@gmail.com \
--cc=peff@peff.net \
--cc=phillip.wood123@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).