git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: ZheNing Hu <adlternative@gmail.com>
Cc: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Hariom Verma <hariom18599@gmail.com>
Subject: Re: [PATCH] [GSOC] ref-filter: add contents:raw atom
Date: Fri, 21 May 2021 11:10:05 +0200	[thread overview]
Message-ID: <CAP8UFD1yLS9UBs0r6_GjB-K6prW7GNxR+z445HJe8cR4HYLewA@mail.gmail.com> (raw)
In-Reply-To: <CAOLTT8S_Bu1PG+-gVK_6iUx--YrMx2hxDCTa=5sW6UJv9Oz_0Q@mail.gmail.com>

On Fri, May 21, 2021 at 6:44 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> 于2021年5月21日周五 上午12:20写道:
> >
> > On Thu, May 20, 2021 at 10:49 AM ZheNing Hu via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > >
> > > From: ZheNing Hu <adlternative@gmail.com>
> > >
> > > Add new formatting option %(contents:raw), which will
> > > print all the object contents without any changes.
> >
> > Maybe you could tell how it would be different from %(contents), or in
> > other words what are the changes that %(contents) makes.
> >
> > Isn't %(contents) only printing the commit message or the tag message
> > of a commit or a tag respectively? If %(contents:raw) would print all
> > the object contents, it could feel strange that it is actually
> > printing more than %(contents).
>
> Okay, some explanations are indeed missing here:

By the way I forgot to say that your patch might want to also properly
document %(contents:raw). If the doc was updated as part of the patch,
maybe the whole patch would be easier to understand.

> %(contents) will discard the metadata part of the object file,

It's not clear what you mean when you talk about metadata in objects.
Are you taking about only the headers, like the "tree XXX", "parent
YYY", etc lines in commits, and the "object OOO", "type TTT", etc
lines in tags? Or does it also includes the lines in tree objects that
reference other trees or blobs?

> and only print the data contents part of it. %(contents:raw)
> can will not discard the metadata part of the object file, this

s/can//

>  means that it can print the "raw" content of an object.

The above seems to be changing the definition of %(contents) (as
previously it was only the commit message of a commit or the tag
message a tag) to explain %(contents:raw)...

> In addition, %(contents:raw) can support print commit, blob,
> tag, tree objects contents which %(contents) can only support
> commit,tag objects.

...but this is saying that the definition of %(contents) actually
doesn't change. This doesn't seem logical to me.

If %(contents:raw) can support any kind of object (commit, blob, tag
or tree) then it would be logical that %(contents) also support any
kind of object.

So if you really want to define %(contents:raw) as the raw object
contents, you might want to consider a preparatory patch that would
first change the definition of %(contents) to be the object contents
without the headers. This would keep the same output for any commit or
tag. But for blobs and trees it would print the whole content of the
object in the same way as `git cat-file -p` does, instead of nothing.

I think this acceptable as refs rarely point to something other than
commits, so people are not likely to rely on the fact that %(contents)
currently prints nothing for blobs and trees.

> E.g:
>
> git for-each-ref --format=%(contents:raw) refs/heads/foo
>
> will have the same output as:
>
> git rev-parse refs/heads/foo | git cat-file --batch

Ok, I think that would indeed be useful.

> > Also is %(contents:raw) supposed to print something for a blob or a
> > tree, while I guess %(contents) is printing nothing for them?
>
> Now my thoughts are:
> Let %(contents) learn to print four kinds of objects.
> and then let %(contents:raw) learn to print metadata.
> I will split it into two patches.

Yeah, great!

> > > It will help further to migrate all cat-file formatting
> > > logic from cat-file to ref-filter.
> > >
> > > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> >
> > It looks like you rewrote the patch nearly completely, but if you
> > based your patch on, or got inspired by, Olga's work, it might be nice
> > to acknowledge this using a trailer (for example "Based-on-patch-by:
> > ..." or "Helped-by:...").
>
> Okay, "Based-on-patch-by" would be more appropriate here.

Nice!

> > > @@ -1312,6 +1315,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> > >                     !starts_with(name, "trailers") &&
> > >                     !starts_with(name, "contents"))
> > >                         continue;
> > > +               if (atom->u.contents.option == C_RAW) {
> > > +                       v->s = xmemdupz(buf, buf_size);
> > > +                       continue;
> > > +               }
> > > +               if (object_type != OBJ_TAG && object_type != OBJ_COMMIT)
> > > +                       continue;
> >
> > When seeing the 2 lines above, I am guessing that, before this patch,
> > grab_sub_body_contents() couldn't actually be called when object_type
> > was OBJ_BLOB or OBJ_TREE, but you have made other changes elsewhere so
> > that now it can. As only the atom->u.contents.option == C_RAW case is
> > relevant in this case, you added this check. Let's see if I am
> > right...
> >
> > >                 if (!subpos)
> > >                         find_subpos(buf,
> > >                                     &subpos, &sublen,
> > > @@ -1374,25 +1384,30 @@ static void fill_missing_values(struct atom_value *val)
> > >   * pointed at by the ref itself; otherwise it is the object the
> > >   * ref (which is a tag) refers to.
> > >   */
> > > -static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
> > > +static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
> > >  {
> > > +       void *buf = data->content;
> > > +       unsigned long buf_size = data->size;
> > > +
> > >         switch (obj->type) {
> > >         case OBJ_TAG:
> > >                 grab_tag_values(val, deref, obj);
> > > -               grab_sub_body_contents(val, deref, buf);
> > > +               grab_sub_body_contents(val, deref, buf, buf_size, obj->type);
> > >                 grab_person("tagger", val, deref, buf);
> > >                 break;
> > >         case OBJ_COMMIT:
> > >                 grab_commit_values(val, deref, obj);
> > > -               grab_sub_body_contents(val, deref, buf);
> > > +               grab_sub_body_contents(val, deref, buf, buf_size, obj->type);
> > >                 grab_person("author", val, deref, buf);
> > >                 grab_person("committer", val, deref, buf);
> > >                 break;
> > >         case OBJ_TREE:
> > >                 /* grab_tree_values(val, deref, obj, buf, sz); */
> > > +               grab_sub_body_contents(val, deref, buf, buf_size, obj->type);
> > >                 break;
> > >         case OBJ_BLOB:
> > >                 /* grab_blob_values(val, deref, obj, buf, sz); */
> > > +               grab_sub_body_contents(val, deref, buf, buf_size, obj->type);
> >
> > ...ok, I was right above. The issue now is that I wonder if
> > grab_sub_body_contents() is still a good name for a function that can
> > be called for a blob or a tree which does not really have a body.
>
> Makes sense, It might be better to use the new name: grab_contents().

Yeah.

  reply	other threads:[~2021-05-21  9:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  8:49 [PATCH] [GSOC] ref-filter: add contents:raw atom ZheNing Hu via GitGitGadget
2021-05-20 11:29 ` Hariom verma
2021-05-20 14:34   ` ZheNing Hu
2021-05-20 16:20 ` Christian Couder
2021-05-21  4:43   ` ZheNing Hu
2021-05-21  9:10     ` Christian Couder [this message]
2021-05-21 13:12       ` ZheNing Hu
2021-05-21 13:39         ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP8UFD1yLS9UBs0r6_GjB-K6prW7GNxR+z445HJe8cR4HYLewA@mail.gmail.com \
    --to=christian.couder@gmail.com \
    --cc=adlternative@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).