From: ZheNing Hu <adlternative@gmail.com>
To: Git List <git@vger.kernel.org>
Cc: Junio C Hamano <gitster@pobox.com>,
Christian Couder <christian.couder@gmail.com>,
Hariom verma <hariom18599@gmail.com>,
Felipe Contreras <felipe.contreras@gmail.com>,
Phillip Wood <phillip.wood123@gmail.com>
Subject: [GSOC] Git Blog 2
Date: Sun, 30 May 2021 22:35:56 +0800 [thread overview]
Message-ID: <CAOLTT8T0XtmpH3cHDBOfcX5nmCzfyrKby4TrFkHtA9H3dDN63A@mail.gmail.com> (raw)
My second week blog finished:
The web version is here:
https://adlternative.github.io/GSOC-Git-Blog-2/
-------
## Week2: learning the slang of a new city
### What happened this week
- In [[PATCH 1/2] [GSOC] ref-filter: add %(raw)
atom](https://lore.kernel.org/git/b3848f24f2d3f91fc96f20b5a08cbfbe721acbd6.1622126603.git.gitgitgadget@gmail.com/),
I made a license-related mistake this week. When I was
implementing `%(raw)` atom for ref-filter, I noticed that
`glibc` did not provide us with `memcasecmp()` which
can be used to compare two pieces of memory and
ignore case, so I found `memcasecmp()` implemented
by `gnulib` on the Internet, and copy it to git to use.
But unfortunately, I should not copy it so "conveniently".
Git use `gpl-v2` and `gunlib` use `gpl-v3`, they are
incompatible. Since I used to write code for my own use,
I am not very sensitive to licenses problems. Thanks to
`Felipe Contreras` for correcting me. Therefore, from today
onwards, I will be more careful about the license.
- On the other hand, I realized that clean code is also a
very important thing. In `cmp_ref_sorting()`, I want to use
`memcmp()/memcasecmp()` to compare two strings.
BAD VERSION:
```c
int (*cmp_fn)(const void *, const void *, size_t);
cmp_fn = s->sort_flags & REF_SORTING_ICASE
? memcasecmp : memcmp;
if (va->s_size != ATOM_VALUE_S_SIZE_INIT &&
vb->s_size != ATOM_VALUE_S_SIZE_INIT) {
cmp = cmp_fn(va->s, vb->s, va->s_size
> vb->s_size ?
vb->s_size : va->s_size);
} else if (va->s_size == ATOM_VALUE_S_SIZE_INIT) {
slen = strlen(va->s);
cmp = cmp_fn(va->s, vb->s, slen > vb->s_size ?
vb->s_size : slen);
} else {
slen = strlen(vb->s);
cmp = cmp_fn(va->s, vb->s, slen > va->s_size ?
slen : va->s_size);
}
cmp = cmp ? cmp : va->s_size - vb->s_size;
}
```
It's complicated and buggy.
GOOD VERSION:
```c
int (*cmp_fn)(const void *, const void *, size_t);
cmp_fn = s->sort_flags & REF_SORTING_ICASE
? memcasecmp : memcmp;
size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
strlen(va->s) : va->s_size;
size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
strlen(vb->s) : vb->s_size;
cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
a_size : b_size);
if (!cmp) {
if (a_size > b_size)
cmp = 1;
else if (a_size < b_size)
cmp = -1;
}
```
It's relatively refreshing a lot.
So how to cultivate a good coding style? As
`Felipe Contreras` said: "It's like learning the
slang of a new city; it takes a while."
### What's the next step
There are still some flaws in the %(raw)
implementation, but let's look ahead and see what
we can do. We check the atoms with `verify_ref_format()`
and pass object oid and grub corresponding object data
through `format_ref_array_item()`:
|Git command|Atoms|
|-|-|
|`git cat-file --batch-check` | `%(objectname) %(objecttype) %(objectsize)`|
|`git cat-file --batch --symlink`| `%(objectname) %(objecttype) %(objectsize)`|
|`git cat-file --batch` | `%(objectname) %(objecttype) %(objectsize)\n%(raw)`|
|`git cat-file --batch --textconv` | `%(objectname) %(objecttype)
%(objectsize)\n%(raw:textconv)`|
|`git cat-file --batch --filter` | `%(objectname) %(objecttype)
%(objectsize)\n%(raw:filter)`|
|`git cat-file --batch="%(rest)"` | `%(rest)`|
No additional operation is required in
`git cat-file --batch --symlink`, because
`get_oid_with_context(...,GET_OID_FOLLOW_SYMLINKS,...)`
can help us do that.
I have leave the rough implementation here:
[adlternative:cat-file-temp](https://github.com/gitgitgadget/git/compare/master...adlternative:cat-file-temp).
its performance is 25% worse than before.
Rather than posting it to the mailing list, because I still
need to implement the previous dependencies step by step...
Thanks!
--
ZheNing Hu
next reply other threads:[~2021-05-30 14:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-30 14:35 ZheNing Hu [this message]
2021-05-30 15:53 ` [GSOC] Git Blog 2 Felipe Contreras
2021-05-31 11:04 ` ZheNing Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOLTT8T0XtmpH3cHDBOfcX5nmCzfyrKby4TrFkHtA9H3dDN63A@mail.gmail.com \
--to=adlternative@gmail.com \
--cc=christian.couder@gmail.com \
--cc=felipe.contreras@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=phillip.wood123@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).