git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC] Git Blog 5
@ 2021-06-20 16:53 ZheNing Hu
  2021-06-21 12:07 ` Christian Couder
  0 siblings, 1 reply; 3+ messages in thread
From: ZheNing Hu @ 2021-06-20 16:53 UTC (permalink / raw)
  To: Git List
  Cc: Junio C Hamano, Hariom verma,
	Ævar Arnfjörð Bjarmason, Christian Couder

My fifth week blog finished:
The web version is here:
https://adlternative.github.io/GSOC-Git-Blog-5/

## Week5: Tempting apple

This week, I spent a lot of time working on the digital circuit
course design of the school. So this week's patches for git
was completed in a hurry. This week `Ævar Arnfjörð Bjarmason`
gave a lot of useful suggestions for the patch I wrote earlier.
Some are related to code style improvements, and some are
better design ideas.

Before, I wanted to use a `<s, s_size>` style string in `atom_value`
to help copy and compare data containing '\0'.

Like this:

```c
struct atom_value {
const char *s;
size_t s_size;
...
};
```

But `Ævar Arnfjörð Bjarmason` thinks it is more reasonable to
use `strbuf` instead of `<s, s_size>`.

Like this:

```c
struct atom_value {
struct strbuf s;
...
};
```

Since the `strbuf` API has a natural `<s, s_size>`, we can add data
that may contain '\0' through `strbuf_add()`, `strbuf_addbuf()`, and
`strbuf_addf()` can also be used to fill strbuf with format string. Use
 `strbuf_addstr()` to replace `xstrdup()`, `strbuf_add()` to replace
 `xmemdupz()`, this is indeed a very tempting choice.

But in my actual refactoring process, this is not very easy to achieve.

For example, the original interface is like this,

```c
v->s = copy_email(wholine, &used_atom[i]);
```

`v->s` will be filled with the data dynamically allocated by `copy_email()`.

```c
static const char *copy_email(const char *buf, struct used_atom *atom)
{
...
if (!eoemail)
return xstrdup("");
return xmemdupz(email, eoemail - email);
}
```

Then if we want use `strbuf` type `v->s`, we should change the
 `copy_email()` interface parameters and return value.

```c
static void copy_email(struct strbuf *str, const char *buf, struct
used_atom *atom)
{
...
if (!eoemail)
return;
return strbuf_add(str, email, eoemail - email);
}
```

Then the caller can do:

```c
copy_email(&v->s, wholine, &used_atom[i]);
```

This is in line with our expectations.

But something like `fill_remote_ref_details()`, things gradually
become complicated and difficult. `fill_remote_ref_details()`.
Just consider the `show_ref()` called in `fill_remote_ref_details()`,
`show_ref()` may call `shorten_unambiguous_ref()` internally, and
then another function is called internally in `shorten_unambiguous_ref()`...
This makes us fail the method of passing `v->s` in parameters like
`copy_email()` does. Another way of thinking: what if I can "attach"
the data directly? Using `strbuf_attach()` may be a viable option, but...

```c
size_t len;
void *data = shorten_unambiguous_ref(refname, warn_ambiguous_refs);
len = strlen(data);
strbuf_attach(&str, data, len, len);
```

...we need to get the length of the data, but this is not easy to do,
`strlen()` can only be used on data that does not contain '\0',
and we are not sure whether a function like `shorten_unambiguous_ref()`
will return a `NULL`.

Well, this is one of the reasons why I cannot move on.

On the other hand, look at the following piece of code, it appears
in `populate_value()`.

```c
for (i = 0; i < used_atom_cnt; i++) {
struct atom_value *v = &ref->value[i];
if (v->s == NULL && used_atom[i].source == SOURCE_NONE)
return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
oid_to_hex(&ref->objectname), ref->refname);
}
```

We need to determine whether `v->s` equals to `NULL`. We can use
c-style strings to easily distinguish between empty strings(`xstrdup("")`)
and `NULL`, but if we use strbuf, it is not easy to distinguish, because
an empty strbuf has the following characteristics: `s.buf == strbuf_slopbuf`
or `sb->buf[0] == '\0'`. It can be said that we shouldn’t even use `NULL`
to assign to `s.buf`.

So in the end, I rejected this seemingly very attractive solution, and use the
previous strategy: `<s, s_size>`, unless someone can think of a better
solution later `;-)`.

* FootNote:
1. Why is there no `strbuf_empty()` in `strbuf` API? I think this may be
a very important thing.

```c
#define strbuf_empty(sb) \
(sb->buf == strbuf_slopbuf) ? \
(!strbuf_slopbuf[0]) : \
(sb->buf[0] == '\0')
```

2. Another thing worth mentioning is: I will have the school final exam
between July 7th and July 14th, I may be busy during this period.

Thanks for Git community, reviewers and mentors.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC] Git Blog 5
  2021-06-20 16:53 [GSoC] Git Blog 5 ZheNing Hu
@ 2021-06-21 12:07 ` Christian Couder
  2021-06-21 13:17   ` ZheNing Hu
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Couder @ 2021-06-21 12:07 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Git List, Junio C Hamano, Hariom verma,
	Ævar Arnfjörð Bjarmason

On Sun, Jun 20, 2021 at 6:53 PM ZheNing Hu <adlternative@gmail.com> wrote:
>
> My fifth week blog finished:
> The web version is here:
> https://adlternative.github.io/GSOC-Git-Blog-5/

Great, thanks!

> ## Week5: Tempting apple
>
> This week, I spent a lot of time working on the digital circuit
> course design of the school. So this week's patches for git
> was completed in a hurry. This week `Ævar Arnfjörð Bjarmason`

s/was/were/

> gave a lot of useful suggestions for the patch I wrote earlier.

Maybe a link could help.

> Some are related to code style improvements, and some are
> better design ideas.

[...]

> But something like `fill_remote_ref_details()`, things gradually
> become complicated and difficult. `fill_remote_ref_details()`.

I think you can remove the above `fill_remote_ref_details()` as you
are talking about it later.

> Just consider the `show_ref()` called in `fill_remote_ref_details()`,
> `show_ref()` may call `shorten_unambiguous_ref()` internally, and
> then another function is called internally in `shorten_unambiguous_ref()`...

> Well, this is one of the reasons why I cannot move on.
>
> On the other hand, look at the following piece of code, it appears
> in `populate_value()`.
>
> ```c
> for (i = 0; i < used_atom_cnt; i++) {
> struct atom_value *v = &ref->value[i];
> if (v->s == NULL && used_atom[i].source == SOURCE_NONE)
> return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
> oid_to_hex(&ref->objectname), ref->refname);
> }
> ```
>
> We need to determine whether `v->s` equals to `NULL`. We can use

s/equals to/equals/

> c-style strings to easily distinguish between empty strings(`xstrdup("")`)
> and `NULL`, but if we use strbuf, it is not easy to distinguish, because
> an empty strbuf has the following characteristics: `s.buf == strbuf_slopbuf`
> or `sb->buf[0] == '\0'`. It can be said that we shouldn’t even use `NULL`
> to assign to `s.buf`.
>
> So in the end, I rejected this seemingly very attractive solution, and use the
> previous strategy: `<s, s_size>`, unless someone can think of a better
> solution later `;-)`.
>
> * FootNote:
> 1. Why is there no `strbuf_empty()` in `strbuf` API? I think this may be
> a very important thing.

There is a 'size_t len' field in 'struct strbuf', so doing `buf->len
== 0` works.

> 2. Another thing worth mentioning is: I will have the school final exam
> between July 7th and July 14th, I may be busy during this period.

Ok, thanks for telling us in advance.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC] Git Blog 5
  2021-06-21 12:07 ` Christian Couder
@ 2021-06-21 13:17   ` ZheNing Hu
  0 siblings, 0 replies; 3+ messages in thread
From: ZheNing Hu @ 2021-06-21 13:17 UTC (permalink / raw)
  To: Christian Couder
  Cc: Git List, Junio C Hamano, Hariom verma,
	Ævar Arnfjörð Bjarmason

Christian Couder <christian.couder@gmail.com> 于2021年6月21日周一 下午8:07写道:
>
> On Sun, Jun 20, 2021 at 6:53 PM ZheNing Hu <adlternative@gmail.com> wrote:
> >
> > My fifth week blog finished:
> > The web version is here:
> > https://adlternative.github.io/GSOC-Git-Blog-5/
>
> Great, thanks!
>
> > ## Week5: Tempting apple
> >
> > This week, I spent a lot of time working on the digital circuit
> > course design of the school. So this week's patches for git
> > was completed in a hurry. This week `Ævar Arnfjörð Bjarmason`
>
> s/was/were/
>
> > gave a lot of useful suggestions for the patch I wrote earlier.
>
> Maybe a link could help.
>
> > Some are related to code style improvements, and some are
> > better design ideas.
>
> [...]
>
> > But something like `fill_remote_ref_details()`, things gradually
> > become complicated and difficult. `fill_remote_ref_details()`.
>
> I think you can remove the above `fill_remote_ref_details()` as you
> are talking about it later.
>

OK.

> > * FootNote:
> > 1. Why is there no `strbuf_empty()` in `strbuf` API? I think this may be
> > a very important thing.
>
> There is a 'size_t len' field in 'struct strbuf', so doing `buf->len
> == 0` works.
>

Really? So I think for reader who is learning strbuf api for the first time,
the characteristics of empty strbuf are not so obvious. For example,
`strbuf_slopbuf` may mean "empty". In any case, adding a macro
is absolutely reasonable and will not affect performance.

#define strbuf_empty(sb) (sb->len == 0)

> > 2. Another thing worth mentioning is: I will have the school final exam
> > between July 7th and July 14th, I may be busy during this period.
>
> Ok, thanks for telling us in advance.

Correct the time deviation, the examination time is from July 1st to July 12th.

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-21 13:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20 16:53 [GSoC] Git Blog 5 ZheNing Hu
2021-06-21 12:07 ` Christian Couder
2021-06-21 13:17   ` ZheNing Hu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).