git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Abhishek Kumar <abhishekkumar8222@gmail.com>
To: jnareb@gmail.com
Cc: stolee@gmail.com, git@vger.kernel.org
Subject: Re: [PATCH v3 1/4] alloc: introduce parsed_commits_count
Date: Sun, 14 Jun 2020 11:39:38 +0530	[thread overview]
Message-ID: <20200614060938.GA3024@Abhishek-Arch> (raw)

On 13.06.2020, Jakub Narębski wrote:

> On 12.06.2020, Abhishek Kumar wrote:
> 
> > Commit slab relies on uniqueness of commit->index to access data. As
> > submodules are repositories on their own, alloc_commit_index() (which
> > depends on repository->parsed_objects->commit_count) no longer
> > returns unique values.
> > 
> > This would break tests once we move `generation` and `graph_pos` into
> > a
> > commit slab, as commits of supermodule and submodule can have the same
> > index but must have different graph positions.
> 
> First, commits of supermodule and of submodule are in different graphs,
> so I don't see why they have to be in the same serialized commit-graph
> file.
> 

I should have been clearer in the commit message:

commit->index = ith commit allocated for the repository.

As both supermodule and submodule are repositories, they can have
commits with the same index.

This means that were both the ith commit allocated for supermodule and
submodule respectively.

However, the commit-slab code treats them identically (because of the
same index) and they (incorrectly) have same generation number and graph
position.

Switching to a global counter instead of per-repo counter avoids having
two commits with the same index in the first place.

> Second, Git stores many different types of information on slab already.
> How comes that we have not had any problems till now?  
> 
> There is contains_cache, commit_seen, indegree_slab, author_date_slab,
> commit_base, commit_pos, bloom_filter_slab, buffer_slab,
> commit_rev_name,
> commit_names, commit_name_slab, saved_parents, blame_suspects,
> commit_todo_item.
> 

That's a fair point - It seems rather unlikely that we had not faced
this problem yet.

> > 
> > Let's introduce a counter variable, `parsed_commits_count` to keep
> > track
> > of parsed commits so far.
> 
> All right, thought it might be worth mentioning that it is a global
> variable, or rather a static variable in a function.
> 

Alright, noted.

> > 
> > 
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > ---
> > 
> > CI Build for the failing tests:
> > https://travis-ci.com/github/abhishekkumar2718/git/jobs/345413840
> 
> The failed tests are, from what I see:
> - t4060-diff-submodule-option-diff-format.sh
> - t4041-diff-submodule-option.sh
> - t4059-diff-submodule-not-initialized.sh
> 
> 
> > 
> >   alloc.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/alloc.c b/alloc.c
> > index 1c64c4dd16..29f0e3aa80 100644
> > --- a/alloc.c
> > +++ b/alloc.c
> > @@ -101,7 +101,9 @@ void *alloc_object_node(struct repository *r)
> >   
> >   static unsigned int alloc_commit_index(struct repository *r)
> >   {
> > -     return r->parsed_objects->commit_count++;
> > +     static unsigned int parsed_commits_count = 0;
> > +     r->parsed_objects->commit_count++;
> 
> Do we use r->parsed_objects->commit_count anywhere?

We don't use it anywhere else. 14ba97f8 (alloc: allow arbitary
repositories for alloc functions, 2019-05-15) made the count per
repository but it's only been used to assign commit index.

Junio suggested that we could remove commit_count from the struct and
clean up function calls a bit.

> 
> > +     return parsed_commits_count++;
> 
> Does it matter that it is not thread safe, because it is not atomic?
> Shouldn't it be
> 
>   +     static _Atomic unsigned int parsed_commits_count = 0;
> 
>   or
> 
>     +   static _Atomic unsigned int parsed_commits_count =
>     ATOMIC_VAR_INIT(0);
> 
>     (If it is allowed).
> 
>     >   }
>     >   
>     >   void init_commit_node(struct repository *r, struct commit *c)
>     > 

Thanks
Abhishek

             reply	other threads:[~2020-06-14  6:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-14  6:09 Abhishek Kumar [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-06-12 18:40 [GSoC Patch v3 0/4] Move generation, graph_pos to a slab Abhishek Kumar
2020-06-12 18:40 ` [PATCH v3 1/4] alloc: introduce parsed_commits_count Abhishek Kumar
2020-06-12 20:08   ` Junio C Hamano
2020-06-12 22:00     ` Junio C Hamano
2020-06-12 22:20       ` Junio C Hamano
2020-06-12 22:52   ` Junio C Hamano
2020-06-13 18:57     ` Abhishek Kumar
2020-06-12 23:16   ` Jakub Narębski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200614060938.GA3024@Abhishek-Arch \
    --to=abhishekkumar8222@gmail.com \
    --cc=8b0facef-c85d-c25c-d49d-2bc1a3836e77@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).