git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Michal Suchánek" <msuchanek@suse.de>
To: "Eric S. Raymond" <esr@thyrsus.com>
Cc: "Jakub Narebski" <jnareb@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	git@vger.kernel.org
Subject: Re: Finer timestamps and serialization in git
Date: Mon, 20 May 2019 16:41:34 +0200	[thread overview]
Message-ID: <20190520164134.6b35b9f9@kitsune.suse.cz> (raw)
In-Reply-To: <20190520141417.GA83559@thyrsus.com>

On Mon, 20 May 2019 10:14:17 -0400
"Eric S. Raymond" <esr@thyrsus.com> wrote:

> Jakub Narebski <jnareb@gmail.com>:
> > > What "commits that follow it?" By hypothesis, the incoming commit's
> > > timestamp is bumped (if it's bumped) when it's first added to a branch
> > > or branches, before there are following commits in the DAG.  
> > 
> > Errr... the main problem is with distributed nature of Git, i.e. when
> > two repositories create different commits with the same
> > committer+timestamp value.  You receive commits on fetch or push, and
> > you receive many commits at once.
> > 
> > Say you have two repositories, and the history looks like this:
> > 
> >  repo A:   1<---2<---a<---x<---c<---d      <- master
> > 
> >  repo B:   1<---2<---X<---3<---4           <- master
> > 
> > When you push from repo A to repo B, or fetch in repo B from repo A you
> > would get the following DAG of revisions
> > 
> >  repo B:   1<---2<---X<---3<---4           <- master
> >                  \
> >                   \--a<---x<---c<---d      <- repo_A/master
> > 
> > Now let's assume that commits X and x have the came committer and the
> > same fractional timestamp, while being different commits.  Then you
> > would need to bump timestamp of 'x', changing the commit.  This means
> > that 'c' needs to be rewritten too, and 'd' also:
> > 
> >  repo B:   1<---2<---X<---3<---4           <- master
> >                  \
> >                   \--a<---x'<--c'<--d'     <- repo_A/master  
> 
> Of course that's true.  But you were talking as though all those commits
> have to be modified *after they're in the DAG*, and that's not the case.
> If any timestamp has to be modified, it only has to happen *once*, at the
> time its commit enters the repo.

And that's where you get it wrong. Git is *distributed*. There is more
than one repository. Each repository has its own DAG that is completely
unrelated to the other repositories and their DAGs. So when you take
your history and push it to another repository and the timestamps
change as the result what ends up in the other repository is not the
history you pushed. So the repositories diverge and you no longer know
what is what.

> 
> Actually, in the normal case only x would need to be modified. The only
> way c would need to be modified is if bumping x's timestamp caused an
> actual collision with c's.
> 
> I don't see any conceptual problem with this.  You appear to me to be
> confusing two issues.  Yes, bumping timestamps would mean that all
> hashes downstream in the Merkle tree would be generated differently,
> even when there's no timestamp collision, but so what?  The hash of a
> commit isn't portable to begin with - it can't be, because AFAIK
> there's no guarantee that the ancestry parts of the DAG in two
> repositories where copies of it live contain all the same commits and
> topo relationships.

If you push form one repository to another repository now you get exact
same history with exact same hashes. So the hashes are portable across
repositories that share history. With your proposed change hashes can
be modified on push/pull so repositories no longer share history and
hashes become non-portable. That's why it is a bad idea.

The commits are currently identified by the hash so it must not change
during push/pull. Changing the identifier to something else (eg content
has without (some) metadata) might be useful to make the identifier
more stable but will bring other problems when you need two
different identifiers for the same content to include it in two
unrelated histories.

Thanks

Michal

  reply	other threads:[~2019-05-20 14:41 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 19:16 Finer timestamps and serialization in git Eric S. Raymond
2019-05-15 20:16 ` Derrick Stolee
2019-05-15 20:28   ` Jason Pyeron
2019-05-15 21:14     ` Derrick Stolee
2019-05-15 22:07       ` Ævar Arnfjörð Bjarmason
2019-05-16  0:28       ` Eric S. Raymond
2019-05-16  1:25         ` Derrick Stolee
2019-05-20 15:05           ` Michal Suchánek
2019-05-20 16:36             ` Eric S. Raymond
2019-05-20 17:22               ` Derrick Stolee
2019-05-20 21:32                 ` Eric S. Raymond
2019-05-15 23:40     ` Eric S. Raymond
2019-05-19  0:16       ` Philip Oakley
2019-05-19  4:09         ` Eric S. Raymond
2019-05-19 10:07           ` Philip Oakley
2019-05-15 23:32   ` Eric S. Raymond
2019-05-16  1:14     ` Derrick Stolee
2019-05-16  9:50     ` Ævar Arnfjörð Bjarmason
2019-05-19 23:15       ` Jakub Narebski
2019-05-20  0:45         ` Eric S. Raymond
2019-05-20  9:43           ` Jakub Narebski
2019-05-20 10:08             ` Ævar Arnfjörð Bjarmason
2019-05-20 12:40             ` Jeff King
2019-05-20 14:14             ` Eric S. Raymond
2019-05-20 14:41               ` Michal Suchánek [this message]
2019-05-20 22:18                 ` Philip Oakley
2019-05-20 21:38               ` Elijah Newren
2019-05-20 23:12                 ` Eric S. Raymond
2019-05-21  0:08               ` Jakub Narebski
2019-05-21  1:05                 ` Eric S. Raymond
2019-05-15 20:20 ` Ævar Arnfjörð Bjarmason
2019-05-16  0:35   ` Eric S. Raymond
2019-05-16  4:14   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190520164134.6b35b9f9@kitsune.suse.cz \
    --to=msuchanek@suse.de \
    --cc=avarab@gmail.com \
    --cc=esr@thyrsus.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).