erratic behavior commit --allow-empty

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* erratic behavior commit --allow-empty
@ 2012-10-02  7:51 Angelo Borsotti
  2012-10-02  8:26 ` Johannes Sixt
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-02  7:51 UTC (permalink / raw)
  To: git

Hi

I have noticed an erratic behavior of git commit --allow-empty: sometimes
it creates a new commit, but sometimes not.
I have executed two times the following script, emptycommit:

#!/bin/bash
set -x
rm -rf local
mkdir local
cd local
git init
echo "aaa" >f1
git add f1
git commit -m A
git checkout --orphan feature
git commit -m A --allow-empty
git rev-list --all --pretty=oneline

This is the log of the first execution:

$ emptycommit
+ rm -rf local
+ mkdir local
+ cd local
+ git init
Initialized empty Git repository in d:/gtest/local/.git/
+ echo aaa
+ git add f1
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
+ git commit -m A
[master (root-commit) 07e7d37] A
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
 1 file changed, 1 insertion(+)
 create mode 100644 f1
+ git checkout --orphan feature
Switched to a new branch 'feature'
+ git commit -m A --allow-empty
[feature (root-commit) 2297c4e] A
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
 1 file changed, 1 insertion(+)
 create mode 100644 f1
+ git rev-list --all --pretty=oneline
2297c4e34ec27f3cdeca8c0dcdcd61b4a079f411 A
07e7d379c2339ed375ed4903f6196d627367b7bf A

>>>>> note that git commit -m A --allow-empty creates a commit

This is the log of the second execution:

$ emptycommit
+ rm -rf local
+ mkdir local
+ cd local
+ git init
Initialized empty Git repository in d:/gtest/local/.git/
+ echo aaa
+ git add f1
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
+ git commit -m A
[master (root-commit) 1b86218] A
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
 1 file changed, 1 insertion(+)
 create mode 100644 f1
+ git checkout --orphan feature
Switched to a new branch 'feature'
+ git commit -m A --allow-empty
[feature (root-commit) 1b86218] A
warning: LF will be replaced by CRLF in f1.
The file will have its original line endings in your working directory.
 1 file changed, 1 insertion(+)
 create mode 100644 f1
+ git rev-list --all --pretty=oneline
1b8621851f6ae2943347da655661e9d5dc978208 A

>>>>> note that git commit -m A --allow-empty DOES NOT create a commit

The script has been run on Windows 7 with git version 1.7.11.msysgit.1

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02  7:51 erratic behavior commit --allow-empty Angelo Borsotti
@ 2012-10-02  8:26 ` Johannes Sixt
  2012-10-02  8:49   ` Angelo Borsotti
                     ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Johannes Sixt @ 2012-10-02  8:26 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: git

Am 10/2/2012 9:51, schrieb Angelo Borsotti:
> This is the log of the second execution:
> 
> $ emptycommit
> + rm -rf local
> + mkdir local
> + cd local
> + git init
> Initialized empty Git repository in d:/gtest/local/.git/
> + echo aaa
> + git add f1
> warning: LF will be replaced by CRLF in f1.
> The file will have its original line endings in your working directory.
> + git commit -m A
> [master (root-commit) 1b86218] A
> warning: LF will be replaced by CRLF in f1.
> The file will have its original line endings in your working directory.
>  1 file changed, 1 insertion(+)
>  create mode 100644 f1
> + git checkout --orphan feature
> Switched to a new branch 'feature'
> + git commit -m A --allow-empty
> [feature (root-commit) 1b86218] A
> warning: LF will be replaced by CRLF in f1.
> The file will have its original line endings in your working directory.
>  1 file changed, 1 insertion(+)
>  create mode 100644 f1
> + git rev-list --all --pretty=oneline
> 1b8621851f6ae2943347da655661e9d5dc978208 A
> 
>>>>>> note that git commit -m A --allow-empty DOES NOT create a commit

Note that git commit -m A --allow-empty *DID* create a commit. Only, that
it received the same name (SHA1) as the commit you created before it
because it had the exact same contents (files, parents, author, committer,
and timestamps). Obviously, your script was executed sufficiently fast
that the two commits happend in the same second.

-- Hannes

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02  8:26 ` Johannes Sixt
@ 2012-10-02  8:49   ` Angelo Borsotti
  2012-10-02 17:27   ` Junio C Hamano
  2013-01-12 18:30   ` Jan Engelhardt
  2 siblings, 0 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-02  8:49 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Hi

having such  a time-dependent behavior is not nice. It means that the user must
know it, and wait patiently before issuing the command, or in a script
add a sleep
before the command.
The choice is then between adding a warning in the man page ("please
wait at least
a second before executing the command") or adding a sleep inside the command
itself.
Obviously, the second alternative looks much more appealing.

Thank you
-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02  8:26 ` Johannes Sixt
  2012-10-02  8:49   ` Angelo Borsotti
@ 2012-10-02 17:27   ` Junio C Hamano
  2012-10-02 19:34     ` Angelo Borsotti
  2013-01-12 18:30   ` Jan Engelhardt
  2 siblings, 1 reply; 53+ messages in thread
From: Junio C Hamano @ 2012-10-02 17:27 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Angelo Borsotti, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Note that git commit -m A --allow-empty *DID* create a commit. Only, that
> it received the same name (SHA1) as the commit you created before it
> because it had the exact same contents (files, parents, author, committer,
> and timestamps). Obviously, your script was executed sufficiently fast
> that the two commits happend in the same second.

Correct.

And this does not have anything to do with --allow-empty.  You can
"reset --soft HEAD^" immediately after committing a change and redo
it to get the same effect.  If you commit the same state with the
same history with the same message as the same person at the same
time, you will reliably get the same commit object.

And that is fundamental property called reproducibility.  There is
nothing to be alarmed by this exercise.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 17:27   ` Junio C Hamano
@ 2012-10-02 19:34     ` Angelo Borsotti
  2012-10-02 19:56       ` Junio C Hamano
  2012-10-03 12:59       ` Phil Hord
  0 siblings, 2 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-02 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

Hi Junio,

if I put on my head the implementor's hat, I would agree with you: that command
after all behaves as implemented.
However, if I put the user's hat I would reason differently. What I
need are predictable
commands, and that by all means is not. This because the time at which a command
is executed is not predictable (more precisely, the statement in it
that reads the system
calendar). So, even if an implementor thinks that this behavior is
reliable, a user
thinks that it is not predictable. Actually, I called that command
from within a script,
and thus I could not count on it being executed within 1 second from
the last commit.
Read also the paragraph in the man page that describes it:

"Usually recording a commit that has the exact same tree as its sole
parent commit is a mistake, and the command prevents you from making
such a commit. This option bypasses the safety, and is primarily for
use by foreign SCM interface scripts."

I cannot find any clue in it that lets me know that is does not create
a commit if the time is
within the same second as the other commit.

My suggestion is either to include a sleep in the command so as to
guarantee that a commit
is created, or to remove the option.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 19:34     ` Angelo Borsotti
@ 2012-10-02 19:56       ` Junio C Hamano
  2012-10-02 21:56         ` Angelo Borsotti
  2012-10-03 12:59       ` Phil Hord
  1 sibling, 1 reply; 53+ messages in thread
From: Junio C Hamano @ 2012-10-02 19:56 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> "Usually recording a commit that has the exact same tree as its sole
> parent commit is a mistake, and the command prevents you from making
> such a commit. This option bypasses the safety, and is primarily for
> use by foreign SCM interface scripts."
>
> I cannot find any clue in it that lets me know that is does not
> create a commit if the time is within the same second as the other
> commit.

It does create one; it just is the same one you already happen to have,
when you record the same state on top of the same history as the
same person at the same time.

> My suggestion is either to include a sleep in the command so as to
> guarantee that a commit is created, or to remove the option.

And how would it help what to insert a sleep for 1 second (or 1 year
for that matter)?  As you said, it reads from the system clock, and
there are millions of systems in the world that have Git installed.
You may record the same state on top of the same history as the same
person on two different machines 5 minutes in wallclock time in
between doing so.  These two machines may end up creating the same
commit because one of them had a clock skewed by 5 minutes.

What problem are you really trying to solve?  You mentioned
importing from the foreign SCM, but in that case, you would be
building commits on top of other commits, and some commits may
not have any change recorded in them, i.e. you could validly
have (as always, time flows from left to right)

	---o---o---o---o---A---B---C

where differences between A and B is nothing, and differences
between B and C is nothing.  You may be a script that records these
commits in rapid succession.

When you create B and C, you may be recording the same state as the
same person with the same timestamp. *BUT* you are not recording
these two commits on top of the same history.  B is done on top of
the history leading to A, but C is done on top of the history
leading to B.  They will get different commit object name.

So what problem are you trying to solve?

You also did not seem to have read what I wrote, or deliberately
ignored it (in which case I am wasting even more time writing this,
so I'll stop).

This does not have anything to do with "--allow-empty"; removing
"the option" would not help anything, either.  Run the following on
a fast-enough machine.

    git init
    >file
    git add file
    git commit -m initial
    echo foo >file
    git add file
    git commit -a -m second
    H1=$(git rev-parse HEAD)
    git reset --soft HEAD^
    git commit -a -m second
    H2=$(git rev-parse HEAD)
    if test "$H1" = "$H2"
    then
	echo I was quick enough
    else
	echo I was not quick enough
    fi

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 19:56       ` Junio C Hamano
@ 2012-10-02 21:56         ` Angelo Borsotti
  2012-10-03  2:10           ` PJ Weisberg
                             ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-02 21:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

Hi Junio,

> It does create one; it just is the same one you already happen to have,
> when you record the same state on top of the same history as the
> same person at the same time.
>

No, it does not create one: as you can see from the trace of the execution
of my script, the sha of the commit is the same as that of the other,
which means
that in the .git/objects there is only one such commit object, and not two with
the same sha. The meaning of the word "create" is to bring into being something
that did not exist before. There is no "creation" if the object already exists.

>
> And how would it help what to insert a sleep for 1 second (or 1 year
> for that matter)?  As you said, it reads from the system clock, and
> there are millions of systems in the world that have Git installed.
> You may record the same state on top of the same history as the same
> person on two different machines 5 minutes in wallclock time in
> between doing so.  These two machines may end up creating the same
> commit because one of them had a clock skewed by 5 minutes.

I understood that the command does not create a new commit if all its data, i.e.
tree, committer, ... and date are the same, representing the date with 1 second
precision. Sleeping for 1 second guarantees that there is no commit in the repo
that has the same time as the time after the sleep, i.e. that the
command creates
a (new) commit.

>
> What problem are you really trying to solve?  You mentioned
> importing from the foreign SCM,

I quoted a piece of the man page of git commit, that states that
--allow-empty bypasses
the safety check that prevents to make a new commit. That piece
incidentally states
that it is "primarily" used by foreign SCM interface scripts. But of
course it can be used
in any script that needs to build a commit on top of another.

>
> You also did not seem to have read what I wrote, or deliberately
> ignored it (in which case I am wasting even more time writing this,
> so I'll stop).

I did not deliberately ignore what you wrote. I might have missed some
point though.

> This does not have anything to do with "--allow-empty"; removing
> "the option" would not help anything, either.

I am reporting a problem with --allow-empty, so why you say that this
does not have
anything to do with it?
Removing the option removes a behavior that is not predictable.
Often it is better to remove a feature that turns out to be
inconsistent than to leave it
in the software. Of course a much better avenue is to make it consistent.

> Run the following on a fast-enough machine.
>
 I did, and obtained most of the times "I was quick enough" and
sometimes "I was not quick enough", which is the same kind of behavior
of my script.

The problem I am trying to solve is to push to a remote server the
source files only,
while keeping in the local repo both sources and binaries. To do it, I
keep an orphan
branch, say "sources". When I make a commit on the master branch, I make also a
commit on the sources one after having un-staged (git rm --cached) the binaries.
The script that does this must cope also with the particular case in
which in the commit
on the master branch there are no sources. Basically the script does:

# this is the commit on the master branch
git init
echo "aaa" >f1
git add f1
git commit -m A

# this is the piece of the script that builds the sources branch
git checkout --orphan sources
# git rm --cached ...   remove binaries, if any"
git commit -m A --allow-empty
git rev-list --all --pretty=oneline

When there are binaries in the commit A, they are removed, and the
tree for the second
git commit is then different, and the commit is actually created.
When there are no binaries (as in the script above, in which the
removal is commented out),
the second git commit would not create any new commit, and I would not
have an orphan
branch. Thence the --allow-empty to force it to create a new commit.
Unfortunately, it creates a new commit only if the system clock
changes the seconds of
the system time between the two git commits.
If you insert a "sleep 1" before the second git commit, the commit is
really created.

I spent many hours to spot this time-dependent error ....

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 21:56         ` Angelo Borsotti
@ 2012-10-03  2:10           ` PJ Weisberg
  2012-10-03  5:37           ` Johannes Sixt
  2012-10-03  7:29           ` Philip Oakley
  2 siblings, 0 replies; 53+ messages in thread
From: PJ Weisberg @ 2012-10-03  2:10 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Johannes Sixt, git

On Tue, Oct 2, 2012 at 2:56 PM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
> Hi Junio,
>
>> It does create one; it just is the same one you already happen to have,
>> when you record the same state on top of the same history as the
>> same person at the same time.
>>
>
> No, it does not create one: as you can see from the trace of the execution
> of my script, the sha of the commit is the same as that of the other,
> which means
> that in the .git/objects there is only one such commit object, and not two with
> the same sha. The meaning of the word "create" is to bring into being something
> that did not exist before. There is no "creation" if the object already exists.

It's also impossible to create two identical files in Git.  If you
try, you'll find that they both have the same SHA1, and thus are
represented by the same object in .git/objects.

You have a script that creates two commits that are identical in every
way.  What practical difference does it make whether they're
represented by one object or two?

-PJ

Gehm's Corollary to Clark's Law: Any technology distinguishable from
magic is insufficiently advanced.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 21:56         ` Angelo Borsotti
  2012-10-03  2:10           ` PJ Weisberg
@ 2012-10-03  5:37           ` Johannes Sixt
  2012-10-03  6:22             ` Angelo Borsotti
  2012-10-03  7:29           ` Philip Oakley
  2 siblings, 1 reply; 53+ messages in thread
From: Johannes Sixt @ 2012-10-03  5:37 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, git

Am 10/2/2012 23:56, schrieb Angelo Borsotti:
> The problem I am trying to solve is to push to a remote server the
> source files only,
> while keeping in the local repo both sources and binaries. To do it, I
> keep an orphan
> branch, [...] 
> 
> # this is the commit on the master branch
> git init
> echo "aaa" >f1
> git add f1
> git commit -m A
> 
> # this is the piece of the script that builds the sources branch
> git checkout --orphan sources
> # git rm --cached ...   remove binaries, if any"
> git commit -m A --allow-empty
> git rev-list --all --pretty=oneline
> 
> When there are binaries in the commit A, they are removed, and the
> tree for the second
> git commit is then different, and the commit is actually created.
> When there are no binaries (as in the script above, in which the
> removal is commented out),
> the second git commit would not create any new commit, and I would not
> have an orphan
> branch. Thence the --allow-empty to force it to create a new commit.
> Unfortunately, it creates a new commit only if the system clock
> changes the seconds of
> the system time between the two git commits.

But the existing-and-not-created-commit has exactly the content that you
wanted. What's the point in insisting that it is different from any other
commit?

-- Hannes

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  5:37           ` Johannes Sixt
@ 2012-10-03  6:22             ` Angelo Borsotti
  2012-10-03  6:27               ` Johannes Sixt
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03  6:22 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git

Hi PJ and Hannes,

try to run the last script that I posted, with and without a sleep 1
before the last commit:

git init
echo "aaa" >f1
git add f1
git commit -m A
git checkout --orphan sources
git commit -m A --allow-empty

and

git init
echo "aaa" >f1
git add f1
git commit -m A
git checkout --orphan sources
sleep 1
git commit -m A --allow-empty

In the first one, no new commit is created, and the "sources" branch
is not orphan (you can easily see it with the git gui).
In the second one, a new commit is created, and the "sources" branch
is orphan, as expected.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  6:22             ` Angelo Borsotti
@ 2012-10-03  6:27               ` Johannes Sixt
       [not found]                 ` <CAB9Jk9AgtNQfWDr31CWbXf2ag=11du-aruu-0+nOZ3KaaG9=og@mail.gmail.com>
  0 siblings, 1 reply; 53+ messages in thread
From: Johannes Sixt @ 2012-10-03  6:27 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, git

Not answering questions does not help anyone.

My question was: What is the point in insisting that there is a *really*
new commit when the one commit that already existed has exactly the
content that you wanted?

-- Hannes

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
       [not found]                 ` <CAB9Jk9AgtNQfWDr31CWbXf2ag=11du-aruu-0+nOZ3KaaG9=og@mail.gmail.com>
@ 2012-10-03  7:12                   ` Johannes Sixt
  2012-10-03  7:35                     ` Angelo Borsotti
  2012-10-03 20:49                     ` Junio C Hamano
  0 siblings, 2 replies; 53+ messages in thread
From: Johannes Sixt @ 2012-10-03  7:12 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Git Mailing List

Cc restored; please reply to all.

Am 10/3/2012 8:32, schrieb Angelo Borsotti:
> Hi Hannes,
> 
> well, I thought I replied to your question:
> 
>    "What is the point in insisting that there is a *really*
>    new commit when the one commit that already existed has exactly the
>    content that you wanted?"
> 
> I wanted to create an orphan branch. I did it with a git checkout
> --orphan sources.
> This command alone does not create a branch; it needs a commit to be done on
> it, but a "real" one. If it is not a "real" one, the branch is
> created, but it is not an
> orphan one.

When you do 'git checkout --orphan sources', you request (nothing more and
nothing less than) that the next commit you make on the new branch
"sources" does not have a parent. But this is exactly what happens: The
next commit you make does not have a parent.

Perhaps you are confused by the fact that the commit you made first does
not have a parent, either. But that is just a "side effect" that it
happened to be the very first commit that you made after 'git init'.

IOW, the second commit that you made has all properties that you
requested. (It just so happens that it is exactly identical to the first
commit you made.) Your case does not demonstrate a bug in git.

Why don't you use a different commit message to ensure that there is a
difference between the commits?

-- Hannes

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 21:56         ` Angelo Borsotti
  2012-10-03  2:10           ` PJ Weisberg
  2012-10-03  5:37           ` Johannes Sixt
@ 2012-10-03  7:29           ` Philip Oakley
  2012-10-03  7:45             ` Angelo Borsotti
  2 siblings, 1 reply; 53+ messages in thread
From: Philip Oakley @ 2012-10-03  7:29 UTC (permalink / raw)
  To: Angelo Borsotti, Junio C Hamano; +Cc: Johannes Sixt, git

From: "Angelo Borsotti" <angelo.borsotti@gmail.com>
> Hi Junio,
>
>> It does create one; it just is the same one you already happen to 
>> have,
>> when you record the same state on top of the same history as the
>> same person at the same time.
>>
>
> No, it does not create one:

Angelo
This is a semantics problem. It is like the confusion as to whether zero 
is a natural number that can be used in counting.

In this case we have created two commits. However they are, by design 
and definition, identical to each other for this case of identical 
content and identical administration fields. They cannot be 
distinguished.

So when the file system is asked to 'write' the second commit, it (the 
file system in conjunction with the git code) does a no-op, and reports 
'done'.

It is a common (systems) engineering problem. Software engineering 
usually allows an empty subroutine to exist, while physical engineering 
wouldn't. Git cannot have two unique but identical commits (a 
contradiction in terms).

Normally git will create a new (different & unique) commit for each and 
every commit, but in this special case a second identical commit was 
'created', but the uniqueness requirement means it _is_ the same as the 
first commit.

> as you can see from the trace of the execution
> of my script, the sha of the commit is the same as that of the other,
> which means
> that in the .git/objects there is only one such commit object, and not 
> two with
> the same sha. The meaning of the word "create" is to bring into being 
> something
> that did not exist before. There is no "creation" if the object 
> already exists.
>
>>
>> And how would it help what to insert a sleep for 1 second (or 1 year
>> for that matter)?  As you said, it reads from the system clock, and
>> there are millions of systems in the world that have Git installed.
>> You may record the same state on top of the same history as the same
>> person on two different machines 5 minutes in wallclock time in
>> between doing so.  These two machines may end up creating the same
>> commit because one of them had a clock skewed by 5 minutes.
>
> I understood that the command does not create a new commit if all its 
> data, i.e.
> tree, committer, ... and date are the same, representing the date with 
> 1 second
> precision. Sleeping for 1 second guarantees that there is no commit in 
> the repo
> that has the same time as the time after the sleep, i.e. that the
> command creates
> a (new) commit.
>
>>
>> What problem are you really trying to solve?  You mentioned
>> importing from the foreign SCM,
>
> I quoted a piece of the man page of git commit, that states that
> --allow-empty bypasses
> the safety check that prevents to make a new commit. That piece
> incidentally states
> that it is "primarily" used by foreign SCM interface scripts. But of
> course it can be used
> in any script that needs to build a commit on top of another.
>
>>
>> You also did not seem to have read what I wrote, or deliberately
>> ignored it (in which case I am wasting even more time writing this,
>> so I'll stop).
>
> I did not deliberately ignore what you wrote. I might have missed some
> point though.
>
>> This does not have anything to do with "--allow-empty"; removing
>> "the option" would not help anything, either.
>
> I am reporting a problem with --allow-empty, so why you say that this
> does not have
> anything to do with it?
> Removing the option removes a behavior that is not predictable.
> Often it is better to remove a feature that turns out to be
> inconsistent than to leave it
> in the software. Of course a much better avenue is to make it 
> consistent.
>
>> Run the following on a fast-enough machine.
>>
> I did, and obtained most of the times "I was quick enough" and
> sometimes "I was not quick enough", which is the same kind of behavior
> of my script.
>
> The problem I am trying to solve is to push to a remote server the
> source files only,
> while keeping in the local repo both sources and binaries. To do it, I
> keep an orphan
> branch, say "sources". When I make a commit on the master branch, I 
> make also a
> commit on the sources one after having un-staged (git rm --cached) the 
> binaries.
> The script that does this must cope also with the particular case in
> which in the commit
> on the master branch there are no sources. Basically the script does:
>
> # this is the commit on the master branch
> git init
> echo "aaa" >f1
> git add f1
> git commit -m A
>
> # this is the piece of the script that builds the sources branch
> git checkout --orphan sources
> # git rm --cached ...   remove binaries, if any"
> git commit -m A --allow-empty
> git rev-list --all --pretty=oneline
>
> When there are binaries in the commit A, they are removed, and the
> tree for the second
> git commit is then different, and the commit is actually created.
> When there are no binaries (as in the script above, in which the
> removal is commented out),
> the second git commit would not create any new commit, and I would not
> have an orphan
> branch. Thence the --allow-empty to force it to create a new commit.
> Unfortunately, it creates a new commit only if the system clock
> changes the seconds of
> the system time between the two git commits.
> If you insert a "sleep 1" before the second git commit, the commit is
> really created.
>
> I spent many hours to spot this time-dependent error ....
>
> -Angelo
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.2221 / Virus Database: 2441/5305 - Release Date: 
> 10/02/12
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  7:12                   ` Johannes Sixt
@ 2012-10-03  7:35                     ` Angelo Borsotti
  2012-10-03 20:49                     ` Junio C Hamano
  1 sibling, 0 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03  7:35 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Git Mailing List

Hi Hannes,

>
> Perhaps you are confused by the fact that the commit you made first does
> not have a parent, either. But that is just a "side effect" that it
> happened to be the very first commit that you made after 'git init'.

Well, I know that, and this is why I added --allow-empty. The man page of
git commit ("This option bypasses the safety, ..."). I thought that it
would unconditionally
create a brand new, commit.

> Your case does not demonstrate a bug in git.

The bug is that the git commit --allow-empty does a different action
depending on
whether the system clock has changed its seconds right before the command.
This is a time-dependent behavior, and it is very harmful. Our applications must
never behave differently depending on the time they are run or on the processor
speed. It is an issue of correctness and robustness of software. To
have a predictable
behavior, i.e. to create a brand new commit with git commit
--allow-empty, the command
in a script must ALWAYS be preceded by a sleep 1 so as to make sure
that the date
and time it will use are for sure different from any other commits'.
But then it would be a lot better to embed such a sleep in the command.
If that is not possible, then the users must be warned in the man page
that the command
sometimes may not create a brand new commit, and that if the user
instead wants it s/he
should change something in the commit, like, e.g. the message.

>
> Why don't you use a different commit message to ensure that there is a
> difference between the commits?
>

This is what eventually I did to force the creation of a brand new commit.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  7:29           ` Philip Oakley
@ 2012-10-03  7:45             ` Angelo Borsotti
  2012-10-03  8:04               ` Matthieu Moy
  2012-10-03 10:12               ` Andreas Schwab
  0 siblings, 2 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03  7:45 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, Johannes Sixt, git

In reply to Philip,

I understand what the implementation does, but I am stating that it is
not what the
user (by reading the man page) expects.
The user adds --allow-empty to have a different & unique commit, such seems to
be the purpose of the option.
Unfortunately, it gets that only sometimes, depending on the exact
instant in time
the command is executed, which is out of his/her control.
I think that you would agree with me that this is not a nice
behaviour. How could a user
ever use a command that is not predictable?
If it is not possible to change the implementation, at least warn the
user in the man page.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  7:45             ` Angelo Borsotti
@ 2012-10-03  8:04               ` Matthieu Moy
  2012-10-03  8:24                 ` Angelo Borsotti
  2012-10-03 10:12               ` Andreas Schwab
  1 sibling, 1 reply; 53+ messages in thread
From: Matthieu Moy @ 2012-10-03  8:04 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> I think that you would agree with me that this is not a nice
> behaviour.

This is fundamentally how Git works. You probably didn't notice it, but
if you do

echo 'some content' > file1.txt
git add file1.txt
git commit -m "file1"

echo 'some content' > file2.txt
git add file2.txt
git commit -m "file2"

Then the second commit does not "create" a new blob object for
file2.txt, because it has the same content as an existing one. But the
point is: you really don't care, or indeed, you care about sharing the
blob objects to save disk space.

> How could a user ever use a command that is not predictable?

It is predictible: give it twice the same inputs in the same conditions,
and it will yield the same output.

You still didn't tell us where the problem was. You are unhappy with
having twice the same sha1 for the same object, but what concrete bad
consequence does this have? (except for saving bandwidth in addition to
disk space when trying to push your commit)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  8:04               ` Matthieu Moy
@ 2012-10-03  8:24                 ` Angelo Borsotti
  2012-10-03 11:07                   ` Matthieu Moy
  2012-10-03 12:25                   ` Tomas Carnecky
  0 siblings, 2 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03  8:24 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Matthiew,

> Then the second commit does not "create" a new blob object for
> file2.txt, because it has the same content as an existing one. But the
> point is: you really don't care, or indeed, you care about sharing the
> blob objects to save disk space.

That is fine, and it is well documented.

> It is predictible: give it twice the same inputs in the same conditions,
> and it will yield the same output.

Well, I have some difficulties to hit the return key while watching the system
clock at the same time so as to make sure that the command is executed
before the seconds change. So, it theory it would be predictable, but not
in practice. Note that commands must be predictable for the user that writes
them, i.e. the user must be able to figure out what the result is. Which is
certainly not the case here.

>
> You still didn't tell us where the problem was.

I described it few mails above. I wanted to create an orphan branch. The command
to create it is git checkout --orphan. However, the branch is not
actually created
until a commit is done on it. Then I did such a commit (all this is
placed in a script
to be used by my developers), but if there are no changes, git commit does not
create a new one. To force it to create a brand new one I added
--allow-empty to it
because the man page stated that it would bypass the check that prevents to make
a new one. The I discovered that sometimes --allow-empty does not behave as
expected.


-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  7:45             ` Angelo Borsotti
  2012-10-03  8:04               ` Matthieu Moy
@ 2012-10-03 10:12               ` Andreas Schwab
  2012-10-03 11:37                 ` Angelo Borsotti
  1 sibling, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 10:12 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> The user adds --allow-empty to have a different & unique commit

Where does the manual say that --allow-empty implies a different and
unique commit?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  8:24                 ` Angelo Borsotti
@ 2012-10-03 11:07                   ` Matthieu Moy
  2012-10-03 11:52                     ` Angelo Borsotti
  2012-10-03 12:25                   ` Tomas Carnecky
  1 sibling, 1 reply; 53+ messages in thread
From: Matthieu Moy @ 2012-10-03 11:07 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

>> You still didn't tell us where the problem was.
>
> I described it few mails above. I wanted to create an orphan branch.

And you did. The branch happens to point to the same commit as another
existing commit, but this is a very common situation. Try this:

# do arbitrary hacking and commit on branch master
git checkout -b new-branch
gitk

You will see branches "master" and "new-branch" pointing to the same
commit (but you HEAD points to new-branch, as "git branch" will tell
you).

You still did not describe a _problem_. Up to now, the only "problem" I
see is that you have twice the same sha1 showing up, but you did not
describe somethine concrete that you wanted to do and did not work.

> However, the branch is not actually created until a commit is done on
> it.

Right, but the definition of "done" in your sentence includes "reusing
an object in the object database".

I just tried this:

rm -fr test
git init test
cd test
date > foo.txt
git add .
git commit --allow-empty -m foo
git checkout --orphan new-branch
git commit --allow-empty -m foo

I ended up with a branch "master" and a branch "new-branch", both
pointing to the same commit. The new branch _is_ created.

(BTW, --allow-empty is useless here as you have no parent)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 10:12               ` Andreas Schwab
@ 2012-10-03 11:37                 ` Angelo Borsotti
  2012-10-03 13:44                   ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 11:37 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Andreas,

>
> Where does the manual say that --allow-empty implies a different and
> unique commit?
>

In the git commit man page:

"--allow-empty

    Usually recording a commit that has the exact same tree as its
sole parent commit is a mistake, and the command prevents you from
making such a commit. This option bypasses the safety, and is
primarily for use by foreign SCM interface scripts."

By reading: "the command prevents" I understand that a new commit is
not created, and "This option bypasses" that it is instead created.

Perhaps my reading was a bit straightforward, but a man page is not a
sort of ancient holy writing that the reader has to sift every word to
understand hidden meanings, it should be something
clear and plain.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 11:07                   ` Matthieu Moy
@ 2012-10-03 11:52                     ` Angelo Borsotti
       [not found]                       ` <CABURp0oHez6j8+FPG8Zm52TGVyC1XwWhE55TBDrXRGFrW6kWww@mail.gmail.com>
                                         ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 11:52 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi

>>> You still didn't tell us where the problem was.

I thought I did, but here it is: I have private and a public
repositories. In the private ones the developers keep both the sources
and the binaries. In the public ones they keep only the sources. They
do not want the binaries there because binaries are very large and
requite much time to be pushed. Besides that, they are not even needed
because they must be rebuilt anyway.
To push the sources only they keep in the private repositories an
orphan branch in which commits are done taking the relevant commits in
the (say) master branch and removing the binaries from the index.
Pushing directly the master branch would push also the binaries even
if they were removed from its index (the  history gets pushed): thence
the need for an orphan branch. Scripts have been provided to do this
easily and safely. Now, it could happen that a developer does not have
(yet) binaries, but want to push all the same. The script has to take
care for this special case, in which no binaries are removed, but a
commit on the orphan branch is done all the same. And here is the
problem since git commit does not produce a brand new, different &
unique commit all the times, making then the orphan branch point to
the master one, i.e. becoming a non-orphan one.

> I ended up with a branch "master" and a branch "new-branch", both
> pointing to the same commit. The new branch _is_ created.
>

Exactly, it is created, but it is not an orphan ... or more precisely,
it is sometimes, depending on how fast you are to enter the second
commit command. This time-dependent behaviour is what I am talking
about.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  8:24                 ` Angelo Borsotti
  2012-10-03 11:07                   ` Matthieu Moy
@ 2012-10-03 12:25                   ` Tomas Carnecky
  2012-10-03 13:08                     ` Angelo Borsotti
  1 sibling, 1 reply; 53+ messages in thread
From: Tomas Carnecky @ 2012-10-03 12:25 UTC (permalink / raw)
  To: Angelo Borsotti, Matthieu Moy
  Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

On Wed, 03 Oct 2012 10:24:00 +0200, Angelo Borsotti <angelo.borsotti@gmail.com> wrote:
> create a new one. To force it to create a brand new one I added
> --allow-empty to it
> because the man page stated that it would bypass the check that prevents to make
> a new one. The I discovered that sometimes --allow-empty does not behave as
> expected.

The documentation only states that it will skip the 'same tree as parent'
check, not that it will *always* create a new commit.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02 19:34     ` Angelo Borsotti
  2012-10-02 19:56       ` Junio C Hamano
@ 2012-10-03 12:59       ` Phil Hord
  2012-10-03 14:25         ` Angelo Borsotti
  1 sibling, 1 reply; 53+ messages in thread
From: Phil Hord @ 2012-10-03 12:59 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Johannes Sixt, git

On Tue, Oct 2, 2012 at 3:34 PM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
>
> "Usually recording a commit that has the exact same tree as its sole
> parent commit is a mistake, and the command prevents you from making
> such a commit. This option bypasses the safety, and is primarily for
> use by foreign SCM interface scripts."

Perhaps the confusion arises from the the meaning of "the safety".  In
this case, the safety mechanism in place is to prevent you from
creating a child commit which has the same "tree" contents (working
directory) as the parent commit.  It will not be the same commit
because it has different parent(s) than its parent commit; but the
tree (working directory) is the same and git normally prevents you
from doing this because normally this is an accident, a mistake.

--allow-empty tells git you intend to do this and so it should bypass
this "no changed files" safety mechanism.  It is not a safety to
prevent you creating a new commit with the exact same sha1; the safety
is concerned only with the exact same "working directory" file
contents.

Can you suggest a rewrite of this description which would make it more clear?

Phil

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 12:25                   ` Tomas Carnecky
@ 2012-10-03 13:08                     ` Angelo Borsotti
  0 siblings, 0 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 13:08 UTC (permalink / raw)
  To: Tomas Carnecky
  Cc: Matthieu Moy, Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Thomas,

> The documentation only states that it will skip the 'same tree as parent'
> check, not that it will *always* create a new commit.

Ok, understood: you believe that the documentation is clear, and I
that it is somehow not.
I would prefer to have it more plain.

But that is not all the story. The behavior of the command remains
time-dependent,
so that a user cannot reliably predict its result. I think that this
is an ill-specified option.
I would not insist in removing it (although that would be the correct
solution), but at
least to warn the user about this possibly unexpected behavior.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
       [not found]                       ` <CABURp0oHez6j8+FPG8Zm52TGVyC1XwWhE55TBDrXRGFrW6kWww@mail.gmail.com>
@ 2012-10-03 13:35                         ` Angelo Borsotti
  2012-10-03 14:15                           ` Phil Hord
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 13:35 UTC (permalink / raw)
  To: Phil Hord; +Cc: git

Hi Phil

>
> I think what you are missing here is that the script does _not_ have
> to take care for this special case.  The script can do the same thing
> it does for all the other cases and it will work just fine.  This is
> because your goal, as I understand it, is this:
>
> A. Take this branch,
> B. Copy it but remove the binaries,
> C. Push it to the remote (with no binaries)
>
> If the branch has no binaries to begin with, then B is a no-op.  Your
> insistence that the new commits get unique SHA1's is unnecessary and
> is what is causing your trouble.

Suppose the branch has binaries. Then the only way to avoid to push
them is to create an orphan branch (one that has no parents),
otherwise git push will upload also the parent with its binaries.
This is why there is a need to make the script perform different
actions depending on the presence of the binaries. In the attempt to
make the script handle both cases in a simple way I tried to make an
empty commit, and discovered the time-dependent behavior of it.

>
> Consider this analogous operation:
>
> A. Take this file,
> B. Remove every line that does not contain foo,
> C. Cat the result to the console (with only foo lines)
>

This example differs from the commit one in that the user has to cope
with data that s/he can fully control (the contents of files), while
in the other s/he has to cope with the passing of time, which s/he
cannot control. So, taking the files I can predict the result, but
taking the commits, I cannot because I do not know exactly when they
will actually be run. Time is a sort of independent variable that I
know only approximately (or very approximately when the commands are
embedded in scripts).

>
> It seems to those more familiar with git that you are saying that this
> is "the problem", that the operation did not work because the results
> are not unique each time.

Exactly.

>
> But if you ignore the SHA1 of the commits and just rely on the branch
> names, I think you will be happier.  This is because two branches can
> refer to the same SHA1 commit without causing any problem.  You may
> find that sometimes when you push there is no update applied to the
> server.  But this is not a mistake.  It is simply that the server
> already has the same contents as you are pushing, even though your
> local branch name is different than it was before.

Actually I ignore the SHA1 of the commits, and rely on the branch
names I have topic branches and /src/topic branches. Developers push
when they have something new. Of course the scripts must take care of
when they are called and there is nothing to push, but that is not a
big problem.
I eventually found a workaround, which is to change the commit
message, forcing then git commit to create a brand new commit.

> I think when you say "orphan" you mean it has a different SHA1 than
> any other commit.  But this is not what "orphan" means.

No, I mean that it has no parents.

Actually, in the special case in which there are no binaries, I could
create a branch that points to the same commit as the branch that it
is mirroring, and push it. However, this has two disadvantages: 1.
that it will not be an orphan while in the more general case it is,
and 2, that the history of commits will be pushed to the remote
server, while in the general case (with an orphan) it will not. I
preferred to have a unique branch topology so as to make the picture
as simple as possible for the developers.

Note that eventually I solved the problem with a tweak. I still
believe that the git commit command does not behave properly, and that
changing nothing (implementation or documentation) leaves a drifting
mine on which someone (or even myself) will stumble sooner or later. I
am spending time to write all this because I care for git and I would
really see it improving over time removing weak spots, and believe
that you do the same.

-Angelo
>
> Phil

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 11:37                 ` Angelo Borsotti
@ 2012-10-03 13:44                   ` Andreas Schwab
  2012-10-03 14:37                     ` Angelo Borsotti
  0 siblings, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 13:44 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> By reading: "the command prevents" I understand that a new commit is
> not created, and "This option bypasses" that it is instead created.

But where does it say "different and unique"?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 11:52                     ` Angelo Borsotti
       [not found]                       ` <CABURp0oHez6j8+FPG8Zm52TGVyC1XwWhE55TBDrXRGFrW6kWww@mail.gmail.com>
@ 2012-10-03 13:57                       ` Matthieu Moy
  2012-10-03 14:46                         ` Angelo Borsotti
  2012-10-03 22:32                       ` Philip Oakley
  2 siblings, 1 reply; 53+ messages in thread
From: Matthieu Moy @ 2012-10-03 13:57 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> [...] making then the orphan branch point to the master one, i.e.
> becoming a non-orphan one.

I understand both parts of the sentense, but not the "i.e.".

And I still don't see a concrete problem. "two branches point to the
same commit" is not a problem, it's an observation. I have branches
pointing to the same commit all the time.

>> I ended up with a branch "master" and a branch "new-branch", both
>> pointing to the same commit. The new branch _is_ created.
>
> Exactly, it is created, but it is not an orphan ... or more precisely,
> it is sometimes, depending on how fast you are to enter the second
> commit command. This time-dependent behaviour is what I am talking
> about.

You don't understand what an orphan branch is.

What "git checkout --orphan && git commit" does is that it creates a
commit that doesn't have parent (hence the name orphan, btw). It does in
your case. You _do_ create an orphan commit regardless of the timing.

The fact that another branch points to the same commit is a different
matter, and you still didn't explain why this was problematic.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 13:35                         ` Angelo Borsotti
@ 2012-10-03 14:15                           ` Phil Hord
  0 siblings, 0 replies; 53+ messages in thread
From: Phil Hord @ 2012-10-03 14:15 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: git

On Wed, Oct 3, 2012 at 9:35 AM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
> Hi Phil
>
>>
>> I think what you are missing here is that the script does _not_ have
>> to take care for this special case.  The script can do the same thing
>> it does for all the other cases and it will work just fine.  This is
>> because your goal, as I understand it, is this:
>>
>> A. Take this branch,
>> B. Copy it but remove the binaries,
>> C. Push it to the remote (with no binaries)
>>
>> If the branch has no binaries to begin with, then B is a no-op.  Your
>> insistence that the new commits get unique SHA1's is unnecessary and
>> is what is causing your trouble.
>
> Suppose the branch has binaries. Then the only way to avoid to push
> them is to create an orphan branch (one that has no parents),
> otherwise git push will upload also the parent with its binaries.

This is true only if the root commit also has binaries.  Otherwise it
is fine to push a branch with the common ancestor.

Suppose A does not have binaries but B and C do.

A---B---C

Now we need to make a new branch ending at C' which has no binaries:
A---B---C
 \
  ---B'---C'

A already has no binaries, so we did not need to make an A'.  Now we
can push C' to the server and no binaries will be pushed.  That is
because the server will receive only these commits:

A---B'---C'


> This is why there is a need to make the script perform different
> actions depending on the presence of the binaries. In the attempt to
> make the script handle both cases in a simple way I tried to make an
> empty commit, and discovered the time-dependent behavior of it.

Every commit is time-dependent.  You tried to make a _unique_ empty
commit, and this is where you ran into trouble.  I think your
uniqueness constraint is overkill.


>> Consider this analogous operation:
>>
>> A. Take this file,
>> B. Remove every line that does not contain foo,
>> C. Cat the result to the console (with only foo lines)
>>
>
> This example differs from the commit one in that the user has to cope
> with data that s/he can fully control (the contents of files), while
> in the other s/he has to cope with the passing of time, which s/he
> cannot control. So, taking the files I can predict the result, but
> taking the commits, I cannot because I do not know exactly when they
> will actually be run. Time is a sort of independent variable that I
> know only approximately (or very approximately when the commands are
> embedded in scripts).

You need not be concerned with the time on the commit, nor the
uniqueness of the SHA1.


>> It seems to those more familiar with git that you are saying that this
>> is "the problem", that the operation did not work because the results
>> are not unique each time.
>
> Exactly.
>
>>
>> But if you ignore the SHA1 of the commits and just rely on the branch
>> names, I think you will be happier.  This is because two branches can
>> refer to the same SHA1 commit without causing any problem.  You may
>> find that sometimes when you push there is no update applied to the
>> server.  But this is not a mistake.  It is simply that the server
>> already has the same contents as you are pushing, even though your
>> local branch name is different than it was before.
>
> Actually I ignore the SHA1 of the commits, and rely on the branch
> names I have topic branches and /src/topic branches. Developers push
> when they have something new. Of course the scripts must take care of
> when they are called and there is nothing to push, but that is not a
> big problem.
> I eventually found a workaround, which is to change the commit
> message, forcing then git commit to create a brand new commit.

Doesn't this force git always to push new commits even though the
contents match commits already on the server?

>> I think when you say "orphan" you mean it has a different SHA1 than
>> any other commit.  But this is not what "orphan" means.
>
> No, I mean that it has no parents.
>
> Actually, in the special case in which there are no binaries, I could
> create a branch that points to the same commit as the branch that it
> is mirroring, and push it. However, this has two disadvantages: 1.
> that it will not be an orphan while in the more general case it is,
> and 2, that the history of commits will be pushed to the remote
> server, while in the general case (with an orphan) it will not. I
> preferred to have a unique branch topology so as to make the picture
> as simple as possible for the developers.

It seems to me that you are creating unnecessary work for the server
and for your scripts.  But perhaps I do not fully understand your use
case.

> Note that eventually I solved the problem with a tweak. I still
> believe that the git commit command does not behave properly, and that
> changing nothing (implementation or documentation) leaves a drifting
> mine on which someone (or even myself) will stumble sooner or later. I
> am spending time to write all this because I care for git and I would
> really see it improving over time removing weak spots, and believe
> that you do the same.

You may suggest improvements to the documentation.  But be careful to
understand the existing documentation completely before you do.

Thanks for helping.

Phil

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 12:59       ` Phil Hord
@ 2012-10-03 14:25         ` Angelo Borsotti
  2012-10-03 16:06           ` PJ Weisberg
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 14:25 UTC (permalink / raw)
  To: Phil Hord; +Cc: Junio C Hamano, Johannes Sixt, git

Hi Phil,

> Perhaps the confusion arises from the the meaning of "the safety".  In
> this case, the safety mechanism in place is to prevent you from
> creating a child commit which has the same "tree" contents (working
> directory) as the parent commit.  It will not be the same commit
> because it has different parent(s) than its parent commit; but the
> tree (working directory) is the same and git normally prevents you
> from doing this because normally this is an accident, a mistake.
>
> --allow-empty tells git you intend to do this and so it should bypass
> this "no changed files" safety mechanism.  It is not a safety to
> prevent you creating a new commit with the exact same sha1; the safety
> is concerned only with the exact same "working directory" file
> contents.
>
> Can you suggest a rewrite of this description which would make it more clear?

Instead of:

"Usually recording a commit that has the exact same tree as its sole
parent commit is a mistake, and the command prevents you from making
such a commit. This option bypasses the safety, and is primarily for
use by foreign SCM interface scripts."

I would suggest:

"Usually recording a commit that has the exact same tree as its sole
parent commit is not allowed, and the command prevents you from making
such a commit. This option allows to disregard this condition, thereby
making a commit even when the trees are the same. Note that when the
tree, author, parents, message and date (with the precision of one
second) are the same as those of an existing commit object, no new
commit object is created, and the identity of the existing one is
returned."
>
> Phil

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 13:44                   ` Andreas Schwab
@ 2012-10-03 14:37                     ` Angelo Borsotti
  2012-10-03 16:44                       ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 14:37 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Andreas,

> But where does it say "different and unique"?

It does not, but it says: "Usually recording a commit that has the
exact same tree as its sole parent commit is a mistake, and the
command prevents you from making such a commit.", followed by "This
option bypasses the safety ..." leading to thing that the option
negates that "prevents" above.
I do understand that by reading very carefully each word of these
sentences one can eventually figure out that the option removes the
check on the tree only, and that all the others remain, including the
one on the identity of the time. However, it does not say that the
time must be equal with the approximation of one second. Apart from
this detail, it does not state plainly that no commit object is
created.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 13:57                       ` Matthieu Moy
@ 2012-10-03 14:46                         ` Angelo Borsotti
  2012-10-03 14:52                           ` Matthieu Moy
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 14:46 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Matthiew,

>
> You don't understand what an orphan branch is.

I do not think so. I wanted to create a branch with a commit that has no parent,
and I think that this is called "orphan branch".

I wanted also to have another branch, pointing to a different commit,
the difference
being that this contains binaries, and the other does not.
So, having two references pointing to the same commit is not a problem for me,
but it is not either the solution.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 14:46                         ` Angelo Borsotti
@ 2012-10-03 14:52                           ` Matthieu Moy
  0 siblings, 0 replies; 53+ messages in thread
From: Matthieu Moy @ 2012-10-03 14:52 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> Hi Matthiew,
>
>>
>> You don't understand what an orphan branch is.
>
> I do not think so. I wanted to create a branch with a commit that has no parent,
> and I think that this is called "orphan branch".

Yes, and this is what you did.

> I wanted also to have another branch, pointing to a different commit,
> the difference
> being that this contains binaries, and the other does not.

If they contain different content, they will be different commits, with
different sha1.

> So, having two references pointing to the same commit is not a problem
> for me,

So, you have no problem.

End of discussion for me, sorry.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 14:25         ` Angelo Borsotti
@ 2012-10-03 16:06           ` PJ Weisberg
  2012-10-03 17:34             ` Angelo Borsotti
  0 siblings, 1 reply; 53+ messages in thread
From: PJ Weisberg @ 2012-10-03 16:06 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Phil Hord, Junio C Hamano, Johannes Sixt, git

On Wed, Oct 3, 2012 at 7:25 AM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
> Hi Phil,
>
>> Perhaps the confusion arises from the the meaning of "the safety".  In
>> this case, the safety mechanism in place is to prevent you from
>> creating a child commit which has the same "tree" contents (working
>> directory) as the parent commit.  It will not be the same commit
>> because it has different parent(s) than its parent commit; but the
>> tree (working directory) is the same and git normally prevents you
>> from doing this because normally this is an accident, a mistake.
>>
>> --allow-empty tells git you intend to do this and so it should bypass
>> this "no changed files" safety mechanism.  It is not a safety to
>> prevent you creating a new commit with the exact same sha1; the safety
>> is concerned only with the exact same "working directory" file
>> contents.
>>
>> Can you suggest a rewrite of this description which would make it more clear?
>
> Instead of:
>
> "Usually recording a commit that has the exact same tree as its sole
> parent commit is a mistake, and the command prevents you from making
> such a commit. This option bypasses the safety, and is primarily for
> use by foreign SCM interface scripts."
>
> I would suggest:
>
> "Usually recording a commit that has the exact same tree as its sole
> parent commit is not allowed, and the command prevents you from making
> such a commit. This option allows to disregard this condition, thereby
> making a commit even when the trees are the same. Note that when the
> tree, author, parents, message and date (with the precision of one
> second) are the same as those of an existing commit object, no new
> commit object is created, and the identity of the existing one is
> returned."

But that's true of 'git commit' generally; it has nothing to do with
--allow-empty.

-PJ

Gehm's Corollary to Clark's Law: Any technology distinguishable from
magic is insufficiently advanced.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 14:37                     ` Angelo Borsotti
@ 2012-10-03 16:44                       ` Andreas Schwab
  2012-10-03 17:37                         ` Angelo Borsotti
  0 siblings, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 16:44 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> it does not state plainly that no commit object is created.

But the commit object _is_ created, it just doesn't have a unique name.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 16:06           ` PJ Weisberg
@ 2012-10-03 17:34             ` Angelo Borsotti
  2012-10-03 19:05               ` Andreas Schwab
                                 ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 17:34 UTC (permalink / raw)
  To: PJ Weisberg; +Cc: Phil Hord, Junio C Hamano, Johannes Sixt, git

HI PJ,

take a git commit without --allow-empty: if the trees are equal, it
creates no commit,
and if the trees are different it creates one.
Take then a git commit --allow-empty: if the trees are equal it may
create a commit or
not depending on the parent, message, author and date; if the trees
are different it
creates a commit.
So, the statement does not apply to commits in general.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 16:44                       ` Andreas Schwab
@ 2012-10-03 17:37                         ` Angelo Borsotti
  2012-10-03 19:03                           ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 17:37 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Andreas,

> But the commit object _is_ created, it just doesn't have a unique name.

The command may internally create the commit object, compute its sha and then
seeing that there is already one in the repo with the same sha, throw it away.
But this is an implementation detail. The net result for the user is
that after the
command there are no new objects.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 17:37                         ` Angelo Borsotti
@ 2012-10-03 19:03                           ` Andreas Schwab
  2012-10-03 19:11                             ` Angelo Borsotti
  0 siblings, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 19:03 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> that after the command there are no new objects.

That is an uninteresting implementation detail.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 17:34             ` Angelo Borsotti
@ 2012-10-03 19:05               ` Andreas Schwab
  2012-10-03 19:43               ` PJ Weisberg
  2012-10-05  8:15               ` Lars Noschinski
  2 siblings, 0 replies; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 19:05 UTC (permalink / raw)
  To: Angelo Borsotti
  Cc: PJ Weisberg, Phil Hord, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> Take then a git commit --allow-empty: if the trees are equal it may
> create a commit or not depending on the parent, message, author and
> date; if the trees are different it creates a commit.

The commit is _always_ created, with a name depending on the parent,
message, author and date.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 19:03                           ` Andreas Schwab
@ 2012-10-03 19:11                             ` Angelo Borsotti
  2012-10-03 20:30                               ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-03 19:11 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Hi Andreas,

as a user, and owner of a repository I do care about the objects that are in it.
I do not care about the way they are names, be it numbers or sha's, but for
sure about their existence.
So, for me it is important if a command creates a new commit or not.

> The commit is _always_ created, with a name depending on the parent,
> message, author and date.

I do not understand this: I have produced several examples that show that
it is not created, i.e. that the very same objects are present in the repository
after the command execution as they were before it.
It is possible, though, that you use the word "create"  with a
different meaning.
Most dictionaries state: "to cause to come into existence", i.e. before creation
the thing does not exist, and after creation it does.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 17:34             ` Angelo Borsotti
  2012-10-03 19:05               ` Andreas Schwab
@ 2012-10-03 19:43               ` PJ Weisberg
  2012-10-05  8:15               ` Lars Noschinski
  2 siblings, 0 replies; 53+ messages in thread
From: PJ Weisberg @ 2012-10-03 19:43 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Phil Hord, Junio C Hamano, Johannes Sixt, git

On Wed, Oct 3, 2012 at 10:34 AM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
> HI PJ,
>
> take a git commit without --allow-empty: if the trees are equal, it
> creates no commit,
> and if the trees are different it creates one.
> Take then a git commit --allow-empty: if the trees are equal it may
> create a commit or
> not depending on the parent, message, author and date; if the trees
> are different it
> creates a commit.
> So, the statement does not apply to commits in general.

But that same thing applies to git commit without --allow-empty.  If
you create the same object twice then only one copy is stored,
regardless of how you create it.  In fact, the commits you were
creating in your example were orphans, so --allow-empty couldn't have
had an effect on them in any case.

-PJ

Gehm's Corollary to Clark's Law: Any technology distinguishable from
magic is insufficiently advanced.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 19:11                             ` Angelo Borsotti
@ 2012-10-03 20:30                               ` Andreas Schwab
  0 siblings, 0 replies; 53+ messages in thread
From: Andreas Schwab @ 2012-10-03 20:30 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git

Angelo Borsotti <angelo.borsotti@gmail.com> writes:

> as a user, and owner of a repository I do care about the objects that are in it.

There is no need to care.

> I do not understand this: I have produced several examples that show that
> it is not created, i.e. that the very same objects are present in the repository
> after the command execution as they were before it.

That is just an implementation detail.  All you need to know is that a
ref has been created or modified.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03  7:12                   ` Johannes Sixt
  2012-10-03  7:35                     ` Angelo Borsotti
@ 2012-10-03 20:49                     ` Junio C Hamano
  1 sibling, 0 replies; 53+ messages in thread
From: Junio C Hamano @ 2012-10-03 20:49 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Angelo Borsotti, Git Mailing List

Johannes Sixt <j.sixt@viscovery.net> writes:

> Why don't you use a different commit message to ensure that there is a
> difference between the commits?

That sounds like a workaround, and unnecessary one at that, as it is
entirely unclear why there _needs_ to be a different commit.

Perhaps OP fears that the orphan branch "foo" in his example,
because it happens to point at the same commit object as the
"master", will not stay the same and follow along the advancement of
"master" if some new commits are added to it, and that is the reason
he wants a different commit?

Of course, starting from "master" and "foo" pointing at the same
commit (or different commit, for that matter), "foo" won't change if
you commit on "master", so that fear is unnecessary.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 11:52                     ` Angelo Borsotti
       [not found]                       ` <CABURp0oHez6j8+FPG8Zm52TGVyC1XwWhE55TBDrXRGFrW6kWww@mail.gmail.com>
  2012-10-03 13:57                       ` Matthieu Moy
@ 2012-10-03 22:32                       ` Philip Oakley
  2012-10-04  7:07                         ` Angelo Borsotti
  2 siblings, 1 reply; 53+ messages in thread
From: Philip Oakley @ 2012-10-03 22:32 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

From: "Angelo Borsotti" <angelo.borsotti@gmail.com>
Sent: Wednesday, October 03, 2012 12:52 PM
> Hi
>
>>>> You still didn't tell us where the problem was.
>

I've split up the explanation of your problem you have seen, to see if I 
can understand where the 'missing' aspect is within the extended 
dicussions.

> I thought I did, but here it is:

> I have private and a public
> repositories. In the private ones the developers keep both the sources
> and the binaries. In the public ones they keep only the sources. They
> do not want the binaries there because binaries are very large and
> requite much time to be pushed. Besides that, they are not even needed
> because they must be rebuilt anyway.

> To push the sources only, they keep in the private repositories an
> orphan branch in which commits are done taking the relevant commits in
> the (say) master branch and removing the binaries from the index.

> Pushing directly the master branch would push also the binaries even
> if they were removed from its index (the  history gets pushed): thence
> the need for an orphan branch.

> Scripts have been provided to do this
> easily and safely. Now, it could happen that a developer does not have
> (yet) binaries, but want to push all the same.

> The script has to take
> care for this special case, in which no binaries are removed, but a
> commit on the orphan branch is done all the same.

>And here is the
> problem since git commit does not produce a brand new, different &
> unique commit all the times, making then the orphan branch point to
> the master one, i.e. becoming a non-orphan one.

What isn't clear is how the master branch is created and maintained at 
this point.

Does the script create it afresh each time, so that it is also, 
implicitly, an --orphan branch?

>
>> I ended up with a branch "master" and a branch "new-branch", both
>> pointing to the same commit. The new branch _is_ created.
>>
In such a case (a new master being created every time the script runs), 
then you can suffer the situation you describe where you have a common 
sentinel commit being used for both branches, even though you thought 
they were orphaned from each other. - a very special case.

However one has to ask how the rest of the script would work in such 
situations with such a truncated master branch.

If the master branch has a true history, then you would get different 
commits being created on the two branches because the parents would be 
different.

Or finally, you have a truly special test (initialisation) case when you 
are starting master (which will later grow) and comparing it to the very 
first test case of the --orphan branch and in that special case you 
could get a common commit. But that is a one off special case, and would 
not recur in practice.

Can you say more about the script?

> Exactly, it is created, but it is not an orphan ... or more precisely,
> it is sometimes, depending on how fast you are to enter the second
> commit command. This time-dependent behaviour is what I am talking
> about.
>
> -Angelo
> --

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 22:32                       ` Philip Oakley
@ 2012-10-04  7:07                         ` Angelo Borsotti
  2012-10-04 13:24                           ` Phil Hord
  2012-10-04 21:17                           ` Philip Oakley
  0 siblings, 2 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-04  7:07 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

Hi Philip and all,

let me explain in full what is the problem that I tried to solve, and
how along the way I stumbled in something that seems to me a git bug
(at least a documentation one).

There is an R&D team developing software using a workflow that is
similar to the integerator-manager one (the one described by Scott
Chacon in chapter 5 of ProGit).
Developers implement features using a local repository hosted on their
workstations, and when finished push on a server; integrators pull
from it and put all the contributions together.
Since integrators rebuild always the software after merging all
contribution, there is no need for the developers to push the
binaries. Not pushing them speeds up uploading.
In order to make life simpler and safer, scripts are provided to
perform the pushing, pulling, etc. operations. So, most of the git
commands shown below are actually run from within scripts.
The development of each feature is done in a dedicated topic branch,
and the commits done in it contain both the sources and the binaries
(to allow to recover fully a previous snapshot when a later change
broke a previous one). When pushing, there are these needs:

      1. push the sources only
      2. push only the last commit of the topic branch (not the whole history)

A note on point 2: the integrators are not interested in seeing all
the commits that developers did while implementing their features.
Having all the history makes their repositories cluttered.

In order to avoid pushing all the history, orphan branches are used to
parallel the topic ones.
When pushing, first a commit is done on the topic branch, and then a
snapshot is created in the parallel branch with the same files,
binaries removed. The general case is:

     source branch                              D'
                                                        :
     topic branch        A----B----C---D

In the picture, the developer made 4 commits, and pushed the sources
of the last one, D.
A D' is created on the source branch (the relationship with D is
indicated with a dotted line).
The push script must cope with all the cases that may occur:

     1.  the general one (the one in the previous figure)
     2.  none of the commits in the topic branch with binaries (i.e. D
and D' with the same tree)
     3.  push done immediately after the first commit (A)
     4.  a push done after another

The script:

     1.  creates the source branch if it does not exist yet (git
checkout --orphan),
          otherwise makes HEAD point to it
     2.  sets a .git/info/exclude file that excludes the binaries
     3.  removes the binaries from the index (git rm)
     4.  creates a commit on the source branch
     5.  pushes it
     6.  restores the HEAD and index as they were before

The operation that caused problems was nr. 4. In all the cases
enlisted above, a git commit creates a brand new and unique commit
because either it has a parent that is different from that of any
other commit, or because its tree is different. All, except case nr 3
when there are no binaries:

     source branch         A'
                                   :
     topic branch        A

In this case the parent is the same as that of A, i.e. none, and also
the tree is the same. In order to try to force the creation of a brand
new and unique commit even when the trees are the same --allow-empty
has been used, but this did not avail because git commit creates a
brand new one only when the seconds of the system clock have ticked
before it.

Some of you have suggested to create an A' that is not orphan in such
a case, which is a workaround, and some others to change the message
in it, and this is another. I choose the latter because it allows to
keep the source branch orphan in all cases. So, there are workarounds,
and the script has eventually been implemented and tested, but the
unexpected, time-dependent behavior of git commit is there and someone
could stumble on it sooner or later.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04  7:07                         ` Angelo Borsotti
@ 2012-10-04 13:24                           ` Phil Hord
  2012-10-04 19:00                             ` Angelo Borsotti
  2012-10-04 21:17                           ` Philip Oakley
  1 sibling, 1 reply; 53+ messages in thread
From: Phil Hord @ 2012-10-04 13:24 UTC (permalink / raw)
  To: Angelo Borsotti
  Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git, Matthieu Moy

On Thu, Oct 4, 2012 at 3:07 AM, Angelo Borsotti
<angelo.borsotti@gmail.com> wrote:
...
> The operation that caused problems was nr. 4. In all the cases
> enlisted above, a git commit creates a brand new and unique commit
> because either it has a parent that is different from that of any
> other commit, or because its tree is different. All, except case nr 3
> when there are no binaries:
>
>      source branch         A'
>                                    :
>      topic branch        A
>
> In this case the parent is the same as that of A, i.e. none, and also
> the tree is the same.

And why is this a problem?

Is there a process or person watching the server for a new commit?

Is it not enough to notice that the pushed-to branch has a new HEAD?

Phil

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04 13:24                           ` Phil Hord
@ 2012-10-04 19:00                             ` Angelo Borsotti
  0 siblings, 0 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-04 19:00 UTC (permalink / raw)
  To: Phil Hord; +Cc: Philip Oakley, Junio C Hamano, Johannes Sixt, git, Matthieu Moy

Hi Phil,

\>
> And why is this a problem?
>
> Is there a process or person watching the server for a new commit?
>
> Is it not enough to notice that the pushed-to branch has a new HEAD?
>

Yes, the developers use the git gui to see the graph of branches and commits.
The simpler and uniform it is, the better.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04  7:07                         ` Angelo Borsotti
  2012-10-04 13:24                           ` Phil Hord
@ 2012-10-04 21:17                           ` Philip Oakley
  2012-10-04 22:09                             ` Angelo Borsotti
  1 sibling, 1 reply; 53+ messages in thread
From: Philip Oakley @ 2012-10-04 21:17 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

From: "Angelo Borsotti" <angelo.borsotti@gmail.com>
Sent: Thursday, October 04, 2012 8:07 AM
> Hi Philip and all,
>
> let me explain in full what is the problem that I tried to solve, and
> how along the way I stumbled in something that seems to me a git bug
> (at least a documentation one).
>
> There is an R&D team developing software using a workflow that is
> similar to the integerator-manager one (the one described by Scott
> Chacon in chapter 5 of ProGit).

This has the developers having a full copy/history of the integrators 
relevant branches, so that when the pull of the developers branch occurs 
there is a proper link to the integrators history.

> Developers implement features using a local repository hosted on their
> workstations, and when finished push on a server; integrators pull
> from it and put all the contributions together.
> Since integrators rebuild always the software after merging all
> contribution, there is no need for the developers to push the
> binaries. Not pushing them speeds up uploading.
> In order to make life simpler and safer, scripts are provided to
> perform the pushing, pulling, etc. operations. So, most of the git
> commands shown below are actually run from within scripts.
> The development of each feature is done in a dedicated topic branch,
> and the commits done in it contain both the sources and the binaries
> (to allow to recover fully a previous snapshot when a later change
> broke a previous one). When pushing, there are these needs:
>
>      1. push the sources only
>      2. push only the last commit of the topic branch (not the whole 
> history)
>
> A note on point 2: the integrators are not interested in seeing all
> the commits that developers did while implementing their features.
> Having all the history makes their repositories cluttered.
>
> In order to avoid pushing all the history, orphan branches are used to
> parallel the topic ones.

There are other ways to create a branch which has all the developers 
feature history removed, rather tha using an --orphan, which removes the 
integrators history as well.

> When pushing, first a commit is done on the topic branch, and then a
> snapshot is created in the parallel branch with the same files,
> binaries removed. The general case is:
>
>     source branch                              D'
>                                                        :
>     topic branch        A----B----C---D
>
> In the picture, the developer made 4 commits, and pushed the sources
> of the last one, D.
> A D' is created on the source branch (the relationship with D is
> indicated with a dotted line).

The disconnection of the D' source branch makes it sound like you have a 
second SCM system that you have to put stuff into, which is independent 
of the development teams git repos. I have this [hassle] at my 
$dayjob -one almost has to hide git from the powers-that-be.

> The push script must cope with all the cases that may occur:
>
>     1.  the general one (the one in the previous figure)
>     2.  none of the commits in the topic branch with binaries (i.e. D
> and D' with the same tree)
>     3.  push done immediately after the first commit (A)
>     4.  a push done after another
>
> The script:
>
>     1.  creates the source branch if it does not exist yet (git
> checkout --orphan),
>          otherwise makes HEAD point to it
>     2.  sets a .git/info/exclude file that excludes the binaries
>     3.  removes the binaries from the index (git rm)
>     4.  creates a commit on the source branch
>     5.  pushes it
>     6.  restores the HEAD and index as they were before
>
> The operation that caused problems was nr. 4. In all the cases
> enlisted above, a git commit creates a brand new and unique commit
> because either it has a parent that is different from that of any
> other commit, or because its tree is different. All, except case nr 3
> when there are no binaries:
>
>     source branch         A'
>                                   :
>     topic branch        A
>
> In this case the parent is the same as that of A, i.e. none, and also
> the tree is the same.
True.

>In order to try to force the creation of a brand
> new and unique commit even when the trees are the same --allow-empty
> has been used, but this did not avail because

It was --orphan,  --allow-empty (a common tree), the --root commit, and 
scripted with both branches using the same clock tick...

> git commit creates a
> brand new one only when the seconds of the system clock have ticked
> before it.
>
> Some of you have suggested to create an A' that is not orphan in such
> a case, which is a workaround, and some others to change the message
> in it, and this is another. I choose the latter

A reasonable solution. You can also create a sentinel (--root) commit 
for any time that you need to create the source branch, just so it (the 
real source code commit) has a different parent when on source branch to 
that on the binaries branch.

However, personally, I'd have wanted the source branch to show real 
history and actually match with the integrators repo history, but no 
doubt local conditions & politics have their influence.

> because it allows to
> keep the source branch orphan in all cases. So, there are workarounds,
> and the script has eventually been implemented and tested, but the
> unexpected, time-dependent behavior of git commit is there and someone
> could stumble on it sooner or later.
>
> -Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04 21:17                           ` Philip Oakley
@ 2012-10-04 22:09                             ` Angelo Borsotti
  2012-10-04 22:42                               ` Philip Oakley
  0 siblings, 1 reply; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-04 22:09 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

Hi Philip,

> This has the developers having a full copy/history of the integrators
> relevant branches, so that when the pull of the developers branch occurs
> there is a proper link to the integrators history.

True.
>
> There are other ways to create a branch which has all the developers feature
> history removed, rather tha using an --orphan, which removes the integrators
> history as well.

The topic branches are populated only by the developers. The integrators merge
all the topic branches into branches dedicated to the integration. In
case of need,
the developers can pull these (with all the integrators' history).

>
> The disconnection of the D' source branch makes it sound like you have a
> second SCM system that you have to put stuff into, which is independent of
> the development teams git repos. I have this [hassle] at my $dayjob -one
> almost has to hide git from the powers-that-be.

Well, there is another way to see this: think to a distributed SCM in
which there are some parts of the contents that are shared and some
that are not.
The technique to use disconnected branches is only a way of implementing this.
If, say, git push had an option to filter out the binaries there would
be no need for disconnected branches.

>
> A reasonable solution. You can also create a sentinel (--root) commit for
> any time that you need to create the source branch, just so it (the real
> source code commit) has a different parent when on source branch to that on
> the binaries branch.

Do you mean I could create an empty root commit to be used as parent for the
real source commit? Or that there is some --root option to be used?

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04 22:09                             ` Angelo Borsotti
@ 2012-10-04 22:42                               ` Philip Oakley
  2012-10-04 23:10                                 ` Angelo Borsotti
  0 siblings, 1 reply; 53+ messages in thread
From: Philip Oakley @ 2012-10-04 22:42 UTC (permalink / raw)
  To: Angelo Borsotti; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

From: "Angelo Borsotti" <angelo.borsotti@gmail.com>
Sent: Thursday, October 04, 2012 11:09 PM

>>
>> A reasonable solution. You can also create a sentinel (--root) commit
>> for
>> any time that you need to create the source branch, just so it (the
>> real
>> source code commit) has a different parent when on source branch to
>> that on
>> the binaries branch.
>
> Do you mean I could create an empty root commit to be used as parent
> for the
> real source commit? Or that there is some --root option to be used?

I was using "--root" in a colloquial way. It is used in some other
commands when the very first commit is to be included in its operation.

At the point where you do the 'git checkout --orphan  <new_branch>
<start_point>' you could have separate start points ready for the source
branch and the binaries branch, and immediately do a 'git commit' to
create the unique sentinel commit before you re-checkout the developers
latest and greatest (with --force), and then do your commits on the 
source branch as before.

Another technique could be to simply switch to the sources branch, and 
then use a 'git clean -x' with an updated .gitignore ('reset' the file 
from the source branch)[or use the exclude file] to remove those now 
ignored binaries, before doing the commit.

Philip

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-04 22:42                               ` Philip Oakley
@ 2012-10-04 23:10                                 ` Angelo Borsotti
  0 siblings, 0 replies; 53+ messages in thread
From: Angelo Borsotti @ 2012-10-04 23:10 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, Johannes Sixt, git, Matthieu Moy

Hi Phil,

>
> Another technique could be to simply switch to the sources branch, and then
> use a 'git clean -x' with an updated .gitignore ('reset' the file from the
> source branch)[or use the exclude file] to remove those now ignored
> binaries, before doing the commit.
>

Actually, the first time I make a git checkout --orphan to create the
branch, and the following times a git symbolic-ref HEAD to switch to
it. Then I set a proper exclude file and do a list=`git ls-files -c -i
--exclude-standard` to get the paths of the files to remove from the
index. Then I remove them with git rm --cached. Then all is ready to
make a git commit. At this point I restore the HEAD and the index as
they were before.
This allows me to keep the work tree pristine, no files removed or
loaded in it from the repo,
which makes the script quite fast.

-Angelo

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-03 17:34             ` Angelo Borsotti
  2012-10-03 19:05               ` Andreas Schwab
  2012-10-03 19:43               ` PJ Weisberg
@ 2012-10-05  8:15               ` Lars Noschinski
  2 siblings, 0 replies; 53+ messages in thread
From: Lars Noschinski @ 2012-10-05  8:15 UTC (permalink / raw)
  To: git

Angelo Borsotti <angelo.borsotti <at> gmail.com> writes:
> take a git commit without --allow-empty: if the trees are equal, it
> creates no commit,
> and if the trees are different it creates one.
> Take then a git commit --allow-empty: if the trees are equal it may
> create a commit or
> not depending on the parent, message, author and date; if the trees
> are different it
> creates a commit.
> So, the statement does not apply to commits in general.

It does (as already shown to you). The ID of a commit object depends on
the author, the time, the tree, and the commit message (did I forget
something?). If all these are equal, no new physical object will be
created.

Independent of this: If you are on a branch "foo" pointing to a commit A
and successfully do a commit (with --allow-empty or not), "foo" will
afterwards point to a commit B different from A. So, a successful
"git commit (--allow-empty)" will always add a commit to the branch
you are on.

  -- Lars.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2012-10-02  8:26 ` Johannes Sixt
  2012-10-02  8:49   ` Angelo Borsotti
  2012-10-02 17:27   ` Junio C Hamano
@ 2013-01-12 18:30   ` Jan Engelhardt
  2013-01-16 12:26     ` Joachim Schmitz
  2 siblings, 1 reply; 53+ messages in thread
From: Jan Engelhardt @ 2013-01-12 18:30 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Angelo Borsotti, git


On Tuesday 2012-10-02 10:26, Johannes Sixt wrote:
>
>Note that git commit -m A --allow-empty *DID* create a commit. Only, that
>it received the same name (SHA1) as the commit you created before it
>because it had the exact same contents (files, parents, author, committer,
>and timestamps). Obviously, your script was executed sufficiently fast
>that the two commits happend in the same second.

What about introducing nanosecond-granular timestamps into Git?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: erratic behavior commit --allow-empty
  2013-01-12 18:30   ` Jan Engelhardt
@ 2013-01-16 12:26     ` Joachim Schmitz
  0 siblings, 0 replies; 53+ messages in thread
From: Joachim Schmitz @ 2013-01-16 12:26 UTC (permalink / raw)
  To: git

Jan Engelhardt wrote:
> On Tuesday 2012-10-02 10:26, Johannes Sixt wrote:
>>
>> Note that git commit -m A --allow-empty *DID* create a commit. Only,
>> that it received the same name (SHA1) as the commit you created
>> before it because it had the exact same contents (files, parents,
>> author, committer, and timestamps). Obviously, your script was
>> executed sufficiently fast that the two commits happend in the same
>> second.
>
> What about introducing nanosecond-granular timestamps into Git?

Not every platform (supported by git) does have a nanosecond clock 
resolution

Bye, Jojo 

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2013-01-16 12:26 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-02  7:51 erratic behavior commit --allow-empty Angelo Borsotti
2012-10-02  8:26 ` Johannes Sixt
2012-10-02  8:49   ` Angelo Borsotti
2012-10-02 17:27   ` Junio C Hamano
2012-10-02 19:34     ` Angelo Borsotti
2012-10-02 19:56       ` Junio C Hamano
2012-10-02 21:56         ` Angelo Borsotti
2012-10-03  2:10           ` PJ Weisberg
2012-10-03  5:37           ` Johannes Sixt
2012-10-03  6:22             ` Angelo Borsotti
2012-10-03  6:27               ` Johannes Sixt
     [not found]                 ` <CAB9Jk9AgtNQfWDr31CWbXf2ag=11du-aruu-0+nOZ3KaaG9=og@mail.gmail.com>
2012-10-03  7:12                   ` Johannes Sixt
2012-10-03  7:35                     ` Angelo Borsotti
2012-10-03 20:49                     ` Junio C Hamano
2012-10-03  7:29           ` Philip Oakley
2012-10-03  7:45             ` Angelo Borsotti
2012-10-03  8:04               ` Matthieu Moy
2012-10-03  8:24                 ` Angelo Borsotti
2012-10-03 11:07                   ` Matthieu Moy
2012-10-03 11:52                     ` Angelo Borsotti
     [not found]                       ` <CABURp0oHez6j8+FPG8Zm52TGVyC1XwWhE55TBDrXRGFrW6kWww@mail.gmail.com>
2012-10-03 13:35                         ` Angelo Borsotti
2012-10-03 14:15                           ` Phil Hord
2012-10-03 13:57                       ` Matthieu Moy
2012-10-03 14:46                         ` Angelo Borsotti
2012-10-03 14:52                           ` Matthieu Moy
2012-10-03 22:32                       ` Philip Oakley
2012-10-04  7:07                         ` Angelo Borsotti
2012-10-04 13:24                           ` Phil Hord
2012-10-04 19:00                             ` Angelo Borsotti
2012-10-04 21:17                           ` Philip Oakley
2012-10-04 22:09                             ` Angelo Borsotti
2012-10-04 22:42                               ` Philip Oakley
2012-10-04 23:10                                 ` Angelo Borsotti
2012-10-03 12:25                   ` Tomas Carnecky
2012-10-03 13:08                     ` Angelo Borsotti
2012-10-03 10:12               ` Andreas Schwab
2012-10-03 11:37                 ` Angelo Borsotti
2012-10-03 13:44                   ` Andreas Schwab
2012-10-03 14:37                     ` Angelo Borsotti
2012-10-03 16:44                       ` Andreas Schwab
2012-10-03 17:37                         ` Angelo Borsotti
2012-10-03 19:03                           ` Andreas Schwab
2012-10-03 19:11                             ` Angelo Borsotti
2012-10-03 20:30                               ` Andreas Schwab
2012-10-03 12:59       ` Phil Hord
2012-10-03 14:25         ` Angelo Borsotti
2012-10-03 16:06           ` PJ Weisberg
2012-10-03 17:34             ` Angelo Borsotti
2012-10-03 19:05               ` Andreas Schwab
2012-10-03 19:43               ` PJ Weisberg
2012-10-05  8:15               ` Lars Noschinski
2013-01-12 18:30   ` Jan Engelhardt
2013-01-16 12:26     ` Joachim Schmitz

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).