git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git v2.11.0 breaks max depth nested alternates
@ 2016-12-04  0:24 Kyle J. McKay
  2016-12-04  4:55 ` Jeff King
  2016-12-04 11:22 ` Philip Oakley
  0 siblings, 2 replies; 7+ messages in thread
From: Kyle J. McKay @ 2016-12-04  0:24 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Git mailing list

The recent addition of pre-receive quarantining breaks nested  
alternates that are already at the maximum alternates nesting depth.

In the file sha1_file.c in the function link_alt_odb_entries we have  
this:

 > if (depth > 5) {
 >         error("%s: ignoring alternate object stores, nesting too deep.",
 >                         relative_base);
 >         return;
 > }

When the incoming quarantine takes place the current objects directory  
is demoted to an alternate thereby increasing its depth (and any  
alternates it references) by one and causing any object store that was  
previously at the maximum nesting depth to be ignored courtesy of the  
above hard-coded maximum depth.

If the incoming push happens to need access to some of those objects  
to perhaps "--fix-thin" its pack it will crash and burn.

Originally I was not going to include a patch to fix this, but simply  
suggest that the expeditious fix is to just allow one additional  
alternates nesting depth level during quarantine operations.

However, it was so simple, I have included the patch below :)

I have verified that where a push with Git v2.10.2 succeeds and a push  
with Git v2.11.0 to the same repository fails because of this problem  
that the below patch does indeed correct the issue and allow the push  
to succeed.

Cheers,

Kyle

-- 8< --
Subject: [PATCH] receive-pack: increase max alternates depth during quarantine

Ever since 722ff7f876 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining
objects and packs received during an incoming push into a separate
objects directory and using the alternates mechanism to make them
available until they are either accepted and moved into the main
objects directory or rejected and discarded.

Unfortunately this has the side effect of increasing the alternates
nesting depth level by one for all pre-existing alternates.

If a repository is already at the maximum alternates nesting depth,
then this quarantining operation can temporarily push it over making
the incoming push fail.

To prevent the failure we simply increase the allowed alternates
nesting depth by one whenever a quarantine operation is in effect.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---

Notes:
    Some alternates nesting depth background:
    
    If base/fork0/fork1/fork2/fork3/fork4/fork5 represents
    seven git repositories where base.git has no alternates,
    fork0.git has base.git as an alternate, fork1.git has
    fork0.git as an alternate and so on where fork5.git has
    only fork4.git as an alternate, then fork5.git is at
    the maximum allowed depth of 5.  git fsck --strict --full
    works without complaint on fork5.git.
    
    However, in base/fork0/fork1/fork2/fork3/fork4/fork5/fork6,
    an fsck --strict --full of fork6.git will generate complaints
    and any objects/packs present in base.git will be ignored.

 cache.h       | 1 +
 common-main.c | 3 +++
 environment.c | 1 +
 sha1_file.c   | 2 +-
 4 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index a50a61a1..25c17c29 100644
--- a/cache.h
+++ b/cache.h
@@ -676,6 +676,7 @@ extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern unsigned long big_file_threshold;
 extern unsigned long pack_size_limit_cfg;
+extern int alt_odb_max_depth;
 
 /*
  * Accessors for the core.sharedrepository config which lazy-load the value
diff --git a/common-main.c b/common-main.c
index c654f955..9f747491 100644
--- a/common-main.c
+++ b/common-main.c
@@ -37,5 +37,8 @@ int main(int argc, const char **argv)
 
 	restore_sigpipe_to_default();
 
+	if (getenv(GIT_QUARANTINE_ENVIRONMENT))
+		alt_odb_max_depth++;
+
 	return cmd_main(argc, argv);
 }
diff --git a/environment.c b/environment.c
index 0935ec69..32e11f70 100644
--- a/environment.c
+++ b/environment.c
@@ -64,6 +64,7 @@ int merge_log_config = -1;
 int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
 unsigned long pack_size_limit_cfg;
 enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
+int alt_odb_max_depth = 5;
 
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
diff --git a/sha1_file.c b/sha1_file.c
index 9c86d192..15b8432e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -337,7 +337,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
 	int i;
 	struct strbuf objdirbuf = STRBUF_INIT;
 
-	if (depth > 5) {
+	if (depth > alt_odb_max_depth) {
 		error("%s: ignoring alternate object stores, nesting too deep.",
 				relative_base);
 		return;
---

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-04  0:24 Git v2.11.0 breaks max depth nested alternates Kyle J. McKay
@ 2016-12-04  4:55 ` Jeff King
  2016-12-04  9:37   ` Kyle J. McKay
  2016-12-04 11:22 ` Philip Oakley
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2016-12-04  4:55 UTC (permalink / raw)
  To: Kyle J. McKay; +Cc: Junio C Hamano, Git mailing list

On Sat, Dec 03, 2016 at 04:24:02PM -0800, Kyle J. McKay wrote:

> When the incoming quarantine takes place the current objects directory  
> is demoted to an alternate thereby increasing its depth (and any  
> alternates it references) by one and causing any object store that was  
> previously at the maximum nesting depth to be ignored courtesy of the  
> above hard-coded maximum depth.
> 
> If the incoming push happens to need access to some of those objects  
> to perhaps "--fix-thin" its pack it will crash and burn.

Yep, that makes sense. I didn't really worry about this because the
existing "5" is totally arbitrary, and meant to be so high that nobody
reaches it (it's just there to break cycles).

So I do think this is worth dealing with, but I'm also curious why
you're hitting the depth-5 limit. I'm guessing it has to do with hosting
a hierarchy of related repos. But is your system then always in danger
of busting the 5-limit if people create too deep a repository hierarchy?

Specifically, I'm wondering if it would be sufficient to just bump it to
6. Or 100.

Of course any static bump runs into the funny case where a repo
_usually_ works, but fails when pushed to. Which is kind of nasty and
unintuitive. And your patch fixes that, and we can leave the idea of
bumping the static depth number as an orthogonal issue (that personally,
I do not care about much about either way).

> diff --git a/common-main.c b/common-main.c
> index c654f955..9f747491 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -37,5 +37,8 @@ int main(int argc, const char **argv)
>  
>  	restore_sigpipe_to_default();
>  
> +	if (getenv(GIT_QUARANTINE_ENVIRONMENT))
> +		alt_odb_max_depth++;
> +
>  	return cmd_main(argc, argv);

After reading your problem description, my initial thought was to
increment the counter when we allocate the tmp-objdir, and decrement
when it is destroyed. Because the parent receive-pack process adds it to
its alternates, too. But:

  1. Receive-pack doesn't care; it adds the tmp-objdir as an alternate,
     rather than adding it as its main object dir and bumping down the
     main one.

  2. There would have to be some way of communicating to sub-processes
     that they should bump their max-depth by one.

You've basically used the quarantine-path variable as the
inter-process flag for (2). Which feels a little funny, because its
value is unrelated to the alt-odb setup. But it is a reliable signal, so
there's a certain elegance. It's probably the best option, given that
the alternative is a specific variable to say "hey, bump your
max-alt-odb-depth by one". That's pretty ugly, too. :)

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-04  4:55 ` Jeff King
@ 2016-12-04  9:37   ` Kyle J. McKay
  2016-12-05  7:14     ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Kyle J. McKay @ 2016-12-04  9:37 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git mailing list

On Dec 3, 2016, at 20:55, Jeff King wrote:

> So I do think this is worth dealing with, but I'm also curious why
> you're hitting the depth-5 limit. I'm guessing it has to do with  
> hosting
> a hierarchy of related repos. But is your system then always in danger
> of busting the 5-limit if people create too deep a repository  
> hierarchy?

No we check for the limit.  Anything at the limit gets broken by the  
quarantine change though.

> Specifically, I'm wondering if it would be sufficient to just bump  
> it to
> 6. Or 100.

Well, if we left the current limit in place, but as you say:

> Of course any static bump runs into the funny case where a repo
> _usually_ works, but fails when pushed to. Which is kind of nasty and
> unintuitive. And your patch fixes that,

Yes.  That's not nice, hence the patch.  Without the fix, pushing  
might work sometimes until you actually need to access cut-off objects  
at pre-receive time.  So you might be able to push sometimes and  
sometimes it breaks.

> and we can leave the idea of
> bumping the static depth number as an orthogonal issue (that  
> personally,
> I do not care about much about either way).

The patch is a step on that road.  It doesn't go that far but all it  
would take is connecting the introduced variable to a config item.   
But you still need to bump it by 1 during quarantine operations.  Such  
support would even allow alternates to be disallowed (except during  
quarantine).  I wonder if there's an opportunity for further pack  
operation optimizations in such a case (you know there are no  
alternates because they're not allowed)?

>> diff --git a/common-main.c b/common-main.c
>> index c654f955..9f747491 100644
>> --- a/common-main.c
>> +++ b/common-main.c
>> @@ -37,5 +37,8 @@ int main(int argc, const char **argv)
>>
>> 	restore_sigpipe_to_default();
>>
>> +	if (getenv(GIT_QUARANTINE_ENVIRONMENT))
>> +		alt_odb_max_depth++;
>> +
>> 	return cmd_main(argc, argv);
>
> After reading your problem description, my initial thought was to
> increment the counter when we allocate the tmp-objdir, and decrement
> when it is destroyed. Because the parent receive-pack process adds  
> it to
> its alternates, too. But:
>
>  1. Receive-pack doesn't care; it adds the tmp-objdir as an alternate,
>     rather than adding it as its main object dir and bumping down the
>     main one.
>
>  2. There would have to be some way of communicating to sub-processes
>     that they should bump their max-depth by one.

All true.  And I had similar thoughts.  Perhaps we should add your  
comments to the patch description?  There seems to be a trend towards  
having longer patch descriptions these days... ;)

> You've basically used the quarantine-path variable as the
> inter-process flag for (2). Which feels a little funny, because its
> value is unrelated to the alt-odb setup. But it is a reliable  
> signal, so
> there's a certain elegance. It's probably the best option, given that
> the alternative is a specific variable to say "hey, bump your
> max-alt-odb-depth by one". That's pretty ugly, too. :)

You took the words right out of my mouth...   I guess I need to work  
on doing a better job of dumping my stream-of-thoughts that go into a  
patch into the emails to the list.

Most all of your comments could be dumped into the patch description  
as-is to pimp it out some.  I have no objection to that, even adding  
an "Additional-analysis-by:" (or similar) credit line too.  :)

--Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-04  0:24 Git v2.11.0 breaks max depth nested alternates Kyle J. McKay
  2016-12-04  4:55 ` Jeff King
@ 2016-12-04 11:22 ` Philip Oakley
  2016-12-05  7:18   ` Jeff King
  1 sibling, 1 reply; 7+ messages in thread
From: Philip Oakley @ 2016-12-04 11:22 UTC (permalink / raw)
  To: Kyle J. McKay, Jeff King, Junio C Hamano; +Cc: Git mailing list

From: "Kyle J. McKay" <mackyle@gmail.com>
Sent: Sunday, December 04, 2016 12:24 AM
> The recent addition of pre-receive quarantining breaks nested
> alternates that are already at the maximum alternates nesting depth.
>
> In the file sha1_file.c in the function link_alt_odb_entries we have
> this:
>
> > if (depth > 5) {
> >         error("%s: ignoring alternate object stores, nesting too deep.",
> >                         relative_base);
> >         return;
> > }
>
> When the incoming quarantine takes place the current objects directory
> is demoted to an alternate thereby increasing its depth (and any
> alternates it references) by one and causing any object store that was
> previously at the maximum nesting depth to be ignored courtesy of the
> above hard-coded maximum depth.
>
> If the incoming push happens to need access to some of those objects
> to perhaps "--fix-thin" its pack it will crash and burn.
>
> Originally I was not going to include a patch to fix this, but simply
> suggest that the expeditious fix is to just allow one additional
> alternates nesting depth level during quarantine operations.
>
> However, it was so simple, I have included the patch below :)
>
> I have verified that where a push with Git v2.10.2 succeeds and a push
> with Git v2.11.0 to the same repository fails because of this problem
> that the below patch does indeed correct the issue and allow the push
> to succeed.
>
> Cheers,
>
> Kyle
>
> -- 8< --
> Subject: [PATCH] receive-pack: increase max alternates depth during 
> quarantine
>
> Ever since 722ff7f876 (receive-pack: quarantine objects until
> pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining
> objects and packs received during an incoming push into a separate
> objects directory and using the alternates mechanism to make them
> available until they are either accepted and moved into the main
> objects directory or rejected and discarded.

Is there a step here that after the accepted/rejected stage, it should then 
decrement the limit back to its original value. The problem description 
suggests that might be the case.
--
Philip

>
> Unfortunately this has the side effect of increasing the alternates
> nesting depth level by one for all pre-existing alternates.
>
> If a repository is already at the maximum alternates nesting depth,
> then this quarantining operation can temporarily push it over making
> the incoming push fail.
>
> To prevent the failure we simply increase the allowed alternates
> nesting depth by one whenever a quarantine operation is in effect.
>
> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
> ---
>
> Notes:
>    Some alternates nesting depth background:
>
>    If base/fork0/fork1/fork2/fork3/fork4/fork5 represents
>    seven git repositories where base.git has no alternates,
>    fork0.git has base.git as an alternate, fork1.git has
>    fork0.git as an alternate and so on where fork5.git has
>    only fork4.git as an alternate, then fork5.git is at
>    the maximum allowed depth of 5.  git fsck --strict --full
>    works without complaint on fork5.git.
>
>    However, in base/fork0/fork1/fork2/fork3/fork4/fork5/fork6,
>    an fsck --strict --full of fork6.git will generate complaints
>    and any objects/packs present in base.git will be ignored.
>
> cache.h       | 1 +
> common-main.c | 3 +++
> environment.c | 1 +
> sha1_file.c   | 2 +-
> 4 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/cache.h b/cache.h
> index a50a61a1..25c17c29 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -676,6 +676,7 @@ extern size_t packed_git_limit;
> extern size_t delta_base_cache_limit;
> extern unsigned long big_file_threshold;
> extern unsigned long pack_size_limit_cfg;
> +extern int alt_odb_max_depth;
>
> /*
>  * Accessors for the core.sharedrepository config which lazy-load the 
> value
> diff --git a/common-main.c b/common-main.c
> index c654f955..9f747491 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -37,5 +37,8 @@ int main(int argc, const char **argv)
>
>  restore_sigpipe_to_default();
>
> + if (getenv(GIT_QUARANTINE_ENVIRONMENT))
> + alt_odb_max_depth++;
> +
>  return cmd_main(argc, argv);
> }
> diff --git a/environment.c b/environment.c
> index 0935ec69..32e11f70 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -64,6 +64,7 @@ int merge_log_config = -1;
> int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
> unsigned long pack_size_limit_cfg;
> enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
> +int alt_odb_max_depth = 5;
>
> #ifndef PROTECT_HFS_DEFAULT
> #define PROTECT_HFS_DEFAULT 0
> diff --git a/sha1_file.c b/sha1_file.c
> index 9c86d192..15b8432e 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -337,7 +337,7 @@ static void link_alt_odb_entries(const char *alt, int 
> len, int sep,
>  int i;
>  struct strbuf objdirbuf = STRBUF_INIT;
>
> - if (depth > 5) {
> + if (depth > alt_odb_max_depth) {
>  error("%s: ignoring alternate object stores, nesting too deep.",
>  relative_base);
>  return;
> ---
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-04  9:37   ` Kyle J. McKay
@ 2016-12-05  7:14     ` Jeff King
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2016-12-05  7:14 UTC (permalink / raw)
  To: Kyle J. McKay; +Cc: Junio C Hamano, Git mailing list

On Sun, Dec 04, 2016 at 01:37:00AM -0800, Kyle J. McKay wrote:

> On Dec 3, 2016, at 20:55, Jeff King wrote:
> 
> > So I do think this is worth dealing with, but I'm also curious why
> > you're hitting the depth-5 limit. I'm guessing it has to do with hosting
> > a hierarchy of related repos. But is your system then always in danger
> > of busting the 5-limit if people create too deep a repository hierarchy?
> 
> No we check for the limit.  Anything at the limit gets broken by the
> quarantine change though.

OK. So the limit is an issue for your system, but one that you're able
to deal gracefully with (and the quarantine change makes that a lot
harder). I buy that line of reasoning.

> The patch is a step on that road.  It doesn't go that far but all it would
> take is connecting the introduced variable to a config item.  But you still
> need to bump it by 1 during quarantine operations.  Such support would even
> allow alternates to be disallowed (except during quarantine).  I wonder if
> there's an opportunity for further pack operation optimizations in such a
> case (you know there are no alternates because they're not allowed)?

I doubt it. We look at the list of alternates early on, and in most
cases there aren't any. So any optimization there can be done already at
that point.

The only optimization I know if in that area is 56dfeb626 (pack-objects:
compute local/ignore_pack_keep early, 2016-07-29), which works already.

> All true.  And I had similar thoughts.  Perhaps we should add your comments
> to the patch description?  There seems to be a trend towards having longer
> patch descriptions these days... ;)

Feel free to pick out anything that's useful and add it in verbatim or
rephrased, whichever is more convenient.

> You took the words right out of my mouth...   I guess I need to work on
> doing a better job of dumping my stream-of-thoughts that go into a patch
> into the emails to the list.

It's a lot easier when you're the reviewer, because you don't start
reading through the commit-message with a full understanding of the
problem yet. :)

> Most all of your comments could be dumped into the patch description as-is
> to pimp it out some.  I have no objection to that, even adding an
> "Additional-analysis-by:" (or similar) credit line too.  :)

Sure. I don't really need credit, or even just "reviewed-by" is fine.
Talking and generating a shared understanding of the problem is part of
the review process.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-04 11:22 ` Philip Oakley
@ 2016-12-05  7:18   ` Jeff King
  2016-12-05 12:07     ` Philip Oakley
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2016-12-05  7:18 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Kyle J. McKay, Junio C Hamano, Git mailing list

On Sun, Dec 04, 2016 at 11:22:52AM -0000, Philip Oakley wrote:

> > Ever since 722ff7f876 (receive-pack: quarantine objects until
> > pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining
> > objects and packs received during an incoming push into a separate
> > objects directory and using the alternates mechanism to make them
> > available until they are either accepted and moved into the main
> > objects directory or rejected and discarded.
> 
> Is there a step here that after the accepted/rejected stage, it should then
> decrement the limit back to its original value. The problem description
> suggests that might be the case.

No. I thought that at first, too, but this increment happens in the
sub-process which is using the extra level of alternates for its entire
lifetime. So it "resets" it by exiting, and the parent process never
increments its internal value at all.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git v2.11.0 breaks max depth nested alternates
  2016-12-05  7:18   ` Jeff King
@ 2016-12-05 12:07     ` Philip Oakley
  0 siblings, 0 replies; 7+ messages in thread
From: Philip Oakley @ 2016-12-05 12:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle J. McKay, Junio C Hamano, Git mailing list

From: "Jeff King" <peff@peff.net>
> On Sun, Dec 04, 2016 at 11:22:52AM -0000, Philip Oakley wrote:
>
>> > Ever since 722ff7f876 (receive-pack: quarantine objects until
>> > pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining
>> > objects and packs received during an incoming push into a separate
>> > objects directory and using the alternates mechanism to make them
>> > available until they are either accepted and moved into the main
>> > objects directory or rejected and discarded.
>>
>> Is there a step here that after the accepted/rejected stage, it should 
>> then
>> decrement the limit back to its original value. The problem description
>> suggests that might be the case.
>
> No. I thought that at first, too, but this increment happens in the
> sub-process which is using the extra level of alternates for its entire
> lifetime. So it "resets" it by exiting, and the parent process never
> increments its internal value at all.
>
Thanks for the clarification.
--
Philip 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-12-05 12:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-04  0:24 Git v2.11.0 breaks max depth nested alternates Kyle J. McKay
2016-12-04  4:55 ` Jeff King
2016-12-04  9:37   ` Kyle J. McKay
2016-12-05  7:14     ` Jeff King
2016-12-04 11:22 ` Philip Oakley
2016-12-05  7:18   ` Jeff King
2016-12-05 12:07     ` Philip Oakley

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).