git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Diagnosing stray/stale .keep files -- explore what is in a pack?
@ 2014-01-14 14:54 Martin Langhoff
  2014-01-14 17:10 ` Martin Langhoff
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Langhoff @ 2014-01-14 14:54 UTC (permalink / raw
  To: Git Mailing List

hi folks,

I have a git server which gets pushes of data (not code) from a couple
hundred VMs every hour. Every round of pushes leaves two stray .keep
files, so I am guessing two clients are having problems completing the
push. The contents being pushed are reports of a puppet run.

Is there a handy way to list the blobs in a pack, so I can feed them
to git-cat-file and see what's in there? I'm sure that'll help me
narrow down on the issue.

Are there other ways to try diagnose this?

Does the server-side record anything if a push fails? There are a
number of problems I am familiar with, and they always require
collaboration from the "client" side to spot and diagnose

 - a ref is not up to date and the server rejects non-ft
 - perms issues over objects or refs
 - ENOSPC
 - ... catchable signals (ETERM?)

AFAIK, I think it doesn't, and maybe it should, even if it's as simple
as trying to spawn a pipe to ""/usr/bin/logger -t git-server" and
attach it to stderr...

This has veered a bit off topic, but I think it's important for large
git server installations.

cheers,


m
-- 
 martin.langhoff@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-14 14:54 Diagnosing stray/stale .keep files -- explore what is in a pack? Martin Langhoff
@ 2014-01-14 17:10 ` Martin Langhoff
  2014-01-14 19:36   ` Martin Fick
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Langhoff @ 2014-01-14 17:10 UTC (permalink / raw
  To: Git Mailing List

On Tue, Jan 14, 2014 at 9:54 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> Is there a handy way to list the blobs in a pack, so I can feed them
> to git-cat-file and see what's in there? I'm sure that'll help me
> narrow down on the issue.

git show-index  <
/var/lib/ppg/reports.git/objects/pack/pack-22748bcca7f50a3a49aa4aed61444bf9c4ced685.idx
|
cut -d\  -f2 | xargs -iHASH git --git-dir  /var/lib/ppg/reports.git/
unpack-file HASH

After a bit of looking at the output, clearly I have two clients, out
of the many that connect here, that have the problem. I will be
looking into those clients to see what's the problem.

In my use case, clients push to their own head. Looking at refs/heads
shows that there are stale .lock files there. Hmmm.

This is on git 1.7.1 (RHEL and CentOS clients).

cheers,


m
-- 
 martin.langhoff@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-14 17:10 ` Martin Langhoff
@ 2014-01-14 19:36   ` Martin Fick
  2014-01-14 19:42     ` Martin Langhoff
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Fick @ 2014-01-14 19:36 UTC (permalink / raw
  To: Martin Langhoff; +Cc: Git Mailing List

Perhaps the receiving process is dying hard and leaving 
stuff behind?  Out-of-memory, out of disk space?

-Martin

On Tuesday, January 14, 2014 10:10:31 am Martin Langhoff 
wrote:
> On Tue, Jan 14, 2014 at 9:54 AM, Martin Langhoff
> 
> <martin.langhoff@gmail.com> wrote:
> > Is there a handy way to list the blobs in a pack, so I
> > can feed them to git-cat-file and see what's in there?
> > I'm sure that'll help me narrow down on the issue.
> 
> git show-index  <
> /var/lib/ppg/reports.git/objects/pack/pack-22748bcca7f50a
> 3a49aa4aed61444bf9c4ced685.idx
> 
> cut -d\  -f2 | xargs -iHASH git --git-dir 
> /var/lib/ppg/reports.git/ unpack-file HASH
> 
> After a bit of looking at the output, clearly I have two
> clients, out of the many that connect here, that have
> the problem. I will be looking into those clients to see
> what's the problem.
> 
> In my use case, clients push to their own head. Looking
> at refs/heads shows that there are stale .lock files
> there. Hmmm.
> 
> This is on git 1.7.1 (RHEL and CentOS clients).
> 
> cheers,
> 
> 
> m

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-14 19:36   ` Martin Fick
@ 2014-01-14 19:42     ` Martin Langhoff
  2014-01-15  9:12       ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Langhoff @ 2014-01-14 19:42 UTC (permalink / raw
  To: Martin Fick; +Cc: Git Mailing List

 On Tue, Jan 14, 2014 at 2:36 PM, Martin Fick <mfick@codeaurora.org> wrote:
> Perhaps the receiving process is dying hard and leaving
> stuff behind?  Out-of-memory, out of disk space?

Yes, that's my guess as well. This server had gc misconfigured, so it
hit ENOSPC a few weeks ago.

It is likely that the .lock files were left behind back then, and
since then the clients pushing to these refs were transferring their
whole history and still failing to update the ref, leading to rapid
repo growth.

So my situation is diagnosed and solved; I am still unhappy that it
took so much work and expertise; mainly because git isn't logging
anywhere. See my "Error logging for git over ssh?" message...

thanks,




m
-- 
 martin.langhoff@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-14 19:42     ` Martin Langhoff
@ 2014-01-15  9:12       ` Jeff King
  2014-01-15 13:42         ` Martin Langhoff
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2014-01-15  9:12 UTC (permalink / raw
  To: Martin Langhoff; +Cc: Martin Fick, Git Mailing List

On Tue, Jan 14, 2014 at 02:42:09PM -0500, Martin Langhoff wrote:

>  On Tue, Jan 14, 2014 at 2:36 PM, Martin Fick <mfick@codeaurora.org> wrote:
> > Perhaps the receiving process is dying hard and leaving
> > stuff behind?  Out-of-memory, out of disk space?
> 
> Yes, that's my guess as well. This server had gc misconfigured, so it
> hit ENOSPC a few weeks ago.
> 
> It is likely that the .lock files were left behind back then, and
> since then the clients pushing to these refs were transferring their
> whole history and still failing to update the ref, leading to rapid
> repo growth.

We see these occasionally at GitHub, too. I haven't yet figured out a
definite cause, though whatever it is, it's relatively rare.

I think the ".keep" files and the ".lock" files are in two separate
boats, though.

pack-objects creates the .keep files as a "lock" between the time it
moves them into place and when receive-pack updates the refs (so that a
simultaneous prune does not think they should be removed). Receive-pack
then updates the refs and removes the ".keep" file. However, in the
interim code, we are just updating the refs, and are careful to return
any errors rather than calling die() (so if ENOSPC prevented ref write,
that would not cause this). So for us to leave a .keep there, it is
probably one of:

  1. A few generic library functions, like xmalloc, can cause us to die.
     This should be very rare, though.

  2. We tried to unlink the keep-file, but couldn't (could ENOSPC
     prevent a deletion? I suspect it depends on the filesystem).

  3. We were killed by signal (or system crash).

Fetch-pack also will create .keep files, and it is much less careful
during the time the file exists.  However, busy servers tend to be
receiving pushes, not initiating fetches.

Actual ".lock" files are added to a signal/atexit handle that cleans
them up automatically on program exit. So those really should be caused
by system crash (or "kill -9"), and that has generally been our
experience at GitHub. But again, if ENOSPC could prevent deletion on
your filesystem, it could be related. But there is not much git can do
to clean up if unlink() fails us.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-15  9:12       ` Jeff King
@ 2014-01-15 13:42         ` Martin Langhoff
  2014-01-15 17:49           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Langhoff @ 2014-01-15 13:42 UTC (permalink / raw
  To: Jeff King; +Cc: Martin Fick, Git Mailing List

On Wed, Jan 15, 2014 at 4:12 AM, Jeff King <peff@peff.net> wrote:
> We see these occasionally at GitHub, too. I haven't yet figured out a
> definite cause, though whatever it is, it's relatively rare.

Do you have a cleanup script to safely get rid of stale .keep and
.lock files? I wonder what other stale bits merit a cleanup...

We could draft a 'git-repo-clean' that works akin to git clean (i.e.:
only reports by default), or add it to gc.

cheers,



m
-- 
 martin.langhoff@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-15 13:42         ` Martin Langhoff
@ 2014-01-15 17:49           ` Junio C Hamano
  2014-01-15 23:50             ` Martin Langhoff
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2014-01-15 17:49 UTC (permalink / raw
  To: Martin Langhoff; +Cc: Jeff King, Martin Fick, Git Mailing List

Martin Langhoff <martin.langhoff@gmail.com> writes:

> On Wed, Jan 15, 2014 at 4:12 AM, Jeff King <peff@peff.net> wrote:
>> We see these occasionally at GitHub, too. I haven't yet figured out a
>> definite cause, though whatever it is, it's relatively rare.
>
> Do you have a cleanup script to safely get rid of stale .keep and
> .lock files? I wonder what other stale bits merit a cleanup...

As long as we can reliably determine that it is safe to do so
without risking races, automatically cleaning .lock files is a good
thing to do.

Cleaning .keep files needs the same care and a bit more, though.
You of course have to be sure that no other concurrent process is in
the middle of doing something, but you also need to be sure that the
".keep" file is not a marker created by the end user to say "keep
this pack, do not subject its contents to repacking" after a careful
repacking of the stable part of the history.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-15 17:49           ` Junio C Hamano
@ 2014-01-15 23:50             ` Martin Langhoff
  2014-01-16  1:14               ` Duy Nguyen
  2014-01-21  5:19               ` Jeff King
  0 siblings, 2 replies; 10+ messages in thread
From: Martin Langhoff @ 2014-01-15 23:50 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Jeff King, Martin Fick, Git Mailing List

On Wed, Jan 15, 2014 at 12:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
> As long as we can reliably determine that it is safe to do so
> without risking races, automatically cleaning .lock files is a good
> thing to do.

If the .lock file is a day old, it seems to me that it should be safe
to call it stale.

Can anyone "take the lock" if there is already a lock file?

> Cleaning .keep files needs the same care and a bit more, though.
> You of course have to be sure that no other concurrent process is in
> the middle of doing something, but you also need to be sure that the
> ".keep" file is not a marker created by the end user to say "keep
> this pack, do not subject its contents to repacking" after a careful
> repacking of the stable part of the history.

For the keep files, I already drafted a script that looks inside the
keep file, if it reads 'receive-pack [pid] [host]' it checks whether
the hostname matches, and if so whether the pid matches a running
process.

Only if the host matches and the pid is dead we call it stale.

Seems fairly conservative to me. Are there scenarios where we think
this can misfire?




m
-- 
 martin.langhoff@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-15 23:50             ` Martin Langhoff
@ 2014-01-16  1:14               ` Duy Nguyen
  2014-01-21  5:19               ` Jeff King
  1 sibling, 0 replies; 10+ messages in thread
From: Duy Nguyen @ 2014-01-16  1:14 UTC (permalink / raw
  To: Martin Langhoff; +Cc: Junio C Hamano, Jeff King, Martin Fick, Git Mailing List

On Thu, Jan 16, 2014 at 6:50 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Jan 15, 2014 at 12:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> As long as we can reliably determine that it is safe to do so
>> without risking races, automatically cleaning .lock files is a good
>> thing to do.
>
> If the .lock file is a day old, it seems to me that it should be safe
> to call it stale.

Perhaps report those stale locks (and stale .keep files as well if you
can detct them) as garbage in count-objects too.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Diagnosing stray/stale .keep files -- explore what is in a pack?
  2014-01-15 23:50             ` Martin Langhoff
  2014-01-16  1:14               ` Duy Nguyen
@ 2014-01-21  5:19               ` Jeff King
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff King @ 2014-01-21  5:19 UTC (permalink / raw
  To: Martin Langhoff; +Cc: Junio C Hamano, Martin Fick, Git Mailing List

On Wed, Jan 15, 2014 at 06:50:33PM -0500, Martin Langhoff wrote:

> On Wed, Jan 15, 2014 at 12:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
> > As long as we can reliably determine that it is safe to do so
> > without risking races, automatically cleaning .lock files is a good
> > thing to do.
> 
> If the .lock file is a day old, it seems to me that it should be safe
> to call it stale.

Probably. The way our "lease" system works, nobody should be
holding a ref lock for more than a few milliseconds.

That being said, we do lock other things, like the index. Generally I
think the index lock should be quick, too. And similar for config file
rewrites, and shallow files. And rerere files, it looks like. My, "git
grep commit_lock_file" turns up a lot of hits. :)

So I think all of the existing uses are fine, and I suppose that most
new cases should be fine, too, because git processes tend not to last a
long time.

You asked earlier if I had a script for cleaning locks. No code worth
sharing, but I'll give an outline of what we do at GitHub. We basically
do:

  find -name *.lock -mmin +60 | xargs rm

I.e., we give only an hour.  For keep files, we give a day (since things
like hooks may run for a while under the lock, though a day is probably
excessive). And we check that it begins with "^receive-pack".

As far as I know, neither of these has ever caused any problems. Of
course, any problems might not be immediately obvious.

> Can anyone "take the lock" if there is already a lock file?

Git never takes an existing lock. It expects you to clean it up
yourself.

> For the keep files, I already drafted a script that looks inside the
> keep file, if it reads 'receive-pack [pid] [host]' it checks whether
> the hostname matches, and if so whether the pid matches a running
> process.
> 
> Only if the host matches and the pid is dead we call it stale.

That sounds reasonable.

> Seems fairly conservative to me. Are there scenarios where we think
> this can misfire?

I cannot think of any.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-01-21  5:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-14 14:54 Diagnosing stray/stale .keep files -- explore what is in a pack? Martin Langhoff
2014-01-14 17:10 ` Martin Langhoff
2014-01-14 19:36   ` Martin Fick
2014-01-14 19:42     ` Martin Langhoff
2014-01-15  9:12       ` Jeff King
2014-01-15 13:42         ` Martin Langhoff
2014-01-15 17:49           ` Junio C Hamano
2014-01-15 23:50             ` Martin Langhoff
2014-01-16  1:14               ` Duy Nguyen
2014-01-21  5:19               ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).