git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Brian Gesiak <modocache@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [GSoC14][RFC] Proposal Draft: Refactor tempfile handling
Date: Tue, 11 Mar 2014 17:27:05 +0100	[thread overview]
Message-ID: <531F3959.4060608@alum.mit.edu> (raw)
In-Reply-To: <CAN7MxmVQuk96dmXfxZ5kRZPTXNwpz2RY=y8HyqX4mZzrZUVbNg@mail.gmail.com>

On 03/01/2014 10:04 PM, Brian Gesiak wrote:
> Hello all,
> 
> My name is Brian Gesiak. I'm a research student at the University of
> Tokyo, and I'm hoping to participate in this year's Google Summer of
> Code by contributing to Git. I'm a longtime user, first-time
> contributor--some of you may have noticed my "microproject"
> patches.[1][2]
> 
> I'd like to gather some information on one of the GSoC ideas posted on
> the ideas page. Namely, I'm interested in refactoring the way
> tempfiles are cleaned up.
> 
> The ideas page points out that while lock files are closed and
> unlinked[3] when the program exits[4], object pack files implement
> their own brand of temp file creation and deletion. This
> implementation doesn't share the same guarantees as lock files--it is
> possible that the program terminates before the temp file is
> unlinked.[5]
> 
> Lock file references are stored in a linked list. When the program
> exits, this list is traversed and each file is closed and unlinked. It
> seems to me that this mechanism is appropriate for temp files in
> general, not just lock files. Thus, my proposal would be to extract
> this logic into a separate module--tempfile.h, perhaps. Lock and
> object files would share the tempfile implementation.
> 
> That is, both object and lock temp files would be stored in a linked
> list, and all of these would be deleted at program exit.
> 
> I'm very enthused about this project--I think it has it all:
> 
> - Tangible benefits for the end-user
> - Reduced complexity in the codebase
> - Ambitious enough to be interesting
> - Small enough to realistically be completed in a summer
> 
> Please let me know if this seems like it would make for an interesting
> proposal, or if perhaps there is something I am overlooking. Any
> feedback at all would be appreciated. Thank you!

Hi Brian,

Thanks for your proposal.  I have a technical point that I think your
proposal should address:

Currently the linked list of lockfiles only grows, never shrinks.  Once
an object has been linked into the list, there is no way to remove it
again even after the lock has been released.  So if a lock needs to be
created dynamically at a random place in the code, its memory is
unavoidably leaked.

This hasn't been much of a problem in the past because (1) the number of
locks acquired/released during a Git invocation is reasonable, and (2) a
lock object (even if it is already in the list) can be reused after the
lock has been released.  So there are many lock callsites that define
one static lock instance and use it over and over again.

But I have a feeling that if we want to use a similar mechanism to
handle all temporary files (of which there can be more), then it would
be a good idea to lift this limitation.  It will require some care,
though, to make sure that record removal is done in a way that is
threadsafe and safe in the event of all expected kinds of process death.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2014-03-11 16:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-01 21:04 [GSoC14][RFC] Proposal Draft: Refactor tempfile handling Brian Gesiak
2014-03-03 22:42 ` Jeff King
2014-03-08 17:04   ` Brian Gesiak
2014-03-11  1:33     ` Jeff King
2014-03-11 16:27 ` Michael Haggerty [this message]
2014-03-11 18:05   ` Jeff King
2014-03-12 12:26     ` Brian Gesiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531F3959.4060608@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=modocache@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).