git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: mr.gaffo@gmail.com
Cc: git@vger.kernel.org
Subject: Re: [PATCH JGit 3/5] Implemented directory based info cache for objects/info/packs.
Date: Thu, 8 Oct 2009 10:00:19 -0700	[thread overview]
Message-ID: <20091008170019.GF9261@spearce.org> (raw)
In-Reply-To: <1253062116-13830-4-git-send-email-mr.gaffo@gmail.com>

mr.gaffo@gmail.com wrote:
> Add abstract method for updating the object db's info cache to directory.
> 
> Implemented passthrough on Alternate for the update of infocache.
> 
> Added utility that generates the contents of the objects/info/packs
> file as a string from a list of PackFiles.
> 
> Added implementation from ObjectDirectory on down
> for creating the info cache.
> 
> Added test for creating the info cache

Reading this message gave me the funny feeling that I'm going to
see a lot of unrelated code mashed into one patch.  There doesn't
seem to be a general theme to the commit, at least as far as the
message is concerned.

[a bit later...] After reading the code itself, I agree with the
original guess on the commit message, there is too much happening
in this one patch that is unrelated, and there are several problems
lurking that are harder to spot because its a mash of changes.
Please try to break it down to more focused commits.
 
> diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/CachedPacksInfoFileContentsGeneratorTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/CachedPacksInfoFileContentsGeneratorTest.java
> new file mode 100644
> index 0000000..bea0b70
> --- /dev/null
> +++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/CachedPacksInfoFileContentsGeneratorTest.java
> @@ -0,0 +1,74 @@

FYI, our new package space is org.eclipse.jgit and any new changes
need to take that into account.

> + * - Neither the name of the Git Development Community nor the

Also FYI, since our move to the Eclipse Foundation the generic term
"Git Development Community" has been replaced in the license header
by "Eclipse Foundation, Inc.", otherwise the license header remains
as-is and copyright is still attributed to the contributor.

> +public class CachedPacksInfoFileContentsGeneratorTest extends TestCase {
...
> +	public void testGettingPacksContentsMultiplePacks() throws Exception {
> +		List<PackFile> packs = new ArrayList<PackFile>();
> +		packs.add(new PackFile(TEST_IDX, TEST_PACK));
> +		packs.add(new PackFile(TEST_IDX, TEST_PACK));
> +		packs.add(new PackFile(TEST_IDX, TEST_PACK));

I think we should be testing multiple names here, to ensure the
generator didn't do something stupid like reuse the same array index
for pulling the name while looping for the size of the input list.

> diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/ObjectDirectoryTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/ObjectDirectoryTest.java
> @@ -54,9 +54,9 @@
>  
>  	@Override
>  	protected void setUp() throws Exception {
> -		testDir = new File(new File(System.getProperty("java.io.tmpdir")), UUID.randomUUID().toString());
> +		testDir = JGitTestUtil.generateTempDirectoryFileObject();

Yea, I'm confused about this hunk.  It isn't strictly necessary
for the new feature this commit is adding.  Pull this into its own
commit, before the new feature, so you can take advantage of the
refactoring in your new tests, but you also don't muddle the new
feature addition with old code refactoring.

> diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/util/JGitTestUtil.java b/org.spearce.jgit.test/tst/org/spearce/jgit/util/JGitTestUtil.java
> @@ -74,7 +76,8 @@ public static File getTestResourceFile(final String fileName) {
>  		}
>  	}
>  
> -	public static void copyFile(final File fromFile, final File toFile) throws IOException {
> +	public static void copyFile(final File fromFile, final File toFile)
> +			throws IOException {
>  		InputStream in = new FileInputStream(fromFile);
>  		OutputStream out = new FileOutputStream(toFile);

Unnecessary reformatting hunk; we try to avoid these when possible.

> @@ -87,6 +90,21 @@ public static void copyFile(final File fromFile, final File toFile) throws IOExc
>  		out.close();
>  	}
>  
> +	public static String readFileAsString(final File file)
> +			throws java.io.IOException {

This method already exists in some form in the RepositoryTest class.
Can we instead make a commit to refactor it out of there here?

> +		StringBuilder fileData = new StringBuilder(1000);
> +		BufferedReader reader = new BufferedReader(new FileReader(file));

We should be more specific about our encoding and not rely on the
platform default.

> +		char[] buf = new char[1024];
> +		int numRead = 0;
> +		while ((numRead = reader.read(buf)) != -1) {
> +			String readData = String.valueOf(buf, 0, numRead);
> +			fileData.append(readData);
> +			buf = new char[1024];

There is no need to reallocate the buffer, String.valueOf is required
to copy the array to ensure the returned String is immutable.
Worse, you don't need to convert the char array to string, there is
a method on StringBuilder to append a char[] taking char[],int,int.

> @@ -136,4 +154,8 @@ private static void reportDeleteFailure(final String name,
> +
> +	public static File generateTempDirectoryFileObject() {
> +		return new File(new File(System.getProperty("java.io.tmpdir")), UUID.randomUUID().toString());
> +	}

Please generate temporary directories under the same area that
RepositoryTest produces them, which makes it easier to cleanup,
and ensures we aren't treading out of our build area.

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/AlternateRepositoryDatabase.java b/org.spearce.jgit/src/org/spearce/jgit/lib/AlternateRepositoryDatabase.java
> index 68ad488..70ce505 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/lib/AlternateRepositoryDatabase.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/lib/AlternateRepositoryDatabase.java
> @@ -130,4 +130,9 @@ protected void closeAlternates(final ObjectDatabase[] alt) {
>  	public List<PackFile> listLocalPacks() {
>  		return odb.listLocalPacks();
>  	}
> +
> +	@Override
> +	public void updateInfoCache() throws IOException {
> +		odb.updateInfoCache();
> +	}
>  }
> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/CachedPacksInfoFileContentsGenerator.java b/org.spearce.jgit/src/org/spearce/jgit/lib/CachedPacksInfoFileContentsGenerator.java
> + * This file is used to generate the contents of the file system
> + * based pack file cache used by the dumb git-http client protocol.
> + * @author mike

We don't record @author annotations.

> +public class CachedPacksInfoFileContentsGenerator {
> +
> +	private List<PackFile> packs;
> +
> +	public CachedPacksInfoFileContentsGenerator(List<PackFile> packs) {
> +		this.packs = packs;
> +	}
> +	
> +	public String generateContents(){

Style nit: space between () and {.

I also wonder about the value of an instance for this class, if
all it does is produce one return value as a string, why not just
use a static method?

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
> index 9afea67..2d78dda 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
> @@ -224,6 +224,9 @@
>  
>  	/** Info refs folder */
>  	public static final String INFO_REFS = "info/refs";
> +	
> +	/** cached packs file */
> +	public static final String CACHED_PACKS_FILE = "info/packs"; 

I think you should denote in the comment that this is relative to
the objects directory, and not the repository.  The INFO_REFS file
right above you is relative to the repository and something looks
wrong seeing these together in context.
  
> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDatabase.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDatabase.java
> index 722c802..5ded7bb 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDatabase.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDatabase.java
> @@ -75,6 +75,14 @@ protected ObjectDatabase() {
>  	public abstract List<PackFile> listLocalPacks();
>  	
>  	/**
> +	 * Creates the caches that are typically done by 
> +	 * update-server-info, namely objects/info/packs and 
> +	 * info/refs

Why is the object database updating info/refs?  The info/refs file
has nothing to do with the raw object storage.

> +	 * @throws IOException 
> +	 */
> +	public abstract void updateInfoCache() throws IOException;

I'm not sure I'm happy with this being on ObjectDatabase but there
may not be any other choice.  We'd like to eventually have other
types of ObjectDatabase that don't even store packs, in such a
database updating the info cache makes no sense.  Asking them to
implement the operation is silly.  But... we have no other way to
easily signal the database that it should do this update.

Hmmph.  I wonder if it might be better to configure the
ObjectDirectory at creation time to automatically maintain
the info/packs file during openPack, and put this method on
ObjectDirectory for explicit invocation in case someone has made
edits outside of JGit and wants to force the cache to be current
again.

I'm not sold either way yet.  I'm still open on the approach.

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDirectory.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDirectory.java
> index cbe132d..f4251c1 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDirectory.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectDirectory.java
> @@ -514,4 +514,9 @@ boolean tryAgain(final long currLastModified) {
>  		tryAgain1();
>  		return new ArrayList<PackFile>(Arrays.asList(packList.get().packs));
>  	}
> +
> +	@Override
> +	public void updateInfoCache() throws IOException {
> +		new UpdateDirectoryBasedPacksInfoCache(this.listLocalPacks(), new File(this.getDirectory(), Constants.CACHED_PACKS_FILE)).execute();

Style-nit: Line is far too long, we try to wrap at around 80
characters but sometimes allow a line to go longer if its only
longer by 1 or 2 characters and the wrapped result would be harder
to read than the unwrapped result.  This is out at 140, too many
over the limit.

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/UpdateDirectoryBasedPacksInfoCache.java b/org.spearce.jgit/src/org/spearce/jgit/lib/UpdateDirectoryBasedPacksInfoCache.java
> +public class UpdateDirectoryBasedPacksInfoCache {
> +	public void execute() throws IOException {
> +		String packsContents = new CachedPacksInfoFileContentsGenerator(packsList).generateContents();
> +		FileOutputStream fos = new FileOutputStream(infoPacksFile);
> +		fos.write(packsContents.getBytes());
> +		fos.close();

I'm not sure why this 4 line method requires its own top level
class and being able to create an instance.  Code bloat?

You need to specify the character encoding here and not rely on
the platform default.

I'm not sure how C Git handles this update, but we probably should be
safer about it and use the LockFile class so the update is going to
be transactional.  Here you are truncating a live file, which means
readers could see an empty content if they happen to request the file
from the web server at the wrong moment.  LockFile does the writing
to a temporary file and then renames it over, which on a POSIX file
system means there is no way someone can see a partial update.

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/UpdateDirectoryInfoCache.java b/org.spearce.jgit/src/org/spearce/jgit/lib/UpdateDirectoryInfoCache.java
> +public class UpdateDirectoryInfoCache {
> +	public void execute() throws IOException {
> +		String packsContents = new CachedPacksInfoFileContentsGenerator(packsList).generateContents();
> +		FileOutputStream fos = new FileOutputStream(infoPacksFile);
> +		fos.write(packsContents.getBytes());
> +		fos.close();

Uh, isn't this the same as UpdateDirectoryBasedPacksInfoCache?
Two top level classes for a 4 line method?

> diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/ReceivePack.java b/org.spearce.jgit/src/org/spearce/jgit/transport/ReceivePack.java
> +	private void updateObjectInfoCache() {
> +		try{

Style-nit: space after try

> +			getRepository().getObjectDatabase().updateInfoCache();
> +		} 
> +		catch (IOException e){

Style-nit: space between ) and {

> +			sendMessage("error updating server info: " + e.getMessage());

-- 
Shawn.

  parent reply	other threads:[~2009-10-08 17:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-16  0:48 [PATCH JGit] Adding update-server-info functionality try2 mr.gaffo
2009-09-16  0:48 ` [PATCH JGit 1/5] adding tests for ObjectDirectory mr.gaffo
2009-09-16  0:48   ` [PATCH JGit 2/5] Create abstract method on ObjectDatabase for accessing the list of local pack files mr.gaffo
2009-09-16  0:48     ` [PATCH JGit 3/5] Implemented directory based info cache for objects/info/packs mr.gaffo
2009-09-16  0:48       ` [PATCH JGit 4/5] Adding in a InfoDatabase like ObjectDatabase and and implementation based upon a directory mr.gaffo
2009-09-16  0:48         ` [PATCH JGit 5/5] added tests for the file based info cache update and made pass mr.gaffo
2009-10-08 17:12           ` Shawn O. Pearce
2009-10-08 17:05         ` [PATCH JGit 4/5] Adding in a InfoDatabase like ObjectDatabase and and implementation based upon a directory Shawn O. Pearce
2009-10-08 17:00       ` Shawn O. Pearce [this message]
2009-09-21 19:40     ` [PATCH JGit 2/5] Create abstract method on ObjectDatabase for accessing the list of local pack files Shawn O. Pearce
2009-09-21 19:51       ` Michael Gaffney
2009-09-21 19:30   ` [PATCH JGit 1/5] adding tests for ObjectDirectory Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091008170019.GF9261@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=mr.gaffo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).