From: Marek Zawirski <marek.zawirski@gmail.com>
To: robin.rosenberg@dewire.com, spearce@spearce.org
Cc: git@vger.kernel.org, Marek Zawirski <marek.zawirski@gmail.com>
Subject: [EGIT PATCH 13/20] Raw-data operations in ObjectLoaders and PackFile
Date: Sun, 15 Jun 2008 23:45:42 +0200 [thread overview]
Message-ID: <1213566349-25395-14-git-send-email-marek.zawirski@gmail.com> (raw)
In-Reply-To: <1213566349-25395-13-git-send-email-marek.zawirski@gmail.com>
Expose operations on raw-data (storage specific) in ObjectLoaders and
subclasses:
- getRawType() giving access to the object type at PackFile header level
- getRawSize() giving access to the size of this object at PackFile
header level
- getDeltaBase() determining delta base if applicable
- copyRawData() allowing direct copying raw (compressed or delitified)
object data if possible
+ helper fields, methods in ObjectLoaders
+ helper methods/core engine in PackFile
New operations do not introduce any signifficant performance overhead
when not used.
Signed-off-by: Marek Zawirski <marek.zawirski@gmail.com>
---
.../jgit/lib/DeltaOfsPackedObjectLoader.java | 21 ++++++-
.../spearce/jgit/lib/DeltaPackedObjectLoader.java | 9 ++-
.../jgit/lib/DeltaRefPackedObjectLoader.java | 15 ++++-
.../src/org/spearce/jgit/lib/ObjectLoader.java | 24 +++++++
.../src/org/spearce/jgit/lib/PackFile.java | 65 ++++++++++++++++++-
.../org/spearce/jgit/lib/PackedObjectLoader.java | 44 +++++++++++++-
.../org/spearce/jgit/lib/UnpackedObjectLoader.java | 10 +++
.../spearce/jgit/lib/WholePackedObjectLoader.java | 20 ++++++-
8 files changed, 194 insertions(+), 14 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaOfsPackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaOfsPackedObjectLoader.java
index edbeef9..5c9fb00 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaOfsPackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaOfsPackedObjectLoader.java
@@ -40,17 +40,34 @@ package org.spearce.jgit.lib;
import java.io.IOException;
+import org.spearce.jgit.errors.CorruptObjectException;
+
/** Reads a deltified object which uses an offset to find its base. */
class DeltaOfsPackedObjectLoader extends DeltaPackedObjectLoader {
private final long deltaBase;
DeltaOfsPackedObjectLoader(final WindowCursor curs, final PackFile pr,
- final long offset, final int deltaSz, final long base) {
- super(curs, pr, offset, deltaSz);
+ final long dataOffset, final long objectOffset, final int deltaSz,
+ final long base) {
+ super(curs, pr, dataOffset, objectOffset, deltaSz);
deltaBase = base;
}
protected PackedObjectLoader getBaseLoader() throws IOException {
return pack.resolveBase(curs, deltaBase);
}
+
+ @Override
+ public int getRawType() {
+ return Constants.OBJ_OFS_DELTA;
+ }
+
+ @Override
+ public ObjectId getDeltaBase() throws IOException {
+ final ObjectId id = pack.findObjectForOffset(deltaBase);
+ if (id == null)
+ throw new CorruptObjectException(
+ "Offset-written delta base for object not found in a pack");
+ return id;
+ }
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
index 4813572..e73f8e5 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
@@ -50,8 +50,8 @@ abstract class DeltaPackedObjectLoader extends PackedObjectLoader {
private final int deltaSize;
DeltaPackedObjectLoader(final WindowCursor curs, final PackFile pr,
- final long offset, final int deltaSz) {
- super(curs, pr, offset);
+ final long dataOffset, final long objectOffset, final int deltaSz) {
+ super(curs, pr, dataOffset, objectOffset);
objectType = -1;
deltaSize = deltaSz;
}
@@ -98,6 +98,11 @@ abstract class DeltaPackedObjectLoader extends PackedObjectLoader {
}
}
+ @Override
+ public long getRawSize() {
+ return deltaSize;
+ }
+
/**
* @return the object loader for the base object
* @throws IOException
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaRefPackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaRefPackedObjectLoader.java
index fb87abc..042d3a8 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaRefPackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaRefPackedObjectLoader.java
@@ -47,8 +47,9 @@ class DeltaRefPackedObjectLoader extends DeltaPackedObjectLoader {
private final ObjectId deltaBase;
DeltaRefPackedObjectLoader(final WindowCursor curs, final PackFile pr,
- final long offset, final int deltaSz, final ObjectId base) {
- super(curs, pr, offset, deltaSz);
+ final long dataOffset, final long objectOffset, final int deltaSz,
+ final ObjectId base) {
+ super(curs, pr, dataOffset, objectOffset, deltaSz);
deltaBase = base;
}
@@ -58,4 +59,14 @@ class DeltaRefPackedObjectLoader extends DeltaPackedObjectLoader {
throw new MissingObjectException(deltaBase, "delta base");
return or;
}
+
+ @Override
+ public int getRawType() throws IOException {
+ return Constants.OBJ_REF_DELTA;
+ }
+
+ @Override
+ public ObjectId getDeltaBase() throws IOException {
+ return deltaBase;
+ }
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
index 3a96dd1..5282491 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
@@ -66,6 +66,13 @@ public abstract class ObjectLoader {
}
/**
+ * @return true if id of loaded object is already known, false otherwise.
+ */
+ protected boolean hasComputedId() {
+ return objectId != null;
+ }
+
+ /**
* Set the SHA-1 id of the object handled by this loader
*
* @param id
@@ -113,4 +120,21 @@ public abstract class ObjectLoader {
* the object cannot be read.
*/
public abstract byte[] getCachedBytes() throws IOException;
+
+ /**
+ * @return raw object type from object header, as stored in storage (pack,
+ * loose file). This may be different from {@link #getType()} result
+ * for packs (see {@link Constants}).
+ * @throws IOException
+ * when type cannot be read from the object header.
+ */
+ public abstract int getRawType() throws IOException;
+
+ /**
+ * @return raw size of object from object header (pack, loose file).
+ * Interpretation of this value depends on {@link #getRawType()}.
+ * @throws IOException
+ * when raw size cannot be read from the object header.
+ */
+ public abstract long getRawSize() throws IOException;
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
index 3880966..9992615 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
@@ -38,11 +38,16 @@
package org.spearce.jgit.lib;
+import java.io.EOFException;
import java.io.File;
import java.io.IOException;
+import java.io.OutputStream;
import java.util.Iterator;
+import java.util.zip.CRC32;
+import java.util.zip.CheckedOutputStream;
import java.util.zip.DataFormatException;
+import org.spearce.jgit.errors.CorruptObjectException;
import org.spearce.jgit.util.NB;
/**
@@ -201,6 +206,52 @@ public class PackFile implements Iterable<PackIndex.MutableEntry> {
return dstbuf;
}
+ final void copyRawData(final PackedObjectLoader loader,
+ final OutputStream out, final byte buf[]) throws IOException {
+ final long objectOffset = loader.objectOffset;
+ final long dataOffset = loader.dataOffset;
+ final int cnt = (int) (findEndOffset(objectOffset) - dataOffset);
+ final WindowCursor curs = loader.curs;
+
+ if (idx.hasCRC32Support()) {
+ final CRC32 crc = new CRC32();
+ int headerCnt = (int) (dataOffset - objectOffset);
+ while (headerCnt > 0) {
+ int toRead = Math.min(headerCnt, buf.length);
+ int read = pack.read(objectOffset, buf, 0, toRead, curs);
+ if (read != toRead)
+ throw new EOFException();
+ crc.update(buf, 0, read);
+ headerCnt -= toRead;
+ }
+ final CheckedOutputStream crcOut = new CheckedOutputStream(out, crc);
+ pack.copyToStream(dataOffset, buf, cnt, crcOut, curs);
+ final long computed = crc.getValue();
+
+ ObjectId id;
+ if (loader.hasComputedId())
+ id = loader.getId();
+ else
+ id = findObjectForOffset(objectOffset);
+ final long expected = idx.findCRC32(id);
+ if (computed != expected)
+ throw new CorruptObjectException(id,
+ "Possible data corruption - CRC32 of raw pack data (object offset "
+ + objectOffset
+ + ") mismatch CRC32 from pack index");
+ } else {
+ pack.copyToStream(dataOffset, buf, cnt, out, curs);
+
+ // read to verify against Adler32 zlib checksum
+ loader.getCachedBytes();
+ }
+ }
+
+ boolean supportsFastCopyRawData() {
+ return idx.hasCRC32Support();
+ }
+
+
private void readPackHeader() throws IOException {
final WindowCursor curs = new WindowCursor();
long position = 0;
@@ -252,8 +303,8 @@ public class PackFile implements Iterable<PackIndex.MutableEntry> {
case Constants.OBJ_TREE:
case Constants.OBJ_BLOB:
case Constants.OBJ_TAG:
- return new WholePackedObjectLoader(curs, this, pos, typeCode,
- (int) dataSize);
+ return new WholePackedObjectLoader(curs, this, pos, objOffset,
+ typeCode, (int) dataSize);
case Constants.OBJ_OFS_DELTA: {
pack.readFully(pos, ib, curs);
@@ -267,18 +318,24 @@ public class PackFile implements Iterable<PackIndex.MutableEntry> {
ofs += (c & 127);
}
return new DeltaOfsPackedObjectLoader(curs, this, pos + p,
- (int) dataSize, objOffset - ofs);
+ objOffset, (int) dataSize, objOffset - ofs);
}
case Constants.OBJ_REF_DELTA: {
pack.readFully(pos, ib, curs);
return new DeltaRefPackedObjectLoader(curs, this, pos + ib.length,
- (int) dataSize, ObjectId.fromRaw(ib));
+ objOffset, (int) dataSize, ObjectId.fromRaw(ib));
}
default:
throw new IOException("Unknown object type " + typeCode + ".");
}
}
+ private long findEndOffset(final long startOffset)
+ throws CorruptObjectException {
+ final long maxOffset = pack.length() - Constants.OBJECT_ID_LENGTH;
+ return getReverseIdx().findNextOffset(startOffset, maxOffset);
+ }
+
private PackReverseIndex getReverseIdx() {
if (reverseIdx == null)
reverseIdx = new PackReverseIndex(idx);
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackedObjectLoader.java
index 43d43e6..b433609 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackedObjectLoader.java
@@ -39,6 +39,7 @@
package org.spearce.jgit.lib;
import java.io.IOException;
+import java.io.OutputStream;
/**
* Base class for a set of object loader classes for packed objects.
@@ -50,15 +51,18 @@ abstract class PackedObjectLoader extends ObjectLoader {
protected final long dataOffset;
+ protected final long objectOffset;
+
protected int objectType;
protected int objectSize;
PackedObjectLoader(final WindowCursor c, final PackFile pr,
- final long offset) {
+ final long dataOffset, final long objectOffset) {
curs = c;
pack = pr;
- dataOffset = offset;
+ this.dataOffset = dataOffset;
+ this.objectOffset = objectOffset;
}
public int getType() throws IOException {
@@ -82,4 +86,40 @@ abstract class PackedObjectLoader extends ObjectLoader {
System.arraycopy(data, 0, copy, 0, data.length);
return data;
}
+
+ /**
+ * Copy raw object representation from storage to provided output stream.
+ * <p>
+ * Copied data doesn't include object header. User must provide temporary
+ * buffer used during copying by underlying I/O layer.
+ * </p>
+ *
+ * @param out
+ * output stream when data is copied. No buffering is guaranteed.
+ * @param buf
+ * temporary buffer used during copying. Recommended size is at
+ * least few kB.
+ * @throws IOException
+ * when the object cannot be read.
+ */
+ public void copyRawData(OutputStream out, byte buf[]) throws IOException {
+ pack.copyRawData(this, out, buf);
+ }
+
+ /**
+ * @return true if this loader is capable of fast raw-data copying basing on
+ * compressed data checksum; false if raw-data copying needs
+ * uncompressing and compressing data
+ */
+ public boolean supportsFastCopyRawData() {
+ return pack.supportsFastCopyRawData();
+ }
+
+ /**
+ * @return id of delta base object for this object representation. null if
+ * object is not stored as delta.
+ * @throws IOException
+ * when delta base cannot read.
+ */
+ public abstract ObjectId getDeltaBase() throws IOException;
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
index a5c484b..65072f0 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
@@ -208,4 +208,14 @@ public class UnpackedObjectLoader extends ObjectLoader {
public byte[] getCachedBytes() throws IOException {
return bytes;
}
+
+ @Override
+ public int getRawType() {
+ return objectType;
+ }
+
+ @Override
+ public long getRawSize() {
+ return objectSize;
+ }
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
index e54fba6..7185df5 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
@@ -47,8 +47,9 @@ class WholePackedObjectLoader extends PackedObjectLoader {
private static final int OBJ_COMMIT = Constants.OBJ_COMMIT;
WholePackedObjectLoader(final WindowCursor curs, final PackFile pr,
- final long offset, final int type, final int size) {
- super(curs, pr, offset);
+ final long dataOffset, final long objectOffset, final int type,
+ final int size) {
+ super(curs, pr, dataOffset, objectOffset);
objectType = type;
objectSize = size;
}
@@ -76,4 +77,19 @@ class WholePackedObjectLoader extends PackedObjectLoader {
throw coe;
}
}
+
+ @Override
+ public int getRawType() {
+ return objectType;
+ }
+
+ @Override
+ public long getRawSize() {
+ return objectSize;
+ }
+
+ @Override
+ public ObjectId getDeltaBase() {
+ return null;
+ }
}
--
1.5.5.1
next prev parent reply other threads:[~2008-06-15 21:48 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-15 21:45 [EGIT PATCH 00/20] PackWriter, first usable attempt Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 01/20] Fix typo in PackIndexV2 Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 02/20] Integer versions of copyRawTo() and fromRaw() in ObjectId Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 03/20] Add openObjectInAllPacks() to Repository, exposing packed objects storage Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 04/20] WindowedFile fragments copying: copyToStream() Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 05/20] Reverse pack index implementation: PackReverseIndex Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 06/20] Tests for PackReverseIndex Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 07/20] Refactor PackIndexV2 - extract binarySearchLevelTwo() Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 08/20] CRC32 support for PackIndex Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 09/20] CRC32 PackIndex tests Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 10/20] Format PackedObjectLoader class Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 11/20] Format UnpackedObjectLoader class Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 12/20] Format DeltaOfsPackedObjectLoader class Marek Zawirski
2008-06-15 21:45 ` Marek Zawirski [this message]
2008-06-15 21:45 ` [EGIT PATCH 14/20] Add hasRevSort() in RevWalk for faster sorting strategy checking Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 15/20] Refactor getRevSort() calls to hasRevSort() Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 16/20] Support for RevSort.BOUNDARY in ObjectWalk Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 17/20] Rename confusing objects field " Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 18/20] New CountingOutputStream class - stream decorator Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 19/20] Simplified implementation of pack creation: PackWriter Marek Zawirski
2008-06-15 21:45 ` [EGIT PATCH 20/20] PackWriter test suite Marek Zawirski
2008-06-17 21:28 ` [EGIT PATCH 21/20] Make isBetterDeltaReuseLoader() static in PackWriter Marek Zawirski
2008-06-17 22:07 ` Robin Rosenberg
2008-06-19 16:26 ` Marek Zawirski
2008-06-16 4:06 ` [EGIT PATCH 05/20] Reverse pack index implementation: PackReverseIndex Shawn O. Pearce
2008-06-16 16:27 ` Marek Zawirski
2008-06-17 2:02 ` Shawn O. Pearce
2008-06-16 5:19 ` [EGIT PATCH 00/20] PackWriter, first usable attempt Shawn O. Pearce
2008-06-16 16:37 ` Marek Zawirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1213566349-25395-14-git-send-email-marek.zawirski@gmail.com \
--to=marek.zawirski@gmail.com \
--cc=git@vger.kernel.org \
--cc=robin.rosenberg@dewire.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).