git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Nicolas Pitre <nico@fluxnic.net>
Subject: Re: [PATCH v2] pack-objects: use streaming interface for reading large loose blobs
Date: Wed, 16 May 2012 14:09:32 +0700	[thread overview]
Message-ID: <CACsJy8DzdFORUMy7p_eVotr=HdkMX10uXy25H=05TBDjOmi4yw@mail.gmail.com> (raw)
In-Reply-To: <7vhavhforl.fsf@alter.siamese.dyndns.org>

On Tue, May 15, 2012 at 10:27 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> On Tue, May 15, 2012 at 2:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>>>
>>>> git usually streams large blobs directly to packs. But there are cases
>>>> diff --git a/t/t1050-large.sh b/t/t1050-large.sh
>>>> index 55ed955..7fbd2e1 100755
>>>> --- a/t/t1050-large.sh
>>>> +++ b/t/t1050-large.sh
>>>> @@ -134,6 +134,22 @@ test_expect_success 'repack' '
>>>>       git repack -ad
>>>>  '
>>>>
>>>> +test_expect_success 'pack-objects with large loose object' '
>>>> +     echo Z | dd of=large4 bs=1k seek=2000 &&
>>>> +     OBJ=9f36d94e145816ec642592c09cc8e601d83af157 &&
>>>> +     P=.git/objects/9f/36d94e145816ec642592c09cc8e601d83af157 &&
>>>
>>> I do not think you need these hardcoded constants; you will run
>>> hash-object later, no?
>>>
>>> Also, relying on $P to exist after hash-object -w returns is somewhat
>>> flaky, no?
>>
>> I need it to be a loose object to test this code path.
>
> No you don't.  You only need it to be something istream_read() will read
> from, iow, it could come from a base representation in a packfile.

No, an in-pack object will set to_reuse to 1, which goes a completely
different code path. write_large_blob_data() is only called when
to_reuse == 0.

>>> In any case, the patch when applied on top of cd07cc5 (Update draft
>>> release notes to 1.7.11 (11th batch), 2012-05-11) does not pass this part
>>> of the test on my box.
>>
>> Interesting. It passes for me (same base). I assume rm failed?
>
> No, reading the resulting pack dies with an error message that says the
> object could not be read at offset 12, implying that the pack writer wrote
> something bogus.

I'm still unable to reproduce that. But I think I found the problem.
In streaming code path, I set datalen = <uncompressed size> but
write_object() returns "hdrlen + (wrong) datalen". Patches will come a
couple of hours from now.
-- 
Duy

  reply	other threads:[~2012-05-16  7:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-12 10:26 [PATCH] pack-objects: use streaming interface for reading large loose blobs Nguyễn Thái Ngọc Duy
2012-05-12 16:51 ` Nicolas Pitre
2012-05-13  4:37   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2012-05-14 15:56     ` Junio C Hamano
2012-05-14 19:43     ` Junio C Hamano
2012-05-15 11:18       ` Nguyen Thai Ngoc Duy
2012-05-15 15:27         ` Junio C Hamano
2012-05-16  7:09           ` Nguyen Thai Ngoc Duy [this message]
2012-05-16 12:02 ` [PATCH v2 1/4] streaming: allow to call close_istream(NULL); Nguyễn Thái Ngọc Duy
2012-05-16 12:02   ` [PATCH v2 2/4] pack-objects, streaming: turn "xx >= big_file_threshold" to ".. > .." Nguyễn Thái Ngọc Duy
2012-05-18 21:05     ` Junio C Hamano
2012-05-16 12:02   ` [PATCH v2 3/4] pack-objects: refactor write_object() Nguyễn Thái Ngọc Duy
2012-05-18 21:16     ` Junio C Hamano
2012-05-19  2:43     ` Nicolas Pitre
2012-05-16 12:02   ` [PATCH v2 4/4] pack-objects: use streaming interface for reading large loose blobs Nguyễn Thái Ngọc Duy
2012-05-18 21:02   ` [PATCH v2 1/4] streaming: allow to call close_istream(NULL); Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8DzdFORUMy7p_eVotr=HdkMX10uXy25H=05TBDjOmi4yw@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=nico@fluxnic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).