From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-11.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,T_DKIMWL_WL_MED,USER_IN_DEF_DKIM_WL shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 122FB2023D for ; Wed, 2 May 2018 20:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751344AbeEBUaL (ORCPT ); Wed, 2 May 2018 16:30:11 -0400 Received: from mail-pg0-f41.google.com ([74.125.83.41]:46597 "EHLO mail-pg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750970AbeEBUaJ (ORCPT ); Wed, 2 May 2018 16:30:09 -0400 Received: by mail-pg0-f41.google.com with SMTP id z4-v6so11443014pgu.13 for ; Wed, 02 May 2018 13:30:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/EG+Ltf0XwN7byEI2DXNSCHa2vP2KbjacmEJTbRPTOM=; b=ISG46dtDY8fGUZHhWZR82287kUWUTeCgBAQGUdVtofbB2TxiH/c3YmAOmvyOpGvGQv BBKXXKRnEo9GOq2DTJQHBV6/jzAjinuOUeEZD2PDayqqqqaxsnAgIIks2XDU8dfyIGV2 31A2MJvwAB/hnhKjy+PsE0sNe+7FVPTea9oFTxfEqHhSo/K+59YplRidPRU4NsuYGCSn Oh66xSLWgE9eqGlYGktaBSioM7+X9oT2swnQru79FMbUoJGRpSFD6RPysObhoOOBy2f4 IxYFIGbSAvgwPMUZL3Xl60QAv4FDZSDsuck7ite7d1H33DetOl2af2bsWo2wA4KNfd1j RJ9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/EG+Ltf0XwN7byEI2DXNSCHa2vP2KbjacmEJTbRPTOM=; b=ot36RhfgEbcKiXbCnjtdeXfPTUHPJeTxRbHQH9ehXPqzbmxuD5b89rbvV5BoTQhX5T GwsUmcEdd4VZt7yyKHb1C7gYoz1vzdT2taFaYQ9fLocaxoWBeYijul4IuunW5zuTxwcO EXoklvwVBaWU2I2/MAdMZt3qP3q4omAGqdjQqLt63jITGllYE9K9iQv+dVVv9HA4qbGZ k6wTeI1SBMKTOG6G4tEHpgx8RemW1gn8o+qBIhaOJE0U7QUOI+Ey/ns9t3X1QGxHyczU kgUclvSVpvxZ4gsypQEyuQD8CXxvQpk+cJ+JPZ2cai0ts0BdRXk/28UDVsL6VvyKsgzP hMYg== X-Gm-Message-State: ALQs6tDaBMlaJ2IdA5N/nVgRbOq9RBJfKGCKGETHlgQoW7iRG5b5U6zQ jhCGLZnNRRce+prfoILaMzYVSA== X-Google-Smtp-Source: AB8JxZqIatJJhp1nqDuy6acMmYfD6rToZGb1VPstq+fz8QtfFdMmQZGQ/LKRNjNRixClXbvTQq4m3A== X-Received: by 2002:a65:4844:: with SMTP id i4-v6mr17079021pgs.191.1525293007991; Wed, 02 May 2018 13:30:07 -0700 (PDT) Received: from twelve3.svl.corp.google.com ([2620:0:100e:422:ffac:c1d4:4bf7:bb93]) by smtp.gmail.com with ESMTPSA id u188sm20140872pfb.84.2018.05.02.13.30.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 02 May 2018 13:30:06 -0700 (PDT) Date: Wed, 2 May 2018 13:30:05 -0700 From: Jonathan Tan To: Stefan Beller Cc: git@vger.kernel.org, jamill@microsoft.com, Jonathan Nieder Subject: Re: [PATCH 01/13] repository: introduce object parser field Message-Id: <20180502133005.668e9eefda445f37ba808b96@google.com> In-Reply-To: <20180501213403.14643-2-sbeller@google.com> References: <20180501213403.14643-1-sbeller@google.com> <20180501213403.14643-2-sbeller@google.com> X-Mailer: Sylpheed 3.6.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, 1 May 2018 14:33:51 -0700 Stefan Beller wrote: > Git's object access code can be thought of as containing two layers: > the raw object store provides access to raw object content, while the > higher level obj_hash code parses raw objects and keeps track of > parenthood and other object relationships using 'struct object'. > Keeping these layers separate should make it easier to find relevant > functions and to change the implementation of one without having to > touch the other. I only understood this after reading the code below. I think this paragraph can be removed; I don't see its relevance - of course we need to store metadata about how to load objects somewhere, and caching objects we have already loaded is a good idea: and the metadata and cache are two separate things before and after this patch anyway. > Add an object_parser field to 'struct repository' to prepare obj_hash > to be handled per repository. Callers still only use the_repository > for now --- later patches will adapt them to handle arbitrary > repositories. I think this is better reworded as: Convert the existing global cache for parsed objects (obj_hash) into repository-specific parsed object caches. Existing code that uses obj_hash are modified to use the parsed object cache of the_repository; future patches will use the parsed object caches of other repositories. > +struct object_parser *object_parser_new(void) > +{ > + struct object_parser *o = xmalloc(sizeof(*o)); > + memset(o, 0, sizeof(*o)); > + return o; > +} I'm not sure that I like this style of code (I prefer the strbuf style with _INIT and release(), which I think is more flexible) but I don't feel too strongly about it. > +struct object_parser { > + struct object **obj_hash; > + int nr_objs, obj_hash_size; > +}; object_parser is probably a bad name. I think Duy had some good suggestions in [1]. [1] https://public-inbox.org/git/CACsJy8CgX6BME=EhiDBfMRzBOYDBNHE6Ouxv4fZC-GW7Rsi81Q@mail.gmail.com/ > /* > - * Holds any information related to accessing the raw object content. > + * Holds any information needed to retrieve the raw content > + * of objects. The object_parser uses this to get object > + * content which it then parses. > */ > struct raw_object_store *objects; I think the additional sentence ("The object_parser uses this...") is unnecessary and confusing, especially if its identity is going to be one of storage (like "parsed_objects" implies). > + /* > + * State for the object parser. This owns all parsed objects > + * (struct object) so callers do not have to manage their > + * lifetime. > + */ > + struct object_parser *parsed_objects; Even after all the commits in this patch set, this does not store any state for parsing. Maybe just document as: All objects in this repository that have been parsed. This structure owns all objects it references, so users of "struct object *" generally do not need to free them; instead, when a repository is no longer used, call object_parser_clear() on this structure. (And maybe say that the freeing method on struct repository will automatically call object_parser_clear().)