From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yann Dirson Subject: Re: [PATCH 1/6] Introduce sorted-array binary-search function. Date: Thu, 30 Dec 2010 01:40:27 +0100 Message-ID: <20101230004027.GB6639@home.lan> References: <1291848695-24601-1-git-send-email-ydirson@altern.org> <1291848695-24601-2-git-send-email-ydirson@altern.org> <7vwrnhb6tm.fsf@alter.siamese.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: git@vger.kernel.org To: Junio C Hamano X-From: git-owner@vger.kernel.org Thu Dec 30 01:40:41 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PY6ZM-0004zA-Vn for gcvg-git-2@lo.gmane.org; Thu, 30 Dec 2010 01:40:41 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754224Ab0L3Akg (ORCPT ); Wed, 29 Dec 2010 19:40:36 -0500 Received: from smtp5-g21.free.fr ([212.27.42.5]:42758 "EHLO smtp5-g21.free.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753182Ab0L3Akf (ORCPT ); Wed, 29 Dec 2010 19:40:35 -0500 Received: from home.lan (unknown [81.57.214.146]) by smtp5-g21.free.fr (Postfix) with ESMTP id C7F90D4807B; Thu, 30 Dec 2010 01:40:28 +0100 (CET) Received: from yann by home.lan with local (Exim 4.72) (envelope-from ) id 1PY6Z9-0002dV-L8; Thu, 30 Dec 2010 01:40:27 +0100 Content-Disposition: inline In-Reply-To: <7vwrnhb6tm.fsf@alter.siamese.dyndns.org> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Fri, Dec 10, 2010 at 02:29:09PM -0800, Junio C Hamano wrote: > Yann Dirson writes: > > > +Suffix meanings are as follows: > > + > > +`check`:: > > +... > > +* those defining the generic algorithms > > Yuck. > > All of these feel way overengineered and at the same time too rigid and > brittle. I confess a natural tendency towards overengineering stuff, but more iterations usually help :). Indeed, most of the overengineered-looking features added in the latest iterations are things I already needed for the dir-rename series. About rigidity, well, it is already less rigid than a couple of the hardcoded implementations we have throughout the source, and as such could be seen as an iterative step towards someting better - as I mentionned, my primary intent was to just get things useful for that dir-rename series, and at least on this point, I think the approach in this series is not so bad. It also looks generic enough to get the string-list lookup/insert funcs use it, and this again does not look so bad to me, compared to the stretching of the string-list API itself that would be required to make it useful to the dir-rename stuff. As for the "brittle" aspect, I'm not sure there is anything unfixable. At the very least, more parentheses around the expnasion of macro args would help making things more robust. > I have a suspicion that the "convenience" macros that generate many > functions and definitions are the main culprit. Hm. I was reluctant at first to use so many macros for an API, fearing it would confuse people (that resulted in the v5 API: genericity made it too verbose to be useful). Then I saw how many entry-points the linked-list API of the kernel has: that made me reconsider, and I found the result much more usable, although I had to write more doc. > For example, why do all the functions generated by a "convenience" > macro must share the same MAYBESTATIC? The intention was that MAYBESTATIC only gets applied to the explicitely-requested function, and that the transparently-generated ones get "static". Only declare_sorted_array_search_check() does not conform to that, that's a bug. That also ought to be documented in the API. > "binsearch" takes a comparison function pointer, and always > picks the midpoint, but what is the performance implication if we wanted > to use sorted-array.h to rewrite say sha1-lookup.c? As I noted in the cover letter, I consider sha1-lookup.c too special. What it is not doing is not a conventional binary search, and this looks out of scope. > How can an API user who wants to use > declare_sorted_array_insert_checkbook() easily figure out what other > macros fromt this family can be used without getting the same thing > generated twice? I have tried to make this explicit in the API reference, but things can surely be made more clear by including examples. > If somebody wanted to have a sorted array in a > struct, it may be tempting to use declare_sorted_array() with an empty > MAYBESTATIC inside struct's field declaration (even when the struct itself > is static---which leaves a queasy feeling, but that is a separate issue), > and the _current_ macro definition of declare_sorted_array() may allow > such a usage work perfectly fine, but how can such an API user be rest > assured it won't break in later revisions of these macros? That would only work by side-effect. We can either decide to document it and make it part of the API if we feel it's worth it, or explicitely state it is not supported (or enforce it: defining _nr and _alloc with =0 initializers would be a fairly good way of catching such attempts). > In addition, these macros in this patch are almost unreadable, but that > probably is mostly a fault of C's macro, not yours. Yes. When writing those I sorely missed the readability of C++ templates - yuck :) -- Yann