From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RDNS_DYNAMIC,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 9F6121F8C6 for ; Mon, 30 Aug 2021 21:20:58 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 97DC1385840A for ; Mon, 30 Aug 2021 21:20:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 97DC1385840A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630358457; bh=AlsgYmnibuCArIdQJqn8diObBUZ1bVG7iJzOWfZ7Vcg=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=EXga5+aY3S2Hx1NnqXRvt0tqq4J2Lx5gHED+S5V1xs205zMBATZpDExlW9yWIL9F4 CY0HzL+8zbcxbmPrSbW0I2NHDFq9M143AMJhdYxrqgpehTBrH9fm2NyiNJ9S0YDITW ayY8D3UfT5LBPqIK2rt33bvYOdgZkiKUB9/uKJJQ= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id BEDB03858D34 for ; Mon, 30 Aug 2021 21:20:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BEDB03858D34 Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-DLf3EsIZNryL5xETkNsmqQ-1; Mon, 30 Aug 2021 17:20:26 -0400 X-MC-Unique: DLf3EsIZNryL5xETkNsmqQ-1 Received: by mail-qk1-f199.google.com with SMTP id k12-20020a05620a0b8c00b003d5c8646ec2so839340qkh.20 for ; Mon, 30 Aug 2021 14:20:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=AlsgYmnibuCArIdQJqn8diObBUZ1bVG7iJzOWfZ7Vcg=; b=EYxLBUQx7lXb24TgdQyAZui6++SwheIDDYD7p1MWSlD4Q49buPPlAGI+IQnoriFs8q Ssvuh78SsbNjaZp2tG6iWLxQGkS3+2ErgnbRmc8l/02Fa5hcbYZGDGVz13FGSMrysZ42 dtlfXF0DYBpqR/Vk+Ky3TC9zrFwKuymnmunjmWo6KqLiPJ78VTchTNKcPd68BtloJLbF L8+li0lnrIzddPd0DSGtaFvHEu0qRxw8ad04eBwlOsk1dgV2h45eNbMYTL5YN3GexFFG 3dSh6eKbdAkPoa3fKiMYOc/opG7yjZN/ZFPb08m1Ss7crpBiXhiUycYA9FT1Q9nTFpTJ aNGg== X-Gm-Message-State: AOAM530neydZxf5JhDjzn4s1sE40zMk47HyCi4FOlp2yC2mbDfnU+hYi /CLHtQU+snYVBXLfSPCHBs6y0kI7W/3otKxvSFIrqVI7kK/2ZuFYEpj1poWiwIEIajdixUtbJey G2+HuhdThFULQzzz6zV1l X-Received: by 2002:ac8:4a83:: with SMTP id l3mr16694124qtq.141.1630358425361; Mon, 30 Aug 2021 14:20:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyPchWvDkPwOsEwpI15llCk/7TXFwtLfGjKZ4pFChgF3WGKSS+hjXSSgjeSMGKkjeuhiM21sg== X-Received: by 2002:ac8:4a83:: with SMTP id l3mr16694100qtq.141.1630358424965; Mon, 30 Aug 2021 14:20:24 -0700 (PDT) Received: from [192.168.1.16] (198-84-214-74.cpe.teksavvy.com. [198.84.214.74]) by smtp.gmail.com with ESMTPSA id w129sm12551272qkb.61.2021.08.30.14.20.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 Aug 2021 14:20:24 -0700 (PDT) Subject: Re: [PATCH v4 1/2] Port shared code information from the wiki To: Siddhesh Poyarekar , libc-alpha@sourceware.org References: <20210823025410.590471-1-siddhesh@sourceware.org> <20210823025410.590471-2-siddhesh@sourceware.org> Organization: Red Hat Message-ID: Date: Mon, 30 Aug 2021 17:20:23 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210823025410.590471-2-siddhesh@sourceware.org> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Carlos O'Donell via Libc-alpha Reply-To: Carlos O'Donell Cc: joseph@codesourcery.com Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 8/22/21 10:54 PM, Siddhesh Poyarekar via Libc-alpha wrote: > Since the shared code now has special status with respect to > copyrights, port them into a more structured format in the source tree > and add a python function that parses and returns a dictionary with > the information. Perfect. Thanks for doing this, I think a more structured approach like this is important. Please send v5 with mktime-internal.h fix. > I need this to exclude these files from the Contributed-by changes and > I reckon it would be useful to know these files for future tooling. > --- > SHARED-FILES | 206 +++++++++++++++++++++++++++++++++++ > scripts/glibc_shared_code.py | 70 ++++++++++++ > 2 files changed, 276 insertions(+) > create mode 100644 SHARED-FILES > create mode 100644 scripts/glibc_shared_code.py > > diff --git a/SHARED-FILES b/SHARED-FILES > new file mode 100644 > index 0000000000..d1c4fc4eeb > --- /dev/null > +++ b/SHARED-FILES > @@ -0,0 +1,206 @@ > +# Files shared with other projects. Pass a file path to the > +# get_glibc_shared_code() function in the python library > +# scripts/glibc_shared_code.py to get a dict object with this information. See > +# the library sources for more information. > + > +# The headers on most of these files indicate that glibc is the canonical > +# source for these files, although in many cases there seem to be useful > +# changes in the gnulib versions that could be merged back in. Not all gnulib > +# files contain such a header and it is not always consistent in its format, so > +# it would be useful to make sure that all gnulib files that are using glibc as > +# upstream have a greppable header. > +# > +# These files are quite hard to find without a header to grep for and each file > +# has to be compared manually so this list is likely incomplete or may contain > +# errors. > +gnulib: > + argp/argp-ba.c > + argp/argp-ba.c > + argp/argp-eexst.c > + argp/argp-fmtstream.c > + argp/argp-fmtstream.h > + argp/argp-fs-xinl.c > + argp/argp-help.c > + argp/argp-namefrob.h > + argp/argp-parse.c > + argp/argp-pv.c > + argp/argp-pvh.c > + argp/argp-xinl.c > + argp/argp.h > + crypt/md5.c > + crypt/md5.h > + dirent/alphasort.c > + dirent/scandir.c > + locale/programs/3level.h > + # Merged from gnulib 2014-6-23 > + malloc/obstack.c > + # Merged from gnulib 2014-6-23 > + malloc/obstack.h > + # Merged from gnulib 2014-07-10 > + misc/error.c > + misc/error.h > + misc/getpass.c > + misc/mkdtemp.c > + posix/fnmatch_loop.c > + # Intended to be the same. Gnulib copy contains glibc changes. > + posix/getopt.c > + # Intended to be the same. Gnulib copy contains glibc changes. > + posix/getopt1.c > + # Intended to be the same. Gnulib copy contains glibc changes. > + posix/getopt_int.h > + posix/glob.c > + posix/regcomp.c > + posix/regex.c > + posix/regex.h > + posix/regex_internal.c > + posix/regex_internal.h > + posix/regexec.c > + posix/spawn.c > + posix/spawn_faction_addclose.c > + posix/spawn_faction_adddup2.c > + posix/spawn_faction_addopen.c > + posix/spawn_faction_destroy.c > + posix/spawn_faction_init.c > + posix/spawn_int.h > + posix/spawnattr_destroy.c > + posix/spawnattr_getdefault.c > + posix/spawnattr_getflags.c > + posix/spawnattr_getpgroup.c > + posix/spawnattr_getschedparam.c > + posix/spawnattr_getschedpolicy.c > + posix/spawnattr_getsigmask.c > + posix/spawnattr_init.c > + posix/spawnattr_setdefault.c > + posix/spawnattr_setflags.c > + posix/spawnattr_setpgroup.c > + posix/spawnattr_setschedparam.c > + posix/spawnattr_setschedpolicy.c > + posix/spawnattr_setsigmask.c > + posix/spawnp.c > + stdlib/atoll.c > + stdlib/getsubopt.c > + stdlib/setenv.c > + stdlib/strtoll.c > + stdlib/strtoul.c > + # Merged from gnulib 2014-6-26, needs merge back > + string/memchr.c > + string/memcmp.c > + string/memmem.c > + string/mempcpy.c > + string/memrchr.c > + string/rawmemchr.c > + string/stpcpy.c > + string/stpncpy.c > + string/str-two-way.h > + string/strcasestr.c > + string/strcspn.c > + string/strdup.c > + string/strndup.c > + string/strpbrk.c > + string/strsignal.c > + string/strstr.c > + string/strtok_r.c > + string/strverscmp.c > + sysdeps/generic/pty-private.h > + sysdeps/generic/siglist.h > + sysdeps/posix/euidaccess.c > + sysdeps/posix/gai_strerror.c > + sysdeps/posix/getcwd.c > + sysdeps/posix/pwrite.c > + sysdeps/posix/spawni.c > + # Merged from gnulib 2014-6-23 > + sysdeps/posix/tempname.c > + # Merged from gnulib 2014-6-27 > + time/mktime.c > + time/strptime.c > + time/timegm.c Missing: time/mktime-internal.h I cross checked with gnulib's srclist.txt. > + > +# The last merge was 2014-12-11 and merged gettext 0.19.3 into glibc with a > +# patch submitted to the gettext mailing list for changes that could be merged > +# back. > +# > +# This commit was omitted from the merge as it does not appear to be compatible > +# with how glibc expects things to work: > +# > +# commit 279b57fc367251666f00e8e2b599b83703451afb > +# Author: Bruno Haible > +# Date: Fri Jun 14 12:03:49 2002 +0000 > +# > +# Make absolute pathnames inside $LANGUAGE work. > +gettext: > + intl/bindtextdom.c > + intl/dcgettext.c > + intl/dcigettext.c > + intl/dcngettext.c > + intl/dgettext.c > + intl/dngettext.c > + intl/explodename.c > + intl/finddomain.c > + intl/gettext.c > + intl/gettextP.h > + intl/gmo.h > + intl/hash-string.c > + intl/hash-string.h > + intl/l10nflist.c > + intl/loadinfo.h > + intl/loadmsgcat.c > + intl/locale.alias > + intl/localealias.c > + intl/ngettext.c > + intl/plural-exp.c > + intl/plural-exp.h > + intl/plural.y > + intl/textdomain.c > + > +# The following files are shared with the upstream Unicode project and must be > +# updated regularly to stay in sync with the upstream unicode releases. > +# > +# Merged from Unicode 13.0.0 release. > +unicode: > + localedata/unicode-gen/UnicodeData.txt OK. > + localedata/unicode-gen/unicode-license.txt This is correct, but I see we don't download this file with each update. I've reached out to Steve Loomis to see how CLDR got this file too. I think we should be downloading it from Unicode when we do an update. > + localedata/unicode-gen/DerivedCoreProperties.txt > + localedata/unicode-gen/EastAsianWidth.txt > + localedata/unicode-gen/PropList.txt OK. > + > +# The following files are shared with the upstream tzcode project and must be > +# updated regularly to stay in sync with the upstream releases. > +# > +# Update from tzcode 2017b. > +# Latest is 2018g: > +# https://mm.icann.org/pipermail/tz-announce/2018-October/000052.html > +tzcode: > + timezone/private.h > + timezone/tzfile.h > + timezone/zdump.c > + timezone/zic.c > + timezone/tzselect.ksh > + > +# The following files are shared with the upstream tzdata project but is not > +# synchronized regularly. The data files themselves are used only for testing > +# purposes and their data is never used to generate any output. We synchronize > +# them only to stay on top of newer data that might help with testing. > +# > +# Currently synced to 2009i. Latest is 2018g. > +# https://mm.icann.org/pipermail/tz-announce/2018-October/000052.html > +tzdata: > + timezone/africa > + timezone/antarctica > + timezone/asia > + timezone/australasia > + timezone/europe > + timezone/northamerica > + timezone/southamerica > + timezone/pacificnew > + timezone/etcetera > + timezone/factory > + timezone/backward > + timezone/systemv > + timezone/solar87 > + timezone/solar88 > + timezone/solar89 > + timezone/iso3166.tab > + timezone/zone.tab > + timezone/leapseconds > + # This is yearistype.sh in the parent project > + timezone/yearistype > diff --git a/scripts/glibc_shared_code.py b/scripts/glibc_shared_code.py > new file mode 100644 > index 0000000000..873a26117f > --- /dev/null > +++ b/scripts/glibc_shared_code.py > @@ -0,0 +1,70 @@ > +#!/usr/bin/python > +# Copyright (C) 2021 Free Software Foundation, Inc. > +# This file is part of the GNU C Library. > +# > +# The GNU C Library is free software; you can redistribute it and/or > +# modify it under the terms of the GNU Lesser General Public > +# License as published by the Free Software Foundation; either > +# version 2.1 of the License, or (at your option) any later version. > +# > +# The GNU C Library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# You should have received a copy of the GNU Lesser General Public > +# License along with the GNU C Library; if not, see > +# . > + > +def get_glibc_shared_code(path): > + """ Get glibc shared code information from a file > + > + The input file must have project names in their own line ending with a colon > + and all shared files in the project on their own lines following the project > + name. Whitespaces are ignored. Lines with # as the first non-whitespace > + character are ignored. > + > + Args: > + path: The path to file containing shared code information. > + > + Returns: > + A dictionary with project names as key and lists of files as values. > + """ > + > + projects = {} > + with open(path, 'r') as f: > + for line in f.readlines(): > + line = line.strip() > + if len(line) == 0 or line[0] == '#': > + continue > + if line[-1] == ':': > + cur = line[:-1] > + projects[cur] = [] > + else: > + projects[cur].append(line) > + > + return projects > + > +# Function testing. > +import sys > +from os import EX_NOINPUT > +from os.path import exists > +from pprint import * > + > +if __name__ == '__main__': > + if len(sys.argv) != 2: > + print('Usage: %s ' % sys.argv[0]) > + print('Run this script from the base glibc source directory') > + sys.exit(EX_NOINPUT) > + > + print('Testing get_glibc_shared_code with %s:\n' % sys.argv[1]) > + r = get_glibc_shared_code(sys.argv[1]) > + errors = False > + for k in r.keys(): > + for f in r[k]: > + if not exists(f): > + print('%s does not exist' % f) > + errors = True > + > + if not errors: > + pprint(r) > OK. -- Cheers, Carlos.