From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D0F331F466 for ; Fri, 17 Jan 2020 03:58:58 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-transfer-encoding; q=dns; s=default; b=H9kYOiiz04E34tqD 7M27lkfKvdZrRnvZxpV7DLBm9FQXESUicfkh9oxQYWlWxFT+L19DihJpxzprMd9R WPiLRBmLw6AFqw1pOD5G4yQvm45b9BTm4nDZZCsLMAGW66A7bRbOUFqsmeMtCXjQ OybAnEwaknw7UuakvoAAO6mxe4w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-transfer-encoding; s=default; bh=i/I4AsSRETAEYBJxf9UnBD YVnm4=; b=D0TYB9nlvvL98Y59ct1hH0g8kDhfqFRNm13AeMWm1LjGAdcjahtOzE 35/cvA2xAAfC7VoxDtrEzoeaXwOVmUK/6kz8GPXckaywYzJTHuVXxZjQdd1QXuPJ 0LfPYWcye2H4dxDb9he86JlJ1L1SNaOL6pYULXvaHRoZXqgCf0FPA= Received: (qmail 124724 invoked by alias); 17 Jan 2020 03:58:55 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 124711 invoked by uid 89); 17 Jan 2020 03:58:55 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: dog.birch.relay.mailchannels.net X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|siddhesh@gotplt.org X-MailChannels-Auth-Id: dreamhost X-Left-Thoughtful: 456e6f230fb36e9e_1579233525855_1228423473 X-MC-Loop-Signature: 1579233525855:513935252 X-MC-Ingress-Time: 1579233525855 X-DH-BACKEND: pdx1-sub0-mail-a10 From: Siddhesh Poyarekar To: libc-alpha@sourceware.org Subject: [PATCH] gitlog-to-changelog: Drop scripts in favour of gnulib version Date: Fri, 17 Jan 2020 09:28:23 +0530 Message-Id: <20200117035823.90813-1-siddhesh@sourceware.org> MIME-Version: 1.0 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedugedrtdeigdeivdcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffoggfgsedtkeertdertddtnecuhfhrohhmpefuihguughhvghshhcurfhohigrrhgvkhgrrhcuoehsihguughhvghshhesshhouhhrtggvfigrrhgvrdhorhhgqeenucffohhmrghinhepghhnuhdrohhrghdpghhithdqshgtmhdrtghomhenucfkphepuddvfedrvdehvddrvddtvddrudejvdenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhinhgrrhhoqdhlrghpthhophdrihhnthhrrgdrrhgvshgvrhhvvgguqdgsihhtrdgtohhmpdhinhgvthepuddvfedrvdehvddrvddtvddrudejvddprhgvthhurhhnqdhprghthhepufhiugguhhgvshhhucfrohihrghrvghkrghruceoshhiugguhhgvshhhsehsohhurhgtvgifrghrvgdrohhrgheqpdhmrghilhhfrhhomhepshhiugguhhgvshhhsehsohhurhgtvgifrghrvgdrohhrghdpnhhrtghpthhtoheplhhisggtqdgrlhhphhgrsehsohhurhgtvgifrghrvgdrohhrghenucevlhhushhtvghrufhiiigvpedt Content-Transfer-Encoding: quoted-printable The ChangeLog automation scripts were incorporated in gnulib as vcs-to-changelog for a while now since other projects expressed the desire to use and extend this script. In the interest of avoiding duplication of code, drop the glibc version of gitlog-to-changelog and use the gnulib one directly. The only file that remains is vcstocl_quirks.py, which specifies properties and quirks of the glibc project source code. This patch also drops the shebang at the start of vcstocl_quirks.py since the file is not intended to be directly executable. --- scripts/gitlog_to_changelog.py | 138 --- scripts/vcs_to_changelog/frontend_c.py | 827 ------------------ scripts/vcs_to_changelog/misc_util.py | 51 -- scripts/vcs_to_changelog/vcs_git.py | 164 ---- .../{vcs_to_changelog =3D> }/vcstocl_quirks.py | 1 - 5 files changed, 1181 deletions(-) delete mode 100755 scripts/gitlog_to_changelog.py delete mode 100644 scripts/vcs_to_changelog/frontend_c.py delete mode 100644 scripts/vcs_to_changelog/misc_util.py delete mode 100644 scripts/vcs_to_changelog/vcs_git.py rename scripts/{vcs_to_changelog =3D> }/vcstocl_quirks.py (99%) diff --git a/scripts/gitlog_to_changelog.py b/scripts/gitlog_to_changelog= .py deleted file mode 100755 index b7920aaf99..0000000000 --- a/scripts/gitlog_to_changelog.py +++ /dev/null @@ -1,138 +0,0 @@ -#!/usr/bin/python3 -# Main VCSToChangeLog script. -# Copyright (C) 2019-2020 Free Software Foundation, Inc. -# -# This program is free software: you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see . - -''' Generate a ChangeLog style output based on a VCS log. - -This script takes two revisions as input and generates a ChangeLog style= output -for all revisions between the two revisions. - -This script is intended to be executed from the project parent directory= . - -The vcs_to_changelog directory has a file vcstocl_quirks.py that defines= a -function called get_project_quirks that returns a object of class type -ProjectQuirks or a subclass of the same. The definition of the ProjectQ= uirks -class is below and it specifies the properties that the project must set= to -ensure correct parsing of its contents. - -Among other things, ProjectQurks specifies the VCS to read from; the def= ault is -assumed to be git. The script then studies the VCS log and for each cha= nge, -list out the nature of changes in the constituent files. - -Each file type may have parser frontends that can read files and constru= ct -objects that may be compared to determine the minimal changes that occur= ed in -each revision. For files that do not have parsers, we may only know the= nature -of changes at the top level depending on the information that the VCS st= ores. - -The parser frontend must have a compare() method that takes the old and = new -files as arrays of strings and prints the output in ChangeLog format. - -Currently implemented VCS: - - git - -Currently implemented frontends: - - C -''' -import sys -import os -import re -import argparse -from vcs_to_changelog.misc_util import * -from vcs_to_changelog import frontend_c -from vcs_to_changelog.vcs_git import * - -debug =3D DebugUtil(False) - -class ProjectQuirks: - # This is a list of regex substitutions for C/C++ macros that are kn= own to - # break parsing of the C programs. Each member of this list is a di= ct with - # the key 'orig' having the regex and 'sub' having the substitution = of the - # regex. - MACRO_QUIRKS =3D [] - - # This is a list of macro definitions that are extensively used and = are - # known to break parsing due to some characteristic, mainly the lack= of a - # semicolon at the end. - C_MACROS =3D [] - - # The repo type, defaults to git. - repo =3D 'git' - - # List of files to ignore either because they are not needed (such a= s the - # ChangeLog) or because they are non-parseable. For example, glibc = has a - # header file that is only assembly code, which breaks the C parser. - IGNORE_LIST =3D ['ChangeLog'] - - -# Load quirks file. We assume that the script is run from the top level= source -# directory. -sys.path.append('/'.join([os.getcwd(), 'scripts', 'vcs_to_changelog'])) -try: - from vcstocl_quirks import * - project_quirks =3D get_project_quirks(debug) -except: - project_quirks =3D ProjectQuirks() - -def analyze_diff(filename, oldfile, newfile, frontends): - ''' Parse the output of the old and new files and print the differen= ce. - - For input files OLDFILE and NEWFILE with name FILENAME, generate red= uced - trees for them and compare them. We limit our comparison to only C = source - files. - ''' - name, ext =3D os.path.splitext(filename) - - if not ext in frontends.keys(): - return None - else: - frontend =3D frontends[ext] - frontend.compare(oldfile, newfile) - - -def main(repo, frontends, refs): - ''' ChangeLog Generator Entry Point. - ''' - commits =3D repo.list_commits(args.refs) - for commit in commits: - repo.list_changes(commit, frontends) - - -if __name__ =3D=3D '__main__': - parser =3D argparse.ArgumentParser() - - parser.add_argument('refs', metavar=3D'ref', type=3Dstr, nargs=3D2, - help=3D'Refs to print ChangeLog entries between') - - parser.add_argument('-d', '--debug', required=3DFalse, action=3D'sto= re_true', - help=3D'Run the file parser debugger.') - - args =3D parser.parse_args() - - debug.debug =3D args.debug - - if len(args.refs) < 2: - debug.eprint('Two refs needed to get a ChangeLog.') - sys.exit(os.EX_USAGE) - - REPO =3D {'git': GitRepo(project_quirks.IGNORE_LIST, debug)} - - fe_c =3D frontend_c.Frontend(project_quirks, debug) - FRONTENDS =3D {'.c': fe_c, - '.h': fe_c} - - main(REPO[project_quirks.repo], FRONTENDS, args.refs) diff --git a/scripts/vcs_to_changelog/frontend_c.py b/scripts/vcs_to_chan= gelog/frontend_c.py deleted file mode 100644 index 8e37c5fa47..0000000000 --- a/scripts/vcs_to_changelog/frontend_c.py +++ /dev/null @@ -1,827 +0,0 @@ -#!/usr/bin/python3 -# The C Parser. -# Copyright (C) 2019-2020 Free Software Foundation, Inc. -# -# This program is free software: you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see . - -from enum import Enum -import re -from vcs_to_changelog.misc_util import * - -class block_flags(Enum): - ''' Flags for the code block. - ''' - else_block =3D 1 - macro_defined =3D 2 - macro_redefined =3D 3 - - -class block_type(Enum): - ''' Type of code block. - ''' - file =3D 1 - macro_cond =3D 2 - macro_def =3D 3 - macro_undef =3D 4 - macro_include =3D 5 - macro_info =3D 6 - decl =3D 7 - func =3D 8 - composite =3D 9 - macrocall =3D 10 - fndecl =3D 11 - assign =3D 12 - struct =3D 13 - union =3D 14 - enum =3D 15 - -# A dictionary describing what each action (add, modify, delete) show up= as in -# the ChangeLog output. -actions =3D {0:{'new': 'New', 'mod': 'Modified', 'del': 'Remove'}, - block_type.file:{'new': 'New file', 'mod': 'Modified file', - 'del': 'Remove file'}, - block_type.macro_cond:{'new': 'New', 'mod': 'Modified', - 'del': 'Remove'}, - block_type.macro_def:{'new': 'New', 'mod': 'Modified', - 'del': 'Remove'}, - block_type.macro_include:{'new': 'Include file', 'mod': 'Modi= fied', - 'del': 'Remove include'}, - block_type.macro_info:{'new': 'New preprocessor message', - 'mod': 'Modified', 'del': 'Remove'}, - block_type.decl:{'new': 'New', 'mod': 'Modified', 'del': 'Rem= ove'}, - block_type.func:{'new': 'New function', 'mod': 'Modified func= tion', - 'del': 'Remove function'}, - block_type.composite:{'new': 'New', 'mod': 'Modified', - 'del': 'Remove'}, - block_type.struct:{'new': 'New struct', 'mod': 'Modified stru= ct', - 'del': 'Remove struct'}, - block_type.union:{'new': 'New union', 'mod': 'Modified union'= , - 'del': 'Remove union'}, - block_type.enum:{'new': 'New enum', 'mod': 'Modified enum', - 'del': 'Remove enum'}, - block_type.macrocall:{'new': 'New', 'mod': 'Modified', - 'del': 'Remove'}, - block_type.fndecl:{'new': 'New function', 'mod': 'Modified', - 'del': 'Remove'}, - block_type.assign:{'new': 'New', 'mod': 'Modified', 'del': 'R= emove'}} - -def new_block(name, type, contents, parent, flags =3D 0): - ''' Create a new code block with the parent as PARENT. - - The code block is a basic structure around which the tree representa= tion of - the source code is built. It has the following attributes: - - - name: A name to refer it by in the ChangeLog - - type: Any one of the following types in BLOCK_TYPE. - - contents: The contents of the block. For a block of types file or - macro_cond, this would be a list of blocks that it nests. For oth= er types - it is a list with a single string specifying its contents. - - parent: This is the parent of the current block, useful in setting= up - #elif or #else blocks in the tree. - - flags: A special field to indicate some properties of the block. S= ee - BLOCK_FLAGS for values. - ''' - block =3D {} - block['matched'] =3D False - block['name'] =3D name - block['type'] =3D type - block['contents'] =3D contents - block['parent'] =3D parent - if parent: - parent['contents'].append(block) - - block['flags'] =3D flags - block['actions'] =3D actions[type] - - return block - - -class ExprParser: - ''' Parent class of all of the C expression parsers. - - It is necessary that the children override the parse_line() method. - ''' - ATTRIBUTE =3D r'(((__attribute__\s*\(\([^;]+\)\))|(asm\s*\([?)]+\)))= \s*)*' - - def __init__(self, project_quirks, debug): - self.project_quirks =3D project_quirks - self.debug =3D debug - - def fast_forward_scope(self, cur, op, loc): - ''' Consume lines in a code block. - - Consume all lines of a block of code such as a composite type de= claration or - a function declaration. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - - Returns: The next location to be read in the array as well as = the updated - value of CUR, which will now have the body of the function or = composite - type. - ''' - nesting =3D cur.count('{') - cur.count('}') - while nesting > 0 and loc < len(op): - cur =3D cur + ' ' + op[loc] - - nesting =3D nesting + op[loc].count('{') - nesting =3D nesting - op[loc].count('}') - loc =3D loc + 1 - - return (cur, loc) - - def parse_line(self, cur, op, loc, code, macros): - ''' The parse method should always be overridden by the child. - ''' - raise - - -class FuncParser(ExprParser): - REGEX =3D re.compile(ExprParser.ATTRIBUTE + r'\s*(\w+)\s*\([^(][^{]+= \)\s*{') - - def parse_line(self, cur, op, loc, code, macros): - ''' Parse a function. - - Match a function definition. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this - - - Returns: The next location to be read in the array. - ''' - found =3D re.search(self.REGEX, cur) - if not found: - return cur, loc - - name =3D found.group(5) - self.debug.print('FOUND FUNC: %s' % name) - - # Consume everything up to the ending brace of the function. - (cur, loc) =3D self.fast_forward_scope(cur, op, loc) - - new_block(name, block_type.func, [cur], code) - - return '', loc - - -class CompositeParser(ExprParser): - # Composite types such as structs and unions. - REGEX =3D re.compile(r'(struct|union|enum)\s*(\w*)\s*{') - - def parse_line(self, cur, op, loc, code, macros): - ''' Parse a composite type. - - Match declaration of a composite type such as a sruct or a union= .. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this - - - Returns: The next location to be read in the array. - ''' - found =3D re.search(self.REGEX, cur) - if not found: - return cur, loc - - # Lap up all of the struct definition. - (cur, loc) =3D self.fast_forward_scope(cur, op, loc) - - name =3D found.group(2) - - if not name: - if 'typedef' in cur: - name =3D re.sub(r'.*}\s*(\w+);$', r'\1', cur) - else: - name=3D '' - - ctype =3D found.group(1) - - if ctype =3D=3D 'struct': - blocktype =3D block_type.struct - if ctype =3D=3D 'enum': - blocktype =3D block_type.enum - if ctype =3D=3D 'union': - blocktype =3D block_type.union - - new_block(name, block_type.composite, [cur], code) - - return '', loc - - -class AssignParser(ExprParser): - # Static assignments. - REGEX =3D re.compile(r'(\w+)\s*(\[[^\]]*\])*\s*([^\s]*attribute[\s\w= ()]+)?\s*=3D') - - def parse_line(self, cur, op, loc, code, macros): - ''' Parse an assignment statement. - - This includes array assignments. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this - - - Returns: The next location to be read in the array. - ''' - found =3D re.search(self.REGEX, cur) - if not found: - return cur, loc - - name =3D found.group(1) - self.debug.print('FOUND ASSIGN: %s' % name) - # Lap up everything up to semicolon. - while ';' not in cur and loc < len(op): - cur =3D op[loc] - loc =3D loc + 1 - - new_block(name, block_type.assign, [cur], code) - - return '', loc - - -class DeclParser(ExprParser): - # Function pointer typedefs. - TYPEDEF_FN_RE =3D re.compile(r'\(\*(\w+)\)\s*\([^)]+\);') - - # Simple decls. - DECL_RE =3D re.compile(r'(\w+)(\[\w*\])*\s*' + ExprParser.ATTRIBUTE = + ';') - - # __typeof decls. - TYPEOF_RE =3D re.compile(r'__typeof\s*\([\w\s]+\)\s*(\w+)\s*' + \ - ExprParser.ATTRIBUTE + ';') - - # Function Declarations. - FNDECL_RE =3D re.compile(r'\s*(\w+)\s*\([^\(][^;]*\)\s*' + - ExprParser.ATTRIBUTE + ';') - - def __init__(self, regex, blocktype, project_quirks, debug): - # The regex for the current instance. - self.REGEX =3D regex - self.blocktype =3D blocktype - super().__init__(project_quirks, debug) - - def parse_line(self, cur, op, loc, code, macros): - ''' Parse a top level declaration. - - All types of declarations except function declarations. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this function - - - Returns: The next location to be read in the array. - ''' - found =3D re.search(self.REGEX, cur) - if not found: - return cur, loc - - # The name is the first group for all of the above regexes. Thi= s is a - # coincidence, so care must be taken if regexes are added or cha= nged to - # ensure that this is true. - name =3D found.group(1) - - self.debug.print('FOUND DECL: %s' % name) - new_block(name, self.blocktype, [cur], code) - - return '', loc - - -class MacroParser(ExprParser): - # The macrocall_re peeks into the next line to ensure that it doesn'= t - # eat up a FUNC by accident. The func_re regex is also quite crude = and - # only intends to ensure that the function name gets picked up - # correctly. - MACROCALL_RE =3D re.compile(r'(\w+)\s*(\(.*\))*$') - - def parse_line(self, cur, op, loc, code, macros): - ''' Parse a macro call. - - Match a symbol hack macro calls that get added without semicolon= s. - - - CUR is the string to consume this expression from - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this - - MACROS is the regex match object. - - - Returns: The next location to be read in the array. - ''' - - # First we have the macros for symbol hacks and all macros we id= entified so - # far. - if cur.count('(') !=3D cur.count(')'): - return cur, loc - if loc < len(op) and '{' in op[loc]: - return cur, loc - - found =3D re.search(self.MACROCALL_RE, cur) - if found: - sym =3D found.group(1) - name =3D found.group(2) - if sym in macros or self.project_quirks and \ - sym in self.project_quirks.C_MACROS: - self.debug.print('FOUND MACROCALL: %s (%s)' % (sym, name= )) - new_block(sym, block_type.macrocall, [cur], code) - return '', loc - - # Next, there could be macros that get called right inside their= #ifdef, but - # without the semi-colon. - if cur.strip() =3D=3D code['name'].strip(): - self.debug.print('FOUND MACROCALL (without brackets): %s' % = (cur)) - new_block(cur, block_type.macrocall, [cur], code) - return '',loc - - return cur, loc - - -class Frontend: - ''' The C Frontend implementation. - ''' - KNOWN_MACROS =3D [] - - def __init__(self, project_quirks, debug): - self.op =3D [] - self.debug =3D debug - self.project_quirks =3D project_quirks - - self.c_expr_parsers =3D [ - CompositeParser(project_quirks, debug), - AssignParser(project_quirks, debug), - DeclParser(DeclParser.TYPEOF_RE, block_type.decl, - project_quirks, debug), - DeclParser(DeclParser.TYPEDEF_FN_RE, block_type.decl, - project_quirks, debug), - DeclParser(DeclParser.FNDECL_RE, block_type.fndecl, - project_quirks, debug), - FuncParser(project_quirks, debug), - DeclParser(DeclParser.DECL_RE, block_type.decl, project_= quirks, - debug), - MacroParser(project_quirks, debug)] - - - def remove_extern_c(self): - ''' Process extern "C"/"C++" block nesting. - - The extern "C" nesting does not add much value so it's safe to a= lmost always - drop it. Also drop extern "C++" - ''' - new_op =3D [] - nesting =3D 0 - extern_nesting =3D 0 - for l in self.op: - if '{' in l: - nesting =3D nesting + 1 - if re.match(r'extern\s*"C"\s*{', l): - extern_nesting =3D nesting - continue - if '}' in l: - nesting =3D nesting - 1 - if nesting < extern_nesting: - extern_nesting =3D 0 - continue - new_op.append(l) - - # Now drop all extern C++ blocks. - self.op =3D new_op - new_op =3D [] - nesting =3D 0 - extern_nesting =3D 0 - in_cpp =3D False - for l in self.op: - if re.match(r'extern\s*"C\+\+"\s*{', l): - nesting =3D nesting + 1 - in_cpp =3D True - - if in_cpp: - if '{' in l: - nesting =3D nesting + 1 - if '}' in l: - nesting =3D nesting - 1 - if nesting =3D=3D 0: - new_op.append(l) - - self.op =3D new_op - - - def remove_comments(self, op): - ''' Remove comments. - - Return OP by removing all comments from it. - ''' - self.debug.print('REMOVE COMMENTS') - - sep=3D'\n' - opstr =3D sep.join(op) - opstr =3D re.sub(r'/\*.*?\*/', r'', opstr, flags=3Dre.MULTILINE = | re.DOTALL) - opstr =3D re.sub(r'\\\n', r' ', opstr, flags=3Dre.MULTILINE | re= .DOTALL) - new_op =3D list(filter(None, opstr.split(sep))) - - return new_op - - - def normalize_condition(self, name): - ''' Make some minor transformations on macro conditions to make = them more - readable. - ''' - # Negation with a redundant bracket. - name =3D re.sub(r'!\s*\(\s*(\w+)\s*\)', r'! \1', name) - # Pull in negation of equality. - name =3D re.sub(r'!\s*\(\s*(\w+)\s*=3D=3D\s*(\w+)\)', r'\1 !=3D = \2', name) - # Pull in negation of inequality. - name =3D re.sub(r'!\s*\(\s*(\w+)\s*!=3D\s*(\w+)\)', r'\1 =3D=3D = \2', name) - # Fix simple double negation. - name =3D re.sub(r'!\s*\(\s*!\s*(\w+)\s*\)', r'\1', name) - # Similar, but nesting a complex expression. Because of the gre= edy match, - # this matches only the outermost brackets. - name =3D re.sub(r'!\s*\(\s*!\s*\((.*)\)\s*\)$', r'\1', name) - return name - - - def parse_preprocessor(self, loc, code, start =3D ''): - ''' Parse a preprocessor directive. - - In case a preprocessor condition (i.e. if/elif/else), create a n= ew code - block to nest code into and in other cases, identify and add ent= ities suchas - include files, defines, etc. - - - OP is the string array for the file - - LOC is the first unread location in CUR - - CODE is the block to which we add this function - - START is the string that should continue to be expanded in cas= e we step - into a new macro scope. - - - Returns: The next location to be read in the array. - ''' - cur =3D self.op[loc] - loc =3D loc + 1 - endblock =3D False - - self.debug.print('PARSE_MACRO: %s' % cur) - - # Remove the # and strip spaces again. - cur =3D cur[1:].strip() - - # Include file. - if cur.find('include') =3D=3D 0: - m =3D re.search(r'include\s*["<]?([^">]+)[">]?', cur) - new_block(m.group(1), block_type.macro_include, [cur], code) - - # Macro definition. - if cur.find('define') =3D=3D 0: - m =3D re.search(r'define\s+([a-zA-Z0-9_]+)', cur) - name =3D m.group(1) - exists =3D False - # Find out if this is a redefinition. - for c in code['contents']: - if c['name'] =3D=3D name and c['type'] =3D=3D block_type= .macro_def: - c['flags'] =3D block_flags.macro_redefined - exists =3D True - break - if not exists: - new_block(m.group(1), block_type.macro_def, [cur], code, - block_flags.macro_defined) - # Add macros as we encounter them. - self.KNOWN_MACROS.append(m.group(1)) - - # Macro undef. - if cur.find('undef') =3D=3D 0: - m =3D re.search(r'undef\s+([a-zA-Z0-9_]+)', cur) - new_block(m.group(1), block_type.macro_def, [cur], code) - - # #error and #warning macros. - if cur.find('error') =3D=3D 0 or cur.find('warning') =3D=3D 0: - m =3D re.search(r'(error|warning)\s+"?(.*)"?', cur) - if m: - name =3D m.group(2) - else: - name =3D '' - new_block(name, block_type.macro_info, [cur], code) - - # Start of an #if or #ifdef block. - elif cur.find('if') =3D=3D 0: - rem =3D re.sub(r'ifndef', r'!', cur).strip() - rem =3D re.sub(r'(ifdef|defined|if)', r'', rem).strip() - rem =3D self.normalize_condition(rem) - ifdef =3D new_block(rem, block_type.macro_cond, [], code) - ifdef['headcond'] =3D ifdef - ifdef['start'] =3D start - loc =3D self.parse_line(loc, ifdef, start) - - # End the previous #if/#elif and begin a new block. - elif cur.find('elif') =3D=3D 0 and code['parent']: - rem =3D self.normalize_condition(re.sub(r'(elif|defined)', r= '', cur).strip()) - # The #else and #elif blocks should go into the current bloc= k's parent. - ifdef =3D new_block(rem, block_type.macro_cond, [], code['pa= rent']) - ifdef['headcond'] =3D code['headcond'] - loc =3D self.parse_line(loc, ifdef, code['headcond']['start'= ]) - endblock =3D True - - # End the previous #if/#elif and begin a new block. - elif cur.find('else') =3D=3D 0 and code['parent']: - name =3D self.normalize_condition('!(' + code['name'] + ')') - ifdef =3D new_block(name, block_type.macro_cond, [], code['p= arent'], - block_flags.else_block) - ifdef['headcond'] =3D code['headcond'] - loc =3D self.parse_line(loc, ifdef, code['headcond']['start'= ]) - endblock =3D True - - elif cur.find('endif') =3D=3D 0 and code['parent']: - # Insert an empty else block if there isn't one. - if code['flags'] !=3D block_flags.else_block: - name =3D self.normalize_condition('!(' + code['name'] + = ')') - ifdef =3D new_block(name, block_type.macro_cond, [], cod= e['parent'], - block_flags.else_block) - ifdef['headcond'] =3D code['headcond'] - loc =3D self.parse_line(loc - 1, ifdef, code['headcond']= ['start']) - endblock =3D True - - return (loc, endblock) - - - def parse_c_expr(self, cur, loc, code): - ''' Parse a C expression. - - CUR is the string to be parsed, which continues to grow until a = match is - found. OP is the string array and LOC is the first unread locat= ion in the - string array. CODE is the block in which any identified express= ions should - be added. - ''' - self.debug.print('PARSING: %s' % cur) - - for p in self.c_expr_parsers: - cur, loc =3D p.parse_line(cur, self.op, loc, code, self.KNOW= N_MACROS) - if not cur: - break - - return cur, loc - - - def expand_problematic_macros(self, cur): - ''' Replace problem macros with their substitutes in CUR. - ''' - for p in self.project_quirks.MACRO_QUIRKS: - cur =3D re.sub(p['orig'], p['sub'], cur) - - return cur - - - def parse_line(self, loc, code, start =3D ''): - ''' - Parse the file line by line. The function assumes a mostly GNU = coding - standard compliant input so it might barf with anything that is = eligible for - the Obfuscated C code contest. - - The basic idea of the parser is to identify macro conditional sc= opes and - definitions, includes, etc. and then parse the remaining C code = in the - context of those macro scopes. The parser does not try to under= stand the - semantics of the code or even validate its syntax. It only reco= rds high - level symbols in the source and makes a tree structure to indica= te the - declaration/definition of those symbols and their scope in the m= acro - definitions. - - OP is the string array. - LOC is the first unparsed line. - CODE is the block scope within which the parsing is currently go= ing on. - START is the string with which this parsing should start. - ''' - cur =3D start - endblock =3D False - saved_cur =3D '' - saved_loc =3D 0 - endblock_loc =3D loc - - while loc < len(self.op): - nextline =3D self.op[loc] - - # Macros. - if nextline[0] =3D=3D '#': - (loc, endblock) =3D self.parse_preprocessor(loc, code, c= ur) - if endblock: - endblock_loc =3D loc - # Rest of C Code. - else: - cur =3D cur + ' ' + nextline - cur =3D self.expand_problematic_macros(cur).strip() - cur, loc =3D self.parse_c_expr(cur, loc + 1, code) - - if endblock and not cur: - # If we are returning from the first #if block, we want = to proceed - # beyond the current block, not repeat it for any preced= ing blocks. - if code['headcond'] =3D=3D code: - return loc - else: - return endblock_loc - - return loc - - def drop_empty_blocks(self, tree): - ''' Drop empty macro conditional blocks. - ''' - newcontents =3D [] - - for x in tree['contents']: - if x['type'] !=3D block_type.macro_cond or len(x['contents']= ) > 0: - newcontents.append(x) - - for t in newcontents: - if t['type'] =3D=3D block_type.macro_cond: - self.drop_empty_blocks(t) - - tree['contents'] =3D newcontents - - - def consolidate_tree_blocks(self, tree): - ''' Consolidate common macro conditional blocks. - - Get macro conditional blocks at the same level but scatterred ac= ross the - file together into a single common block to allow for better com= parison. - ''' - # Nothing to do for non-nesting blocks. - if tree['type'] !=3D block_type.macro_cond \ - and tree['type'] !=3D block_type.file: - return - - # Now for nesting blocks, get the list of unique condition names= and - # consolidate code under them. The result also bunches up all t= he - # conditions at the top. - newcontents =3D [] - - macros =3D [x for x in tree['contents'] \ - if x['type'] =3D=3D block_type.macro_cond] - macro_names =3D sorted(set([x['name'] for x in macros])) - for m in macro_names: - nc =3D [x['contents'] for x in tree['contents'] if x['name']= =3D=3D m \ - and x['type'] =3D=3D block_type.macro_cond] - b =3D new_block(m, block_type.macro_cond, sum(nc, []), tree) - self.consolidate_tree_blocks(b) - newcontents.append(b) - - newcontents.extend([x for x in tree['contents'] \ - if x['type'] !=3D block_type.macro_cond]) - - tree['contents'] =3D newcontents - - - def compact_tree(self, tree): - ''' Try to reduce the tree to its minimal form. - - A source code tree in its simplest form may have a lot of duplic= ated - information that may be difficult to compare and come up with a = minimal - difference. - ''' - - # First, drop all empty blocks. - self.drop_empty_blocks(tree) - - # Macro conditions that nest the entire file aren't very interes= ting. This - # should take care of the header guards. - if tree['type'] =3D=3D block_type.file \ - and len(tree['contents']) =3D=3D 1 \ - and tree['contents'][0]['type'] =3D=3D block_type.macro_= cond: - tree['contents'] =3D tree['contents'][0]['contents'] - - # Finally consolidate all macro conditional blocks. - self.consolidate_tree_blocks(tree) - - - def parse(self, op): - ''' File parser. - - Parse the input array of lines OP and generate a tree structure = to - represent the file. This tree structure is then used for compar= ison between - the old and new file. - ''' - self.KNOWN_MACROS =3D [] - tree =3D new_block('', block_type.file, [], None) - self.op =3D self.remove_comments(op) - self.remove_extern_c() - self.op =3D [re.sub(r'#\s+', '#', x) for x in self.op] - self.parse_line(0, tree) - self.compact_tree(tree) - self.dump_tree(tree, 0) - - return tree - - - def print_change(self, tree, action, prologue =3D ''): - ''' Print the nature of the differences found in the tree compar= ed to the - other tree. TREE is the tree that changed, action is what the c= hange was - (Added, Removed, Modified) and prologue specifies the macro scop= e the change - is in. The function calls itself recursively for all macro cond= ition tree - nodes. - ''' - - if tree['type'] !=3D block_type.macro_cond: - print('\t%s(%s): %s.' % (prologue, tree['name'], action)) - return - - prologue =3D '%s[%s]' % (prologue, tree['name']) - for t in tree['contents']: - if t['type'] =3D=3D block_type.macro_cond: - self.print_change(t, action, prologue) - else: - print('\t%s(%s): %s.' % (prologue, t['name'], action)) - - - def compare_trees(self, left, right, prologue =3D ''): - ''' Compare two trees and print the difference. - - This routine is the entry point to compare two trees and print o= ut their - differences. LEFT and RIGHT will always have the same name and = type, - starting with block_type.file and '' at the top level. - ''' - - if left['type'] =3D=3D block_type.macro_cond or left['type'] =3D= =3D block_type.file: - - if left['type'] =3D=3D block_type.macro_cond: - prologue =3D '%s[%s]' % (prologue, left['name']) - - # Make sure that everything in the left tree exists in the r= ight tree. - for cl in left['contents']: - found =3D False - for cr in right['contents']: - if not cl['matched'] and not cr['matched'] and \ - cl['name'] =3D=3D cr['name'] and cl['type'] = =3D=3D cr['type']: - cl['matched'] =3D cr['matched'] =3D True - self.compare_trees(cl, cr, prologue) - found =3D True - break - if not found: - self.print_change(cl, cl['actions']['del'], prologue= ) - - # ... and vice versa. This time we only need to look at unm= atched - # contents. - for cr in right['contents']: - if not cr['matched']: - self.print_change(cr, cr['actions']['new'], prologue= ) - else: - if left['contents'] !=3D right['contents']: - self.print_change(left, left['actions']['mod'], prologue= ) - - - def dump_tree(self, tree, indent): - ''' Print the entire tree. - ''' - if not self.debug.debug: - return - - if tree['type'] =3D=3D block_type.macro_cond or tree['type'] =3D= =3D block_type.file: - print('%sScope: %s' % (' ' * indent, tree['name'])) - for c in tree['contents']: - self.dump_tree(c, indent + 4) - print('%sEndScope: %s' % (' ' * indent, tree['name'])) - else: - if tree['type'] =3D=3D block_type.func: - print('%sFUNC: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.composite: - print('%sCOMPOSITE: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.assign: - print('%sASSIGN: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.fndecl: - print('%sFNDECL: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.decl: - print('%sDECL: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.macrocall: - print('%sMACROCALL: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.macro_def: - print('%sDEFINE: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.macro_include: - print('%sINCLUDE: %s' % (' ' * indent, tree['name'])) - elif tree['type'] =3D=3D block_type.macro_undef: - print('%sUNDEF: %s' % (' ' * indent, tree['name'])) - else: - print('%sMACRO LEAF: %s' % (' ' * indent, tree['name'])) - - - def compare(self, oldfile, newfile): - ''' Entry point for the C backend. - - Parse the two files into trees and compare them. Print the resu= lt of the - comparison in the ChangeLog-like format. - ''' - self.debug.print('LEFT TREE') - self.debug.print('-' * 80) - left =3D self.parse(oldfile) - - self.debug.print('RIGHT TREE') - self.debug.print('-' * 80) - right =3D self.parse(newfile) - - self.compare_trees(left, right) diff --git a/scripts/vcs_to_changelog/misc_util.py b/scripts/vcs_to_chang= elog/misc_util.py deleted file mode 100644 index cce68ba71d..0000000000 --- a/scripts/vcs_to_changelog/misc_util.py +++ /dev/null @@ -1,51 +0,0 @@ -# General Utility functions. -# Copyright (C) 2019-2020 Free Software Foundation, Inc. -# -# This program is free software: you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see . - -import sys - -class DebugUtil: - debug =3D False - def __init__(self, debug): - self.debug =3D debug - - def eprint(self, *args, **kwargs): - ''' Print to stderr. - ''' - print(*args, file=3Dsys.stderr, **kwargs) - - - def print(self, *args, **kwargs): - ''' Convenience function to print diagnostic information in the = program. - ''' - if self.debug: - self.eprint(*args, **kwargs) - - -def decode(string): - ''' Attempt to decode a string. - - Decode a string read from the source file. The multiple attempts ar= e needed - due to the presence of the page break characters and some tests in l= ocales. - ''' - codecs =3D ['utf8', 'cp1252'] - - for i in codecs: - try: - return string.decode(i) - except UnicodeDecodeError: - pass - - DebugUtil.eprint('Failed to decode: %s' % string) diff --git a/scripts/vcs_to_changelog/vcs_git.py b/scripts/vcs_to_changel= og/vcs_git.py deleted file mode 100644 index 575d6d987e..0000000000 --- a/scripts/vcs_to_changelog/vcs_git.py +++ /dev/null @@ -1,164 +0,0 @@ -# Git repo support. -# Copyright (C) 2019-2020 Free Software Foundation, Inc. -# -# This program is free software: you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see . - -from gitlog_to_changelog import analyze_diff -import subprocess -import re -from misc_util import * - -class GitRepo: - def __init__(self, ignore_list, debug): - self.ignore_list =3D ignore_list - self.debug =3D debug - - - def exec_git_cmd(self, args): - ''' Execute a git command and return its result as a list of str= ings. - ''' - args.insert(0, 'git') - self.debug.print(args) - proc =3D subprocess.Popen(args, stdout=3Dsubprocess.PIPE) - - # Clean up the output by removing trailing spaces, newlines and = dropping - # blank lines. - op =3D [decode(x[:-1]).strip() for x in proc.stdout] - op =3D [re.sub(r'[\s\f]+', ' ', x) for x in op] - op =3D [x for x in op if x] - return op - - - def list_changes(self, commit, frontends): - ''' List changes in a single commit. - - For the input commit id COMMIT, identify the files that have cha= nged and the - nature of their changes. Print commit information in the Change= Log format, - calling into helper functions as necessary. - ''' - - op =3D self.exec_git_cmd(['show', '--pretty=3Dfuller', '--date=3D= short', - '--raw', commit]) - authors =3D [] - date =3D '' - merge =3D False - copyright_exempt=3D'' - subject=3D '' - - for l in op: - if l.lower().find('copyright-paperwork-exempt:') =3D=3D 0 \ - and 'yes' in l.lower(): - copyright_exempt=3D' (tiny change)' - elif l.lower().find('co-authored-by:') =3D=3D 0 or \ - l.find('Author:') =3D=3D 0: - author =3D l.split(':')[1] - author =3D re.sub(r'([^ ]*)\s*(<.*)', r'\1 \2', author.= strip()) - authors.append(author) - elif l.find('CommitDate:') =3D=3D 0: - date =3D l[11:].strip() - elif l.find('Merge:') =3D=3D 0: - merge =3D True - elif not subject and date: - subject =3D l.strip() - - # Find raw commit information for all non-ChangeLog files. - op =3D [x[1:] for x in op if len(x) > 0 and re.match(r'^:[0-9]+'= , x)] - - # Skip all ignored files. - for ign in self.ignore_list: - op =3D [x for x in op if ign not in x] - - # It was only the ChangeLog, ignore. - if len(op) =3D=3D 0: - return - - print('%s %s' % (date, authors[0])) - - if (len(authors) > 1): - authors =3D authors[1:] - for author in authors: - print(' %s' % author) - - print() - - if merge: - print('\t MERGE COMMIT: %s\n' % commit) - return - - print('\tCOMMIT%s: %s\n\t%s\n' % (copyright_exempt, commit, subj= ect)) - - # Changes across a large number of files are typically mechanica= l (URL - # updates, copyright notice changes, etc.) and likely not intere= sting - # enough to produce a detailed ChangeLog entry. - if len(op) > 100: - print('\t* Suppressing diff as too many files differ.\n') - return - - # Each of these lines has a space separated format like so: - # : = - # - # where OPERATION can be one of the following: - # A: File added - # D: File removed - # M[0-9]{3}: File modified - # R[0-9]{3}: File renamed, with the 3 digit number following it = indicating - # what percentage of the file is intact. - # C[0-9]{3}: File copied. Same semantics as R. - # T: The permission bits of the file changed - # U: Unmerged. We should not encounter this, so we ignore it/ - # X, or anything else: Most likely a bug. Report it. - # - # FILE2 is set only when OPERATION is R or C, to indicate the ne= w file name. - # - # Also note that merge commits have a different format here, wit= h three - # entries each for the modes and refs, but we don't bother with = it for now. - # - # For more details: https://git-scm.com/docs/diff-format - for f in op: - data =3D f.split() - if data[4] =3D=3D 'A': - print('\t* %s: New file.' % data[5]) - elif data[4] =3D=3D 'D': - print('\t* %s: Delete file.' % data[5]) - elif data[4] =3D=3D 'T': - print('\t* %s: Changed file permission bits from %s to %= s' % \ - (data[5], data[0], data[1])) - elif data[4][0] =3D=3D 'M': - print('\t* %s: Modified.' % data[5]) - analyze_diff(data[5], - self.exec_git_cmd(['show', data[2]]), - self.exec_git_cmd(['show', data[3]]), front= ends) - elif data[4][0] =3D=3D 'R' or data[4][0] =3D=3D 'C': - change =3D int(data[4][1:]) - print('\t* %s: Move to...' % data[5]) - print('\t* %s: ... here.' % data[6]) - if change < 100: - analyze_diff(data[6], - self.exec_git_cmd(['show', data[2]]), - self.exec_git_cmd(['show', data[3]]), f= rontends) - # We should never encounter this, so ignore for now. - elif data[4] =3D=3D 'U': - pass - else: - eprint('%s: Unknown line format %s' % (commit, data[4])) - sys.exit(42) - - print('') - - - def list_commits(self, revs): - ''' List commit IDs between the two revs in the REVS list. - ''' - ref =3D revs[0] + '..' + revs[1] - return self.exec_git_cmd(['log', '--pretty=3D%H', ref]) diff --git a/scripts/vcs_to_changelog/vcstocl_quirks.py b/scripts/vcstocl= _quirks.py similarity index 99% rename from scripts/vcs_to_changelog/vcstocl_quirks.py rename to scripts/vcstocl_quirks.py index 0e611ffd7d..d73a586317 100644 --- a/scripts/vcs_to_changelog/vcstocl_quirks.py +++ b/scripts/vcstocl_quirks.py @@ -1,4 +1,3 @@ -#!/usr/bin/python3 # VCSToChangeLog Quirks for the GNU C Library. =20 # Copyright (C) 2019-2020 Free Software Foundation, Inc. --=20 2.24.1