From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 677C81F8C7 for ; Tue, 13 Jul 2021 15:05:35 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50135396E000 for ; Tue, 13 Jul 2021 15:05:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 50135396E000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1626188734; bh=3ZzEBk4caP6N4y7fGiFtEEfOtxyEkZKLExF7INuQ3F8=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=tFM0XbsMJ8V0SVHkH6iiqQ2+FFBKLrXTSMKKBFAAwH9b8zkwn/vpObG1ZFo9Vpj0n zMRsX6cGYcKG388NA1lq1ldDzXEKIqQZuI316PTvJZcSCzYgwFhKufOwMD3sAJ5Nry i8ZyPVHwQLWIQNEmxkBTS+cwvRmWGc74Isiw4SiA= Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by sourceware.org (Postfix) with ESMTPS id C84B6384843E for ; Tue, 13 Jul 2021 15:05:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C84B6384843E Received: by mail-pf1-x429.google.com with SMTP id 17so19912755pfz.4 for ; Tue, 13 Jul 2021 08:05:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3ZzEBk4caP6N4y7fGiFtEEfOtxyEkZKLExF7INuQ3F8=; b=RWxaUDgn6r1eSHgQctm70h0H6Ex4ePvw5JeCbZdItdq75uLHNgjjfGIDq/oXGOxRRk SVq/+WcXy+2glE121ZDiTfiwTf3nwrd5CfPMOwNgVYtNBLP6D6XEbQ2CQ7onPtfbUM7R 5R/5hd/aWpkX4xoxv2wMO4K/fRPxEvLp4OOyFz8z49G8HwT1tUQ1Ox0Y406+R4HvBPOA Uo+X2s3HZbHlsoI3xBQsPut4v8MXSRGswIJKAtxN1Ka/iYemJLOZaXjCPooc1QDKERJa /D/ulY6zjC0OuUtY6FfVY28urd9tDquBIFwRVjkkJYzfv3Pay8FiVZmmpaIVoC2G6E6z F1PQ== X-Gm-Message-State: AOAM533RCmTaQfAMbIL7UkWb3PQl5sCwy8AyRr3UEvqZu+1cCRkoMXZm Uyk/AbVxobXmwMYUrrsmKXe7CyIv/3CQCMC7LLU= X-Google-Smtp-Source: ABdhPJzGCn1ySjiIgiMgmL0u7uQ9nKhhgqxfkpvoedm5TslCTaSwDomu2dxeW6u3JSEe4txV6KG4CBMsEpuTexkVjGo= X-Received: by 2002:a63:114d:: with SMTP id 13mr4804250pgr.180.1626188712919; Tue, 13 Jul 2021 08:05:12 -0700 (PDT) MIME-Version: 1.0 References: <20210701210537.51272-1-hjl.tools@gmail.com> In-Reply-To: Date: Tue, 13 Jul 2021 08:04:36 -0700 Message-ID: Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support To: "Wang, Pengfei" Content-Type: text/plain; charset="UTF-8" X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: "H.J. Lu via Libc-alpha" Reply-To: "H.J. Lu" Cc: "llvm-dev@lists.llvm.org" , GNU C Library , GCC Patches , IA32 System V Application Binary Interface , Joseph Myers Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On Tue, Jul 13, 2021 at 7:48 AM Wang, Pengfei wrote: > > Hi H.J., > > Our LLVM implementation currently use %xmm0 for both _Complex's real part and imaginary part. Do we have special reason to use two registers? > We are using one register on X64. Considering the performance, especially the register pressure, should it be better to use one register for _Complex _Float16 on 32 bits target? x86-64 psABI is unrelated to i386 psABI. Using a pair of registers is more natural for complex _Float16. Since it is only used for function return value, I don't think there is a register pressure issue. > Thanks > Pengfei > > -----Original Message----- > From: H.J. Lu > Sent: Tuesday, July 13, 2021 10:26 PM > To: Wang, Pengfei ; llvm-dev@lists.llvm.org > Cc: Joseph Myers ; GCC Patches ; GNU C Library ; IA32 System V Application Binary Interface > Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei wrote: > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > > > Can you please explain the behavior here? Is there difference between > > _Float16 and _Complex _Float16 when return? I.e., 1, In which case will _Float16 values return in both %xmm0 and %xmm1? > > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively? > > Here is the v2 patch to add the missing _Float16 bits. The PDF file is at > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI > > > Thanks > > Pengfei > > > > -----Original Message----- > > From: llvm-dev On Behalf Of H.J. Lu > > via llvm-dev > > Sent: Friday, July 2, 2021 6:28 AM > > To: Joseph Myers > > Cc: llvm-dev@lists.llvm.org; GCC Patches ; > > GNU C Library ; IA32 System V Application > > Binary Interface > > Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support > > > > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers wrote: > > > > > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote: > > > > > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > > > > > That restricts use of _Float16 to processors with SSE. Is that what > > > we want in the ABI, or should _Float16 be available with base 32-bit > > > x86 architecture features only, much like _Float128 and the decimal > > > FP types > > > > Yes, _Float16 requires XMM registers. > > > > > are? (If it is restricted to SSE, we can of course ensure relevant > > > libgcc functions are built with SSE enabled, and likewise in glibc > > > if that gains > > > _Float16 functions, though maybe with some extra complications to > > > get relevant testcases to run whenever possible.) > > > > > > > _Float16 functions in libgcc should be compiled with SSE enabled. > > > > BTW, _Float16 software emulation may require more than just SSE since we need to do _Float16 load and store with XMM registers. > > There is no 16bit load/store for XMM registers without AVX512FP16. > > > > -- > > H.J. > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev@lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > -- > H.J. -- H.J.