From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id E2D9E1F55B for ; Thu, 28 May 2020 11:57:31 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1FAC83972816; Thu, 28 May 2020 11:57:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1FAC83972816 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1590667051; bh=MN4zDeVgZuVBDRk2hRXzg+vXzHIGYCLebPPtWpo+yfw=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ccDJmLJr9hUtXi7T/WNJWHH0U/i28iIWUundvF9E2YGydhv47qKcHKr41lIUGqDpH kCfCzAyWu6KW+/mVl33RjP0GaMbDi6xMSvcEyrjgXZpbL8IsiIPSnrWYbjt9uIamwJ NfZ2Efkr+o81QFd+VuPpukf/EbqqPj4r26OaW4LU= Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) by sourceware.org (Postfix) with ESMTPS id 0038B3870842 for ; Thu, 28 May 2020 11:57:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0038B3870842 Received: by mail-io1-xd43.google.com with SMTP id o5so29585424iow.8 for ; Thu, 28 May 2020 04:57:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MN4zDeVgZuVBDRk2hRXzg+vXzHIGYCLebPPtWpo+yfw=; b=fsH8RsjbWY1Fyr2ZOSyyfFpYRjwvu3tsRySok0xXJ1rUmLiBY51cNgOu19mrfAaiBe UVI1koxSvGUtQPZfijJ7NYQZtGsBgnRqTuHm9dcwK21TdyWAIpUgnv4VBFW/DtjFAkd5 s8ldbbmAaoLXNWNcMaXYD1OhPzOHzqCx2gKFptf6qarxkVephGYfSaqRmuUc0lF5I+Ww lq6Cwt7fauAvDPn1kSNjzl8w1M9h6qkeAC48UleF0QiiQveP/5AiKQ0jRSjth+/8QeBd 7JNXZ7hjum5rdPLv4paw8Ne1WlxX3cXwBctAKebo64bp7FFaoKrEa+LJ0N2Ky3Goc3L5 4/wA== X-Gm-Message-State: AOAM532MSoeZZNItgUmCQQ1NsK0NR2oTAfdLvq3jx6iGdeGbZJ0x8DKF ODKwsDOrGHhb7WPApXs7YalVxnNGF4F6cb83NP4= X-Google-Smtp-Source: ABdhPJxVK+fLE44Co4xvl9sbTL3ftFS1Ad7Af/5CkDijJbFYn2ygr2WEwhNqMJrLz/y+aK/dk5GmEHk8mGTdl5RQAIM= X-Received: by 2002:a02:2ac5:: with SMTP id w188mr1367272jaw.4.1590667048441; Thu, 28 May 2020 04:57:28 -0700 (PDT) MIME-Version: 1.0 References: <15ec783d-46f5-0166-aee9-f1d16a58ca83@huawei.com> In-Reply-To: Date: Thu, 28 May 2020 04:56:52 -0700 Message-ID: Subject: Re: [PATCH] x86: Add thresholds for "rep movsb/stosb" to tunables To: liqingqing Content-Type: text/plain; charset="UTF-8" X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: "H.J. Lu via Libc-alpha" Reply-To: "H.J. Lu" Cc: Hushiyuan , "libc-alpha@sourceware.org" Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On Fri, May 22, 2020 at 9:37 PM H.J. Lu wrote: > > On Fri, May 22, 2020 at 9:10 PM liqingqing wrote: > > > > this commitid 830566307f038387ca0af3fd327706a8d1a2f595 optimize implementation of function memset, > > and set macro REP_STOSB_THRESHOLD's default value to 2KB, when the input value is less than 2KB, the data flow is the same, and when the input value is large than 2KB, > > this api will use STOB to instead of MOVQ > > > > but when I test this API on x86_64 platform > > and found that this default value is not appropriate for some input length. here it's the enviornment and result > > > > test suite: libMicro-0.4.0 > > ./memset -E -C 200 -L -S -W -N "memset_4k" -s 4k -I 250 > > ./memset -E -C 200 -L -S -W -N "memset_4k_uc" -s 4k -u -I 400 > > ./memset -E -C 200 -L -S -W -N "memset_1m" -s 1m -I 200000 > > ./memset -E -C 200 -L -S -W -N "memset_10m" -s 10m -I 2000000 > > > > hardware platform: > > Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz > > L1d cache:32KB > > L1i cache: 32KB > > L2 cache: 1MB > > L3 cache: 60MB > > > > the result is that when input length is between the processor's L1 data cache and L2 cache size, the REP_STOSB_THRESHOLD=2KB will reduce performance. > > > > before this commit after this commit > > cycle cycle > > memset_4k 249 96 > > memset_10k 657 185 > > memset_36k 2773 3767 > > memset_100k 7594 10002 > > memset_500k 37678 52149 > > memset_1m 86780 108044 > > memset_10m 1307238 1148994 > > > > before this commit after this commit > > MLC cache miss(10sec) MLC cache miss(10sec) > > memset_4k 1,09,33,823 1,01,79,270 > > memset_10k 1,23,78,958 1,05,41,087 > > memset_36k 3,61,64,244 4,07,22,429 > > memset_100k 8,25,33,052 9,31,81,253 > > memset_500k 37,32,55,449 43,56,70,395 > > memset_1m 75,16,28,239 88,29,90,237 > > memset_10m 9,36,61,67,397 8,96,69,49,522 > > > > > > though REP_STOSB_THRESHOLD can be modified at the building time by use -DREP_STOSB_THRESHOLD=xxx, > > but I think the default value may be is not a better one, cause I think most of the processor's L2 cache is large than 2KB, so i submit a patch as below: > > > > > > > > From 44314a556239a7524b5a6451025737c1bdbb1cd0 Mon Sep 17 00:00:00 2001 > > From: liqingqing > > Date: Thu, 21 May 2020 11:23:06 +0800 > > Subject: [PATCH] update REP_STOSB_THRESHOLD's default value from 2k to 1M > > macro REP_STOSB_THRESHOLD's value will reduce memset performace when input length is between processor's L1 data cache and L2 cache. > > so update the defaule value to eliminate the decrement . > > > > There is no single threshold value which is good for all workloads. > I don't think we should change REP_STOSB_THRESHOLD to 1MB. > On the other hand, the fixed threshold isn't flexible. Please try this > patch to see if you can set the threshold for your specific workload. > Any comments, objections? https://sourceware.org/pipermail/libc-alpha/2020-May/114281.html -- H.J.