Skip to content

Perl benchmark shows performance decline after upgrading perl 5.30.1 from perl 5.28.2 #17558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dapeng-mi opened this issue Feb 12, 2020 · 22 comments

Comments

@dapeng-mi
Copy link

dapeng-mi commented Feb 12, 2020

Module:

Description

We upgraded perl from 5.28.2 to 5.30.1 on Clear Linux. But we found there were performance decline (3% ~ 10%) on Perl benchmark (Pod2html and Interpreter). Here are detailed test results.

Perl Version Pod2html Interpreter
perl-5.28.2 0.10765260/0.43% 0.00106407/0.19%
perl-5.30.1 0.11623903/0.52% 0.00110267/0.41%

Steps to Reproduce

  1. Download and install phoronix-test-suite from this link (https://www.phoronix-test-suite.com)
  2. Run command "phoronix-test-suite run perl-benchmark-1.0.1" and select "3" to test all options (Pod2html and Interpreter).
  3. After test finishes, record the test results.

Expected behavior

Perl configuration

Perl Info
Summary of my perl5 (revision 5 version 30 subversion 1) configuration:

  Platform:
    osname=linux
    osvers=5.3.11-869.native
    archname=x86_64-linux-thread-multi
    uname='linux bc7d1fc894be44529b787f45b9a1fcde 5.3.11-869.native #1 smp thu nov 14 06:38:22 pst 2019 x86_64 gnulinux '
    config_args='--build=x86_64-generic-linux-gnu --host=x86_64-generic-linux-gnu --target=x86_64-clr-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/bin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --disable-static -d -e -r -Dprefix=/usr -Dsiteprefix=/usr/local -Dvendorprefix=/usr -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dusethreads -Duseshrplib -Adefine:d_procselfexe -Adefine:procselfexe="/proc/self/exe" -Adefine:ccflags=-O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4  -fPIC -Adefine:ldflags= -Wl,--build-id=sha1 -Adefine:lddflags= -Wl,--build-id=sha1 -U d_off64_t'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4 -fPIC -fwrapv -fno-strict-aliasing -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4 -fPIC -fwrapv -fno-strict-aliasing -fstack-protector-strong'
    ccversion=''
    gccversion='9.2.1 20200111 gcc-9-branch@280154'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -Wl,--build-id=sha1 -fstack-protector-strong'
    libpth=/usr/lib64/gcc/x86_64-generic-linux/9/include-fixed /usr/lib /lib/../lib64 /usr/lib/../lib64 /lib /lib64 /usr/lib64
    libs=-lpthread -lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.30.so
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version='2.30'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.30.1/x86_64-linux-thread-multi/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -fstack-protector-strong'


Characteristics of this binary (from libperl):
  Compile-time options:
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_IMPLICIT_CONTEXT
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under linux
  Compiled at Jan  6 2020 22:15:26
  @INC:
    /usr/local/lib/perl5/site_perl/5.30.1/x86_64-linux-thread-multi
    /usr/local/lib/perl5/site_perl/5.30.1
    /usr/lib/perl5/vendor_perl/5.30.1/x86_64-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.30.1
    /usr/lib/perl5/5.30.1/x86_64-linux-thread-multi
    /usr/lib/perl5/5.30.1
    /usr/local/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl
@richardleach
Copy link
Contributor

The phoronix-test-suite - https://github.com/phoronix-test-suite/phoronix-test-suite/tree/master/ob-cache/test-profiles/pts/perl-benchmark-1.0.1 - looks like a wrapper around Gisle Aas' https://github.com/gisle/perlbench.

It also only looks to run each test 3 times, so potential for system noise to make a big difference.

Having a look at what the tests are doing to see if there's an easy route to minimal test cases.

@richardleach
Copy link
Contributor

Notes:

The Pod2html test (benchmarks/app/podhtml.b) essentially uses static copies of Pod::HTML & perlfunc.pod to run this in a loop:

pod2html("--infile" => "benchmarks/app/perlfunc.pod", "--outfile" => "/dev/null");

The Interpreter test (./benchmarks/startup/noprog.t) runs this in a loop:

system $^X, "-e", "1";

Next, I'm going to:

  1. Look at how the perl-benchmark gets set up within phoronix-test-suite
  2. Run the current version of Perlbench from CPAN compare v5.28.2 and v5.30.1 directly

@jkeenan
Copy link
Contributor

jkeenan commented Feb 16, 2020

[snip]

Next, I'm going to:

1. Look at how the _perl-benchmark_ gets set up within phoronix-test-suite

2. Run the current version of Perlbench from CPAN compare v5.28.2 and v5.30.1 directly

Thank you for undertaking this systematic investigation.

@richardleach
Copy link
Contributor

Thank you for undertaking this systematic investigation.

It's a relaxingly nerdy way to wind down. ;-)

@richardleach
Copy link
Contributor

phoronix-test-suite looks to set up a shell script that calls:

perl perlbench-runtests [benchmarks to test]

(This will execute the benchmarks using whichever perl is in $^X .)

phoronix-test-suite then reads the Avg: time out of the resulting .pb file.

That seems legit.

Don't know why phoronix-test-suite only uses 2 out of ~40 tests in Perlbench tho!

@richardleach
Copy link
Contributor

richardleach commented Feb 16, 2020

Did a few runs using the latest Perlbench in a Debian VM against fresh 5.28.2 and 5.30.1 builds. The relative rankings of the two builds across the ~40 tests were pretty much all over the place. On one run, 5.28.2 might be ahead on a test by 1-10%, but on a different run 5.30.1 was ahead instead.

IIRC this is not unusual behaviour for Perlbench, because of how it takes its measurements. (Reading the clock, rather than using e.g. cachegrind or perf. )

The "Interpreter" test did perform better on 5.28.2 across all runs, but only by 1-3%. That:

The benchmarks are always swamped by noise. The outputs of individual
tests within perlbench can easily vary by 10% on slightly different perl
builds just by the changed alignment of bits of the object code across
cache line boundaries.
  • Could also result from differences between the builds. For example, I just ran "./Configure -des" for both my builds, but somehow 5.30.1 was built with -D_FORTIFY_SOURCE=2 and 5.28.2 was not. (I think this was down to this commit or one similar: 9f4e630.)

In summary, the phoronix-test-suite results by themselves don't seem to clearly point to a regression.

Is either test case suitable for running with Porting/bench.pl?

@dapeng-mi
Copy link
Author

Phoronix-test-suite would run three times for each indicator by default, but if deviation exceeds the default threshold, phoronix-test-suite would continue to run more times until the deviation meets the threshold.
From my side, the regression can be reproduced stably, the Pod2html regression on i7-8700 is ~10% and on i9-9900K is ~6%.

@richardleach
Copy link
Contributor

Thanks. Approximately how long does phoronix-test-suite run for?
Please could you include the perl -V output for your 5.28.2 build as well?

@dapeng-mi
Copy link
Author

dapeng-mi commented Feb 18, 2020

On my side, generally Pod2html needs to run near 15 times to reach the default deviation threshold and takes near 15 minutes to finish the test. The same configuration and compiling options are used to build these two version perls.

Here are the output of perl 5.28.2.
Summary of my perl5 (revision 5 version 28 subversion 2) configuration:

Platform:
osname=linux
osvers=5.3.11-869.native
archname=x86_64-linux-thread-multi
uname='linux 07cd155c8bf447d99259a955d5500cff 5.3.11-869.native #1 smp thu nov 14 06:38:22 pst 2019 x86_64 gnulinux '
config_args='--build=x86_64-generic-linux-gnu --host=x86_64-generic-linux-gnu --target=x86_64-clr-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/bin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --disable-static -d -e -r -Dprefix=/usr -Dsiteprefix=/usr/local -Dvendorprefix=/usr -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dusethreads -Duseshrplib -Adefine:d_procselfexe -Adefine:procselfexe="/proc/self/exe" -Adefine:ccflags=-O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4 -fPIC -Adefine:ldflags= -Wl,--build-id=sha1 -Adefine:lddflags= -Wl,--build-id=sha1 -U d_off64_t'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-D_REENTRANT -D_GNU_SOURCE -O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4 -fPIC -fwrapv -fno-strict-aliasing -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-O2'
cppflags='-D_REENTRANT -D_GNU_SOURCE -O2 -g -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,-z,now,-z,relro,-z,max-page-size=0x1000,-z,separate-code -Wno-error -ftree-vectorize -ftree-slp-vectorize -Wl,--enable-new-dtags -fcf-protection=return -Wl,--build-id=sha1 -Wa,-mbranches-within-32B-boundaries -m64 -march=westmere -mtune=haswell -fasynchronous-unwind-tables -fno-omit-frame-pointer -Wp,-D_REENTRANT -O3 -ffat-lto-objects -flto=4 -fPIC -fwrapv -fno-strict-aliasing -fstack-protector-strong'
ccversion=''
gccversion='9.2.1 20191210 gcc-9-branch@279166'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -Wl,--build-id=sha1 -fstack-protector-strong'
libpth=/usr/lib64/gcc/x86_64-generic-linux/9/include-fixed /usr/lib /lib/../lib64 /usr/lib/../lib64 /lib /lib64 /usr/lib64
libs=-lpthread -lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.30.so
so=so
useshrplib=true
libperl=libperl.so
gnulibc_version='2.30'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.28.2/x86_64-linux-thread-multi/CORE'
cccdlflags='-fPIC'
lddlflags='-shared -O2 -fstack-protector-strong'

Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
Built under linux
Compiled at Dec 12 2019 00:24:27
@inc:
/usr/local/lib/perl5/site_perl/5.28.2/x86_64-linux-thread-multi
/usr/local/lib/perl5/site_perl/5.28.2
/usr/lib/perl5/vendor_perl/5.28.2/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.28.2
/usr/lib/perl5/5.28.2/x86_64-linux-thread-multi
/usr/lib/perl5/5.28.2

@richardleach
Copy link
Contributor

@dapeng-mi Thanks for that detail.

@ everyone: Suggestions on how to move forward with this? Results from some quick perf experiments below. (Disclaimer: I've never used perf before.) I don't mind building every 5.29.x release and running the same perf, if that would be worthwhile...

/5282# perf stat /gh17558/5282/perl5/perl -I/gh17558/5282/perl5/lib -MPod::Html -MFile::Spec -e 'pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" )'

 Performance counter stats for '/gh17558/5282/perl5/perl -I/gh17558/5282/perl5/lib -MPod::Html -MFile::Spec -e pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" )':

            384.48 msec task-clock                #    0.992 CPUs utilized          
                28      context-switches          #    0.073 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
             3,714      page-faults               #    0.010 M/sec                  
     1,072,682,917      cycles                    #    2.790 GHz                    
     1,745,919,221      instructions              #    1.63  insn per cycle         
       356,790,778      branches                  #  927.986 M/sec                  
         6,341,590      branch-misses             #    1.78% of all branches        

       0.387540624 seconds time elapsed

       0.357445000 seconds user
       0.027801000 seconds sys

/5282# perf stat /gh17558/5282/perl5/perl -I/gh17558/5282/perl5/lib -MPod::Html -MFile::Spec -e 'pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" ) for ( 0 .. 999 )'

 Performance counter stats for '/gh17558/5282/perl5/perl -I/gh17558/5282/perl5/lib -MPod::Html -MFile::Spec -e pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" ) for ( 0 .. 999 )':

        354,431.68 msec task-clock                #    0.997 CPUs utilized          
            10,461      context-switches          #    0.030 K/sec                  
               101      cpu-migrations            #    0.000 K/sec                  
           321,077      page-faults               #    0.906 K/sec                  
   974,256,551,546      cycles                    #    2.749 GHz                    
 1,601,626,624,150      instructions              #    1.64  insn per cycle         
   326,049,339,752      branches                  #  919.922 M/sec                  
     5,020,400,433      branch-misses             #    1.54% of all branches        

     355.414378728 seconds time elapsed

     352.627838000 seconds user
       1.728528000 seconds sys

/5301# perf stat /gh17558/5301/perl5/perl -I/gh17558/5301/perl5/lib -MPod::Html -MFile::Spec -e 'pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" )'

 Performance counter stats for '/gh17558/5301/perl5/perl -I/gh17558/5301/perl5/lib -MPod::Html -MFile::Spec -e pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" )':

            432.09 msec task-clock                #    0.990 CPUs utilized          
                26      context-switches          #    0.060 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
             3,730      page-faults               #    0.009 M/sec                  
     1,164,654,318      cycles                    #    2.695 GHz                    
     1,780,088,766      instructions              #    1.53  insn per cycle         
       363,935,095      branches                  #  842.272 M/sec                  
         7,029,140      branch-misses             #    1.93% of all branches        

       0.436268922 seconds time elapsed

       0.416828000 seconds user
       0.015729000 seconds sys

/5301# perf stat /gh17558/5301/perl5/perl -I/gh17558/5301/perl5/lib -MPod::Html -MFile::Spec -e 'pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" ) for (0 .. 999)'

 Performance counter stats for '/gh17558/5301/perl5/perl -I/gh17558/5301/perl5/lib -MPod::Html -MFile::Spec -e pod2html("--infile" => "/gh17558/perlfunc.pod", "--outfile" => File::Spec->devnull, "--quiet" ) for (0 .. 999)':

        362,237.18 msec task-clock                #    0.998 CPUs utilized          
            10,016      context-switches          #    0.028 K/sec                  
               102      cpu-migrations            #    0.000 K/sec                  
           205,705      page-faults               #    0.568 K/sec                  
 1,002,808,465,214      cycles                    #    2.768 GHz                    
 1,631,358,178,037      instructions              #    1.63  insn per cycle         
   332,252,149,201      branches                  #  917.223 M/sec                  
     5,060,947,906      branch-misses             #    1.52% of all branches        

     362.881463506 seconds time elapsed

     361.196604000 seconds user
       0.974318000 seconds sys

@iabyn
Copy link
Contributor

iabyn commented Feb 18, 2020 via email

@richardleach
Copy link
Contributor

pod2html_misc_perf.txt - some numbers for the past two release cycles, plus the .0 stable releases back to 5.20.

Cycles, instructions, branches all vary. Haven't had time to look in any detail.

One thing that jumps out though is a massive jump in cycles, instructions, and branches from v5.31.9 to v5.31.10.

@iabyn
Copy link
Contributor

iabyn commented Mar 21, 2020 via email

@khwilliamson
Copy link
Contributor

khwilliamson commented Mar 21, 2020 via email

@iabyn
Copy link
Contributor

iabyn commented Mar 21, 2020 via email

@richardleach
Copy link
Contributor

I see almost identical (to each other) results for 5.31.9 and 5.31.10. Are you sure your .9 and .10 perl binaries are built in the same way as each other as regards to threading, DEBUGGING, compiler etc?

Yeah, I built them all in the same way (./Configure -des -Dusedevel && make, with a make clean in-between), but just reran for blead and it didn't show that jump:

 Performance counter stats for './perl -Ilib -MPod::Html -MFile::Spec -e pod2html("--infile" => "../perlfunc.pod","--outfile" => File::Spec->devnull, "--quiet")':

            635.77 msec task-clock                #    0.997 CPUs utilized          
                 7      context-switches          #    0.011 K/sec                  
                 1      cpu-migrations            #    0.002 K/sec                  
             4,491      page-faults               #    0.007 M/sec                  
     1,309,623,366      cycles                    #    2.060 GHz                    
     1,850,029,406      instructions              #    1.41  insn per cycle         
       375,064,232      branches                  #  589.940 M/sec                  
         7,713,324      branch-misses             #    2.06% of all branches        

       0.637361018 seconds time elapsed

       0.591935000 seconds user
       0.045998000 seconds sys

Nothing like repeatability, eh? ;-)

@richardleach
Copy link
Contributor

And again for v5.30.10, back in the same ballpark:

 Performance counter stats for './perl -Ilib -MPod::Html -MFile::Spec -e pod2html("--infile" => "../perlfunc.pod","--outfile" => File::Spec->devnull, "--quiet")':

            622.37 msec task-clock                #    0.998 CPUs utilized          
                 8      context-switches          #    0.013 K/sec                  
                 1      cpu-migrations            #    0.002 K/sec                  
             4,497      page-faults               #    0.007 M/sec                  
     1,287,636,701      cycles                    #    2.069 GHz                    
     1,848,940,119      instructions              #    1.44  insn per cycle         
       374,712,223      branches                  #  602.072 M/sec                  
         7,708,676      branch-misses             #    2.06% of all branches        

       0.623594041 seconds time elapsed

       0.601463000 seconds user
       0.022481000 seconds sys

Th only thing I've ever been able to conclude from test suite timings is that I can't conclude anything from them.

This seems very true.

It might be useful if we could produce box and whisker plots for benchmarks, so we're not mislead by outliers. (Using counter-based data from perlbench or perf.) Wonder if Perl::Formance can/could do this?

@iabyn
Copy link
Contributor

iabyn commented Mar 21, 2020 via email

@richardleach
Copy link
Contributor

1,000x pod2html iterations for 5.28.2, 5.30.1, 5.31.10 below. Similar story.

v5.31.10

 Performance counter stats for './perl -Ilib -MPod::Html -MFile::Spec -e pod2html("--infile" => "../perlfunc.pod","--outfile" => File::Spec->devnull, "--quiet") for (0..999)':

        454,670.34 msec task-clock                #    0.999 CPUs utilized          
             3,078      context-switches          #    0.007 K/sec                  
                13      cpu-migrations            #    0.000 K/sec                  
           205,704      page-faults               #    0.452 K/sec                  
 1,082,806,460,200      cycles                    #    2.382 GHz                    
 1,687,579,076,989      instructions              #    1.56  insn per cycle         
   340,128,910,670      branches                  #  748.078 M/sec                  
     5,416,255,880      branch-misses             #    1.59% of all branches        

     454.920092899 seconds time elapsed

     453.228837000 seconds user
       1.411193000 seconds sys

==========================
v5.30.1

 Performance counter stats for './perl -Ilib -MPod::Html -MFile::Spec -e pod2html("--infile" => "../perlfunc.pod","--outfile" => File::Spec->devnull, "--quiet") for (0..999)':

        408,737.90 msec task-clock                #    0.999 CPUs utilized          
             2,577      context-switches          #    0.006 K/sec                  
                14      cpu-migrations            #    0.000 K/sec                  
           202,101      page-faults               #    0.494 K/sec                  
   996,591,259,416      cycles                    #    2.438 GHz                    
 1,634,068,853,879      instructions              #    1.64  insn per cycle         
   333,080,367,811      branches                  #  814.900 M/sec                  
     5,362,305,842      branch-misses             #    1.61% of all branches        

     408.947508213 seconds time elapsed

     407.643432000 seconds user
       1.082093000 seconds sys

==========================
v5.28.2

 Performance counter stats for './perl -Ilib -MPod::Html -MFile::Spec -e pod2html("--infile" => "../perlfunc.pod","--outfile" => File::Spec->devnull, "--quiet") for (0..999)':

        468,584.42 msec task-clock                #    0.999 CPUs utilized          
             3,361      context-switches          #    0.007 K/sec                  
                27      cpu-migrations            #    0.000 K/sec                  
           322,632      page-faults               #    0.689 K/sec                  
 1,027,909,571,067      cycles                    #    2.194 GHz                    
 1,605,057,470,043      instructions              #    1.56  insn per cycle         
   327,106,521,040      branches                  #  698.074 M/sec                  
     5,109,485,964      branch-misses             #    1.56% of all branches        

     468.908419310 seconds time elapsed

     466.658571000 seconds user
       1.898568000 seconds sys

@richardleach
Copy link
Contributor

@dapeng-mi If you're able to identify a specific commit as the cause of a pod2html regression, that would give something specific to examine. Otherwise it looks like test noise is swamping any signal.

@khwilliamson
Copy link
Contributor

khwilliamson commented Apr 14, 2020 via email

@richardleach
Copy link
Contributor

This issue hasn't turned up a minimally reproduceable test case, or even a general suspicion of "something in X part of core slowed down". It does seem likely that noise and/or the standard small % variability of performance across builds due to factors outside of our control might have been what was happening.

I'll close this ticket now. If there are any specific slowdowns that people want looking at, please could you raise a new issue and include your detailed test case(s) there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants