mirror of
https://gitlab.rtems.org/rtems/rtos/rtems.git
synced 2025-12-12 10:59:18 +08:00
testsuites/benchmarks/dhrystone: Convert documents to single README.md
This commit is contained in:
committed by
Joel Sherrill
parent
52a9fdec5c
commit
cfdd40935d
@@ -1,3 +1,88 @@
|
||||
C README
|
||||
========
|
||||
This "shar" file contains the documentation for the
|
||||
electronic mail distribution of the Dhrystone benchmark (C version 2.1);
|
||||
a companion "shar" file contains the source code.
|
||||
(Because of mail length restrictions for some mailers, I have
|
||||
split the distribution in two parts.)
|
||||
|
||||
For versions in other languages, see the other "shar" files.
|
||||
|
||||
Files containing the C version (*.h: Header File, *.c: C Modules)
|
||||
|
||||
dhry.h
|
||||
dhry_1.c
|
||||
dhry_2.c
|
||||
|
||||
The file RATIONALE contains the article
|
||||
|
||||
"Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules"
|
||||
|
||||
which has been published, together with the C source code (Version 2.0),
|
||||
in SIGPLAN Notices vol. 23, no. 8 (Aug. 1988), pp. 49-62.
|
||||
This article explains all changes that have been made for Version 2,
|
||||
compared with the version of the original publication
|
||||
in Communications of the ACM vol. 27, no. 10 (Oct. 1984), pp. 1013-1030.
|
||||
It also contains "ground rules" for benchmarking with Dhrystone
|
||||
which should be followed by everyone who uses the program and publishes
|
||||
Dhrystone results.
|
||||
|
||||
Compared with the Version 2.0 published in SIGPLAN Notices, Version 2.1
|
||||
contains a few corrections that have been made after Version 2.0 was
|
||||
distriobuted over the UNIX network Usenet. These small differences between
|
||||
Version 2.0 and 2.1 should not affect execution time measurements.
|
||||
For those who want to compare the exact contents of both versions,
|
||||
the file "dhry_c.dif" contains the differences between the two versions,
|
||||
as generated by a file comparison of the corresponding files with the
|
||||
UNIX utility "diff".
|
||||
|
||||
The file VARIATIONS contains the article
|
||||
|
||||
"Understanding Variations in Dhrystone Performance"
|
||||
|
||||
which has been published in Microprocessor Report, May 1989
|
||||
(Editor: M. Slater), pp. 16-17. It describes the points that users
|
||||
should know if C Dhrystone results are compared.
|
||||
|
||||
Recipients of this shar file who perform measurements are asked
|
||||
to send measurement results to the author and/or to Rick Richardson.
|
||||
Rick Richardson publishes regularly Dhrystone results on the UNIX network
|
||||
Usenet. For submissions of results to him (preferably by electronic mail,
|
||||
see address in the program header), he has provided a form which is contained
|
||||
in the file "submit.frm".
|
||||
|
||||
|
||||
The following files are contained in other "shar" files:
|
||||
|
||||
Files containing the Ada version (*.s: Specifications, *.b: Bodies):
|
||||
|
||||
d_global.s
|
||||
d_main.b
|
||||
d_pack_1.b
|
||||
d_pack_1.s
|
||||
d_pack_2.b
|
||||
d_pack_2.s
|
||||
|
||||
File containing the Pascal version:
|
||||
|
||||
dhry.p
|
||||
|
||||
|
||||
February 22, 1990
|
||||
|
||||
Reinhold P. Weicker
|
||||
Siemens AG, AUT E 51
|
||||
Postfach 3220
|
||||
D-8520 Erlangen
|
||||
Germany (West)
|
||||
|
||||
Phone: [xxx-49]-9131-7-20330 (8-17 Central European Time)
|
||||
UUCP: ..!mcsun!unido!estevax!weicker
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
|
||||
|
||||
Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
|
||||
@@ -359,3 +444,161 @@ comments on earlier versions of the benchmark.
|
||||
Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language.
|
||||
Prentice-Hall, Englewood Cliffs (NJ) 1978
|
||||
|
||||
|
||||
Variations
|
||||
==========
|
||||
Understanding Variations in Dhrystone Performance
|
||||
|
||||
|
||||
|
||||
By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
|
||||
|
||||
|
||||
|
||||
April 1989
|
||||
|
||||
|
||||
This article has appeared in:
|
||||
|
||||
|
||||
Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
|
||||
|
||||
|
||||
|
||||
|
||||
Microprocessor manufacturers tend to credit all the performance measured by
|
||||
benchmarks to the speed of their processors, they often don't even mention the
|
||||
programming language and compiler used. In their detailed documents, usually
|
||||
called "performance brief" or "performance report," they usually do give more
|
||||
details. However, these details are often lost in the press releases and other
|
||||
marketing statements. For serious performance evaluation, it is necessary to
|
||||
study the code generated by the various compilers.
|
||||
|
||||
Dhrystone was originally published in Ada (Communications of the ACM, Oct.
|
||||
1984). However, since good Ada compilers were rare at this time and, together
|
||||
with UNIX, C became more and more popular, the C version of Dhrystone is the
|
||||
one now mainly used in industry. There are "official" versions 2.1 for Ada,
|
||||
Pascal, and C, which are as close together as the languages' semantic
|
||||
differences permit.
|
||||
|
||||
Dhrystone contains two statements where the programming language and its
|
||||
translation play a major part in the execution time measured by the benchmark:
|
||||
|
||||
o String assignment (in procedure Proc_0 / main)
|
||||
o String comparison (in function Func_2)
|
||||
|
||||
In Ada and Pascal, strings are arrays of characters where the length of the
|
||||
string is part of the type information known at compile time. In C, strings
|
||||
are also arrays of characters, but there are no operators defined in the
|
||||
language for assignment and comparison of strings. Instead, functions
|
||||
"strcpy" and "strcmp" are used. These functions are defined for strings of
|
||||
arbitrary length, and make use of the fact that strings in C have to end with
|
||||
a terminating null byte. For general-purpose calls to these functions, the
|
||||
implementor can assume nothing about the length and the alignment of the
|
||||
strings involved.
|
||||
|
||||
The C version of Dhrystone spends a relatively large amount of time in these
|
||||
two functions. Some time ago, I made measurements on a VAX 11/785 with the
|
||||
Berkeley UNIX (4.2) compilers (often-used compilers, but certainly not the
|
||||
most advanced). In the C version, 23% of the time was spent in the string
|
||||
functions; in the Pascal version, only 10%. On good RISC machines (where less
|
||||
time is spent in the procedure calling sequence than on a VAX) and with better
|
||||
optimizing compilers, the percentage is higher; MIPS has reported 34% for an
|
||||
R3000. Because of this effect, Pascal and Ada Dhrystone results are usually
|
||||
better than C results (except when the optimization quality of the C compiler
|
||||
is considerably better than that of the other compilers).
|
||||
|
||||
Several people have noted that the string operations are over-represented in
|
||||
Dhrystone, mainly because the strings occurring in Dhrystone are longer than
|
||||
average strings. I admit that this is true, and have said so in my SIGPLAN
|
||||
Notices paper (Aug. 1988); however, I didn't want to generate confusion by
|
||||
changing the string lengths from version 1 to version 2.
|
||||
|
||||
Even if they are somewhat over-represented in Dhrystone, string operations are
|
||||
frequent enough that it makes sense to implement them in the most efficient
|
||||
way possible, not only for benchmarking purposes. This means that they can
|
||||
and should be written in assembly language code. ANSI C also explicitly allows
|
||||
the strings functions to be implemented as macros, i.e. by inline code.
|
||||
|
||||
There is also a third way to speed up the "strcpy" statement in Dhrystone: For
|
||||
this particular "strcpy" statement, the source of the assignment is a string
|
||||
constant. Therefore, in contrast to calls to "strcpy" in the general case, the
|
||||
compiler knows the length and alignment of the strings involved at compile
|
||||
time and can generate code in the same efficient way as a Pascal compiler
|
||||
(word instructions instead of byte instructions).
|
||||
|
||||
This is not allowed in the case of the "strcmp" call: Here, the addresses are
|
||||
formal procedure parameters, and no assumptions can be made about the length
|
||||
or alignment of the strings. Any such assumptions would indicate an incorrect
|
||||
implementation. They might work for Dhrystone, where the strings are in fact
|
||||
word-aligned with typical compilers, but other programs would deliver
|
||||
incorrect results.
|
||||
|
||||
So, for an apple-to-apple comparison between processors, and not between
|
||||
several possible (legal or illegal) degrees of compiler optimization, one
|
||||
should check that the systems are comparable with respect to the following
|
||||
three points:
|
||||
|
||||
(1) String functions in assembly language vs. in C
|
||||
|
||||
Frequently used functions such as the string functions can and should be
|
||||
written in assembly language, and all serious C language systems known
|
||||
to me do this. (I list this point for completeness only.) Note that
|
||||
processors with an instruction that checks a word for a null byte (such
|
||||
as AMD's 29000 and Intel's 80960) have an advantage here. (This
|
||||
advantage decreases relatively if optimization (3) is applied.) Due to
|
||||
the length of the strings involved in Dhrystone, this advantage may be
|
||||
considered too high in perspective, but it is certainly legal to use
|
||||
such instructions - after all, these situations are what they were
|
||||
invented for.
|
||||
|
||||
(2) String function code inline vs. as library functions.
|
||||
|
||||
ANSI C has created a new situation, compared with the older
|
||||
Kernighan/Ritchie C. In the original C, the definition of the string
|
||||
function was not part of the language. Now it is, and inlining is
|
||||
explicitly allowed. I probably should have stated more clearly in my
|
||||
SIGPLAN Notices paper that the rule "No procedure inlining for
|
||||
Dhrystone" referred to the user level procedures only and not to the
|
||||
library routines.
|
||||
|
||||
(3) Fixed-length and alignment assumptions for the strings
|
||||
|
||||
Compilers should be allowed to optimize in these cases if (and only if)
|
||||
it is safe to do so. For Dhrystone, this is the "strcpy" statement, but
|
||||
not the "strcmp" statement (unless, of course, the "strcmp" code
|
||||
explicitly checks the alignment at execution time and branches
|
||||
accordingly). A "Dhrystone switch" for the compiler that causes the
|
||||
generation of code that may not work under certain circumstances is
|
||||
certainly inappropriate for comparisons. It has been reported in Usenet
|
||||
that some C compilers provide such a compiler option; since I don't have
|
||||
access to all C compilers involved, I cannot verify this.
|
||||
|
||||
If the fixed-length and word-alignment assumption can be used, a wide
|
||||
bus that permits fast multi-word load instructions certainly does help;
|
||||
however, this fact by itself should not make a really big difference.
|
||||
|
||||
A check of these points - something that is necessary for a thorough
|
||||
evaluation and comparison of the Dhrystone performance claims - requires
|
||||
object code listings as well as listings for the string functions (strcpy,
|
||||
strcmp) that are possibly called by the program.
|
||||
|
||||
I don't pretend that Dhrystone is a perfect tool to measure the integer
|
||||
performance of microprocessors. The more it is used and discussed, the more I
|
||||
myself learn about aspects that I hadn't noticed yet when I wrote the program.
|
||||
And of course, the very success of a benchmark program is a danger in that
|
||||
people may tune their compilers and/or hardware to it, and with this action
|
||||
make it less useful.
|
||||
|
||||
Whetstone and Linpack have their critical points also: The Whetstone rating
|
||||
depends heavily on the speed of the mathematical functions (sine, sqrt, ...),
|
||||
and Linpack is sensitive to data alignment for some cache configurations.
|
||||
|
||||
Introduction of a standard set of public domain benchmark software (something
|
||||
the SPEC effort attempts) is certainly a worthwhile thing. In the meantime,
|
||||
people will continue to use whatever is available and widely distributed, and
|
||||
Dhrystone ratings are probably still better than MIPS ratings if these are -
|
||||
as often in industry - based on no reproducible derivation. However, any
|
||||
serious performance evaluation requires more than just a comparison of raw
|
||||
numbers; one has to make sure that the numbers have been obtained in a
|
||||
comparable way.
|
||||
@@ -1,78 +0,0 @@
|
||||
This "shar" file contains the documentation for the
|
||||
electronic mail distribution of the Dhrystone benchmark (C version 2.1);
|
||||
a companion "shar" file contains the source code.
|
||||
(Because of mail length restrictions for some mailers, I have
|
||||
split the distribution in two parts.)
|
||||
|
||||
For versions in other languages, see the other "shar" files.
|
||||
|
||||
Files containing the C version (*.h: Header File, *.c: C Modules)
|
||||
|
||||
dhry.h
|
||||
dhry_1.c
|
||||
dhry_2.c
|
||||
|
||||
The file RATIONALE contains the article
|
||||
|
||||
"Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules"
|
||||
|
||||
which has been published, together with the C source code (Version 2.0),
|
||||
in SIGPLAN Notices vol. 23, no. 8 (Aug. 1988), pp. 49-62.
|
||||
This article explains all changes that have been made for Version 2,
|
||||
compared with the version of the original publication
|
||||
in Communications of the ACM vol. 27, no. 10 (Oct. 1984), pp. 1013-1030.
|
||||
It also contains "ground rules" for benchmarking with Dhrystone
|
||||
which should be followed by everyone who uses the program and publishes
|
||||
Dhrystone results.
|
||||
|
||||
Compared with the Version 2.0 published in SIGPLAN Notices, Version 2.1
|
||||
contains a few corrections that have been made after Version 2.0 was
|
||||
distriobuted over the UNIX network Usenet. These small differences between
|
||||
Version 2.0 and 2.1 should not affect execution time measurements.
|
||||
For those who want to compare the exact contents of both versions,
|
||||
the file "dhry_c.dif" contains the differences between the two versions,
|
||||
as generated by a file comparison of the corresponding files with the
|
||||
UNIX utility "diff".
|
||||
|
||||
The file VARIATIONS contains the article
|
||||
|
||||
"Understanding Variations in Dhrystone Performance"
|
||||
|
||||
which has been published in Microprocessor Report, May 1989
|
||||
(Editor: M. Slater), pp. 16-17. It describes the points that users
|
||||
should know if C Dhrystone results are compared.
|
||||
|
||||
Recipients of this shar file who perform measurements are asked
|
||||
to send measurement results to the author and/or to Rick Richardson.
|
||||
Rick Richardson publishes regularly Dhrystone results on the UNIX network
|
||||
Usenet. For submissions of results to him (preferably by electronic mail,
|
||||
see address in the program header), he has provided a form which is contained
|
||||
in the file "submit.frm".
|
||||
|
||||
|
||||
The following files are contained in other "shar" files:
|
||||
|
||||
Files containing the Ada version (*.s: Specifications, *.b: Bodies):
|
||||
|
||||
d_global.s
|
||||
d_main.b
|
||||
d_pack_1.b
|
||||
d_pack_1.s
|
||||
d_pack_2.b
|
||||
d_pack_2.s
|
||||
|
||||
File containing the Pascal version:
|
||||
|
||||
dhry.p
|
||||
|
||||
|
||||
February 22, 1990
|
||||
|
||||
Reinhold P. Weicker
|
||||
Siemens AG, AUT E 51
|
||||
Postfach 3220
|
||||
D-8520 Erlangen
|
||||
Germany (West)
|
||||
|
||||
Phone: [xxx-49]-9131-7-20330 (8-17 Central European Time)
|
||||
UUCP: ..!mcsun!unido!estevax!weicker
|
||||
@@ -1,157 +0,0 @@
|
||||
|
||||
Understanding Variations in Dhrystone Performance
|
||||
|
||||
|
||||
|
||||
By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
|
||||
|
||||
|
||||
|
||||
April 1989
|
||||
|
||||
|
||||
This article has appeared in:
|
||||
|
||||
|
||||
Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
|
||||
|
||||
|
||||
|
||||
|
||||
Microprocessor manufacturers tend to credit all the performance measured by
|
||||
benchmarks to the speed of their processors, they often don't even mention the
|
||||
programming language and compiler used. In their detailed documents, usually
|
||||
called "performance brief" or "performance report," they usually do give more
|
||||
details. However, these details are often lost in the press releases and other
|
||||
marketing statements. For serious performance evaluation, it is necessary to
|
||||
study the code generated by the various compilers.
|
||||
|
||||
Dhrystone was originally published in Ada (Communications of the ACM, Oct.
|
||||
1984). However, since good Ada compilers were rare at this time and, together
|
||||
with UNIX, C became more and more popular, the C version of Dhrystone is the
|
||||
one now mainly used in industry. There are "official" versions 2.1 for Ada,
|
||||
Pascal, and C, which are as close together as the languages' semantic
|
||||
differences permit.
|
||||
|
||||
Dhrystone contains two statements where the programming language and its
|
||||
translation play a major part in the execution time measured by the benchmark:
|
||||
|
||||
o String assignment (in procedure Proc_0 / main)
|
||||
o String comparison (in function Func_2)
|
||||
|
||||
In Ada and Pascal, strings are arrays of characters where the length of the
|
||||
string is part of the type information known at compile time. In C, strings
|
||||
are also arrays of characters, but there are no operators defined in the
|
||||
language for assignment and comparison of strings. Instead, functions
|
||||
"strcpy" and "strcmp" are used. These functions are defined for strings of
|
||||
arbitrary length, and make use of the fact that strings in C have to end with
|
||||
a terminating null byte. For general-purpose calls to these functions, the
|
||||
implementor can assume nothing about the length and the alignment of the
|
||||
strings involved.
|
||||
|
||||
The C version of Dhrystone spends a relatively large amount of time in these
|
||||
two functions. Some time ago, I made measurements on a VAX 11/785 with the
|
||||
Berkeley UNIX (4.2) compilers (often-used compilers, but certainly not the
|
||||
most advanced). In the C version, 23% of the time was spent in the string
|
||||
functions; in the Pascal version, only 10%. On good RISC machines (where less
|
||||
time is spent in the procedure calling sequence than on a VAX) and with better
|
||||
optimizing compilers, the percentage is higher; MIPS has reported 34% for an
|
||||
R3000. Because of this effect, Pascal and Ada Dhrystone results are usually
|
||||
better than C results (except when the optimization quality of the C compiler
|
||||
is considerably better than that of the other compilers).
|
||||
|
||||
Several people have noted that the string operations are over-represented in
|
||||
Dhrystone, mainly because the strings occurring in Dhrystone are longer than
|
||||
average strings. I admit that this is true, and have said so in my SIGPLAN
|
||||
Notices paper (Aug. 1988); however, I didn't want to generate confusion by
|
||||
changing the string lengths from version 1 to version 2.
|
||||
|
||||
Even if they are somewhat over-represented in Dhrystone, string operations are
|
||||
frequent enough that it makes sense to implement them in the most efficient
|
||||
way possible, not only for benchmarking purposes. This means that they can
|
||||
and should be written in assembly language code. ANSI C also explicitly allows
|
||||
the strings functions to be implemented as macros, i.e. by inline code.
|
||||
|
||||
There is also a third way to speed up the "strcpy" statement in Dhrystone: For
|
||||
this particular "strcpy" statement, the source of the assignment is a string
|
||||
constant. Therefore, in contrast to calls to "strcpy" in the general case, the
|
||||
compiler knows the length and alignment of the strings involved at compile
|
||||
time and can generate code in the same efficient way as a Pascal compiler
|
||||
(word instructions instead of byte instructions).
|
||||
|
||||
This is not allowed in the case of the "strcmp" call: Here, the addresses are
|
||||
formal procedure parameters, and no assumptions can be made about the length
|
||||
or alignment of the strings. Any such assumptions would indicate an incorrect
|
||||
implementation. They might work for Dhrystone, where the strings are in fact
|
||||
word-aligned with typical compilers, but other programs would deliver
|
||||
incorrect results.
|
||||
|
||||
So, for an apple-to-apple comparison between processors, and not between
|
||||
several possible (legal or illegal) degrees of compiler optimization, one
|
||||
should check that the systems are comparable with respect to the following
|
||||
three points:
|
||||
|
||||
(1) String functions in assembly language vs. in C
|
||||
|
||||
Frequently used functions such as the string functions can and should be
|
||||
written in assembly language, and all serious C language systems known
|
||||
to me do this. (I list this point for completeness only.) Note that
|
||||
processors with an instruction that checks a word for a null byte (such
|
||||
as AMD's 29000 and Intel's 80960) have an advantage here. (This
|
||||
advantage decreases relatively if optimization (3) is applied.) Due to
|
||||
the length of the strings involved in Dhrystone, this advantage may be
|
||||
considered too high in perspective, but it is certainly legal to use
|
||||
such instructions - after all, these situations are what they were
|
||||
invented for.
|
||||
|
||||
(2) String function code inline vs. as library functions.
|
||||
|
||||
ANSI C has created a new situation, compared with the older
|
||||
Kernighan/Ritchie C. In the original C, the definition of the string
|
||||
function was not part of the language. Now it is, and inlining is
|
||||
explicitly allowed. I probably should have stated more clearly in my
|
||||
SIGPLAN Notices paper that the rule "No procedure inlining for
|
||||
Dhrystone" referred to the user level procedures only and not to the
|
||||
library routines.
|
||||
|
||||
(3) Fixed-length and alignment assumptions for the strings
|
||||
|
||||
Compilers should be allowed to optimize in these cases if (and only if)
|
||||
it is safe to do so. For Dhrystone, this is the "strcpy" statement, but
|
||||
not the "strcmp" statement (unless, of course, the "strcmp" code
|
||||
explicitly checks the alignment at execution time and branches
|
||||
accordingly). A "Dhrystone switch" for the compiler that causes the
|
||||
generation of code that may not work under certain circumstances is
|
||||
certainly inappropriate for comparisons. It has been reported in Usenet
|
||||
that some C compilers provide such a compiler option; since I don't have
|
||||
access to all C compilers involved, I cannot verify this.
|
||||
|
||||
If the fixed-length and word-alignment assumption can be used, a wide
|
||||
bus that permits fast multi-word load instructions certainly does help;
|
||||
however, this fact by itself should not make a really big difference.
|
||||
|
||||
A check of these points - something that is necessary for a thorough
|
||||
evaluation and comparison of the Dhrystone performance claims - requires
|
||||
object code listings as well as listings for the string functions (strcpy,
|
||||
strcmp) that are possibly called by the program.
|
||||
|
||||
I don't pretend that Dhrystone is a perfect tool to measure the integer
|
||||
performance of microprocessors. The more it is used and discussed, the more I
|
||||
myself learn about aspects that I hadn't noticed yet when I wrote the program.
|
||||
And of course, the very success of a benchmark program is a danger in that
|
||||
people may tune their compilers and/or hardware to it, and with this action
|
||||
make it less useful.
|
||||
|
||||
Whetstone and Linpack have their critical points also: The Whetstone rating
|
||||
depends heavily on the speed of the mathematical functions (sine, sqrt, ...),
|
||||
and Linpack is sensitive to data alignment for some cache configurations.
|
||||
|
||||
Introduction of a standard set of public domain benchmark software (something
|
||||
the SPEC effort attempts) is certainly a worthwhile thing. In the meantime,
|
||||
people will continue to use whatever is available and widely distributed, and
|
||||
Dhrystone ratings are probably still better than MIPS ratings if these are -
|
||||
as often in industry - based on no reproducible derivation. However, any
|
||||
serious performance evaluation requires more than just a comparison of raw
|
||||
numbers; one has to make sure that the numbers have been obtained in a
|
||||
comparable way.
|
||||
|
||||
Reference in New Issue
Block a user