GPROF(1) User Commands GPROF(1)
NAME
gprof - display call-graph profile data
SYNOPSIS
gprof [
-abcCDlsz] [
-e function-name] [
-E function-name]
[
-f function-name] [
-F function-name]
[
image-file [
profile-file...]]
[
-n number of functions]
DESCRIPTION
The
gprof utility produces an execution profile of a program. The effect
of called routines is incorporated in the profile of each caller. The
profile data is taken from the call graph profile file that is created by
programs compiled with the
-xpg option of
cc(1), or by the
-pg option
with other compilers, or by setting the
LD_PROFILE environment variable
for shared objects. See
ld.so.1(1). These compiler options also link in
versions of the library routines which are compiled for profiling. The
symbol table in the executable image file
image-file (
a.out by default)
is read and correlated with the call graph profile file
profile-file (
gmon.out by default).
First, execution times for each routine are propagated along the edges of
the call graph. Cycles are discovered, and calls into a cycle are made to
share the time of the cycle. The first listing shows the functions
sorted according to the time they represent, including the time of their
call graph descendants. Below each function entry is shown its (direct)
call-graph children and how their times are propagated to this function.
A similar display above the function shows how this function's time and
the time of its descendants are propagated to its (direct) call-graph
parents.
Cycles are also shown, with an entry for the cycle as a whole and a
listing of the members of the cycle and their contributions to the time
and call counts of the cycle.
Next, a flat profile is given. This listing gives the total execution
times and call counts for each of the functions in the program, sorted by
decreasing time. Finally, an index is given, which shows the
correspondence between function names and call-graph profile index
numbers.
A single function may be split into subfunctions for profiling by means
of the
MARK macro. See
prof(7).
Beware of quantization errors. The granularity of the sampling is shown,
but remains statistical at best. It is assumed that the time for each
execution of a function can be expressed by the total time for the
function divided by the number of times the function is called. Thus the
time propagated along the call-graph arcs to parents of that function is
directly proportional to the number of times that arc is traversed.
The profiled program must call
exit(2) or return normally for the
profiling information to be saved in the
gmon.out file.
OPTIONS
The following options are supported:
-a Suppress printing statically declared functions. If
this option is given, all relevant information about
the static function (for instance, time samples, calls
to other functions, calls from other functions)
belongs to the function loaded just before the static
function in the
a.out file.
-b Brief. Suppress descriptions of each field in the
profile.
-c Discover the static call-graph of the program by a
heuristic which examines the text space of the object
file. Static-only parents or children are indicated
with call counts of 0. Note that for dynamically
linked executables, the linked shared objects' text
segments are not examined.
-C Demangle symbol names before printing them out.
-D Produce a profile file
gmon.sum that represents the
difference of the profile information in all specified
profile files. This summary profile file may be given
to subsequent executions of
gprof (also with
-D) to
summarize profile data across several runs of an
a.out file. See also the
-s option.
As an example, suppose function A calls function B
n times in profile file
gmon.sum, and
m times in profile
file
gmon.out. With
-D, a new
gmon.sum file will be
created showing the number of calls from A to B as
n- m.
-efunction-name Suppress printing the graph profile entry for routine
function-name and all its descendants (unless they
have other ancestors that are not suppressed). More
than one
-e option may be given. Only one
function- name may be given with each
-e option.
-Efunction-name Suppress printing the graph profile entry for routine
function-name (and its descendants) as
-e, below, and
also exclude the time spent in
function-name (and its
descendants) from the total and percentage time
computations. More than one
-E option may be given.
For example:
-E mcount -E mcleanup is the default.
-ffunction-name Print the graph profile entry only for routine
function-name and its descendants. More than one
-f option may be given. Only one
function-name may be
given with each
-f option.
-Ffunction-name Print the graph profile entry only for routine
function-name and its descendants (as
-f, below) and
also use only the times of the printed routines in
total time and percentage computations. More than one
-F option may be given. Only one
function-name may be
given with each
-F option. The
-F option overrides
the
-E option.
-l Suppress the reporting of graph profile entries for
all local symbols. This option would be the
equivalent of placing all of the local symbols for the
specified executable image on the
-E exclusion list.
-n Limits the size of flat and graph profile listings to
the top
n offending functions.
-s Produce a profile file
gmon.sum which represents the
sum of the profile information in all of the specified
profile files. This summary profile file may be given
to subsequent executions of
gprof (also with
-s) to
accumulate profile data across several runs of an
a.out file. See also the
-D option.
-z Display routines which have zero usage (as indicated
by call counts and accumulated time). This is useful
in conjunction with the
-c option for discovering
which routines were never called. Note that this has
restricted use for dynamically linked executables,
since shared object text space will not be examined by
the
-c option.
ENVIRONMENT VARIABLES
PROFDIR If this environment variable contains a value, place profiling
output within that directory, in a file named
pid.programname.
pid is the process
ID and
programname is the name of the
program being profiled, as determined by removing any path
prefix from the
argv[0] with which the program was called. If
the variable contains a null value, no profiling output is
produced. Otherwise, profiling output is placed in the file
gmon.out.
FILES
a.out executable file containing namelist
gmon.out dynamic call-graph and profile
gmon.sum summarized dynamic call-graph and profile
$PROFDIR/pid.programnameSEE ALSO
cc(1),
ld.so.1(1),
exit(2),
pcsample(2),
profil(2),
malloc(3C),
monitor(3C),
malloc(3MALLOC),
attributes(7),
prof(7) Graham, S.L., Kessler, P.B., McKusick, M.K.,
gprof: A Call Graph Execution Profiler Proceedings of the SIGPLAN '82 Symposium on Compiler Construction,
SIGPLAN Notices, Vol. 17, No. 6, pp. 120-126, June 1982.
Linker and Libraries GuideNOTES
If the executable image has been stripped and does not have the
.symtab symbol table,
gprof reads the global dynamic symbol tables
.dynsym and
.SUNW_ldynsym, if present. The symbols in the dynamic symbol tables are
a subset of the symbols that are found in
.symtab. The
.dynsym symbol
table contains the global symbols used by the runtime linker.
.SUNW_ldynsym augments the information in
.dynsym with local function
symbols. In the case where
.dynsym is found and
.SUNW_ldynsym is not,
only the information for the global symbols is available. Without local
symbols, the behavior is as described for the
-a option.
LD_LIBRARY_PATH must not contain
/usr/lib as a component when compiling a
program for profiling. If
LD_LIBRARY_PATH contains
/usr/lib, the
program will not be linked correctly with the profiling versions of the
system libraries in
/usr/lib/libp.
The times reported in successive identical runs may show variances
because of varying cache-hit ratios that result from sharing the cache
with other processes. Even if a program seems to be the only one using
the machine, hidden background or asynchronous processes may blur the
data. In rare cases, the clock ticks initiating recording of the program
counter may
beat with loops in a program, grossly distorting
measurements. Call counts are always recorded precisely, however.
Only programs that call
exit or return from
main are guaranteed to
produce a profile file, unless a final call to
monitor is explicitly
coded.
Functions such as
mcount(),
_mcount(),
moncontrol(),
_moncontrol(),
monitor(), and
_monitor() may appear in the
gprof report. These
functions are part of the profiling implementation and thus account for
some amount of the runtime overhead. Since these functions are not
present in an unprofiled application, time accumulated and call counts
for these functions may be ignored when evaluating the performance of an
application.
64-bit profiling 64-bit profiling may be used freely with dynamically linked executables,
and profiling information is collected for the shared objects if the
objects are compiled for profiling. Care must be applied to interpret the
profile output, since it is possible for symbols from different shared
objects to have the same name. If name duplication occurs in the profile
output, the module id prefix before the symbol name in the symbol index
listing can be used to identify the appropriate module for the symbol.
When using the
-s or
-Doption to sum multiple profile files, care must be
taken not to mix 32-bit profile files with 64-bit profile files.
32-bit profiling 32-bit profiling may be used with dynamically linked executables, but
care must be applied. In 32-bit profiling, shared objects cannot be
profiled with
gprof. Thus, when a profiled, dynamically linked program is
executed, only the
main portion of the image is sampled. This means that
all time spent outside of the
main object, that is, time spent in a
shared object, will not be included in the profile summary; the total
time reported for the program may be less than the total time used by the
program.
Because the time spent in a shared object cannot be accounted for, the
use of shared objects should be minimized whenever a program is profiled
with
gprof. If desired, the program should be linked to the profiled
version of a library (or to the standard archive version if no profiling
version is available), instead of the shared object to get profile
information on the functions of a library. Versions of profiled libraries
may be supplied with the system in the
/usr/lib/libp directory. Refer to
compiler driver documentation on profiling.
Consider an extreme case. A profiled program dynamically linked with the
shared C library spends 100 units of time in some
libc routine, say,
malloc(). Suppose
malloc() is called only from routine
B and
B consumes
only 1 unit of time. Suppose further that routine
A consumes 10 units of
time, more than any other routine in the
main (profiled) portion of the
image. In this case,
gprof will conclude that most of the time is being
spent in
A and almost no time is being spent in
B. From this it will be
almost impossible to tell that the greatest improvement can be made by
looking at routine
B and not routine
A. The value of the profiler in
this case is severely degraded; the solution is to use archives as much
as possible for profiling.
BUGS
Parents which are not themselves profiled will have the time of their
profiled children propagated to them, but they will appear to be
spontaneously invoked in the call-graph listing, and will not have their
time propagated further. Similarly, signal catchers, even though
profiled, will appear to be spontaneous (although for more obscure
reasons). Any profiled children of signal catchers should have their
times propagated properly, unless the signal catcher was invoked during
the execution of the profiling routine, in which case all is lost.
April 10, 2023
GPROF(1)