Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fprof(3)		   Erlang Module Definition		      fprof(3)

NAME
       fprof  -	 A Time	Profiling Tool using trace to file for minimal runtime
       performance impact.

DESCRIPTION
       This module is used to profile a	program	to find	out how	the  execution
       time is used. Trace to file is used to minimize runtime performance im-
       pact.

       The fprof module	uses tracing to	collect	profiling data,	hence there is
       no  need	 for special compilation of any	module to be profiled. When it
       starts tracing, fprof will erase	all previous tracing in	the  node  and
       set the necessary trace flags on	the profiling target processes as well
       as local	call trace on all functions in all loaded modules and all mod-
       ules  to	 be loaded. fprof erases all tracing in	the node when it stops
       tracing.

       fprof presents both own time i.e	how much time a	function has used  for
       its own execution, and accumulated time i.e including called functions.
       All presented times are collected using trace timestamps.  fprof	 tries
       to  collect  cpu	 time  timestamps, if the host machine OS supports it.
       Therefore the times may be wallclock times and OS scheduling will  ran-
       domly strike all	called functions in a presumably fair way.

       If,  however, the profiling time	is short, and the host machine OS does
       not support high	resolution cpu time measurements, some few OS schedul-
       ings may	show up	as ridiculously	long execution times for functions do-
       ing practically nothing.	An example of a	function  more	or  less  just
       composing a tuple in about 100 times the	normal execution time has been
       seen, and when the tracing was repeated,	the execution time became nor-
       mal.

       Profiling is essentially	done in	3 steps:

	 1:
	   Tracing; to file, as	mentioned in the previous paragraph. The trace
	   contains entries for	function calls,	returns	to  function,  process
	   scheduling,	other process related (spawn, etc) events, and garbage
	   collection. All trace entries are timestamped.

	 2:
	   Profiling; the trace	file is	read, the execution call stack is sim-
	   ulated,  and	raw profile data is calculated from the	simulated call
	   stack and the trace timestamps. The profile data is stored  in  the
	   fprof  server  state. During	this step the trace data may be	dumped
	   in text format to file or console.

	 3:
	   Analysing; the raw profile data is sorted, filtered and  dumped  in
	   text	 format	either to file or console. The text format intended to
	   be both readable for	a human	reader,	as well	as parsable  with  the
	   standard erlang parsing tools.

       Since  fprof uses trace to file,	the runtime performance	degradation is
       minimized, but still far	from negligible, especially for	programs  that
       use  the	 filesystem  heavily  by themselves. Where you place the trace
       file is also important, e.g on Solaris /tmp is usually  a  good	choice
       since  it  is  essentially  a RAM disk, while any NFS (network) mounted
       disk is a bad idea.

       fprof can also skip the file step and trace to a	 tracer	 process  that
       does the	profiling in runtime.

EXPORTS
       start() -> {ok, Pid} | {error, {already_started,	Pid}}

	      Types:

		 Pid = pid()

	      Starts the fprof server.

	      Note  that  it seldom needs to be	started	explicitly since it is
	      automatically started by	the  functions	that  need  a  running
	      server.

       stop() -> ok

	      Same as stop(normal).

       stop(Reason) -> ok

	      Types:

		 Reason	= term()

	      Stops the	fprof server.

	      The  supplied  Reason  becomes  the  exit	 reason	for the	server
	      process. Default Any Reason other	than kill sends	a  request  to
	      the server and waits for it to clean up, reply and exit. If Rea-
	      son is kill, the server is bluntly killed.

	      If the fprof server is not running, this function	returns	 imme-
	      diately with the same return value.

	  Note:
	      When  the	fprof server is	stopped	the collected raw profile data
	      is lost.

       apply(Func, Args) -> term()

	      Types:

		 Func =	function() | {Module ::	module(), Function :: atom()}
		 Args =	[term()]

	      Same as apply(Func, Args,	[]).

       apply(Module, Function, Args) ->	term()

	      Types:

		 Module	= module()
		 Function = atom()
		 Args =	[term()]

	      Same as apply({Module, Function},	Args, []).

       apply(Func, Args, OptionList) ->	term()

	      Types:

		 Func =	function() | {Module ::	module(), Function :: atom()}
		 Args =	[term()]
		 OptionList = [Option]
		 Option	= apply_option()
		 apply_option()	=
		     continue |
		     {procs, PidList ::	[pid()]} |
		     start |
		     (TraceStartOption :: trace_option())
		 trace_option()	=
		     cpu_time |
		     {cpu_time,	boolean()} |
		     file |
		     {file, Filename ::	file:filename()} |
		     {procs, PidSpec ::	pid_spec()} |
		     {procs, [PidSpec :: pid_spec()]} |
		     start | stop |
		     {tracer, Tracer ::	pid() |	port()}	|
		     verbose |
		     {verbose, boolean()}
		 pid_spec() = pid() | atom()

	      Calls erlang:apply(Func, Args) surrounded	by trace([start, ...])
	      and trace(stop).

	      Some  effort  is	made  to keep the trace	clean from unnecessary
	      trace messages; tracing is started and stopped  from  a  spawned
	      process  while  the  erlang:apply/2  call	is made	in the current
	      process, only surrounded by receive and send statements  towards
	      the  trace  starting  process.  The trace	starting process exits
	      when not needed any more.

	      The TraceStartOption is any option allowed for trace/1. The  op-
	      tions  [start,  {procs, [self() |	PidList]} | OptList] are given
	      to trace/1, where	OptList	is OptionList with continue, start and
	      {procs, _} options removed.

	      The  continue option inhibits the	call to	trace(stop) and	leaves
	      it up to the caller to stop tracing at a suitable	time.

       apply(Module, Function, Args, OptionList) -> term()

	      Types:

		 Module	= module()
		 Function = atom()
		 Args =	[term()]
		 OptionList = [Option]
		 Option	= apply_option()
		 apply_option()	=
		     continue |
		     {procs, PidList ::	[pid()]} |
		     start |
		     (TraceStartOption :: trace_option())
		 trace_option()	=
		     cpu_time |
		     {cpu_time,	boolean()} |
		     file |
		     {file, Filename ::	file:filename()} |
		     {procs, PidSpec ::	pid_spec()} |
		     {procs, [PidSpec :: pid_spec()]} |
		     start | stop |
		     {tracer, Tracer ::	pid() |	port()}	|
		     verbose |
		     {verbose, boolean()}
		 pid_spec() = pid() | atom()

	      Same as apply({Module, Function},	Args, OptionList).

	      OptionList is an option list allowed for apply/3.

       trace(OptionName	:: start, Filename) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 Filename = file:filename()
		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([start, {file, Filename}]).

       trace(OptionName	:: verbose, Filename) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 Filename = file:filename()
		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([start, verbose, {file, Filename}]).

       trace(OptionName, OptionValue) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([{OptionName, OptionValue}]).

       trace(Option :: verbose)	->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([start, verbose]).

       trace(OptionName) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([OptionName]).

       trace(Option :: {OptionName, OptionValue}) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 ServerPid = pid()
		 Reason	= term()

	      Same as trace([{OptionName, OptionValue}]).

       trace(OptionList) ->
		ok | {error, Reason} | {'EXIT',	ServerPid, Reason}

	      Types:

		 OptionList = [Option]
		 Option	= trace_option()
		 ServerPid = pid()
		 Reason	= term()
		 trace_option()	=
		     cpu_time |
		     {cpu_time,	boolean()} |
		     file |
		     {file, Filename ::	file:filename()} |
		     {procs, PidSpec ::	pid_spec()} |
		     {procs, [PidSpec :: pid_spec()]} |
		     start | stop |
		     {tracer, Tracer ::	pid() |	port()}	|
		     verbose |
		     {verbose, boolean()}
		 pid_spec() = pid() | atom()

	      Starts or	stops tracing.

	      PidSpec and Tracer are used in  calls  to	 erlang:trace(PidSpec,
	      true,  [{tracer, Tracer} | Flags]), and Filename is used to call
	      dbg:trace_port(file, Filename). Please  see  erlang:trace/3  and
	      dbg:trace_port/2.

	      Option description:

		stop:
		  Stops	 a running fprof trace and clears all tracing from the
		  node.	Either option stop or start must be specified, but not
		  both.

		start:
		  Clears  all  tracing	from  the  node	and starts a new fprof
		  trace. Either	option start or	stop must  be  specified,  but
		  not both.

		verbose	| {verbose, boolean()}:
		  The options verbose or {verbose, true} adds some trace flags
		  that fprof does not need, but	that may  be  interesting  for
		  general debugging purposes. This option is only allowed with
		  the start option.

		cpu_time | {cpu_time, boolean()}:
		  The options cpu_time or {cpu_time,  true}  makes  the	 time-
		  stamps in the	trace be in CPU	time instead of	wallclock time
		  which	is the default.	This option is only allowed  with  the
		  start	option.

	    Warning:
		Getting	 correct  values out of	cpu_time can be	difficult. The
		best way to get	correct	values is to run using a single	sched-
		uler  and bind that scheduler to a specific CPU, i.e. erl +S 1
		+sbt db.

		{procs,	PidSpec} | {procs, [PidSpec]}:
		  Specifies which processes that shall be traced. If this  op-
		  tion	is  not	given, the calling process is traced. All pro-
		  cesses spawned by the	traced processes are also traced. This
		  option is only allowed with the start	option.

		file | {file, Filename}:
		  Specifies  the  filename of the trace. If the	option file is
		  given,  or  none  of	these  options	are  given,  the  file
		  "fprof.trace"	 is used. This option is only allowed with the
		  start	option,	but not	with the {tracer, Tracer} option.

		{tracer, Tracer}:
		  Specifies that trace to process or port shall	 be  done  in-
		  stead	of trace to file. This option is only allowed with the
		  start	option,	but not	with the {file,	Filename} option.

       profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 ServerPid = pid()
		 Reason	= term()

	      Same as profile([]).

       profile(OptionName, OptionValue)	->
		  ok |
		  {ok, Tracer} |
		  {error, Reason} |
		  {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Tracer	= ServerPid = pid()
		 Reason	= term()

	      Same as profile([{OptionName, OptionValue}]).

       profile(OptionName) ->
		  ok |
		  {ok, Tracer} |
		  {error, Reason} |
		  {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 Tracer	= ServerPid = pid()
		 Reason	= term()

	      Same as profile([OptionName]).

       profile(Option :: {OptionName, OptionValue}) ->
		  ok |
		  {ok, Tracer} |
		  {error, Reason} |
		  {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Tracer	= ServerPid = pid()
		 Reason	= term()

	      Same as profile([{OptionName, OptionValue}]).

       profile(OptionList) ->
		  ok |
		  {ok, Tracer} |
		  {error, Reason} |
		  {'EXIT', ServerPid, Reason}

	      Types:

		 OptionList = [Option]
		 Option	= profile_option()
		 Tracer	= ServerPid = pid()
		 Reason	= term()
		 profile_option() =
		     append | dump |
		     {dump, pid() | (Dump :: (Dumpfile :: file:filename() | []))} |
		     file |
		     {file, Filename ::	file:filename()} |
		     start | stop

	      Compiles a trace into raw	profile	data held by the fprof server.

	      Dumpfile is used to call file:open/2, and	Filename  is  used  to
	      call  dbg:trace_port(file, Filename). Please see file:open/2 and
	      dbg:trace_port/2.

	      Option description:

		file | {file, Filename}:
		  Reads	the file Filename and creates raw profile data that is
		  stored  in  RAM  by  the fprof server. If the	option file is
		  given,  or  none  of	these  options	are  given,  the  file
		  "fprof.trace"	 is  read. The call will return	when the whole
		  trace	has been read with the return value ok if  successful.
		  This option is not allowed with the start or stop options.

		dump | {dump, Dump}:
		  Specifies  the  destination for the trace text dump. If this
		  option is not	given, no dump is generated, if	it is dump the
		  destination will be the caller's group leader, otherwise the
		  destination Dump is either the pid of	an  I/O	 device	 or  a
		  filename. And, finally, if the filename is []	- "fprof.dump"
		  is used instead. This	option is not allowed  with  the  stop
		  option.

		append:
		  Causes the trace text	dump to	be appended to the destination
		  file.	This option is only allowed with the {dump,  Dumpfile}
		  option.

		start:
		  Starts a tracer process that profiles	trace data in runtime.
		  The call will	return immediately with	the return value  {ok,
		  Tracer}  if  successful. This	option is not allowed with the
		  stop,	file or	{file, Filename} options.

		stop:
		  Stops	the tracer process that	profiles trace	data  in  run-
		  time.	 The return value will be value	ok if successful. This
		  option is not	allowed	with the start,	file or	 {file,	 File-
		  name}	options.

       analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 ServerPid = pid()
		 Reason	= term()

	      Same as analyse([]).

       analyse(OptionName, OptionValue)	->
		  ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 ServerPid = pid()
		 Reason	= term()

	      Same as analyse([{OptionName, OptionValue}]).

       analyse(OptionName) ->
		  ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 ServerPid = pid()
		 Reason	= term()

	      Same as analyse([OptionName]).

       analyse(Option :: {OptionName, OptionValue}) ->
		  ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 ServerPid = pid()
		 Reason	= term()

	      Same as analyse([{OptionName, OptionValue}]).

       analyse(OptionList) ->
		  ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 OptionList = [Option]
		 Option	= analyse_option()
		 ServerPid = pid()
		 Reason	= term()
		 analyse_option() =
		     append | callers |
		     {callers, boolean()} |
		     {cols, Cols :: integer() >= 0} |
		     dest |
		     {dest, Dest :: pid() | (Destfile :: file:filename())} |
		     details |
		     {details, boolean()} |
		     no_callers	| no_details |
		     {sort, SortSpec ::	acc | own} |
		     totals |
		     {totals, boolean()}

	      Analyses	raw  profile data in the fprof server. If called while
	      there is no raw profile data available, {error,  no_profile}  is
	      returned.

	      Destfile is used to call file:open/2.

	      Option description:

		dest | {dest, Dest}:
		  Specifies  the  destination for the analysis.	If this	option
		  is not given or it is	dest,  the  destination	 will  be  the
		  caller's group leader, otherwise the destination Dest	is ei-
		  ther the pid() of an I/O device or a filename. And, finally,
		  if the filename is []	- "fprof.analysis" is used instead.

		append:
		  Causes  the analysis to be appended to the destination file.
		  This option is only allowed with the	{dest,	Destfile}  op-
		  tion.

		{cols, Cols}:
		  Specifies  the  number  of  columns in the analysis text. If
		  this option is not given the number of columns is set	to 80.

		callers	| {callers, true}:
		  Prints callers and called information	in the analysis.  This
		  is the default.

		{callers, false} | no_callers:
		  Suppresses the printing of callers and called	information in
		  the analysis.

		{sort, SortSpec}:
		  Specifies if the analysis should be sorted according to  the
		  ACC  column,	which  is  the default,	or the OWN column. See
		  Analysis Format below.

		totals | {totals, true}:
		  Includes a section containing	call statistics	for all	 calls
		  regardless of	process, in the	analysis.

		{totals, false}:
		  Supresses  the  totals section in the	analysis, which	is the
		  default.

		details	| {details, true}:
		  Prints call statistics for each  process  in	the  analysis.
		  This is the default.

		{details, false} | no_details:
		  Suppresses  the  call	 statistics  for each process from the
		  analysis.

ANALYSIS FORMAT
       This section describes the output format	of the	analyse	 command.  See
       analyse/0.

       The  format is parsable with the	standard Erlang	parsing	tools erl_scan
       and erl_parse, file:consult/1 or	io:read/2. The parse format is not ex-
       plained here - it should	be easy	for the	interested to try it out. Note
       that some flags to analyse/1 will affect	the format.

       The following example was run on	OTP/R8 on Solaris 8, all OTP internals
       in this example are very	version	dependent.

       As  an example, we will use the following function, that	you may	recog-
       nise as	a  slightly  modified  benchmark  function  from  the  manpage
       file(3):

       -module(foo).
       -export([create_file_slow/2]).

       create_file_slow(Name, N) when is_integer(N), N >= 0 ->
	   {ok,	FD} =
	       file:open(Name, [raw, write, delayed_write, binary]),
	   if N	> 256 ->
		   ok =	file:write(FD,
				   lists:map(fun (X) ->	<<X:32/unsigned>> end,
				   lists:seq(0,	255))),
		   ok =	create_file_slow(FD, 256, N);
	      true ->
		   ok =	create_file_slow(FD, 0,	N)
	   end,
	   ok =	file:close(FD).

       create_file_slow(FD, M, M) ->
	   ok;
       create_file_slow(FD, M, N) ->
	   ok =	file:write(FD, <<M:32/unsigned>>),
	   create_file_slow(FD,	M+1, N).

       Let us have a look at the printout after	running:

       1> fprof:apply(foo, create_file_slow, [junk, 1024]).
       2> fprof:profile().
       3> fprof:analyse().

       The printout starts with:

       %% Analysis results:
       {  analysis_options,
	[{callers, true},
	 {sort,	acc},
	 {totals, false},
	 {details, true}]}.

       %				       CNT	 ACC	   OWN
       [{ totals,			      9627, 1691.119, 1659.074}].  %%%

       The  CNT	column shows the total number of function calls	that was found
       in the trace. In	the ACC	column is the total time  of  the  trace  from
       first timestamp to last.	And in the OWN column is the sum of the	execu-
       tion time in functions found in the trace, not including	 called	 func-
       tions. In this case it is very close to the ACC time since the emulator
       had practically nothing else to do than to execute our test program.

       All time	values in the printout are in milliseconds.

       The printout continues:

       %				       CNT	 ACC	   OWN
       [{ "<0.28.0>",			      9627,undefined, 1659.074}].   %%

       This is the printout header of one process. The printout	contains  only
       this  one process since we did fprof:apply/3 which traces only the cur-
       rent process. Therefore the CNT and OWN columns perfectly  matches  the
       totals  above.  The ACC column is undefined since summing the ACC times
       of all calls in the process makes no sense - you	 would	get  something
       like the	ACC value from totals above multiplied by the average depth of
       the call	stack, or something.

       All paragraphs up to the	next process  header  only  concerns  function
       calls within this process.

       Now we come to something	more interesting:

       {[{undefined,				 0, 1691.076,	 0.030}],
	{ {fprof,apply_start_stop,4},		 0, 1691.076,	 0.030},     %
	[{{foo,create_file_slow,2},		 1, 1691.046,	 0.103},
	 {suspend,				 1,    0.000,	 0.000}]}.

       {[{{fprof,apply_start_stop,4},		 1, 1691.046,	 0.103}],
	{ {foo,create_file_slow,2},		 1, 1691.046,	 0.103},     %
	[{{file,close,1},			 1, 1398.873,	 0.019},
	 {{foo,create_file_slow,3},		 1,  249.678,	 0.029},
	 {{file,open,2},			 1,   20.778,	 0.055},
	 {{lists,map,2},			 1,   16.590,	 0.043},
	 {{lists,seq,2},			 1,    4.708,	 0.017},
	 {{file,write,2},			 1,    0.316,	 0.021}]}.

       The  printout  consists of one paragraph	per called function. The func-
       tion marked with	'%' is the  one	 the  paragraph	 concerns  -  foo:cre-
       ate_file_slow/2.	 Above the marked function are the calling functions -
       those that has called the marked, and below are	those  called  by  the
       marked function.

       The  paragraphs	are  per default sorted	in decreasing order of the ACC
       column for the marked function. The calling list	and called list	within
       one  paragraph are also per default sorted in decreasing	order of their
       ACC column.

       The columns are:	CNT - the  number  of  times  the  function  has  been
       called,	ACC  -	the  time spent	in the function	including called func-
       tions, and OWN -	the time spent in the function	not  including	called
       functions.

       The  rows  for  the calling functions contain statistics	for the	marked
       function	with the constraint that only the occasions when  a  call  was
       made from the row's function to the marked function are accounted for.

       The  row	for the	marked function	simply contains	the sum	of all calling
       rows.

       The rows	for the	called functions contains  statistics  for  the	 row's
       function	 with  the  constraint that only the occasions when a call was
       made from the marked to the row's function are accounted	for.

       So, we see that foo:create_file_slow/2 used very	little	time  for  its
       own  execution. It spent	most of	its time in file:close/1. The function
       foo:create_file_slow/3 that writes 3/4 of the file contents is the sec-
       ond biggest time	thief.

       We  also	 see that the call to file:write/2 that	writes 1/4 of the file
       contents	takes very little time in itself. What takes time is to	 build
       the data	(lists:seq/2 and lists:map/2).

       The function 'undefined'	that has called	fprof:apply_start_stop/4 is an
       unknown function	because	that call was not recorded in  the  trace.  It
       was   only   recorded   that  the  execution  returned  from  fprof:ap-
       ply_start_stop/4	to some	other function above in	 the  call  stack,  or
       that the	process	exited from there.

       Let us continue down the	printout to find:

       {[{{foo,create_file_slow,2},		 1,  249.678,	 0.029},
	 {{foo,create_file_slow,3},	       768,    0.000,	23.294}],
	{ {foo,create_file_slow,3},	       769,  249.678,	23.323},     %
	[{{file,write,2},		       768,  220.314,	14.539},
	 {suspend,				57,    6.041,	 0.000},
	 {{foo,create_file_slow,3},	       768,    0.000,	23.294}]}.

       If  you	compare	 with  the  code you will see there also that foo:cre-
       ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
       and called only file:write/2, note the number of	calls to file:write/2.
       But here	we see that suspend was	called a few times. This is  a	pseudo
       function	 that indicates	that the process was suspended while executing
       in foo:create_file_slow/3,  and	since  there  is  no  receive  or  er-
       lang:yield/0  in	the code, it must be Erlang scheduling suspensions, or
       the trace file driver compensating  for	large  file  write  operations
       (these  are regarded as a schedule out followed by a schedule in	to the
       same process).

       Let us find the suspend entry:

       {[{{file,write,2},			53,    6.281,	 0.000},
	 {{foo,create_file_slow,3},		57,    6.041,	 0.000},
	 {{prim_file,drv_command,4},		50,    4.582,	 0.000},
	 {{prim_file,drv_get_response,1},	34,    2.986,	 0.000},
	 {{lists,map,2},			10,    2.104,	 0.000},
	 {{prim_file,write,2},			17,    1.852,	 0.000},
	 {{erlang,port_command,2},		15,    1.713,	 0.000},
	 {{prim_file,drv_command,2},		22,    1.482,	 0.000},
	 {{prim_file,translate_response,2},	11,    1.441,	 0.000},
	 {{prim_file,'-drv_command/2-fun-0-',1},  15,	 1.340,	   0.000},
	 {{lists,seq,4},			 3,    0.880,	 0.000},
	 {{foo,'-create_file_slow/2-fun-0-',1},	  5,	0.523,	  0.000},
	 {{erlang,bump_reductions,1},		 4,    0.503,	 0.000},
	 {{prim_file,open_int_setopts,3},	 1,    0.165,	 0.000},
	 {{prim_file,i32,4},			 1,    0.109,	 0.000},
	 {{fprof,apply_start_stop,4},		 1,    0.000,	 0.000}],
	{ suspend,			       299,   32.002,	 0.000},     %
	[ ]}.

       We find no particulary long suspend times, so no	function seems to have
       waited  in  a receive statement.	Actually, prim_file:drv_command/4 con-
       tains a receive statement, but in this test program, the	 message  lies
       in the process receive buffer when the receive statement	is entered. We
       also see	that the total suspend time for	the test run is	small.

       The suspend pseudo function has got an OWN time of  zero.  This	is  to
       prevent	the  process total OWN time from including time	in suspension.
       Whether suspend time is really ACC or OWN time is more of a philosophi-
       cal question.

       Now we look at another interesting pseudo function, garbage_collect:

       {[{{prim_file,drv_command,4},		25,    0.873,	 0.873},
	 {{prim_file,write,2},			16,    0.692,	 0.692},
	 {{lists,map,2},			 2,    0.195,	 0.195}],
	{ garbage_collect,			43,    1.760,	 1.760},     %
	[ ]}.

       Here  we	 see that no function distinguishes itself considerably, which
       is very normal.

       The garbage_collect pseudo function has not got an  OWN	time  of  zero
       like suspend, instead it	is equal to the	ACC time.

       Garbage	collect	 often	occurs while a process is suspended, but fprof
       hides this fact by pretending that the suspended	function was first un-
       suspended and then garbage collected. Otherwise the printout would show
       garbage_collect being called from suspend but not which	function  that
       might have caused the garbage collection.

       Let us now get back to the test code:

       {[{{foo,create_file_slow,3},	       768,  220.314,	14.539},
	 {{foo,create_file_slow,2},		 1,    0.316,	 0.021}],
	{ {file,write,2},		       769,  220.630,	14.560},     %
	[{{prim_file,write,2},		       769,  199.789,	22.573},
	 {suspend,				53,    6.281,	 0.000}]}.

       Not  unexpectedly,  we  see  that file:write/2 was called from foo:cre-
       ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
       case  as	 well as the used time are also	just confirms the previous re-
       sults.

       We see that file:write/2	only calls prim_file:write/2, but let  us  re-
       frain from digging into the internals of	the kernel application.

       But,  if	 we nevertheless do dig	down we	find the call to the linked in
       driver that does	the file operations towards the	host operating system:

       {[{{prim_file,drv_command,4},	       772, 1458.356, 1456.643}],
	{ {erlang,port_command,2},	       772, 1458.356, 1456.643},     %
	[{suspend,				15,    1.713,	 0.000}]}.

       This is 86 % of the total run time, and as we  saw  before  it  is  the
       close  operation	the absolutely biggest contributor. We find a compari-
       son ratio a little bit up in the	call stack:

       {[{{prim_file,close,1},			 1, 1398.748,	 0.024},
	 {{prim_file,write,2},		       769,  174.672,	12.810},
	 {{prim_file,open_int,4},		 1,   19.755,	 0.017},
	 {{prim_file,open_int_setopts,3},	 1,    0.147,	 0.016}],
	{ {prim_file,drv_command,2},	       772, 1593.322,	12.867},     %
	[{{prim_file,drv_command,4},	       772, 1578.973,	27.265},
	 {suspend,				22,    1.482,	 0.000}]}.

       The time	for file operations in the linked in driver distributes	itself
       as  1 % for open, 11 % for write	and 87 % for close. All	data is	proba-
       bly buffered in the operating system until the close.

       The  unsleeping	reader	 may   notice	that   the   ACC   times   for
       prim_file:drv_command/2	and  prim_file:drv_command/4  is not equal be-
       tween the paragraphs above, even	though it  is  easy  to	 believe  that
       prim_file:drv_command/2 is just a passthrough function.

       The  missing  time can be found in the paragraph	for prim_file:drv_com-
       mand/4 where it is evident that	not  only  prim_file:drv_command/2  is
       called but also a fun:

       {[{{prim_file,drv_command,2},	       772, 1578.973,	27.265}],
	{ {prim_file,drv_command,4},	       772, 1578.973,	27.265},     %
	[{{erlang,port_command,2},	       772, 1458.356, 1456.643},
	 {{prim_file,'-drv_command/2-fun-0-',1}, 772,	87.897,	  12.736},
	 {suspend,				50,    4.582,	 0.000},
	 {garbage_collect,			25,    0.873,	 0.873}]}.

       And  some  more	missing	 time  can  be	explained  by  the  fact  that
       prim_file:open_int/4 both  calls	 prim_file:drv_command/2  directly  as
       well  as	 through  prim_file:open_int_setopts/3,	 which complicates the
       picture.

       {[{{prim_file,open,2},			 1,   20.309,	 0.029},
	 {{prim_file,open_int,4},		 1,    0.000,	 0.057}],
	{ {prim_file,open_int,4},		 2,   20.309,	 0.086},     %
	[{{prim_file,drv_command,2},		 1,   19.755,	 0.017},
	 {{prim_file,open_int_setopts,3},	 1,    0.360,	 0.032},
	 {{prim_file,drv_open,2},		 1,    0.071,	 0.030},
	 {{erlang,list_to_binary,1},		 1,    0.020,	 0.020},
	 {{prim_file,i32,1},			 1,    0.017,	 0.017},
	 {{prim_file,open_int,4},		 1,    0.000,	 0.057}]}.
       {[{{prim_file,open_int,4},		 1,    0.360,	 0.032},
	 {{prim_file,open_int_setopts,3},	 1,    0.000,	 0.016}],
	{ {prim_file,open_int_setopts,3},	 2,    0.360,	 0.048},     %
	[{suspend,				 1,    0.165,	 0.000},
	 {{prim_file,drv_command,2},		 1,    0.147,	 0.016},
	 {{prim_file,open_int_setopts,3},	 1,    0.000,	 0.016}]}.

NOTES
       The actual supervision of execution times is in itself a	CPU  intensive
       activity.  A  message  is  written on the trace file for	every function
       call that is made by the	profiled code.

       The ACC time calculation	is sometimes difficult to make correct,	 since
       it  is difficult	to define. This	happens	especially when	a function oc-
       curs in several instances in the	call stack, for	example	by calling it-
       self  perhaps  through other functions and perhaps even non-tail	recur-
       sively.

       To produce sensible results, fprof tries	not  to	 charge	 any  function
       more  than once for ACC time. The instance highest up (with longest du-
       ration) in the call stack is chosen.

       Sometimes a function may	unexpectedly waste a lot (some 10 ms  or  more
       depending  on  host  machine  OS) of OWN	(and ACC) time,	even functions
       that do practically nothing at all. The problem may be that the OS  has
       chosen  to  schedule out	the Erlang runtime system process for a	while,
       and if the OS does not support high resolution  cpu  time  measurements
       fprof  will use wallclock time for its calculations, and	it will	appear
       as functions randomly burn virtual machine time.

SEE ALSO
       dbg(3), eprof(3), erlang(3), io(3), Tools User's	Guide

Ericsson AB			  tools	3.5.1			      fprof(3)

NAME | DESCRIPTION | EXPORTS | ANALYSIS FORMAT | NOTES | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=fprof&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help