Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fprof(3)		   Erlang Module Definition		      fprof(3)

NAME
       fprof  -	 A Time	Profiling Tool using trace to file for minimal runtime
       performance impact.

DESCRIPTION
       This module is used to profile a	program	to find	out how	the  execution
       time is used. Trace to file is used to minimize runtime performance im-
       pact.

       The fprof module	uses tracing to	collect	profiling data,	hence there is
       no  need	 for special compilation of any	module to be profiled. When it
       starts tracing, fprof will erase	all previous tracing in	the  node  and
       set the necessary trace flags on	the profiling target processes as well
       as local	call trace on all functions in all loaded modules and all mod-
       ules  to	 be loaded. fprof erases all tracing in	the node when it stops
       tracing.

       fprof presents both own time i.e	how much time a	function has used  for
       its own execution, and accumulated time i.e including called functions.
       All presented times are collected using trace timestamps.  fprof	 tries
       to  collect  cpu	 time  timestamps, if the host machine OS supports it.
       Therefore the times may be wallclock times and OS scheduling will  ran-
       domly strike all	called functions in a presumably fair way.

       If,  however, the profiling time	is short, and the host machine OS does
       not support high	resolution cpu time measurements, some few OS schedul-
       ings may	show up	as ridiculously	long execution times for functions do-
       ing practically nothing.	An example of a	function  more	or  less  just
       composing a tuple in about 100 times the	normal execution time has been
       seen, and when the tracing was repeated,	the execution time became nor-
       mal.

       Profiling is essentially	done in	3 steps:

	 1:
	   Tracing; to file, as	mentioned in the previous paragraph. The trace
	   contains entries for	function calls,	returns	to  function,  process
	   scheduling,	other process related (spawn, etc) events, and garbage
	   collection. All trace entries are timestamped.

	 2:
	   Profiling; the trace	file is	read, the execution call stack is sim-
	   ulated,  and	raw profile data is calculated from the	simulated call
	   stack and the trace timestamps. The profile data is stored  in  the
	   fprof  server  state. During	this step the trace data may be	dumped
	   in text format to file or console.

	 3:
	   Analysing; the raw profile data is sorted, filtered and  dumped  in
	   text	 format	either to file or console. The text format intended to
	   be both readable for	a human	reader,	as well	as parsable  with  the
	   standard erlang parsing tools.

       Since  fprof uses trace to file,	the runtime performance	degradation is
       minimized, but still far	from negligible, especially for	programs  that
       use  the	 filesystem  heavily  by themselves. Where you place the trace
       file is also important, e.g on Solaris /tmp is usually  a  good	choice
       since  it  is  essentially  a RAM disk, while any NFS (network) mounted
       disk is a bad idea.

       fprof can also skip the file step and trace to a	 tracer	 process  that
       does the	profiling in runtime.

EXPORTS
       start() -> {ok, Pid} | {error, {already_started,	Pid}}

	      Types:

		 Pid = pid()

	      Starts the fprof server.

	      Note  that  it seldom needs to be	started	explicitly since it is
	      automatically started by	the  functions	that  need  a  running
	      server.

       stop() -> ok

	      Same as stop(normal).

       stop(Reason) -> ok

	      Types:

		 Reason	= term()

	      Stops the	fprof server.

	      The  supplied  Reason  becomes  the  exit	 reason	for the	server
	      process. Default Any Reason other	than kill sends	a  request  to
	      the server and waits for it to clean up, reply and exit. If Rea-
	      son is kill, the server is bluntly killed.

	      If the fprof server is not running, this function	returns	 imme-
	      diately with the same return value.

	  Note:
	      When  the	fprof server is	stopped	the collected raw profile data
	      is lost.

       apply(Func, Args) -> term()

	      Types:

		 Func =	function() | {Module, Function}
		 Args =	[term()]
		 Module	= atom()
		 Function = atom()

	      Same as apply(Func, Args,	[]).

       apply(Module, Function, Args) ->	term()

	      Types:

		 Args =	[term()]
		 Module	= atom()
		 Function = atom()

	      Same as apply({Module, Function},	Args, []).

       apply(Func, Args, OptionList) ->	term()

	      Types:

		 Func =	function() | {Module, Function}
		 Args =	[term()]
		 OptionList = [Option]
		 Module	= atom()
		 Function = atom()
		 Option	= continue | start | {procs, PidList} |	 TraceStartOp-
		 tion

	      Calls erlang:apply(Func, Args) surrounded	by trace([start, ...])
	      and trace(stop).

	      Some effort is made to keep the  trace  clean  from  unnecessary
	      trace  messages;	tracing	 is started and	stopped	from a spawned
	      process while the	erlang:apply/2 call is	made  in  the  current
	      process,	only surrounded	by receive and send statements towards
	      the trace	starting process. The  trace  starting	process	 exits
	      when not needed any more.

	      The  TraceStartOption is any option allowed for trace/1. The op-
	      tions [start, {procs, [self() | PidList]}	| OptList]  are	 given
	      to trace/1, where	OptList	is OptionList with continue, start and
	      {procs, _} options removed.

	      The continue option inhibits the call to trace(stop) and	leaves
	      it up to the caller to stop tracing at a suitable	time.

       apply(Module, Function, Args, OptionList) -> term()

	      Types:

		 Module	= atom()
		 Function = atom()
		 Args =	[term()]

	      Same as apply({Module, Function},	Args, OptionList).

	      OptionList is an option list allowed for apply/3.

       trace(start,  Filename)	->  ok | {error, Reason} | {'EXIT', ServerPid,
       Reason}

	      Types:

		 Reason	= term()

	      Same as trace([start, {file, Filename}]).

       trace(verbose, Filename)	-> ok |	{error,	Reason}	| {'EXIT',  ServerPid,
       Reason}

	      Types:

		 Reason	= term()

	      Same as trace([start, verbose, {file, Filename}]).

       trace(OptionName,  OptionValue)	->  ok	|  {error,  Reason} | {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as trace([{OptionName, OptionValue}]).

       trace(verbose) -> ok | {error, Reason} |	{'EXIT', ServerPid, Reason}

	      Types:

		 Reason	= term()

	      Same as trace([start, verbose]).

       trace(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 Reason	= term()

	      Same as trace([OptionName]).

       trace({OptionName, OptionValue})	-> ok |	 {error,  Reason}  |  {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as trace([{OptionName, OptionValue}]).

       trace([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 Option	= start	| stop | {procs, PidSpec} | {procs, [PidSpec]}
		 | verbose | {verbose, bool()} | file  |  {file,  Filename}  |
		 {tracer, Tracer}
		 PidSpec = pid() | atom()
		 Tracer	= pid()	| port()
		 Reason	= term()

	      Starts or	stops tracing.

	      PidSpec  and  Tracer  are	used in	calls to erlang:trace(PidSpec,
	      true, [{tracer, Tracer} |	Flags]), and Filename is used to  call
	      dbg:trace_port(file, Filename). Please see the appropriate docu-
	      mentation.

	      Option description:

		stop:
		  Stops	a running fprof	trace and clears all tracing from  the
		  node.	Either option stop or start must be specified, but not
		  both.

		start:
		  Clears all tracing from the node  and	 starts	 a  new	 fprof
		  trace.  Either  option  start	or stop	must be	specified, but
		  not both.

		verbose| {verbose, bool()}:
		  The options verbose or {verbose, true} adds some trace flags
		  that	fprof  does  not need, but that	may be interesting for
		  general debugging purposes. This option is only allowed with
		  the start option.

		cpu_time| {cpu_time, bool()}:
		  The  options	cpu_time  or  {cpu_time, true} makes the time-
		  stamps in the	trace be in CPU	time instead of	wallclock time
		  which	 is  the default. This option is only allowed with the
		  start	option.

	    Warning:
		Getting	correct	values out of cpu_time can be  difficult.  The
		best way to get	correct	values is to run using a single	sched-
		uler and bind that scheduler to	a specific CPU,	i.e. erl +S  1
		+sbt db.

		{procs,	PidSpec}| {procs, [PidSpec]}:
		  Specifies  which processes that shall	be traced. If this op-
		  tion is not given, the calling process is traced.  All  pro-
		  cesses spawned by the	traced processes are also traced. This
		  option is only allowed with the start	option.

		file| {file, Filename}:
		  Specifies the	filename of the	trace. If the option  file  is
		  given,  or  none  of	these  options	are  given,  the  file
		  "fprof.trace"	is used. This option is	only allowed with  the
		  start	option,	but not	with the {tracer, Tracer} option.

		{tracer, Tracer}:
		  Specifies  that  trace  to process or	port shall be done in-
		  stead	of trace to file. This option is only allowed with the
		  start	option,	but not	with the {file,	Filename} option.

       profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 Reason	= term()

	      Same as profile([]).

       profile(OptionName,  OptionValue)  ->  ok  | {error, Reason} | {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as profile([{OptionName, OptionValue}]).

       profile(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid,  Rea-
       son}

	      Types:

		 OptionName = atom()
		 Reason	= term()

	      Same as profile([OptionName]).

       profile({OptionName,  OptionValue})  -> ok | {error, Reason} | {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as profile([{OptionName, OptionValue}]).

       profile([Option]) -> ok | {ok, Tracer} |	 {error,  Reason}  |  {'EXIT',
       ServerPid, Reason}

	      Types:

		 Option	 = file	| {file, Filename} | dump | {dump, Dump} | ap-
		 pend |	start |	stop
		 Dump =	pid() |	Dumpfile | []
		 Tracer	= pid()
		 Reason	= term()

	      Compiles a trace into raw	profile	data held by the fprof server.

	      Dumpfile is used to call file:open/2, and	Filename  is  used  to
	      call  dbg:trace_port(file, Filename). Please see the appropriate
	      documentation.

	      Option description:

		file| {file, Filename}:
		  Reads	the file Filename and creates raw profile data that is
		  stored  in  RAM  by  the fprof server. If the	option file is
		  given,  or  none  of	these  options	are  given,  the  file
		  "fprof.trace"	 is  read. The call will return	when the whole
		  trace	has been read with the return value ok if  successful.
		  This option is not allowed with the start or stop options.

		dump| {dump, Dump}:
		  Specifies  the  destination for the trace text dump. If this
		  option is not	given, no dump is generated, if	it is dump the
		  destination will be the caller's group leader, otherwise the
		  destination Dump is either the pid of	an  I/O	 device	 or  a
		  filename. And, finally, if the filename is []	- "fprof.dump"
		  is used instead. This	option is not allowed  with  the  stop
		  option.

		append:
		  Causes the trace text	dump to	be appended to the destination
		  file.	This option is only allowed with the {dump,  Dumpfile}
		  option.

		start:
		  Starts a tracer process that profiles	trace data in runtime.
		  The call will	return immediately with	the return value  {ok,
		  Tracer}  if  successful. This	option is not allowed with the
		  stop,	file or	{file, Filename} options.

		stop:
		  Stops	the tracer process that	profiles trace	data  in  run-
		  time.	 The return value will be value	ok if successful. This
		  option is not	allowed	with the start,	file or	 {file,	 File-
		  name}	options.

       analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 Reason	= term()

	      Same as analyse([]).

       analyse(OptionName,  OptionValue)  ->  ok  | {error, Reason} | {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as analyse([{OptionName, OptionValue}]).

       analyse(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid,  Rea-
       son}

	      Types:

		 OptionName = atom()
		 Reason	= term()

	      Same as analyse([OptionName]).

       analyse({OptionName,  OptionValue})  -> ok | {error, Reason} | {'EXIT',
       ServerPid, Reason}

	      Types:

		 OptionName = atom()
		 OptionValue = term()
		 Reason	= term()

	      Same as analyse([{OptionName, OptionValue}]).

       analyse([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}

	      Types:

		 Option	= dest | {dest,	Dest} |	append | {cols,	Cols} |	 call-
		 ers | {callers, bool()} | no_callers |	{sort, SortSpec} | to-
		 tals |	{totals, bool()}  |  details  |	 {details,  bool()}  |
		 no_details
		 Dest =	pid() |	Destfile
		 Cols =	integer() >= 80
		 SortSpec = acc	| own
		 Reason	= term()

	      Analyses	raw  profile data in the fprof server. If called while
	      there is no raw profile data available, {error,  no_profile}  is
	      returned.

	      Destfile is used to call file:open/2. Please see the appropriate
	      documentation.

	      Option description:

		dest| {dest, Dest}:
		  Specifies the	destination for	the analysis. If  this	option
		  is  not  given  or  it  is dest, the destination will	be the
		  caller's group leader, otherwise the destination Dest	is ei-
		  ther the pid() of an I/O device or a filename. And, finally,
		  if the filename is []	- "fprof.analysis" is used instead.

		append:
		  Causes the analysis to be appended to	the destination	 file.
		  This	option	is  only allowed with the {dest, Destfile} op-
		  tion.

		{cols, Cols}:
		  Specifies the	number of columns in  the  analysis  text.  If
		  this option is not given the number of columns is set	to 80.

		callers| {callers, true}:
		  Prints  callers and called information in the	analysis. This
		  is the default.

		{callers, false}| no_callers:
		  Suppresses the printing of callers and called	information in
		  the analysis.

		{sort, SortSpec}:
		  Specifies  if	the analysis should be sorted according	to the
		  ACC column, which is the default, or	the  OWN  column.  See
		  Analysis Format below.

		totals|	{totals, true}:
		  Includes  a section containing call statistics for all calls
		  regardless of	process, in the	analysis.

		{totals, false}:
		  Supresses the	totals section in the analysis,	which  is  the
		  default.

		details| {details, true}:
		  Prints  call	statistics  for	 each process in the analysis.
		  This is the default.

		{details, false}| no_details:
		  Suppresses the call statistics for  each  process  from  the
		  analysis.

ANALYSIS FORMAT
       This  section  describes	 the output format of the analyse command. See
       analyse/0.

       The format is parsable with the standard	Erlang parsing tools  erl_scan
       and erl_parse, file:consult/1 or	io:read/2. The parse format is not ex-
       plained here - it should	be easy	for the	interested to try it out. Note
       that some flags to analyse/1 will affect	the format.

       The following example was run on	OTP/R8 on Solaris 8, all OTP internals
       in this example are very	version	dependent.

       As an example, we will use the following	function, that you may	recog-
       nise  as	 a  slightly  modified	benchmark  function  from  the manpage
       file(3):

       -module(foo).
       -export([create_file_slow/2]).

       create_file_slow(Name, N) when integer(N), N >= 0 ->
	   {ok,	FD} =
	       file:open(Name, [raw, write, delayed_write, binary]),
	   if N	> 256 ->
		   ok =	file:write(FD,
				   lists:map(fun (X) ->	<<X:32/unsigned>> end,
				   lists:seq(0,	255))),
		   ok =	create_file_slow(FD, 256, N);
	      true ->
		   ok =	create_file_slow(FD, 0,	N)
	   end,
	   ok =	file:close(FD).

       create_file_slow(FD, M, M) ->
	   ok;
       create_file_slow(FD, M, N) ->
	   ok =	file:write(FD, <<M:32/unsigned>>),
	   create_file_slow(FD,	M+1, N).

       Let us have a look at the printout after	running:

       1> fprof:apply(foo, create_file_slow, [junk, 1024]).
       2> fprof:profile().
       3> fprof:analyse().

       The printout starts with:

       %% Analysis results:
       {  analysis_options,
	[{callers, true},
	 {sort,	acc},
	 {totals, false},
	 {details, true}]}.

       %				       CNT	 ACC	   OWN
       [{ totals,			      9627, 1691.119, 1659.074}].  %%%

       The CNT column shows the	total number of	function calls that was	 found
       in  the	trace.	In  the	ACC column is the total	time of	the trace from
       first timestamp to last.	And in the OWN column is the sum of the	execu-
       tion  time  in functions	found in the trace, not	including called func-
       tions. In this case it is very close to the ACC time since the emulator
       had practically nothing else to do than to execute our test program.

       All time	values in the printout are in milliseconds.

       The printout continues:

       %				       CNT	 ACC	   OWN
       [{ "<0.28.0>",			      9627,undefined, 1659.074}].   %%

       This  is	the printout header of one process. The	printout contains only
       this one	process	since we did fprof:apply/3 which traces	only the  cur-
       rent  process.  Therefore the CNT and OWN columns perfectly matches the
       totals above. The ACC column is undefined since summing the  ACC	 times
       of  all	calls  in the process makes no sense - you would get something
       like the	ACC value from totals above multiplied by the average depth of
       the call	stack, or something.

       All  paragraphs	up  to	the next process header	only concerns function
       calls within this process.

       Now we come to something	more interesting:

       {[{undefined,				 0, 1691.076,	 0.030}],
	{ {fprof,apply_start_stop,4},		 0, 1691.076,	 0.030},     %
	[{{foo,create_file_slow,2},		 1, 1691.046,	 0.103},
	 {suspend,				 1,    0.000,	 0.000}]}.

       {[{{fprof,apply_start_stop,4},		 1, 1691.046,	 0.103}],
	{ {foo,create_file_slow,2},		 1, 1691.046,	 0.103},     %
	[{{file,close,1},			 1, 1398.873,	 0.019},
	 {{foo,create_file_slow,3},		 1,  249.678,	 0.029},
	 {{file,open,2},			 1,   20.778,	 0.055},
	 {{lists,map,2},			 1,   16.590,	 0.043},
	 {{lists,seq,2},			 1,    4.708,	 0.017},
	 {{file,write,2},			 1,    0.316,	 0.021}]}.

       The printout consists of	one paragraph per called function.  The	 func-
       tion  marked  with  '%'	is  the	 one the paragraph concerns - foo:cre-
       ate_file_slow/2.	Above the marked function are the calling functions  -
       those  that  has	 called	 the marked, and below are those called	by the
       marked function.

       The paragraphs are per default sorted in	decreasing order  of  the  ACC
       column for the marked function. The calling list	and called list	within
       one paragraph are also per default sorted in decreasing order of	 their
       ACC column.

       The  columns  are:  CNT	-  the	number	of times the function has been
       called, ACC - the time spent in the  function  including	 called	 func-
       tions,  and  OWN	 - the time spent in the function not including	called
       functions.

       The rows	for the	calling	functions contain statistics  for  the	marked
       function	 with  the  constraint that only the occasions when a call was
       made from the row's function to the marked function are accounted for.

       The row for the marked function simply contains the sum of all  calling
       rows.

       The  rows  for  the  called functions contains statistics for the row's
       function	with the constraint that only the occasions when  a  call  was
       made from the marked to the row's function are accounted	for.

       So,  we	see  that foo:create_file_slow/2 used very little time for its
       own execution. It spent most of its time	in file:close/1. The  function
       foo:create_file_slow/3 that writes 3/4 of the file contents is the sec-
       ond biggest time	thief.

       We also see that	the call to file:write/2 that writes 1/4 of  the  file
       contents	 takes very little time	in itself. What	takes time is to build
       the data	(lists:seq/2 and lists:map/2).

       The function 'undefined'	that has called	fprof:apply_start_stop/4 is an
       unknown	function  because  that	call was not recorded in the trace. It
       was  only  recorded  that  the  execution   returned   from   fprof:ap-
       ply_start_stop/4	 to  some  other  function above in the	call stack, or
       that the	process	exited from there.

       Let us continue down the	printout to find:

       {[{{foo,create_file_slow,2},		 1,  249.678,	 0.029},
	 {{foo,create_file_slow,3},	       768,    0.000,	23.294}],
	{ {foo,create_file_slow,3},	       769,  249.678,	23.323},     %
	[{{file,write,2},		       768,  220.314,	14.539},
	 {suspend,				57,    6.041,	 0.000},
	 {{foo,create_file_slow,3},	       768,    0.000,	23.294}]}.

       If you compare with the code you	will  see  there  also	that  foo:cre-
       ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
       and called only file:write/2, note the number of	calls to file:write/2.
       But  here  we see that suspend was called a few times. This is a	pseudo
       function	that indicates that the	process	was suspended while  executing
       in  foo:create_file_slow/3,  and	 since	there  is  no  receive	or er-
       lang:yield/0 in the code, it must be Erlang scheduling suspensions,  or
       the  trace  file	 driver	 compensating  for large file write operations
       (these are regarded as a	schedule out followed by a schedule in to  the
       same process).

       Let us find the suspend entry:

       {[{{file,write,2},			53,    6.281,	 0.000},
	 {{foo,create_file_slow,3},		57,    6.041,	 0.000},
	 {{prim_file,drv_command,4},		50,    4.582,	 0.000},
	 {{prim_file,drv_get_response,1},	34,    2.986,	 0.000},
	 {{lists,map,2},			10,    2.104,	 0.000},
	 {{prim_file,write,2},			17,    1.852,	 0.000},
	 {{erlang,port_command,2},		15,    1.713,	 0.000},
	 {{prim_file,drv_command,2},		22,    1.482,	 0.000},
	 {{prim_file,translate_response,2},	11,    1.441,	 0.000},
	 {{prim_file,'-drv_command/2-fun-0-',1},  15,	 1.340,	   0.000},
	 {{lists,seq,4},			 3,    0.880,	 0.000},
	 {{foo,'-create_file_slow/2-fun-0-',1},	  5,	0.523,	  0.000},
	 {{erlang,bump_reductions,1},		 4,    0.503,	 0.000},
	 {{prim_file,open_int_setopts,3},	 1,    0.165,	 0.000},
	 {{prim_file,i32,4},			 1,    0.109,	 0.000},
	 {{fprof,apply_start_stop,4},		 1,    0.000,	 0.000}],
	{ suspend,			       299,   32.002,	 0.000},     %
	[ ]}.

       We find no particulary long suspend times, so no	function seems to have
       waited in a receive statement. Actually,	 prim_file:drv_command/4  con-
       tains  a	 receive statement, but	in this	test program, the message lies
       in the process receive buffer when the receive statement	is entered. We
       also see	that the total suspend time for	the test run is	small.

       The  suspend  pseudo  function  has got an OWN time of zero. This is to
       prevent the process total OWN time from including time  in  suspension.
       Whether suspend time is really ACC or OWN time is more of a philosophi-
       cal question.

       Now we look at another interesting pseudo function, garbage_collect:

       {[{{prim_file,drv_command,4},		25,    0.873,	 0.873},
	 {{prim_file,write,2},			16,    0.692,	 0.692},
	 {{lists,map,2},			 2,    0.195,	 0.195}],
	{ garbage_collect,			43,    1.760,	 1.760},     %
	[ ]}.

       Here we see that	no function distinguishes itself  considerably,	 which
       is very normal.

       The  garbage_collect  pseudo  function  has not got an OWN time of zero
       like suspend, instead it	is equal to the	ACC time.

       Garbage collect often occurs while a process is	suspended,  but	 fprof
       hides this fact by pretending that the suspended	function was first un-
       suspended and then garbage collected. Otherwise the printout would show
       garbage_collect	being  called from suspend but not which function that
       might have caused the garbage collection.

       Let us now get back to the test code:

       {[{{foo,create_file_slow,3},	       768,  220.314,	14.539},
	 {{foo,create_file_slow,2},		 1,    0.316,	 0.021}],
	{ {file,write,2},		       769,  220.630,	14.560},     %
	[{{prim_file,write,2},		       769,  199.789,	22.573},
	 {suspend,				53,    6.281,	 0.000}]}.

       Not unexpectedly, we see	that file:write/2  was	called	from  foo:cre-
       ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
       case as well as the used	time are also just confirms the	 previous  re-
       sults.

       We  see	that file:write/2 only calls prim_file:write/2,	but let	us re-
       frain from digging into the internals of	the kernel application.

       But, if we nevertheless do dig down we find the call to the  linked  in
       driver that does	the file operations towards the	host operating system:

       {[{{prim_file,drv_command,4},	       772, 1458.356, 1456.643}],
	{ {erlang,port_command,2},	       772, 1458.356, 1456.643},     %
	[{suspend,				15,    1.713,	 0.000}]}.

       This  is	 86  %	of  the	total run time,	and as we saw before it	is the
       close operation the absolutely biggest contributor. We find a  compari-
       son ratio a little bit up in the	call stack:

       {[{{prim_file,close,1},			 1, 1398.748,	 0.024},
	 {{prim_file,write,2},		       769,  174.672,	12.810},
	 {{prim_file,open_int,4},		 1,   19.755,	 0.017},
	 {{prim_file,open_int_setopts,3},	 1,    0.147,	 0.016}],
	{ {prim_file,drv_command,2},	       772, 1593.322,	12.867},     %
	[{{prim_file,drv_command,4},	       772, 1578.973,	27.265},
	 {suspend,				22,    1.482,	 0.000}]}.

       The time	for file operations in the linked in driver distributes	itself
       as 1 % for open,	11 % for write and 87 %	for close. All data is	proba-
       bly buffered in the operating system until the close.

       The   unsleeping	  reader   may	 notice	  that	 the   ACC  times  for
       prim_file:drv_command/2 and prim_file:drv_command/4 is  not  equal  be-
       tween  the  paragraphs  above,  even  though it is easy to believe that
       prim_file:drv_command/2 is just a passthrough function.

       The missing time	can be found in	the paragraph  for  prim_file:drv_com-
       mand/4  where  it  is  evident that not only prim_file:drv_command/2 is
       called but also a fun:

       {[{{prim_file,drv_command,2},	       772, 1578.973,	27.265}],
	{ {prim_file,drv_command,4},	       772, 1578.973,	27.265},     %
	[{{erlang,port_command,2},	       772, 1458.356, 1456.643},
	 {{prim_file,'-drv_command/2-fun-0-',1}, 772,	87.897,	  12.736},
	 {suspend,				50,    4.582,	 0.000},
	 {garbage_collect,			25,    0.873,	 0.873}]}.

       And  some  more	missing	 time  can  be	explained  by  the  fact  that
       prim_file:open_int/4  both  calls  prim_file:drv_command/2  directly as
       well as through	prim_file:open_int_setopts/3,  which  complicates  the
       picture.

       {[{{prim_file,open,2},			 1,   20.309,	 0.029},
	 {{prim_file,open_int,4},		 1,    0.000,	 0.057}],
	{ {prim_file,open_int,4},		 2,   20.309,	 0.086},     %
	[{{prim_file,drv_command,2},		 1,   19.755,	 0.017},
	 {{prim_file,open_int_setopts,3},	 1,    0.360,	 0.032},
	 {{prim_file,drv_open,2},		 1,    0.071,	 0.030},
	 {{erlang,list_to_binary,1},		 1,    0.020,	 0.020},
	 {{prim_file,i32,1},			 1,    0.017,	 0.017},
	 {{prim_file,open_int,4},		 1,    0.000,	 0.057}]}.
       {[{{prim_file,open_int,4},		 1,    0.360,	 0.032},
	 {{prim_file,open_int_setopts,3},	 1,    0.000,	 0.016}],
	{ {prim_file,open_int_setopts,3},	 2,    0.360,	 0.048},     %
	[{suspend,				 1,    0.165,	 0.000},
	 {{prim_file,drv_command,2},		 1,    0.147,	 0.016},
	 {{prim_file,open_int_setopts,3},	 1,    0.000,	 0.016}]}.

NOTES
       The  actual supervision of execution times is in	itself a CPU intensive
       activity. A message is written on the trace  file  for  every  function
       call that is made by the	profiled code.

       The  ACC	time calculation is sometimes difficult	to make	correct, since
       it is difficult to define. This happens especially when a function  oc-
       curs in several instances in the	call stack, for	example	by calling it-
       self perhaps through other functions and	perhaps	even  non-tail	recur-
       sively.

       To  produce  sensible  results,	fprof tries not	to charge any function
       more than once for ACC time. The	instance highest up (with longest  du-
       ration) in the call stack is chosen.

       Sometimes  a  function may unexpectedly waste a lot (some 10 ms or more
       depending on host machine OS) of	OWN (and  ACC)	time,  even  functions
       that  does  practically	nothing	at all.	The problem may	be that	the OS
       has chosen to schedule out the Erlang  runtime  system  process	for  a
       while, and if the OS does not support high resolution cpu time measure-
       ments fprof will	use wallclock time for its calculations, and  it  will
       appear as functions randomly burn virtual machine time.

SEE ALSO
       dbg(3), eprof(3), erlang(3), io(3), Tools User's	Guide

Ericsson AB			   tools 3.1			      fprof(3)

NAME | DESCRIPTION | EXPORTS | ANALYSIS FORMAT | NOTES | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=fprof&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help