Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
CRASH(8)		FreeBSD	System Manager's Manual		      CRASH(8)

NAME
     crash -- system failure and diagnosis

DESCRIPTION
     This section explains what	happens	when the system	crashes	and (very
     briefly) how to analyze crash dumps.

     When the system crashes voluntarily it prints a message of	the form

	   panic: why i	gave up	the ghost

     on	the console and	enters the kernel debugger, ddb(4).

     If	you wish to report this	panic, you should include the output of	the ps
     and trace commands.  Unless the `ddb.log' sysctl has been disabled, any-
     thing output to screen will be appended to	the system message buffer,
     from where	it may be possible to retrieve it through the dmesg(8) command
     after a warm reboot.  If the debugger command boot	dump is	entered, or if
     the debugger was not compiled into	the kernel, or the debugger was	dis-
     abled with	sysctl(8), then	the system dumps the contents of physical mem-
     ory onto a	mass storage peripheral	device.	 The particular	device used is
     determined	by the `dumps on' directive in the config(8) file used to
     build the kernel.

     After the dump has	been written, the system then invokes the automatic
     reboot procedure as described in reboot(8).  If auto-reboot is disabled
     (in a machine dependent way) the system will simply halt at this point.

     Upon rebooting, and unless	some unexpected	inconsistency is encountered
     in	the state of the file systems due to hardware or software failure, the
     system will copy the previously written dump into /var/crash using
     savecore(8), before resuming multi-user operations.

   Causes of system failure
     The system	has a large number of internal consistency checks; if one of
     these fails, then it will panic with a very short message indicating
     which one failed.	In many	instances, this	will be	the name of the	rou-
     tine which	detected the error, or a two-word description of the inconsis-
     tency.  A full understanding of most panic	messages requires perusal of
     the source	code for the system.

     The most common cause of system failures is hardware failure (e.g., bad
     memory) which can reflect itself in different ways.  Here are the mes-
     sages which are most likely, with some hints as to	causes.	 Left unstated
     in	all cases is the possibility that a hardware or	software error pro-
     duced the message in some unexpected way.

     no	init
	     This panic	message	indicates filesystem problems, and reboots are
	     likely to be futile.  Late	in the bootstrap procedure, the	system
	     was unable	to locate and execute the initialization process,
	     init(8).  The root	filesystem is incorrect	or has been corrupted,
	     or	the mode or type of /sbin/init forbids execution.

     trap type %d, code=%x, pc=%x
	     A unexpected trap has occurred within the system; the trap	types
	     are machine dependent and can be found listed in
	     /sys/arch/ARCH/include/trap.h.

	     The code is the referenced	address, and the pc is the program
	     counter at	the time of the	fault is printed.  Hardware flakiness
	     will sometimes generate this panic, but if	the cause is a kernel
	     bug, the kernel debugger ddb(4) can be used to locate the in-
	     struction and subroutine inside the kernel	corresponding to the
	     PC	value.	If that	is insufficient	to suggest the nature of the
	     problem, more detailed examination	of the system status at	the
	     time of the trap usually can produce an explanation.

     init died
	     The system	initialization process has exited.  This is bad	news,
	     as	no new users will then be able to log in.  Rebooting is	the
	     only fix, so the system just does it right	away.

     out of mbufs: map full
	     The network has exhausted its private page	map for	network	buf-
	     fers.  This usually indicates that	buffers	are being lost,	and
	     rather than allow the system to slowly degrade, it	reboots	imme-
	     diately.  The map may be made larger if necessary.

     That completes the	list of	panic types you	are likely to see.

   Analyzing a dump
     When the system crashes it	writes (or at least attempts to	write) an im-
     age of memory, including the kernel image,	onto the dump device.  On re-
     boot, the kernel image and	memory image are separated and preserved in
     the directory /var/crash.

     To	analyze	the kernel and memory images preserved as bsd.0	and
     bsd.0.core, you should run	gdb(1),	loading	in the images with the follow-
     ing commands:

	   # gdb
	   GNU gdb 6.3
	   Copyright 2004 Free Software	Foundation, Inc.
	   GDB is free software, covered by the	GNU General Public License, and	you are
	   welcome to change it	and/or distribute copies of it under certain conditions.
	   Type	"show copying" to see the conditions.
	   There is absolutely no warranty for GDB.  Type "show	warranty" for details.
	   This	GDB was	configured as "i386-unknown-openbsd4.6".
	   (gdb) file /var/crash/bsd.0
	   Reading symbols from	/var/crash/bsd.0...(no debugging symbols found)...done.
	   (gdb) target	kvm /var/crash/bsd.0.core

     [Note that	the "kvm" target is currently only supported by	gdb(1) on some
     architectures.]

     After this, you can use the where command to show trace of	procedure
     calls that	led to the crash.

     For custom-built kernels, you should use bsd.gdb instead of bsd, thus al-
     lowing gdb(1) to show symbolic names for addresses	and line numbers from
     the source.

     Analyzing saved system images is sometimes	called post-mortem debugging.
     There are a class of analysis tools designed to work on both live systems
     and saved images, most of them are	linked with the	kvm(3) library and
     share option flags	to specify the kernel and memory image.	 These tools
     typically take the	following flags:

     -M	core
	     Normally this core	is an image produced by	savecore(8) but	it can
	     be	/dev/mem too, if you are looking at the	live system.

     -N	system
	     Takes a kernel system image as an argument.  This is where	the
	     symbolic information is gotten from, which	means the image	cannot
	     be	stripped.  In some cases, using	a bsd.gdb version of the ker-
	     nel can assist even more.

     The following commands understand these options: fstat(1),	netstat(1),
     nfsstat(1), ps(1),	w(1), dmesg(8),	iostat(8), kgmon(8), pstat(8),
     trpt(8), vmstat(8)	and many others.  There	are exceptions,	however.  For
     instance, ipcs(1) has renamed the -M argument to be -C instead.

     Examples of use:

	   # ps	-N /var/crash/bsd.0 -M /var/crash/bsd.0.core -O	paddr

     The -O paddr option prints	each process' struct proc address.  This is
     very useful information if	you are	analyzing process contexts in gdb(1).

	   # vmstat -N /var/crash/bsd.0	-M /var/crash/bsd.0.core -m

     This analyzes memory allocations at the time of the crash.	 Perhaps some
     resource was starving the system?

   Analyzing a live kernel
     Like the tools mentioned above, gdb(1) can	be used	to analyze a live sys-
     tem as well.  This	can be accomplished by not specifying a	crash dump
     when selecting the	"kvm" target:

	   (gdb) target	kvm

     It	is possible to inspect processes that entered the kernel by specifying
     a process'	struct proc address to the kvm proc command:

	   (gdb) kvm proc 0xd69dada0
	   #0  0xd0355d91 in sleep_finish (sls=0x0, do_sleep=0)
	       at ../../../../kern/kern_synch.c:217
	   217			   mi_switch();

     After this, the where command will	show a trace of	procedure calls, right
     back to where the selected	process	entered	the kernel.

CRASH LOCATION DETERMINATION
     The following example should make it easier for a novice kernel developer
     to	find out where the kernel crashed.

     First, in ddb(4) find the function	that caused the	crash.	It is either
     the function at the top of	the traceback or the function under the	call
     to	panic()	or uvm_fault().

     The point of the crash usually looks something like this "func-
     tion+0x4711".

     Find the function in the sources, let's say that the function is in
     "foo.c".

     Go	to the kernel build directory, e.g., /sys/arch/ARCH/compile/GENERIC,
     and do the	following:

	   # objdump -S	foo.o |	less

     Find the function in the output.  The function will look something	like
     this:

	   0: 17 47 11 42	  foo %x, bar, %y
	   4: foo bar		  allan	%kaka
	   8: XXXX		  boink	%bloyt
	   etc.

     The first number is the offset.  Find the offset that you got in the ddb
     trace (in this case it's 4711).

     When reporting data collected in this way,	include	~20 lines before and
     ~10 lines after the offset	from the objdump output	in the crash report,
     as	well as	the output of ddb(4)'s "show registers"	command.  It's impor-
     tant that the output from objdump includes	at least two or	three lines of
     C code.

REPORTING
     If	you are	sure you have found a reproducible software bug	in the kernel,
     and need help in further diagnosis, or already have a fix,	use sendbug(1)
     to	send the developers a detailed description including the entire	ses-
     sion from gdb(1).

SEE ALSO
     gdb(1), sendbug(1), ddb(4), reboot(8), savecore(8)

FreeBSD	13.0		       November	29, 2016		  FreeBSD 13.0

NAME | DESCRIPTION | CRASH LOCATION DETERMINATION | REPORTING | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=crash&sektion=8&manpath=OpenBSD+6.9>

home | help