Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MAKEPP_TUTORIAL_COMPILATION(1)	    Makepp	MAKEPP_TUTORIAL_COMPILATION(1)

NAME
       makepp_tutorial_compilation -- Unix compilation commands

DESCRIPTION
       Skip this this manual page if you have a	good grasp on what the
       compilation commands do.

       I find that distressingly few people seem to be taught in their
       programming classes is how to go	about compiling	programs once they've
       written them.  Novices rely either on a single memorized	command, or
       else on the builtin rules in make.  I have been surprised by extremely
       computer	literate people	who learned to compile without optimization
       because they simply never were told how important it is.	 Rudimentary
       knowledge of how	compilation commands work may make your	programs run
       twice as	fast or	more, so it's worth at least five minutes.  This page
       describes just about everything you'll need to know to compile C	or C++
       programs	on just	about any variant of Unix.

       The examples will be mostly for C, since	C++ compilation	is identical
       except that the name of the compiler is different.  Suppose you're
       compiling source	code in	a file called "xyz.c" and you want to build a
       program called "xyz".  What must	happen?

       You may know that you can build your program in one step, using a
       command like this:

	   cc -g xyz.c -o xyz

       This will work, but it conceals a two-step process that you must
       understand if you are writing makefiles.	 (Actually, there are more
       than two	steps, but you only have to understand two of them.)  For a
       program of more than one	module,	the two	steps are usually explicitly
       separated.

   Compilation
       The first step is the translation of your C or C++ source code into a
       binary file called an object file.  Object files	usually	have an
       extension of ".o". (For some more recent	projects, ".lo"	is also	used
       for a slightly different	kind of	object file.)

       The command to produce an object	file on	Unix looks something like
       this:

	   cc -g -c xyz.c -o xyz.o

       "cc" is the C compiler.	Sometimes alternate C compilers	are used; a
       very common one is called "gcc".	 A common C++ compiler is the GNU
       compiler, usually called	"g++".	Virtually all C	and C++	compilers on
       Unix have the same syntax for the rest of the command (at least for
       basic operations), so the only difference would be the first word.

       We'll explain what the "-g" option does later.

       The "-c"	option tells the C compiler to produce a ".o" file as output.
       (If you don't specify "-c", then	it performs the	second compilation
       step automatically.)

       The "-o xyz.o" option tells the compiler	what the name of the object
       file is.	 You can omit this, as long as the name	of the object file is
       the same	as the name of the source file except for the ".o" extension.

       For the most part, the order of the options and the file	names does not
       matter.	One important exception	is that	the output file	must
       immediately follow "-o".

   Linking
       The second step of building a program is	called linking.	 An object
       file cannot be run directly; it's an intermediate form that must	be
       linked to other components in order to produce a	program.  Other
       components might	include:

       o   Libraries.  A library, roughly speaking, is a collection of object
	   modules that	are included as	necessary.  For	example, if your
	   program calls the "printf" function,	then the definition of the
	   "printf" function must be included from the system C	library.  Some
	   libraries are automatically linked into your	program	(e.g., the one
	   containing "printf")	so you never need to worry about them.

       o   Object files	derived	from other source files	in your	program.  If
	   you write your program so that it actually has several source
	   files, normally you would compile each source file to a separate
	   object file and then	link them all together.

       The linker is the program responsible for taking	a collection of	object
       files and libraries and linking them together to	produce	an executable
       file.  The executable file is the program you actually run.

       The command to link the program looks something like this:

	   cc -g xyz.o -o xyz

       It may seem odd,	but we usually run the same program ("cc") to perform
       the linking.  What happens under	the surface is that the	"cc" program
       immediately passes off control to a different program (the linker,
       sometimes called	the loader, or "ld") after adding a number of complex
       pieces of information to	the command line.  For example,	"cc" tells
       "ld" where the system library is	that includes the definition of
       functions like "printf".	 Until you start writing shared	libraries, you
       usually do not need to deal directly with "ld".

       If you do not specify "-o xyz", then the	output file will be called
       "a.out",	which seems to me to be	a completely useless and confusing
       convention.  So always specify "-o" on the linking step.

       If your program has more	than one object	file, you should specify all
       the object files	on the link command.

   Why you need	to separate the	steps
       Why not just use	the simple, one-step command, like this:

	   cc -g xyz.c -o xyz

       instead of the more complicated two-stage compilation

	   cc -g -c xyz.c -o xyz.o
	   cc -g xyz.o -o xyz

       if internally the first is converted into the second?  The difference
       is important only if there is more than one module in your program.
       Suppose we have an additional module, "abc.c".  Now our compilation
       looks like this:

	   # One-stage command.
	   cc -g xyz.c abc.c -o	xyz

       or

	   # Two-stage command.
	   cc -g -c xyz.c -o xyz.o
	   cc -g -c abc.c -o abc.o
	   cc -g xyz.o abc.o -o	xyz

       The first method, of course, is converted internally into the second
       method.	This means that	both "xyz.c" and "abc.c" are recompiled	each
       time the	command	is run.	 But if	you only changed "xyz.c", there's no
       need to recompile "abc.c", so the second	line of	the two-stage commands
       does not	need to	be done.  This can make	a huge difference in
       compilation time, especially if you have	many modules.  For this
       reason, virtually all makefiles keep the	two compilation	steps
       separate.

       That's pretty much the basics, but there	are a few more little details
       you really should know about.

   Debugging vs. optimization
       Usually programmers compile a program either either for debug or	for
       speed.  Compilation for speed is	called optimization; compiling with
       optimization can	make your code run up to 5 times faster	or more,
       depending on your code, your processor, and your	compiler.

       With such dramatic gains	possible, why would you	ever not want to
       optimize?  The most important answer is that optimization makes use of
       a debugger much more difficult (sometimes impossible).  (If you don't
       know anything about a debugger, it's time to learn.  The	half hour or
       hour you'll spend learning the basics will be repaid many many times
       over in the time	you'll save later when debugging.  I'd recommend
       starting	with a GUI debugger like "kdbg", "ddd",	or "gdb" run from
       within emacs (see the info pages	on gdb for instructions	on how to do
       this).)	Optimization reorders and combines statements, removes
       unnecessary temporary variables,	and generally rearranges your code so
       that it's very tough to follow inside a debugger.  The usual procedure
       is to write your	code, compile it without optimization, debug it, and
       then turn on optimization.

       In order	for the	debugger to work, the compiler has to cooperate	not
       only by not optimizing, but also	by putting information about the names
       of the symbols into the object file so the debugger knows what things
       are called.  This is what the "-g" compilation option does.

       If you're done debugging, and you want to optimize your code, simply
       replace "-g" with "-O".	For many compilers, you	can specify increasing
       levels of optimization by appending a number after "-O".	 You may also
       be able to specify other	options	that increase the speed	under some
       circumstances (possibly trading off with	increased memory usage).  See
       your compiler's man page	for details.  For example, here	is an
       optimizing compile command that I use frequently	with the "gcc"
       compiler:

	   gcc -O6 -malign-double -c xyz.c -o xyz.o

       You may have to experiment with different optimization options for the
       absolute	best performance.  You may need	different options for
       different pieces	of code.  Generally speaking, a	simple optimization
       flag like "-O6" works with many compilers and usually produces pretty
       good results.

       Warning:	on rare	occasions, your	program	doesn't	actually do exactly
       the same	thing when it is compiled with optimization.  This may be due
       to (1) an invalid assumption you	made in	your code that was harmless
       without optimization, but causes	problems because the compiler takes
       the liberty of rearranging things when you optimize; or (2) sadly,
       compilers have bugs too,	including bugs in their	optimizers.  For a
       stable compiler like "gcc" on a common platform like an Pentium,
       optimization bugs are seldom a problem (as of the year 2000--there were
       problems	a few years ago).

       If you don't specify either "-g"	or "-O"	in your	compilation command,
       the resulting object file is suitable neither for debugging nor for
       running fast.  For some reason, this is the default.  So	always specify
       either "-g" or "-O".

       On some systems,	you must supply	"-g" on	both the compilation and
       linking steps; on others	(e.g. Linux), it needs to be supplied only on
       the compilation step.  On some systems, "-O" actually does something
       different in the	linking	phase, while on	others,	it has no effect.  In
       any case, it's always harmless to supply	"-g" or	"-O" for both
       commands.

   Warnings
       Most compilers are capable of catching a	number of common programming
       errors (e.g., forgetting	to return a value from a function that's
       supposed	to return a value).  Usually, you'll want to turn on warnings.
       How you do this depends on your compiler	(see the man page), but	with
       the "gcc" compiler, I usually use something like	this:

	   gcc -g -Wall	-c xyz.c -o xyz.o

       (Sometimes I also add "-Wno-uninitialized" after	"-Wall"	because	of a
       warning that is usually wrong that crops	up when	optimizing.)

       These warnings have saved me many many hours of debugging.

   Other useful	compilation options
       Often, necessary	include	files are stored in some directory other than
       the current directory or	the system include directory (/usr/include).
       This frequently happens when you	are using a library that comes with
       include files to	define the functions or	classes.

       Suppose,	for example, you are writing an	application that uses the Qt
       libraries.  You've installed a local copy of the	Qt library in
       /home/users/joe/qt, which means that the	include	files are stored in
       the directory /home/users/joe/qt/include.  In your code,	you want to be
       able to do things like this:

	   #include <qwidget.h>

       instead of

	   #include "/home/users/joe/qt/include/qwidget.h"

       You can tell the	compiler to look for include files in a	different
       directory by using the "-I" compilation option:

	   g++ -I/home/users/joe/qt/include -g -c mywidget.cpp -o mywidget.o

       There is	usually	no space between the "-I" and the directory name.

       When the	C++ compiler is	looking	for the	file qwidget.h,	it will	look
       in /home/users/joe/qt/include before looking in the system include
       directory.  You can specify as many "-I"	options	as you want.

   Using libraries
       You will	often have to tell the linker to link with specific external
       libraries, if you are calling any functions that	aren't part of the
       standard	C library.  The	"-l" (lowercase	L) option says to link with a
       specific	library:

	   cc -g xyz.o -o xyz -lm

       "-lm" says to link with the system math library,	which you will need if
       you are using functions like "sqrt".

       Beware: if you specify more than	one "-l" option, the order can make a
       difference on some systems.  If you are getting undefined variables
       when you	know you have included the library that	defines	them, you
       might try moving	that library to	the end	of the command line, or	even
       including it a second time at the end of	the command line.

       Sometimes the libraries you will	need are not stored in the default
       place for system	libraries.  "-labc" searches for a file	called
       libabc.a	or libabc.so or	libabc.sa in the system	library	directories
       (/usr/lib and usually a few other places	too, depending on what kind of
       Unix you're running).  The "-L" option specifies	an additional
       directory to search for libraries.  To take the above example again,
       suppose you've installed	the Qt libraries in /home/users/joe/qt,	which
       means that the library files are	in /home/users/joe/qt/lib.  Your link
       step for	your program might look	something like this:

	   g++ -g test_mywidget.o mywidget.o -o	test_mywidget -L/home/users/joe/qt/lib -lqt

       (On some	systems, if you	link in	Qt you will need to add	other
       libraries as well (e.g.,	"-L/usr/X11R6/lib -lX11	-lXext").  What	you
       need to do will depend on your system.)

       Note that there is no space between "-L"	and the	directory name.	 The
       "-L" option usually goes	before any "-l"	options	it's supposed to
       affect.

       How do you know which libraries you need?  In general, this is a	hard
       question, and varies depending on what kind of Unix you are running.
       The documentation for the functions or classes you are using should say
       what libraries you need to link with.  If you are using functions or
       classes from an external	package, there is usually a library you	need
       to link with; the library will usually be a file	called "libabc.a" or
       "libabc.so" or "libabc.sa" if you need to add a "-labc" option.

   Some	other confusing	things
       You may have noticed that it is possible	to specify options which
       normally	apply to compilation on	the linking step, and options which
       normally	apply to linking on the	compilation step.  For example,	the
       following commands are valid:

	   cc -g -L/usr/X11R6/lib -c xyz.c -o xyz.o
	   cc -g -I/somewhere/include xyz.o -o xyz

       The irrelevant options are ignored; the above commands are exactly
       equivalent to this:

	   cc -g -c xyz.c -o xyz.o
	   cc -g xyz.o -o xyz

perl v5.24.1			  2012-02-07	MAKEPP_TUTORIAL_COMPILATION(1)

NAME | DESCRIPTION

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=makepp_tutorial_compilation&sektion=1&manpath=FreeBSD+12.0-RELEASE+and+Ports>

home | help