Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
LOCCOUNT(1)							   LOCCOUNT(1)

NAME
       loccount	- count	lines of code in a source tree and perform cost
       estimation

SYNOPSIS
       loccount	[-cdegijlinsuV>] [-x pathlist] file-or-dir...

DESCRIPTION
       This program counts physical source lines of code (SLOC)	and logical
       lines of	code (LLOC) in one or more files or directories	given on the
       command line.

       A line of code is counted in SLOC if it includes	non-whitespace
       charecters outside the scope of a comment. LLOC is counted by tallying
       SLOCs with statement-terminating	punctuation.

       LLOC reporting is not available in all supported	languages, as the
       concept may not fit the langage's syntax	(e.g. the Lisp family) or its
       line-termination	rules would require full parsing. In these case	LLOC
       will always be reported as 0. On	the other hand,	LLOC reporting is
       reliably	consistent in languages	with C-like statement termination by
       semicolon.

       These definitions are simplistic	and arguably lead to undercounting if
       LLOC is being used as a complexity measure; the author considers	it a
       particular problem that most C macro definitions	won't be counted.
       However,	they have the advantage	that they improve comparability	of
       results across broad swathes of different languages.

       Certain kinds of	syntactic errors in source code	- notably unbalanced
       comment and string literal delimiters - make this program likely	to
       produce wrong counts and	spurious errors.

       It is advisable to run "make clean" or equivalent in your source
       directory before	running	this program, though it	knows how to detect
       some common kinds of generated files (such as yacc and lex output and
       manual pages or HTML generated by asciidoc) and will ignore them.

       Optionally, this	program	can perform a cost-to-replicate	estimation
       using the COCOMO	I and (if LLOC count is	nonzero) COCOMO	II models. It
       uses the	"organic" profile of COCOMO, which is generally	appropriate
       for open-source projects.

       SLOC/LLOC figures should	be used	with caution. While they do predict
       project costs and defect	incidence reasonably well, they	are not
       appropriate for use as productivity measures; good code is often	less
       bulky than bad code. Comparing SLOC across languages from different
       familes (for example, Algol-descended vs. Lisp-descended) is also
       dubious,	as these can have can have greatly differing complexity	per
       line.

       With these qualifications, SLOC/LLOC does have some other uses. It is
       quite effective for tracking changes in complexity and attack surface
       as a codebase evolves over time.

       All languages in	common use on Unix-like	operating systems are
       supported. For a	full list of supported languages, run "loccount	-s";
       "loccount -l" lists languages for which LLOC computation	is available.

       The program also	emits counts for build recipes - Makefiles, autoconf
       specifications, scons recipes, and waf scripts. Generated Makefiles are
       recognized and ignored. An installed copy of waf	and any	waf build
       directory is ignored, but a wscript file	is not.

       Counts for the configuration languages JSON, YAML, TOML,	and INI	are
       reported.

       The program emits counts	for well-known documentation markups as	well,
       including man-page, asciidoc, Markdown, Tex, and	others.	There is no
       equivalent of LLOC for these. The -n option disables this feature.

       PostScript is a special case. It	is usually generated from some other
       markup and thus not source code,	but not	always.	This program looks for
       "!PS-Adobe" early in the	fire as	an indication that it was generated,
       and ignores such	files.

       Languages are recognized	by file	extension or filename pattern;
       executable filenames without an extension are mined for #! lines
       identifying an interpreter. Files that cannot be	classified in this way
       are skipped, but	a list of files	skipped	in this	way is available with
       the -u option.

       Some file types are identified and silently skipped without being
       reported	by -u; these include symlinks, .o, .a, and .so object files,
       various kinds of	image and audio	files, and the .pyc/.pyo files
       produced	by the Python interpreter. All files and directories named
       with a leading dot are also silently skipped (in	particular, this
       ignores metadata	associated with	version-control	systems).

LIMITATIONS
       There are some sources or error and confusion that no amount of clever
       code in this program can	abolish.

       One has to do with comment nesting in Pascal. ISO 7185:1990, the
       standard	for the	language, specifies that comments do not nest; however
       important historical and	current	Pascal compilers support comment
       nesting.	This program assumes that if a block comment start is within
       the scope of a block comment, the programmer is working with such a
       compiler	and did	that deliberately.

       Python detection	is slightly flaky. Anything with a .py extension will
       be classified simply as "Python", not distinguishing between Python 2
       and Python 3. Python files without an extension will be correctly
       detected	only when they have a hashbang line containing "python"	or
       "python3"; end-of-lifed versions	such as	2 and 1.5 won't	be picked up.

       There is	a conflict among Objective-C, MATLAB, MUMPS, and ntroff/troff
       over the	extensions .m and .mm; this may	lead to	misidentification of
       files with these	extensions. To avoid problems, ensure that every
       MATLAB file contains at least one %-led winged comment or %{-led	block
       comment.

       What is reported	as "ML"	includes its dialects Caml and Ocaml, which
       are not readily distinguishable,	but unlikely to	be mixed in the	same
       source tree. Standard ML	and Concurrent ML have distinguishing file
       extensions and can therefore be reported	separately (as "SML" and "CML"
       respectively).

       The syntax of Algol 60 was not carefully	specified. Variants in which
       keywords	are disinguished from variable and function names by either
       being upprecase or being	quted like string exist. This program assumes
       an Algol	dialect	with all-caps unquoted keywords. The sticking point
       here is that COMMENT (uppercase,	no quotes) is used to recognize
       comments.

       This program assumes that Lisp and Scheme interpret backslash as	C
       does, that is as	an escape for a	following string delimiter. While this
       is true in Common Lisp, Scheme, Emacs Lisp, and Guile, it may not be
       true in other Lisp dialects.

       Manual pages sometimes have idiosyncratic extensions (that is, other
       than ".man" or a	single section digit) which this program will not
       recognize. Older	manual pages sometimes abuse nroff to achieve
       commenting in ways this program does not	recognize, resulting in	some
       overcounting of source lines.

       The language attribution	"shell"	includes bash, dash, ksh, and other
       similar variants	descended from the Bourne shell.

       ECMAScript6/es6 files with a .js	extension will be reported as
       Javascript.

OPTIONS
       -?
	   Display usage summary and quit.

       -c
	   Report COCOMO cost estimates. Use the coefficients for the
	   "organic" project type, which fits most open-source projects. An
	   EAF of 1.0 is assumed.

       -d n
	   Set debug level. At > 0, displays various progress messages.	Mainly
	   of interest to loccount developers.

       -e
	   Show	the association	between	languages and file extensions.

       -g
	   List	files normally excluded	by the autogeneration filter; do not
	   emit	line counts.

       -i
	   Report file path, line count, and type for each individual path.

       -j
	   Dump	SLOC and LLOC counts as	self-describing	JSON records for
	   postprocessing.

       -l
	   List	languages for which we can report LLOC and exit. Combine with
	   -i to list languages	one per	line.

       -n
	   Do not tally	documentation SLOC.

       -s
	   List	languages for which we can report SLOC and exit.

       -u
	   List	paths of files that could not be classified into a known
	   source type or as autogenerated.

       -x prefix
	   Ignore paths	matching the specified Go regular expression.

       -V
	   Show	program	version	and exit.

       Arguments following options may be either directories or	files.
       Directories are recursed	into. The report is generated on all paths
       specified on the	command	line.

EXIT VALUES
       Normally	0. 1 in	-s or -e mode if a non-duplication check on file
       extensions or hashbangs fails.

HISTORY	AND COMPATIBILITY
       The algorithms in this code originated with David A. Wheeler's
       sloccount utility, version 2.26 of 2004.	It is, however,	faster than
       sloccount, and handles many languages that sloccount does not.

       Generally it will produce identical SLOC	figures	to sloccount for a
       language	supported by both tools; the differences in whole-tree reports
       will mainly be due to better detection of some files sloccount left
       unclassified. Notably, for individual C and Perl	files you can expect
       both tools to produce identical SLOC. However, Python counts are
       different, because sloccount does not recognize and ignore single-quote
       multiline literals.

       A few of	sloccount's tests have been simplified in cases	where the
       complexity came from a rare or edge case	that the author	judges to have
       become extinct since 2004.

       The reporting formats of	loccount 2.x are substantially different from
       those in	the 1.x	versions due to	absence	of any LLOC fields in 1.x.

       The base	salary used for	cost estimation	will differ between these
       tools depending on time of last release.

BUGS
       Eiffel indexing comments	are counted as code, not text. (This is
       arguably	a feature.)

       Literate	Haskell	(.lhs) is not supported. (This is a regression from
       sloccount).

       LLOC counts in languages	that use a semicolon as	an Algol-like
       statement separator, rather than	a terminator, will be a	bit low. This
       group includes Pascal, Modula, Oberon, and Perl.

       Dylan LOC will be a bit high due	to its use of semicolon	as a
       terminator for classes and methods as well as statements.

       If a Factor program defines words containing embedded ! or ", loccount
       will be confused.

       Fantom documentation comments (led with **) are counted as code.

       Comment detection in Forth can be confused by tabs or unusual
       whitespace following a \\ or (, or by strings containing	unbalanced
       parens.

REPORTING BUGS
       Report bugs to Eric S. Raymond <esr@thyrsus.com>.

				  08/10/2020			   LOCCOUNT(1)

NAME | SYNOPSIS | DESCRIPTION | LIMITATIONS | OPTIONS | EXIT VALUES | HISTORY AND COMPATIBILITY | BUGS | REPORTING BUGS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=loccount&sektion=1&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help