Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
LOCATEDB(5)							   LOCATEDB(5)

       locatedb	- front-compressed file	name database

       This  manual  page  documents the format	of file	name databases for the
       GNU version of locate.  The file	name databases contain lists of	 files
       that  were  in  particular directory trees when the databases were last

       There can be multiple databases.	  Users	 can  select  which  databases
       locate  searches	 using an environment variable or command line option;
       see locate(1).  The system administrator	can choose the	file  name  of
       the  default  database,	the  frequency	with  which  the databases are
       updated,	and the	directories for	which they contain entries.  Normally,
       file name databases are updated by running the updatedb program period-
       ically, typically nightly; see updatedb(1).

GNU LOCATE02 database format
       This is the default format of  databases	 produced  by  updatedb.   The
       updatedb	 program  runs frcode to compress the list of file names using
       front-compression, which	reduces	the database size by a factor of 4  to
       5.   Front-compression  (also  known  as	incremental encoding) works as

       The database entries are	a sorted list (case-insensitively, for	users'
       convenience).   Since the list is sorted, each entry is likely to share
       a prefix	(initial string) with the previous entry.  Each	database entry
       begins  with  an	 signed	 offset-differential  count byte, which	is the
       additional number of characters of prefix of the	preceding entry	to use
       beyond the number that the preceding entry is using of its predecessor.
       (The counts can be negative.)  Following	the count is a null-terminated
       ASCII remainder -- the part of the name that follows the	shared prefix.

       If the offset-differential count	is larger than	can  be	 stored	 in  a
       signed byte (+/-127), the byte has the value 0x80 (binary 10000000) and
       the actual count	follows	in a 2-byte word, with	the  high  byte	 first
       (network	 byte  order).	 This count can	also be	negative (the sign bit
       being in	the first of the two bytes).

       Every database begins with a dummy entry	for a file called  `LOCATE02',
       which  locate  checks for to ensure that	the database file has the cor-
       rect format; it ignores the entry in doing the search.

       Databases can not be concatenated together, even	if the	first  (dummy)
       entry  is trimmed from all but the first	database.  This	is because the
       offset-differential count in the	first entry of the second and  follow-
       ing databases will be wrong.

       In the future, the data within the locate database may not be sorted in
       any particular order.  To obtain	sorted results,	 pipe  the  output  of
       locate through sort -f.

slocate	database format
       The  slocate  program  uses a database format similar to, but not quite
       the same	as, GNU	locate.	 The first byte	of the database	specifies  its
       security	 level.	  If the security level	is 0, slocate will read, match
       and print filenames on the basis	of the	information  in	 the  database
       only.   However,	if the security	level byte is 1, slocate omits entries
       from its	output if the invoking user is unable  to  access  them.   The
       second  byte  of	 the database is zero.	The second byte	is followed by
       the first database entry.  The first entry in the database is not  pre-
       ceded  by any differential count	or dummy entry.	 Instead the differen-
       tial count for the first	item is	assumed	to be zero.

       Starting	with the second	entry (if any) in the database,	data is	inter-
       preted as for the GNU LOCATE02 format.

Old Locate Database format
       There is	also an	old database format, used by Unix locate and find pro-
       grams and earlier releases of the GNU  ones.   updatedb	runs  programs
       called bigram and code to produce old-format databases.	The old	format
       differs from the	above description in the following ways.   Instead  of
       each  entry  starting with an offset-differential count byte and	ending
       with a null, byte values	from 0 through 28 indicate offset-differential
       counts from -14 through 14.  The	byte value indicating that a long off-
       set-differential	count follows is 0x1e (30), not	0x80.  The long	counts
       are  stored  in	host byte order, which is not necessarily network byte
       order, and host integer word size, which	is usually 4 bytes.  They also
       represent a count 14 less than their value.  The	database lines have no
       termination byte; the start of the next line is indicated by its	 first
       byte having a value <= 30.

       In  addition,  instead of starting with a dummy entry, the old database
       format starts with a 256	byte table  containing	the  128  most	common
       bigrams in the file list.  A bigram is a	pair of	adjacent bytes.	 Bytes
       in the database that have the high bit set are indexes (with  the  high
       bit cleared) into the bigram table.  The	bigram and offset-differential
       count coding makes these	databases 20-25% smaller than the new  format,
       but makes them not 8-bit	clean.	Any byte in a file name	that is	in the
       ranges used for the special codes is replaced  in  the  database	 by  a
       question	 mark, which not coincidentally	is the shell wildcard to match
       a single	character.

       Input to	frcode:

       Length of the longest prefix of the preceding entry to share:
       0 /usr/src
       8 /cmd/aardvark.c
       14 rmadillo.c
       5 tmp/zoo

       Output from frcode, with	trailing nulls changed to newlines  and	 count
       bytes made printable:
       0 LOCATE02
       0 /usr/src
       8 /cmd/aardvark.c
       6 rmadillo.c
       -9 tmp/zoo

       (6 = 14 - 8, and	-9 = 5 - 14)

       find(1),	 locate(1),  locatedb(5),  xargs(1), Finding Files (on-line in
       Info, or	printed)

       The best	way to report a	bug  is	 to  use  the  form  at	 http://savan-   The	 reason	 for  this is that you
       will then be able to track progress in fixing the problem.   Other com-
       ments about locate(1) and about the findutils package in	general	can be
       sent to the bug-findutils mailing list.	To join	the list,  send	 email


NAME | DESCRIPTION | GNU LOCATE02 database format | slocate database format | Old Locate Database format | EXAMPLE | SEE ALSO | BUGS

Want to link to this manual page? Use this URL:

home | help