Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MU-INDEX(1)		    General Commands Manual		   MU-INDEX(1)

NAME
       mu index	- index	e-mail messages	stored in Maildirs

SYNOPSIS
       mu index	[options]

DESCRIPTION
       mu  index is the	mu command for scanning	the contents of	Maildir	direc-
       tories and storing the results in a Xapian database. The	data can  then
       be queried using	mu-find(1).

       index understands Maildirs as defined by	Daniel Bernstein for qmail(7).
       In  addition,  it  understands  recursive  Maildirs  (Maildirs	within
       Maildirs),  Maildir++.  It can also deal	with VFAT-based	Maildirs which
       use '!' as the separators instead of ':'.

       E-mail messages which are not stored in something resembling a  maildir
       leaf-directory  (cur and	new) are ignored, as are the cache directories
       for notmuch and gnus, and any dot-directory.

       Symlinks	are not	followed.

       If there	is a file called .noindex in a directory, the contents of that
       directory  and  all  of its subdirectories will be ignored. This	can be
       useful to exclude certain directories from the  indexing	 process,  for
       example directories with	spam-messages.

       If  there  is  a	 file called .noupdate in a directory, the contents of
       that directory and all of its subdirectories will be ignored, unless we
       do  a  full  rebuild  (with  --rebuild).	This can be useful to speed up
       things you have some maildirs that never	 change.  Note	that  you  can
       still  search  for these	messages, this only affects updating the data-
       base.

       There also the --lazy-check which can greatly speed  up	indexing;  see
       below for details.

       The  first  run of mu index may take a few minutes if you have a	lot of
       mail (tens of thousands of messages).  Fortunately, such	 a  full  scan
       needs  to  be  done  only  once;	 after	that  it suffices to index the
       changes,	 which	goes  much  faster.  See  the  'Note  on   performance
       (i,ii,iii)' below for more information.

       The optional 'phase two'	of the indexing-process	is the removal of mes-
       sages from the database for which there is no  longer  a	 corresponding
       file  in	 the  Maildir.	If you do not want this, you can use -n, --no-
       cleanup.

       When mu index catches one of the	 signals  SIGINT,  SIGHUP  or  SIGTERM
       (e.g.,  when you	press Ctrl-C during the	indexing process), it tries to
       shutdown	gracefully; it tries to	save and commit	data,  and  close  the
       database	etc. If	it receives another signal (e.g., when pressing	Ctrl-C
       once more), mu index will terminate immediately.

OPTIONS
       Note, some of the general options are described in the  mu(1)  man-page
       and not here, as	they apply to multiple mu commands.

       -m, --maildir=_maildir_
	      starts  searching	at _maildir_. By default, mu uses whatever the
	      MAILDIR environment variable is set to; if it  is	 not  set,  it
	      tries ~/Maildir. See the note on mixing sub-maildirs below.

       --my-address=_my-email-address_
	      specifies	that some e-mail address is 'my-address' (--my-address
	      can be used multiple times). This	is used	by mu cfind -- any  e-
	      mail address found in the	address	fields of a message which also
	      has _my-email-address_ in	one of its address fields  is  consid-
	      ered a personal e-mail address. This allows you, for example, to
	      filter out (mu cfind --personal)	addresses  which  were	merely
	      seen in mailing list messages.

       --lazy-check
	      in  lazy-check mode, mu does not consider	messages for which the
	      time-stamp (ctime) of the	 directory  they  reside  in  has  not
	      changed  since  the  previous  indexing run. This	is much	faster
	      than the non-lazy	check, but won't  update  messages  that  have
	      change  (rather than having been added or	removed), since	merely
	      editing a	message	does not update	the directory  time-stamp.  Of
	      course,  you can run mu-index occasionally without --lazy-check,
	      to pick up such messages.

       --nocleanup
	      disables the database cleanup that mu does by default after  in-
	      dexing.

       --rebuild
	      clear  all messages from the database before indexing. --rebuild
	      guarantees that after the	indexing has finished,	there  are  no
	      'old'  messages  in the database anymore,	which is not true with
	      --reindex	 when  indexing	 only  a  part	of   messages	(using
	      --maildir).  For	this  reason,  it is necessary to run mu index
	      --rebuild	when there is an upgrade in the	 database  format.  mu
	      index will issue a warning about this.

       --autoupgrade
	      automatically  use -y, --empty when mu notices that the database
	      version is not up-to-date.  This	option	is  for	 use  in  cron
	      scripts  and  the	 like, so they won't require any user interac-
	      tion, even when mu introduces a new database version.

       --xbatchsize=_batch size_
	      set the maximum number of	messages to process in a single	Xapian
	      transaction. In practice,	this option is only useful if you find
	      that mu is running out of	memory while indexing; in  that	 case,
	      you can set the batch size to (for example) 1000,	which will re-
	      duce memory consumption, but also	substantially reduce  the  in-
	      dexing performance.

       --max-msg-size=_max msg size_
	      set  the maximum size (in	bytes) for messages. The default maxi-
	      mum (currently at	500Mb) should be enough	in most	cases, but  if
	      you  encounter  warnings from mu about ignoring messsage because
	      they are too big,	you may	want to	increase this. Note  that  the
	      reason  for  having  a maximum size is that big messages require
	      big memory allocations, which may	lead to	problems.

	      NOTE: It is not recommended to  mix  maildirs  and  sub-maildirs
	      within  the  hierarchy  in  the same database; for example, it's
	      better  not  to  index  both  with   --maildir=~/MyMaildir   and
	      --maildir=~/MyMaildir/foo,  as  this  may	lead to	unexpected re-
	      sults when searching with	the 'maildir:' search  parameter  (see
	      below).

   A note on performance (i)
       As a non-scientific benchmark, a	simple test on the author's machine (a
       Thinkpad	X61s laptop using Linux	2.6.35 and an ext3 file	 system)  with
       no existing database, and a maildir with	27273 messages:

	$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	$ time mu index	--quiet
	66,65s user 6,05s system 27% cpu 4:24,20 total
       (about 103 messages per second)

       A  second run, which is the more	typical	use case when there is a data-
       base already, goes much faster:

	$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	$ time mu index	--quiet
	0,48s user 0,76s system	10% cpu	11,796 total
       (more than 56818	messages per second)

       Note that each test flushes the caches first; a more  common  use  case
       might  be to run	mu index when new mail has arrived; the	cache may stay
       quite 'warm' in that case:

	$ time mu index	--quiet
	0,33s user 0,40s system	80% cpu	0,905 total
       which is	more than 30000	messages per second.

   A note on performance (ii)
       As per June 2012, we did	the same non-scientific	benchmark,  this  time
       with  an	Intel i5-2500 CPU @ 3.30GHz, an	ext4 file system and a maildir
       with 22589 messages. We start without an	existing database.

	$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	$ time mu index	--quiet
	27,79s user 2,17s system 48% cpu 1:01,47 total
       (about 813 messages per second)

       A second	run, which is the more typical use case	when there is a	 data-
       base already, goes much faster:

	$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	$ time mu index	--quiet
	0,13s user 0,30s system	19% cpu	2,162 total
       (more than 173000 messages per second)

   A note on performance (iii)
       As  per July 2016, we did the same non-scientific benchmark, again with
       the Intel i5-2500 CPU @ 3.30GHz,	an ext4	file system.  This  time,  the
       maildir contains	72525 messages.

	$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	$ time mu index	--quiet
	40,34s user 2,56s system 64% cpu 1:06,17 total
       (about 1099 messages per	second).

       As shown, mu has	been getting faster with each release, even with rela-
       tively expensive	new features such as text-normalization	(for  case-in-
       sensitve/accent-insensitive  matching).	The  profiles are dominated by
       operations in the Xapian	database now.

FILES
       By default, mu index stores its message database	in  ~/.mu/xapian;  the
       database	 has an	embedded version number, and mu	will automatically up-
       date it when it notices a different version. This allows	for  automatic
       updating	 of  mu-versions,  without the need to clear out any old data-
       bases.

       However,	note that versions of mu before	0.7 used a  different  scheme,
       which  puts  the	 database in ~/.mu/xapian-_version_. These older data-
       bases can safely	be deleted. Starting from  version  0.7,  this	manual
       cleanup should no longer	be needed.

       mu stores logs of its operations	and queries in _muhome_/mu.log (by de-
       fault, this is ~/.mu/mu.log). Upon startup, mu checks the size of  this
       log  file.  If  it  exceeds 1 MB, it will be moved to ~/.mu/mu.log.old,
       overwriting any existing	file of	that name, and start with an empty log
       file.  This  scheme allows for continued	use of mu without the need for
       any manual maintenance of log files.

ENVIRONMENT
       mu index	uses MAILDIR to	find the user's	Maildir	if  it	has  not  been
       specified  explicitly  with --maildir=_maildir_.	If MAILDIR is not set,
       mu index	will try ~/Maildir.

RETURN VALUE
       mu index	return 0 upon successful  completion,  and  any	 other	number
       greater than 0 signals an error.

BUGS
       Please report bugs if you find them: https://github.com/djcb/mu/issues

AUTHOR
       Dirk-Jan	C. Binnema <djcb@djcbsoftware.nl>

SEE ALSO
       maildir(5), mu(1), mu-find(1), mu-cfind(1)

User Manuals			   July	2016			   MU-INDEX(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | FILES | ENVIRONMENT | RETURN VALUE | BUGS | AUTHOR | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=mu-index&sektion=1&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help