Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
COROSYNC_CONF(5)  Corosync Cluster Engine Programmer's Manual COROSYNC_CONF(5)

       corosync.conf - corosync	executive configuration	file


       The corosync.conf instructs the corosync	executive about	various	param-
       eters needed to control the corosync executive.	Empty lines and	 lines
       starting	with # character are ignored.  The configuration file consists
       of bracketed top	level directives.  The possible	directive choices are:

       totem { }
	      This top level directive contains	configuration options for  the
	      totem protocol.

       logging { }
	      This top level directive contains	configuration options for log-

       quorum {	}
	      This top level directive contains	configuration options for quo-

       nodelist	{ }
	      This  top	 level	directive  contains  configuration options for
	      nodes in cluster.

       system {	}
	      This top level directive contains	configuration options  related
	      to system.

       resources { }
	      This  top	level directive	contains configuration options for re-

       The interface sub-directive of totem  is	 optional  for	UDP  and  knet

       For  knet,  multiple  interface	subsections define parameters for each
       knet link on the	system.

       For UDPU	an interface section is	not needed and it is recommended  that
       the nodelist is used to define cluster nodes.

	      This  specifies  the  link number	for the	interface.  When using
	      the knet protocol, each interface	should specify	separate  link
	      numbers  to  uniquely  identify to the membership	protocol which
	      interface	to use for which link.	The linknumber must  start  at
	      0. For UDP the only supported linknumber is 0.

	      This  specifies  the  priority for the link when knet is used in
	      'passive'	mode. (see link_mode below)

	      This  specifies  the   interval	between	  knet	 link	pings.
	      knet_ping_interval  and  knet_ping_timeout are a pair, if	one is
	      specified	the other should be too, otherwise one will be	calcu-
	      lated from the token timeout and one will	be taken from the con-
	      fig file.	 (default is token timeout / (knet_pong_count*2))

	      If no ping is received within this time, the knet	 link  is  de-
	      clared  dead.   knet_ping_interval  and  knet_ping_timeout are a
	      pair, if one is specified	the other should be too, otherwise one
	      will  be calculated from the token timeout and one will be taken
	      from  the	 config	  file.	   (default   is   token   timeout   /

	      How  many	 values	 of  latency are used to calculate the average
	      link latency. (default 2048 samples)

	      How many valid ping/pongs	before a link is marked	 UP.  (default

	      Which  IP	 transport knet	should use. valid values are "sctp" or
	      "udp". (default: udp)

       bindnetaddr (udp	only)
	      This specifies the network address the corosync executive	should
	      bind to when using udp.

	      bindnetaddr (udp only) should be an IP address configured	on the
	      system, or a network address.

	      For example, if the local	interface is with netmask,  you  should  set	bindnetaddr to or  If the local interface is  with  net-
	      mask,	set  bindnetaddr  to  or, and	so forth.

	      This may also be an IPV6 address,	in which case IPV6  networking
	      will be used.  In	this case, the exact address must be specified
	      and there	is no automatic	selection  of  the  network  interface
	      within a specific	subnet as with IPv4.

	      If IPv6 networking is used, the nodeid field in nodelist must be

       broadcast (udp only)
	      This is optional and can be set to yes.  If it is	 set  to  yes,
	      the  broadcast  address will be used for communication.  If this
	      option is	set, mcastaddr should not be set.

       mcastaddr (udp only)
	      This is the multicast address used by corosync  executive.   The
	      default  should work for most networks, but the network adminis-
	      trator should be queried	about  a  multicast  address  to  use.
	      Avoid 224.x.x.x because this is a	"config" multicast address.

	      This  may	 also be an IPV6 multicast address, in which case IPV6
	      networking will be used.	If IPv6	networking is used, the	nodeid
	      field in nodelist	must be	specified.

	      It's  not	necessary to use this option if	cluster_name option is
	      used. If both options are	used, mcastaddr	has higher priority.

       mcastport (udp only)
	      This specifies the UDP port number.  It is possible to  use  the
	      same  multicast  address on a network with the corosync services
	      configured for different UDP ports.  Please note	corosync  uses
	      two  UDP	ports mcastport	(for mcast receives) and mcastport - 1
	      (for mcast sends).  If you have multiple clusters	 on  the  same
	      network using the	same mcastaddr please configure	the mcastports
	      with a gap.

       ttl (udp	only)
	      This specifies the Time To Live (TTL). If	you run	 your  cluster
	      on  a  routed network then the default of	"1" will be too	small.
	      This option provides a way to increase this up to	255. The valid
	      range is 0..255.

       Within  the  totem  directive, there are	seven configuration options of
       which one is required, five are optional, and one is required when IPV6
       is  configured  in  the interface subdirective.	The required directive
       controls	the version of the totem configuration.	 The  optional	option
       unless  using  IPV6 directive controls identification of	the processor.
       The optional options control secrecy and	 authentication,  the  network
       mode of operation and maximum network MTU field.

	      This specifies the version of the	configuration file.  Currently
	      the only valid version for this directive	is 2.

       clear_node_high_bit This	configuration option is	optional and  is  only
       relevant	 when no nodeid	is specified.  Some corosync clients require a
       signed 32 bit nodeid that is  greater  than  zero  however  by  default
       corosync	 uses  all 32 bits of the IPv4 address space when generating a
       nodeid.	Set this option	to yes to force	the high bit to	 be  zero  and
       therefor	ensure the nodeid is a positive	signed 32 bit integer.

       WARNING:	 The  clusters behavior	is undefined if	this option is enabled
       on only a subset	of the cluster (for example during a rolling upgrade).

	      This specifies which cryptographic library  should  be  used  by
	      knet. Options are	nss and	openssl.

	      The default is nss

	      This  specifies  which HMAC authentication should	be used	to au-
	      thenticate all messages. Valid values are	none  (no  authentica-
	      tion), md5, sha1,	sha256,	sha384 and sha512. Encrypted transmis-
	      sion is only supported for the knet transport.

	      The default is none.

	      This specifies which cipher should be used to encrypt  all  mes-
	      sages.   Valid  values are none (no encryption), aes256, aes192,
	      aes128 and 3des.	Enabling crypto_cipher,	requires also enabling
	      of crypto_hash. Encrypted	transmission is	only supported for the
	      knet transport.

	      The default is none.

	      This specifies the fully qualified path to the shared  key  used
	      to authenticate and encrypt data used within the Totem protocol.

	      The default is /etc/corosync/authkey.

       key    Shared key stored	in configuration instead of authkey file. This
	      option has lower precedence than keyfile	option	so  it's  used
	      only  when  keyfile  is not specified.  Using this option	is not
	      recommended for security reasons.

	      This specifies the Kronosnet mode, which may be passive, active,
	      or  rr  (round-robin).  passive: the active link with the	lowest
	      priority will be used. If	one or more links share	the same  pri-
	      ority the	one with the lowest link ID will be used.  active: All
	      active links will	be used	simultaneously to send traffic.	  link
	      priority	is  ignored.  rr: Round-Robin policy. Each packet will
	      be sent to the next active link in order.

	      If only one interface directive is specified, passive  is	 auto-
	      matically	chosen.

	      The  maximum number of interface directives that is allowed with
	      Kronosnet	is 8. For other	transports it is 1.

       netmtu This specifies the network maximum transmit unit.	 To  set  this
	      value  beyond 1500, the regular frame MTU, requires ethernet de-
	      vices that support large,	or also	called jumbo, frames.  If  any
	      device in	the network doesn't support large frames, the protocol
	      will not operate properly.  The hosts must also have  their  mtu
	      size set from 1500 to whatever frame size	is specified here.

	      Please  note  while some NICs or switches	claim large frame sup-
	      port, they support 9000 MTU as the maximum frame size  including
	      the  IP  header.	 Setting the netmtu and	host MTUs to 9000 will
	      cause totem to use the full 9000 bytes of	the frame.  Then Linux
	      will  add	 a  18 byte header moving the full frame size to 9018.
	      As a result some hardware	will not operate  properly  with  this
	      size  of data.  A	netmtu of 8982 seems to	work for the few large
	      frame devices that have been tested.  Some  manufacturers	 claim
	      large  frame  support  when  in fact they	support	frame sizes of
	      4500 bytes.

	      When sending multicast traffic, if the network frequently	recon-
	      figures,	chances	 are  that  some device	in the network doesn't
	      support large frames.

	      Choose hardware carefully	if intending to	use large  frame  sup-

	      The default is 1500.

	      This  directive  controls	the transport mechanism	used.  The de-
	      fault is knet.  The transport type can also be set  to  udpu  or
	      udp.  Only knet allows crypto or multiple	interfaces per node.

	      This  specifies  the name	of cluster and it's used for automatic
	      generating of multicast address.

	      This specifies version of	config file. This is converted to  un-
	      signed 64-bit int.  By default it's 0. Option is used to prevent
	      joining old nodes	with not up-to-date configuration. If value is
	      not  0,  and  node is going for first time (only for first time,
	      join after split doesn't follow  this  rules)  from  single-node
	      membership to multiple nodes membership, other nodes config_ver-
	      sions are	collected. If current node config_version is not equal
	      to highest of collected versions,	corosync is terminated.

	      For  udp or udpu,	this specifies version of IP to	use for	commu-
	      nication.	 The value can be one of ipv4 or ipv6. Default (if un-
	      specified) is ipv4.  This	does not apply to knet where both ipv4
	      and ipv6 address can be used, provided they  are	consistent  on
	      each link.

	      Within  the totem	directive, there are several configuration op-
	      tions which are used to control the operation of	the  protocol.
	      It  is  generally	 not recommended to change any of these	values
	      without proper guidance and sufficient testing.	Some  networks
	      may  require larger values if suffering from frequent reconfigu-
	      rations.	Some applications may require faster failure detection
	      times which can be achieved by reducing the token	timeout.

       token  This  timeout is used directly or	as a base for real token time-
	      out calculation (explained in token_coefficient section).	 Token
	      timeout specifies	in milliseconds	until a	token loss is declared
	      after not	receiving a token.  This is the	time spent detecting a
	      failure  of a processor in the current configuration.  Reforming
	      a	new configuration takes	about 50 milliseconds in  addition  to
	      this timeout.

	      For  real	token timeout used by totem it's possible to read cmap
	      value of runtime.config.totem.token key.

	      The default is 1000 milliseconds.

	      Specifies	the interval between warnings that the token  has  not
	      been  received.	The value is a percentage of the token timeout
	      and can be set to	0 to disable warnings.

	      The default is 75%.

	      This value is used only when nodelist section is	specified  and
	      contains	at  least  3  nodes. If	so, real token timeout is then
	      computed as token	+ (number_of_nodes - 2)	 *  token_coefficient.
	      This  allows  cluster  to	 scale without manually	changing token
	      timeout every time new node is added. This value can be set to 0
	      resulting	in effective removal of	this feature.

	      The default is 650 milliseconds.

	      This timeout specifies in	milliseconds after how long before re-
	      ceiving a	token the token	is retransmitted.  This	will be	 auto-
	      matically	 calculated  if	 token	is modified.  It is not	recom-
	      mended to	alter this value without guidance  from	 the  corosync

	      The default is 238 milliseconds.

	      The (optional) type of compression used by Kronosnet. The	values
	      available	depend on the build  and  also	avaialable  libraries.
	      Typically	 zlib  and  lz4	will be	available but bzip2 and	others
	      could also be allowed. The default is 'none'

	      Tells knet to NOT	compress any packets that are smaller than the
	      value indicated. Default 100 bytes.

	      Set  to  0 to reset to the default.  Set to 1 to compress	every-

	      Many compression libraries allow tuning of  compression  parame-
	      ters.  For  example  0 or	1 ... 9	are commonly used to determine
	      the level	of compression.	This value is passed unmodified	to the
	      compression  library  so	it  is	recommended to consult the li-
	      brary's documentation for	more detailed information.

       hold   This timeout specifies in	milliseconds how long the token	should
	      be  held	by  the	 representative	when the protocol is under low
	      utilization.   It	is not recommended to alter this value without
	      guidance from the	corosync community.

	      The default is 180 milliseconds.

	      This  value  identifies how many token retransmits should	be at-
	      tempted before forming a new configuration.  If  this  value  is
	      set,  retransmit	and hold will be automatically calculated from
	      retransmits_before_loss and token.

	      The default is 4 retransmissions.

       join   This timeout specifies in	milliseconds how long to wait for join
	      messages in the membership protocol.

	      The default is 50	milliseconds.

	      This  timeout specifies in milliseconds an upper range between 0
	      and send_join to wait before sending a join message.   For  con-
	      figurations  with	less than 32 nodes, this parameter is not nec-
	      essary.  For larger rings, this parameter	is necessary to	ensure
	      the  NIC	is not overflowed with join messages on	formation of a
	      new ring.	 A reasonable value for	large rings (128 nodes)	 would
	      be 80msec.  Other	timer values must also change if this value is
	      changed.	Seek advice from the corosync mailing list  if	trying
	      to run larger configurations.

	      The default is 0 milliseconds.

	      This timeout specifies in	milliseconds how long to wait for con-
	      sensus to	be achieved before starting a new round	of  membership
	      configuration.   The  minimum  value for consensus must be 1.2 *
	      token.  This value will be automatically calculated at 1.2 * to-
	      ken if the user doesn't specify a	consensus value.

	      For  two node clusters, a	consensus larger than the join timeout
	      but less than token is safe.  For	three node or larger clusters,
	      consensus	 should	 be larger than	token.	There is an increasing
	      risk of odd membership changes, which  still  guarantee  virtual
	      synchrony,  as node count	grows if consensus is less than	token.

	      The default is 1200 milliseconds.

       merge  This  timeout  specifies in milliseconds how long	to wait	before
	      checking for a partition when  no	 multicast  traffic  is	 being
	      sent.   If  multicast traffic is being sent, the merge detection
	      happens automatically as a function of the protocol.

	      The default is 200 milliseconds.

	      This timeout specifies in	milliseconds how long to  wait	before
	      checking	that  a	network	interface is back up after it has been

	      The default is 1000 milliseconds.

	      This constant specifies how many rotations of the	token  without
	      receiving	 any  of the messages when messages should be received
	      may occur	before a new configuration is formed.

	      The default is 2500 failures to receive a	message.

	      This constant specifies how many rotations of the	token  without
	      any  multicast  traffic  should  occur  before the hold timer is

	      The default is 30	rotations.

	      [HeartBeating mechanism] Configures  the	optional  HeartBeating
	      mechanism	for faster failure detection. Keep in mind that	engag-
	      ing this mechanism in lossy networks  could  cause  faulty  loss
	      declaration  as  the  mechanism relies on	the network for	heart-

	      So as a rule of thumb use	this mechanism if you require improved
	      failure in low to	medium utilized	networks.

	      This  constant  specifies	 the  number of	heartbeat failures the
	      system should tolerate before declaring heartbeat	failure	e.g 3.
	      Also  if this value is not set or	is 0 then the heartbeat	mecha-
	      nism is not engaged in the system	 and  token  rotation  is  the
	      method of	failure	detection

	      The default is 0 (disabled).

	      [HeartBeating mechanism] This constant specifies in milliseconds
	      the approximate delay that your network takes to	transport  one
	      packet  from  one	machine	to another. This value is to be	set by
	      system engineers and please don't	change if not sure as this ef-
	      fects the	failure	detection mechanism using heartbeat.

	      The default is 50	milliseconds.

	      This  constant specifies the maximum number of messages that may
	      be sent on  one  token  rotation.	  If  all  processors  perform
	      equally  well,  this value could be large	(300), which would in-
	      troduce higher latency from origination  to  delivery  for  very
	      large  rings.   To  reduce  latency in large rings(16+), the de-
	      faults are a safe	compromise.  If	1 or  more  slow  processor(s)
	      are  present  among  fast	 processors,  window_size should be no
	      larger than 256000 / netmtu to avoid overflow of the kernel  re-
	      ceive buffers.  The user is notified of this by the display of a
	      retransmit list in the notification logs.	 There is no  loss  of
	      data, but	performance is reduced when these errors occur.

	      The default is 50	messages.

	      This  constant specifies the maximum number of messages that may
	      be sent by one processor on receipt of the token.	 The  max_mes-
	      sages  parameter	is limited to 256000 / netmtu to prevent over-
	      flow of the kernel transmit buffers.

	      The default is 17	messages.

	      This constant defines the	maximum	number of times	on receipt  of
	      a	 token	a  message  is checked for retransmission before a re-
	      transmission occurs.  This parameter is  useful  to  modify  for
	      switches	that delay multicast packets compared to unicast pack-
	      ets.  The	default	setting	 works	well  for  nearly  all	modern

	      The default is 5 messages.

	      How  often  the knet PMTUd runs to look for network MTU changes.
	      Value in seconds,	default: 30

       Within the logging directive, there are several	configuration  options
       which are all optional.

       The following 3 options are valid only for the top level	logging	direc-

	      This specifies that a timestamp is placed	on all	log  messages.
	      It  can be one of	off (no	timestamp), on (second precision time-
	      stamp) or	hires (millisecond precision  timestamp	 -  only  when
	      supported	by LibQB).

	      The default is hires (or on if hires is not supported).

	      This specifies that file and line	should be printed.

	      The default is off.

	      This specifies that the code function name should	be printed.

	      The default is off.

	      This specifies that blackbox functionality should	be enabled.

	      The default is on.

       The  following  options	are valid both for top level logging directive
       and they	can be overridden in logger_subsys entries.



	      These specify the	destination of logging output. Any combination
	      of these options may be specified. Valid options are yes and no.

	      The default is syslog and	stderr.

	      Please  note, if you are using to_logfile	and want to rotate the
	      file, use	logrotate(8) with the option copytruncate.  eg.
	      /var/log/corosync.log {
		   rotate 7

	      If the to_logfile	directive is set to yes	, this	option	speci-
	      fies the pathname	of the log file.

	      No default.

	      This  specifies the logfile priority for this particular subsys-
	      tem. Ignored if debug is on.  Possible values are: alert,	 crit,
	      debug (same as debug = on), emerg, err, info, notice, warning.

	      The default is: info.

	      This  specifies  the  syslog facility type that will be used for
	      any messages sent	to syslog. options are daemon, local0, local1,
	      local2, local3, local4, local5, local6 & local7.

	      The default is daemon.

	      This  specifies  the syslog level	for this particular subsystem.
	      Ignored if debug is on.  Possible	values are: alert, crit, debug
	      (same as debug = on), emerg, err,	info, notice, warning.

	      The default is: info.

       debug  This  specifies whether debug output is logged for this particu-
	      lar logger. Also can contain value trace,	what is	highest	 level
	      of debug information.

	      The default is off.

       Within the logging directive, logger_subsys directives are optional.

       Within  the  logger_subsys sub-directive, all of	the above logging con-
       figuration options are valid and	can be used to	override  the  default
       settings.   The subsys entry, described below, is mandatory to identify
       the subsystem.

       subsys This specifies the subsystem identity (name) for	which  logging
	      is  specified.  This  is	the  name  used	 by  a	service	in the
	      log_init() call. E.g. 'CPG'. This	directive is required.

       Within the quorum directive it is possible to specify the quorum	 algo-
       rithm to	use with the

	      directive.  At  the  time	of writing only	corosync_votequorum is
	      supported.  See votequorum(5) for	configuration options.

       Within the nodelist directive it	is possible to specify specific	infor-
       mation  about nodes in cluster. Directive can contain only node sub-di-
       rective,	which specifies	every node that	should be a member of the mem-
       bership,	and where non-default options are needed. Every	node must have
       at least	ring0_addr field filled.

       Every node that should be a member of the membership must be specified.

       Possible	options	are:

	      This specifies IP	or network hostname address of the  particular
	      node.  X is a link number.

       nodeid This  configuration option is required for each node for Kronos-
	      net mode.	 It is a 32 bit	value specifying the  node  identifier
	      delivered	to the cluster membership service. The node identifier
	      value of zero is reserved	and should not be  used.  If  knet  is
	      set, this	field must be set.

       name   This option is used mainly with knet transport to	identify local
	      node.  It's also used by client software (pacemaker).  Algorithm
	      for identifying local node is following:

	      1.     Looks up $HOSTNAME	in the nodelist

	      2.     If	 this  fails  strip the	domain name from $HOSTNAME and
		     looks up that in the nodelist

	      3.     If	this fails look	in the nodelist	for a  fully-qualified
		     name  whose  short	 version  matches the short version of

	      4.     If	all this fails then search the interfaces list for  an
		     address that matches a name in the	nodelist

       Within the system directive it is possible to specify system options.

       Possible	options	are:

	      This  specifies  type  of	 IPC to	use. Can be one	of native (de-
	      fault), shm and socket.  Native means one	of shm or socket,  de-
	      pending  on what is supported by OS. On systems with support for
	      both, SHM	is selected. SHM is generally faster, but need to  al-
	      locate ring buffer file in /dev/shm.

	      Should  be  set  to  yes (default) if corosync should try	to set
	      round robin realtime scheduling with maximal priority to itself.
	      When  setting of scheduler fails,	fallback to set	maximal	prior-

	      Set priority of corosync process.	Valid only  when  sched_rr  is
	      set  to  no.  Can	be ether numeric value with similar meaning as
	      nice(1) or max / min meaning maximal / minimal priority (so min-
	      imal / maximal nice value).

	      Should  be  set  to yes (default)	if corosync should try to move
	      itself to	root cgroup. This feature is available only  for  sys-
	      tems  with  cgroups  with	 RT  sched  enabled  (Linux  with CON-
	      FIG_RT_GROUP_SCHED kernel	option).

	      Existing directory where corosync	should	chdir  into.  Corosync
	      stores important state files and blackboxes there.

	      The default is /var/lib/corosync.

       Within  the  resources  directive it is possible	to specify options for

       Possible	option is:

	      (Valid only if Corosync was compiled with	watchdog support.)
	      Watchdog device to use, for example  /dev/watchdog.   If	unset,
	      empty or "off", no watchdog is used.

	      In  a  cluster with properly configured power fencing a watchdog
	      provides no additional value.  On	the other hand,	slow  watchdog
	      communication may	incur multi-second delays in the Corosync main
	      loop, potentially	breaking down membership.  IPMI	watchdogs  are
	      particularly   notorious	 in   this  regard:  read  about  kip-
	      mid_max_busy_us in IPMI.txt in the Linux kernel documentation.

       For example to add a node with address with	nodeid 3.  The
       node  has the name NEW (in DNS or /etc/hosts) and is not	currently run-
       ning corosync. The current corosync.conf	nodelist looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      name: node1
		  node {
		      nodeid: 2
		      name: node2


       Add a new entry for the node below the  existing	 nodes.	 Node  entries
       don't  have  to	be in nodeid order, but	it will	help keep you sane. So
       the nodelist now	looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      name: node1
		  node {
		      nodeid: 2
		      name: node2

		  node {
		      nodeid: 3
		      name: NEW


       This file must then be copied onto all three nodes -  the existing  two
       nodes,  and  the	 new one.  On one of the existing corosync nodes, tell
       corosync	to re-read the updated config file into	memory:

	      corosync-cfgtool -R

       This command only needs to be run on one	node in	the cluster.  You  may
       then  start corosync on the NEW node and	it should join the cluster. If
       this doesn't work as expected then check	the communications between all
       three  nodes  is	 working,  and check the syslog	files on all nodes for
       more information. It's important	to note	that the key bit  of  informa-
       tion about a node failing to join might be on a different node than you

       This is the reverse procedure to	'Adding	a node'	above. First you  need
       to shut down the	node you will be removing from the cluster.

	      corosync-cfgtool -H

       Then  delete  the nodelist stanza from corosync.conf and	finally	update
       corosync	on the remaining nodes by running

	      corosync-cfgtool -R

       on one of them.

	      The corosync executive configuration file.

       corosync_overview(7), votequorum(5), corosync-qdevice(8), logrotate(8)

corosync Man Page		  2018-11-13		      COROSYNC_CONF(5)


Want to link to this manual page? Use this URL:

home | help