Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fi_domain(3)			   #VERSION#			  fi_domain(3)

NAME
       fi_domain - Open	a fabric access	domain

SYNOPSIS
	      #include <rdma/fabric.h>

	      #include <rdma/fi_domain.h>

	      int fi_domain(struct fid_fabric *fabric, struct fi_info *info,
		  struct fid_domain **domain, void *context);

	      int fi_close(struct fid *domain);

	      int fi_domain_bind(struct	fid_domain *domain, struct fid *eq,
		  uint64_t flags);

	      int fi_open_ops(struct fid *domain, const	char *name, uint64_t flags,
		  void **ops, void *context);

	      int fi_set_ops(struct fid	*domain, const char *name, uint64_t flags,
		  void *ops, void *context);

ARGUMENTS
       fabric Fabric domain

       info   Fabric   information,  including	domain	capabilities  and  at-
	      tributes.

       domain An opened	access domain.

       context
	      User specified context associated	with the domain.  This context
	      is  returned  as	part of	any asynchronous event associated with
	      the domain.

       eq     Event queue for asynchronous operations initiated	on the domain.

       name   Name associated with an interface.

       ops    Fabric interface operations.

DESCRIPTION
       An access domain	typically refers to a physical or virtual NIC or hard-
       ware  port;  however, a domain may span across multiple hardware	compo-
       nents for fail-over or data striping purposes.  A  domain  defines  the
       boundary	 for  associating  different  resources	 together.  Fabric re-
       sources belonging to the	same domain may	share resources.

   fi_domain
       Opens a fabric access domain, also referred to as  a  resource  domain.
       Fabric  domains are identified by a name.  The properties of the	opened
       domain are specified using the info parameter.

   fi_open_ops
       fi_open_ops is used to open provider specific interfaces.  Provider in-
       terfaces	 may be	used to	access low-level resources and operations that
       are specific to the opened resource domain.  The	details	of domain  in-
       terfaces	are outside the	scope of this documentation.

   fi_set_ops
       fi_set_ops  assigns callbacks that a provider should invoke in place of
       performing selected tasks.  This	allows users to	modify	or  control  a
       provider's  default behavior.  Conceptually, it allows the user to hook
       specific	functions used by a provider and replace it with their own.

       The operations being modified are identified using a well-known charac-
       ter string, passed as the name parameter.  The format of	the ops	param-
       eter is dependent upon the name value.  The ops parameter  will	refer-
       ence  a	structure  containing the callbacks and	other fields needed by
       the provider to invoke the user's functions.

       If a provider accepts the override, it will return FI_SUCCESS.  If  the
       override	 is  unknown  or  not  supported,  the	provider  will	return
       -FI_ENOSYS.  Overrides should be	set prior to allocating	 resources  on
       the domain.

       The  following  fi_set_ops operations and corresponding callback	struc-
       tures are defined.

       FI_SET_OPS_HMEM_OVERRIDE	- Heterogeneous	Memory Overrides

       HMEM override allows  users  to	override  HMEM	related	 operations  a
       provider	 may perform.  Currently, the scope of the HMEM	override is to
       allow a user to define the memory movement functions a provider	should
       use  when  accessing  a	user buffer.  The user-defined memory movement
       functions need to account for all the  different	 HMEM  iface  types  a
       provider	may encounter.

       All objects allocated against a domain will inherit this	override.

       The following is	the HMEM override operation name and structure.

	      #define FI_SET_OPS_HMEM_OVERRIDE "hmem_override_ops"

	      struct fi_hmem_override_ops {
		  size_t  size;

		  ssize_t (*copy_from_hmem_iov)(void *dest, size_t size,
		      enum fi_hmem_iface iface,	uint64_t device, const struct iovec *hmem_iov,
		      size_t hmem_iov_count, uint64_t hmem_iov_offset);

		  ssize_t (*copy_to_hmem_iov)(enum fi_hmem_iface iface,	uint64_t device,
		  const	struct iovec *hmem_iov,	size_t hmem_iov_count,
		      uint64_t hmem_iov_offset,	const void *src, size_t	size);
	      };

       All  fields  in struct fi_hmem_override_ops must	be set (non-null) to a
       valid value.

       size   This should be set to the	 sizeof(struct	fi_hmem_override_ops).
	      The  size	 field	is used	for forward and	backward compatibility
	      purposes.

       copy_from_hmem_iov
	      Copy data	from the device/hmem to	host  memory.	This  function
	      should  return  a	 negative  fi_errno on error, or the number of
	      bytes copied on success.

       copy_to_hmem_iov
	      Copy data	from host memory to the	 device/hmem.	This  function
	      should  return  a	 negative  fi_errno on error, or the number of
	      bytes copied on success.

   fi_domain_bind
       Associates an event queue with the domain.  An event queue bound	 to  a
       domain  will  be	 the  default  EQ associated with asynchronous control
       events that occur on the	domain or active endpoints allocated on	a  do-
       main.   This  includes  CM  events.  Endpoints may direct their control
       events to alternate EQs by binding directly with	the EQ.

       Binding an event	queue to a domain with the  FI_REG_MR  flag  indicates
       that  the  provider  should  perform all	memory registration operations
       asynchronously, with the	completion reported through the	 event	queue.
       If  an  event queue is not bound	to the domain with the FI_REG_MR flag,
       then memory registration	requests complete synchronously.

       See fi_av_bind(3), fi_ep_bind(3),  fi_mr_bind(3),  fi_pep_bind(3),  and
       fi_scalable_ep_bind(3) for more information.

   fi_close
       The  fi_close  call  is used to release all resources associated	with a
       domain or interface.  All objects associated  with  the	opened	domain
       must be released	prior to calling fi_close, otherwise the call will re-
       turn -FI_EBUSY.

DOMAIN ATTRIBUTES
       The fi_domain_attr structure defines the	set of	attributes  associated
       with a domain.

	      struct fi_domain_attr {
		  struct fid_domain	*domain;
		  char			*name;
		  enum fi_threading	threading;
		  enum fi_progress	control_progress;
		  enum fi_progress	data_progress;
		  enum fi_resource_mgmt	resource_mgmt;
		  enum fi_av_type	av_type;
		  int			mr_mode;
		  size_t		mr_key_size;
		  size_t		cq_data_size;
		  size_t		cq_cnt;
		  size_t		ep_cnt;
		  size_t		tx_ctx_cnt;
		  size_t		rx_ctx_cnt;
		  size_t		max_ep_tx_ctx;
		  size_t		max_ep_rx_ctx;
		  size_t		max_ep_stx_ctx;
		  size_t		max_ep_srx_ctx;
		  size_t		cntr_cnt;
		  size_t		mr_iov_limit;
		  uint64_t		caps;
		  uint64_t		mode;
		  uint8_t		*auth_key;
		  size_t		auth_key_size;
		  size_t		max_err_data;
		  size_t		mr_cnt;
		  uint32_t		tclass;
	      };

   domain
       On  input  to  fi_getinfo,  a user may set this to an opened domain in-
       stance to restrict output to the	given domain.  On output from  fi_get-
       info,  if  no domain was	specified, but the user	has an opened instance
       of the named domain, this will reference	the first opened instance.  If
       no instance has been opened, this field will be NULL.

       The  domain  instance  returned by fi_getinfo should only be considered
       valid if	the application	does not close any domain instances  from  an-
       other thread while fi_getinfo is	being processed.

   Name
       The name	of the access domain.

   Multi-threading Support (threading)
       The threading model specifies the level of serialization	required of an
       application when	using the libfabric data transfer interfaces.  Control
       interfaces  are	always	considered thread safe,	and may	be accessed by
       multiple	threads.  Applications which can  guarantee  serialization  in
       their  access  of provider allocated resources and interfaces enables a
       provider	to eliminate lower-level locks.

       FI_THREAD_COMPLETION
	      The completion threading model is	intended  for  providers  that
	      make use of manual progress.  Applications must serialize	access
	      to all objects that are associated through the use of  having  a
	      shared  completion  structure.  This includes endpoint, transmit
	      context, receive context,	completion queue, counter,  wait  set,
	      and poll set objects.

       For example, threads must serialize access to an	endpoint and its bound
       completion queue(s) and/or counters.  Access to	endpoints  that	 share
       the same	completion queue must also be serialized.

       The   use   of	FI_THREAD_COMPLETION  can  increase  parallelism  over
       FI_THREAD_SAFE, but requires the	use of isolated	resources.

       FI_THREAD_DOMAIN
	      A	domain serialization model requires applications to  serialize
	      access to	all objects belonging to a domain.

       FI_THREAD_ENDPOINT
	      The  endpoint  threading	model is similar to FI_THREAD_FID, but
	      with the added restriction that serialization is	required  when
	      accessing	 the  same endpoint, even if multiple transmit and re-
	      ceive contexts are used.	Conceptually, FI_THREAD_ENDPOINT  maps
	      well to providers	that implement fabric services in hardware but
	      use a single command queue to access different data flows.

       FI_THREAD_FID
	      A	fabric descriptor (FID)	serialization model requires  applica-
	      tions to serialize access	to individual fabric resources associ-
	      ated with	data transfer operations  and  completions.   Multiple
	      threads  must  be	 serialized  when accessing the	same endpoint,
	      transmit context,	receive	context,  completion  queue,  counter,
	      wait  set,  or  poll  set.   Serialization  is  required only by
	      threads accessing	the same object.

       For example, one	thread may be initiating a data	transfer  on  an  end-
       point,  while  another  thread reads from a completion queue associated
       with the	endpoint.

       Serialization to	endpoint access	is only	required  when	accessing  the
       same  endpoint  data  flow.  Multiple threads may initiate transfers on
       different transmit contexts of the same endpoint	 without  serializing,
       and  no serialization is	required between the submission	of data	trans-
       mit requests and	data receive operations.

       In general, FI_THREAD_FID allows	the provider to	be implemented without
       needing	internal  locking when handling	data transfers.	 Conceptually,
       FI_THREAD_FID maps well to providers that implement fabric services  in
       hardware	and provide separate command queues to different data flows.

       FI_THREAD_SAFE
	      A	thread safe serialization model	allows a multi-threaded	appli-
	      cation to	access any allocated resources through	any  interface
	      without  restriction.   All  providers  are  required to support
	      FI_THREAD_SAFE.

       FI_THREAD_UNSPEC
	      This value indicates that	no threading model has	been  defined.
	      It  may  be  used	 on  input hints to the	fi_getinfo call.  When
	      specified, providers will	return a threading model  that	allows
	      for the greatest level of	parallelism.

   Progress Models (control_progress / data_progress)
       Progress	 is  the  ability of the underlying implementation to complete
       processing of an	asynchronous request.  In many cases,  the  processing
       of an asynchronous request requires the use of the host processor.  For
       example,	a received message may need to be  matched  with  the  correct
       buffer,	or a timed out request may need	to be retransmitted.  For per-
       formance	reasons, it may	be undesirable for the provider	to allocate  a
       thread  for  this  purpose,  which  will	 compete  with the application
       threads.

       Control progress	indicates the method that the provider	uses  to  make
       progress	 on  asynchronous  control operations.	Control	operations are
       functions which do not directly involve the transfer of application da-
       ta  between  endpoints.	 They include address vector, memory registra-
       tion, and connection management routines.

       Data progress indicates the method  that	 the  provider	uses  to  make
       progress	 on  data  transfer  operations.  This includes	message	queue,
       RMA, tagged messaging, and atomic operations, along with	their  comple-
       tion processing.

       Progress	 frequently  requires action being taken at both the transmit-
       ting and	receiving sides	of an operation.  This is often	a  requirement
       for  reliable  transfers, as a result of	retry and acknowledgement pro-
       cessing.

       To balance between performance and ease of use, two progress models are
       defined.

       FI_PROGRESS_AUTO
	      This  progress  model indicates that the provider	will make for-
	      ward progress on an asynchronous operation without  further  in-
	      tervention by the	application.  When FI_PROGRESS_AUTO is provid-
	      ed as output to fi_getinfo in the	absence	of any progress	hints,
	      it often indicates that the desired functionality	is implemented
	      by the provider hardware or is a standard	service	of the operat-
	      ing system.

       All  providers are required to support FI_PROGRESS_AUTO.	 However, if a
       provider	does not natively support automatic progress, forcing the  use
       of  FI_PROGRESS_AUTO  may  result  in threads being allocated below the
       fabric interfaces.

       FI_PROGRESS_MANUAL
	      This progress model indicates that the provider requires the use
	      of  an  application  thread to complete an asynchronous request.
	      When manual progress is set, the provider	will  attempt  to  ad-
	      vance an asynchronous operation forward when the application at-
	      tempts to	wait on	or read	an event queue,	completion  queue,  or
	      counter	where	the  completed	operation  will	 be  reported.
	      Progress also occurs when	the application	processes  a  poll  or
	      wait  set	 that has been associated with the event or completion
	      queue.

       Only wait operations defined by the fabric interface will result	in  an
       operation  progressing.	 Operating  system or external wait functions,
       such as select, poll, or	pthread	routines, cannot.

       Manual progress requirements not	only apply to endpoints	that  initiate
       transmit	 operations,  but  also	to endpoints that may be the target of
       such operations.	 This holds true even if the target endpoint will  not
       generate	 completion  events  for the operations.  For example, an end-
       point that acts purely as the target of RMA or atomic  operations  that
       uses  manual  progress may still	need application assistance to process
       received	operations.

       FI_PROGRESS_UNSPEC
	      This value indicates that	no progress model  has	been  defined.
	      It may be	used on	input hints to the fi_getinfo call.

   Resource Management (resource_mgmt)
       Resource	 management  (RM)  is provider and protocol support to protect
       against overrunning local and remote resources.	 This  includes	 local
       and  remote transmit contexts, receive contexts,	completion queues, and
       source and target data buffers.

       When enabled, applications are given some level of  protection  against
       overrunning  provider  queues  and local	and remote data	buffers.  Such
       support may be built directly into the hardware and/or  network	proto-
       col,  but may also require that checks be enabled in the	provider soft-
       ware.  By disabling resource management,	an application assumes all re-
       sponsibility for	preventing queue and buffer overruns, but doing	so may
       allow a provider	to eliminate internal synchronization calls,  such  as
       atomic variables	or locks.

       It  should  be  noted that even if resource management is disabled, the
       provider	implementation and protocol may	still provide  some  level  of
       protection  against  overruns.  However,	such protection	is not guaran-
       teed.  The following values for resource	management are defined.

       FI_RM_DISABLED
	      The provider is free to select an	 implementation	 and  protocol
	      that  does  not protect against resource overruns.  The applica-
	      tion is responsible for resource protection.

       FI_RM_ENABLED
	      Resource management is enabled for this provider domain.

       FI_RM_UNSPEC
	      This value indicates that	no resource management model has  been
	      defined.	It may be used on input	hints to the fi_getinfo	call.

       The  behavior  of  the  various	resource management options depends on
       whether the endpoint is reliable	or unreliable, as well as provider and
       protocol	specific implementation	details, as shown in the following ta-
       ble.  The table assumes that all	peers enable or	disable	RM the same.

       Resource	   DGRAM EP-no RM    DGRAM EP-with RM	RDM/MSG	  EP-no	   RDM/MSG EP-with
							RM		   RM
       ------------------------------------------------------------------------------------
	Tx Ctx	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Rx Ctx	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Tx CQ	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Rx CQ	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Target	       dropped		  dropped	 transmit error	       retried
	EP
       No    Rx	       dropped		  dropped	 transmit error	       retried
       Buffer
       Rx   Buf	  truncate or drop   truncate or drop	truncate or er-	   truncate or er-
       Overrun						ror		   ror
       Un-	   not applicable     not applicable	 transmit error	   transmit error
       matched
       RMA
       RMA	   not applicable     not applicable	 transmit error	   transmit error
       Overrun

       The resource column indicates the resource being	 accessed  by  a  data
       transfer	operation.

       Tx Ctx /	Rx Ctx
	      Refers to	the transmit/receive contexts when a data transfer op-
	      eration is submitted.  When RM is	enabled, attempting to	submit
	      a	 request will fail if the context is full.  If RM is disabled,
	      an undefined error (provider specific) will occur.  Such	errors
	      should be	considered fatal to the	context, and applications must
	      take steps to avoid queue	overruns.

       Tx CQ / Rx CQ
	      Refers to	the completion queue associated	with the Tx or Rx con-
	      text when	a local	operation completes.  When RM is disabled, ap-
	      plications must take care	to ensure that	completion  queues  do
	      not  get overrun.	 When an overrun occurs, an undefined, but fa-
	      tal, error will occur affecting all  endpoints  associated  with
	      the CQ.  Overruns	can be avoided by sizing the CQs appropriately
	      or by deferring the posting of a data transfer operation	unless
	      CQ  space	 is available to store its completion.	When RM	is en-
	      abled, providers may use	different  mechanisms  to  prevent  CQ
	      overruns.	  This	includes  failing  (returning  -FI_EAGAIN) the
	      posting of operations that could result in CQ overruns,  or  in-
	      ternally retrying	requests (which	will be	hidden from the	appli-
	      cation).	See notes at the end of	this section regarding CQ  re-
	      source management	restrictions.

       Target EP / No Rx Buffer
	      Target  EP refers	to resources associated	with the endpoint that
	      is the target of a transmit operation.  This includes the	target
	      endpoint's  receive  queue,  posted  receive buffers (no Rx buf-
	      fers), the receive side  completion  queue,  and	other  related
	      packet  processing queues.  The defined behavior is that seen by
	      the initiator of a request.  For FI_EP_DGRAM endpoints,  if  the
	      target  EP  queues  are  unable to accept	incoming messages, re-
	      ceived messages will be dropped.	For reliable endpoints,	if  RM
	      is  disabled,  the transmit operation will complete in error.  A
	      provider may choose to return an error completion	with the error
	      code  FI_ENORX for that transmit operation so that it can	be re-
	      tried.  If RM is enabled,	the provider will internally retry the
	      operation.

       Rx Buffer Overrun
	      This  refers to buffers posted to	receive	incoming tagged	or un-
	      tagged messages, with the	behavior defined from the viewpoint of
	      the  sender.   The  behavior for handling	received messages that
	      are larger than the  buffers  provided  by  the  application  is
	      provider	specific.   Providers  may either truncate the message
	      and report a successful completion, or fail the operation.   For
	      datagram	endpoints, failed sends	will result in the message be-
	      ing dropped.  For	reliable endpoints, send operations  may  com-
	      plete  successfully, yet be truncated at the receive side.  This
	      can occur	when the target	side buffers received  data  until  an
	      application buffer is made available.  The completion status may
	      also be dependent	upon the completion model selected byt the ap-
	      plication	  (e.g.	FI_DELIVERY_COMPLETE  versus  FI_TRANSMIT_COM-
	      PLETE).

       Unmatched RMA / RMA Overrun
	      Unmatched	RMA and	RMA overruns deal with the processing  of  RMA
	      and  atomic  operations.	Unlike send operations,	RMA operations
	      that attempt to access a memory address that is either not  reg-
	      istered for such operations, or attempt to access	outside	of the
	      target memory region will	fail, resulting	in a transmit error.

       When a resource management error	occurs on an endpoint, the endpoint is
       transitioned  into a disabled state.  Any operations which have not al-
       ready completed will fail and be	discarded.   For  connectionless  end-
       points,	the endpoint must be re-enabled	before it will accept new data
       transfer	operations.  For connected endpoints, the connection  is  torn
       down and	must be	re-established.

       There is	one notable restriction	on the protections offered by resource
       management.  This occurs	when resource management is enabled on an end-
       point  that  has	 been bound to completion queue(s) using the FI_SELEC-
       TIVE_COMPLETION flag.  Operations posted	to such	an endpoint may	speci-
       fy that a successful completion should not generate a entry on the cor-
       responding completion queue.  (I.e.  the	operation leaves  the  FI_COM-
       PLETION	flag unset).  In such situations, the provider is not required
       to reserve an entry in the completion queue to handle  the  case	 where
       the  operation  fails  and does generate	a CQ entry, which would	effec-
       tively require tracking the operation to	completion.  Applications con-
       cerned  with  avoiding CQ overruns in the occurrence of errors must en-
       sure that there is sufficient space in the CQ to	report	failed	opera-
       tions.  This can	typically be achieved by sizing	the CQ to at least the
       same size as the	endpoint queue(s) that are bound to it.

   AV Type (av_type)
       Specifies the type of address vectors that are usable with this domain.
       For  additional details on AV type, see fi_av(3).  The following	values
       may be specified.

       FI_AV_MAP
	      Only address vectors of type AV map are requested	or supported.

       FI_AV_TABLE
	      Only address vectors of type AV index are	requested or  support-
	      ed.

       FI_AV_UNSPEC
	      Any address vector format	is requested and supported.

       Address	vectors	 are  only used	by connectionless endpoints.  Applica-
       tions that require the use of a specific	type of	address	vector	should
       set  the	 domain	 attribute av_type to the necessary value when calling
       fi_getinfo.  The	value FI_AV_UNSPEC may be used to  indicate  that  the
       provider	 can  support  either  address vector format.  In this case, a
       provider	may return FI_AV_UNSPEC	to indicate that either	format is sup-
       portable, or may	return another AV type to indicate the optimal AV type
       supported by this domain.

   Memory Registration Mode (mr_mode)
       Defines memory registration specific mode bits used with	 this  domain.
       Full details on MR mode options are available in	fi_mr(3).  The follow-
       ing values may be specified.

       FI_MR_ALLOCATED
	      Indicates	that memory registration occurs	on allocated data buf-
	      fers,  and  physical pages must back all virtual addresses being
	      registered.

       FI_MR_COLLECTIVE
	      Requires data buffers passed to collective operations be explic-
	      itly  registered	for collective operations using	the FI_COLLEC-
	      TIVE flag.

       FI_MR_ENDPOINT
	      Memory registration occurs at the	endpoint  level,  rather  than
	      domain.

       FI_MR_LOCAL
	      The  provider  is	 optimized around having applications register
	      memory for locally accessed data buffers.	 Data buffers used  in
	      send and receive operations and as the source buffer for RMA and
	      atomic operations	must be	registered by the application for  ac-
	      cess domains opened with this capability.

       FI_MR_MMU_NOTIFY
	      Indicates	 that the application is responsible for notifying the
	      provider when the	page tables referencing	 a  registered	memory
	      region may have been updated.

       FI_MR_PROV_KEY
	      Memory  registration  keys  are  selected	 and  returned	by the
	      provider.

       FI_MR_RAW
	      The provider requires additional setup as	part of	 their	memory
	      registration  process.   This mode is required by	providers that
	      use a memory key that is larger than 64-bits.

       FI_MR_RMA_EVENT
	      Indicates	that the memory	 regions  associated  with  completion
	      counters	must  be  explicitly  enabled after being bound	to any
	      counter.

       FI_MR_UNSPEC
	      Defined for compatibility	- library versions  1.4	 and  earlier.
	      Setting  mr_mode	to 0 indicates that FI_MR_BASIC	or FI_MR_SCAL-
	      ABLE are requested and supported.

       FI_MR_VIRT_ADDR
	      Registered memory	regions	are referenced by peers	using the vir-
	      tual  address  of	 the  registered  memory region, rather	than a
	      0-based offset.

       FI_MR_BASIC
	      Defined for compatibility	- library versions  1.4	 and  earlier.
	      Only  basic memory registration operations are requested or sup-
	      ported.	This  mode  is	equivalent  to	the   FI_MR_VIRT_ADDR,
	      FI_MR_ALLOCATED, and FI_MR_PROV_KEY flags	being set in later li-
	      brary versions.  This flag may not be used in  conjunction  with
	      other mr_mode bits.

       FI_MR_SCALABLE
	      Defined  for  compatibility  - library versions 1.4 and earlier.
	      Only scalable memory registration	operations  are	 requested  or
	      supported.   Scalable registration uses offset based addressing,
	      with application selectable memory keys.	For  library  versions
	      1.5  and	later, this is the default if no mr_mode bits are set.
	      This flag	may not	be used	 in  conjunction  with	other  mr_mode
	      bits.

       Buffers	used  in  data	transfer  operations may require notifying the
       provider	of their use before a data transfer can	 occur.	  The  mr_mode
       field  indicates	 the type of memory registration that is required, and
       when registration is necessary.	Applications that require the use of a
       specific	 registration  mode should set the domain attribute mr_mode to
       the necessary value when	calling	fi_getinfo.   The  value  FI_MR_UNSPEC
       may be used to indicate support for any registration mode.

   MR Key Size (mr_key_size)
       Size  of	 the  memory region remote access key, in bytes.  Applications
       that request their own MR key must select  a  value  within  the	 range
       specified  by  this value.  Key sizes larger than 8 bytes require using
       the FI_RAW_KEY mode bit.

   CQ Data Size	(cq_data_size)
       Applications may	include	a small	message	with a data transfer  that  is
       placed  directly	into a remote completion queue as part of a completion
       event.  This is referred	to as remote CQ	data (sometimes	referred to as
       immediate  data).   This	 field	indicates the number of	bytes that the
       provider	supports for remote CQ data.  If supported (non-zero value  is
       returned), the minimum size of remote CQ	data must be at	least 4-bytes.

   Completion Queue Count (cq_cnt)
       The  optimal number of completion queues	supported by the domain, rela-
       tive to any specified or	default	CQ attributes.	The cq_cnt  value  may
       be a fixed value	of the maximum number of CQs supported by the underly-
       ing hardware, or	may be a dynamic  value,  based	 on  the  default  at-
       tributes	of an allocated	CQ, such as the	CQ size	and data format.

   Endpoint Count (ep_cnt)
       The  total number of endpoints supported	by the domain, relative	to any
       specified or default endpoint attributes.  The ep_cnt value  may	 be  a
       fixed  value of the maximum number of endpoints supported by the	under-
       lying hardware, or may be a dynamic value, based	 on  the  default  at-
       tributes	 of  an	 allocated endpoint, such as the endpoint capabilities
       and size.  The endpoint count is	the number  of	addressable  endpoints
       supported by the	provider.  Providers return capability limits based on
       configured hardware maximum capabilities.  Providers cannot predict all
       possible	 system	limitations without posteriori knowledge acquired dur-
       ing runtime that	will further limit these hardware  maximums  (e.g. ap-
       plication memory	consumption, FD	usage, etc.).

   Transmit Context Count (tx_ctx_cnt)
       The  number  of	outbound  command  queues  optimally  supported	by the
       provider.  For a	low-level provider, this represents the	number of com-
       mand  queues to the hardware and/or the number of parallel transmit en-
       gines effectively supported by the hardware and	caches.	  Applications
       which allocate more transmit contexts than this value will end up shar-
       ing underlying resources.  By default, there is a single	transmit  con-
       text  associated	with each endpoint, but	in an advanced usage model, an
       endpoint	may be configured with multiple	transmit contexts.

   Receive Context Count (rx_ctx_cnt)
       The number of inbound processing	 queues	 optimally  supported  by  the
       provider.   For	a low-level provider, this represents the number hard-
       ware queues that	can be effectively utilized  for  processing  incoming
       packets.	  Applications	which allocate more receive contexts than this
       value will end up sharing underlying resources.	By default,  a	single
       receive	context	 is  associated	with each endpoint, but	in an advanced
       usage model, an endpoint	may be configured with multiple	 receive  con-
       texts.

   Maximum Endpoint Transmit Context (max_ep_tx_ctx)
       The  maximum number of transmit contexts	that may be associated with an
       endpoint.

   Maximum Endpoint Receive Context (max_ep_rx_ctx)
       The maximum number of receive contexts that may be associated  with  an
       endpoint.

   Maximum Sharing of Transmit Context (max_ep_stx_ctx)
       The  maximum  number  of	endpoints that may be associated with a	shared
       transmit	context.

   Maximum Sharing of Receive Context (max_ep_srx_ctx)
       The maximum number of endpoints that may	be associated  with  a	shared
       receive context.

   Counter Count (cntr_cnt)
       The optimal number of completion	counters supported by the domain.  The
       cq_cnt value may	be a fixed value of the	 maximum  number  of  counters
       supported  by the underlying hardware, or may be	a dynamic value, based
       on the default attributes of the	domain.

   MR IOV Limit	(mr_iov_limit)
       This is the maximum number of IO	vectors	(scatter-gather	elements) that
       a single	memory registration operation may reference.

   Capabilities	(caps)
       Domain  level  capabilities.  Domain capabilities indicate domain level
       features	that are supported by the provider.

       FI_LOCAL_COMM
	      At a conceptual level, this field	indicates that the  underlying
	      device supports loopback communication.  More specifically, this
	      field indicates that an endpoint may communicate with other end-
	      points that are allocated	from the same underlying named domain.
	      If this field is not set,	an application may need	to use an  al-
	      ternate  domain or mechanism (e.g. shared	memory)	to communicate
	      with peers that execute on the same node.

       FI_REMOTE_COMM
	      This field indicates that	the underlying provider	supports  com-
	      munication  with	nodes that are reachable over the network.  If
	      this field is not	set, then the provider only supports  communi-
	      cation  between  processes  that	execute	 on  the same node - a
	      shared memory provider, for example.

       FI_SHARED_AV
	      Indicates	that the domain	supports the ability to	share  address
	      vectors  among multiple processes	using the named	address	vector
	      feature.

       See fi_getinfo(3) for a discussion on primary versus secondary capabil-
       ities.  All domain capabilities are considered secondary	capabilities.

   mode
       The operational mode bit	related	to using the domain.

       FI_RESTRICTED_COMP
	      This  bit	indicates that the domain limits completion queues and
	      counters to only be used with endpoints, transmit	contexts,  and
	      receive contexts that have the same set of capability flags.

   Default authorization key (auth_key)
       The  default  authorization  key	 to associate with endpoint and	memory
       registrations created within the	domain.	 This field is ignored	unless
       the fabric is opened with API version 1.5 or greater.

   Default authorization key length (auth_key_size)
       The  length  in	bytes of the default authorization key for the domain.
       If set to 0, then no authorization key will  be	associated  with  end-
       points and memory registrations created within the domain unless	speci-
       fied in the endpoint or memory registration attributes.	This field  is
       ignored unless the fabric is opened with	API version 1.5	or greater.

   Max Error Data Size (max_err_data)
       :  The  maximum amount of error data, in	bytes, that may	be returned as
       part of a completion or event queue error.  This	value  corresponds  to
       the   err_data_size   field   in	  struct  fi_cq_err_entry  and	struct
       fi_eq_err_entry.

   Memory Regions Count	(mr_cnt)
       The optimal number of memory regions supported by the domain,  or  end-
       point if	the mr_mode FI_MR_ENDPOINT bit has been	set.  The mr_cnt value
       may be a	fixed value of the maximum number of MRs supported by the  un-
       derlying	 hardware, or may be a dynamic value, based on the default at-
       tributes	of the domain,	such  as  the  supported  memory  registration
       modes.	Applications can set the mr_cnt	on input to fi_getinfo,	in or-
       der to indicate their memory registration requirements.	Doing  so  may
       allow  the provider to optimize any memory registration cache or	lookup
       tables.

   Traffic Class (tclass)
       This specifies the default traffic class	that will  be  associated  any
       endpoints  created  within  the	domain.	  See [fi_endpoint(3)](fi_end-
       point.3.html for	additional information.

RETURN VALUE
       Returns 0 on success.  On error,	a negative value corresponding to fab-
       ric  errno is returned.	Fabric errno values are	defined	in rdma/fi_er-
       rno.h.

NOTES
       Users should call fi_close to release all resources  allocated  to  the
       fabric domain.

       The following fabric resources are associated with domains: active end-
       points, memory regions, completion event	queues,	and address vectors.

       Domain attributes reflect the limitations and capabilities of  the  un-
       derlying	hardware and/or	software provider.  They do not	reflect	system
       limitations, such as the	number of physical pages that  an  application
       may  pin	 or  number of file descriptors	that the application may open.
       As a result, the	reported maximums may not be  achievable,  even	 on  a
       lightly loaded systems, without an administrator	configuring system re-
       sources appropriately for the installed provider(s).

SEE ALSO
       fi_getinfo(3), fi_endpoint(3), fi_av(3),	fi_ep(3), fi_eq(3), fi_mr(3)

AUTHORS
       OpenFabrics.

Libfabric Programmer's Manual	  2021-10-07			  fi_domain(3)

NAME | SYNOPSIS | ARGUMENTS | DESCRIPTION | DOMAIN ATTRIBUTES | RETURN VALUE | NOTES | SEE ALSO | AUTHORS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=fi_domain&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help