Skip site navigation (1)Skip section navigation (2)

FreeBSD Man Pages

Man Page or Keyword Search:
Man Architecture
Apropos Keyword Search (all sections) Output format
home | help
NETMAP(4)              FreeBSD Kernel Interfaces Manual              NETMAP(4)

     netmap - a framework for fast packet I/O
     VALE - a fast VirtuAl Local Ethernet using the netmap API
     netmap pipes - a shared memory packet transport channel

     device netmap

     netmap is a framework for extremely fast and efficient packet I/O for
     both userspace and kernel clients.  It runs on FreeBSD and Linux, and
     includes VALE, a very fast and modular in-kernel software
     switch/dataplane, and netmap pipes, a shared memory packet transport
     channel.  All these are accessed interchangeably with the same API.

     netmap, VALE and netmap pipes are at least one order of magnitude faster
     than standard OS mechanisms (sockets, bpf, tun/tap interfaces, native
     switches, pipes), reaching 14.88 million packets per second (Mpps) with
     much less than one core on a 10 Gbit NIC, about 20 Mpps per core for VALE
     ports, and over 100 Mpps for netmap pipes.

     Userspace clients can dynamically switch NICs into netmap mode and send
     and receive raw packets through memory mapped buffers.  Similarly, VALE
     switch instances and ports, and netmap pipes can be created dynamically,
     providing high speed packet I/O between processes, virtual machines, NICs
     and the host stack.

     netmap suports both non-blocking I/O through ioctls(), synchronization
     and blocking I/O through a file descriptor and standard OS mechanisms
     such as select(2), poll(2), epoll(2), kqueue(2).  VALE and netmap pipes
     are implemented by a single kernel module, which also emulates the netmap
     API over standard drivers for devices without native netmap support.  For
     best performance, netmap requires explicit support in device drivers.

     In the rest of this (long) manual page we document various aspects of the
     netmap and VALE architecture, features and usage.

     netmap supports raw packet I/O through a port, which can be connected to
     a physical interface (NIC), to the host stack, or to a VALE switch).
     Ports use preallocated circular queues of buffers (rings) residing in an
     mmapped region.  There is one ring for each transmit/receive queue of a
     NIC or virtual port.  An additional ring pair connects to the host stack.

     After binding a file descriptor to a port, a netmap client can send or
     receive packets in batches through the rings, and possibly implement
     zero-copy forwarding between ports.

     All NICs operating in netmap mode use the same memory region, accessible
     to all processes who own /dev/netmap file descriptors bound to NICs.
     Independent VALE and netmap pipe ports by default use separate memory
     regions, but can be independently configured to share memory.

     The following section describes the system calls to create and control
     netmap ports (including VALE and netmap pipe ports).  Simpler, higher
     level functions are described in section LIBRARIES.

     Ports and rings are created and controlled through a file descriptor,
     created by opening a special device
           fd = open("/dev/netmap");
     and then bound to a specific port with an
           ioctl(fd, NIOCREGIF, (struct nmreq *)arg);

     netmap has multiple modes of operation controlled by the struct nmreq
     argument.  arg.nr_name specifies the port name, as follows:

     OS network interface name (e.g. 'em0', 'eth1', ...)
           the data path of the NIC is disconnected from the host stack, and
           the file descriptor is bound to the NIC (one or all queues), or to
           the host stack;

     valeXXX:YYY (arbitrary XXX and YYY)
           the file descriptor is bound to port YYY of a VALE switch called
           XXX, both dynamically created if necessary.  The string cannot
           exceed IFNAMSIZ characters, and YYY cannot be the name of any
           existing OS network interface.

     On return, arg indicates the size of the shared memory region, and the
     number, size and location of all the netmap data structures, which can be
     accessed by mmapping the memory
           char *mem = mmap(0, arg.nr_memsize, fd);

     Non blocking I/O is done with special ioctl(2) select(2) and poll(2) on
     the file descriptor permit blocking I/O.  epoll(2) and kqueue(2) are not
     supported on netmap file descriptors.

     While a NIC is in netmap mode, the OS will still believe the interface is
     up and running.  OS-generated packets for that NIC end up into a netmap
     ring, and another ring is used to send packets into the OS network stack.
     A close(2) on the file descriptor removes the binding, and returns the
     NIC to normal mode (reconnecting the data path to the host stack), or
     destroys the virtual port.

     The data structures in the mmapped memory region are detailed in
     sys/net/netmap.h, which is the ultimate reference for the netmap API. The
     main structures and fields are indicated below:

     struct netmap_if (one per interface)

          struct netmap_if {
              const uint32_t   ni_flags;      /* properties              */
              const uint32_t   ni_tx_rings;   /* NIC tx rings            */
              const uint32_t   ni_rx_rings;   /* NIC rx rings            */
              uint32_t         ni_bufs_head;  /* head of extra bufs list */

          Indicates the number of available rings (struct netmap_rings) and
          their position in the mmapped region.  The number of tx and rx rings
          (ni_tx_rings, ni_rx_rings) normally depends on the hardware.  NICs
          also have an extra tx/rx ring pair connected to the host stack.
          NIOCREGIF can also request additional unbound buffers in the same
          memory space, to be used as temporary storage for packets.
          ni_bufs_head contains the index of the first of these free rings,
          which are connected in a list (the first uint32_t of each buffer
          being the index of the next buffer in the list).  A 0 indicates the
          end of the list.

     struct netmap_ring (one per ring)

          struct netmap_ring {
              const uint32_t num_slots;   /* slots in each ring            */
              const uint32_t nr_buf_size; /* size of each buffer           */
              uint32_t       head;        /* (u) first buf owned by user   */
              uint32_t       cur;         /* (u) wakeup position           */
              const uint32_t tail;        /* (k) first buf owned by kernel */
              uint32_t       flags;
              struct timeval ts;          /* (k) time of last rxsync()     */
              struct netmap_slot slot[0]; /* array of slots                */

          Implements transmit and receive rings, with read/write pointers,
          metadata and and an array of slots describing the buffers.

     struct netmap_slot (one per buffer)

          struct netmap_slot {
              uint32_t buf_idx;           /* buffer index                 */
              uint16_t len;               /* packet length                */
              uint16_t flags;             /* buf changed, etc.            */
              uint64_t ptr;               /* address for indirect buffers */

          Describes a packet buffer, which normally is identified by an index
          and resides in the mmapped region.

     packet buffers
          Fixed size (normally 2 KB) packet buffers allocated by the kernel.

     The offset of the struct netmap_if in the mmapped region is indicated by
     the nr_offset field in the structure returned by NIOCREGIF.  From there,
     all other objects are reachable through relative references (offsets or
     indexes).  Macros and functions in <net/netmap_user.h> help converting
     them into actual pointers:

           struct netmap_if *nifp = NETMAP_IF(mem, arg.nr_offset);
           struct netmap_ring *txr = NETMAP_TXRING(nifp, ring_index);
           struct netmap_ring *rxr = NETMAP_RXRING(nifp, ring_index);

           char *buf = NETMAP_BUF(ring, buffer_index);

     Rings are circular queues of packets with three indexes/pointers (head,
     cur, tail); one slot is always kept empty.  The ring size (num_slots)
     should not be assumed to be a power of two.
     (NOTE: older versions of netmap used head/count format to indicate the
     content of a ring).

     head is the first slot available to userspace;
     cur is the wakeup point: select/poll will unblock when tail passes cur;
     tail is the first slot reserved to the kernel.

     Slot indexes MUST only move forward; for convenience, the function
           nm_ring_next(ring, index)
     returns the next index modulo the ring size.

     head and cur are only modified by the user program; tail is only modified
     by the kernel.  The kernel only reads/writes the struct netmap_ring slots
     and buffers during the execution of a netmap-related system call.  The
     only exception are slots (and buffers) in the range tail ... head-1, that
     are explicitly assigned to the kernel.

     On transmit rings, after a netmap system call, slots in the range
     head ... tail-1 are available for transmission.  User code should fill
     the slots sequentially and advance head and cur past slots ready to
     transmit.  cur may be moved further ahead if the user code needs more
     slots before further transmissions (see SCATTER GATHER I/O).

     At the next NIOCTXSYNC/select()/poll(), slots up to head-1 are pushed to
     the port, and tail may advance if further slots have become available.
     Below is an example of the evolution of a TX ring:

         after the syscall, slots between cur and tail are (a)vailable
                   head=cur   tail
                    |          |
                    v          v
          TX  [.....aaaaaaaaaaa.............]

         user creates new packets to (T)ransmit
                     head=cur tail
                         |     |
                         v     v
          TX  [.....TTTTTaaaaaa.............]

         NIOCTXSYNC/poll()/select() sends packets and reports new slots
                     head=cur      tail
                         |          |
                         v          v
          TX  [..........aaaaaaaaaaa........]

     select() and poll() wlll block if there is no space in the ring, i.e.
           ring->cur == ring->tail
     and return when new slots have become available.

     High speed applications may want to amortize the cost of system calls by
     preparing as many packets as possible before issuing them.

     A transmit ring with pending transmissions has
           ring->head != ring->tail + 1 (modulo the ring size).
     The function int nm_tx_pending(ring) implements this test.

     On receive rings, after a netmap system call, the slots in the range
     head... tail-1 contain received packets.  User code should process them
     and advance head and cur past slots it wants to return to the kernel.
     cur may be moved further ahead if the user code wants to wait for more
     packets without returning all the previous slots to the kernel.

     At the next NIOCRXSYNC/select()/poll(), slots up to head-1 are returned
     to the kernel for further receives, and tail may advance to report new
     incoming packets.
     Below is an example of the evolution of an RX ring:

         after the syscall, there are some (h)eld and some (R)eceived slots
                head  cur     tail
                 |     |       |
                 v     v       v
          RX  [..hhhhhhRRRRRRRR..........]

         user advances head and cur, releasing some slots and holding others
                    head cur  tail
                      |  |     |
                      v  v     v
          RX  [..*****hhhRRRRRR...........]

         NICRXSYNC/poll()/select() recovers slots and reports new packets
                    head cur        tail
                      |  |           |
                      v  v           v
          RX  [.......hhhRRRRRRRRRRRR....]

     Normally, packets should be stored in the netmap-allocated buffers
     assigned to slots when ports are bound to a file descriptor.  One packet
     is fully contained in a single buffer.

     The following flags affect slot and buffer processing:

          it MUST be used when the buf_idx in the slot is changed.  This can
          be used to implement zero-copy forwarding, see ZERO-COPY FORWARDING.

          reports when this buffer has been transmitted.  Normally, netmap
          notifies transmit completions in batches, hence signals can be
          delayed indefinitely. This flag helps detecting when packets have
          been send and a file descriptor can be closed.

          When a ring is in 'transparent' mode (see TRANSPARENT MODE), packets
          marked with this flags are forwarded to the other endpoint at the
          next system call, thus restoring (in a selective way) the connection
          between a NIC and the host stack.

          tells the forwarding code that the SRC MAC address for this packet
          must not be used in the learning bridge code.

          indicates that the packet's payload is in a user-supplied buffer,
          whose user virtual address is in the 'ptr' field of the slot.  The
          size can reach 65535 bytes.
          This is only supported on the transmit ring of VALE ports, and it
          helps reducing data copies in the interconnection of virtual

          indicates that the packet continues with subsequent buffers; the
          last buffer in a packet must have the flag clear.

     Packets can span multiple slots if the NS_MOREFRAG flag is set in all but
     the last slot.  The maximum length of a chain is 64 buffers.  This is
     normally used with VALE ports when connecting virtual machines, as they
     generate large TSO segments that are not split unless they reach a
     physical device.

     NOTE: The length field always refers to the individual fragment; there is
     no place with the total length of a packet.

     On receive rings the macro NS_RFRAGS(slot) indicates the remaining number
     of slots for this packet, including the current one.  Slots with a value
     greater than 1 also have NS_MOREFRAG set.

     netmap uses two ioctls (NIOCTXSYNC, NIOCRXSYNC) for non-blocking I/O.
     They take no argument.  Two more ioctls (NIOCGINFO, NIOCREGIF) are used
     to query and configure ports, with the following argument:

     struct nmreq {
         char      nr_name[IFNAMSIZ]; /* (i) port name                  */
         uint32_t  nr_version;        /* (i) API version                */
         uint32_t  nr_offset;         /* (o) nifp offset in mmap region */
         uint32_t  nr_memsize;        /* (o) size of the mmap region    */
         uint32_t  nr_tx_slots;       /* (i/o) slots in tx rings        */
         uint32_t  nr_rx_slots;       /* (i/o) slots in rx rings        */
         uint16_t  nr_tx_rings;       /* (i/o) number of tx rings       */
         uint16_t  nr_rx_rings;       /* (i/o) number of tx rings       */
         uint16_t  nr_ringid;         /* (i/o) ring(s) we care about    */
         uint16_t  nr_cmd;            /* (i) special command            */
         uint16_t  nr_arg1;           /* (i/o) extra arguments          */
         uint16_t  nr_arg2;           /* (i/o) extra arguments          */
         uint32_t  nr_arg3;           /* (i/o) extra arguments          */
         uint32_t  nr_flags           /* (i/o) open mode                */

     A file descriptor obtained through /dev/netmap also supports the ioctl
     supported by network devices, see netintro(4).

           returns EINVAL if the named port does not support netmap.
           Otherwise, it returns 0 and (advisory) information about the port.
           Note that all the information below can change before the interface
           is actually put in netmap mode.

               indicates the size of the netmap memory region. NICs in netmap
               mode all share the same memory region, whereas VALE ports have
               independent regions for each port.

           nr_tx_slots, nr_rx_slots
               indicate the size of transmit and receive rings.

           nr_tx_rings, nr_rx_rings
               indicate the number of transmit and receive rings.  Both ring
               number and sizes may be configured at runtime using interface-
               specific functions (e.g.  ethtool ).

           binds the port named in nr_name to the file descriptor. For a
           physical device this also switches it into netmap mode,
           disconnecting it from the host stack.  Multiple file descriptors
           can be bound to the same port, with proper synchronization left to
           the user.

           NIOCREGIF can also bind a file descriptor to one endpoint of a
           netmap pipe, consisting of two netmap ports with a crossover
           connection.  A netmap pipe share the same memory space of the
           parent port, and is meant to enable configuration where a master
           process acts as a dispatcher towards slave processes.

           To enable this function, the nr_arg1 field of the structure can be
           used as a hint to the kernel to indicate how many pipes we expect
           to use, and reserve extra space in the memory region.

           On return, it gives the same info as NIOCGINFO, with nr_ringid and
           nr_flags indicating the identity of the rings controlled through
           the file descriptor.

           nr_flags nr_ringid selects which rings are controlled through this
           file descriptor.  Possible values of nr_flags are indicated below,
           together with the naming schemes that application libraries (such
           as the nm_open indicated below) can use to indicate the specific
           set of rings.  In the example below, "netmap:foo" is any valid
           netmap port name.

           NR_REG_ALL_NIC netmap:foo
                  (default) all hardware ring pairs

           NR_REG_SW_NIC netmap:foo^
                  the ``host rings'', connecting to the host stack.

           NR_RING_NIC_SW netmap:foo+
                  all hardware rings and the host rings

           NR_REG_ONE_NIC netmap:foo-i
                  only the i-th hardware ring pair, where the number is in

           NR_REG_PIPE_MASTER netmap:foo{i
                  the master side of the netmap pipe whose identifier (i) is
                  in nr_ringid;

           NR_REG_PIPE_SLAVE netmap:foo}i
                  the slave side of the netmap pipe whose identifier (i) is in

                  The identifier of a pipe must be thought as part of the pipe
                  name, and does not need to be sequential. On return the pipe
                  will only have a single ring pair with index 0, irrespective
                  of the value of i.

           By default, a poll(2) or select(2) call pushes out any pending
           packets on the transmit ring, even if no write events are
           specified.  The feature can be disabled by or-ing NETMAP_NO_TX_SYNC
           to the value written to nr_ringid. When this feature is used,
           packets are transmitted only on ioctl(NIOCTXSYNC) or
           select()/poll() are called with a write event (POLLOUT/wfdset) or a
           full ring.

           When registering a virtual interface that is dynamically created to
           a vale(4) switch, we can specify the desired number of rings (1 by
           default, and currently up to 16) on it using nr_tx_rings and
           nr_rx_rings fields.

           tells the hardware of new packets to transmit, and updates the
           number of slots available for transmission.

           tells the hardware of consumed packets, and asks for newly
           available packets.

     select(2) and poll(2) on a netmap file descriptor process rings as
     indicated in TRANSMIT RINGS and RECEIVE RINGS, respectively when write
     (POLLOUT) and read (POLLIN) events are requested.  Both block if no slots
     are available in the ring (ring-_cur == ring-_tail).  Depending on the
     platform, epoll(2) and kqueue(2) are supported too.

     Packets in transmit rings are normally pushed out (and buffers reclaimed)
     even without requesting write events. Passing the NETMAP_NO_TX_SYNC flag
     to NIOCREGIF disables this feature.  By default, receive rings are
     processed only if read events are requested. Passing the
     NETMAP_DO_RX_SYNC flag to NIOCREGIF updates receive rings even without
     read events. Note that on epoll and kqueue, NETMAP_NO_TX_SYNC and
     NETMAP_DO_RX_SYNC only have an effect when some event is posted for the
     file descriptor.

     The netmap API is supposed to be used directly, both because of its
     simplicity and for efficient integration with applications.

     For conveniency, the _net/netmap_user.h_ header provides a few macros and
     functions to ease creating a file descriptor and doing I/O with a netmap
     port. These are loosely modeled after the pcap(3) API, to ease porting of
     libpcap-based applications to netmap.  To use these extra functions,
     programs should
           #define NETMAP_WITH_LIBS
           #include <net/netmap_user.h>

     The following functions are available:

     struct nm_desc * nm_open(const char *ifname, const struct nmreq *req,
            uint64_t flags, const struct nm_desc *arg)
            similar to pcap_open, binds a file descriptor to a port.

                is a port name, in the form "netmap:XXX" for a NIC and
                "valeXXX:YYY" for a VALE port.

                provides the initial values for the argument to the NIOCREGIF
                ioctl.  The nm_flags and nm_ringid values are overwritten by
                parsing ifname and flags, and other fields can be overridden
                through the other two arguments.

                points to a struct nm_desc containing arguments (e.g. from a
                previously open file descriptor) that should override the
                defaults.  The fields are used as described below

                can be set to a combination of the following flags:
                NETMAP_NO_TX_POLL, NETMAP_DO_RX_POLL (copied into nr_ringid);
                NM_OPEN_NO_MMAP (if arg points to the same memory region,
                avoids the mmap and uses the values from it); NM_OPEN_IFNAME
                (ignores ifname and uses the values in arg); NM_OPEN_ARG1,
                NM_OPEN_ARG2, NM_OPEN_ARG3 (uses the fields from arg);
                NM_OPEN_RING_CFG (uses the ring number and sizes from arg).

     int nm_close(struct nm_desc *d)
            closes the file descriptor, unmaps memory, frees resources.

     int nm_inject(struct nm_desc *d, const void *buf, size_t size)
            similar to pcap_inject(), pushes a packet to a ring, returns the
            size of the packet is successful, or 0 on error;

     int nm_dispatch(struct nm_desc *d, int cnt, nm_cb_t cb, u_char *arg)
            similar to pcap_dispatch(), applies a callback to incoming packets

     u_char * nm_nextpkt(struct nm_desc *d, struct nm_pkthdr *hdr)
            similar to pcap_next(), fetches the next packet

     netmap natively supports the following devices:

     On FreeBSD: em(4), igb(4), ixgbe(4), lem(4), re(4).

     On Linux e1000(4), e1000e(4), igb(4), ixgbe(4), mlx4(4), forcedeth(4),

     NICs without native support can still be used in netmap mode through
     emulation. Performance is inferior to native netmap mode but still
     significantly higher than sockets, and approaching that of in-kernel
     solutions such as Linux's pktgen.

     Emulation is also available for devices with native netmap support, which
     can be used for testing or performance comparison.  The sysctl variable
     dev.netmap.admode globally controls how netmap mode is implemented.

     Some aspect of the operation of netmap are controlled through sysctl
     variables on FreeBSD (dev.netmap.*) and module parameters on Linux

     dev.netmap.admode: 0
             Controls the use of native or emulated adapter mode.  0 uses the
             best available option, 1 forces native and fails if not
             available, 2 forces emulated hence never fails.

     dev.netmap.generic_ringsize: 1024
             Ring size used for emulated netmap mode

     dev.netmap.generic_mit: 100000
             Controls interrupt moderation for emulated mode

     dev.netmap.mmap_unreg: 0

     dev.netmap.fwd: 0
             Forces NS_FORWARD mode

     dev.netmap.flags: 0

     dev.netmap.txsync_retry: 2

     dev.netmap.no_pendintr: 1
             Forces recovery of transmit buffers on system calls

     dev.netmap.mitigate: 1
             Propagates interrupt mitigation to user processes

     dev.netmap.no_timestamp: 0
             Disables the update of the timestamp in the netmap ring

     dev.netmap.verbose: 0
             Verbose kernel messages

     dev.netmap.buf_num: 163840

     dev.netmap.buf_size: 2048

     dev.netmap.ring_num: 200

     dev.netmap.ring_size: 36864

     dev.netmap.if_num: 100

     dev.netmap.if_size: 1024
             Sizes and number of objects (netmap_if, netmap_ring, buffers) for
             the global memory region. The only parameter worth modifying is
             dev.netmap.buf_num as it impacts the total amount of memory used
             by netmap.

     dev.netmap.buf_curr_num: 0

     dev.netmap.buf_curr_size: 0

     dev.netmap.ring_curr_num: 0

     dev.netmap.ring_curr_size: 0

     dev.netmap.if_curr_num: 0

     dev.netmap.if_curr_size: 0
             Actual values in use.

     dev.netmap.bridge_batch: 1024
             Batch size used when moving packets across a VALE switch. Values
             above 64 generally guarantee good performance.

     netmap uses select(2), poll(2), epoll and kqueue to wake up processes
     when significant events occur, and mmap(2) to map memory.  ioctl(2) is
     used to configure ports and VALE switches.

     Applications may need to create threads and bind them to specific cores
     to improve performance, using standard OS primitives, see pthread(3).  In
     particular, pthread_setaffinity_np(3) may be of use.

     No matter how fast the CPU and OS are, achieving line rate on 10G and
     faster interfaces requires hardware with sufficient performance.  Several
     NICs are unable to sustain line rate with small packet sizes.
     Insufficient PCIe or memory bandwidth can also cause reduced performance.

     Another frequent reason for low performance is the use of flow control on
     the link: a slow receiver can limit the transmit speed.  Be sure to
     disable flow control when running high speed experiments.

     netmap is orthogonal to some NIC features such as multiqueue, schedulers,
     packet filters.

     Multiple transmit and receive rings are supported natively and can be
     configured with ordinary OS tools, such as ethtool or device-specific
     sysctl variables.  The same goes for Receive Packet Steering (RPS) and
     filtering of incoming traffic.

     netmap does not use features such as checksum offloading, TCP
     segmentation offloading, encryption, VLAN encapsulation/decapsulation,
     etc. .  When using netmap to exchange packets with the host stack, make
     sure to disable these features.

     netmap comes with a few programs that can be used for testing or simple
     applications.  See the examples/ directory in netmap distributions, or
     tools/tools/netmap/ directory in FreeBSD distributions.

     pkt-gen is a general purpose traffic source/sink.

     As an example
           pkt-gen -i ix0 -f tx -l 60
     can generate an infinite stream of minimum size packets, and
           pkt-gen -i ix0 -f rx
     is a traffic sink.  Both print traffic statistics, to help monitor how
     the system performs.

     pkt-gen has many options can be uses to set packet sizes, addresses,
     rates, and use multiple send/receive threads and cores.

     bridge is another test program which interconnects two netmap ports. It
     can be used for transparent forwarding between interfaces, as in
           bridge -i ix0 -i ix1
     or even connect the NIC to the host stack using netmap
           bridge -i ix0 -i ix0

     The following code implements a traffic generator

     #include <net/netmap_user.h>
     void sender(void)
         struct netmap_if *nifp;
         struct netmap_ring *ring;
         struct nmreq nmr;
         struct pollfd fds;

         fd = open("/dev/netmap", O_RDWR);
         bzero(&nmr, sizeof(nmr));
         strcpy(nmr.nr_name, "ix0");
         nmr.nm_version = NETMAP_API;
         ioctl(fd, NIOCREGIF, &nmr);
         p = mmap(0, nmr.nr_memsize, fd);
         nifp = NETMAP_IF(p, nmr.nr_offset);
         ring = NETMAP_TXRING(nifp, 0);
         fds.fd = fd; = POLLOUT;
         for (;;) {
             poll(&fds, 1, -1);
             while (!nm_ring_empty(ring)) {
                 i = ring->cur;
                 buf = NETMAP_BUF(ring, ring->slot[i].buf_index);
                 ... prepare packet in buf ...
                 ring->slot[i].len = ... packet length ...
                 ring->head = ring->cur = nm_ring_next(ring, i);

     A simple receiver can be implemented using the helper functions
     #define NETMAP_WITH_LIBS
     #include <net/netmap_user.h>
     void receiver(void)
         struct nm_desc *d;
         struct pollfd fds;
         u_char *buf;
         struct nm_pkthdr h;
         d = nm_open("netmap:ix0", NULL, 0, 0);
         fds.fd = NETMAP_FD(d); = POLLIN;
         for (;;) {
             poll(&fds, 1, -1);
             while ( (buf = nm_nextpkt(d, &h)) )
                 consume_pkt(buf, h->len);

     Since physical interfaces share the same memory region, it is possible to
     do packet forwarding between ports swapping buffers. The buffer from the
     transmit ring is used to replenish the receive ring:
         uint32_t tmp;
         struct netmap_slot *src, *dst;
         src = &src_ring->slot[rxr->cur];
         dst = &dst_ring->slot[txr->cur];
         tmp = dst->buf_idx;
         dst->buf_idx = src->buf_idx;
         dst->len = src->len;
         dst->flags = NS_BUF_CHANGED;
         src->buf_idx = tmp;
         src->flags = NS_BUF_CHANGED;
         rxr->head = rxr->cur = nm_ring_next(rxr, rxr->cur);
         txr->head = txr->cur = nm_ring_next(txr, txr->cur);

     The host stack is for all practical purposes just a regular ring pair,
     which you can access with the netmap API (e.g. with
           nm_open("netmap:eth0^", ...);
     All packets that the host would send to an interface in netmap mode end
     up into the RX ring, whereas all packets queued to the TX ring are send
     up to the host stack.

     A simple way to test the performance of a VALE switch is to attach a
     sender and a receiver to it, e.g. running the following in two different
           pkt-gen -i vale1:a -f rx # receiver
           pkt-gen -i vale1:b -f tx # sender
     The same example can be used to test netmap pipes, by simply changing
     port names, e.g.
           pkt-gen -i vale:x{3 -f rx # receiver on the master side
           pkt-gen -i vale:x}3 -f tx # sender on the slave side

     The following command attaches an interface and the host stack to a
           vale-ctl -h vale2:em0
     Other netmap clients attached to the same switch can now communicate with
     the network card or the host.


     Luigi Rizzo, Revisiting network I/O APIs: the netmap framework,
     Communications of the ACM, 55 (3), pp.45-51, March 2012

     Luigi Rizzo, netmap: a novel framework for fast packet I/O, Usenix
     ATC'12, June 2012, Boston

     Luigi Rizzo, Giuseppe Lettieri, VALE, a switched ethernet for virtual
     machines, ACM CoNEXT'12, December 2012, Nice

     Luigi Rizzo, Giuseppe Lettieri, Vincenzo Maffione, Speeding up packet I/O
     in virtual machines, ACM/IEEE ANCS'13, October 2013, San Jose

     The netmap framework has been originally designed and implemented at the
     Universita` di Pisa in 2011 by Luigi Rizzo, and further extended with
     help from Matteo Landi, Gaetano Catalli, Giuseppe Lettieri, Vincenzo

     netmap and VALE have been funded by the European Commission within FP7
     Projects CHANGE (257422) and OPENLAB (287581).

FreeBSD 11.0-PRERELEASE        February 13, 2014       FreeBSD 11.0-PRERELEASE


Want to link to this manual page? Use this URL:

home | help