Skip site navigation (1)Skip section navigation (2)

FreeBSD Man Pages

Man Page or Keyword Search:
Man Architecture
Apropos Keyword Search (all sections) Output format
home | help
TCP(4)                 FreeBSD Kernel Interfaces Manual                 TCP(4)

     tcp - Internet Transmission Control Protocol

     #include <sys/types.h>
     #include <sys/socket.h>
     #include <netinet/in.h>
     #include <netinet/tcp.h>

     socket(AF_INET, SOCK_STREAM, 0);

     The TCP protocol provides reliable, flow-controlled, two-way transmission
     of data.  It is a byte-stream protocol used to support the SOCK_STREAM
     abstraction.  TCP uses the standard Internet address format and, in
     addition, provides a per-host collection of ``port addresses''.  Thus,
     each address is composed of an Internet address specifying the host and
     network, with a specific TCP port on the host identifying the peer

     Sockets utilizing the TCP protocol are either ``active'' or ``passive''.
     Active sockets initiate connections to passive sockets.  By default, TCP
     sockets are created active; to create a passive socket, the listen(2)
     system call must be used after binding the socket with the bind(2) system
     call.  Only passive sockets may use the accept(2) call to accept incoming
     connections.  Only active sockets may use the connect(2) call to initiate

     Passive sockets may ``underspecify'' their location to match incoming
     connection requests from multiple networks.  This technique, termed
     ``wildcard addressing'', allows a single server to provide service to
     clients on multiple networks.  To create a socket which listens on all
     networks, the Internet address INADDR_ANY must be bound.  The TCP port
     may still be specified at this time; if the port is not specified, the
     system will assign one.  Once a connection has been established, the
     socket's address is fixed by the peer entity's location.  The address
     assigned to the socket is the address associated with the network
     interface through which packets are being transmitted and received.
     Normally, this address corresponds to the peer entity's network.

     TCP supports a number of socket options which can be set with
     setsockopt(2) and tested with getsockopt(2):

     TCP_INFO            Information about a socket's underlying TCP session
                         may be retrieved by passing the read-only option
                         TCP_INFO to getsockopt(2).  It accepts a single
                         argument: a pointer to an instance of struct

                         This API is subject to change; consult the source to
                         determine which fields are currently filled out by
                         this option.  FreeBSD specific additions include send
                         window size, receive window size, and bandwidth-
                         controlled window space.

     TCP_CONGESTION      Select or query the congestion control algorithm that
                         TCP will use for the connection.  See mod_cc(4) for

     TCP_KEEPINIT        This setsockopt(2) option accepts a per-socket
                         timeout argument of u_int in seconds, for new, non-
                         established TCP connections.  For the global default
                         in milliseconds see keepinit in the MIB Variables
                         section further down.

     TCP_KEEPIDLE        This setsockopt(2) option accepts an argument of
                         u_int for the amount of time, in seconds, that the
                         connection must be idle before keepalive probes (if
                         enabled) are sent for the connection of this socket.
                         If set on a listening socket, the value is inherited
                         by the newly created socket upon accept(2).  For the
                         global default in milliseconds see keepidle in the
                         MIB Variables section further down.

     TCP_KEEPINTVL       This setsockopt(2) option accepts an argument of
                         u_int to set the per-socket interval, in seconds,
                         between keepalive probes sent to a peer.  If set on a
                         listening socket, the value is inherited by the newly
                         created socket upon accept(2).  For the global
                         default in milliseconds see keepintvl in the MIB
                         Variables section further down.

     TCP_KEEPCNT         This setsockopt(2) option accepts an argument of
                         u_int and allows a per-socket tuning of the number of
                         probes sent, with no response, before the connection
                         will be dropped.  If set on a listening socket, the
                         value is inherited by the newly created socket upon
                         accept(2).  For the global default see the keepcnt in
                         the MIB Variables section further down.

     TCP_NODELAY         Under most circumstances, TCP sends data when it is
                         presented; when outstanding data has not yet been
                         acknowledged, it gathers small amounts of output to
                         be sent in a single packet once an acknowledgement is
                         received.  For a small number of clients, such as
                         window systems that send a stream of mouse events
                         which receive no replies, this packetization may
                         cause significant delays.  The boolean option
                         TCP_NODELAY defeats this algorithm.

     TCP_MAXSEG          By default, a sender- and receiver-TCP will negotiate
                         among themselves to determine the maximum segment
                         size to be used for each connection.  The TCP_MAXSEG
                         option allows the user to determine the result of
                         this negotiation, and to reduce it if desired.

     TCP_NOOPT           TCP usually sends a number of options in each packet,
                         corresponding to various TCP extensions which are
                         provided in this implementation.  The boolean option
                         TCP_NOOPT is provided to disable TCP option use on a
                         per-connection basis.

     TCP_NOPUSH          By convention, the sender-TCP will set the ``push''
                         bit, and begin transmission immediately (if
                         permitted) at the end of every user call to write(2)
                         or writev(2).  When this option is set to a non-zero
                         value, TCP will delay sending any data at all until
                         either the socket is closed, or the internal send
                         buffer is filled.

     TCP_MD5SIG          This option enables the use of MD5 digests (also
                         known as TCP-MD5) on writes to the specified socket.
                         Outgoing traffic is digested; digests on incoming
                         traffic are verified if the
                         net.inet.tcp.signature_verify_input sysctl is
                         nonzero.  The current default behavior for the system
                         is to respond to a system advertising this option
                         with TCP-MD5; this may change.

                         One common use for this in a FreeBSD router
                         deployment is to enable based routers to interwork
                         with Cisco equipment at peering points.  Support for
                         this feature conforms to RFC 2385.  Only IPv4
                         (AF_INET) sessions are supported.

                         In order for this option to function correctly, it is
                         necessary for the administrator to add a tcp-md5 key
                         entry to the system's security associations database
                         (SADB) using the setkey(8) utility.  This entry must
                         have an SPI of 0x1000 and can therefore only be
                         specified on a per-host basis at this time.

                         If an SADB entry cannot be found for the destination,
                         the outgoing traffic will have an invalid digest
                         option prepended, and the following error message
                         will be visible on the system console:
                         tcp_signature_compute: SADB lookup failed for

     The option level for the setsockopt(2) call is the protocol number for
     TCP, available from getprotobyname(3), or IPPROTO_TCP.  All options are
     declared in <netinet/tcp.h>.

     Options at the IP transport level may be used with TCP; see ip(4).
     Incoming connection requests that are source-routed are noted, and the
     reverse source route is used in responding.

     The default congestion control algorithm for TCP is cc_newreno(4).  Other
     congestion control algorithms can be made available using the mod_cc(4)

   MIB Variables
     The TCP protocol implements a number of variables in the net.inet.tcp
     branch of the sysctl(3) MIB.

     TCPCTL_DO_RFC1323      (rfc1323) Implement the window scaling and
                            timestamp options of RFC 1323 (default is true).

     TCPCTL_MSSDFLT         (mssdflt) The default value used for the maximum
                            segment size (``MSS'') when no advice to the
                            contrary is received from MSS negotiation.

     TCPCTL_SENDSPACE       (sendspace) Maximum TCP send window.

     TCPCTL_RECVSPACE       (recvspace) Maximum TCP receive window.

     log_in_vain            Log any connection attempts to ports where there
                            is not a socket accepting connections.  The value
                            of 1 limits the logging to SYN (connection
                            establishment) packets only.  That of 2 results in
                            any TCP packets to closed ports being logged.  Any
                            value unlisted above disables the logging (default
                            is 0, i.e., the logging is disabled).

     msl                    The Maximum Segment Lifetime, in milliseconds, for
                            a packet.

     keepinit               Timeout, in milliseconds, for new, non-established
                            TCP connections.  The default is 75000 msec.

     keepidle               Amount of time, in milliseconds, that the
                            connection must be idle before keepalive probes
                            (if enabled) are sent.  The default is 7200000
                            msec (2 hours).

     keepintvl              The interval, in milliseconds, between keepalive
                            probes sent to remote machines, when no response
                            is received on a keepidle probe.  The default is
                            75000 msec.

     keepcnt                Number of probes sent, with no response, before a
                            connection is dropped.  The default is 8 packets.

     always_keepalive       Assume that SO_KEEPALIVE is set on all TCP
                            connections, the kernel will periodically send a
                            packet to the remote host to verify the connection
                            is still up.

     icmp_may_rst           Certain ICMP unreachable messages may abort
                            connections in SYN-SENT state.

     do_tcpdrain            Flush packets in the TCP reassembly queue if the
                            system is low on mbufs.

     blackhole              If enabled, disable sending of RST when a
                            connection is attempted to a port where there is
                            not a socket accepting connections.  See

     delayed_ack            Delay ACK to try and piggyback it onto a data

     delacktime             Maximum amount of time, in milliseconds, before a
                            delayed ACK is sent.

     path_mtu_discovery     Enable Path MTU Discovery.

     tcbhashsize            Size of the TCP control-block hash table (read-
                            only).  This may be tuned using the kernel option
                            TCBHASHSIZE or by setting net.inet.tcp.tcbhashsize
                            in the loader(8).

     pcbcount               Number of active process control blocks (read-

     syncookies             Determines whether or not SYN cookies should be
                            generated for outbound SYN-ACK packets.  SYN
                            cookies are a great help during SYN flood attacks,
                            and are enabled by default.  (See syncookies(4).)

     isn_reseed_interval    The interval (in seconds) specifying how often the
                            secret data used in RFC 1948 initial sequence
                            number calculations should be reseeded.  By
                            default, this variable is set to zero, indicating
                            that no reseeding will occur.  Reseeding should
                            not be necessary, and will break TIME_WAIT
                            recycling for a few minutes.

     rexmit_min, rexmit_slop
                            Adjust the retransmit timer calculation for TCP.
                            The slop is typically added to the raw calculation
                            to take into account occasional variances that the
                            SRTT (smoothed round-trip time) is unable to
                            accommodate, while the minimum specifies an
                            absolute minimum.  While a number of TCP RFCs
                            suggest a 1 second minimum, these RFCs tend to
                            focus on streaming behavior, and fail to deal with
                            the fact that a 1 second minimum has severe
                            detrimental effects over lossy interactive
                            connections, such as a 802.11b wireless link, and
                            over very fast but lossy connections for those
                            cases not covered by the fast retransmit code.
                            For this reason, we use 200ms of slop and a near-0
                            minimum, which gives us an effective minimum of
                            200ms (similar to Linux).

     rfc3042                Enable the Limited Transmit algorithm as described
                            in RFC 3042.  It helps avoid timeouts on lossy
                            links and also when the congestion window is
                            small, as happens on short transfers.

     rfc3390                Enable support for RFC 3390, which allows for a
                            variable-sized starting congestion window on new
                            connections, depending on the maximum segment
                            size.  This helps throughput in general, but
                            particularly affects short transfers and high-
                            bandwidth large propagation-delay connections.

     sack.enable            Enable support for RFC 2018, TCP Selective
                            Acknowledgment option, which allows the receiver
                            to inform the sender about all successfully
                            arrived segments, allowing the sender to
                            retransmit the missing segments only.

     sack.maxholes          Maximum number of SACK holes per connection.
                            Defaults to 128.

     sack.globalmaxholes    Maximum number of SACK holes per system, across
                            all connections.  Defaults to 65536.

     maxtcptw               When a TCP connection enters the TIME_WAIT state,
                            its associated socket structure is freed, since it
                            is of negligible size and use, and a new structure
                            is allocated to contain a minimal amount of
                            information necessary for sustaining a connection
                            in this state, called the compressed TCP TIME_WAIT
                            state.  Since this structure is smaller than a
                            socket structure, it can save a significant amount
                            of system memory.  The net.inet.tcp.maxtcptw MIB
                            variable controls the maximum number of these
                            structures allocated.  By default, it is
                            initialized to kern.ipc.maxsockets / 5.

     nolocaltimewait        Suppress creating of compressed TCP TIME_WAIT
                            states for connections in which both endpoints are

     fast_finwait2_recycle  Recycle TCP FIN_WAIT_2 connections faster when the
                            socket is marked as SBS_CANTRCVMORE (no user
                            process has the socket open, data received on the
                            socket cannot be read).  The timeout used here is

     finwait2_timeout       Timeout to use for fast recycling of TCP
                            FIN_WAIT_2 connections.  Defaults to 60 seconds.

     ecn.enable             Enable support for TCP Explicit Congestion
                            Notification (ECN).  ECN allows a TCP sender to
                            reduce the transmission rate in order to avoid
                            packet drops.

     ecn.maxretries         Number of retries (SYN or SYN/ACK retransmits)
                            before disabling ECN on a specific connection.
                            This is needed to help with connection
                            establishment when a broken firewall is in the
                            network path.

                            Turn on automatic path MTU blackhole detection.
                            In case of retransmits OS will lower the MSS to
                            check if it's MTU problem.  If current MSS is
                            greater than configured value to try, it will be
                            set to configured value, otherwise, MSS will be
                            set to default values (net.inet.tcp.mssdflt and

     pmtud_blackhole_mss    MSS to try for IPv4 if PMTU blackhole detection is
                            turned on.

     v6pmtud_blackhole_mss  MSS to try for IPv6 if PMTU blackhole detection is
                            turned on.

                            Number of times configured values were used in an
                            attempt to downshift.

                            Number of times default MSS was used in an attempt
                            to downshift.

                            Number of connections for which retransmits
                            continued even after MSS downshift.

     A socket operation may fail with one of the following errors returned:

     [EISCONN]          when trying to establish a connection on a socket
                        which already has one;

     [ENOBUFS]          when the system runs out of memory for an internal
                        data structure;

     [ETIMEDOUT]        when a connection was dropped due to excessive

     [ECONNRESET]       when the remote peer forces the connection to be

     [ECONNREFUSED]     when the remote peer actively refuses connection
                        establishment (usually because no process is listening
                        to the port);

     [EADDRINUSE]       when an attempt is made to create a socket with a port
                        which has already been allocated;

     [EADDRNOTAVAIL]    when an attempt is made to create a socket with a
                        network address for which no network interface exists;

     [EAFNOSUPPORT]     when an attempt is made to bind or connect a socket to
                        a multicast address.

     getsockopt(2), socket(2), sysctl(3), blackhole(4), inet(4), intro(4),
     ip(4), mod_cc(4), siftr(4), syncache(4), setkey(8)

     V. Jacobson, R. Braden, and D. Borman, TCP Extensions for High
     Performance, RFC 1323.

     A. Heffernan, Protection of BGP Sessions via the TCP MD5 Signature
     Option, RFC 2385.

     K. Ramakrishnan, S. Floyd, and D. Black, The Addition of Explicit
     Congestion Notification (ECN) to IP, RFC 3168.

     The TCP protocol appeared in 4.2BSD.  The RFC 1323 extensions for window
     scaling and timestamps were added in 4.4BSD.  The TCP_INFO option was
     introduced in Linux 2.6 and is subject to change.

FreeBSD 11.0-PRERELEASE        October 13, 2014        FreeBSD 11.0-PRERELEASE


Want to link to this manual page? Use this URL:

home | help