Skip site navigation (1)Skip section navigation (2)

FreeBSD Man Pages

Man Page or Keyword Search:
Man Architecture
Apropos Keyword Search (all sections) Output format
home | help
TCP(4)                 FreeBSD Kernel Interfaces Manual                 TCP(4)

     tcp - Internet Transmission Control Protocol

     #include <sys/types.h>
     #include <sys/socket.h>
     #include <netinet/in.h>

     socket(AF_INET, SOCK_STREAM, 0);

     The TCP protocol provides reliable, flow-controlled, two-way transmission
     of data.  It is a byte-stream protocol used to support the SOCK_STREAM
     abstraction.  TCP uses the standard Internet address format and, in
     addition, provides a per-host collection of ``port addresses''.  Thus,
     each address is composed of an Internet address specifying the host and
     network, with a specific TCP port on the host identifying the peer

     Sockets utilizing the tcp protocol are either ``active'' or ``passive''.
     Active sockets initiate connections to passive sockets.  By default TCP
     sockets are created active; to create a passive socket the listen(2)
     system call must be used after binding the socket with the bind(2) system
     call.  Only passive sockets may use the accept(2) call to accept incoming
     connections.  Only active sockets may use the connect(2) call to initiate
     connections.  TCP also supports a more datagram-like mode, called
     Transaction TCP, which is described in ttcp(4).

     Passive sockets may ``underspecify'' their location to match incoming
     connection requests from multiple networks.  This technique, termed
     ``wildcard addressing'', allows a single server to provide service to
     clients on multiple networks.  To create a socket which listens on all
     networks, the Internet address INADDR_ANY must be bound.  The TCP port
     may still be specified at this time; if the port is not specified the
     system will assign one.  Once a connection has been established the
     socket's address is fixed by the peer entity's location.   The address
     assigned the socket is the address associated with the network interface
     through which packets are being transmitted and received.  Normally this
     address corresponds to the peer entity's network.

     TCP supports a number of socket options which can be set with
     setsockopt(2) and tested with getsockopt(2):

     TCP_NODELAY   Under most circumstances, TCP sends data when it is
                   presented; when outstanding data has not yet been
                   acknowledged, it gathers small amounts of output to be sent
                   in a single packet once an acknowledgement is received.
                   For a small number of clients, such as window systems that
                   send a stream of mouse events which receive no replies,
                   this packetization may cause significant delays.  The
                   boolean option TCP_NODELAY defeats this algorithm.

     TCP_MAXSEG    By default, a sender- and receiver-TCP will negotiate among
                   themselves to determine the maximum segment size to be used
                   for each connection.  The TCP_MAXSEG option allows the user
                   to determine the result of this negotiation, and to reduce
                   it if desired.

     TCP_NOOPT     TCP usually sends a number of options in each packet,
                   corresponding to various TCP extensions which are provided
                   in this implementation.  The boolean option TCP_NOOPT is
                   provided to disable TCP option use on a per-connection

     TCP_NOPUSH    By convention, the sender-TCP will set the ``push'' bit and
                   begin transmission immediately (if permitted) at the end of
                   every user call to write(2) or writev(2).  The TCP_NOPUSH
                   option is provided to allow servers to easily make use of
                   Transaction TCP (see ttcp(4)).  When the option is set to a
                   non-zero value, TCP will delay sending any data at all
                   until either the socket is closed, or the internal send
                   buffer is filled.

     The option level for the setsockopt(2) call is the protocol number for
     TCP, available from getprotobyname(3), or IPPROTO_TCP.  All options are
     declared in <netinet/tcp.h>.

     Options at the IP transport level may be used with TCP; see ip(4).
     Incoming connection requests that are source-routed are noted, and the
     reverse source route is used in responding.

     The tcp protocol implements a number of variables in the net.inet branch
     of the sysctl(3) MIB.

     TCPCTL_DO_RFC1323  (tcp.rfc1323) Implement the window scaling and
                        timestamp options of RFC 1323 (default true).

     TCPCTL_DO_RFC1644  (tcp.rfc1644) Implement Transaction TCP, as described
                        in RFC 1644.

     TCPCTL_MSSDFLT     (tcp.mssdflt) The default value used for the maximum
                        segment size (``MSS'') when no advice to the contrary
                        is received from MSS negotiation.

     TCPCTL_SENDSPACE   (tcp.sendspace) Maximum TCP send window.

     TCPCTL_RECVSPACE   (tcp.recvspace) Maximum TCP receive window.

     tcp.log_in_vain    Log any connection attempts to ports where there is
                        not a socket accepting connections.  The value of 1
                        limits the logging to SYN (connection establishment)
                        packets only.  That of 2 results in any TCP packets to
                        closed ports being logged.  Any value unlisted above
                        disables the logging (default is 0, i.e., the logging
                        is disabled).

                        The number of packets allowed to be in-flight during
                        the TCP slow-start phase on a non-local network.

                        The number of packets allowed to be in-flight during
                        the TCP slow-start phase to local machines in the same

     tcp.msl            The Maximum Segment Lifetime, in milliseconds, for a

     tcp.keepinit       Timeout, in milliseconds, for new, non-established TCP

     tcp.keepidle       Amount of time, in milliseconds, that the connection
                        must be idle before keepalive probes (if enabled) are

     tcp.keepintvl      The interval, in milliseconds, between keepalive
                        probes sent to remote machines.  After TCPTV_KEEPCNT
                        (default 8) probes are sent, with no response, the
                        connection is dropped.

                        Assume that SO_KEEPALIVE is set on all TCP
                        connections, the kernel will periodically send a
                        packet to the remote host to verify the connection is
                        still up.

     tcp.icmp_may_rst   Certain ICMP unreachable messages may abort
                        connections in SYN-SENT state.

     tcp.do_tcpdrain    Flush packets in the TCP reassembly queue if the
                        system is low on mbufs.

     tcp.blackhole      If enabled, disable sending of RST when a connection
                        is attempted to a port where there is not a socket
                        accepting connections.  See blackhole(4).

     tcp.delayed_ack    Delay ACK to try and piggyback it onto a data packet.

     tcp.delacktime     Maximum amount of time, in milliseconds, before a
                        delayed ACK is sent.

     tcp.newreno        Enable TCP NewReno Fast Recovery algorithm, as
                        described in RFC 2582.

                        Enable Path MTU Discovery

     tcp.tcbhashsize    Size of the TCP control-block hashtable (read-only).
                        This may be tuned using the kernel option TCBHASHSIZE
                        or by setting net.inet.tcp.tcbhashsize in the

     tcp.pcbcount       Number of active process control blocks (read-only).

     tcp.syncookies     Determines whether or not syn cookies should be
                        generated for outbound syn-ack packets.  Syn cookies
                        are a great help during syn flood attacks, and are
                        enabled by default.

                        The interval (in seconds) specifying how often the
                        secret data used in RFC 1948 initial sequence number
                        calculations should be reseeded.  By default, this
                        variable is set to zero, indicating that no reseeding
                        will occur.  Reseeding should not be necessary, and
                        will break TIME_WAIT recycling for a few minutes.

                        Adjust the retransmit timer calculation for TCP.  The
                        slop is typically added to the raw calculation to take
                        into account occassional variances that the SRTT
                        (smoothed round trip time) is unable to accomodate,
                        while the minimum specifies an absolute minimum.
                        While a number of TCP RFCs suggest a 1 second minimum
                        these RFCs tend to focus on streaming behavior and
                        fail to deal with the fact that a 1 second minimum has
                        severe detrimental effects over lossy interactive
                        connections, such as a 802.11b wireless link, and over
                        very fast but lossy connections for those cases not
                        covered by the fast retransmit code.  For this reason
                        we use 200ms of slop and a near-0 minimum, which gives
                        us an effective minimum of 200ms (similar to Linux).

                        Enable TCP bandwidth delay product limiting.  An
                        attempt will be made to calculate the bandwidth delay
                        product for each individual TCP connection and limit
                        the amount of inflight data being transmitted to avoid
                        building up unnecessary packets in the network.  This
                        option is recommended if you are serving a lot of data
                        over connections with high bandwidth-delay products,
                        such as modems, GigE links, and fast long-haul WANs,
                        and/or you have configured your machine to accomodate
                        large TCP windows.  In such situations, without this
                        option, you may experience high interactive latencies
                        or packet loss due to the overloading of intermediate
                        routers and switches.  Note that bandwidth delay
                        product limiting only effects the transmit side of a
                        TCP connection.

                        Enable debugging for the bandwidth delay product
                        algorithm.  This may default to on (1) so if you
                        enable the algorithm you should probably also disable
                        debugging by setting this variable to 0.

     tcp.inflight_min   This puts an lower bound on the bandwidth delay
                        product window, in bytes.  A value of 1024 is
                        typically used for debugging.  6000-16000 is more
                        typical in a production installation.  Setting this
                        value too low may result in slow ramp-up times for
                        bursty connections.  Setting this value too high
                        effectively disables the algorithm.

     tcp.inflight_max   This puts an upper bound on the bandwidth delay
                        product window, in bytes.  This value should not
                        generally be modified but may be used to set a global
                        per-connection limit on queued data, potentially
                        allowing you to intentionally set a less then optimum
                        limit to smooth data flow over a network while still
                        being able to specify huge internal TCP buffers.

     A socket operation may fail with one of the following errors returned:

     [EISCONN]          when trying to establish a connection on a socket
                        which already has one;

     [ENOBUFS]          when the system runs out of memory for an internal
                        data structure;

     [ETIMEDOUT]        when a connection was dropped due to excessive

     [ECONNRESET]       when the remote peer forces the connection to be

     [ECONNREFUSED]     when the remote peer actively refuses connection
                        establishment (usually because no process is listening
                        to the port);

     [EADDRINUSE]       when an attempt is made to create a socket with a port
                        which has already been allocated;

     [EADDRNOTAVAIL]    when an attempt is made to create a socket with a
                        network address for which no network interface exists.

     [EAFNOSUPPORT]     when an attempt is made to bind or connect a socket to
                        a multicast address.

     getsockopt(2), socket(2), sysctl(3), blackhole(4), inet(4), intro(4),
     ip(4), ttcp(4)

     V. Jacobson, R. Braden, and D. Borman, TCP Extensions for High
     Performance, RFC 1323.

     R. Braden, T/TCP - TCP Extensions for Transactions, RFC 1644.

     The tcp protocol appeared in 4.2BSD.  The RFC 1323 extensions for window
     scaling and timestamps were added in 4.4BSD.

FreeBSD 11.0-PRERELEASE        February 14, 1995       FreeBSD 11.0-PRERELEASE


Want to link to this manual page? Use this URL:

home | help