Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
KHTTP_PARSE(3)	       FreeBSD Library Functions Manual		KHTTP_PARSE(3)

NAME
     khttp_parse, khttp_parsex -- parse	a CGI instance for kcgi

LIBRARY
     library "libkcgi"

SYNOPSIS
     #include <sys/types.h>
     #include <stdarg.h>
     #include <stdint.h>
     #include <kcgi.h>

     enum kcgi_err
     khttp_parse(struct	kreq *req, const struct	kvalid *keys, size_t keysz,
	 const char *const *pages, size_t pagesz, size_t defpage);

     enum kcgi_err
     khttp_parsex(struct kreq *req, const struct kmimemap *suffixes,
	 const char *const *mimes, size_t mimesz, const	struct kvalid *keys,
	 size_t	keysz, const char *const *pages, size_t	pagesz,
	 size_t	defmime, size_t	defpage, void *arg,
	 void (*argfree)(void *arg), unsigned int debugging,
	 const struct kopts *opts);

     extern const char *const kmimetypes[KMIME__MAX];
     extern const char *const khttps[KHTTP__MAX];
     extern const char *const kschemes[KSCHEME__MAX];
     extern const char *const kmethods[KMETHOD__MAX];
     extern const struct kmimemap ksuffixmap[];
     extern const char *const ksuffixes[KMIME__MAX];

DESCRIPTION
     The khttp_parse() and khttp_parsex() functions parse and validate input
     and the HTTP environment (compression, paths, MIME	types, and so on).
     They are the central functions in the kcgi(3) library, parsing and	vali-
     dating key-value form (query string, message body,	cookie)	data and
     opaque message bodies.

     They must be matched by khttp_free(3) if and only if the return value is
     KCGI_OK.  Otherwise, resources are	internally freed.

     The collective arguments are as follows:

     arg     A pointer to private application data.  It	is not touched unless
	     argfree is	provided.

     argfree
	     Function invoked with arg by the child process starting to	parse
	     untrusted network data.  This makes sure that no unnecessary data
	     is	leaked into the	child.

     debugging
	     This bit-field enables debugging of the underlying	parse and/or
	     write routines.  It may have KREQ_DEBUG_WRITE for writes and
	     KREQ_DEBUG_READ_BODY for the pre-parsed body.  Debugging messages
	     to	kutil_info(3) consist of the process ID	followed by "-tx" or
	     "-rx" for writing or reading, a colon and space, then the logged
	     data.  A newline will flush the existing line, as well reaching
	     80	characters.  If	flushed	at 80 characters and not a newline, an
	     ellipsis will follow the line.  The total logged bytes will be
	     emitted at	the end	of all reads or	writes.

     defmime
	     If	no MIME	type is	specified (that	is, there's no suffix to the
	     page request), use	this index in the mimes	array.

     defpage
	     If	no page	was specified (e.g., the default landing page),	this
	     is	provided as the	requested page index.

     keys    An	optional array of input	and validation fields or NULL.

     keysz   The number	of elements in keys.

     mimesz  The number	of elements in mimes.  Also the	MIME index used	if no
	     MIME type was matched.  This differs from defmime,	which is used
	     if	there is no MIME suffix	at all.

     mimes   An	array of MIME types (e.g., "text/html"), mapped	into a MIME
	     index during MIME body parsing.  This relates both	to pages and
	     input fields with a body type.  Any array should include at least
	     text/plain, as this is the	default	content	type for MIME docu-
	     ments.

     opts    Tunable options regarding socket buffer sizes and so on.  If set
	     to	NULL, meaningful defaults are used.

     pages   An	array of recognised pathnames.	When pathnames are parsed,
	     they're matched to	indices	in this	array.

     pagesz  The number	of pages in pages.  Also used if the requested page
	     was not in	pages.

     req     This structure is cleared and filled with input fields and	HTTP
	     context parsed from the CGI environment.  It is the main struc-
	     ture carried around in a kcgi(3) application.

     suffixes
	     Define the	MIME type (suffix) mapping.

     The first form, khttp_parse(), is for applications	using the system-
     recognised	MIME types.  This should work well enough for most applica-
     tions.  It	is equivalent to invoking the second form, khttp_parsex(), as
     follows:

	   khttp_parsex(req, ksuffixmap,
	     kmimetypes, KMIME__MAX, keys, keysz,
	     pages, pagesz, KMIME_TEXT_HTML,
	     defpage, NULL, NULL, 0, NULL);

   Types
     A struct kreq object is filled in by khttp_parse()	and khttp_parsex().
     It	consists of the	following fields:

     void *arg
	     Private application data.	This is	set during khttp_parse().

     enum kauth	auth
	     Type of "managed" HTTP authorisation performed by the web server
	     according to the AUTH_TYPE	header variable, if any.  This is
	     KAUTH_DIGEST for the AUTH_TYPE "digest", KAUTH_BASIC for "basic",
	     KAUTH_UNKNOWN for other values of AUTH_TYPE, or KAUTH_NONE	if
	     AUTH_TYPE is not set.  See	the rawauth field for raw authorisa-
	     tion requests.

     struct kpair **cookiemap
	     An	array of keysz singly linked lists of elements of the cookies
	     array.  If	cookie->key is equal to	one of the entries of keys and
	     cookie->state is KPAIR_VALID or KPAIR_UNCHECKED, the cookie is
	     added to the list cookiemap[cookie->keypos].  Empty lists are
	     NULL.  If a list contains more than one cookie, cookie->next
	     points to the next	cookie.	 For the last cookie in	a list,
	     cookie->next is NULL.

     struct kpair **cookienmap
	     Similar to	cookiemap, except that it contains the cookies where
	     cookie->state is KPAIR_INVALID.

     struct kpair *cookies
	     Key-value pairs read from request cookies found in	the
	     HTTP_COOKIE header	variable, or NULL if cookiesz is 0.  See
	     fields for	key-value pairs	from the request query string or mes-
	     sage body.

     size_t cookiesz
	     The size of the cookies array.

     struct kpair **fieldmap
	     Similar to	cookiemap, except that the lists contain elements of
	     the fields	array.

     struct kpair **fieldnmap
	     Similar to	fieldmap, except that it contains the fields where
	     field->state is KPAIR_INVALID.

     struct kpair *fields
	     Key-value pairs read from the QUERY_STRING	header variable	and
	     from the message body, or NULL if fieldsz is 0.  See cookies for
	     key-value pairs from request cookies.

     size_t fieldsz
	     The number	of elements in the fields array.

     char *fullpath
	     The full requested	path as	contained in the PATH_INFO header
	     variable.	For example, requesting
	     "https://bsd.lv/app.cgi/dir/file.html?q=v", where "app.cgi" is
	     the CGI program, this value would be /dir/file.html.  It is not
	     guaranteed	to start with a	slash and it may be an empty string.

     char *host
	     The host name received in the HTTP_HOST header variable.  When
	     using name-based virtual hosting, this is typically the virtual
	     host name specified by the	client in the HTTP request, and	it
	     should not	be confused with the canonical DNS name	of the host
	     running the web server.  For example, a request to
	     "https://bsd.lv/app.cgi/file" would have a	host of	"bsd.lv".  If
	     HTTP_HOST is not defined, host is set to "localhost".

     struct kdata *kdata
	     Internal data.  Should not	be touched.

     const struct kvalid *keys
	     Value passed to khttp_parse().

     size_t keysz
	     Value passed to khttp_parse().

     enum kmethod method
	     The KMETHOD_ACL, KMETHOD_CONNECT, KMETHOD_COPY, KMETHOD_DELETE,
	     KMETHOD_GET, KMETHOD_HEAD,	KMETHOD_LOCK, KMETHOD_MKCALENDAR,
	     KMETHOD_MKCOL, KMETHOD_MOVE, KMETHOD_OPTIONS, KMETHOD_POST,
	     KMETHOD_PROPFIND, KMETHOD_PROPPATCH, KMETHOD_PUT, KMETHOD_REPORT,
	     KMETHOD_TRACE, or KMETHOD_UNLOCK submission method	obtained from
	     the REQUEST_METHOD	header variable.  If an	unknown	method was re-
	     quested, KMETHOD__MAX is used.  If	no method was specified, the
	     default is	KMETHOD_GET.

	     Applications will usually accept only KMETHOD_GET and
	     KMETHOD_POST, so be sure to emit a	KHTTP_405 status for undesired
	     methods.

     size_t mime
	     The MIME type of the requested file as determined by its suffix
	     matched to	the mimemap map	passed to khttp_parsex() or the	de-
	     fault kmimemap if using khttp_parse().  This defaults to the
	     mimesz value passed to khttp_parsex() or the default KMIME__MAX
	     if	using khttp_parse() when no suffix is specified	or when	the
	     suffix is specified but not known.

     size_t page
	     The page index found by looking up	pagename in the	pages array.
	     If	pagename is not	found in pages,	pagesz is used;	if pagename is
	     empty, defpage is used.

     char *pagename
	     The first component of fullpath or	an empty string	if there is
	     none.  It is compared to the elements of the pages	array to de-
	     termine which page	it corresponds to.  For	example, for a
	     fullpath of "/dir/file.html" this component corresponds to	dir.
	     For "/file.html", it's file.

     char *path
	     The middle	part of	fullpath, after	stripping pagename/ at the be-
	     ginning and .suffix at the	end, or	an empty string	if there is
	     none.  For	example, if the	fullpath is bar/baz.html, this compo-
	     nent is baz.

     char *pname
	     The script	name received in the SCRIPT_NAME header	variable.  For
	     example, for a request to a CGI program /var/www/cgi-bin/app.cgi
	     mapped by the web server from "https://bsd.lv/app.cgi/file", this
	     would be app.cgi.	This may not reflect a file system entity and
	     it	may be an empty	string.

     uint16_t port
	     The server's receiving TCP	port according to the SERVER_PORT
	     header variable, or 80 if that is not defined or an invalid num-
	     ber.

     struct khttpauth rawauth
	     The raw authorization request according to	the HTTP_AUTHORIZATION
	     header variable passed by the web server.	Some web servers, for
	     example Apache, do	not set	HTTP_AUTHORIZATION by default.

     char *remote
	     The string	form of	the client's IPv4 or IPv6 address taken	from
	     the REMOTE_ADDR header variable, or "127.0.0.1" if	that is	not
	     defined.  The address format of the string	is not checked.

     struct khead *reqmap[KREQU__MAX]
	     Mapping of	enum krequ enumeration values to reqs parsed from the
	     input stream.

     struct khead *reqs
	     List of all HTTP request headers, known via enum krequ and	not
	     known, parsed from	the input stream, or NULL if reqsz is 0.

     size_t reqsz
	     Number of request headers in reqs.

     enum kscheme scheme
	     The access	scheme according to the	HTTPS header variable, either
	     KSCHEME_HTTPS if HTTPS is set and equal to	the string "on"	or
	     KSCHEME_HTTP otherwise.

     char *suffix
	     The suffix	part of	the last component of fullpath or an empty
	     string if there is	none.  For example, if the fullpath is
	     /bar/baz.html, this component is html.  See the mime field	for
	     the MIME type parsed from the suffix.

     The application may optionally define keys	provided to khttp_parse() and
     khttp_parsex() as an array	of struct kvalid.  This	structure is central
     to	the validation of input	data.  It consists of the following fields:

     const char	*name
	     The field name, i.e., how it appears in the HTML form input name.
	     This cannot be NULL.  If the field	name is	an empty string	and
	     the HTTP message consists of an opaque body (and not key-value
	     pairs), then that field will be used to validate the HTTP message
	     body.  This is useful for KMETHOD_PUT style requests.

     int (*)(struct kpair *) valid
	     A validation function returning non-zero if parsing and valida-
	     tion succeed or 0 otherwise.  If it is NULL, then no validation
	     is	performed, the data is considered as valid, and	it is bucketed
	     into cookiemap or fieldmap	as such.

	     User-defined valid	functions usually set the type and parsed
	     fields in the key-value pair.  When working with binary data or
	     with a key	that can take different	data types, it is acceptable
	     for a validation function to set the type to KPAIR__MAX and for
	     the application to	ignore the parsed field	and to work directly
	     with val and valsz.

	     The validation function is	allowed	to allocate new	memory for
	     val: if the val pointer changes during validation,	the memory
	     pointed to	after validation will be freed with free(3) after the
	     data is passed out	of the sandbox.

	     These functions are invoked from within a system-specific sandbox
	     that may not allow	some system calls, for example opening files
	     or	sockets.  In other words, validation functions should only do
	     pure computation.

     The struct	kpair structure	presents the user with fields parsed from in-
     put and (possibly)	matched	to the keys variable passed to khttp_parse()
     and khttp_parsex().  It is	also passed to the validation function to be
     filled in.	 In this case, the MIME-related	fields are already filled in
     and may be	examined to determine the method of validation.	 This is use-
     ful when validating opaque	message	bodies.

     char *ctype
	     The value's MIME content type (e.g., image/jpeg), or an empty
	     string if not defined.

     size_t ctypepos
	     If	ctype is not NULL, it is looked	up in the mimes	parameter
	     passed to khttp_parsex() or ksuffixmap if using khttp_parse().
	     If	found, it is set to the	appropriate index.  Otherwise, it's
	     mimesz.

     char *file
	     The value's MIME source filename or an empty string if not	de-
	     fined.

     char *key
	     The NUL-terminated	key (input) name.  If the HTTP message body is
	     opaque (e.g., KMETHOD_PUT), then an empty-string key is cooked
	     up.  The key may contain an arbitrary sequence of non-NUL bytes,
	     even non-ASCII bytes, control characters, and shell metacharac-
	     ters.

     size_t keypos
	     If	found in the keys array	passed to khttp_parse(), the index of
	     the matching key.	Otherwise keysz.

     struct kpair *next
	     In	a cookie or field map, next points to the next parsed key-
	     value pair	with the same key name.	 This occurs most often	in
	     HTML checkbox forms, where	many fields may	have the same name.

     union parsed parsed
	     The parsed, validated value.  These may be	integer	in i, for a
	     64-bit signed integer; a string s,	for a NUL-termianted character
	     string; or	a double d, for	a double-precision floating-point num-
	     ber.  This	is intentionally basic because the resulting data must
	     be	reliably passed	from the parsing context back into the web ap-
	     plication.

     enum kpairstate state
	     The validation state: KPAIR_VALID if the pair was successfully
	     validated by a validation function, KPAIR_INVALID if a validation
	     function was invoked but failed, or KPAIR_UNCHECKED if no valida-
	     tion function is defined for this key.

     enum kpairtype type
	     If	parsed,	the type of data in parsed, otherwise KFIELD__MAX.

     char *val
	     The (input) value,	which may contain an arbitrary sequence	of
	     bytes, even NUL bytes, non-ASCII bytes, control characters, and
	     shell metacharacters.  The	byte following the end of the array,
	     val[valsz], is always guaranteed to be NUL.  The validation func-
	     tion may modify the contents.  For	example, for integer numbers
	     and e-mail	adresses, trailing whitespace may be replaced with NUL
	     bytes.

     size_t valsz
	     The length	of the val buffer in bytes.  It	is not a string
	     length.

     char *xcode
	     The value's MIME content transfer encoding	(e.g., base64),	or an
	     empty string if not defined.

     The struct	khttpauth structure holds authorisation	data if	passed by the
     server.  The specific fields are as follows.

     enum kauth	type
	     If	no data	was passed by the server, the type value is
	     KAUTH_NONE.  Otherwise it's KAUTH_BASIC or	KAUTH_DIGEST, with
	     KAUTH_UNKNOWN if the authorisation	type was not recognised.

     int authorised
	     For KAUTH_BASIC or	KAUTH_DIGEST authorisation, this field indi-
	     cates whether all required	values were specified.

     char *digest
	     An	MD5 digest of REQUEST_METHOD, SCRIPT_NAME, PATH_INFO, header
	     variables and the request body.  It is not	a NUL-terminated
	     string, but an array of exactly MD5_DIGEST_LENGTH bytes.  Only
	     filled in when HTTP_AUTHORIZATION is "digest" and authorised is
	     non-zero.	Otherwise, it remains NULL.  Used in
	     khttpdigest_validatehash(3).

     d	     An	anonymous union	containing parsed fields per type: struct
	     khttpbasic	basic for KAUTH_BASIC or struct	khttpdigest digest for
	     KAUTH_DIGEST.

     If	the field for an HTTP authorisation request is KAUTH_BASIC, it will
     consist of	the following for its parsed entities in its struct khttpbasic
     structure:

     response
	     The hashed	and encoded response string.

     If	the field for an HTTP authorisation request is KAUTH_DIGEST, it	will
     consist of	the following in its struct khttpdigest	structure:

     alg     The encoding algorithm, parsed from the possible MD5 or MD5-Sess
	     values.

     qop     The quality of protection algorithm, which	may be unspecified,
	     Auth or Auth-Init.

     user    The user coordinating the request.

     uri     The URI for which the request is designated.  (This must match
	     the request URI).

     realm   The request realm.

     nonce   The server-generated nonce	value.

     cnonce  The (optional) client-generated nonce value.

     response
	     The hashed	and encoded response string, which entangled fields
	     depending on algorithm and	quality	of protection.

     count   The (optional) cnonce counter.

     opaque  The (optional) opaque string requested by the server.

     The struct	kopts structure	consists of tunables for network performance.
     You probably don't	want to	use these unless you really know what you're
     doing!

     sndbufsz
	     The size of the output buffer.  The output	buffer is a heap-allo-
	     cated region into which writes (via khttp_write(3)	and
	     khttp_head(3)) are	buffered instead of being flushed directly to
	     the wire.	The buffer is flushed when it is full, when the	HTTP
	     headers are flushed, and when khttp_free(3) is invoked.  If the
	     buffer size is zero, writes are flushed immediately to the	wire.
	     If	the buffer size	is less	than zero, it is filled	with a mean-
	     ingful default.

     Lastly, the struct	khead structure	holds parsed HTTP headers.

     key     Holds the HTTP header name.  This is not the CGI header name
	     (e.g., HTTP_COOKIE), but the reconstituted	HTTP name (e.g.,
	     Coookie).

     val     The opaque	header value, which may	be an empty string.

   Variables
     A number of variables are defined <kcgi.h>	to simplify invocations	of the
     khttp_parse() family.  Applications are strongly suggested	to use these
     variables (and associated enumerations) in	khttp_parse() instead of over-
     riding them with hand-rolled sets in khttp_parsex().

     kmimetypes
	     Indexed list of common MIME types,	for example, "text/html" and
	     "application/json".  Corresponds to enum kmime enum khttp.

     khttps  Indexed list of HTTP status code and identifier, for example,
	     "200 OK".	Corresponds to enum khttp.

     kschemes
	     Indexed list of URL schemes, for example, "https" or "ftp".  Cor-
	     responds to enum kscheme.

     kmethods
	     Indexed list of HTTP methods, for example,	"GET" and "POST".
	     Corresponds to enum kmethod.

     ksuffixmap
	     Map of MIME types defined in enum kmime to	possible suffixes.
	     This array	is terminated with a MIME type of KMIME__MAX and name
	     NULL.

     ksuffixes
	     Indexed list of canonical suffixes	for MIME types corresponding
	     to	enum kmime.  This may be a NULL	pointer	for types that have no
	     canonical suffix, for example.  "application/octet-stream".

RETURN VALUES
     khttp_parse() and khttp_parsex() return an	error code:

     KCGI_OK
	  Success (not an error).

     KCGI_ENOMEM
	  Memory failure.  This	can occur in many places: spawning a child,
	  allocating memory, creating sockets, etc.

     KCGI_ENFILE
	  Could	not allocate file descriptors.

     KCGI_EAGAIN
	  Could	not spawn a child.

     KCGI_FORM
	  Malformed data between parent	and child whilst parsing an HTTP re-
	  quest.  (Internal system error.)

     KCGI_SYSTEM
	  Opaque operating system error.

     On	failure, the calling application should	terminate as soon as possible.
     Applications should not try to write an HTTP 505 error or similar,	but
     allow the web server to handle the	empty CGI response on its own.

SEE ALSO
     kcgi(3), khttp_free(3)

AUTHORS
     The khttp_parse() and khttp_parsex() functions were written by Kristaps
     Dzonsons <kristaps@bsd.lv>.

FreeBSD	13.0			 July 21, 2020			  FreeBSD 13.0

NAME | LIBRARY | SYNOPSIS | DESCRIPTION | RETURN VALUES | SEE ALSO | AUTHORS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=khttp_parse&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help