Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Sah::FAQ(3)	      User Contributed Perl Documentation	   Sah::FAQ(3)

       Sah::FAQ	- Frequently asked questions

       This document describes version 0.9.49 of Sah::FAQ (from	Perl
       distribution Sah), released on 2020-02-11.

       Why use a schema	(a.k.a "Turing tarpit")? Why not use pure Perl?

       Schema language is a specialized	language (DSL) that should be more
       concise to write	than equivalent	Perl code for common validation	tasks.
       Its goal	is never to be as powerful as Perl.

       90% of the time,	my schemas are some variations of the simple cases

	["str":	  {"len_between": [1, 10], "match": "some regex"}]
	["str":	  {"in": ["a", "b", "c", ...]}]
	["array": {"of": "some_other_type"}]
	["hash":  {"keys": {"key1": "some schema", ...}, "req_keys": [...], ...}]

       and writing schemas is faster and less tedious/error-prone than writing
       equivalent Perl code, plus Data::Sah can	generate JavaScript code and
       human description text for me. For more complex validation I stay with
       Sah until it starts to get unwieldy. It usually can go pretty far since
       I can add functions and custom clauses to its types; it's for the very
       complex and dynamic validation needs that I go pure Perl. Your mileage
       may vary.

       What does "Sah" mean?

       Sah is an Indonesian word, meaning "valid" or "legal". It's picked
       because it's short.

       The previous incarnation	of this	module uses the	namespace
       Data::Schema, started in	2009 and deprecated in 2011 in favor of	"Sah".

   Comparison to other schema languages	and type systems
       Comparison to JSON schema?

       o   JSON	schema limits its type system to that supported	by

       o   JSON	schema's syntax	is simpler.

	   Its metaschema (schema for the schema) is only about	130 lines.
	   There are no	shortcut forms.

       o   JSON	schema's features are more limited.

	   No expression, no function.

       Comparison to Data::Rx?


       Comparison to Data::FormValidator (DFV)?


       Comparison to Moose types?


       Why is "req" not	enabled	the default?

       I am following SQL's behavior. A	type declaration like:


       in SQL means "NULL" is allowed, while:


       means "NULL" is not allowed. The	above is equivalent to specifying this
       in Sah:


       One could argue that setting "req" to 1 by default is safer/more
       convenient to her/whatever, and "int" should mean "["int", "req", 1]"
       while something like perhaps "int?" means "["int", "req", 0]". But this
       is simply a design choice and each has its pros/cons. Nullable by
       default can also	be convenient in some cases, like when specifying
       program options where most of the options are optional.

       How about adding	a "default_req"	configuration in "Data::Sah" then?

       In general I am against compiler	configuration which changes language
       behavior	(think PHP's "register_globals"	or <magic_quotes_*> settings).
       In this case, it	makes a	simple schema like "int" to have ambiguous
       meaning (is undefined value allowed? Or not allowed? It depends on
       compiler	configuration).

       Why "int" instead of "integer"? Why "req" instead of "required"?	"str"
       instead of "string"? Etc.

       This is also a design choice. To	be consistent, either we abbreviate or
       we don't. Although there	is very	little reason to abbreviate when it
       comes to	disk/memory size (compared to the eras of early	Unix or	C
       language), there	are other limited resources to consider: source	code
       column width (usually still around 80 characters	in many	best
       practices) and developer's time/energy (typing more takes more time and

       I want to make it possible for short schemas to be specified on a
       single line.  For example compare:

	[integer => {required => 1, minimum => 0, maximum => 100, divisible_by => 2}]


	[int =>	{req=>1, min=>0, max=>100, div_by=>2}]

       The latter is not that much less	readable than the first, but is	less
       tedious to type,	especially if you write	lots of	schemas.

       Therefore, the decision is to use commonly used (and unambiguous)
       abbreviations for type and clause names.

       How to express "not-something"? Why isn't there a "not" or "not_in"

       There are generally no "not_CLAUSE" clauses. Instead, a generic
       "!CLAUSE" syntax	is provided. Examples:

	// an integer that is not 0
	["int",	{"!is":	0}]

	// a username that is not one of the forbidden/reserved	ones
	["str",	{"!in":	["root", "admin", "superuser"]}]

       How to state "in" as well as "!in" in the same clause set?

       You can't do this since it will cause a conflict:

	["str ", {"in":	["a","b","c"], "!in": ["x","y","z"]}]

       However,	you can	do this:

	["str ", {"clset&": [{"in": ["a","b","c"]}, {"!in": ["x","y","z"]}]}]

       How to express mutual failure ("if A fails, B must also fail")?

       You can use "if"	clause and negate the clauses. For example:

	"if": [{"!div_by": 2}, {"!div_by": 5}]

       How about "len_in" clause for str? Or "values_uniq" for hash? Or
       perhaps "len_div_by"? Or	some other clauses that	test a
       property/transform of a value?

       Except for some commonly	used cases like	"len_between", "min_len",
       "max_len", "allowed_keys", "forbidden_keys", to validate	a certain
       property	of the value (instead of the raw value itself),	you can	use
       the generic "prop" clause:

	// check hash values are unique
	["hash", {"prop": ["values", ["array", {"uniq":1}]]}]

       General advice when writing schemas?

       o   Avoid "any" or "all"	if you know that data is of a certain type

	   For performance and ease of reflection, it is better	to create a
	   custom clause than using the	"any" type, especially with long list
	   of alternatives. An example:

	    // dns_record is either a_record, mx_record, ns_record, cname_record, ...
	    ["any", "of", [

	    // base_record
	    ["hash", "keys", {
		"owner": "str*",
		"ttl": "int",

	    // a_record
	    ["base_record", "merge.normal.keys", {
		"type":	["str*", "is", "A"],
		"address": "str*"

	    // mx_record
	    ["base_record", "merge.normal.keys", {
		"type":	["str*", "is", "MX"],
		"host":	"str*",
		"prio":	"int"


	   If you see the declaration above, every record is a hash. So	it is
	   better to declare "dns_record" as a "hash" instead of an "any". But
	   we need to select a different schema	based on the "type" key. We
	   can develop a custom	clause like this:

	    ["hash", "select_schema_on_key", ["type", {
		"A": "a_record",
		"MX": "mx_record",
		"NS": "ns_record",
		"CNAME": "cname_record",

	   This	will be	faster.

       How does	Sah check allowed/unallowed keys?

       If "keys" clause	is specified, then by default only keys	defined	in
       "keys" clause is	allowed, unless	the ".restrict"	attribute is set to
       false, in which case no restriction to allowed keys is done by the
       clause. The same	case for "re_keys".

       If "allowed_keys" and/or	"allowed_keys_re" clause is specified, then
       only keys matching those	clauses	are allowed. This is in	addition to
       restriction placed by other clauses, of course.

       How do I	specify	schemas	for some keys, but still allow some other

       Set the ".restrict" attribute for "keys"	or "re_keys" to	false.

	["hash", {
	    "keys": {"a": "int", "b": "int"},
	    "keys.restrict": 0,
	    "allowed_keys": ["a", "b", "c", "d", "e"]

       The above schema	allows keys "a,	b, c, d, e" and	specifies values for
       "a, b".	Another	example:

	["hash", {
	    "keys": {"a": "int", "b": "int"},
	    "keys.restrict": 0,
	    "allowed_keys_re": "^[ab_]",

       The above schema	specifies values for "a, b" but	still allows other
       keys beginning with an underscore.

       What is the difference between the "keys" and "req_keys"	clauses?

       "req_keys" require keys to exist, but their values are governed by the
       schemas in "keys" or "keys_re". Here are	four combination
       possibilities, each with	the schema:

       To require a hash key to	exist, but its value can be undef:

	["hash", "keys", {"a": "int"}, "req_keys": ["a"]]

       To allow	a hash key to not exist, but when it exists it must not	be

	["hash", "keys", {"a": "int*"}]

       To allow	a hash key to not exist, or its	value to be undef when exists:

	["hash", "keys", {"a": "int"}]

       To require hash key exist and its value must not	be undef:

	["hash", "keys", {"a": "int*"},	"req_keys": ["a"]]

       Merging and hash	keys?

       XXX (Turn off hash merging using	the '' Data::ModeMerge options key.

       Please visit the	project's homepage at

       Source repository is at <>.

       Please report any bugs or feature requests on the bugtracker website

       When submitting a bug or	request, please	include	a test-file or a patch
       to an existing test-file	that illustrates the bug or desired feature.

       perlancar <>

       This software is	copyright (c) 2020, 2019, 2017,	2016, 2015, 2014,
       2013, 2012 by

       This is free software; you can redistribute it and/or modify it under
       the same	terms as the Perl 5 programming	language system	itself.

perl v5.32.1			  2020-02-11			   Sah::FAQ(3)


Want to link to this manual page? Use this URL:

home | help