Validating API payloads with JSON Schema
How JSON Schema replaces hand-written payload validation with a declarative contract, the keywords that matter, and the format gotcha that bites people.
Hand-written validation rots. You start with if (!body.email) return 400,
then add a length check, then a regex, then a branch for the optional
field someone added last quarter. The checks drift from the documentation,
the error messages are inconsistent, and the cases nobody thought of
(age: -1, role: "suuper-admin", a stray isAdmin: true the client
should never have sent) sail straight through. JSON Schema replaces all
of that with a declarative contract: a JSON document that describes the
shape another JSON document must take, checked by a validator instead of
by code you maintain by hand.
A worked schema
Here is a schema for a user-creation payload.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"email": { "type": "string", "format": "email" },
"username": { "type": "string", "minLength": 3, "pattern": "^[a-z0-9_]+$" },
"age": { "type": "integer", "minimum": 13 },
"role": { "type": "string", "enum": ["reader", "author", "admin"] },
"tags": {
"type": "array",
"items": { "type": "string" },
"maxItems": 10
}
},
"required": ["email", "username", "role"],
"additionalProperties": false
}
An instance that passes:
{
"email": "[email protected]",
"username": "lin_99",
"age": 27,
"role": "author",
"tags": ["ml", "rust"]
}
An instance that fails:
{
"email": "[email protected]",
"username": "Li",
"age": 11,
"role": "superuser",
"isAdmin": true
}
A validator emits one error per violated keyword, each pointing at the exact location in the instance. The shape varies by library, but the substance is consistent:
/username : "Li" is too short (minLength 3)
/username : "Li" does not match pattern "^[a-z0-9_]+$"
/age : 11 is less than the minimum of 13
/role : "superuser" is not one of ["reader","author","admin"]
/ : additional property "isAdmin" is not allowed
That is five distinct, located errors from one declarative document. The equivalent hand-written code is a few dozen lines that you have to keep in sync with the schema in your docs forever.
The keywords, grouped
JSON Schema has a lot of keywords, but they fall into three jobs.
Structure describes the shape. type constrains the JSON type
(object, array, string, number, integer, boolean, null).
properties maps keys to subschemas. required lists the keys that
must be present — note it does not imply they are non-null, only present.
items applies a subschema to every element of an array.
additionalProperties controls whether keys outside properties are
allowed.
Value constraints narrow individual values. minimum/maximum
(and the exclusive variants) bound numbers; minLength/maxLength
bound strings; minItems/maxItems bound arrays. pattern applies a
regular expression to a string. enum restricts a value to a fixed
list; const pins it to exactly one value, which is handy for
discriminator fields like "type": { "const": "user.created" }.
Composition combines subschemas. allOf requires every branch to
match, anyOf at least one, oneOf exactly one. $ref points at a
named subschema so you can define a shape once and reuse it; $defs
is the conventional place to keep those definitions. A reusable address
block looks like this:
{
"type": "object",
"properties": {
"billing": { "$ref": "#/$defs/address" },
"shipping": { "$ref": "#/$defs/address" }
},
"$defs": {
"address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"country": { "type": "string", "pattern": "^[A-Z]{2}$" }
},
"required": ["street", "country"]
}
}
}
oneOf is the usual tool for tagged unions — a payload that is either an
email event or an SMS event, distinguished by a const channel field —
but be aware that oneOf enforces exactly one match, so overlapping
branches produce confusing "matches more than one" errors. When branches
are genuinely mutually exclusive on a discriminator, oneOf is correct;
when they merely overlap, anyOf is usually what you meant.
The format gotcha
format is the keyword people misread most often. Writing
"format": "email" looks like it rejects malformed addresses, and
"format": "date-time" looks like it rejects bad timestamps. In many
validators, by default, they do not. format is an annotation — a
hint about the intended semantics — and assertion is off unless you
explicitly turn it on.
So a schema with "format": "email" may happily accept
"not an email" because the validator records the format as metadata
and moves on. To make format actually constrain the value you have to
enable format assertion: a constructor flag in Ajv (validateFormats),
the format vocabulary in 2020-12, or the equivalent option in your
language's library. Behavior differs across implementations and drafts,
so the safe assumption is "format does nothing until I prove it does
in my validator." When you need a guarantee, back the format up with a
pattern, which is always asserted.
Strictness with additionalProperties
By default, JSON Schema ignores keys you did not mention. A payload
carrying an unexpected isAdmin: true validates fine unless you say
otherwise. Setting "additionalProperties": false rejects any key not
named in properties, which is what turns the schema into a closed
contract and catches typos like usrename or fields a client should
never be sending.
The trade-off is forward compatibility. A strict schema rejects the extra fields a future client version might add, so a v2 client talking to a v1 server breaks on a property the server would have safely ignored. For internal APIs where you control both ends, strict is the right default — unknown keys are bugs. For public APIs that need to tolerate clients ahead of the server, leaving it open (or scoping strictness to the fields you genuinely own) keeps you from breaking on benign additions.
Where it pays off
The declarative form earns its keep wherever the same shape is described in more than one place.
- API boundaries. Validate the request before your handler runs and the response before it ships. The schema becomes the single source of truth, and the 400 you return is generated from it rather than hand-assembled.
- Config files. A schema over your service config catches a misspelled key or an out-of-range timeout at startup instead of at 3 a.m. This applies whether the config is JSON or YAML, since they describe the same data model — see JSON vs YAML for where the two diverge.
- OpenAPI. OpenAPI embeds JSON Schema for its request and response bodies, so the schema you write doubles as your API spec.
- Code generation. Many toolchains generate types or client code directly from a schema, keeping the model and the code in lockstep.
- Editor support. Point an editor at a schema and you get inline
validation and autocomplete while you hand-edit a config — the same
experience that makes
tsconfig.jsonandpackage.jsonself-checking.
Drafts and what your validator supports
JSON Schema is versioned by draft. 2020-12 is the latest and the one
to target for new work; draft-07 remains extremely common because a
lot of tooling settled there and never moved; draft-04 still turns up
in older OpenAPI 2.0 stacks. The keywords above are stable across recent
drafts, but details — how $ref resolves, whether format asserts,
the spelling of $defs versus the older definitions — changed between
versions. The $schema declaration at the top of your document names the
draft; honor it, and confirm your validator actually implements that
draft rather than assuming it does. Mismatches here produce schemas that
silently validate less than you think.
Write the schema first, validate against real payloads early, and let the contract live next to the code it guards. To draft and test a schema against sample documents in the browser, our JSON Schema validator checks an instance against a schema and reports each error with its location — and if you just need to clean up a payload before pasting it in, the JSON formatter will pretty-print and lint it first.