BlogDeveloper

Validating API payloads with JSON Schema

How JSON Schema replaces hand-written payload validation with a declarative contract, the keywords that matter, and the format gotcha that bites people.

Hand-written validation rots. You start with if (!body.email) return 400, then add a length check, then a regex, then a branch for the optional field someone added last quarter. The checks drift from the documentation, the error messages are inconsistent, and the cases nobody thought of (age: -1, role: "suuper-admin", a stray isAdmin: true the client should never have sent) sail straight through. JSON Schema replaces all of that with a declarative contract: a JSON document that describes the shape another JSON document must take, checked by a validator instead of by code you maintain by hand.

A worked schema

Here is a schema for a user-creation payload.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "email": { "type": "string", "format": "email" },
    "username": { "type": "string", "minLength": 3, "pattern": "^[a-z0-9_]+$" },
    "age": { "type": "integer", "minimum": 13 },
    "role": { "type": "string", "enum": ["reader", "author", "admin"] },
    "tags": {
      "type": "array",
      "items": { "type": "string" },
      "maxItems": 10
    }
  },
  "required": ["email", "username", "role"],
  "additionalProperties": false
}

An instance that passes:

{
  "email": "[email protected]",
  "username": "lin_99",
  "age": 27,
  "role": "author",
  "tags": ["ml", "rust"]
}

An instance that fails:

{
  "email": "[email protected]",
  "username": "Li",
  "age": 11,
  "role": "superuser",
  "isAdmin": true
}

A validator emits one error per violated keyword, each pointing at the exact location in the instance. The shape varies by library, but the substance is consistent:

/username  : "Li" is too short (minLength 3)
/username  : "Li" does not match pattern "^[a-z0-9_]+$"
/age       : 11 is less than the minimum of 13
/role      : "superuser" is not one of ["reader","author","admin"]
/          : additional property "isAdmin" is not allowed

That is five distinct, located errors from one declarative document. The equivalent hand-written code is a few dozen lines that you have to keep in sync with the schema in your docs forever.

The keywords, grouped

JSON Schema has a lot of keywords, but they fall into three jobs.

Structure describes the shape. type constrains the JSON type (object, array, string, number, integer, boolean, null). properties maps keys to subschemas. required lists the keys that must be present — note it does not imply they are non-null, only present. items applies a subschema to every element of an array. additionalProperties controls whether keys outside properties are allowed.

Value constraints narrow individual values. minimum/maximum (and the exclusive variants) bound numbers; minLength/maxLength bound strings; minItems/maxItems bound arrays. pattern applies a regular expression to a string. enum restricts a value to a fixed list; const pins it to exactly one value, which is handy for discriminator fields like "type": { "const": "user.created" }.

Composition combines subschemas. allOf requires every branch to match, anyOf at least one, oneOf exactly one. $ref points at a named subschema so you can define a shape once and reuse it; $defs is the conventional place to keep those definitions. A reusable address block looks like this:

{
  "type": "object",
  "properties": {
    "billing": { "$ref": "#/$defs/address" },
    "shipping": { "$ref": "#/$defs/address" }
  },
  "$defs": {
    "address": {
      "type": "object",
      "properties": {
        "street": { "type": "string" },
        "country": { "type": "string", "pattern": "^[A-Z]{2}$" }
      },
      "required": ["street", "country"]
    }
  }
}

oneOf is the usual tool for tagged unions — a payload that is either an email event or an SMS event, distinguished by a const channel field — but be aware that oneOf enforces exactly one match, so overlapping branches produce confusing "matches more than one" errors. When branches are genuinely mutually exclusive on a discriminator, oneOf is correct; when they merely overlap, anyOf is usually what you meant.

The format gotcha

format is the keyword people misread most often. Writing "format": "email" looks like it rejects malformed addresses, and "format": "date-time" looks like it rejects bad timestamps. In many validators, by default, they do not. format is an annotation — a hint about the intended semantics — and assertion is off unless you explicitly turn it on.

So a schema with "format": "email" may happily accept "not an email" because the validator records the format as metadata and moves on. To make format actually constrain the value you have to enable format assertion: a constructor flag in Ajv (validateFormats), the format vocabulary in 2020-12, or the equivalent option in your language's library. Behavior differs across implementations and drafts, so the safe assumption is "format does nothing until I prove it does in my validator." When you need a guarantee, back the format up with a pattern, which is always asserted.

Strictness with additionalProperties

By default, JSON Schema ignores keys you did not mention. A payload carrying an unexpected isAdmin: true validates fine unless you say otherwise. Setting "additionalProperties": false rejects any key not named in properties, which is what turns the schema into a closed contract and catches typos like usrename or fields a client should never be sending.

The trade-off is forward compatibility. A strict schema rejects the extra fields a future client version might add, so a v2 client talking to a v1 server breaks on a property the server would have safely ignored. For internal APIs where you control both ends, strict is the right default — unknown keys are bugs. For public APIs that need to tolerate clients ahead of the server, leaving it open (or scoping strictness to the fields you genuinely own) keeps you from breaking on benign additions.

Where it pays off

The declarative form earns its keep wherever the same shape is described in more than one place.

  • API boundaries. Validate the request before your handler runs and the response before it ships. The schema becomes the single source of truth, and the 400 you return is generated from it rather than hand-assembled.
  • Config files. A schema over your service config catches a misspelled key or an out-of-range timeout at startup instead of at 3 a.m. This applies whether the config is JSON or YAML, since they describe the same data model — see JSON vs YAML for where the two diverge.
  • OpenAPI. OpenAPI embeds JSON Schema for its request and response bodies, so the schema you write doubles as your API spec.
  • Code generation. Many toolchains generate types or client code directly from a schema, keeping the model and the code in lockstep.
  • Editor support. Point an editor at a schema and you get inline validation and autocomplete while you hand-edit a config — the same experience that makes tsconfig.json and package.json self-checking.

Drafts and what your validator supports

JSON Schema is versioned by draft. 2020-12 is the latest and the one to target for new work; draft-07 remains extremely common because a lot of tooling settled there and never moved; draft-04 still turns up in older OpenAPI 2.0 stacks. The keywords above are stable across recent drafts, but details — how $ref resolves, whether format asserts, the spelling of $defs versus the older definitions — changed between versions. The $schema declaration at the top of your document names the draft; honor it, and confirm your validator actually implements that draft rather than assuming it does. Mismatches here produce schemas that silently validate less than you think.

Write the schema first, validate against real payloads early, and let the contract live next to the code it guards. To draft and test a schema against sample documents in the browser, our JSON Schema validator checks an instance against a schema and reports each error with its location — and if you just need to clean up a payload before pasting it in, the JSON formatter will pretty-print and lint it first.