JSON vs YAML: when to use which, and the footguns of each
JSON and YAML model the same data, but their failure modes differ sharply: YAML's type coercion and whitespace traps versus JSON's missing comments and verbosity.
영문 본문을 표시하고 있습니다. 번역은 준비 중입니다.
JSON and YAML describe the same shapes — maps, lists, and scalars — and the choice between them is usually framed as a style preference. It is not. They have different failure modes, and those failures land in different places: JSON's bite at edit time, YAML's bite silently at parse time. This post covers where each format earns its keep, the side-by-side mechanics, and the footguns that turn a working config into a 2 a.m. incident.
The relationship
YAML 1.2 is, for practical purposes, a superset of JSON. Any valid JSON document is also a valid YAML document, because YAML's spec adopted JSON's syntax as a subset. A YAML parser will read this verbatim:
{"service": "api", "replicas": 3, "ports": [80, 443]}
That matters more than it sounds. It means YAML inherits JSON's data model entirely — there is nothing JSON can represent that YAML cannot. The differences are not in what the formats can hold but in how humans write them and how parsers interpret the ambiguity that YAML's looser grammar introduces.
Side by side
The same document, first in JSON:
{
"service": "api",
"replicas": 3,
"image": "registry.example.com/api:1.4.0",
"ports": [80, 443],
"env": {
"LOG_LEVEL": "info",
"REGION": "us-east-1"
}
}
And in YAML:
# deployment config for the api service
service: api
replicas: 3
image: registry.example.com/api:1.4.0
ports:
- 80
- 443
env:
LOG_LEVEL: info
REGION: us-east-1
The YAML version drops the braces, the brackets, the quotes around keys and most strings, the commas, and adds a comment. Structure is carried by indentation instead of punctuation. For a human editing a config by hand this is genuinely more pleasant. The catch is that every piece of that convenience is also a place where a parser has to guess, and the guesses are where things go wrong.
Feature comparison
| Feature | JSON | YAML |
|---|---|---|
| Comments | No (spec forbids) | Yes (#) |
| Trailing commas | No (spec forbids) | N/A (no commas) |
| Multiline strings | Escaped \n only |
Block scalars (` |
| Anchors / aliases | No | Yes (&, *, <<) |
| Parsing strictness | High — one grammar | Low — context-sensitive, version-dependent |
| Type inference | None (explicit) | Aggressive (implicit) |
| Ubiquity | Universal | Wide but uneven |
| Attack surface | Minimal | Anchors, custom tags, deep nesting |
YAML's footguns
This is the part worth memorizing, because the failures are quiet.
The Norway problem
YAML 1.1 treats a long list of bare words as booleans: yes, no,
on, off, true, false, y, n. Many widely deployed parsers
still default to 1.1 behavior. So this:
country: NO
parses NO as the boolean false, not the string "NO" (Norway's
country code — hence the name). The same class of bug:
version: 1.0 # parses as the float 1.0, drops the trailing zero
build: 1.20 # becomes 1.2
zip: 02134 # leading zero: octal in 1.1, or stripped to 2134
git_sha: 1234567 # an all-digit SHA prefix becomes an integer
mac: 12:34:56 # 1.1 reads colons as base-60 (sexagesimal)
enabled: off # the string "off" becomes false
A ZIP code, a git SHA, a version string, a MAC address, a phone
number, a serial — anything that looks numeric but is semantically a
string is a candidate for silent coercion. The fix is mechanical:
quote it. country: "NO", version: "1.0", zip: "02134". Quoting
forces the scalar to a string regardless of parser version. Beyond
that, prefer a parser configured for YAML 1.2 / "core schema," which
narrows the boolean set to true/false only and removes the
sexagesimal and octal traps.
Significant whitespace
Indentation is structure, so indentation errors are structural errors. Tabs are not allowed for indentation in YAML at all — a stray tab from an editor that didn't convert it is a hard parse error, and the message rarely points at the real line. Copy-pasting a block into a different indentation context shifts its meaning. A list item that loses two spaces silently becomes a sibling of its former parent rather than a child. None of this is caught by the structure being "valid"; it parses fine, just into a different document than you meant.
Anchors, aliases, and the merge key
YAML lets you define a node once and reuse it:
defaults: &defaults
retries: 3
timeout: 30
prod:
<<: *defaults
timeout: 60
&defaults anchors the map, *defaults references it, and <<
merges it in. This is genuinely useful for DRY config — and it is also
a readability cliff once a file leans on it heavily, because the
effective value of a key now lives somewhere else in the document.
Worse, recursive aliases enable the "billion laughs" denial-of-service
attack: a handful of nested anchors that each reference the previous
one expand exponentially and exhaust memory at parse time. Parsers
that don't cap expansion are vulnerable. If you accept YAML from
untrusted sources, use a parser with alias-expansion limits (or a
"safe" loader) and consider disabling anchors entirely.
Type coercion, generally
The unifying theme is that YAML tries to be helpful by inferring
types, and inference is a guess. The defenses are the same in every
case: quote anything that is conceptually a string, and load with a
strict 1.2 parser instead of the permissive default. In Python that is
yaml.safe_load rather than yaml.load; in other ecosystems, look
for the schema or strict mode the library exposes.
JSON's limits
JSON's strictness is the reason it almost never surprises you at parse time — but the same strictness makes it a poor format for humans to maintain.
- No comments. The spec has none, full stop. This is why JSONC
(VS Code settings), JSON5, and the
// commenthacks exist. A config you can't annotate is a config future-you can't safely change. - No trailing commas. Add a line to an array or object and you must remember to add a comma to the line above. This is the single most common hand-edited-JSON error.
- Verbose for hand editing. Braces, brackets, and quoted keys are noise when a human is the editor. At config scale it adds up.
- No native date or comment types. Dates are strings by convention, and you carry no metadata about them.
None of these matter for machine-to-machine traffic, where nobody is hand-editing the payload. They matter enormously for files a person opens in an editor every week.
When to use which
The split follows the failure modes. Use JSON wherever a machine produces and consumes the data and a human rarely touches it: API request and response bodies, interchange between services, log records, anything serialized programmatically. Strictness is a feature there — you want exactly one interpretation, and you don't need comments because the schema is the documentation.
Use YAML for configuration that humans edit by hand: CI pipelines, Kubernetes manifests, Ansible playbooks, application config. Comments, multiline strings, and the lighter syntax pay off precisely where a person is in the loop.
The uncomfortable caveat is that YAML's footguns bite hardest in
exactly this case. The human-edited config files where YAML shines are
the same files where an unquoted NO, a tab, or a mis-indented list
item slips through review and ships. So the discipline that makes YAML
safe — quote your stringy scalars, lint indentation in CI, validate
against a schema — is non-negotiable for the workloads YAML is best at.
That validation step is worth building into the pipeline: a schema
catches the coercion bugs a yaml.load will never warn you about. We
cover the mechanics in validating with JSON
Schema, which works against
either format once parsed.
If you're moving a document between the two — porting a JSON API fixture into a YAML config, or flattening a manifest back to JSON for a tool that demands it — our JSON ⇄ YAML converter handles the round-trip and preserves the structure, and the JSON formatter will lint and pretty-print the JSON side before you commit it.