YAML vs JSON

JSON is a wire format. YAML is a configuration format. Here's when each one is the right choice — and the YAML footguns to memorize.

5 min read

YAML and JSON can represent the same data — every JSON document is valid YAML — but they exist for different reasons. JSON is a wire format. YAML is a configuration format. Using one where the other belongs is why you've probably seen a Kubernetes manifest with a trailing colon that took an hour to debug, or an API response wrapped in three-space indentation for no reason.

The short version

  • JSON — when a machine writes it and a machine reads it.
  • YAML — when a human writes it and a machine reads it.

If neither party is a human, JSON wins on speed, tooling, and unambiguity. If a human maintains the file by hand, YAML wins on readability, comments, and skimmability.

Where JSON is the right call

  • HTTP request and response bodies.
  • Data stored in a database column.
  • Message-queue payloads.
  • Anything generated programmatically and consumed programmatically.
  • Any format where you want unambiguous parsing across every language.

JSON parsers exist in every language, they all agree on what the spec means, and there are no surprises. That's the whole value proposition.

Where YAML is the right call

  • Kubernetes manifests, Helm values files.
  • GitHub Actions, GitLab CI, CircleCI workflows.
  • Docker Compose files.
  • Ansible playbooks.
  • App config that a human edits and a machine reads.

The winning features here are inline comments, multi-line strings that don't need \n escaping, and less visual noise. When you're reading a 400-line CI file, that adds up.

YAML's classic footguns

Most YAML bugs come from the same handful of gotchas. If you use YAML, memorize these:

  • The Norway problem. Unquoted NO, no, off, false all become the boolean false. So does N. If you have a list of country codes and one of them is Norway, quote every value.
  • Version numbers eaten by numeric parsing. version: 1.10 parses as the number 1.1, not the string "1.10". Quote it: version: "1.10".
  • Leading zeros. id: 0755 parses as octal (493 decimal) in YAML 1.1. YAML 1.2 fixes this but many parsers still ship 1.1.
  • Time-like strings. 05:30 becomes the integer 19800 (5 * 3600 + 30 * 60) in YAML 1.1. Quote it.
  • Indentation drift. Mixing tabs and spaces, or swapping between 2- and 4-space indentation halfway down a file, silently changes the structure.

Rule of thumb: quote every string that looks like it could be interpreted as something else. Belt and braces.

JSON's rough edges

  • No comments. There is no way to comment JSON. Use _comment keys, use JSONC (with a tolerant parser), or switch to YAML for the file.
  • Trailing commas are illegal. Copy-pasting a block and forgetting to remove the trailing comma breaks the whole document.
  • Everything is on one visual line if you don't format it. The JSON formatter exists for exactly this reason.
  • Integers lose precision above 2⁵³. JavaScript's number can't represent 64-bit IDs exactly. Serialize large IDs as strings.

Converting between them

The two formats are lossless in one direction — YAML can represent any JSON — and mostly lossless in the other. Comments are dropped going JSON-ward (there aren't any) and YAML-only features like anchors are resolved. In practice, our converters cover both directions:

  • YAML to JSON — for shipping a config into an API.
  • JSON to YAML — for turning an API response into an editable config.

What about TOML?

TOML is the third option — used by pyproject.toml, Cargo.toml, Hugo config. It's more structured than YAML (no significant indentation, no Norway problem) and more readable than JSON. If you're picking a config format for a new tool today, TOML is worth a look. For any format that already has an ecosystem (Kubernetes, GitHub Actions, Compose), just use the one they expect.

Bottom line

Don't pick a format based on taste. Pick it based on who's writing the file. Humans get YAML (with everything quoted). Machines get JSON. If you have to move between them, that's a two-second conversion — not a reason to standardize on the wrong format.

Try the tools

Related reading