JSON Schema Generator
Generate JSON Schema from sample JSON, choose draft and inference rules, and review property ledgers before copying a validator-ready draft.| Path | Type | Required | Constraints | Evidence | Copy |
|---|---|---|---|---|---|
| {{ row.path }} | {{ row.type }} | {{ row.required }} | {{ row.constraints }} | {{ row.evidence }} |
| Level | Area | Evidence | Review action | Copy |
|---|---|---|---|---|
| {{ row.level }} | {{ row.area }} | {{ row.evidence }} | {{ row.action }} |
Introduction:
JSON moves through many systems before anyone notices that a field changed shape. A payment webhook may add a new status, an API response may omit a nested object, or a configuration file may start accepting an array where a single string used to appear. JSON Schema gives that moving data a written contract: the expected keys, value types, array shapes, allowed extras, and descriptive hints that validators and humans can inspect.
A schema is most useful when it sits between examples and enforcement. Examples show what already happened in real payloads. A schema states what should be accepted next time. The hard part is that one sample can prove a field exists in that sample, but it cannot prove every future value, every optional field, or every business rule. Sample-derived schemas are therefore best treated as a first draft that shortens the blank-page work, not as a finished contract.
| Sample clue | Helpful signal | Common overread |
|---|---|---|
"quantity": 2 |
The value is numeric and has no fractional part in the sample. | Assuming all valid quantities are positive integers without checking the real rules. |
"status": "paid" |
The field may belong to a small set of status strings. | Locking an enum before collecting cancelled, refunded, or pending cases. |
| A key appears in every object | The field looks mandatory for the observed examples. | Rejecting partial updates, error responses, or older records that legitimately omit it. |
| A string looks like a date or email address | A format annotation may help validators and documentation. |
Believing the annotation proves the address exists or that every validator will enforce it. |
Different teams use JSON Schema for different jobs. API producers use it to publish request or response expectations. Consumers use it to catch breaking payload changes before code runs. Data engineers use it to document event streams and files whose shape is easy to lose in a large system. In all of those cases, the draft version matters because the same idea can be written with different keywords across drafts.
The safest workflow starts with representative examples, then adds human judgment. Decide which fields are truly required, whether extra object keys should be accepted, how nulls should behave, and which small value sets are real business domains. After that, test the schema with valid examples, expected failures, and future-looking cases that were not present in the original sample.
How to Use This Tool:
Use the generator to turn one representative JSON value into a reviewable schema, then inspect the generated rules before sharing or enforcing them.
- Enter a Schema title that describes the payload, event, object, or configuration file. The title is placed in the root schema and used for downloaded filenames.
- Choose Schema draft. Draft 2020-12 is the newest option offered here; choose Draft 2019-09 or Draft 7 when the validator that will read the schema requires that draft.
- Paste one valid JSON value into Sample JSON, drag text into the textarea, browse for a JSON/TXT file, or use Load sample for a starting example. Use Format after pasting if you want the source cleaned up before review.
- Set Required properties. Require keys present in every sampled object is safer for arrays of similar objects because one missing key keeps that field optional.
- Set Unknown object keys. Leaving
additionalPropertiesomitted keeps validator defaults open, while setting it to false creates a closed object contract that rejects unlisted keys. - Open Advanced when the sample needs different array or null behavior. Use merged arrays for lists of similar records, tuple arrays for fixed positions, nullable null handling for fields that can be null, and strict null handling when null should stand alone as its own type.
- Turn on Detect string formats, Include observed bounds, Include sample examples, or Add description stubs only when those annotations help the next reviewer. Sample examples can copy real values into the schema, so review them before sharing.
- Read Generated Schema, then check Property Ledger and Inference Notes. If the source cannot be parsed, fix the line or column named in the alert before trusting any output.
Interpreting Results:
The most important result is not the formatted schema alone. Use Property Ledger to see each observed path, inferred type, required status, constraints, and sample evidence. Use Inference Notes to find places where the generator needs human review, such as inferred enums, mixed types, ignored nulls, observed bounds, empty arrays, or copied examples.
A generated schema can be too strict or too loose. Required keys, closed objects, tuple arrays, sample-derived bounds, and small enums can all reject valid future data when the input sample is narrow. Loose settings can accept data that looks structurally valid but violates business rules that never appear in JSON alone.
- Trust type and path discovery as a starting map of the sample, not as proof of the final contract.
- Review every Review and Warning note before using the schema with production validators.
- Test the copied Generated Schema with at least one valid payload, one intentionally invalid payload, and one realistic edge case.
- Treat
format,examples, and description text as documentation unless your validator is configured to enforce or use those annotations.
Technical Details:
JSON Schema separates structural assertions from descriptive annotations. Assertions such as type, required, enum, minimum, maxLength, items, and additionalProperties can change whether an instance validates. Annotations such as title, description, examples, and common format usage help documentation and tooling, but they do not always fail invalid data unless the chosen validator treats them as assertions.
Sample inference works by walking one parsed JSON value and grouping observed values by path. Object keys become property candidates. Arrays are either combined into one item schema or described by position. Primitive values become JSON Schema types, with integers separated from decimal numbers because the validation vocabulary includes an integer type for numbers with no fractional part.
Transformation Core:
| Inference area | Generated keyword pattern | Review question |
|---|---|---|
| Draft target | The root $schema URI follows the selected draft, with optional $id when supplied. |
Will the downstream validator read the same draft and keyword meanings? |
| Objects | Observed keys become properties; required is added from the selected required-property policy. |
Are missing keys really invalid, or did the sample only show one response shape? |
| Unknown keys | additionalProperties is omitted, set to false, or set to true. |
Should new producer fields be allowed without breaking consumers? |
| Arrays | Merge mode creates one items schema from all items; tuple mode creates position-specific schemas. |
Is the array a list of similar records or a fixed-position value such as coordinates? |
| Nulls | Nullable mode adds null to another type, strict mode keeps null as its own type, and ignore mode drops null when other values exist. |
Does null mean an allowed empty value, an unknown value, or a missing-data accident? |
| Mixed non-null types | Different observed types at the same path become anyOf. |
Is a mixed field intentional, or should separate schemas or a discriminator be used? |
| Small repeated values | Enum detection can add enum for small repeated string or primitive sets under the selected limit. |
Is the value set truly finite, or can new values appear later? |
| Observed bounds | Optional bounds add minLength, maxLength, minimum, maximum, minItems, and maxItems. |
Do sample minima and maxima reflect real limits, or only the current examples? |
Tuple arrays are the clearest draft-sensitive case. In Draft 2020-12 and Draft 2019-09 output from this generator, position-specific tuple entries use prefixItems with items: false to reject additional positions. In Draft 7 output, the same intent uses array-form items with additionalItems: false.
String format detection checks common shapes such as email addresses, URIs, date-time strings, dates, times, UUIDs, IPv4 addresses, and IPv6 addresses. A shape match does not prove ownership, reachability, timezone intent, or business validity. JSON Schema Draft 2020-12 also separates format annotation from format assertion, so validators can differ in how strongly they enforce format.
Review Boundaries:
| Schema signal | What it supports | What still needs a human decision |
|---|---|---|
required |
A key appeared often enough under the selected policy. | Whether create, update, error, or legacy payloads may omit it. |
enum |
The sample contains a small repeated set of primitive values. | Whether the value set is governed by product rules outside the sample. |
anyOf |
More than one non-null type appeared at the same path. | Whether mixed types are allowed or indicate inconsistent upstream data. |
additionalProperties: false |
Unknown object keys should be rejected by validators that honor the keyword. | Whether rejecting forward-compatible fields is acceptable for this contract. |
| Bounds from examples | The observed length, numeric, or item-count range is documented. | Whether real limits come from product rules, database limits, or protocol specs. |
Generated schemas do not infer every JSON Schema feature. Conditional rules, cross-field dependencies, references to shared definitions, regular-expression property names, unevaluated properties, and semantic constraints such as "end date must be after start date" require manual design. Add those rules after the sample-derived structure is correct.
Advanced Tips:
- Match Schema draft to the validator that will read the output. Tuple arrays use
prefixItemsin newer drafts, while Draft 7 uses array-formitemswithadditionalItems: false. - Use Require keys present in every sampled object when an array contains several records. It avoids marking a field required just because one object happened to include it.
- Set Unknown object keys to
falseonly when forward-compatible producer changes should fail validation. Leaving the keyword omitted is usually safer while the contract is still being drafted. - Switch Array inference to tuple mode only for fixed-position values such as coordinates, RGB triples, or protocol rows. Keep merged arrays for ordinary lists of similar objects.
- Keep Enum detection off for open text fields and raise Enum value limit only for real status or code domains. Small samples can make ordinary strings look finite.
- Review privacy before enabling Include sample examples or copying the full JSON report. Example annotations and exported ledgers can carry real IDs, emails, tokens, or customer values.
Privacy Notes:
The generator reads pasted text and selected files in the browser for schema creation. That keeps sample parsing local to the page, but the schema you copy or download may still contain sensitive values if Include sample examples is on.
- Remove real customer data, tokens, internal IDs, and secrets before saving or sharing a schema.
- Use description stubs as placeholders only; replace them with reviewed wording before publishing a contract.
- Remember that copied JSON output includes settings, stats, the schema, the property ledger, and inference notes.
Worked Examples:
An order event with id, type, created_at, customer.email, and two items entries produces a Generated Schema with an object root and nested properties. With Require keys present in every sampled object, a key that appears in both item objects is marked required, while a key missing from one item stays optional in Property Ledger.
A coordinate value such as [40.7128, -74.0060] reads better with tuple array inference. In Draft 2020-12, Generated Schema uses prefixItems for the first and second positions and items: false for extra positions. If the same sample is treated as a merged list, both positions are blended into one item schema and the latitude/longitude meaning is lost.
A status sample that repeats "created", "paid", and "paid" may produce an enum when enum detection is enabled. Inference Notes flags the enum for review because future values such as "cancelled" or "refunded" may be valid even though they were absent from the sample.
A pasted object with a trailing comma fails before schema generation. The alert names the parse problem, and Generated Schema shows an error object instead of a usable contract. Fix the JSON source first, then recheck Property Ledger and Inference Notes after the sample parses cleanly.
FAQ:
Can one sample produce a complete JSON Schema?
No. One sample can reveal paths, types, and obvious repeated patterns, but final required fields, enums, bounds, and extra-key rules should be checked against more examples and the validator that will enforce the schema.
Why did the schema include anyOf?
anyOf appears when different non-null types are observed at the same path. Review it carefully because the real contract may need one allowed type, separate schemas, or a clearer discriminator field.
Should I close objects with additionalProperties: false?
Use it when unlisted object keys should fail validation. Leave it omitted while drafting or when producers may add fields that consumers should ignore safely.
Why are dates and emails still strings?
JSON has strings, numbers, booleans, arrays, objects, and null. Date-like and email-like values remain strings; with format detection on, they can receive format annotations that some validators enforce only when configured to do so.
What should I do when the tool says the sample JSON is invalid?
Fix the parse error named in the alert before reading the schema. Common causes include trailing commas, unquoted object keys, single-quoted strings, comments, or pasting several JSON values instead of one valid value.
Glossary:
- JSON Schema
- A JSON-based contract for describing and validating JSON values.
- Schema draft
- A version of the JSON Schema specification that controls keyword names and behavior.
required- An object keyword listing properties that must be present for validation to pass.
additionalProperties- An object keyword that controls keys not listed in
properties. prefixItems- A tuple-array keyword for position-specific item schemas in newer drafts.
anyOf- A composition keyword that accepts a value when it matches at least one listed schema.
format- A string annotation for shapes such as email, URI, date-time, UUID, IPv4, and IPv6.
References:
- JSON Schema Draft 2020-12, JSON Schema, 16 June 2022.
- JSON Schema Validation: A Vocabulary for Structural Validation of JSON, JSON Schema, 16 June 2022.
- JSON Schema Array Reference, JSON Schema.
- JSON Schema Object Reference, JSON Schema.
- JSON Schema Type-specific Keywords, JSON Schema.