{{ summary.heading }}
{{ summary.primary }}
{{ summary.line }}
{{ badge.label }}
Test Data Generator inputs
Define one field per line as name:type:options. Types include sequence, uuid, name, email, city, date, integer, number, money, boolean, choice, sku, url, and word.
{{ fileStatus || 'Drop TXT, CSV, or recipe text onto the textarea.' }}
Use enough rows to test pagination, import validation, and common empty-state handling.
Choose the file shape you want to paste into tests, fixtures, seeds, or import tools.
Use the destination fixture, seed, or staging table name.
Change this when you want a different but repeatable dataset.
PostgreSQL and SQLite use double quotes; MySQL uses backticks.
Use fixed dates so repeated seeds stay stable over time.
to
Use reserved or internal test domains rather than production customer domains.
Keep at 0 for clean fixtures; use small rates to test optional-field handling.
%
Field-level options such as id:sequence:start=1001,step=5 override this.
Only affects JSON array output. NDJSON stays one JSON object per line.
Include CSV header row
Only affects CSV output. Other formats always include field names.
{{ dataset.outputText }}
Field Type Options Blank % Example Note Copy
{{ row.field }} {{ row.type }} {{ row.options }} {{ row.blankPercent }} {{ row.example }} {{ row.note }}
Check Status Value Detail Copy
{{ row.check }} {{ row.status }} {{ row.value }} {{ row.detail }}

        
Customize
Advanced
:

Introduction

Synthetic test data replaces copied production rows with fabricated records that keep the shape a system expects. Fixtures, seed rows, import samples, and staging records need enough variety to exercise parsers, forms, validators, and pagination without dragging real customer data into routine development work.

A useful test dataset is predictable enough to repeat and varied enough to expose ordinary defects. Row counts need to match the workflow being tested, field names need to match the importer or database table, and values need to resemble the type of data the code handles. Reproducibility matters because a failing test is easier to debug when the same recipe and seed recreate the same row values.

Field recipe, seed, and row count flowing into a repeatable synthetic dataset with quality checks

Synthetic data is not the same as anonymized production data. A hand-written recipe can avoid real people, emails, and accounts, but copied business values, rare categories, or sensitive examples can still leak meaning if they were taken from a real system. Treat fixture recipes as code: review them, keep them small enough to understand, and use reserved domains or internal test domains for generated contact fields.

The goal is fit-for-purpose coverage, not perfect realism. A checkout import smoke test may need twelve customer rows with dates, plan names, money values, and blank optional fields. A pagination test may need a larger row count. A database seed may need SQL inserts with stable identifiers. The same generator settings should let the team recreate those rows whenever the test needs to run again.

Technical Details:

A recipe-driven test dataset starts with a compact field list. Each non-comment recipe line names a field, a value type, and optional type settings in the form name:type:options. Blank lines and lines beginning with # are ignored, so a recipe can stay readable while still producing a rectangular dataset.

Repeatability comes from a seeded pseudo-random sequence. The generated rows are tied to the seed text, the recipe source lines, and the row count. Changing any of those inputs changes the generated values; keeping all three the same recreates the same dataset. That behavior is useful for regression tests because a failure can be reproduced without saving a copy of the generated rows.

Generation Core:

How the test data generator turns recipe settings into output rows
Stage Rule Result to Check
Recipe parsing Each active line is split into field name, type, and options. Duplicate names are renamed with a numeric suffix, and unsafe field-name characters become underscores. Field Recipe shows the normalized field name, type badge, options, first-row example, and any note for the current rows.
Type selection Known types are used directly. Aliases such as int, currency, and bool are mapped to supported types. Some missing or unknown types are inferred from field names. Quality Ledger adds review rows when a field name is normalized, a duplicate is renamed, or a type is inferred.
Row generation The seeded sequence creates values for 1 to 500 rows. Non-sequence fields can receive blanks from the global blank rate or from a field-level blank= option. The summary reports row count, field count, blank count, selected format, and seed.
Output shaping The same generated row objects can be rendered as CSV, JSON array, newline-delimited JSON, or SQL INSERT statements. Synthetic Dataset is the copy-ready dataset, and JSON shows the full run payload with params, fields, rows, warnings, and errors.

The supported value types cover common fixture needs rather than full domain modeling. Sequence and UUID values are useful for identifiers; name, first name, last name, email, phone, company, city, country, URL, and IPv4 values cover typical demo records; date and datetime values use the configured date range; integer, number, money, boolean, choice, word, sentence, and SKU values cover structured fields and simple labels.

Options and Boundaries:

Input and validation boundaries for generated test data
Area Accepted Range or Pattern Practical Effect
Rows 1 to 500 Small enough for browser-side generation while still covering pagination, import, and fixture cases.
Fields At least one active line, up to 24 parsed fields Empty recipes and oversized recipes block output until the field list is fixed.
Recipe file Plain text-style recipe files up to 128 KB Loaded file text replaces the recipe editor content for browser-side parsing.
Blank rate 0% to 100%, excluding sequence fields Blank values become empty CSV cells, null in JSON and NDJSON, and NULL in SQL.
Numeric decimals 0 to 6 for number and money fields Money defaults to two decimals, and min/max options can be written per field.
Sequence start Default start or field-level start= with optional step= Stable identifiers can begin at a known value such as 1001 or advance by a custom step.

SQL output is value-oriented. It writes INSERT statements for the chosen table name, quotes identifiers for PostgreSQL and SQLite with double quotes, and uses backticks for MySQL. It does not infer database column types, create tables, define indexes, or validate foreign-key relationships.

The generator creates fabricated values from built-in lists and recipe options. It does not train on production rows, measure statistical similarity, or provide a privacy guarantee. When a fixture must represent a regulated dataset, review the recipe and outputs with the same care as any other test asset.

Everyday Use & Decision Guide:

Start with a short Field recipe that mirrors the importer, API payload, or table you need to test. Use sequence for stable IDs, choice for status or plan values, money for currency-like fields, and blank= only where missing data is a real case.

Keep Rows small for unit tests and review snapshots. Raise it when the workflow needs pagination, sorting, batch validation, or import progress checks. The row limit prevents the page from becoming a bulk data warehouse, so large-scale load testing still belongs in a dedicated generator or fixture pipeline.

  • Use a memorable Seed such as checkout-fixture-v1 when the same rows need to appear in CI, review notes, and local debugging.
  • Choose CSV for spreadsheet-style import tests, JSON array for API fixtures, NDJSON for log-like streams, and SQL INSERT for database seed scripts.
  • Use Email domain with a reserved or internal test domain so generated email and URL values do not point at real customer domains.
  • Check Quality Ledger after normalizing a recipe. Review rows are often caused by duplicate names, inferred types, or blank counts that are higher than expected.
  • Open Field Mix Chart when a recipe has many fields and you want a quick view of whether IDs, dates, numeric values, choices, and text fields are balanced enough for the test.

A generated dataset can look tidy while still missing an edge case. If the target importer rejects a nullable date, duplicate email, long SKU, or unknown status, add that case deliberately to the recipe or keep a separate hand-written fixture for the failure path.

Before copying results into a test suite, compare Synthetic Dataset with Field Recipe and Quality Ledger. The dataset shows the rows; the ledgers explain how the rows were made and what was normalized along the way.

Step-by-Step Guide:

Build the fixture from the field list first, then choose the output shape that matches the consumer.

  1. Enter one line per field in Field recipe. A line such as customer_id:sequence:start=1001 should appear in Field Recipe as a sequence field with a first-row example of 1001.
  2. Set Rows between 1 and 500. The summary should report the generated row count once the recipe has at least one valid field.
  3. Choose Output format. If you select SQL INSERT, fill SQL table name and choose the SQL dialect in Advanced before reviewing the statements.
  4. Set Seed, Date range, Email domain, Global blank rate, and Default sequence start when the defaults do not match the test case.
  5. Use Normalize when a recipe came from notes or a schema draft. If the input alert says Recipe normalized, review the listed warnings before trusting field names.
  6. Open Quality Ledger. If it reports Fix the field recipe, add a field line or reduce the field count to 24 or fewer before using the output.
  7. Copy from Synthetic Dataset only after the first row examples, blank count, repeatability seed, and artifact format match the intended fixture.

Interpreting Results:

Synthetic dataset ready means the recipe parsed, the row count stayed inside the limit, and an output string was produced for the selected format. It does not mean the data matches every business rule in the system that will consume it.

The main confidence check is the Quality Ledger. Row volume tells you whether the row count fits the test size, Field coverage summarizes parsed field types, Repeatability shows the active seed, Blank cells reports nullable output, and Artifact format confirms the rendered shape.

A clean ledger does not prove realism. It means the generator found no blocking recipe error and no normalization warning for the current inputs. If the target system has stricter rules, test those rules directly with deliberate values and confirm that imported rows still pass downstream validation.

Worked Examples:

Customer import smoke test:

A recipe with customer_id:sequence:start=1001, email:email, plan:choice:Starter|Team|Enterprise, monthly_spend:money:min=12,max=480,decimals=2, and signup_date:date can generate a twelve-row CSV. Quality Ledger should show 12 rows, a seeded repeatability row, and the CSV artifact format. The result is a compact fixture for import parsing and UI review, not proof that production spending patterns are represented.

SQL seed with nullable fields:

For a staging table named customers, choose SQL INSERT, set the dialect, and use fields such as id:sequence:start=5000, company:company, support_score:integer:min=1,max=100,blank=15, and active:boolean. When blank values occur, Blank cells reports the count and the SQL output writes NULL for those cells.

Recipe cleanup before copying:

If two recipe lines both begin with email, the parser keeps both fields by renaming the duplicate with a suffix. Quality Ledger adds a Recipe warning, and Field Recipe shows the normalized field names. Rename the fields yourself when the importer expects exact column names.

Blocked field list:

A pasted recipe with no active field lines changes the summary to Recipe needs attention and reports Add at least one field line before generating data. A very wide recipe with more than 24 parsed fields also blocks output. Reduce the field list or split the test into smaller fixtures, then confirm that Synthetic dataset ready returns.

FAQ:

Why did the same seed create different rows?

The seed is combined with the recipe source and row count. Keep Field recipe, Rows, and Seed unchanged when you need the exact same dataset again.

Why did my field name change?

Field names are normalized to safe identifier-style names. Spaces and punctuation become underscores, names that start unsafely are adjusted, and duplicate names receive numeric suffixes.

Can the SQL output create my table?

No. SQL mode writes INSERT statements for generated rows. It does not create tables, infer column types, add constraints, or check relationships against an existing database.

What does the blank rate affect?

The global blank rate applies to non-sequence fields unless a field line sets blank=. Blanks appear as empty CSV cells, null JSON values, and SQL NULL literals.

Does recipe file content get uploaded for generation?

Recipe files are read into the page for browser-side parsing, and generated values are produced in the page. The described generator path does not show a server upload step for recipe text or generated rows.

Glossary:

Field recipe
A line-based description of generated fields using field name, type, and optional settings.
Seed
Text used to make generated rows repeatable when the recipe and row count stay the same.
Fixture
A small dataset used by tests, demos, imports, or staging workflows.
NDJSON
Newline-delimited JSON, where each row is written as one JSON object per line.
Blank rate
The percentage chance that a non-sequence generated field becomes blank for a row.
Quality Ledger
The result table that reports row volume, field coverage, repeatability, blank cells, format, warnings, and errors.

References: