| Metric | Value | Copy |
|---|---|---|
| {{ row.label }} | {{ row.value }} | |
| No metrics available. | ||
| Field | Pattern | Example | Type | Copy |
|---|---|---|---|---|
|
{{ field.renameError }}
|
{{ field.pattern }} | {{ field.sample || 'โ' }} | {{ field.type }} | |
| No captured fields detected. | ||||
| Sample | Status | Captured | Coverage | Copy |
|---|---|---|---|---|
| #{{ row.index }} ยท {{ row.snippet }} |
{{ row.status === 'matched' ? 'Matched' : row.status === 'failed' ? 'No match' : row.status === 'error' ? 'Error' : 'Blank' }}
{{ row.message }}
Fields
{{ field.name }}={{ field.value }},
, ...
|
{{ row.capturedCount }} | {{ (row.coverage || 0).toFixed(1) }}% | |
| No samples analysed yet. | ||||
Grok patterns turn a raw log line into named fields that downstream search, alerting, and dashboards can understand. That matters because a line that only looks readable to a person is still opaque to an ingest pipeline until timestamps, status codes, paths, host names, and message text are broken into stable captures. This generator drafts that parsing pattern from one representative log entry and shows how much of the result maps cleanly to Filebeat-style field naming.
The package is aimed at the practical middle ground between hand-writing every token and relying on a parser you cannot inspect. You paste a canonical line, pick a known preset or stay on the custom path, and the tool returns a Grok pattern, field mapping, confidence label, and match feedback that you can review before sending anything into production.
A common case is a familiar access or system log that repeats all day with only a few values changing. An Apache combined line with a client address, request method, path, status code, byte count, referer, and user agent is a strong fit because the structure is stable enough for the preset engine to map directly to fields such as source.ip, http.request.method, and http.response.status_code.
The important limit is that a promising-looking pattern is not the same thing as a reliable parser. This package analyzes only the first non-empty line you paste, so a high field count or a tidy preview does not prove that sibling lines from the same source will still match. Treat the result as a visible draft, then confirm the Matches badge, the Known field coverage row, and the captured fields before you trust it.
If your source already looks like Apache, Nginx, HAProxy, IIS, Postgres CSV, or RFC 3164 syslog, start with the matching Preset. Presets are the fastest path because they carry fixed capture order, known field names, and a strong confidence score when the sample truly fits.
The custom path is better when your line mixes positional text with small structured fragments such as level=INFO, status=200, or request_id=abc123. In that case, choose one line that includes every part you care about, not the shortest or cleanest line in the file. This package ignores later pasted lines, so your first line needs to contain the full grammar you want the pattern to keep.
Normalize field names on when you want recognizable targets such as log.level or url.path; turn it off only when you need raw source names preserved.Field prefix when the captures are specific to one dataset and you want to avoid collisions with existing event fields.Detect key=value pairs on for app logs and ingress lines, but turn it off for strictly positional formats where equals signs are rare noise.Timestamp hint when the parser could reasonably mistake a date token for ordinary text. A good hint prevents the rest of the pattern from drifting.A strong fit for this tool is a single-line format that repeats with predictable separators. A weak fit is a multiline stack trace, a source that changes shape between lines, or a line whose important content is hidden inside nested quoting.
The stop-and-verify trigger is simple: if the summary says Matches 0/1, or if Known field coverage looks respectable but the sample still fails, do not copy the pattern into Filebeat yet. Rebuild from the same line with clearer quoting, a better timestamp hint, or one less noisy fragment, then confirm that the field mapping and match badge agree with each other.
A Grok expression is a regular-expression template built from named pattern fragments. In Elastic ingest pipelines, the grok processor applies those fragments to an event field and emits captured values under chosen names. This package follows that model directly: the optional pipeline snippet writes one grok processor, targets the configured match field, and can append an event.module setter when you provide a module tag.
The package has two parsing paths. The preset path uses hard-coded sample definitions for nine common log families and maps each capture to a known field base. The custom path is heuristic: it searches the line for a timestamp, a standalone log level, quoted or unquoted key=value fragments, key: value fragments, and loose tokens such as methods, status codes, host names, addresses, URL paths, durations, and a trailing message segment. Literal separators between those captures are escaped and the final expression is anchored with ^ and $.
Field naming is biased toward Elastic Common Schema (ECS). Known keys such as status, method, path, url, and pid map into stable ECS-style targets, while unknown keys can be normalized into custom.*. The package then recompiles the Grok output into a JavaScript regular expression, runs it against the current line, and records both naming quality and actual match quality. That is why Known field coverage and the Matches badge can disagree.
status=200, path=/index.html, or request: "GET / HTTP/1.1".message, then prefix, deduplicate, escape separators, and validate the final expression.A short technical bridge helps here: the output is not just a string builder. Small changes to quoting, brackets, or token order can change both the generated pattern and the match result even when the same high-level fields are still visible.
| Token family | Typical trigger in this package | Representative target | Notes |
|---|---|---|---|
TIMESTAMP_ISO8601, HTTPDATE, SYSLOGTIMESTAMP, NUMBER:long |
Date-like prefix or epoch token | @timestamp |
Timestamp hint biases detection when several shapes are plausible. |
LOGLEVEL |
Standalone severity or mapped key | log.level |
Skipped if it is already part of a structured pair. |
WORD, NUMBER, IP, HOSTNAME |
Loose positional token classification | http.request.method, http.response.status_code, source.ip, host.name |
Number suffixes become :int or :float when inference is enabled. |
URIPATH, URIPATHPARAM, URI |
Paths, paths with queries, or full URLs | url.path, url.original, http.referer |
Quoted referers and request paths are preserved when possible. |
DATA, QUOTEDSTRING, GREEDYDATA |
Quoted values, generic text, or a message tail | message, user_agent.original, custom segments |
These are the first places to inspect when match coverage looks weaker than field coverage. |
| Result field | What it means | How to use it |
|---|---|---|
Confidence score |
Heuristic confidence label from the generated pattern path | Use it as a drafting hint, not a deployment decision on its own. |
Known field coverage |
Share of captures that map to recognized field concepts | High coverage suggests naming quality, not guaranteed match stability. |
Matches and Failures |
Whether the compiled pattern matched the analyzed line | This is the go or stop signal before copying the pattern into a pipeline. |
Literal characters escaped |
Count of separators that had to be escaped into the final expression | A rising count often means the line depends heavily on exact punctuation. |
Parsing runs in the browser and this tool package does not define a server-side parser or lambda.mjs endpoint. Normal page delivery still loads site assets and a chart library, but pattern generation and sample matching are handled locally by the page code.
Use one deliberate pass from sample selection to match verification. The goal is not merely to produce a Grok string, but to end with a pattern whose field names and match result agree.
Preset first. If your source looks like Apache, Nginx, HAProxy, IIS, Postgres CSV, or syslog, pick that preset. Otherwise stay on Custom and paste one representative line into Sample log line.Timestamp hint only when the date token is ambiguous. A correct hint should make the preview start with a timestamp capture instead of leaving the first token as generic text.Normalize field names, Detect key=value pairs, and Infer numeric & boolean types fit your source. If the line contains business-specific fields, add a Field prefix before reviewing the mapping.Grok Pattern preview and then open Pattern Metrics. If Pattern error appears, or if the summary shows Matches 0/1, simplify the sample, fix the quoting, or change the timestamp hint before going further.Field Mapping and inspect the emitted field names, token types, and example values. Rename any awkward capture there if needed; the package updates the underlying pattern when a field rename is accepted.Include ingest pipeline JSON, set Match field name, and optionally supply Pipeline ID or Module tag. The resulting JSON should show one grok processor and, when used, one event.module setter.Finish by replacing the sample with a second real line from the same log source and checking whether Matches, the field mapping, and the final pattern remain coherent. That last substitution is the quickest way to catch brittle separators before deployment.
The most important trio is Matches, Known field coverage, and the field mapping itself. A pattern with many captures but Matches 0/1 is still a failing parser. A pattern with Matches 1/1 but poor field names may still work, but it will produce harder-to-use data.
Known field coverage does not mean coverage across a whole log file. It only means that the current capture set maps well to familiar concepts. High coverage does not prove that rare variants, multiline events, or slightly different separators will still match.
Confidence score High with Matches 1/1, you likely have a strong starting point.Apache access logs with a preset. Start with the Apache combined preset sample. The tool builds an 11-segment pattern, reports Confidence score as High, sets Known field coverage to 100%, and shows Matches 1/1. In the field mapping you should see captures such as source.ip, http.request.method, url.path, and http.response.status_code. That is a strong signal that the preset grammar and field naming agree with the sample.
A custom app line that still needs cleanup. Using the built-in custom sample with level=INFO, message="Started request", duration=35, and status=200, the heuristic path drafts eight segments and reports Known field coverage at 75%. Even so, the summary can still show Matches 0/1 and Coverage 0% because the final expression modeled a separator incorrectly. The practical reading is that the field ideas are useful, but the literal grammar still needs work before the pattern is safe to reuse.
Turning a good preset into a pipeline snippet. After a successful Apache preset run, enable Include ingest pipeline JSON, set Match field name to event.original, add a Module tag of apache, and give the pipeline the ID apache-access. The resulting JSON contains one grok processor pointed at event.original plus a second processor that sets event.module. It is ready for review in an ingest-pipeline workflow, but it still deserves one more real-log substitution before deployment.
The package does not define a backend parsing endpoint or lambda.mjs handler. Pattern generation and sample matching happen in the browser, although the page still loads its normal site assets and a chart library.
Known field coverage only reflects naming quality for the generated captures. The sample can still fail if one escaped separator, quote boundary, or bracket token in the final pattern does not match the line exactly.
Only the first non-empty line is analyzed. The page shows a warning that extra lines were ignored, so choose the line that best represents the full log structure you want to capture.
Start with Normalize field names if you want ECS-style targets. If that is still not right, edit the names directly in Field Mapping. The package blocks empty names, whitespace, duplicate names, and colons.