Secret Pattern Sample Checker
Scan config, log, and ticket samples for secret-like values with masked findings, provider-pattern hits, entropy scoring, and rotation cues.| {{ header }} | Copy |
|---|---|
| {{ cell }} | |
|
No rows for this sample
Adjust the scan profile, load a sample, or paste text with supported secret-like patterns.
|
Credential leaks often start as ordinary text. A copied environment file, CI log, HTTP header, Terraform variable, chat excerpt, or support ticket can expose enough material for someone to authenticate, call an API, sign a request, or reach a private system. The risky string is not always a password. A private-key boundary, token prefix, bearer header, webhook URL, connection string, or long value beside a sensitive variable name can all carry access.
Secret pattern review is a pre-share and pre-commit triage step. It looks for two kinds of evidence. Known shapes catch formats with strong public clues, such as AWS access key IDs, GitHub tokens, Slack webhooks, Stripe secret keys, PEM private keys, credentialed URLs, and authorization headers. Contextual assignment checks handle messier text where a name such as client_secret, api_key, password, or session_cookie is paired with a value that looks generated.
The result is not simply "secret" or "safe." Real provider tokens and private keys usually deserve immediate containment, but fixtures, redacted examples, secret-manager references, and training snippets can resemble credentials. A useful review keeps enough line-level evidence to explain why a value was suspicious while masking the exact secret before evidence leaves a controlled review.
| Situation | Why it matters | Usual caution |
|---|---|---|
| Support or incident snippets | Headers, stack traces, request dumps, and logs can include bearer tokens or connection URLs. | Mask values before sharing outside the controlled review. |
| Config and environment samples | Variable names often reveal which values are meant to be secret even when the format is generic. | Confirm whether suspicious assignments are real credentials, references, or fixtures. |
| Repository and CI changes | One pasted key can spread through review comments, logs, branches, forks, and cached artifacts. | Rotation or revocation usually matters more than deleting only the visible copy. |
A local pattern check cannot prove that every secret has been found. It covers visible text and the selected rule set. Binary files, screenshots, encrypted blobs, unusual encodings, uncommon provider formats, and values split across lines still need ordinary secure-handling discipline.
How to Use This Tool:
Use the checker as a pre-share or pre-commit review for text samples that may contain credentials or credential-like values.
- Set Source label to the file name, ticket ID, deployment note, or sample name you want shown in the results. Avoid labels such as demo or placeholder when the material is real, because sample wording can lower confidence for contextual assignment rows.
- Choose Scan profile. Strict, Balanced, and Broad change the cutoff for generic assignment findings. Direct private-key, provider-token, authorization-header, credentialed-URL, and JWT-like patterns still run in every profile.
- Paste text into Sample text, drop a supported text file onto the textarea, or use Browse file. Browser-loaded files are capped at 1 MiB, and very large pasted samples show a warning asking you to trim the excerpt.
- Use Leak sample and Clean sample only to learn the expected rows before reviewing real material.
- Keep Mask values enabled for normal review. It masks finding values in tables and JSON, while the textarea and Copy source still contain the exact text you pasted or loaded.
- Open Advanced to tune Entropy threshold, Finding row limit, or Generic assignments. Raise the entropy threshold or switch to Strict when the review is too noisy; lower the threshold or use Broad during a migration sweep where weaker cues should stay visible.
- Read Triage Brief for the overall posture, then inspect Finding Ledger for line-level evidence and next actions. Use Pattern Coverage to see which supported families were considered, and use Secret Exposure Mix Chart for a severity count overview.
- If the checker reports "Paste or load text before reviewing secret-like patterns," add actual sample text first. If expected rows disappear, check whether Generic assignments is off, the profile cutoff is too strict, masking hid the value display, or the same span was already captured by a direct pattern.
Interpreting Results:
Review critical and high findings before spending time on lower-confidence rows. A private key block, recognized provider token, credentialed URL, or authorization header from real material should be treated as exposed until the owner confirms revocation, rotation, or invalidation.
- Triage Brief reports line count, character count, total findings, visible row count, masking state, selected profile, entropy threshold, row limit, and the immediate follow-up posture. It also makes clear that provider validation is not performed.
- Finding Ledger is the decision table: line, pattern, severity, confidence, identifier, value display, evidence, and next action.
- Pattern Coverage lists supported patterns with hit counts, including zero-hit families so you can see what the scan considered.
- Secret Exposure Mix Chart counts findings by severity. It helps summarize a review, but it does not replace reading the highest-risk rows.
- No findings does not prove the text is safe. It means the visible supported rules and selected generic-assignment policy did not trigger on the sample that was scanned.
Technical Details:
Secret-pattern triage combines direct recognition with contextual scoring. Direct rules recognize visible formats with strong structural clues: private-key block delimiters, service-specific prefixes, credential-bearing URLs, authorization headers, and three-part JWT-like strings. Assignment scoring handles the messier case where a sensitive key name is paired with a value that has the length, entropy, and character mix expected from a generated credential.
Direct pattern matches are always included because the visible shape carries most of the evidence. Generic assignment findings are profile-gated because names such as token, secret, and password can also appear beside harmless fixtures, placeholders, or secret-manager references. Duplicate rows are avoided when a direct pattern has already captured the same value span.
Rule Core
| Rule family | Recognized signals | Default severity and confidence |
|---|---|---|
| Cryptographic key | PEM private-key boundaries for RSA, DSA, EC, OpenSSH, PGP, and generic private keys. | Critical, high confidence. |
| Cloud and provider token | AWS access key IDs, GitHub classic and fine-grained tokens, GitLab tokens, Stripe keys, Slack tokens and webhooks, Google API keys, and npm tokens. | High, high confidence. |
| HTTP credential | Authorization: Basic and Authorization: Bearer values with long token-like material. |
High, medium confidence. |
| Connection string | Credentialed PostgreSQL, MySQL, MongoDB, Redis, AMQP, SMTP, FTP, and related URLs with user and password material before the host. | High, high confidence. |
| Signed token | JWT-like values with three Base64URL sections and a common encoded JSON header prefix. | Medium, medium confidence. |
| Generic assignment | Sensitive assignment names paired with non-space values that pass the selected profile cutoff. | High, medium, or low based on the final score after key-name and sample-cue adjustments. |
Formula Core
Generic assignment scoring uses Shannon entropy as one input. Entropy measures how spread out the characters are in a value, expressed as bits per character. High entropy alone is not enough; the key name must also look sensitive.
Here H is entropy, m is the count of distinct characters, and p_i is each character's share of the value. Entropy is not proof of secrecy. It is one cue among key name, value length, character classes, sample wording, and profile cutoff.
S is the generic assignment score. K is key-name weight, L is length weight, E is entropy weight, C is character-class weight, and P is the sample-wording penalty. The base value is 18 before the additive weights and penalties are applied.
| Score input | Boundary | Points or effect |
|---|---|---|
| Key-name cue | Private-key, client-secret, secret-access-key, or connection-string names. | +28. |
| Key-name cue | Password, passwd, pwd, or secret names. | +24. |
| Key-name cue | Token, API key, access key, or credential names. | +21. |
| Key-name cue | Auth, session, or cookie names. | +17. |
| Value length | >= 8, 12, 20, 32, or 48 characters. | +4, +8, +13, +18, or +22. |
| Entropy | >= threshold - 0.45, >= threshold, or >= threshold + 0.6. | +8, +17, or +24. |
| Character classes | Two, three, or four classes across lowercase, uppercase, digits, and symbols. | +4, +8, or +12. |
| Sample cue | Words such as example, sample, demo, dummy, fake, test, placeholder, changeme, your-token, or your-key. | -12 when in the value, or -8 when the cue appears only in the line or source label. |
The entropy threshold defaults to 3.7 bits per character and can be adjusted from 2.5 to 5.5. Profile cutoffs are 70 for Strict, 48 for Balanced, and 32 for Broad. After a generic assignment is included, scores of 82 or more are high confidence, scores of 58 to 81 are medium confidence, and lower included scores are low confidence.
For example, API_TOKEN with a 40-character mixed value at 4.3 bits per character and no sample cue scores 89 with the default threshold: 18 base + 21 key cue + 18 length + 24 entropy + 8 character-class weight. That clears every profile and becomes high confidence. By contrast, password=changeme usually stays below Balanced because the value is short, low entropy, and contains a sample cue, though Broad may still keep it as a low-confidence review row.
Privacy and Responsible Use:
The scan runs in your browser against the pasted text or local file content. The checker does not call providers, validate token status, inspect repository history, or upload the sample for server-side analysis.
- Keep Mask values on when copying rows, exporting evidence, or taking screenshots for people who do not need exact secrets.
- Remember that the textarea and Copy source action still contain the exact original text.
- Treat real critical and high findings as exposed until the owning service confirms revocation, rotation, or invalidation.
- Do not paste live secrets into unrelated chat, issue, documentation, or AI systems just to prove a finding.
Worked Examples:
A deployment note that includes AWS_ACCESS_KEY_ID=AKIA... should produce an AWS access key ID row in Finding Ledger with high severity and high confidence. If the note came from a real account, check for the paired secret access key, rotate the credential, and review recent IAM activity rather than only deleting the visible note.
A copied request header such as Authorization: Bearer eyJ... can appear as an HTTP Bearer token or JWT-like token finding depending on the exact shape. The result does not decode claims, verify signatures, or check expiry, so the next step is owner-side inspection and invalidation if the token came from real traffic.
A migration sweep with client_secret=sample-client-secret-9f8e7d6c5b4a3 may pass Broad or Balanced because the key name is sensitive and the value is long enough, while the sample wording lowers confidence. Keep the row as a review cue, but verify whether the value is a fixture before starting rotation work.
An empty paste or oversized local file is a setup problem, not a clean result. Add actual text when the input warning appears, or trim a file over 1 MiB to the relevant excerpt before using Browse file.
FAQ:
Does a finding prove the secret still works?
No. The checker reports visible patterns and scoring cues. Token activity, scope, revocation, expiry, and rotation status must be confirmed with the provider or owning service.
Why did a placeholder get flagged?
Sensitive names such as password or token can combine with length, entropy, and mixed characters to clear the generic assignment cutoff. Sample wording lowers the score, but it does not always remove the row.
Why did a real secret not appear?
The scanner covers supported visible patterns plus selected generic assignment rules. Unsupported token formats, images, encrypted blobs, unusual encodings, and values below the profile cutoff may not appear.
What changes when I switch profiles?
The profile changes the generic assignment cutoff: Strict uses 70, Balanced uses 48, and Broad uses 32. Direct pattern matches still run in every profile.
Can I share masked findings safely?
Masked findings reduce accidental disclosure, but they can still reveal service type, line context, and incident clues. Share them only with people who need the review details.
References:
- Supported secret scanning patterns, GitHub Docs.
- Secrets Management Cheat Sheet, OWASP Cheat Sheet Series.
- Digital Identity Guidelines: Authentication and Lifecycle Management, National Institute of Standards and Technology.