-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
TruffleHog Version
trufflehog 3.92.4
Trace Output
Not applicable - the behavior is deterministic and reproducible without trace logs. The issue is that certain files are simply not scanned/reported by the github scanner while filesystem scanner finds them correctly.
Expected Behavior
When scanning a GitHub repository, TruffleHog should detect secrets in all files that contain them, including files that were created by copying/renaming content from other files in subsequent commits.
Actual Behavior
The trufflehog github scanner only reports the secret in the original file from the first commit where it was introduced. When the same secret is copied into new files (e.g., when splitting a file into multiple files), those new files are not scanned/reported.
The trufflehog filesystem scanner on the same cloned repository does detect the secrets in the new files correctly.
| Scanner | Original file (deleted) | New file _DE |
New file _FR |
|---|---|---|---|
trufflehog github |
✅ Found (in history) | ❌ Not found | ❌ Not found |
trufflehog filesystem |
N/A (deleted) | ✅ Found | ✅ Found |
Steps to Reproduce
- Have a repository with a file containing a verified secret (e.g.,
fileA.ipynbwith a Databricks token) - In a later commit, create new files by copying content from the original file (e.g.,
fileA_DE.ipynb,fileA_FR.ipynb) - these new files contain the same secret - Optionally delete the original file
- Run
trufflehog github --repo=<repo_url> --token=<token> --json - Observe: only the original file is reported, not the new files
- Clone the repo and run
trufflehog filesystem <path> --json - Observe: the new files (
_DE,_FR) are correctly reported
Environment
- OS: macOS Darwin 25.2.0
- Version: trufflehog 3.92.4
Additional Context
This is a security concern because:
- The original file may be deleted (no longer on active branches)
- The new files with the secret are on active branches but not reported
- Users may think the secret exposure is "history only" when it's actually still present in the codebase
The github scanner seems to deduplicate based on the secret value, skipping files if the same secret was already found in a previous commit - even if the new files are currently on the active branch and the original file no longer exists.
References
- Trufflehog doesn't detect multiple occurrences of a secret. #4535 (related but different - same file, multiple occurrences)