A high-performance CLI tool and Python library for detecting open source components and security threats in binaries through semantic signature matching. Specialized for analyzing mobile apps (APK/IPA), Java archives, ML models, and source code to identify OSS components, their licenses, and potential security risks.
- Binary Component Detection: Identify 188+ OSS components in compiled binaries using semantic signatures
- ML Model Security Analysis: Comprehensive security scanning with MITRE ATT&CK mapping
- Multi-Format Support: APK/IPA, JAR/WAR, ELF/PE/Mach-O, ML models (pickle, ONNX, SafeTensors)
- SEMCL.ONE Integration: Works seamlessly with osslili, purl2notices, and other ecosystem tools
pip install binarysnifferFor development:
git clone https://github.com/SemClone/binarysniffer.git
cd binarysniffer
pip install -e .With performance extras:
pip install binarysniffer[fast]# Analyze a binary file
binarysniffer analyze /path/to/binary
# ML model security scan
binarysniffer ml-scan model.pkl --deep
# Generate SBOM
binarysniffer analyze app.apk --format cyclonedx -o sbom.json# Basic analysis
binarysniffer analyze app.apk
# ML model security analysis
binarysniffer ml-scan model.pkl --risk-threshold 0.5
# Directory scanning with recursion
binarysniffer analyze /path/to/project -r
# Generate CycloneDX SBOM
binarysniffer analyze app.jar --format sbom -o app-sbom.json
# Extract package inventory
binarysniffer inventory app.apk --with-hashes -o inventory.jsonfrom binarysniffer import EnhancedBinarySniffer
# Initialize analyzer
sniffer = EnhancedBinarySniffer()
# Analyze a file
result = sniffer.analyze_file("app.apk")
for match in result.matches:
print(f"{match.component} - {match.confidence:.2%}")
print(f"License: {match.license}")
# ML security analysis
from binarysniffer.ml_security import MLSecurityAnalyzer
analyzer = MLSecurityAnalyzer()
risks = analyzer.analyze_model("model.pkl")- Advanced format support (ELF, PE, Mach-O) via LIEF
- Android DEX bytecode analysis
- Static library (.a) support
- Symbol and import extraction
- Mobile apps (APK, IPA)
- Java archives (JAR, WAR)
- Python packages (wheel, egg)
- Linux packages (DEB, RPM)
- Extended formats (7z, RAR, Zstandard)
- Safe pickle file analysis
- ONNX and SafeTensors validation
- PyTorch/TensorFlow native formats
- 100% detection rate on known exploits
- SARIF output for CI/CD integration
- 188 OSS components covered
- 1,400+ high-quality signatures
- Automatic license detection
- Security severity classification
BinarySniffer is a core component of the SEMCL.ONE ecosystem:
- Complements osslili for source code license detection
- Works with purl2notices for comprehensive attribution
- Integrates with ospac for policy evaluation
- Supports upmex for package metadata extraction
# ~/.binarysniffer/config.json
{
"signature_sources": [
"https://signatures.binarysniffer.io/core.xmdb"
],
"min_confidence": 0.5,
"parallel_workers": 4,
"auto_update": true
}- User Guide - Comprehensive usage examples
- API Reference - Python API documentation
- ML Security - ML model security analysis
- Signature Management - Creating and managing signatures
- Architecture - System design and internals
- TLSH Fuzzy Matching - Detecting modified components
- Creating Signatures - Contributing new signatures
- Installation Guide - Platform-specific setup
- Package Verification - Archive analysis
We welcome contributions! Please see CONTRIBUTING.md for details on:
- Code of conduct
- Development setup
- Submitting pull requests
- Signature contributions
For support and questions:
- GitHub Issues - Bug reports and feature requests
- Documentation - Complete project documentation
- SEMCL.ONE Community - Ecosystem support and discussions
Apache License 2.0 - see LICENSE file for details.
See AUTHORS.md for a list of contributors.
Part of the SEMCL.ONE ecosystem for comprehensive OSS compliance and code analysis.