A Python library and CLI for extracting and refanging defanged Indicators of Compromise (IOCs) from text.
iocextract is a Python library and command-line tool for extracting Indicators of Compromise (IOCs) from text, specifically designed to handle 'defanged' IOCs that have been obfuscated to prevent accidental execution. It solves the problem of existing regex-based tools missing IOCs disguised with techniques like bracketed periods (`[.]`) or altered protocols (`hxxp://`), enabling security analysts to automatically collect and refang threat data from sources like tweets, blogs, and reports.
Security analysts, threat intelligence researchers, and incident responders who need to process unstructured text containing obfuscated IOCs, particularly those working with social media threat feeds, malware reports, or log analysis.
Developers choose iocextract for its specialized ability to accurately detect and refang a wide variety of defanging techniques out-of-the-box, its support for custom regex patterns, and its efficient iterator-based design for large datasets—making it a robust alternative to generic regex tools for security use cases.
Defanged Indicator of Compromise (IOC) Extractor.
Excels at detecting obfuscated IOCs using techniques like bracketed periods or altered protocols, as shown in extensive support tables for IPs, emails, and URLs, saving analysts from manual conversion.
Handles hex, URL, and base64-encoded IOCs in addition to defanging, making it versatile for various threat intelligence sources where IOCs are often disguised.
Returns iterators instead of lists, allowing low-memory extraction from massive text corpora, which is crucial for processing feeds or logs without performance hits.
Allows users to add their own regex patterns via a file or code, enhancing flexibility for specialized extraction needs, as detailed in the custom regex section.
Overlapping regex patterns can cause the same IOC to be extracted multiple times, as noted in the library examples, requiring manual deduplication or post-processing.
Not optimized for binary data or structured formats; extracting from HTML requires additional tools like Beautiful Soup, as admitted in the FAQ, leading to extra setup.
Installation on Windows can be problematic due to dependencies on the regex library, necessitating manual wheel installation as per the README, which adds complexity.
MISP (core software) - Open Source Threat Intelligence and Sharing Platform
Extract and aggregate threat intelligence.
Tool to gather Threat Intelligence indicators from publicly available sources
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.