Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Malware Analysis
  3. iocextract

iocextract

GPL-2.0Pythonv1.16.1

A Python library and CLI for extracting and refanging defanged Indicators of Compromise (IOCs) from text.

Visit WebsiteGitHubGitHub
578 stars92 forks0 contributors

What is iocextract?

iocextract is a Python library and command-line tool for extracting Indicators of Compromise (IOCs) from text, specifically designed to handle 'defanged' IOCs that have been obfuscated to prevent accidental execution. It solves the problem of existing regex-based tools missing IOCs disguised with techniques like bracketed periods (`[.]`) or altered protocols (`hxxp://`), enabling security analysts to automatically collect and refang threat data from sources like tweets, blogs, and reports.

Target Audience

Security analysts, threat intelligence researchers, and incident responders who need to process unstructured text containing obfuscated IOCs, particularly those working with social media threat feeds, malware reports, or log analysis.

Value Proposition

Developers choose iocextract for its specialized ability to accurately detect and refang a wide variety of defanging techniques out-of-the-box, its support for custom regex patterns, and its efficient iterator-based design for large datasets—making it a robust alternative to generic regex tools for security use cases.

Overview

Defanged Indicator of Compromise (IOC) Extractor.

Use Cases

Best For

  • Extracting defanged URLs and IPs from Twitter or blog posts
  • Automating IOC collection from threat intelligence feeds
  • Processing malware analysis reports with obfuscated indicators
  • Refanging IOCs for integration into security tools and databases
  • Building custom threat ingestion pipelines with Python
  • Analyzing logs or emails containing disguised malicious links

Not Ideal For

  • Extracting non-defanged IOCs from binary executables or large raw datasets where tools like Cacador are more efficient
  • Processing structured data like HTML or XML without prior text extraction, as it may yield noisy results requiring additional preprocessing
  • Environments requiring zero false positives or exact IOC boundaries, since regex overlaps can cause duplicate extractions
  • Teams needing real-time, high-performance IOC extraction from network streams or live logs without defanging

Pros & Cons

Pros

Defanged IOC Specialization

Excels at detecting obfuscated IOCs using techniques like bracketed periods or altered protocols, as shown in extensive support tables for IPs, emails, and URLs, saving analysts from manual conversion.

Multiple Encoding Support

Handles hex, URL, and base64-encoded IOCs in addition to defanging, making it versatile for various threat intelligence sources where IOCs are often disguised.

Efficient Large-Input Processing

Returns iterators instead of lists, allowing low-memory extraction from massive text corpora, which is crucial for processing feeds or logs without performance hits.

Custom Regex Integration

Allows users to add their own regex patterns via a file or code, enhancing flexibility for specialized extraction needs, as detailed in the custom regex section.

Cons

Duplicate Extraction Issues

Overlapping regex patterns can cause the same IOC to be extracted multiple times, as noted in the library examples, requiring manual deduplication or post-processing.

Limited to Text-Based Input

Not optimized for binary data or structured formats; extracting from HTML requires additional tools like Beautiful Soup, as admitted in the FAQ, leading to extra setup.

Windows Installation Hurdles

Installation on Windows can be problematic due to dependencies on the regex library, necessitating manual wheel installation as per the README, which adds complexity.

Frequently Asked Questions

Quick Stats

Stars578
Forks92
Contributors0
Open Issues2
Last commit1 year ago
CreatedSince 2018

Tags

#threat-sharing#library#python-library#regex#osint#security-automation#ioc-extraction#ioc#cli-tool#indicators-of-compromise#malware-analysis#threat-intelligence#threatintel#cybersecurity#incident-response#malware-research

Built With

P
Python
r
regex

Links & Resources

Website

Included in

Malware Analysis13.6k
Auto-fetched 14 hours ago

Related Projects

MISPMISP

MISP (core software) - Open Source Threat Intelligence and Sharing Platform

Stars6,289
Forks1,582
Last commit4 days ago
ThreatIngestorThreatIngestor

Extract and aggregate threat intelligence.

Stars910
Forks135
Last commit2 years ago
CombineCombine

Tool to gather Threat Intelligence indicators from publicly available sources

Stars657
Forks172
Last commit7 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub