A curated dataset of packed and unpacked PE executables for training machine learning models to detect packing.
Dataset of packed PE samples
A repository of LIVE malwares for your own joy and pleasure. theZoo is a project created to make the possibility of malware analysis open and available to the public.
Malware samples, analysis exercises and other interesting resources.
Elastic Malware Benchmark for Empowering Researchers
EMBER2024 is an updated malware dataset designed for researchers to explore a variety of classification tasks, including malicious/benign detection, malware family classification, and behavior prediction. It provides raw features and multiple label types for 3.2 million files, enabling holistic evaluation of machine learning models in cybersecurity. ## Key Features - **Multi-File Type Support** — Includes Win32, Win64, .NET, APK, ELF, and PDF files for cross-platform analysis. - **Temporal Split** — Training and test sets are separated by time to simulate detection of newer malware. - **Challenge Set** — Contains 6,315 evasive malicious files initially undetected by antivirus products. - **Feature Version 3** — Re-implemented feature vector format using the stable pefile library, with additions like DOS header and Authenticode signature features. - **Extended Labels** — Seven types of labels and tags support diverse classification tasks beyond simple detection. - **Capa Integration** — Includes malware behavior analysis results (ATT&CK techniques, MBC behaviors) for Win32, Win64, .NET, and ELF files. ## Philosophy EMBER2024 aims to provide a comprehensive, realistic benchmark that reflects the evolving malware landscape, enabling robust evaluation of classifier performance on novel and evasive threats.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.