A fast, memory-optimized C tool to remove duplicates from massive wordlists while preserving order, designed for password cracking.
Duplicut is a high-performance, memory-optimized command-line tool written in C that removes duplicate lines from massive wordlists without sorting them. It solves the specific problem in password cracking where wordlist order must be preserved to keep the most probable passwords at the front for efficient cracking, while still handling files larger than available RAM.
Security researchers, penetration testers, and password cracking enthusiasts who work with large, combined wordlists and need efficient deduplication without losing the strategic order of passwords.
Developers choose Duplicut because it uniquely combines order preservation with the ability to process wordlists exceeding system memory, using optimized C code and multithreading for speed—addressing a gap left by general-purpose deduplication tools.
Remove duplicates from MASSIVE wordlist, without sorting it (for dictionary-based password cracking)
Can process wordlists larger than available RAM by splitting them into virtual chunks, enabling handling of multi-gigabyte files without system limits.
Deduplicates without altering line order, which is critical for password cracking efficiency where probable passwords must remain at the front.
Uses compressed hashmap items and tagged pointers to minimize memory footprint, as detailed in the technical implementation for efficient large-scale processing.
Leverages multiple threads for faster processing on modern hardware, speeding up deduplication for performance-intensive workflows.
Includes options to filter by line length, ASCII printable characters, and case conversion, adding utility beyond basic deduplication for tailored wordlist preparation.
The --line-max-size option cannot exceed 4095 characters, which may be restrictive for use cases with very long lines or non-standard data.
Using the --dupfile option to save duplicates slows down processing, as admitted in the README, making it less ideal for time-sensitive operations.
Optimized for ASCII text and password cracking; lacks built-in support for non-ASCII encodings or binary files, limiting general-purpose applicability.
Requires compilation from C source, which can be a hurdle for users without development tools or those seeking quick, package-based installation.
Common User Passwords Profiler (CUPP)
Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and memory ops 🦖
Mentalist is a graphical tool for custom wordlist generation. It utilizes common human paradigms for constructing passwords and can output the full wordlist as well as rules compatible with Hashcat and John the Ripper.
Generate smart and powerful wordlists
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.