Java-based web archive deduplication tool that identifies duplicates and converts them to reference records in WARC files.
Web archive deduplication tools
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.