A fast, multi-threaded file system indexer and search tool with a web interface, supporting text/metadata extraction, thumbnails, OCR, and incremental scanning.
sist2 is a command-line and web-based file system indexer and search engine. It scans directories, extracts text and metadata from a wide variety of file formats, and builds a searchable index that can be queried through a fast web interface. It solves the problem of quickly finding content across large, local file collections without relying on cloud services.
Developers, sysadmins, and power users who need to index and search personal or organizational file archives, media libraries, or document repositories on their own hardware.
Developers choose sist2 for its exceptional speed, low resource consumption, extensive file format support, and the flexibility to run entirely offline with either a heavyweight (Elasticsearch) or lightweight (SQLite) backend.
Lightning-fast file system indexer and search tool
Extracts text, metadata, and thumbnails from over 20 file types including PDFs, images, audio, video, and archives, as detailed in the format support table, making it versatile for mixed media collections.
Uses multi-threaded scanning with low memory footprint, enabling fast indexing of large directories without significant resource drain, as emphasized in the README.
Supports both Elasticsearch for full-featured search and SQLite for lightweight setups, allowing users to choose based on resource constraints and feature needs.
Only indexes new or modified files on subsequent runs, reducing scan times and system load for ongoing updates, which is core to its design philosophy.
Provides a mobile-friendly UI for managing scans, browsing results, and manual tagging, enhancing accessibility without requiring cloud services.
Marked as in early development with a warning in the README, leading to potential breaking changes, bugs, and incomplete features unsuitable for critical deployments.
Full web-based job management and scheduling are only available with Docker Compose setup, limiting functionality for users who prefer native executables or other container systems.
Archive files are scanned sequentially by a single thread, and support for seek-dependent formats like GIFs is limited, which can bottleneck performance on I/O-heavy systems.
When using the SQLite backend, features like fuzzy search, real-time media type updates, and efficient embeddings search are missing, as noted in the search backends comparison table.
Easy to use open source fast database for search | Good alternative to Elasticsearch | Drop-in replacement for E in the ELK stack
A self-hosted, ad-free, privacy-respecting metasearch engine
:rocket: An open source alternative to searx which provides a modern-looking :sparkles:, lightning-fast :zap:, privacy respecting :disguised_face:, secure :lock: meta search engine
The open source Meme Search Engine and Finder. Free and built to self-host locally with Python, Ruby, and Docker.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.