Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Self Hosted
  3. sist2

sist2

GPL-3.0CSelf-Hosted

A fast, multi-threaded file system indexer and search tool with a web interface, supporting text/metadata extraction, thumbnails, OCR, and incremental scanning.

GitHubGitHub
1.3k stars77 forks0 contributors

What is sist2?

sist2 is a command-line and web-based file system indexer and search engine. It scans directories, extracts text and metadata from a wide variety of file formats, and builds a searchable index that can be queried through a fast web interface. It solves the problem of quickly finding content across large, local file collections without relying on cloud services.

Target Audience

Developers, sysadmins, and power users who need to index and search personal or organizational file archives, media libraries, or document repositories on their own hardware.

Value Proposition

Developers choose sist2 for its exceptional speed, low resource consumption, extensive file format support, and the flexibility to run entirely offline with either a heavyweight (Elasticsearch) or lightweight (SQLite) backend.

Overview

Lightning-fast file system indexer and search tool

Use Cases

Best For

  • Indexing and searching personal document archives
  • Creating a searchable media library (photos, videos, audio)
  • Building a self-hosted alternative to Spotlight or Windows Search
  • Adding full-text search to local development file collections
  • Performing OCR on scanned documents or ebooks for searchability
  • Analyzing and tagging files with custom metadata via user scripts

Not Ideal For

  • Production environments requiring stable, mature software with long-term support guarantees
  • Teams needing real-time, collaborative tagging and indexing with multiple concurrent users
  • Non-Docker users who want built-in web-based job management and scheduling
  • Applications demanding advanced search features like fuzzy matching without relying on Elasticsearch

Pros & Cons

Pros

Extensive Format Support

Extracts text, metadata, and thumbnails from over 20 file types including PDFs, images, audio, video, and archives, as detailed in the format support table, making it versatile for mixed media collections.

High Performance Scanning

Uses multi-threaded scanning with low memory footprint, enabling fast indexing of large directories without significant resource drain, as emphasized in the README.

Flexible Search Backends

Supports both Elasticsearch for full-featured search and SQLite for lightweight setups, allowing users to choose based on resource constraints and feature needs.

Incremental Efficiency

Only indexes new or modified files on subsequent runs, reducing scan times and system load for ongoing updates, which is core to its design philosophy.

Modern Web Interface

Provides a mobile-friendly UI for managing scans, browsing results, and manual tagging, enhancing accessibility without requiring cloud services.

Cons

Early Development Risks

Marked as in early development with a warning in the README, leading to potential breaking changes, bugs, and incomplete features unsuitable for critical deployments.

Docker Dependency for Management

Full web-based job management and scheduling are only available with Docker Compose setup, limiting functionality for users who prefer native executables or other container systems.

Archive Scanning Limitations

Archive files are scanned sequentially by a single thread, and support for seek-dependent formats like GIFs is limited, which can bottleneck performance on I/O-heavy systems.

SQLite Feature Gaps

When using the SQLite backend, features like fuzzy search, real-time media type updates, and efficient embeddings search are missing, as noted in the search backends comparison table.

Frequently Asked Questions

Quick Stats

Stars1,264
Forks77
Contributors0
Open Issues79
Last commit10 months ago
CreatedSince 2019

Tags

#metadata-extraction#c#vuejs#docker#full-text-search#sqlite#ocr#web-interface#elasticsearch

Built With

S
SQLite
E
Elasticsearch
V
Vue.js
l
libarchive
C
CMake
P
Python
F
FFmpeg
D
Docker
T
Tesseract
C
C++

Included in

Self Hosted284.1k
Auto-fetched 1 day ago

Related Projects

Manticore SearchManticore Search

Easy to use open source fast database for search | Good alternative to Elasticsearch | Drop-in replacement for E in the ELK stack

Stars11,803
Forks625
Last commit1 day ago
WhoogleWhoogle

A self-hosted, ad-free, privacy-respecting metasearch engine

Stars11,522
Forks1,042
Last commit4 days ago
WebsurfxWebsurfx

:rocket: An open source alternative to searx which provides a modern-looking :sparkles:, lightning-fast :zap:, privacy respecting :disguised_face:, secure :lock: meta search engine

Stars1,112
Forks128
Last commit10 days ago
Meme SearchMeme Search

The open source Meme Search Engine and Finder. Free and built to self-host locally with Python, Ruby, and Docker.

Stars671
Forks25
Last commit20 days ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub