An LLM-powered web honeypot that dynamically crafts realistic HTTP responses to mimic various applications and detect malicious traffic.
Galah is an LLM-powered web honeypot designed to mimic various web applications by dynamically generating realistic HTTP responses to incoming requests. It solves the problem of static, easily detectable honeypots by using large language models to craft contextually appropriate replies, helping security teams detect and analyze malicious web traffic. The project includes features like response caching and Suricata rule matching for enhanced efficiency and threat detection.
Security researchers, penetration testers, and blue team defenders who need to deploy deceptive systems to monitor attack patterns and gather threat intelligence on web-based exploits.
Developers choose Galah over traditional honeypots because it dynamically adapts to any HTTP request without manual configuration, supports multiple LLM providers for flexibility, and reduces operational costs through intelligent response caching.
Galah: An LLM-powered web honeypot.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses LLMs to craft context-aware HTTP headers and bodies for any request, eliminating manual emulation of web apps—evidenced by examples like mimicking TP-LINK routers with SOAP responses.
Integrates with OpenAI, GoogleAI, Anthropic, Cohere, GCP Vertex AI, and Ollama, offering flexibility in model choice and cost management as shown in the provider options.
Caches generated responses per port with configurable duration, reducing repetitive LLM API calls and lowering operational expenses, detailed in the cache-duration flag.
Optional Suricata rule matching inspects HTTP requests for known attack patterns, logging matches for forensic analysis, as demonstrated in the JSON event log with SID 2061623.
Admits in the README that the honeypot may be identifiable via network fingerprinting, prolonged response times from LLM calls, and non-standard outputs, reducing effectiveness for covert operations.
Suricata rule matching has limited keyword support and poor PCRE handling, compromising advanced threat detection capabilities, as noted in the documentation.
Relies on external LLM APIs that can lead to high expenses or denial-of-wallet attacks if usage limits aren't set, requiring active management beyond the caching feature.