Wrap the Gemini CLI as an OpenAI-compatible API service to use the free Gemini Pro model via standard API calls.
Gemini CLI Proxy is a Python service that wraps Google's official Gemini CLI, exposing it as an OpenAI-compatible HTTP API server. It solves the problem of accessing the free Gemini 2.5 Pro model programmatically, enabling developers to use standard OpenAI client libraries and tools with Gemini.
Developers and hobbyists building AI-powered applications who want to use Google's Gemini model via a standard API interface without paying for commercial API services.
It provides a free, self-hosted alternative to paid LLM APIs by leveraging Google's Gemini CLI, with full OpenAI API compatibility for easy integration into existing workflows and tools like Cherry Studio.
Wrap Gemini CLI as an OpenAI-compatible API service, allowing you to enjoy the free Gemini Pro model through API!
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Implements the /v1/chat/completions endpoint exactly, allowing drop-in replacement with existing OpenAI client libraries, as demonstrated in the Python and Cherry Studio examples.
Zero-configuration startup via uvx or standard Python tooling gets the service running in minutes, with clear commands provided in the README.
Leverages the free Gemini 2.5 Pro model through Google's CLI, offering a no-cost alternative to paid LLM APIs for development and prototyping.
Allows customization of host, port, rate limits, and concurrency via command-line arguments, adapting to different local or network environments.
Requires manual proxy configuration in certain regions to access Google services, leading to frequent timeouts and connectivity issues, as admitted in the FAQ.
Depends on Google's Gemini CLI, which adds complexity and potential breakage if the CLI changes or has bugs, since this proxy merely wraps it.
Only supports basic chat completions; lacks advanced OpenAI API features like streaming responses, function calling, or fine-tuning, limiting its use in sophisticated applications.
Uses a dummy API key for all requests, providing no real security, which makes it unsuitable for multi-user or exposed deployments without additional layers.