Showing 13 of 13 projects
A model-definition framework for state-of-the-art machine learning models across text, vision, audio, and multimodal tasks.
A model-definition framework for state-of-the-art machine learning models across text, vision, audio, and multimodal tasks.
Connects ChatGPT with visual foundation models to enable sending and receiving images during chat interactions.
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications in Python.
A collection of hands-on tutorials and practical examples for using Google's Gemini API across text, image, video, audio, and robotics applications.
A Python library for language-vision intelligence research, providing unified access to state-of-the-art models, datasets, and tasks.
An automated machine learning library that trains and deploys high-accuracy models for tabular, text, image, and time series data with minimal code.
An open-source embedded retrieval library for multimodal AI, offering fast vector search, SQL, and full-text search.
A fast, flexible, and hardware-aware LLM inference engine with zero-config support for any Hugging Face model.
A terminal-based AI assistant that analyzes code, automates workflows, and executes tasks using natural language commands.
A JAX library for rapid prototyping of large-scale attention-based vision models across images, video, audio, and multimodal data.
An open-source framework for building multimodal AI systems that enable large language models to understand and chat about videos and images.
A desktop AI assistant and universal MCP client that works with any LLM provider, offering chat, image/video generation, and system-wide productivity tools.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.