A Python library that provides reliable, validated JSON outputs from any LLM using Pydantic models.
Instructor is a Python library that enables developers to extract structured, validated JSON data from any large language model (LLM). It solves the problem of unreliable and unstructured LLM outputs by using Pydantic models to define schemas, automatically handling validation, retries, and parsing.
Developers and data scientists building AI applications that require reliable structured data extraction from LLMs, such as data parsing, content analysis, or automation workflows.
Developers choose Instructor for its simplicity, type safety, and provider-agnostic design. It eliminates the need for manual JSON schema writing, error handling, and retry logic, offering a battle-tested solution focused solely on structured extraction.
structured outputs for llms
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Unifies access to OpenAI, Anthropic, Google, Ollama, and other LLMs through a single interface, simplifying multi-provider setups as shown in the README code examples.
Leverages Pydantic for type safety and automatically retries failed extractions with error feedback, reducing manual error handling and improving reliability.
Supports real-time streaming of incomplete structured data using the Partial wrapper, enabling progressive updates for responsive applications.
Offers official libraries in Python, TypeScript, Ruby, Go, Elixir, and Rust, making it accessible across different tech stacks without logic rewrites.
Focused solely on data extraction and lacks features for agentic workflows, tool execution, or state management, as acknowledged in the README's comparison with PydanticAI.
Requires Pydantic for model definitions, which can be a barrier for teams using alternative validation libraries or seeking ultra-lightweight solutions.
Automatic retries on validation failures can increase response times, potentially affecting performance in latency-sensitive applications like real-time chatbots.