An open-source suite featuring financial large language models (FinMA), instruction datasets (FIT), and evaluation benchmarks (FinBen) for financial AI.
PIXIU is an open-source project that provides a comprehensive suite for financial artificial intelligence, including the first financial large language models (FinMA), a multi-task instruction dataset (FIT), and a holistic evaluation benchmark (FinBen). It addresses the need for specialized, transparent tools to develop, fine-tune, and assess LLMs on financial tasks like sentiment analysis, question answering, and stock prediction.
AI researchers, data scientists, and financial technology developers working on applying large language models to financial domains such as quantitative analysis, risk assessment, and financial NLP.
PIXIU offers a fully open-source, holistic framework with specialized financial LLMs, diverse instruction data, and a rigorous multi-task benchmark, enabling reproducible research and development in financial AI without reliance on proprietary models.
This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
FinBen covers over 30 diverse tasks including sentiment analysis, QA, and stock prediction, with detailed performance metrics as shown in the tasks table and leaderboard.
Supports English, Chinese, and Spanish financial texts, and incorporates text, tables, and time-series data like stock prices for realistic scenarios, per the key features.
Provides all resources openly, including FinMA models, FIT datasets, and FinBen benchmarks, encouraging reproducible research and avoiding vendor lock-in.
Developed by multiple universities with papers published in venues like NeurIPS, ensuring credibility and ongoing updates, as noted in the citations and institution logos.
Requires Docker installation, manual BART checkpoint downloads, and configuration of Huggingface models, making initial deployment cumbersome for non-experts.
Labeled as v0.1 with some guides like 'How to fine-tune' marked 'coming soon', indicating potential bugs and lack of polished documentation.
FinMA models are 7B and 30B parameters, necessitating significant GPU memory and processing power, which may be prohibitive for resource-constrained teams.