Showing 36 of 37 projects
A powerful Python library for data manipulation and analysis, providing fast, flexible data structures.
A powerful Python library for data analysis and manipulation, providing fast, flexible data structures.
A comprehensive guide to Python's essential data science libraries, available as free Jupyter notebooks.
A 10-week, 20-lesson curriculum teaching data science fundamentals through project-based learning and quizzes.
A comprehensive collection of data science Python notebooks covering deep learning, machine learning, big data, visualization, and essential tools.
Companion materials and IPython notebooks for the 'Python for Data Analysis' book, covering pandas, NumPy, and data science workflows.
A Python library that enables conversational data analysis on SQL, CSV, and parquet files using LLMs and RAG.
A Python library for downloading market data from Yahoo! Finance's public API.
An open-source Python quantitative trading system supporting stocks, options, futures, and cryptocurrencies with integrated machine learning.
A Python utility for crawling, cleaning, and storing historical financial data for China stocks and futures.
A Python visualization library based on matplotlib for creating attractive statistical graphics with a high-level interface.
Generate comprehensive data quality profiling and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
Generate comprehensive data quality profiles and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
A drop-in replacement for pandas that scales data analysis workflows to use all CPU cores and handle out-of-memory datasets.
A GPU-accelerated DataFrame library for tabular data processing, part of the RAPIDS data science suite.
A portable Python dataframe library that compiles to SQL and works with over 20 backends for unified data manipulation.
A curated collection of Python tutorials and resources for data science, machine learning, and natural language processing.
A Python library that automates data visualization and exploration for pandas dataframes in Jupyter notebooks.
A Python library that extends pandas to work with geographic data, enabling spatial operations and analysis.
A modular container build system providing the latest AI/ML packages for NVIDIA Jetson and JetPack-L4T.
A flexible and expressive API for performing statistical data validation on dataframe-like objects.
A Python library for visualizing missing data in pandas DataFrames using matrix, bar, heatmap, and dendrogram plots.
A Python package for working with labeled multi-dimensional arrays, inspired by pandas and tailored for scientific data.
A Python library that simplifies data integration between pandas and AWS services like Athena, S3, Redshift, and more.
A Python library that simplifies data integration between pandas and AWS services like Athena, S3, Redshift, and more.
A collection of handwritten notes, notebooks, and resources for Andrew Ng's Deep Learning Specialization on Coursera.
Koalas provides the pandas DataFrame API on Apache Spark, enabling data scientists to work with big data using familiar pandas syntax.
A Python library for automated exploratory data analysis (EDA) with high-density visualizations and target analysis in two lines of code.
High-performance datastore optimized for time series and tick data storage and retrieval.
A language and runtime that optimizes performance of data-intensive applications by lazily building and optimizing computations across libraries.
A Python package that automatically accelerates pandas and Modin DataFrame apply operations by choosing the fastest available method.
A Python library for building Flask-based dashboards with React components using Jinja templating.
A Python library implementing over 80 financial technical indicators using Pandas for trading analysis.
An open-source Python library for low-code data preparation, offering fast EDA, data cleaning, and collection from APIs and databases.
A Python library for feature engineering and selection with scikit-learn compatible transformers.
Automatically visualize any dataset with a single line of code, including data quality assessment and fixes.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.