Showing 35 of 71 projects
A Python library for agile data preparation workflows that works with Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark.
Python library providing clean, chainable functions for data cleaning and manipulation with pandas DataFrames.
A Python library for calculating common financial risk and performance metrics used in quantitative finance.
A pandas DataFrame wrapper for calculating over 70 stock market indicators and statistics with inline column access.
A Python plotting library that generates interactive D3.js visualizations from pandas DataFrames.
A visual, low-code data preparation tool that generates Python code for ETL, reporting, and AI-assisted workflows.
Query pandas DataFrames using SQL syntax, similar to sqldf in R.
A lightweight and intuitive Go library for data manipulation, statistics, and machine learning using DataFrames.
A curated collection of 500+ resources for data analysis and data science, covering Python, SQL, ML, visualization, roadmaps, and interview prep.
A collection of Jupyter notebooks for financial economics, providing high-level APIs to retrieve, analyze, and visualize economic data from sources like FRED.
A cleaned and normalized time series dataset of global COVID-19 confirmed cases, deaths, and recoveries, updated daily.
Display Pandas and Polars DataFrames as interactive, sortable, and searchable DataTables in Jupyter notebooks and Python applications.
A collection of IPython notebooks demonstrating data analysis and machine learning techniques on security datasets.
A Python library that brings R's dplyr data manipulation syntax to pandas DataFrames using a pipe operator.
A Python library that automates the tedious parts of exploratory data analysis with cleaning, feature engineering, visualization, and versioning.
A Python library for comparing Pandas, Polars, Spark, and Snowpark DataFrames with detailed reporting and flexible matching.
A modular multi-modal transactional database for AI and semantic search, replacing MongoDB, Neo4J, and Elastic with a single ACID solution.
A tutorial series comparing how to implement data science concepts and build applications in both Python and R ecosystems.
A VS Code extension for visually exploring, cleaning, and transforming tabular data with automatic Pandas code generation.
An engine for ML/data tracking, visualization, explainability, drift detection, and dashboards, integrated with Polyaxon.
A lightweight Python tool for generating rich summary statistics of pandas and Polars dataframes directly in the console.
An overlay companion for pandas that provides real-time hints and tips to improve data analysis code.
A Python library that provides a Pandas-like API on top of Apache Spark DataFrames for distributed data analysis.
A Python devkit for loading, exploring, and manipulating the PandaSet, a large-scale autonomous driving dataset with LiDAR, camera, and annotations.
A comprehensive scientific computing and AI/ML library in pure Rust, offering SciPy-compatible APIs with 10-100x performance gains.
A Python toolkit for text-focused data science on medium-sized datasets, bridging memory and cluster-scale processing.
A Python package for automated univariate and bivariate data analysis and visualization to streamline machine learning workflows.
A Clojure library providing data-frames and arrays through Python's pandas and numpy.
A Python library for blazing-fast, memory-efficient genomics data operations using DataFrames.
A Python package for processing and normalizing high-dimensional morphological feature data from high-throughput cell imaging experiments.
A Python histogram library offering updateable, semantic histogram objects with multiple visualization backends and data source support.
A Python library implementing fairness-aware machine learning algorithms for measuring and mitigating discrimination in predictive models.
An open-source data pipeline that aggregates and standardizes heterogeneous public COVID-19 data from multiple global sources.
A pandas-based Python library for calculating weighted statistics like means, medians, standard deviations, and distributions.
A lightweight Python parser for EDI 835 Health Care Claim Payment and Remittance Advice files.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.