Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Data Analysis

Data Analysis

283 projects

Showing 36 of 283 projects

visualize_ML
visualize_MLPython

A Python package for automated univariate and bivariate data analysis and visualization to streamline machine learning workflows.

#statistical-analysis#statisics#feature-selection
Stars205
Forks29
Last commit9 years ago
rlist
rlistR

An R package providing a toolbox of pipeline-friendly functions for manipulating and querying non-tabular data stored in list objects.

#functional-programming#r-package#r-language
Stars204
Forks29
Last commit3 years ago
HASS-data-detective
HASS-data-detectivePython

A Python package for exploring and analyzing data from your Home Assistant database.

#iot#home-automation#data-science
Stars204
Forks39
Last commit2 months ago
shib
shibJavaScript

A web client for SQL-like query engines including Hive, Presto, and BigQuery, written in Node.js.

#query-engine#hive#presto
Stars199
Forks56
Last commit9 years ago
go-ml
go-mlGo

A Go library implementing essential machine learning algorithms including linear regression, logistic regression, and neural networks.

#logistic-regression#neural-networks#go-library
Stars199
Forks25
Last commit9 years ago
Open Data
Open DataR

Archived R package for accessing open data from various government and scientific sources.

#r-package#data-science#scientific-data
Stars198
Forks46
Last commit4 years ago
Panthera
PantheraClojure

A Clojure library providing data-frames and arrays through Python's pandas and numpy.

#array#data-science#dataframe
Stars191
Forks15
Last commit6 years ago
geoblaze
geoblazeJavaScript

A blazing fast JavaScript raster processing engine for analyzing GeoTIFFs in browsers and Node.js.

#statistics#geospatial#raster
Stars190
Forks28
Last commit1 year ago
Biological Image Analysis
Biological Image Analysis

A curated list of software, tools, pipelines, and plugins for image analysis in biological research.

#image-analysis#bioimage-analysis#microscopy
Stars186
Forks28
Last commit2 months ago
Proseg
ProsegRust

A probabilistic cell segmentation method for spatial transcriptomics data from platforms like Xenium, CosMx, MERSCOPE, and Visium HD.

#probabilistic-modeling#single-cell-analysis#computational-biology
Stars173
Forks18
Last commit21 days ago
pipeR
pipeRR

An R package providing multiple pipeline styles (operator, object, function) for readable function chaining and data transformation.

#workflow-tools#r-package#r-language
Stars172
Forks39
Last commit9 years ago
tidytransit
tidytransitR

An R package for reading, analyzing, and visualizing public transit data in GTFS format using tidyverse and sf.

#simple-features#transportation-planning#r-package
Stars171
Forks22
Last commit1 month ago
RDataSets
RDataSetsR

Julia package providing easy access to 700+ standard R datasets for data analysis and statistical learning.

#julia#data-science#statistics
Stars166
Forks54
Last commit1 month ago
SSTable Tools
SSTable ToolsJava

A toolkit for parsing, creating, and analyzing Cassandra 3.x SSTables, including an interactive CQL shell.

#cqlsh#dse#cli-tool
Stars164
Forks33
Last commit9 years ago
karoo_gp
karoo_gpPython

A genetic programming platform for Python with TensorFlow for fast CPU and GPU symbolic regression and classification.

#scientific-computing#gpu-acceleration#classification
Stars163
Forks62
Last commit3 years ago
nbflow
nbflowPython

A tool for creating one-button reproducible workflows with Jupyter Notebook and Scons.

#workflow-automation#reproducible-research#python
Stars161
Forks18
Last commit7 years ago
Archives Unleashed Toolkit
Archives Unleashed ToolkitScala

An open-source toolkit for analyzing web archives at scale using Apache Spark.

#apache-spark#web-archives#cultural-heritage
Stars158
Forks34
Last commit6 months ago
dadbod-grip.nvim
dadbod-grip.nvimLua

A Neovim plugin that lets you edit database tables like Vim buffers with live SQL preview, transaction undo, and cross-database federation.

#ai#database#vim-buffers
Stars152
Forks3
Last commit1 month ago
ghql
ghqlR

A GraphQL client for R, enabling querying and interacting with GraphQL APIs from the R programming language.

#data-querying#r-package#graphql
Stars149
Forks13
Last commit3 months ago
Matft
MatftSwift

A NumPy-like multi-dimensional array library for Swift with support for complex numbers and image processing.

#ndimensional-arrays#scientific-computing#signal-processing
Stars145
Forks23
Last commit1 month ago
utah
utahRust

A Rust crate for type-conscious, tabular data manipulation with an expressive, functional interface.

#functional-programming#dataframe#csv-parsing
Stars145
Forks14
Last commit7 years ago
Stats
StatsJulia

A convenience meta-package that loads essential Julia packages for statistics with a single import.

#julia#meta-package#data-science
Stars143
Forks14
Last commit3 years ago
splits-io
splits-ioRuby

A speedrun data store, analysis engine, and racing platform for sharing and analyzing split-by-split run history.

#highcharts#rails#redis
Stars143
Forks32
Last commit1 year ago
gtfs-via-postgres
gtfs-via-postgresJavaScript

Import GTFS Schedule data into PostgreSQL for efficient querying and analysis, with support for GraphQL and REST APIs.

#postgis#postgres#gtfs-schedule
Stars141
Forks21
Last commit3 months ago
elusion
elusionRust

A Rust DataFrame and data engineering library with PySpark/SQL-like syntax, built for business data pipelines with Microsoft stack integration.

#pyspark-alternative#sql-like#data-science
Stars141
Forks4
Last commit2 months ago
statistics
statisticsElixir

Statistical functions and distributions for Elixir, including descriptive statistics and probability distributions.

#scientific-computing#elixir#statistics
Stars141
Forks30
Last commit2 years ago
RMariaDB
RMariaDBR

A DBI-compliant R interface to MariaDB and MySQL databases, designed as a modern replacement for RMySQL.

#database-driver#database#r-package
Stars137
Forks41
Last commit15 days ago
bigmemory
bigmemoryC++

An R package for creating, storing, and manipulating massive matrices using shared memory and memory-mapped files.

#parallel-computing#r-package#shared-memory
Stars133
Forks25
Last commit1 day ago
ipychart
ipychartPython

A Python library that brings Chart.js interactive charts to Jupyter notebooks with a familiar API.

#python-library#data-science#jupyter
Stars131
Forks11
Last commit1 year ago
DuckDB
DuckDBC++

A native Swift API for DuckDB, providing a modern interface for high-performance analytical database operations across Apple, Linux, and Windows.

#sql-database#embedded-database#analytical-database
Stars130
Forks40
Last commit19 days ago
windML
windML

A Python framework for accessing wind data sources and performing renewable energy forecasting and prediction.

#scientific-computing#renewable-energy#python
Stars130
Forks42
Last commit2 years ago
RSiteCatalyst
RSiteCatalystR

R package for accessing Adobe Analytics Reporting API v1.4 to retrieve web analytics data programmatically.

#reporting#r-package#marketing-analytics
Stars127
Forks38
Last commit6 years ago
biocaml
biocamlOCaml

A high-performance, user-friendly OCaml library for bioinformatics applications.

#ocaml-library#scientific-computing#lgpl-licensed
Stars125
Forks21
Last commit6 months ago
dnddata
dnddataR

A weekly updated dataset of Dungeons & Dragons characters submitted to character sheet web applications, with over 7,900 entries and standardized fields.

#tsv-data#5e#ogan-dnd
Stars122
Forks20
Last commit3 years ago
RHive
RHiveR

An R extension for distributed computing using Apache Hive, enabling HQL queries in R and R functions in Hive.

#cluster-computing#apache-hive#rserve
Stars122
Forks62
Last commit9 years ago
PyDMD
PyDMDPython

A Python package for Dynamic Mode Decomposition, providing tools to extract spatiotemporal coherent structures from time-varying datasets.

#scientific-computing#koopman-operator#spatiotemporal-analysis
Stars122
Forks9
Last commit1 year ago
PreviousPage 7 of 8

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
Next
#Python89
#Data Science79
#Data Visualization74
#Machine Learning63
#Statistics44
#R Package42
#R36
#Scientific Computing32
#Pandas30
#Data Manipulation26
#Sql26
#Dataframe22