Showing 36 of 47 projects
Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility and transparency.
A curated repository of resources, tutorials, libraries, and tools for learning and applying data science to real-world problems.
A curated repository of resources, tutorials, libraries, and tools for learning and applying data science to real-world problems.
A curated repository of resources, tutorials, libraries, and tools for learning and applying data science to real-world problems.
A fast and elegant scraping and crawling framework for Go, designed for extracting structured data from websites.
A fast, distributed gradient boosting framework based on decision tree algorithms for ranking, classification, and other ML tasks.
A fast, distributed gradient boosting framework based on decision tree algorithms for ranking, classification, and other machine learning tasks.
A Python library for topic modeling, document indexing, and similarity retrieval with large corpora.
A Python library for topic modeling, document indexing, and similarity retrieval with large text corpora.
A unified Python framework for machine learning with time series, offering scikit-learn compatible tools for forecasting, classification, clustering, and more.
A unified Python framework for machine learning with time series, offering scikit-learn compatible tools for forecasting, classification, clustering, and more.
A Python web mining module with tools for scraping, NLP, machine learning, network analysis, and visualization.
A Python library providing extensions and utilities for data science and machine learning tasks.
A curated collection of Python libraries, tutorials, and tools for data science, from data wrangling to machine learning and visualization.
A comprehensive collection of tutorials, examples, and resources for understanding and solving machine learning and pattern classification problems.
A curated collection of academic papers on data mining and machine learning techniques for fraud detection across various domains.
A fast Support Vector Machine (SVM) library that leverages GPUs and multi-core CPUs for high-performance machine learning.
An open-source Python repository providing around 40 feature selection algorithms for machine learning applications.
A comprehensive Python library for generating and analyzing multi-class confusion matrices with extensive statistical metrics.
A Ruby library implementing the ID3 algorithm for decision tree learning with support for continuous and discrete datasets.
A suite of high-performance command line tools for filtering, summarizing, joining, and manipulating large tabular data files.
A Java/Groovy/JavaFX data visualization tool for ETL, machine learning, and publishing web visualizations.
A Python library with fast C implementations for computing Dynamic Time Warping and other time series distances.
A flexible Python framework for fast network flow data analysis, offering encrypted application identification, statistical feature extraction, and extensibility via plugins.
A high-level web crawling and scraping framework for Elixir, designed for data extraction and processing.
A curated collection of datasets, APIs, and tools for applying artificial intelligence and data mining to video games.
A Ruby gem for web scraping that extracts titles, meta tags, links, images, and structured data from URLs.
A Go web scraping framework that extracts structured data from websites using CSS selectors, including JavaScript-rendered pages.
A Python library providing a comprehensive collection of graph sampling algorithms for NetworkX and NetworKit.
A high-performance data profiler for discovering and validating complex patterns in datasets, enabling data cleaning and quality analysis.
A high-performance data profiler for discovering and validating complex patterns like functional dependencies, inclusion dependencies, and association rules.
A comprehensive suite of Java NLP libraries and tools for text annotation, feature extraction, and language processing tasks.
A Python library for interpretable text classification using the SS3 model, with built-in visualization tools for explainable AI.
A command-line tool to fetch and gather data from software repositories and development platforms using modular backends.
A Python library providing comprehensive metrics for fair and thorough evaluation of recommender systems.
A CPU and GPU-accelerated matrix library optimized for high-performance data mining operations.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.