A curated collection of papers and articles from companies sharing real-world data science and machine learning applications in production.
Applied ML is a curated GitHub repository that aggregates papers, articles, and blog posts from major tech companies about their real-world applications of data science and machine learning in production. It helps practitioners understand how ML projects are implemented at scale, covering problem framing, technique selection, scientific rationale, and business outcomes. The collection addresses the gap between academic theory and industrial practice by providing concrete examples.
Data scientists, ML engineers, researchers, and technical leaders who need to learn from documented industry experiences to design, implement, and scale their own ML systems. It's particularly valuable for those transitioning models from research to production.
It offers a centralized, organized, and vetted source of practical ML knowledge from top companies, saving practitioners time searching for quality case studies. The focus on production details—including failures and ROI—provides insights often missing from academic papers or generic tutorials.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Aggregates case studies from top companies like Google and Netflix, focusing on practical implementations beyond academic theory, as evidenced by the detailed breakdowns in categories like Recommendation and Feature Stores.
Organized into 30+ categories spanning Data Quality to MLOps, providing diverse examples from e-commerce, social media, finance, and healthcare, as listed in the README's table of contents.
Emphasizes the 'how', 'what', and 'why' of ML in production, including techniques that worked or didn't and ROI metrics, helping users learn from documented successes and failures.
Saves practitioners from scouring the internet by vetting and linking to quality articles, blogs, and papers from leading tech companies, as highlighted in the project's philosophy.
The repository is solely a collection of external links without synthesized summaries or critical commentary, limiting its value as a standalone learning tool beyond curation.
As a static list, it may not be frequently updated, risking outdated links or missing recent advancements, and lacks mechanisms for community-driven validation or updates.
Relies on external sources that can range from deep technical blogs to marketing pieces, so users must independently assess the credibility and depth of each linked resource.
Offers no search functionality, filtering, or discussion forums, making it less suitable for dynamic exploration or peer interaction compared to platforms like GitHub Discussions.