Showing 8 of 8 projects
An alphabetical list of free and public domain text datasets for Natural Language Processing (NLP) tasks.
A collaboratively maintained, reverse-chronological list of datasets and corpora for natural language processing tasks.
An R package for the quantitative analysis of textual data, providing comprehensive tools for natural language processing and text management.
A minimal 200-line implementation of a sequence-to-sequence chatbot using TensorLayer and TensorFlow.
A dataset of millions of news articles labeled by credibility type for training fake news detection algorithms.
A corpus of academic papers about COVID-19 and related coronavirus research for text mining and NLP.
A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.
An enriched dataset for Natural Language Generation research, providing intermediate representations for pipeline tasks like lexicalization and aggregation.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.