Corpus

8 projects

Showing 8 of 8 projects

nlp-datasets

An alphabetical list of free and public domain text datasets for Natural Language Processing (NLP) tasks.

#data-curation#nlp-resources#multilingual

Stars6.0k

Forks989

Last commit3 years ago

karthinkncode's Datasets for Natural Language Processing

A collaboratively maintained, reverse-chronological list of datasets and corpora for natural language processing tasks.

#ai-training-data#nlp-research#research-tools

Stars918

Forks249

Last commit6 years ago

quantedaR

An R package for the quantitative analysis of textual data, providing comprehensive tools for natural language processing and text management.

#computational-linguistics#parallel-computing#r-package

Stars884

Forks191

Last commit5 days ago

Seq2seq-ChatbotPython

A minimal 200-line implementation of a sequence-to-sequence chatbot using TensorLayer and TensorFlow.

#chat#educational#tensorlayer

Stars840

Forks309

Last commit4 years ago

FakeNewsCorpus

A dataset of millions of news articles labeled by credibility type for training fake news detection algorithms.

#database#data-scraping#text-corpus

Stars413

Forks98

Last commit6 years ago

CORD-19

A corpus of academic papers about COVID-19 and related coronavirus research for text mining and NLP.

#document-embeddings#semantic-scholar#natural-language-processing

Stars186

Forks23

Last commit1 year ago

colibri-coreC++

A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.

#c-plus-plus-library#computational-linguistics#pattern-modeling

Stars130

Forks20

Last commit4 months ago

WebNLGPython

An enriched dataset for Natural Language Generation research, providing intermediate representations for pipeline tasks like lexicalization and aggregation.

#pipeline-architecture#nlp-research#data-to-text

Stars70

Forks22

Last commit5 years ago

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub

Corpus

Related Tags

Found a gem we're missing?

Corpus

Related Tags

Found a gem we're missing?