Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Question Answering
  3. QANet

QANet

MITPython

A TensorFlow implementation of QANet for machine reading comprehension on the SQuAD dataset.

GitHubGitHub
985 stars298 forks0 contributors

What is QANet?

QANet is a TensorFlow implementation of the QANet neural network architecture designed for machine reading comprehension. It solves the task of answering questions based on a provided text passage, specifically trained and evaluated on the Stanford Question Answering Dataset (SQuAD). The model replaces traditional recurrent layers with convolutional and self-attention mechanisms to improve training speed and performance.

Target Audience

Machine learning researchers and developers working on natural language processing, particularly those focused on question answering, reading comprehension tasks, or experimenting with hybrid convolutional-attention models.

Value Proposition

Developers choose this implementation for a practical, open-source TensorFlow version of QANet that includes training pipelines, an interactive demo, and documented adaptations for hardware constraints, offering a accessible starting point for SQuAD-based projects.

Overview

A Tensorflow implementation of QANet for machine reading comprehension

Use Cases

Best For

  • Implementing reading comprehension models for academic research
  • Experimenting with convolutional self-attention architectures in NLP
  • Building question-answering systems on the SQuAD dataset
  • Learning TensorFlow workflows for NLP model training and evaluation
  • Developing interactive demos for NLP models
  • Comparing performance of different neural network designs on QA tasks

Not Ideal For

  • Production deployments requiring state-of-the-art accuracy on SQuAD
  • Teams needing full multi-head attention and original model dimensions without modifications
  • Developers with limited GPU memory (less than 12GB) or seeking plug-and-play models

Pros & Cons

Pros

Efficient Convolutional-Attention Design

Replaces RNNs with depthwise separable convolution and self-attention layers for faster training and inference, aligning with the paper's efficiency goals as stated in the philosophy.

Integrated Training Pipeline

Includes scripts for data preprocessing, training, testing, and an interactive demo server, adapted from R-Net for workflow efficiency, with modes configurable via config.py.

Transparent Implementation Details

Provides a results table comparing performance with the original paper and explains adaptations like reduced hidden size and single-head attention due to GPU constraints.

Robust Regularization Techniques

Employs dropout, stochastic depth dropout, and exponential moving average for stable training, as detailed in the implementation section to prevent overfitting.

Cons

Performance Trade-offs from Hardware

Uses single-head attention and hidden size 96 instead of 8 heads and 128 due to memory issues, resulting in lower EM/F1 scores (e.g., 70.8/80.1 vs paper's 73.6/82.7) as acknowledged in the README.

Incomplete Feature Implementation

TODO list shows missing features like data augmentation and training with full hyperparameters, limiting its capability compared to the original QANet architecture.

Outdated Dependencies and Setup

Requires Python>=2.7 and specific legacy libraries like spacy==2.0.9, which may cause compatibility issues with modern systems and increase setup complexity.

Frequently Asked Questions

Quick Stats

Stars985
Forks298
Contributors0
Open Issues21
Last commit8 years ago
CreatedSince 2017

Tags

#squad#deep-learning#neural-networks#question-answering#natural-language-processing#cnn#tensorflow#reading-comprehension#squad-dataset#machine-learning#nlp#machine-comprehension

Built With

T
TensorFlow
s
spaCy
P
Python
N
NumPy
D
Docker

Included in

Question Answering767
Auto-fetched 6 hours ago

Related Projects

BERTBERT

TensorFlow code and pre-trained models for BERT

Stars40,007
Forks9,716
Last commit1 year ago
BiDAFBiDAF

Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.

Stars1,542
Forks670
Last commit2 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub