How to install and run PPLM locally?

Install dependencies with pip install -r requirements.txt, then use run_pplm.py with command-line arguments as shown in the README examples, such as for bag-of-words or sentiment control.

PPLM vs fine-tuning: which is better for controlled text generation?

PPLM avoids fine-tuning and preserves the base model, but requires hyperparameter tuning; fine-tuning is more permanent and computationally expensive but can yield more integrated control.

What language models does PPLM support besides GPT-2?

PPLM is primarily designed for GPT-2, as stated in the README, but it integrates with Hugging Face Transformers, potentially allowing adaptation to similar models, though not explicitly supported.

Is PPLM fast enough for real-time applications like chatbots?

No, the iterative control process with multiple sampling steps can be slow, making it better suited for research, batch processing, or offline generation tasks.

How to add a custom discriminator for new attributes in PPLM?

You need to train a small classifier model for your attribute, then modify the code to plug it in, following the structure of existing discriminators in the repository.

Can PPLM be used with GPT-3 or other large models?

No, PPLM is specifically implemented for GPT-2 and similar architectures; adapting it to GPT-3 would require significant code changes due to differences in model access and scale.

Open-Awesome

PPLM

Apache-2.0Python

A method to steer topic and attributes of GPT-2 language models without fine-tuning, enabling controlled text generation.

GitHub

1.2k stars204 forks0 contributors

What is PPLM?

PPLM (Plug and Play Language Models) is a research implementation for controlled text generation that allows steering the output of GPT-2 models toward specific topics or attributes. It solves the problem of generating text with desired characteristics without fine-tuning the base model, using small plug-in attribute models to guide the generation process. This enables precise control over aspects like topic relevance or sentiment while maintaining the fluency of a large pre-trained language model.

Target Audience

NLP researchers and machine learning practitioners who need to experiment with controlled text generation, particularly those working with large language models like GPT-2 and seeking methods to steer outputs without retraining.

Value Proposition

Developers choose PPLM because it provides a flexible, training-free approach to controlled text generation, allowing them to leverage state-of-the-art language models without the computational cost of fine-tuning. Its unique selling point is the ability to plug in multiple attribute models for combined steering objectives while preserving the original model's capabilities.

Overview

Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.

Use Cases

Best For

Steering GPT-2 outputs toward specific topics using keyword lists
Generating text with controlled sentiment (positive/negative)
Research experiments in controlled text generation without model fine-tuning
Combining multiple attribute controls for complex text generation tasks
Baseline implementations for controlled language model research
Educational demonstrations of language model steering techniques

Not Ideal For

Production systems requiring real-time text generation due to iterative control slowing inference
Projects needing state-of-the-art language models beyond GPT-2 for cutting-edge performance
Teams lacking machine learning expertise for extensive hyperparameter tuning
Applications where plug-and-play, pre-trained attribute models are essential without custom training

Pros & Cons

Pros

No Model Fine-Tuning

Uses pre-trained GPT-2 as-is, avoiding costly retraining, as highlighted in the README's philosophy of minimal overhead.

Flexible Steering Objectives

Supports both bag-of-words and discriminator models, allowing combined control over topics and attributes like sentiment.

Research Integration Ready

Integrated into Hugging Face Transformers and provides Colab notebooks, making it accessible for experimentation and baselines.

Customizable Hyperparameters

Offers adjustable parameters like stepsize and KL scale, enabling fine-tuning of control strength versus text quality, per the README examples.

Cons

Hyperparameter Sensitivity

README warns that optimal parameters differ from the paper and require manual tuning, making consistent results non-trivial.

Outdated Base Model

Relies on GPT-2, which is outdated compared to newer models like GPT-3 or GPT-4, limiting state-of-the-art capabilities.

Limited Pre-Built Attributes

Only includes basic discriminators (e.g., for sentiment); creating new attribute models requires custom training and integration.

Frequently Asked Questions

Related Projects

HuggingFace Transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Stars160,069

Forks33,050

Last commit16 hours ago

textgenrnn

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Stars4,925

Forks733

Last commit3 years ago

Texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Stars2,391

Forks368

Last commit4 years ago

aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.

Stars1,842

Forks215

Last commit2 years ago

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub