Question 1

How do I install whisper-timestamped for CPU-only use?

Accepted Answer

Follow the light installation steps: install a CPU version of torch first (e.g., torch==1.13.1+cpu), then whisper-timestamped, or use the Dockerfile.cpu for a smaller 3.5GB image, as detailed in the README.

Question 2

What's the difference between whisper-timestamped and whisperX?

Accepted Answer

whisper-timestamped uses DTW on Whisper's attention weights for word alignment, avoiding language-specific wav2vec models, while whisperX uses wav2vec models which require per-language setup and normalization, as explained in the README's notes on other approaches.

Question 3

How can I use voice activity detection to improve transcriptions?

Accepted Answer

Enable VAD by setting vad=True in Python or --vad True in CLI; it removes non-speech segments to reduce hallucinations, with methods like silero or auditok available, and you can visualize results using the plot option.

Question 4

Can I use fine-tuned Whisper models from Hugging Face?

Accepted Answer

Yes, load models using whisper_timestamped.load_model() with Hugging Face identifiers (e.g., NbAiLab/whisper-large-v2-nob), as shown in the Python and CLI examples, requiring the transformers package.

Question 5

How to generate SRT subtitles with word-level timestamps?

Accepted Answer

Use the CLI with --output_dir and appropriate formats or Python to write SRT files; word timestamps are automatically integrated into the output files, making it suitable for subtitle creation.

Question 6

Is whisper-timestamped compatible with all Whisper versions?

Accepted Answer

It's designed to be compatible with any version of openai-whisper, but specific versions can be pinned during installation, as mentioned in the upgrade section to avoid conflicts.

Question 7

How to plot word alignment for debugging?

Accepted Answer

Use the plot_word_alignment option in Python or --plot in CLI to visualize alignment between audio and words; this requires matplotlib and generates plots like the example shown in the README.

whisper-timestamped

What is whisper-timestamped?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions