Question 1

How to set up Sei on a local machine without a cluster?

Accepted Answer

Follow the Anaconda environment setup in the README, but note that prediction scripts are optimized for GPU nodes; CPU runs are possible but slower, and you'll need to download the large model and resource files from Zenodo first.

Question 2

Sei vs Enformer: which is better for variant effect prediction?

Accepted Answer

Sei focuses on interpretable sequence classes and chromatin profiles across many cell types, while Enformer is known for long-range context modeling. Sei is preferable for regulatory annotation and variant scoring with biological groupings, but Enformer might excel in capturing distal effects.

Question 3

Can Sei be used for mouse genome analysis?

Accepted Answer

Not directly; the pre-trained model is for human genomes (hg19/hg38). You would need to retrain on mouse data using the provided training scripts, which requires GPU resources and custom dataset preparation.

Question 4

What do the Sei sequence class scores mean biologically?

Accepted Answer

Scores represent regulatory activities like enhancer or promoter strength, derived from chromatin profile projections. The README includes a table mapping classes to groups (e.g., E for Enhancer), but interpreting specific values requires domain knowledge from the manuscript.

Question 5

How much GPU memory is needed to run Sei predictions?

Accepted Answer

The README doesn't specify exact requirements, but since it uses deep learning models with PyTorch and recommends GPU nodes, expect several GBs of VRAM, especially for large VCF files or training tasks.

Question 6

How to troubleshoot Sei installation errors with Selene?

Accepted Answer

Ensure Selene version >0.5.0 is installed, as noted in the training section, and check Python compatibility (3.6+). Common issues might involve dependency conflicts; refer to Selene's documentation or GitHub issues for help.

Question 7

Is Sei suitable for integrating with genome-wide association studies (GWAS)?

Accepted Answer

Yes, its variant effect scores and pre-computed genome annotations can prioritize non-coding variants in GWAS, but the computational overhead for large datasets requires careful planning and cluster resources.

Sei

What is Sei?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions