A suite of tools (wham and whamg) for sensitive and accurate structural variant detection and association testing from genomic sequencing data.
wham is a bioinformatics software suite for detecting structural variants (SVs) like deletions, duplications, and inversions from whole-genome sequencing data. It solves the problem of accurately identifying large genomic rearrangements, which are challenging to detect with standard variant callers, and includes statistical tests to find SVs associated with diseases. The suite offers two tools: whamg for general SV discovery and wham for sensitive detection and association testing.
Bioinformaticians, computational biologists, and genomics researchers working with whole-genome sequencing data who need to identify structural variants for population studies, disease research, or clinical genomics.
Developers choose wham because it provides a dual-approach suite—combining the accuracy of whamg for discovery with the sensitivity and association testing capabilities of wham—all within a single, pipeline-friendly toolset that outputs standard VCF and integrates with platforms like gkno and bcbio-nextgen.
Structural variant detection and association testing
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
whamg is optimized for high-accuracy detection of structural variants like deletions and inversions, with the README recommending it for most studies over the more error-prone wham.
Generates VCF 4.2 files with comprehensive INFO fields (e.g., A, CW, SVTYPE), ensuring compatibility with downstream annotation and analysis tools in genomics pipelines.
Can process multiple BAM files simultaneously for family or cohort analysis, improving variant consistency and reducing batch effects in population studies.
Outputs graph structures for visualizing complex SVs, which aids in debugging and manual inspection, though it's not recommended for whole-genome runs due to file size.
Users must implement custom filtering strategies based on metrics like total support (A) and SV type weights (CW), as the README notes a filtering script is 'in the pipe' and not yet available.
Missing critical fields like genotype (GT) and depth (DP) in output, with the README stating that genotyping is still being developed, limiting immediate use in genotyping analyses.
Strictly requires BWA-MEM aligned BAMs with specific flags (e.g., -R for read groups), reducing flexibility and increasing setup complexity for datasets aligned with other tools.