A research-oriented, transparent, and extensible variant interpretation pipeline implementing ACMG/AMP 2015 & 2023 guidelines.
Download Latest Windows Executable from Google Drive
Ready-to-use standalone
.exeβ No Python installation required!
ACMG Assistant is a variant classification tool implementing the ACMG/AMP 2015 and 2023 guidelines. It combines automatic data retrieval from public APIs with structured interactive evidence collection to produce transparent, reproducible classifications.
Automatic evaluation:
- Population frequency criteria (BA1, BS1, PM2) via gnomAD, ExAC, TOPMed
- Computational/in-silico criteria (PP3, BP4) via multi-source predictor aggregation
- Functional domain criteria (PM1) via CancerHotspots and UniProt
- Phenotype matching (PP4, BP5) via HPO ontology similarity
Interactive evaluation:
- Literature-based criteria (PS3/BS3, PS4, PP1/BS4, PS1/PM5, PP5/BP6) through structured prompts
| Use Case | Suitability |
|---|---|
| Educational use | β Understanding ACMG classification logic |
| Research pipelines | β Reproducible, transparent variant interpretation |
| Pre-screening variants | β Workflow augmentation before expert review |
| Clinical decision-making | β Not intended β requires expert validation |
- β Thresholds, weights, scoring formulas β Defined locally
- β ACMG evidence combination rules β Defined locally
- β Predictor scores (REVEL, CADD, etc.) β Must be fetched from APIs
- β Population allele frequencies β Must be fetched from APIs
- β Functional domains, hotspots β Must be fetched from APIs
- β Gene-specific rules β Must be fetched from APIs or entered by user
All factual variant-level data must come from one of:
- External APIs (gnomAD, ClinVar, UniProt, CancerHotspots, myvariant.info, etc.)
- User input (interactive evidence collection for literature-based criteria)
- Validated cache (previously fetched and validated API responses)
The local codebase NEVER fabricates biological values β it only interprets them.
Criteria that require literature review are handled through structured interactive prompts:
| Criterion | What User Provides |
|---|---|
| PS3 / BS3 | Functional study details, assay type, quality level |
| PS4 | Case-control counts, odds ratio data |
| PP1 / BS4 | Segregation data, LOD scores, family structure |
| PS1 / PM5 | Prior variant at same codon, ClinVar status |
| PP5 / BP6 | External lab assertions, submission quality |
The PhenotypeMatcher provides algorithmic phenotype-to-gene matching for PP4/BP5 evidence:
- Uses HPO (Human Phenotype Ontology) term similarity
- Computes reproducible Jaccard and Information Content (IC) based scores
- Serves as a screening aid, not a replacement for clinical phenotyping by domain experts
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLI / Entry Point β
β (acmg_assistant.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EvidenceEvaluator β
β (Central orchestration engine) β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β _fetch_external_data() β β
β β "Fetch once, interpret many" β pre-loads all data β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β PredictorAPI β β PopulationAPI β β GeneSpecific β β Interactive β
β Client β β Client β β Rules β β Evidence β
β β β β β β β Collector β
β β’ myvariant.infoβ β β’ gnomAD GraphQLβ β β’ CancerHotspotsβ β β
β β’ AlphaMissense β β β’ ExAC REST β β β’ UniProt β β β’ PS3/BS3 β
β β’ CADD API β β β’ TOPMed β β β’ ClinGen β β β’ PS4 β
β β’ VEP β β β β β β β’ PP1/BS4 β
β β β β β β β β’ PS1/PM5 β
β Multi-source β β Multi-source β β PM1 external β β β’ PP5/BP6 β
β priority-based β β gnomAD-first β β hotspots/domainsβ β β
ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ
β β β β
βββββββββββββββββ¬βββββ΄βββββββββββββββββββββ΄βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β ResultCache β
β (Strict validation layer) β
β β
β β’ Validates all entries before use β
β β’ Rejects invalid/corrupted data β
β β’ TTL-based expiration β
β β’ Thread-safe file storage β
β β
β "Cache is optimization, NOT truth" β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Pure Interpretation Layer β
β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β MissenseEval β β PopulationAnal β β PhenotypeMatcherβ β
β β β β β β β β
β β Composite scoreβ β AF thresholds β β HPO similarity β β
β β β PP3 / BP4 β β β BA1/BS1/PM2 β β β PP4 / BP5 β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β
β "Data missing β safely degrade evidence strength, never fabricate" β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ACMGClassifier β
β β
β Merges all evidence (automatic + interactive) β Final classification β
β Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Fetch Once, Interpret Many: External data is fetched at the start; evaluators are pure interpreters
- Cache is Optimization, Not Truth: Invalid cache entries are rejected, not trusted
- Graceful Degradation: Missing data β reduced evidence strength, never fabricated values
ACMG Assistant uses a strict, validated caching layer to minimize redundant API calls while ensuring data integrity.
CacheKey(
category='predictor' | 'population',
source='dbNSFP' | 'gnomAD_GraphQL' | ...,
variant_id='GRCh38:17-7674234-G-A',
version='v4.0' # optional
)| Data Type | Default TTL | Rationale |
|---|---|---|
| Predictor scores | 7 days | Scores rarely change |
| Population data | 30 days | gnomAD updates infrequently |
All cached entries are validated before use:
- Predictor scores: Must be within valid ranges (REVEL β [0,1], CADD β [0,60], etc.)
- Population stats: AF β [0,1], AC β€ AN, no negative values
- Invalid entries: Automatically invalidated and re-fetched
- JSON decode errors β Cache miss (file removed)
- Hash mismatch β Cache miss (file removed)
- Expired entries β Cache miss (file removed)
ResultCache uses threading.RLock for safe concurrent access.
Default: src/api_cache/ (organized by category/source)
| Evidence | Source | Type | Details |
|---|---|---|---|
| BA1 / BS1 / PM2 | Population data | β Automatic | gnomAD, ExAC allele frequencies |
| PP3 / BP4 | In-silico predictors | β Automatic | Multi-source: REVEL, CADD, AlphaMissense, SIFT, PolyPhen2, etc. |
| PM1 | Functional domains | β Automatic | CancerHotspots API + UniProt domains |
| PP4 / BP5 | Phenotype matching | β Automatic | HPO ontology similarity scoring |
| PS3 / BS3 | Functional studies | π Interactive | User enters assay details, quality |
| PS4 | Case-control data | π Interactive | User enters case/control counts |
| PP1 / BS4 | Segregation | π Interactive | User enters pedigree, LOD scores |
| PS1 / PM5 | Prior variants | π Interactive | User enters codon-based prior data |
| PP5 / BP6 | External assertions | π Interactive | User enters lab submissions |
| PVS1 | Null variants | β Automatic | LOF in haploinsufficient genes |
| PS2 / PM6 | De novo | π Interactive | User confirms parental testing |
| PM3 / BP2 | In-trans/cis | π Interactive | User enters phase data |
-
NOT a Clinical Decision-Making System
- All evidence must be reviewed by a qualified clinical geneticist
- Classifications are suggestions, not diagnoses
-
API Dependency
- Network failures or API downtime may limit automatic criteria
- Missing data results in reduced evidence, never forced calls
-
Interactive Evidence Accuracy
- PS3, PS4, PP1, etc. rely on truthful user input
-
Missense Composite Score
- The PP3/BP4 composite score is a research approximation
- It is NOT the validated VAMPP score β only inspired by similar methodology
- Clinical labs should use their own validated thresholds
-
Phenotype Matching
- HPO similarity is approximate and algorithmic
- NOT a replacement for clinical phenotyping by experts
- Low sensitivity for rare/novel phenotypes
-
Cache Validity
- Cache is an optimization layer, not a source of truth
- Stale cache may return outdated scores if TTL not managed
- β Replace expert clinical judgment
- β Guarantee 100% accuracy
- β Provide legally binding classifications
- β Automatically retrieve all possible data sources
- β Handle structural variants (current focus: SNVs, indels)
Latest Release: December 2025
| Feature | Description |
|---|---|
| Multi-source predictor system | Fetches from myvariant.info, AlphaMissense API, CADD API with source priority |
| Multi-source population AF | gnomAD GraphQL (primary), ExAC, TOPMed fallbacks |
| Strict validated caching | ResultCache with validation, TTL, thread safety |
| PM1 via external hotspots | CancerHotspots API + UniProt functional domains |
| Phenotype matcher overhaul | HPO-based similarity with IC weighting |
| Interactive evidence subsystem | Structured prompts for PS3/BS3, PS4, PP1/BS4, PS1/PM5, PP5/BP6 |
| 198 tests passing | Comprehensive test coverage |
- v3.5.0: Gene-specific rules, enhanced API integration
- v3.3.0: Statistical framework (Fisher's exact, LOD scoring)
- v3.0.0: Initial ACMG 2023 support
- v2.x: Core ACMG 2015 implementation
# Clone the repository
git clone https://github.com/Bilmem2/ACMG_Assistant
cd ACMG_Assistant
# Create virtual environment (optional but recommended)
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Run from src directory
cd src
python acmg_assistant.py# Standard mode (ACMG 2015)
python acmg_assistant.py
# ACMG 2023 mode
python acmg_assistant.py --acmg-2023
# Test mode (mock data, no API calls)
python acmg_assistant.py --test
# Show version
python acmg_assistant.py --versionThe tool requires internet access for:
- gnomAD (population frequencies)
- myvariant.info (predictor scores via dbNSFP)
- CancerHotspots (PM1 hotspot detection)
- UniProt (functional domains)
- ClinVar (external assertions)
Offline mode uses cached data only.
Tested on Ubuntu 22.04 LTS:
git clone https://github.com/Bilmem2/ACMG_Assistant
cd ACMG_Assistant
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd src && python3 acmg_assistant.pyACMG Assistant is distributed as a standalone Windows .exe for users without Python:
- Download and extract the ZIP file
- Run
ACMG_Assistant.exe - Follow the interactive prompts
- First startup may be slow
- Internet required for API calls
- Cache persists between runs in the same directory
- No Python installation needed
# Install build dependencies
pip install -r requirements_build.txt
# Build executable
python build_executable_new.py
# Output: dist/ACMG_Assistant.exeACMG Assistant is available as a Docker container for cross-platform deployment:
# Pull from GitHub Container Registry
docker pull ghcr.io/bilmem2/acmg_assistant:latest
# Or from Quay.io
docker pull quay.io/bilmem2/acmg_assistant:latest
# Run interactively
docker run -it --rm ghcr.io/bilmem2/acmg_assistant:latest
# With persistent cache
docker run -it --rm -v ./cache:/app/cache ghcr.io/bilmem2/acmg_assistant:latestSee Dockerfile for build details.
ACMG_Assistant/
βββ src/
β βββ acmg_assistant.py # Main CLI entry point
β βββ config/
β β βββ __init__.py
β β βββ constants.py # Thresholds, API settings
β β βββ predictors.py # PredictorScore, PopulationStats dataclasses
β β βββ version.py # Version metadata
β βββ core/
β β βββ __init__.py
β β βββ acmg_classifier.py # Final classification engine
β β βββ evidence_evaluator.py # Central orchestration
β β βββ variant_data.py # VariantData dataclass
β β βββ missense_evaluator.py # PP3/BP4 composite scoring
β β βββ population_analyzer.py # BA1/BS1/PM2 evaluation
β β βββ gene_specific_rules.py # Gene-specific PM1, thresholds
β β βββ phenotype_matcher.py # PP4/BP5 HPO matching
β β βββ functional_studies_evaluator.py # PS3/BS3 evaluation
β βββ utils/
β βββ __init__.py
β βββ api_client.py # ClinVar, Ensembl clients
β βββ predictor_api_client.py # Multi-source predictor/population
β βββ cache.py # ResultCache with validation
β βββ input_handler.py # Interactive evidence collection
β βββ report_generator.py # Report output
β βββ validators.py # Input validation
βββ tests/
β βββ test_acmg_classifier.py # 20 tests
β βββ test_gene_specific_pm1.py # 31 tests
β βββ test_interactive_evidence.py # 56 tests
β βββ test_predictor_population_api.py # 40 tests
β βββ test_cache_and_validation.py # 51 tests
βββ data/
β βββ gene_rules/ # Gene-specific configuration
β βββ domain_annotations/ # Functional domain data
βββ requirements.txt # Runtime dependencies
βββ requirements_build.txt # Build dependencies
βββ pyproject.toml # Project configuration
βββ README.md # This file
If you use this tool in your research, please cite:
ACMG Variant Classification Assistant
https://doi.org/10.5281/zenodo.15831866
This tool uses a VAMPP-score-inspired metascore approach. If you use this methodology, please also cite:
Eylul Aydin, Berk Ergun, et al. "A New Era in Missense Variant Analysis: Statistical Insights and the Introduction of VAMPP-Score for Pathogenicity Assessment." bioRxiv (2024). DOI: 10.1101/2024.07.11.602867
Richards S, et al. "Standards and guidelines for the interpretation of sequence variants." Genet Med. 2015;17(5):405-424.
Plon SE, et al. "Sequence variant classification and reporting: recommendations for improving the interpretation of cancer susceptibility genetic test results." Hum Mutat. 2008.
- Author: Can SevilmiΕ
- Email: cansevilmiss@gmail.com
- LinkedIn: cansevilmiss
- GitHub: Bilmem2/ACMG_Assistant
ACMG Assistant
