A Python client for the VarSome API — annotate genetic variants against gnomAD, ClinVar, and many other databases via a simple command-line interface or a Python library.
This is a new major version (1.x) that requires Python ≥ 3.11.
If you need compatibility with Python 3.10 or earlier, use the previous release:
varsome_api_run— look up one or more variants, genes, or CNVs and receive the full JSON annotation response.varsome_api_annotate_vcf— read a VCF file, annotate every variant via the VarSome API, and write an annotated output VCF.VarSomeAPIClient— a Python class for integrating variant, gene, and CNV annotation directly into your own code (synchronous and async interfaces).VCFAnnotator— a customisable VCF annotation pipeline class for use in your own Python projects.
If you only want to run varsome_api_run or varsome_api_annotate_vcf,
the Docker image is the recommended approach — it ships with all system
dependencies pre-installed and requires no local build toolchain:
docker pull ghcr.io/saphetor/varsome-api-client-python:1See the Docker Guide for full usage instructions.
If you only need VarSomeAPIClient for variant lookup in your own code and
do not require VCF reading/writing, install without extras:
pip install git+https://github.com/saphetor/varsome-api-client-python.gitOr with Poetry:
poetry add git+https://github.com/saphetor/varsome-api-client-python.gitvarsome_api_annotate_vcf and VCFAnnotator depend on
pysam, which requires several C build
libraries. Install the vcf extra to include pysam:
pip install "varsome_api[vcf] @ git+https://github.com/saphetor/varsome-api-client-python.git"Or with Poetry:
poetry add "git+https://github.com/saphetor/varsome-api-client-python.git[vcf]"Build requirements for pysam — the following system libraries must be present before
pipcan compile pysam:
Library Ubuntu/Debian macOS (Homebrew) zlib zlib1g-devzlibbzip2 libbz2-devbzip2lzma liblzma-devxzlibcurl libcurl4-openssl-devcurlOpenSSL libssl-devopenssllibdeflate libdeflate-devlibdeflatebuild tools build-essentialXcode CLT If installing these is inconvenient, use the Docker image instead — it handles all of this for you.
After installation, the varsome_api_run and varsome_api_annotate_vcf commands
will be available in your PATH.
Requires Python ≥ 3.11, < 3.15.
| Server | URL | Notes |
|---|---|---|
| Live | https://api.varsome.com |
Default |
| Stable | https://stable-api.varsome.com |
Kept frozen according to schedule |
| Staging | https://staging-api.varsome.com |
Test environment, throttled |
For more information on the different servers, read here.
Use the -u flag to select a non-default server.
Note: The staging environment is intended for evaluation only. It may contain a partial dataset, is throttled, and may produce different results from production.
varsome_api_run -g hg19 -k YOUR_API_KEY -q 'chr7-140453136-A-T' -p add-ACMG-annotation=1varsome_api_run -g hg19 -k YOUR_API_KEY \
-q 'chr7-140453136-A-T' 'chr19:20082943:1:G' \
-p add-source-databases=gnomad-exomes,refseq-transcriptsvarsome_api_run -g hg19 -k YOUR_API_KEY -i variants.txt -o annotations.json -p add-ACMG-annotation=1varsome_api_run -y genes -g hg19 -k YOUR_API_KEY -q BRCA1 TP53varsome_api_run -y genes -g hg19 -k YOUR_API_KEY -i genes.txt -o gene_annotations.jsonvarsome_api_run -y cnvs -g hg19 -k YOUR_API_KEY -q 'chr1:122:5235:DEL' 'chr1:100:L1254:DUP'varsome_api_run -y cnvs -g hg19 -k YOUR_API_KEY -i cnvs.txt -o cnv_annotations.jsonOutput defaults to stdout. Use -o to write to a file.
The output is always written in JSON Lines format —
one JSON object per line — regardless of whether you write to a file or stdout.
See Output format: JSON Lines
for details and migration guidance.
varsome_api_annotate_vcf -g hg19 -k YOUR_API_KEY -i input.vcf -o annotated.vcf -p add-ACMG-annotation=1VCF annotation limitation:
varsome_api_annotate_vcfsupports SNPs and small indels (up to 200 bp). Remove any variants outside these criteria before running.
| Flag | Description | Default |
|---|---|---|
-k |
API key (required) | — |
-g |
Reference genome: hg19 or hg38 |
hg19 |
-y |
Query type: variants, genes, or cnvs |
variants |
-p |
Request parameters as key=value pairs |
add-ACMG-annotation=1 |
-u |
API server URL | https://api.varsome.com |
-t |
Max concurrent requests (1–20) | 5 |
-m |
Max items per batch request | 100 |
-v / --verbose |
Enable debug-level logging | off |
When using -p to specify request parameters, separate multiple parameters with spaces:
varsome_api_run -g hg19 -k YOUR_API_KEY -q 'chr7-140453136-A-T' \
-p add-ACMG-annotation=1 add-source-databases=gnomad-exomes,refseq-transcriptsUsing the -p flag as part of the reference annotation command (varsome_api_annotate_vcf) with parameters other than
the default add-ACMG-annotation=1 will not produce the expected results. This is because varsome_api_annotate_vcf is designed to work with a specific set of parameters, and deviating from these may lead to unexpected behavior. For VCF annotation, it's recommended to stick with the default parameters or consult the documentation on how to extend the functionality of VCFAnnotator for custom annotation pipelines.
Batch limits: The -m parameter specifies the maximum number of items
(variants or genes) per batch request. The API enforces per-environment limits:
- Live / Stable: Variants: 200, Genes: 100
- Staging: Variants: 50, Genes: 10
If you exceed the environment's limit, the API will return an error. Adjust -m
accordingly. CNV queries do not support batching and are always sent individually.
Run any tool with --help for the full option reference.
varsome_api_run v1.x writes all output — to a file or to stdout — in
JSON Lines (JSONL) format: one self-contained JSON
object per line, with no surrounding array wrapper.
This is a breaking change from v0.x, which wrote the output file as a single JSON array.
The old output file looked like this:
[
{"chromosome": "7", "pos": 140453136, "ref": "A", "alt": "T", ...},
{"chromosome": "19", "pos": 20082943, "ref": "1", "alt": "G", ...}
]Users would load the entire file at once and iterate the resulting list:
# v0.x — old approach
import json
with open("annotations.json") as f:
annotations = json.load(f) # parses the whole file as a JSON array
for annotation in annotations:
print(annotation["chromosome"], annotation["pos"])The new output file looks like this:
{"alt": "T", "chromosome": "7", "pos": 140453136, "ref": "A", ...}
{"alt": "G", "chromosome": "19", "pos": 20082943, "ref": "1", ...}Each line is an independent, complete JSON object. Read the file line by line and parse each line separately:
# v1.x — new approach
import json
with open("annotations.jsonl") as f:
for line in f:
annotation = json.loads(line) # parse one object at a time
print(annotation["chromosome"], annotation["pos"])
⚠️ json.load(f)will fail on a JSONL file because the file as a whole is not valid JSON. Always usejson.loads(line)inside a loop.
The JSONL format allows results to be streamed and written as they arrive from the API, keeping memory usage constant regardless of how many variants are annotated. The old array format required buffering all results in memory before writing, which was impractical for large variant sets.
| Document | Description |
|---|---|
| Developer Guide | Using VarSomeAPIClient and VCFAnnotator in your Python code |
| Docker Guide | Running the tools via the pre-built Docker image or building your own |
Contact support to register for an API key.
An API key is required for all CLI operations and for batch lookups.
Single-variant lookups via VarSomeAPIClient do not require a key, but will be throttled.
See api.varsome.com for available request parameters and
the full response schema. The OpenAPI specification is available at
https://api.varsome.com/openapi/variants/.
See the Developer Guide for instructions on cloning the repository, setting up a development environment, and running the test suite.