VarSome API Client

A Python client for the VarSome API — annotate genetic variants against gnomAD, ClinVar, and many other databases via a simple command-line interface or a Python library.

⚠️ Legacy version notice

This is a new major version (1.x) that requires Python ≥ 3.11.

If you need compatibility with Python 3.10 or earlier, use the previous release:

v0.0.3 — Python ≤ 3.10 compatible

What this library provides

varsome_api_run — look up one or more variants, genes, or CNVs and receive the full JSON annotation response.
varsome_api_annotate_vcf — read a VCF file, annotate every variant via the VarSome API, and write an annotated output VCF.
VarSomeAPIClient — a Python class for integrating variant, gene, and CNV annotation directly into your own code (synchronous and async interfaces).
VCFAnnotator — a customisable VCF annotation pipeline class for use in your own Python projects.

Installation

End users — CLI tools

If you only want to run varsome_api_run or varsome_api_annotate_vcf, the Docker image is the recommended approach — it ships with all system dependencies pre-installed and requires no local build toolchain:

docker pull ghcr.io/saphetor/varsome-api-client-python:1

See the Docker Guide for full usage instructions.

Python install — core library (no VCF support)

If you only need VarSomeAPIClient for variant lookup in your own code and do not require VCF reading/writing, install without extras:

pip install git+https://github.com/saphetor/varsome-api-client-python.git

Or with Poetry:

poetry add git+https://github.com/saphetor/varsome-api-client-python.git

Python install — with VCF support (`[vcf]` extra)

varsome_api_annotate_vcf and VCFAnnotator depend on pysam, which requires several C build libraries. Install the vcf extra to include pysam:

pip install "varsome_api[vcf] @ git+https://github.com/saphetor/varsome-api-client-python.git"

Or with Poetry:

poetry add "git+https://github.com/saphetor/varsome-api-client-python.git[vcf]"

Build requirements for pysam — the following system libraries must be present before pip can compile pysam:

Library Ubuntu/Debian macOS (Homebrew)

zlib zlib1g-dev zlib

bzip2 libbz2-dev bzip2

lzma liblzma-dev xz

libcurl libcurl4-openssl-dev curl

OpenSSL libssl-dev openssl

libdeflate libdeflate-dev libdeflate

build tools build-essential Xcode CLT

If installing these is inconvenient, use the Docker image instead — it handles all of this for you.

After installation, the varsome_api_run and varsome_api_annotate_vcf commands will be available in your PATH.

Requires Python ≥ 3.11, < 3.15.

API servers

Server	URL	Notes
Live	`https://api.varsome.com`	Default
Stable	`https://stable-api.varsome.com`	Kept frozen according to schedule
Staging	`https://staging-api.varsome.com`	Test environment, throttled

For more information on the different servers, read here.

Use the -u flag to select a non-default server.

Note: The staging environment is intended for evaluation only. It may contain a partial dataset, is throttled, and may produce different results from production.

Quick-start: command-line tools

Annotate a single variant

varsome_api_run -g hg19 -k YOUR_API_KEY -q 'chr7-140453136-A-T' -p add-ACMG-annotation=1

Annotate multiple variants in one call

varsome_api_run -g hg19 -k YOUR_API_KEY \
  -q 'chr7-140453136-A-T' 'chr19:20082943:1:G' \
  -p add-source-databases=gnomad-exomes,refseq-transcripts

Annotate variants from a text file (one variant per line)

varsome_api_run -g hg19 -k YOUR_API_KEY -i variants.txt -o annotations.json -p add-ACMG-annotation=1

Look up gene information

varsome_api_run -y genes -g hg19 -k YOUR_API_KEY -q BRCA1 TP53

varsome_api_run -y genes -g hg19 -k YOUR_API_KEY -i genes.txt -o gene_annotations.json

Look up CNV information

varsome_api_run -y cnvs -g hg19 -k YOUR_API_KEY -q 'chr1:122:5235:DEL' 'chr1:100:L1254:DUP'

varsome_api_run -y cnvs -g hg19 -k YOUR_API_KEY -i cnvs.txt -o cnv_annotations.json

Output defaults to stdout. Use -o to write to a file. The output is always written in JSON Lines format — one JSON object per line — regardless of whether you write to a file or stdout. See Output format: JSON Lines for details and migration guidance.

Annotate a VCF file

varsome_api_annotate_vcf -g hg19 -k YOUR_API_KEY -i input.vcf -o annotated.vcf -p add-ACMG-annotation=1

VCF annotation limitation: varsome_api_annotate_vcf supports SNPs and small indels (up to 200 bp). Remove any variants outside these criteria before running.

Common CLI flags

Flag	Description	Default
`-k`	API key (required)	—
`-g`	Reference genome: `hg19` or `hg38`	`hg19`
`-y`	Query type: `variants`, `genes`, or `cnvs`	`variants`
`-p`	Request parameters as `key=value` pairs	`add-ACMG-annotation=1`
`-u`	API server URL	`https://api.varsome.com`
`-t`	Max concurrent requests (1–20)	`5`
`-m`	Max items per batch request	`100`
`-v` / `--verbose`	Enable debug-level logging	off

When using -p to specify request parameters, separate multiple parameters with spaces:

varsome_api_run -g hg19 -k YOUR_API_KEY -q 'chr7-140453136-A-T' \
  -p add-ACMG-annotation=1 add-source-databases=gnomad-exomes,refseq-transcripts

Using the -p flag as part of the reference annotation command (varsome_api_annotate_vcf) with parameters other than the default add-ACMG-annotation=1 will not produce the expected results. This is because varsome_api_annotate_vcf is designed to work with a specific set of parameters, and deviating from these may lead to unexpected behavior. For VCF annotation, it's recommended to stick with the default parameters or consult the documentation on how to extend the functionality of VCFAnnotator for custom annotation pipelines.

Batch limits: The -m parameter specifies the maximum number of items (variants or genes) per batch request. The API enforces per-environment limits:

Live / Stable: Variants: 200, Genes: 100
Staging: Variants: 50, Genes: 10

If you exceed the environment's limit, the API will return an error. Adjust -m accordingly. CNV queries do not support batching and are always sent individually.

Run any tool with --help for the full option reference.

Output format: JSON Lines (breaking change from v0.x)

varsome_api_run v1.x writes all output — to a file or to stdout — in JSON Lines (JSONL) format: one self-contained JSON object per line, with no surrounding array wrapper.

This is a breaking change from v0.x, which wrote the output file as a single JSON array.

v0.x — old format (JSON array)

The old output file looked like this:

[
  {"chromosome": "7", "pos": 140453136, "ref": "A", "alt": "T", ...},
  {"chromosome": "19", "pos": 20082943, "ref": "1", "alt": "G", ...}
]

Users would load the entire file at once and iterate the resulting list:

# v0.x — old approach
import json

with open("annotations.json") as f:
    annotations = json.load(f)  # parses the whole file as a JSON array

for annotation in annotations:
    print(annotation["chromosome"], annotation["pos"])

v1.x — new format (JSON Lines)

The new output file looks like this:

{"alt": "T", "chromosome": "7", "pos": 140453136, "ref": "A", ...}
{"alt": "G", "chromosome": "19", "pos": 20082943, "ref": "1", ...}

Each line is an independent, complete JSON object. Read the file line by line and parse each line separately:

# v1.x — new approach
import json

with open("annotations.jsonl") as f:
    for line in f:
        annotation = json.loads(line)  # parse one object at a time
        print(annotation["chromosome"], annotation["pos"])

⚠️ json.load(f) will fail on a JSONL file because the file as a whole is not valid JSON. Always use json.loads(line) inside a loop.

Why the change?

The JSONL format allows results to be streamed and written as they arrive from the API, keeping memory usage constant regardless of how many variants are annotated. The old array format required buffering all results in memory before writing, which was impractical for large variant sets.

Documentation

Document	Description
Developer Guide	Using `VarSomeAPIClient` and `VCFAnnotator` in your Python code
Docker Guide	Running the tools via the pre-built Docker image or building your own

How to get an API key

Contact support to register for an API key.

An API key is required for all CLI operations and for batch lookups. Single-variant lookups via VarSomeAPIClient do not require a key, but will be throttled.

API documentation

See api.varsome.com for available request parameters and the full response schema. The OpenAPI specification is available at https://api.varsome.com/openapi/variants/.

Contributing & running the tests

See the Developer Guide for instructions on cloning the repository, setting up a development environment, and running the test suite.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
tests		tests
varsome_api		varsome_api
.cz.toml		.cz.toml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VarSome API Client

⚠️ Legacy version notice

What this library provides

Installation

End users — CLI tools

Python install — core library (no VCF support)

Python install — with VCF support (`[vcf]` extra)

API servers

Quick-start: command-line tools

Annotate a single variant

Annotate multiple variants in one call

Annotate variants from a text file (one variant per line)

Look up gene information

Look up CNV information

Annotate a VCF file

Common CLI flags

Output format: JSON Lines (breaking change from v0.x)

v0.x — old format (JSON array)

v1.x — new format (JSON Lines)

Why the change?

Documentation

How to get an API key

API documentation

Contributing & running the tests

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Library	Ubuntu/Debian	macOS (Homebrew)
zlib	`zlib1g-dev`	`zlib`
bzip2	`libbz2-dev`	`bzip2`
lzma	`liblzma-dev`	`xz`
libcurl	`libcurl4-openssl-dev`	`curl`
OpenSSL	`libssl-dev`	`openssl`
libdeflate	`libdeflate-dev`	`libdeflate`
build tools	`build-essential`	Xcode CLT

Folders and files

Latest commit

History

Repository files navigation

VarSome API Client

⚠️ Legacy version notice

What this library provides

Installation

End users — CLI tools

Python install — core library (no VCF support)

Python install — with VCF support ([vcf] extra)

API servers

Quick-start: command-line tools

Annotate a single variant

Annotate multiple variants in one call

Annotate variants from a text file (one variant per line)

Look up gene information

Look up CNV information

Annotate a VCF file

Common CLI flags

Output format: JSON Lines (breaking change from v0.x)

v0.x — old format (JSON array)

v1.x — new format (JSON Lines)

Why the change?

Documentation

How to get an API key

API documentation

Contributing & running the tests

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Python install — with VCF support (`[vcf]` extra)

Packages