NAV
shell python

Introduction

This document contains the specification for the UM data API, used by LocusZoom and other tools under development at the Center for Statistical Genetics, University of Michigan.

Production or development API

Simply replace api with api_internal_dev in any of the URLs below.

# Production API
curl "http://portaldev.sph.umich.edu/api/v1/statistic/single/"

# Development API
curl "http://portaldev.sph.umich.edu/api_internal_dev/v1/statistic/single/"

Common parameters

To retrieve data from available resources, the HTTP GET requests are used with the optional parameters listed in the table below. The list of parameters and their format is based on the best practices from OData and JAX-RS specifications 1,2.

Parameter Type Description
page integer Page number if pagination is requested
limit integer Maximum page size
filter string Specifies filtering options
sort string List of fields that will be used to sort the collection
fields string List of fields that will be included

filter

The filter parameter allows elimination of redundant resource’s entries using logical expressions. The logical expression is the combination of resource field names, operators and literals. The tables below list available literals and operators correspondingly.

Literal Description
‘a string’ Variable length character string.
0.73, -0.73 Floating point number.
12, -12 Integer number.
Operator Description Example
eq = filter=analysis eq 1
filter=variant eq ‘rs1234567’
gt > filter=refAlleleFreq gt 0.01
lt < filter=pvalue lt 0.000000005
ge >= filter=position ge 10000
le <= filter=position le 20000
in filter=chromosome in ‘1’,’2’,’3’,’16’
and & filter=position ge 10000 and position le 20000

Depending on the requirements, only part of the operators may be supported for a particular resource and its field.

fields

The fields parameter allows projection of resource’s fields. The projection is specified as a comma separated list of resource’s fields. For example, to select only analysis and trait fields from the /statistic/single resource, the corresponding GET request must have fields=analysis,trait.

Each request has its own set of fields (specified under the API endpoints section.)

sort

The sort parameter allows ordering of the results based on one or multiple resource’s fields. The fields are provided in a comma separated list. The - character before the field name corresponds to the descending order.

Response status codes

Code Message Description
200 JSON with results Success
400 Incorrect syntax in filter parameter Server unable to parse filter
400 Incorrect syntax in fields parameter Serve unable to parse the fields parameter
501 Unsupported data type for the xyz field in the filter parameter Server successfully parsed the filter parameter, but the xyz field’s data type didn’t match the provided literal’s type
501 Unsupported operation for the field in the filter parameter Server successfully parsed the filter parameter, but the resource doesn’t support the specified operation with the field
501 Unsupported field in the filter parameter Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource
501 Unsupported field in the fields parameter Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource

Response JSON

All responses from HTTP GET requests are represented using JSON data format. The returned object must have two mandatory “data” and “lastPage” fields.

Example JSON response:

{
  "data": "result JSON here",
  "lastPage": "integer here"
}

Overview of API endpoints

Relative Resource URI Description
/statistic/single/ Collection of all available studies that have single variant association results.
/statistic/single/results/ Collection of all single variant association results.
/statistic/pair/LD/ Collection of all datasets that have available linkage disequilibrium information or that can be used to compute linkage disequilibrium.
/statistic/pair/LD/results/ Collection of pair-wise linkage disequilibrium coefficients between all variants.
/statistic/pair/ScoreCov/ Collection of all datasets that have available covariance matrices between single variant score test statistics.
/statistic/pair/ScoreCov/results/ Collection of covariance values between all single variant score test statistics.
/annotation/variant/ Collection of all available single variant annotations.
/annotation/interval/ Collection of all available genome interval annotation sources (such as GENCODE).
/annotation/interval/results/ Collection of all available genome interval annotations.
/annotation/genes/sources/ Collection of all available gene annotation resources.
/annotation/genes/ Collection of all annotated genes.
/annotation/genes/names/ Collection of all gene/transcript/exon names.

API endpoints

Single variant statistics

API endpoints for retrieving association statistics on single variants.

List all available datasets/resources

GET /statistic/single/

curl "http://portaldev.sph.umich.edu/api/v1/statistic/single/"
import requests

response = requests.get("http://portaldev.sph.umich.edu/api/v1/statistic/single/")
json = response.json()

The JSON response will look like:

{
  "data": {
    "analysis": [1,2,3],
    "study": ["METSIM","FUSION","FUSION"],
    "trait": ["T2D","T2D","fasting insulin"],
    "tech": ["Illumina300K","Exome chip","Illumina 1M"],
    "build": ["b36","b37","b37"],
    "imputed": ["1000G","NA","HapMap"]
  },
  "lastPage": null
}

FIELDS

Field Description
analysis Analysis unique identifier
study Study name
trait Trait name
tech Genotyping/sequencing technology
build Genome build
imputed Reference panel used if data was imputed

FILTERS

Filter Description
analysis in 1,2,… Selects set of analyses by unique ID

SORT

Not yet implemented

Retrieve results

GET /statistic/single/results/

Example: retrieve all association results in the FUSION study for T2D (analysis ID 1)

curl -G "http://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1&limit=100&filter=analysis in '1'"
{
  "data": {
    "analysis": [1,1,1],
    "id": ["chr4:1_A/G","chr4:2_C/T","chr4:3900_C/T"],
    "chr": ["4","4","4"],
    "position": [1,2,3900],
    "pvalue": [0.6,0.01,0.000043],
    "scoreTestStat": [0.2,5.4,3.6]
  },
  "lastPage": null
}

Example: Retrieve association results from region 12:10001-20001 from the FUSION study for trait T2D. Include only variant name, position, and p-value columns. Sort by the position and p-value columns.

curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1 & limit=100 & filter=analysis in 1 and chromosome in ‘12’ and position ge 10001 and position le 20001 & fields=variant, position, pvalue & sort=position, pvalue"
{
  "data": {
    "variant": ["chr12:10001_A/G","chr12:10002_C/T","chr12:20000_G/T"],
    "position": [10001,10002,20000],
    "pvalue": [0.001,0.5,0.03]
  },
  "lastPage": null
}

FIELDS

Field Description
analysis Analysis unique identifier
variant Variant unique name
chromosome Chromosome
position Position in base pairs
samples Number of samples in the model
refAllele Reference allele
altAllele Alternate allele
effectAllele Effect allele
effectAlleleFreq Effect allele frequency
effectAlleleCount Effect allele count
refGenotypeCount Number of homozygous genotypes with reference allele
hetGenotypeCount Number of heterozygous genotypes
altGenotypeCount Number of homozygous genotypes with alternate allele
effect Effect size
effectStdErr Effect size standard error
scoreStat Score statistic
pvalue P-value

FILTERS

Filter Description
analysis in 1, 2 Select analysis by a unique identifier
chromosome in ‘1’, ’22’, ’X’ Select chromosomes by name.
position ge 10000 Start position in base-pairs of the interval of interest.
position le 60000 End position in base-pairs of the interval of interest.

SORT

Not yet implemented

Linkage disequilibrium

List all datasets/resources

GET /statistic/pair/LD/

Example: list all available reference panels

curl "http://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/"
{
  "data": {
    "reference": [1,2,3,4],
    "panel": ["1000G","1000G","1000G","HapMap"],
    "population": ["JPT", "YRI", "EUR", "JPT"],
    "build": ["b37","b37","b37","b36"],
    "version": ["3","3","3","2"]
  },
  "lastPage": null
}

FIELDS

Field Description
reference Reference panel unique identifier
panel Reference panel name
population Population name
build Genome build
version Reference panel version

FILTERS

Filter Description
reference in 1, 2 Select reference by a unique identifier.

SORT

Not yet implemented

Retrieve results

GET /statistic/pair/LD/results/

Example: Retrieve all pair-wise LD D’ values between SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Don’t sort the results. Retrieve only variant1, variant2 and value fields. Split results into pages of size 100. Start with the first page.

curl -G "http://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1&limit=100&filter=reference in 3 and chromosome1 in ‘12’ and position1 ge 10001 and position1 le 20001 and chromosome2 in ‘12’ and position2 ge 10001 and position2 le 20001 and type in ‘dprime’ & fields=variant2,variant2,value"
{
  "data": {
    "variant1": ["12:10001", "12:10001", "12:10002"],
    "variant2": ["12:10002", "12:10003", "12:10003"],
    "value": [1.00, 0.78, 1.00]
  },
  "lastPage": 12
}

Example: Retrieve pair-wise D’ LD values between SNP 12:10023 and all SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Retrieve only variant2 and value columns. Split the results into pages of size 100. Start with the first page.

curl -G "http://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1&limit=100&filter=reference in 3 and variant1 in ’12:10023’ and chromosome2 in ’12’ and position ge 10001 and position le 20001 ant type in ‘dprime’ & fields=variant2,value"
{
  "data": {
    "id2": ["12:10001", "12:10002", "12:10003"],
    "value": [1.00, 1.00, 0.98]
  },
  "lastPage": 10
}

This API endpoint calculates LD values between pairs of variants on the fly (not precomputed). For regions of 1 MB, it should be nearly instant.

This endpoint only uses pre-existing reference panels, such as the 1000 Genomes panels.

FIELDS

Field Description
reference Reference panel unique identifier
variant1 Variant name in chr:pos_ref/alt format
chromosome1 Chromosome
position1 Position in base pairs
variant2
chromosome2
position2
value LD value
type LD type: dprime, rsquare

FILTERS

Filter Description
reference in 1, 2 Select reference by unique identifier.
variant1 in ’12:1000’, ’12:1001’ Select first variant by unique name.
chromosome1 in ‘1’, ‘2’ Select chromosome for the first variant.
position1 ge 1000
position1 le 2000
Specify positions range (in base-pairs) for the first variant.
variant2 Select second variant by unique name.
chromosome2 in ‘1’, ‘2’ Select chromosome for the second variant.
position2 ge 1000
position2 le 2000
Specify positions range (in base-pairs) for the second variant.
type in ‘dprime’, ‘rsquare’ Select type of LD coefficient.

SORT

Not yet implemented

Covariance

List all datasets/resources

GET /statistic/pair/ScoreCov/

Example: Get all available studies that have covariance matrices of score statistics.

curl "http://portaldev.sph.umich.edu/api/v1/statistic/pair/ScoreCov/"
{
  "data": {
    "analysis": [1, 2, 3],
    "study": ["FUSION", "FUSION", "MAGIC"],
    "trait": ["t2d", "t2d", "fasting insulin"],
    "tech": ["Illumina300K", "Exome-chip", "Illumina300K"],
    "build": ["b37", "b37", "b36"],
    "imputed": ["1000G", "NA", "HapMap"]
  },
  "lastPage": null
}

Example: Get available information about the covariance matrix available for the analysis with id equal to 1

curl -G "http://portaldev.sph.umich.edu/api/v1/statistic/pair/ScoreCov/" --data-urlencode "filter=analysis in 1"
{
  "data": {
    "analysis": [1],
    "study": ["FUSION"],
    "trait": ["t2d"],
    "tech": ["Illumina300K"],
    "build": ["b37"],
    "imputed": ["imputed"]
  },
  "lastPage": null
}

FIELDS

Field Description
analysis Analysis unique name.
study Study name.
trait Trait name.
tech Genotyping/sequencing technology.
build Genome build version.
imputed Reference panel used for imputation.

FILTERS

Filter Description
analysis in 1, 2 Select analysis that have covariance information available by a unique identifier.

SORT

Not yet implemented

Retrieve results

GET /statistic/pair/ScoreCov/results/

Example: Retrieve covariance of score statistics between all SNPs in the 12:10001-20001 region from T2D association results in the FUSION study. Retrieve only variant1, variant2 and value fields. Split results into pages of size 100. Start with the first page.

curl -G "http://portaldev.sph.umich.edu/api/v1/statistic/pair/ScoreCov/results/" --data-urlencode "page=1&limit=100&filter=analysis in 1 and chromosome1 in ‘12’ and position1 ge 10001 and position1 le 20001 and chromosome2 in ‘12’ and position2 ge 10001 and position2 le 20001 & fields=variant1,variant2,value"
{
  "data": {
    "variant1": ["12:10001", "12:10001", "12:10002"],
    "variant2": ["12:10002", "12:10003", "12:10003"],
    "value": [0.30, 0.43, 0.12]
  },
  "lastPage": 12
}

Example: Retrieve covariance of score statistics between 12:10024 SNP and all SNPs in the 12:10001-20001 region from T2D association results in the FUSION study. Retrieve only variant2 and value fields. Split results into pages of size 100. Start with the first page.

curl -G "http://portaldev.sph.umich.edu/api_internal_dev/v1/statistic/pair/ScoreCov/results/" --data-urlencode "page=1&limit=100&filter=analysis in 1 and varian1 in ’12:10024’ and chromosome2 in ‘12’ and position2 ge 10001 and position2 le 20001 & fields=variant2,value"
{
  "data": {
    "variant2": ["12:10001", "12:10002", "12:10003"],
    "value": [0.55, 0.12, 0.77]
  },
  "lastPage": 10
}

Example: Extract covariance between all markers within the region chr1:762320-862320

curl -G "http://portaldev.sph.umich.edu/api_internal_dev/v1/pair/ScoreCov/results/" --data-urlencode "filter=analysis in 4 and chromosome1 in '1' and position1 ge 762320 and position1 le 862320 and chromosome2 in '1' and position2 ge 762320 and position2 le 862320"
url = "http://portaldev.sph.umich.edu/api_internal_dev/v1/pair/ScoreCov/results/?filter=analysis in 4 and chromosome1 in '1' and position1 ge 762320 and position1 le 862320 and chromosome2 in '1' and position2 ge 762320 and position2 le 862320"

resp = requests.get(url)

data = resp.json()["data"]
{
  "data": {
    "variant_name1": ["1:762320_C/T","1:762320_C/T","1:861349_C/T"],
    "chromosome1": ["1","1","1"],
    "position1": [762320,762320,861349],
    "variant_name2": ["1:762320_C/T","1:861349_C/T","1:861349_C/T"],
    "chromosome2": ["1","1","1"],
    "position2": [762320,861349,861349],
    "statistic": [0.00060542,-4.39597E-7,0.00110772]},
  "lastPage":null
}

FIELDS

Field Description
analysis Analysis unique name.
variant1 Variant unique name.
chromosome1 Chromosome name.
position1 Position in base-pairs.
variant2 Variant unique name.
chromosome2 Chromosome name.
position2 Position in base-pairs.
value Covariance value.

FILTERS

Filter Description
analysis in 1, 2 Select analysis by a unique identifier.
variant1 in ’12:1001’, ’12:1002’ Select first variant by unique name.
chromosome1 in ‘1’, ‘2’, ‘X’ Select chromosome for the first variant by name.
position1 ge 1000
position1 le 2000
Select positions range (in base-pairs) for the first variant.
variant2 in ’12:1001’, ’12:1002’ Select second variant by unique name.
chromosome2 in ‘1’, ‘2’, ‘X’ Select chromosome for the second variant by name.
position2 ge 1000
position2 le 2000
Select positions range (in base-pairs) for the first variant.

SORT

Not yet implemented

Recombination

Get recombination sources

GET /annotation/recomb/

FIELDS

Field Description
id Recombination rate map unique identifier
resource Recombination rate map (e.g. hapmap)
build Genome build for recombination rate positions

FILTERS

SORT

Retrieve recombination rates

GET /annotation/recomb/results/

FIELDS

FILTERS

SORT

Interval annotations

These would be annotations that span intervals of the genome, such as enhancers, TFBSs, etc.

List all datasets/resources

GET /annotation/interval/

Example: Retrieve a list of all available interval annotation resources.

curl "http://portaldev.sph.umich.edu/api/v1/annotation/interval/"
{
  "data": {
    "resource": [1, 2],
    "name": ["GENCODE", "GENCODE"],
    "version": ["23", "19"],
    "build": ["b38", "b37"],
    "description": ["Full gene annotation", "Basic gene annotation"]
  },
  "lastPage": null
}

Example: Retrieve information about the interval annotation resource with id equal to 2.

curl -G "http://portaldev.sph.umich.edu/api/v1/annotation/interval/" --data-urlencode "filter=resource in 2"
{
  "data": {
    "resource": [2],
    "name": ["GENCODE"],
    "version": ["19"],
    "build": ["b37"],
    "description": ["Basic gene annotation"]
  },
  "lastPage": null
}

FIELDS

Field Description
resource Annotation resource unique id.
name Annotation resource name.
version Annotation resource version.
build Genome build version.
description Short description of the resource.

FILTERS

Filter Description
resource in 1, 2 Selects interval annotation resource by a unique identifier.

SORT

Not yet implemented

Retrieve interval annotations / genes

Retrieve annotations for genes in window chr20:10000-20000 using annotation resource with id equal to 2. Return back only id, name, strand, chromosome, start and end fields.

curl -G "http://portaldev.sph.umich.edu/api/v1/annotation/interval/" --data-urlencode "filter=resource in 2 and type in ‘gene’ and chromosome in ‘20’ and start geq 10000 and end leq 20000 & fields=id, name, strand, chromosome, start, end"
{
  "data": {
    "id": ["ENSG00000223972.4"],
    "name": ["DDX11L1"],
    "strand": ["+"],
    "chromosome": ["20"],
    "start": [11869],
    "end": [14412]
  },
  "lastPage": null
}

Retrieve all transcripts for gene ENSG00000223972.4 using annotation resource with id equal to 2. Return back only id, name, chromosome, start and end fields.

curl -G "http://portaldev.sph.umich.edu/api/v1/annotation/interval/" --data-urlencode "filter=resource in 2 and type in ‘transcript’ and parent in ‘ENSG00000223972.4’"
{
  "data": {
  "id": ["ENST00000456328.2", "ENST00000515242.2"],
  "name": ["DDX11L1-002", "DDX11L1-201"],
  "chromosome": ["20", "20"],
  "start": [11869, 11872],
  "end": [14409, 14412]
  },
  "lastPage": null
}

FIELDS

Field Description
resource Annotation resource unique id.
type Interval type: “gene”, “transcript”, “exon”.
id Unique public interval identifier (if available) such as gene id ENSGXXXXXXXXXXX.X, transcript id ENSTXXXXXXXXXXX.X, and exon id ENSEXXXXXXXXXXX.
name Interval name (if available) such as gene name.
parent Parent interval public identifier. For transcripts it will be the corresponding gene public identifier, while for exons it will be the corresponding transcript public identifier.
strand Strand “+” or “-”.
chromosome Chromosome name.
start Interval start position in base-pairs.
end Interval end position in base-pairs.
annotation Variable length text with interval annotation in key:value format (possibly JSON).

FILTERS

Filter Description
resource in 1, 2 Select interval annotation resource by a unique identifier.
chromosome in ‘1’, ‘2’, ‘X’ Select chromosome by name.
type in ‘gene’, ‘transcript’ Select interval type by name.
id in ‘ENSGXXXXXXXXXXX.X’, ‘ENSTXXXXXXXXXXX.X’ Select intervals by id.
name in ‘PCSK9’ Select intervals by name.
parent in ‘ENSTXXXXXXXXXXX.X’ Select intervals by their parent interval id.
start ge 10000
start le 20000
Select interval if its start position falls into the specified interval.
end ge 10000
end le 20000
Select interval if its end position falls into the specified interval.

SORT

Not yet implemented

Genes

List all possible sources of gene annotations

Currently we only include ENSEMBL/GENCODE.

GET /annotation/genes/sources/

Example: retrieve all gene annotation sources

curl "http://portaldev.sph.umich.edu/api/v1/annotation/genes/sources/"
{
  "data": [
    {
      "source_id" : 1,
      "source_name" : "gencode",
      "version" : "Release_23",
      "build" : "GRCh38.p3"
    },
    {
      "source_id" : 2,
      "source_name" : "gencode",
      "version" : "Release_22",
      "build" : "GRCh38.p3"
    }
  ]
}

FIELDS

Field Description
source_id Annotation resource unique id.
source_name Annotation resource name.
version Annotation resource version.
build Annotation resource genome build.

Retrieve gene information

GET /annotation/genes/

Retrieve all gene annotation data.

curl "http://portaldev.sph.umich.edu/api/v1/annotation/genes/"
{
  "data": [
  {
    "gene_id": "ENSG00000223972.5",
    "gene_name": "DDX11L1",
    "chromosome": "chr1",
    "start": "11869",
    "end": "14409",
    "strand": "+",
    "transcripts": [
    {
      "transcript_id": "ENST00000456328.2",
      "transcript_name": "DDX11L1-002",
      "start": "11869",
      "end": "14409",
      "exons": [
        { "exon_id": "ENSE00002234944.1", "start": " 11869", "end": " 12227" },
        { "exon_id": "ENSE00003582793.1", "start": " 12613", "end": " 12721" },
        { "exon_id": "ENSE00002312635.1", "start": " 13221", "end": " 14409" },
      ]           
    },
    {       
      "transcript_id": "ENST00000450305.2",
      "transcript_name": "DDX11L1-001",
      "start": "12010",
      "end":"13670",
      "exons": [
        { "exon_id": "ENSE00001948541.1", "start": "12010", "end": "12057" },
        { "exon_id": "ENSE00001671638.2", "start": "12179", "end": "12227" },
        { "exon_id": "ENSE00001758273.2", "start": "12613", "end": "12697" },
        { "exon_id": "ENSE00001799933.2", "start": "12975", "end": "13052" },
        { "exon_id": "ENSE00001746346.2", "start": "13221", "end": "13374" },
        { "exon_id": "ENSE00001863096.1", "start": "13453", "end": "13670" }
      ]
    }
    ]
  }
  ],
  "lastPage": null
}

FIELDS

Field Description
source Genes annotation resource id.
name Gene name (non-unique).
id Gene unique id.
chrom Chromosome name.
region-start Gene start position.
region-end Gene end position.
strand Gene strand

FILTERS

Filter Description
source in 1, 2 Selects gene annotation source by a unique identifier.
name in ‘APOE’, ‘TCF7L2’ Selects gene annotation by non-unique display name(s).
id in ‘ENSG00000223972.5’, ‘ENSG00000227232.5’ Selects gene annotation by unique gene ID(s).
chrom eq ‘chr20’ Selects gene annotation that lie within a chromosome.
start ge 20000000 Selects gene annotation with start positions greater than a certain value.
end le 20100000 Selects gene annotation with end positions less than a certain value.

SORT

Not yet implemented

Retrieve gene names

GET /annotation/genes/names/

Example: retrieve all searchable regions that begin with the string ‘TCF’.

curl -G "http://portaldev.sph.umich.edu/api/v1/annotation/genes/names/" --data-urlencode "filter=name startswith ‘TCF’"
{
  "data": [
  {
    "region" : "TCF7",
    "label" : "gene_name"
  },
  {
    "region" : "TCF7-001",
    "label" : "transcript_name"
  },
  {
    "region" : "TCF7-002",
    "label" : "transcript_name"
  }
  ]
}

FIELDS

Field Description
region Gene/transcript/exon name.
label Type of region: “gene_name”, “transcript_name”, “exon_name”.

FILTERS

Filter Description
name startswith ‘TCF’, ‘ENSG00001’, ‘ENSG’ Selects all gene_id, gene_name, transcript_id, transcript_name, and exon_id that start with a string.

SORT

Not yet implemented