Introduction
This document contains the specification for the UM data API, used by LocusZoom and other tools under development at the Center for Statistical Genetics, University of Michigan.
Our API naturally evolves over time as key data is revised. Most annotations (genes, recombination, LD) now support both build GRCh37 and build GRCh38. We encourage you to explore the provided metadata endpoints to find the newest and best annotations that match your data.
Production or development API
Simply replace api
with api_internal_dev
in any of the URLs below.
# Production API
curl "https://portaldev.sph.umich.edu/api/v1/statistic/single/"
# Development API
curl "https://portaldev.sph.umich.edu/api_internal_dev/v1/statistic/single/"
Common parameters
To retrieve data from available resources, the HTTP GET requests are used with the optional parameters listed in the table below. The list of parameters and their format is based on the best practices from OData and JAX-RS specifications 1,2.
Parameter | Type | Description |
---|---|---|
page | integer | Page number if pagination is requested |
limit | integer | Maximum page size |
filter | string | Specifies filtering options |
sort | string | List of fields that will be used to sort the collection |
fields | string | List of fields that will be included |
filter
The filter parameter allows elimination of redundant resource’s entries using logical expressions. The logical expression is the combination of resource field names, operators and literals. The tables below list available literals and operators correspondingly.
Literal | Description |
---|---|
'a string' | Variable length character string. |
0.73, -0.73 | Floating point number. |
12, -12 | Integer number. |
Operator | Description | Example |
---|---|---|
eq | = | filter=analysis eq 1 |
filter=variant eq 'rs1234567' | ||
gt | > | filter=refAlleleFreq gt 0.01 |
lt | < | filter=pvalue lt 0.000000005 |
ge | >= | filter=position ge 10000 |
le | <= | filter=position le 20000 |
in | filter=chromosome in '1','2','3','16' | |
and | & | filter=position ge 10000 and position le 20000 |
Depending on the requirements, only part of the operators may be supported for a particular resource and its field.
fields
The fields parameter allows projection of resource’s fields. The projection is specified as a comma separated list of resource’s fields. For example, to select only analysis and trait fields from the /statistic/single resource, the corresponding GET request must have fields=analysis,trait.
Each request has its own set of fields (specified under the API endpoints section.)
sort
The sort parameter allows ordering of the results based on one or multiple resource’s fields. The fields are provided in a comma separated list. The -
character before the field name corresponds to the descending order.
Response status codes
Code | Message | Description |
---|---|---|
200 | JSON with results | Success |
400 | Incorrect syntax in filter parameter | Server unable to parse filter |
400 | Incorrect syntax in fields parameter | Serve unable to parse the fields parameter |
501 | Unsupported data type for the xyz field in the filter parameter |
Server successfully parsed the filter parameter, but the xyz field's data type didn't match the provided literal's type |
501 | Unsupported operation for the |
Server successfully parsed the filter parameter, but the resource doesn’t support the specified operation with the |
501 | Unsupported field in the filter parameter | Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource |
501 | Unsupported field in the fields parameter | Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource |
Response JSON
All responses from HTTP GET requests are represented using JSON data format. The returned object must have two mandatory "data" and "lastPage" fields.
Example JSON response:
{
"data": "result JSON here",
"lastPage": "integer here"
}
Overview of API endpoints
Relative Resource URI | Description |
---|---|
/statistic/single/ | Collection of all available studies that have single variant association results. |
/statistic/single/results/ | Collection of all single variant association results. |
/statistic/phewas/ | Return all available association statistics given a variant. |
/statistic/pair/LD/results/ | Collection of pair-wise linkage disequilibrium coefficients between all variants. |
/annotation/recomb/ | Recombination rates |
/annotation/variant/ | Collection of all available single variant annotations. |
/annotation/snps/ | List all dbSNP datasets |
/annotation/snps/results/ | Query by rsid and find chrom/pos/ref/alt, or vice versa. |
/annotation/omnisearch/ | Search for genomic coordinates given a rsID, gene, transcript, etc. |
/annotation/intervals/ | Collection of all available genome interval annotation sources (such as GENCODE). |
/annotation/intervals/results/ | Collection of all available genome interval annotations. |
/annotation/genes/sources/ | Collection of all available gene annotation resources. |
/annotation/genes/ | Collection of all annotated genes. |
/annotation/gwascatalog/ | Collection of GWAS catalogs |
/annotation/gwascatalog/results/ | Collection of GWAS catalogs |
API endpoints
Single variant statistics
API endpoints for retrieving association statistics on single variants.
List all available datasets/resources
GET /statistic/single/
curl "https://portaldev.sph.umich.edu/api/v1/statistic/single/"
import requests
response = requests.get("https://portaldev.sph.umich.edu/api/v1/statistic/single/")
json = response.json()
The JSON response will look like:
{
"data": {
"analysis": [1, 2, 3],
"build": ["GRCh37", "GRCh37", "GRCh37"],
"date": ["2010-01-17", "2010-01-17", "2010-01-17"],
"first_author": ["Fritsche LG", "Welch R", "Willer CJ"],
"last_author": ["Willer CJ", "Abecasis GR", "Mohlke JL"],
"study": ["METSIM", "FUSION", "FUSION"],
"trait": ["T2D", "T2D", "fasting insulin"],
"tech": ["Illumina300K", "Exome chip", "Illumina 1M"],
"imputed": ["1000G", "NA", "HapMap"]
},
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Analysis unique identifier |
analysis | Human-readable analysis label |
study | Study name |
trait | Trait name |
tech | Genotyping/sequencing technology |
build | Genome build |
imputed | Reference panel used if data was imputed |
FILTERS
Filter | Description |
---|---|
id in 1,2,... | Selects set of analyses by unique ID |
SORT
Not yet implemented
Retrieve results
GET /statistic/single/results/
Example: retrieve all association results in the FUSION study for T2D (analysis ID 1)
curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=analysis in '99'"
{
"data": {
"analysis": [1, 1, 1],
"beta": [null, null, null],
"chromosome": ["4", "4", "4"],
"log_pvalue": [0.22, 2, 4.37],
"position": [1, 2, 3900],
"ref_allele": ["A", "C", "C"],
"ref_allele_freq": [null, null, null],
"score_test_stat": [0.2, 5.4, 3.6],
"se": [null, null, null],
"variant": ["4:1_A/G", "4:2_C/T", "4:3900_C/T"]
},
"lastPage": null
}
Example: Retrieve association results from region 12:10001-20001 from the FUSION study for trait T2D. Include only variant name, position, and p-value columns. Sort by the position and p-value columns.
curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=analysis in 1 and chromosome in '12' and position ge 10001 and position le 20001" --data-urlencode "fields=variant, position, log_pvalue" --data-urlencode "sort=log_pvalue"
{
"data": {
"variant": ["12:10001_A/G", "12:10002_C/T", "12:20000_G/T"],
"position": [10001, 10002, 20000],
"log_pvalue": [0.001, 0.03, 0.5]
},
"lastPage": null
}
FIELDS
Field | Description |
---|---|
analysis | Analysis unique identifier |
beta | Effect size |
chromosome | Chromosome |
log_pvalue | -log10 p-value |
position | Position in base pairs |
ref_allele | Reference allele |
ref_allele_freq | Reference allele frequency |
score_test_stat | Score statistic |
se | Effect size standard error |
variant | Variant unique name (A string in the scheme {chrom}:{pos}_{ref}/{alt}) |
FILTERS
Filter | Description |
---|---|
analysis in 1, 2 | Select analysis by a unique identifier |
chromosome in '1', '22', 'X' | Select chromosomes by name. |
position ge 10000 | Start position in base-pairs of the interval of interest. |
position le 60000 | End position in base-pairs of the interval of interest. |
SORT
Add &sort=field1,field2
to your URL. If the field is not present it will have no effect.
PheWAS: all available results for a given variant
GET /statistic/phewas/
# We're using format=objects here as it's probably the preferred way to retrieve the data.
# The standard data frame / array of arrays layout is also available if you remove format=objects.
curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/phewas/?build=GRCh37&format=objects" --data-urlencode "filter=variant eq '10:114758349_C/T'"
The JSON response will look like:
{
"data": [
{
"id": 45,
"trait_group": "Metabolic disease",
"trait_label": "Type 2 diabetes",
"log_pvalue": 107.032,
"variant": "10:114758349_C/T",
"chromosome": "10",
"position": 114758349,
"build": "GRCh37",
"beta": null,
"ref_allele": "C",
"ref_allele_freq": null,
"score_test_stat": null,
"se": null,
"study": "DIAGRAM",
"description": "DIAGRAM 1000G T2D meta-analysis",
"tech": null,
"pmid": "28566273",
"trait": "T2D"
}
],
"lastPage": null,
"meta": {
"build": [
"GRCh37"
]
}
}
FIELDS
Field | Description | Must exist in response for PheWAS module |
---|---|---|
id | Unique identifier for each dataset | Yes |
beta | Effect size | |
build | Genome build | |
chromosome | Chromosome for variant | |
description | Description of analysis this dataset represents | |
log_pvalue | -log10 p-value | Yes |
pmid | pmid | PubMed ID for paper if this dataset is published |
position | Position | |
study | Study, consortium, or group that generated this analysis | |
tech | Genotyping/sequencing technology | |
ref_allele | Reference allele | |
ref_allele_freq | Reference allele frequency | |
score_test_stat | Score statistic | |
se | Effect size standard error | |
study | Study name | |
trait | Trait code. Example: "T2D" | |
trait_label | Longer description of trait, e.g. "Type 2 diabetes" | Yes |
trait_group | Arbitrary grouping/category the trait belongs to, e.g. "metabolic diseases" | Yes |
variant | Variant unique name (A string in the scheme {chrom}:{pos}_{ref}/{alt}) |
PARAMETERS
Param | Description |
---|---|
build | Genome build for the requested variant. For example 'GRCh37' or 'GRCh38'. Trailing version (e.g. p13.3) will not be present. |
format | Format of the response. Our API server supports two formats - the default is an array of arrays, and the optional objects format returns an array of JSON objects. LocusZoom.js will only generate requests that use format=objects . |
FILTERS
Filter | Description |
---|---|
variant eq 'X' | Select results for this variant. Variant should be in chr:pos_ref/alt format. |
META
Response will contain a meta
object, with the following attributes:
Attribute | Value |
---|---|
build | Array of genome build(s) that were requested. Records returned will be only for these builds. This will typically only be 1 build. In the future we may begin upconverting variants to other builds. |
SORT
Not yet implemented
Linkage disequilibrium
The PortalDev API endpoint has been deprecated. We encourage you to explore the new Michigan LDServer. The interactive "LD playground" tool provides a concise overview of possible options. For many practical applications (such as LocusZoom plots), the "variant correlations" feature is recommended.
Retrieve results
Although the endpoint documented below still exists, it is deprecated and may be removed in the future. The documentation for this old endpoint is not maintained and is not guaranteed to be accurate.
GET /statistic/pair/LD/results/
Example: Retrieve all pair-wise LD D’ values between SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Don’t sort the results. Retrieve only variant1, variant2 and value fields. Split results into pages of size 100. Start with the first page.
curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=reference in 1 and chromosome1 in '12' and position1 ge 10001 and position1 le 20001 and chromosome2 in '12' and position2 ge 10001 and position2 le 20001 and type in 'dprime'" --data-urlencode "fields=variant2,variant2,value"
{
"data": {
"variant1": ["12:10001", "12:10001", "12:10002"],
"variant2": ["12:10002", "12:10003", "12:10003"],
"value": [1.00, 0.78, 1.00]
},
"lastPage": 12
}
Example: Retrieve pair-wise D’ LD values between SNP 12:10023 and all SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Retrieve only variant2 and value columns. Split the results into pages of size 100. Start with the first page.
curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1&limit=100&filter=reference in 3 and variant1 in '12:10023' and chromosome2 in '12' and position ge 10001 and position le 20001 and type in 'dprime'&fields=variant2,value"
{
"data": {
"id2": ["12:10001", "12:10002", "12:10003"],
"value": [1.00, 1.00, 0.98]
},
"lastPage": 10
}
This API endpoint calculates LD values between pairs of variants on the fly (not precomputed). For regions of 1 MB, it should be nearly instant.
This endpoint only uses pre-existing reference panels, such as the 1000 Genomes panels.
FIELDS
Field | Description |
---|---|
reference | Reference panel unique identifier |
variant1 | Variant name in chr:pos_ref/alt format |
chromosome1 | Chromosome |
position1 | Position in base pairs |
variant2 | |
chromosome2 | |
position2 | |
value | LD value |
type | LD type: dprime, rsquare |
FILTERS
Filter | Description |
---|---|
reference in 1, 2 | Select reference by unique identifier. |
variant1 in '12:1000', '12:1001' | Select first variant by unique name. |
chromosome1 in '1', '2' | Select chromosome for the first variant. |
position1 ge 1000 position1 le 2000 |
Specify positions range (in base-pairs) for the first variant. |
variant2 | Select second variant by unique name. |
chromosome2 in '1', '2' | Select chromosome for the second variant. |
position2 ge 1000 position2 le 2000 |
Specify positions range (in base-pairs) for the second variant. |
type in 'dprime', 'rsquare' | Select type of LD coefficient. |
SORT
Not yet implemented
Recombination
Get recombination sources
GET /annotation/recomb/
FIELDS
Field | Description |
---|---|
id | Recombination rate map unique identifier |
name | Recombination rate map (e.g. hapmap) |
build | Genome build for recombination rate positions |
version | Version string for this recombination map (usually a date) |
FILTERS
Filter | Description |
---|---|
id in 1 | Select recombination rate by identifier |
SORT
Add &sort=field1,field2
to your URL.
Retrieve recombination rates
GET /annotation/recomb/results/
Example: Retrieve recombination rates within a specific interval for a given dataset
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/recomb/results/" --data-urlencode "filter=id in 15 and chromosome eq '21' and position gt 10406989 and position lt 10906989"
{
"data": {
"chromosome": [
"21",
"21",
"21"
],
"id": [
15,
15,
15
],
"pos_cm": [
0.0,
0.052685,
0.052781
],
"position": [
10865933,
10906723,
10906915
],
"recomb_rate": [
1.29162,
0.496586,
0.424224
]
},
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Recombination rate map unique identifier |
chromosome | Chromosome |
position | Genomic position (bp) |
pos_cm | Genetic position (cM) |
recomb_rate | Recombination rate |
If no ID is specified in the filter string, the best recommended recombination rate source will be chosen. This is currently HapMap Phase 2. The build
parameter must also be specified.
FILTERS
PARAMETERS
Param | Description |
---|---|
build | Explicitly set the genome build for this endpoint. This affects how the recommended recombination rate source is selected when no ID is present in the filter string. Acceptable builds are 'GRCh37', 'GRCh38'. |
SORT
Data can be sorted on any field by adding &sort=field1,field2
onto your URL.
Search endpoints
Omnisearch
Search for genomic coordinates given a rsID, gene, transcript, etc. The following example search formats are supported:
- chr:position
- chr:start-stop
- chr:position+offset (-> chr:position-offset - chr:position+offset)
- rs00001
- rs00001+offset
- gene symbol names
- transcript names
Positions and offsets may have commas and use K and M suffixes.
GET /annotation/omnisearch/
Example: Find gene positions by gene name
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/omnisearch/" --data-urlencode "q=TCF7L2" --data-urlencode "build=GRCh37"
{
"build": "grch37",
"data": [
{
"chrom": "10",
"end": 114927437,
"gene_id": "ENSG00000148737.11",
"gene_name": "TCF7L2",
"start": 114710009,
"term": "TCF7L2",
"type": "other"
}
]
}
FIELDS
Field | Description |
---|---|
chrom | The chromosome |
start | The start genomic position |
end | The end genomic position |
term | The term used as the query |
type | The type of query (egene, region, rs, other), as predicted by the parser |
Additional fields may be returned depending on the query type.
QUERY PARAMS
Param | Description |
---|---|
q | A string value to search for |
build | A genome build identifier (GRCh37, GRCh38) |
Interval annotations
These would be annotations that span intervals of the genome, such as enhancers, TFBSs, etc.
List all datasets/resources
GET /annotation/intervals/
Example: Retrieve a list of all available interval annotation resources.
curl "https://portaldev.sph.umich.edu/api/v1/annotation/intervals/"
{
"data": {
"assay": [
"ChIP-seq",
"ChIP-seq",
"ChIP-seq",
"ChIP-seq"
],
"build": [
"GRCh37",
"GRCh37",
"GRCh37",
"GRCh37"
],
"cell_line": [
null,
null,
"GM12878",
"K562"
],
"description": [
"Pancreatic islet chromHMM calls from Parker 2013",
"Pancreatic islet stretch enhancers from Parker 2013",
"Chromatin State Segmentation by HMM from ENCODE/Broad",
"Chromatin State Segmentation by HMM from ENCODE/Broad"
],
"histone": [
null,
null,
null,
null
],
"id": [
16,
17,
18,
19
],
"pmid": [
"24127591",
"24127591",
"21441907",
"21441907"
],
"protein": [
null,
null,
null,
null
],
"study": [
"Parker 2013",
"Parker 2013",
"ENCODE",
"ENCODE"
],
"tissue": [
"pancreatic_islet",
"pancreatic_islet",
null,
null
],
"type": [
"chromHMM",
"stretch_enhancers",
"chromHMM",
"chromHMM"
],
"url": [
"http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/",
"http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/",
"http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeBroadHmm",
"http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeBroadHmm"
],
"version": [
"2013-12-10",
"2013-12-10",
"2012-04",
"2012-04"
]
},
"lastPage": null
}
Example: Retrieve information about the interval annotation resource with id equal to 16.
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/intervals/" --data-urlencode "filter=id in 16"
{
"data": {
"assay": [
"ChIP-seq"
],
"build": [
"b37"
],
"cell_line": [
null
],
"description": [
"Pancreatic islet chromHMM calls from Parker 2013"
],
"histone": [
null
],
"id": [
16
],
"pmid": [
"24127591"
],
"protein": [
null
],
"study": [
"Parker 2013"
],
"tissue": [
"pancreatic_islet"
],
"type": [
"chromHMM"
],
"url": [
"http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/"
],
"version": [
"2013-12-10"
]
},
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Unique identifier for interval dataset |
study | Name of study (ENCODE, FUSION, etc.) |
build | Genome build to which these intervals are anchored |
type | Dataset type (chromHMM calls, stretch enhancers, etc.) |
version | Version string, usually a date |
description | Long description of dataset |
assay | Assay used to generate intervals (ChIP-seq, ATAC-seq, etc.) |
cell_line | Name of cell line in which these genomic intervals were discovered in |
tissue | Name of tissue in which these genomic intervals were discovered in |
histone | If the dataset was ChIP-seq for a particular histone, this will be the name of the histone mark |
protein | If the dataset was ChIP-seq for a particular TF/DNA-binding protein, this will be the protein (ENSEMBL ID) |
pmid | PubMed ID for paper if this dataset is published |
url | URL that contains information about the dataset and/or the original downloaded files |
FILTERS
Filter | Description |
---|---|
id in 1, 2 | Selects interval annotation resource by a unique identifier. |
SORT
Sort on any field using sort=field1,field2
.
Retrieve interval annotations
GET /annotation/intervals/results/
Retrieve annotations from dataset with id 16, on chromosome 2, with start positions < 19001
curl -G https://portaldev.sph.umich.edu/api/v1/annotation/intervals/results/ --data-urlencode "filter=id in 19 and chromosome eq '10' and start le 115067678 and end ge 114550452"
{
"data": {
"chromosome": ["10", "10", "10", "10"],
"end": [114574010, 114574210, 114575010, 114575210],
"id": [19, 19, 19, 19],
"public_id": [null, null, null, null],
"start": [114516210, 114574010, 114574210, 114575010],
"state_id": [13, 7, 13, 7],
"state_name": [
"Heterochromatin / low signal",
"Insulator",
"Heterochromatin / low signal",
"Insulator"
],
"strand": [
null,
null,
null,
null
]
},
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Interval dataset identifier |
state_id | A (numeric) state identifier for this annotation, such as determined by ChromHMM. (if applicable) |
state_name | A human-readable state name that generally corresponds to an entry in state_id. (if applicable) |
public_id | Public/other database ID for this interval (if applicable) |
chromosome | Chromosome |
start | Start of interval (in bp) |
end | End of interval (in bp) |
strand | DNA strand that the interval is annotated to (if applicable) |
FILTERS
Filter | Description |
---|---|
id in 1, 2 | Select interval annotation resource by a unique identifier. |
chromosome in '1', '2', 'X' | Select chromosome by name. |
start ge 10000 start le 20000 |
Select interval if its start position falls into the specified interval. |
end ge 10000 end le 20000 |
Select interval if its end position falls into the specified interval. |
SORT
Sort on any field by adding sort=field1,field2
to the URL.
FORMATS
The default format returns JSON where each key is a column name, and the value is an array of values (one per row entry.)
An alternative format returns each row as an object itself. Add format=objects
to the URL for this.
Genes
List all possible sources of gene annotations
Currently we only include ENSEMBL/GENCODE.
GET /annotation/genes/sources/
Example: retrieve all gene annotation sources
curl "https://portaldev.sph.umich.edu/api/v1/annotation/genes/sources/?format=objects"
{
"data": [
{
"genome_build": "GRCh38",
"id": 1,
"organism": "human",
"source": "gencode",
"taxid": 9606,
"version": "27"
},
{
"genome_build": "GRCh37",
"id": 2,
"organism": "human",
"source": "gencode",
"taxid": 9606,
"version": "19"
},
{
"genome_build": "GRCh37",
"id": 3,
"organism": "human",
"source": "gencode",
"taxid": 9606,
"version": "27"
}
],
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Annotation resource unique id. |
genome_build | Annotation resource genome build. |
organism | |
source | Annotation resource name. |
taxid | |
version | Annotation resource version. |
Retrieve gene information
GET /annotation/genes/
Retrieve all gene annotation data.
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/genes/" --data-urlencode "filter=source in 3 and chrom eq '10' and start le 115067678 and end ge 114550452"
{
"data": [
{
"chrom": "10",
"end": 114578503,
"exons": [
{
"chrom": "10",
"end": 114207225,
"exon_id": "ENSE00001449955.2_1",
"start": 114206756,
"strand": "+"
},
{
"chrom": "10",
"end": 114207225,
"exon_id": "ENSE00001882813.1_1",
"start": 114206757,
"strand": "+"
}
],
"gene_id": "ENSG00000151532.13_2",
"gene_name": "VTI1A",
"start": 114206756,
"strand": "+",
"transcripts": [
{
"chrom": "10",
"end": 114210484,
"exons": [
{
"chrom": "10",
"end": 114207225,
"exon_id": "ENSE00001449955.2_1",
"start": 114206756,
"strand": "+"
}
],
"start": 114206992,
"strand": "+",
"transcript_id": "ENST00000489142.5_1"
}
]
}
],
"lastPage": null
}
FIELDS
Field | Description |
---|---|
source | Genes annotation resource id (used for queries) |
gene_name | Gene name (non-unique). |
gene_id | Gene unique id. |
chrom | Chromosome name. |
start | Gene start position. |
end | Gene end position. |
strand | Gene strand |
transcripts | A nested object defining available transcripts, and each exon within each transcript |
If no source is specified in the filter string, the best recommended gene source will be chosen. This is currently the latest version of GENCODE. The build
parameter must also be specified.
FILTERS
Filter | Description |
---|---|
source in 1, 2 | Selects gene annotation source by a unique identifier. |
gene_name in 'APOE', 'TCF7L2' | Selects gene annotation by non-unique display name(s). |
gene_id in 'ENSG00000223972.5', 'ENSG00000227232.5' | Selects gene annotation by unique gene ID(s). |
chrom eq 'chr20' | Selects gene annotation that lie within a chromosome. |
start ge 20000000 | Selects gene annotation with start positions greater than a certain value. |
end le 20100000 | Selects gene annotation with end positions less than a certain value. |
PARAMETERS
Param | Description |
---|---|
build | Explicitly set the genome build for this endpoint. This affects how the recommended gene source is selected when no ID is present in the filter string. Acceptable builds are 'GRCh37', 'GRCh38'. |
SORT
Not yet implemented
GWAS Catalogs
List all available GWAS catalogs
We currently support the EBI GWAS catalog, and the UK BioBank GWAS hits.
GET /annotation/gwascatalog/
Example: retrieve all GWAS catalogs
curl "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/"
{
"data": {
"catalog_version": [
"e91_r2018-03-13",
"e91_r2018-03-13"
],
"date_inserted": [
"2018-03-18T17:20:40-04:00",
"2018-03-18T17:20:40-04:00"
],
"genome_build": [
"GRCh38",
"GRCh37"
],
"id": [
1,
2
],
"name": [
"EBI GWAS Catalog",
"EBI GWAS Catalog"
]
},
"lastPage": null
}
Or alternatively in object mode:
curl "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/?format=objects"
{
"data": [
{
"catalog_version": "e91_r2018-03-13",
"date_inserted": "2018-03-18T17:20:40-04:00",
"genome_build": "GRCh38",
"id": 1,
"name": "EBI GWAS Catalog"
},
{
"catalog_version": "e91_r2018-03-13",
"date_inserted": "2018-03-18T17:20:40-04:00",
"genome_build": "GRCh37",
"id": 2,
"name": "EBI GWAS Catalog"
}
],
"lastPage": null
}
FIELDS
Field | Description |
---|---|
id | Unique ID assigned to each GWAS catalog |
name | Name of the catalog, e.g. "EBI" or "UKBB" |
genome_build | Positions in the catalog are anchored to this build |
catalog_version | Version of the GWAS catalog (varies by catalog) |
date_inserted | Date the GWAS catalog was inserted into the database |
Retrieve variants from one or multiple GWAS catalogs
GET /annotation/gwascatalog/results/
Retrieve all known disease/trait associated variants within a genomic region for a specific catalog
Understanding the format is easier in object mode, so we use that below.
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and chrom eq '10' and pos le 112998595 and pos ge 112998585"
{
"data": [
{
"alt": "T",
"chrom": "10",
"first_author": "Sladek R",
"genes": "TCF7L2",
"id": 1,
"log_pvalue": 33.7,
"or_beta": 1.65,
"pmid": "17293876",
"pos": 112998590,
"pubdate": "2007-02-11",
"ref": "C",
"risk_allele": "T",
"risk_frq": 0.3,
"rsid": "rs7903146",
"study": "A genome-wide association study identifies novel risk loci for type 2 diabetes.",
"trait": "Type 2 diabetes",
"trait_group": "Type 2 diabetes",
"variant": "10:112998590_C/T"
},
{
...
}
]
}
One record is returned per variant * trait * pmid. The same variant <--> trait association can be reported in multiple publications.
Retrieve associations for a specific variant
You should use a catalog that is anchored to the same genome build as your variant (since it contains a position.) For example,
10:112998590_C/T
is rs7903146 in GRCh38, but10:114758349_C/T
in GRCh37. In this example, assume the GWAS catalog with ID 1 is a GRCh38 catalog.
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and variant eq '10:112998590_C/T'"
You can also retrieve by rsID instead of a variant:
curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and rsid eq 'rs7903146'"
FIELDS
Field | Description |
---|---|
id | GWAS catalog ID |
alt | Alternate allele |
chrom | Chromosome |
first_author | First author of the publication reporting this association |
log_pvalue | -log10 p-value for association between variant and trait |
or_beta | Effect size (or odds ratio if binary trait) |
pmid | PubMed ID for the publication reporting this association |
pos | Position |
pubdate | Publication date (YYYY-MM-DD) |
ref | Reference allele |
risk_allele | Specifies allele for effect direction and risk frequency |
risk_frq | Frequency of risk allele |
rsid | rsID of the variant |
study | A human-readable description of the study |
trait | Name of the trait/phenotype/disease |
trait_group | Grouping of traits as defined by the catalog |
variant | Variant in chr:pos_ref/alt format |
If no ID is specified in the filter string, the best recommended GWAS catalog will be chosen. This is currently the latest version of the EBI GWAS catalog. The build
parameter must also be specified.
FILTERS
Filter | Description |
---|---|
id in 1, 3, 6 | Selects GWAS catalogs by their IDs |
chrom eq '6' | Select only variants on a particular chromosome |
pos ge 1 | Select only variants with position greater than or equal to a value |
pos le 10 | Select only variants with position less than or equal to a value |
pos gt 1 | Select only variants with position greater than a value |
pos lt 10 | Select only variants with position less than a value |
variant eq '10:112998590_C/T' | Select a particular variant |
rsid eq 'rs7903146' | Select a variant by rsID |
PARAMETERS
Param | Description |
---|---|
variant_format | Default variant format is EPACTS style, e.g. 'chr:pos_ref/alt'. Specify variant_format='colons' to get variants of the form 'chr:pos:ref:alt'. |
decompose | Decompose multiallelic variants into separate entries, one per every combination of REF/ALT alleles. This is a boolean parameter and can be turned on with any value, e.g. decompose=1 or decompose=true. |
build | Explicitly set the genome build for this endpoint. This affects how the recommended gene source is selected when no ID is present in the filter string. Acceptable builds are 'GRCh37', 'GRCh38'. |
SORT
Return sorted results by including the sort=field
parameter. Probably the most common would be to sort by log p-value, for example sort=log_pvalue
.