NAV Navbar
shell python
  • Introduction
  • Production or development API
  • Common parameters
  • Response status codes
  • Response JSON
  • Overview of API endpoints
  • API endpoints
  • Introduction

    This document contains the specification for the UM data API, used by LocusZoom and other tools under development at the Center for Statistical Genetics, University of Michigan.

    Our API naturally evolves over time as key data is revised. Most annotations (genes, recombination, LD) now support both build GRCh37 and build GRCh38. We encourage you to explore the provided metadata endpoints to find the newest and best annotations that match your data.

    Production or development API

    Simply replace api with api_internal_dev in any of the URLs below.

    # Production API
    curl "https://portaldev.sph.umich.edu/api/v1/statistic/single/"
    
    # Development API
    curl "https://portaldev.sph.umich.edu/api_internal_dev/v1/statistic/single/"
    

    Common parameters

    To retrieve data from available resources, the HTTP GET requests are used with the optional parameters listed in the table below. The list of parameters and their format is based on the best practices from OData and JAX-RS specifications 1,2.

    Parameter Type Description
    page integer Page number if pagination is requested
    limit integer Maximum page size
    filter string Specifies filtering options
    sort string List of fields that will be used to sort the collection
    fields string List of fields that will be included

    filter

    The filter parameter allows elimination of redundant resource’s entries using logical expressions. The logical expression is the combination of resource field names, operators and literals. The tables below list available literals and operators correspondingly.

    Literal Description
    'a string' Variable length character string.
    0.73, -0.73 Floating point number.
    12, -12 Integer number.
    Operator Description Example
    eq = filter=analysis eq 1
    filter=variant eq 'rs1234567'
    gt > filter=refAlleleFreq gt 0.01
    lt < filter=pvalue lt 0.000000005
    ge >= filter=position ge 10000
    le <= filter=position le 20000
    in filter=chromosome in '1','2','3','16'
    and & filter=position ge 10000 and position le 20000

    Depending on the requirements, only part of the operators may be supported for a particular resource and its field.

    fields

    The fields parameter allows projection of resource’s fields. The projection is specified as a comma separated list of resource’s fields. For example, to select only analysis and trait fields from the /statistic/single resource, the corresponding GET request must have fields=analysis,trait.

    Each request has its own set of fields (specified under the API endpoints section.)

    sort

    The sort parameter allows ordering of the results based on one or multiple resource’s fields. The fields are provided in a comma separated list. The - character before the field name corresponds to the descending order.

    Response status codes

    Code Message Description
    200 JSON with results Success
    400 Incorrect syntax in filter parameter Server unable to parse filter
    400 Incorrect syntax in fields parameter Serve unable to parse the fields parameter
    501 Unsupported data type for the xyz field in the filter parameter Server successfully parsed the filter parameter, but the xyz field's data type didn't match the provided literal's type
    501 Unsupported operation for the field in the filter parameter Server successfully parsed the filter parameter, but the resource doesn’t support the specified operation with the field
    501 Unsupported field in the filter parameter Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource
    501 Unsupported field in the fields parameter Server successfully parsed the filter parameter, but at least one of the specified field names is not present in the corresponding resource

    Response JSON

    All responses from HTTP GET requests are represented using JSON data format. The returned object must have two mandatory "data" and "lastPage" fields.

    Example JSON response:

    {
      "data": "result JSON here",
      "lastPage": "integer here"
    }
    

    Overview of API endpoints

    Relative Resource URI Description
    /statistic/single/ Collection of all available studies that have single variant association results.
    /statistic/single/results/ Collection of all single variant association results.
    /statistic/phewas/ Return all available association statistics given a variant.
    /statistic/pair/LD/results/ Collection of pair-wise linkage disequilibrium coefficients between all variants.
    /annotation/recomb/ Recombination rates
    /annotation/variant/ Collection of all available single variant annotations.
    /annotation/snps/ List all dbSNP datasets
    /annotation/snps/results/ Query by rsid and find chrom/pos/ref/alt, or vice versa.
    /annotation/omnisearch/ Search for genomic coordinates given a rsID, gene, transcript, etc.
    /annotation/intervals/ Collection of all available genome interval annotation sources (such as GENCODE).
    /annotation/intervals/results/ Collection of all available genome interval annotations.
    /annotation/genes/sources/ Collection of all available gene annotation resources.
    /annotation/genes/ Collection of all annotated genes.
    /annotation/gwascatalog/ Collection of GWAS catalogs
    /annotation/gwascatalog/results/ Collection of GWAS catalogs

    API endpoints

    Single variant statistics

    API endpoints for retrieving association statistics on single variants.

    List all available datasets/resources

    GET /statistic/single/

    curl "https://portaldev.sph.umich.edu/api/v1/statistic/single/"
    
    import requests
    
    response = requests.get("https://portaldev.sph.umich.edu/api/v1/statistic/single/")
    json = response.json()
    

    The JSON response will look like:

    {
      "data": {
        "analysis": [1, 2, 3],
        "build": ["GRCh37", "GRCh37", "GRCh37"],
        "date": ["2010-01-17", "2010-01-17", "2010-01-17"],
        "first_author": ["Fritsche LG", "Welch R", "Willer CJ"],
        "last_author": ["Willer CJ", "Abecasis GR", "Mohlke JL"],
        "study": ["METSIM", "FUSION", "FUSION"],
        "trait": ["T2D", "T2D", "fasting insulin"],
        "tech": ["Illumina300K", "Exome chip", "Illumina 1M"],
        "imputed": ["1000G", "NA", "HapMap"]
      },
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Analysis unique identifier
    analysis Human-readable analysis label
    study Study name
    trait Trait name
    tech Genotyping/sequencing technology
    build Genome build
    imputed Reference panel used if data was imputed

    FILTERS

    Filter Description
    id in 1,2,... Selects set of analyses by unique ID

    SORT

    Not yet implemented

    Retrieve results

    GET /statistic/single/results/

    Example: retrieve all association results in the FUSION study for T2D (analysis ID 1)

    curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=analysis in '99'"
    
    {
      "data": {
        "analysis": [1, 1, 1],
        "beta": [null, null, null],
        "chromosome": ["4", "4", "4"],
        "log_pvalue": [0.22, 2, 4.37],
        "position": [1, 2, 3900],
        "ref_allele": ["A", "C", "C"],
        "ref_allele_freq": [null, null, null],
        "score_test_stat": [0.2, 5.4, 3.6],
        "se": [null, null, null],
        "variant": ["4:1_A/G", "4:2_C/T", "4:3900_C/T"]
      },
      "lastPage": null
    }
    

    Example: Retrieve association results from region 12:10001-20001 from the FUSION study for trait T2D. Include only variant name, position, and p-value columns. Sort by the position and p-value columns.

    curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/single/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=analysis in 1 and chromosome in '12' and position ge 10001 and position le 20001" --data-urlencode "fields=variant, position, log_pvalue"  --data-urlencode "sort=log_pvalue"
    
    {
      "data": {
        "variant": ["12:10001_A/G", "12:10002_C/T", "12:20000_G/T"],
        "position": [10001, 10002, 20000],
        "log_pvalue": [0.001, 0.03, 0.5]
      },
      "lastPage": null
    }
    

    FIELDS

    Field Description
    analysis Analysis unique identifier
    beta Effect size
    chromosome Chromosome
    log_pvalue -log10 p-value
    position Position in base pairs
    ref_allele Reference allele
    ref_allele_freq Reference allele frequency
    score_test_stat Score statistic
    se Effect size standard error
    variant Variant unique name (A string in the scheme {chrom}:{pos}_{ref}/{alt})

    FILTERS

    Filter Description
    analysis in 1, 2 Select analysis by a unique identifier
    chromosome in '1', '22', 'X' Select chromosomes by name.
    position ge 10000 Start position in base-pairs of the interval of interest.
    position le 60000 End position in base-pairs of the interval of interest.

    SORT

    Add &sort=field1,field2 to your URL. If the field is not present it will have no effect.

    PheWAS: all available results for a given variant

    GET /statistic/phewas/

    # We're using format=objects here as it's probably the preferred way to retrieve the data.
    # The standard data frame / array of arrays layout is also available if you remove format=objects.
    curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/phewas/?build=GRCh37&format=objects" --data-urlencode "filter=variant eq '10:114758349_C/T'"
    

    The JSON response will look like:

    {
      "data": [
        {
          "id": 45,
          "trait_group": "Metabolic disease",
          "trait_label": "Type 2 diabetes",
          "log_pvalue": 107.032,
          "variant": "10:114758349_C/T",
          "chromosome": "10",
          "position": 114758349,
          "build": "GRCh37",
          "beta": null,
          "ref_allele": "C",
          "ref_allele_freq": null,
          "score_test_stat": null,
          "se": null,
          "study": "DIAGRAM",
          "description": "DIAGRAM 1000G T2D meta-analysis",
          "tech": null,
          "pmid": "28566273",
          "trait": "T2D"
        }
      ],
      "lastPage": null,
      "meta": {
        "build": [
          "GRCh37"
        ]
      }
    }
    

    FIELDS

    Field Description Must exist in response for PheWAS module
    id Unique identifier for each dataset Yes
    beta Effect size
    build Genome build
    chromosome Chromosome for variant
    description Description of analysis this dataset represents
    log_pvalue -log10 p-value Yes
    pmid pmid PubMed ID for paper if this dataset is published
    position Position
    study Study, consortium, or group that generated this analysis
    tech Genotyping/sequencing technology
    ref_allele Reference allele
    ref_allele_freq Reference allele frequency
    score_test_stat Score statistic
    se Effect size standard error
    study Study name
    trait Trait code. Example: "T2D"
    trait_label Longer description of trait, e.g. "Type 2 diabetes" Yes
    trait_group Arbitrary grouping/category the trait belongs to, e.g. "metabolic diseases" Yes
    variant Variant unique name (A string in the scheme {chrom}:{pos}_{ref}/{alt})

    PARAMETERS

    Param Description
    build Genome build for the requested variant. For example 'GRCh37' or 'GRCh38'. Trailing version (e.g. p13.3) will not be present.
    format Format of the response. Our API server supports two formats - the default is an array of arrays, and the optional objects format returns an array of JSON objects. LocusZoom.js will only generate requests that use format=objects.

    FILTERS

    Filter Description
    variant eq 'X' Select results for this variant. Variant should be in chr:pos_ref/alt format.

    META

    Response will contain a meta object, with the following attributes:

    Attribute Value
    build Array of genome build(s) that were requested. Records returned will be only for these builds. This will typically only be 1 build. In the future we may begin upconverting variants to other builds.

    SORT

    Not yet implemented

    Linkage disequilibrium

    The PortalDev API endpoint has been deprecated. We encourage you to explore the new Michigan LDServer. The interactive "LD playground" tool provides a concise overview of possible options. For many practical applications (such as LocusZoom plots), the "variant correlations" feature is recommended.

    Retrieve results

    Although the endpoint documented below still exists, it is deprecated and may be removed in the future. The documentation for this old endpoint is not maintained and is not guaranteed to be accurate.

    GET /statistic/pair/LD/results/

    Example: Retrieve all pair-wise LD D’ values between SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Don’t sort the results. Retrieve only variant1, variant2 and value fields. Split results into pages of size 100. Start with the first page.

    curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1" --data-urlencode "limit=100" --data-urlencode "filter=reference in 1 and chromosome1 in '12' and position1 ge 10001 and position1 le 20001 and chromosome2 in '12' and position2 ge 10001 and position2 le 20001 and type in 'dprime'" --data-urlencode "fields=variant2,variant2,value"
    
    {
      "data": {
        "variant1": ["12:10001", "12:10001", "12:10002"],
        "variant2": ["12:10002", "12:10003", "12:10003"],
        "value": [1.00, 0.78, 1.00]
      },
      "lastPage": 12
    }
    

    Example: Retrieve pair-wise D’ LD values between SNP 12:10023 and all SNPs in the 12:10001-20001 region using 1000G EUR build 37 version 3 reference panel. Retrieve only variant2 and value columns. Split the results into pages of size 100. Start with the first page.

    curl -G "https://portaldev.sph.umich.edu/api/v1/statistic/pair/LD/results/" --data-urlencode "page=1&limit=100&filter=reference in 3 and variant1 in '12:10023' and chromosome2 in '12' and position ge 10001 and position le 20001 and type in 'dprime'&fields=variant2,value"
    
    {
      "data": {
        "id2": ["12:10001", "12:10002", "12:10003"],
        "value": [1.00, 1.00, 0.98]
      },
      "lastPage": 10
    }
    

    This API endpoint calculates LD values between pairs of variants on the fly (not precomputed). For regions of 1 MB, it should be nearly instant.

    This endpoint only uses pre-existing reference panels, such as the 1000 Genomes panels.

    FIELDS

    Field Description
    reference Reference panel unique identifier
    variant1 Variant name in chr:pos_ref/alt format
    chromosome1 Chromosome
    position1 Position in base pairs
    variant2
    chromosome2
    position2
    value LD value
    type LD type: dprime, rsquare

    FILTERS

    Filter Description
    reference in 1, 2 Select reference by unique identifier.
    variant1 in '12:1000', '12:1001' Select first variant by unique name.
    chromosome1 in '1', '2' Select chromosome for the first variant.
    position1 ge 1000
    position1 le 2000
    Specify positions range (in base-pairs) for the first variant.
    variant2 Select second variant by unique name.
    chromosome2 in '1', '2' Select chromosome for the second variant.
    position2 ge 1000
    position2 le 2000
    Specify positions range (in base-pairs) for the second variant.
    type in 'dprime', 'rsquare' Select type of LD coefficient.

    SORT

    Not yet implemented

    Recombination

    Get recombination sources

    GET /annotation/recomb/

    FIELDS

    Field Description
    id Recombination rate map unique identifier
    name Recombination rate map (e.g. hapmap)
    build Genome build for recombination rate positions
    version Version string for this recombination map (usually a date)

    FILTERS

    Filter Description
    id in 1 Select recombination rate by identifier

    SORT

    Add &sort=field1,field2 to your URL.

    Retrieve recombination rates

    GET /annotation/recomb/results/

    Example: Retrieve recombination rates within a specific interval for a given dataset

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/recomb/results/" --data-urlencode "filter=id in 15 and chromosome eq '21' and position gt 10406989 and position lt 10906989"
    
    {
      "data": {
        "chromosome": [
          "21",
          "21",
          "21"
        ],
        "id": [
          15,
          15,
          15
        ],
        "pos_cm": [
          0.0,
          0.052685,
          0.052781
        ],
        "position": [
          10865933,
          10906723,
          10906915
        ],
        "recomb_rate": [
          1.29162,
          0.496586,
          0.424224
        ]
      },
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Recombination rate map unique identifier
    chromosome Chromosome
    position Genomic position (bp)
    pos_cm Genetic position (cM)
    recomb_rate Recombination rate

    FILTERS

    SORT

    Data can be sorted on any field by adding &sort=field1,field2 onto your URL.

    Search endpoints

    Omnisearch

    Search for genomic coordinates given a rsID, gene, transcript, etc. The following example search formats are supported:

    Positions and offsets may have commas and use K and M suffixes.

    GET /annotation/omnisearch/

    Example: Find gene positions by gene name

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/omnisearch/"  --data-urlencode "q=TCF7L2" --data-urlencode "build=GRCh37"
    
    {
      "build": "grch37", 
      "data": [
        {
          "chrom": "10", 
          "end": 114927437, 
          "gene_id": "ENSG00000148737.11", 
          "gene_name": "TCF7L2", 
          "start": 114710009, 
          "term": "TCF7L2", 
          "type": "other"
        }
      ]
    }
    

    FIELDS

    Field Description
    chrom The chromosome
    start The start genomic position
    end The end genomic position
    term The term used as the query
    type The type of query (egene, region, rs, other), as predicted by the parser

    Additional fields may be returned depending on the query type.

    QUERY PARAMS

    Param Description
    q A string value to search for
    build A genome build identifier (GRCh37, GRCh38)

    Interval annotations

    These would be annotations that span intervals of the genome, such as enhancers, TFBSs, etc.

    List all datasets/resources

    GET /annotation/intervals/

    Example: Retrieve a list of all available interval annotation resources.

    curl "https://portaldev.sph.umich.edu/api/v1/annotation/intervals/"
    
    {
      "data": {
        "assay": [
          "ChIP-seq",
          "ChIP-seq",
          "ChIP-seq",
          "ChIP-seq"
        ],
        "build": [
          "GRCh37",
          "GRCh37",
          "GRCh37",
          "GRCh37"
        ],
        "cell_line": [
          null,
          null,
          "GM12878",
          "K562"
        ],
        "description": [
          "Pancreatic islet chromHMM calls from Parker 2013",
          "Pancreatic islet stretch enhancers from Parker 2013",
          "Chromatin State Segmentation by HMM from ENCODE/Broad",
          "Chromatin State Segmentation by HMM from ENCODE/Broad"
        ],
        "histone": [
          null,
          null,
          null,
          null
        ],
        "id": [
          16,
          17,
          18,
          19
        ],
        "pmid": [
          "24127591",
          "24127591",
          "21441907",
          "21441907"
        ],
        "protein": [
          null,
          null,
          null,
          null
        ],
        "study": [
          "Parker 2013",
          "Parker 2013",
          "ENCODE",
          "ENCODE"
        ],
        "tissue": [
          "pancreatic_islet",
          "pancreatic_islet",
          null,
          null
        ],
        "type": [
          "chromHMM",
          "stretch_enhancers",
          "chromHMM",
          "chromHMM"
        ],
        "url": [
          "http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/",
          "http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/",
          "http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeBroadHmm",
          "http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeBroadHmm"
        ],
        "version": [
          "2013-12-10",
          "2013-12-10",
          "2012-04",
          "2012-04"
        ]
      },
      "lastPage": null
    }
    

    Example: Retrieve information about the interval annotation resource with id equal to 16.

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/intervals/" --data-urlencode "filter=id in 16"
    
    {
      "data": {
        "assay": [
          "ChIP-seq"
        ],
        "build": [
          "b37"
        ],
        "cell_line": [
          null
        ],
        "description": [
          "Pancreatic islet chromHMM calls from Parker 2013"
        ],
        "histone": [
          null
        ],
        "id": [
          16
        ],
        "pmid": [
          "24127591"
        ],
        "protein": [
          null
        ],
        "study": [
          "Parker 2013"
        ],
        "tissue": [
          "pancreatic_islet"
        ],
        "type": [
          "chromHMM"
        ],
        "url": [
          "http://research.nhgri.nih.gov/manuscripts/Collins/islet_chromatin/"
        ],
        "version": [
          "2013-12-10"
        ]
      },
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Unique identifier for interval dataset
    study Name of study (ENCODE, FUSION, etc.)
    build Genome build to which these intervals are anchored
    type Dataset type (chromHMM calls, stretch enhancers, etc.)
    version Version string, usually a date
    description Long description of dataset
    assay Assay used to generate intervals (ChIP-seq, ATAC-seq, etc.)
    cell_line Name of cell line in which these genomic intervals were discovered in
    tissue Name of tissue in which these genomic intervals were discovered in
    histone If the dataset was ChIP-seq for a particular histone, this will be the name of the histone mark
    protein If the dataset was ChIP-seq for a particular TF/DNA-binding protein, this will be the protein (ENSEMBL ID)
    pmid PubMed ID for paper if this dataset is published
    url URL that contains information about the dataset and/or the original downloaded files

    FILTERS

    Filter Description
    id in 1, 2 Selects interval annotation resource by a unique identifier.

    SORT

    Sort on any field using sort=field1,field2.

    Retrieve interval annotations

    GET /annotation/intervals/results/

    Retrieve annotations from dataset with id 16, on chromosome 2, with start positions < 19001

    curl -G https://portaldev.sph.umich.edu/api/v1/annotation/intervals/results/ --data-urlencode "filter=id in 19 and chromosome eq '10' and start le 115067678 and end ge 114550452"
    
    {
      "data": {
        "chromosome": ["10", "10", "10", "10"],
        "end": [114574010, 114574210, 114575010, 114575210],
        "id": [19, 19, 19, 19],
        "public_id": [null, null, null, null],
        "start": [114516210, 114574010, 114574210, 114575010],
        "state_id": [13, 7, 13, 7],
        "state_name": [
          "Heterochromatin / low signal",
          "Insulator",
          "Heterochromatin / low signal",
          "Insulator"
        ],
        "strand": [
          null,
          null,
          null,
          null
        ]
      },
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Interval dataset identifier
    state_id A (numeric) state identifier for this annotation, such as determined by ChromHMM. (if applicable)
    state_name A human-readable state name that generally corresponds to an entry in state_id. (if applicable)
    public_id Public/other database ID for this interval (if applicable)
    chromosome Chromosome
    start Start of interval (in bp)
    end End of interval (in bp)
    strand DNA strand that the interval is annotated to (if applicable)

    FILTERS

    Filter Description
    id in 1, 2 Select interval annotation resource by a unique identifier.
    chromosome in '1', '2', 'X' Select chromosome by name.
    start ge 10000
    start le 20000
    Select interval if its start position falls into the specified interval.
    end ge 10000
    end le 20000
    Select interval if its end position falls into the specified interval.

    SORT

    Sort on any field by adding sort=field1,field2 to the URL.

    FORMATS

    The default format returns JSON where each key is a column name, and the value is an array of values (one per row entry.)

    An alternative format returns each row as an object itself. Add format=objects to the URL for this.

    Genes

    List all possible sources of gene annotations

    Currently we only include ENSEMBL/GENCODE.

    GET /annotation/genes/sources/

    Example: retrieve all gene annotation sources

    curl "https://portaldev.sph.umich.edu/api/v1/annotation/genes/sources/?format=objects"
    
    {
      "data": [
        {
          "genome_build": "GRCh38", 
          "id": 1, 
          "organism": "human", 
          "source": "gencode", 
          "taxid": 9606, 
          "version": "27"
        }, 
        {
          "genome_build": "GRCh37", 
          "id": 2, 
          "organism": "human", 
          "source": "gencode", 
          "taxid": 9606, 
          "version": "19"
        }, 
        {
          "genome_build": "GRCh37", 
          "id": 3, 
          "organism": "human", 
          "source": "gencode", 
          "taxid": 9606, 
          "version": "27"
        }
      ],
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Annotation resource unique id.
    genome_build Annotation resource genome build.
    organism
    source Annotation resource name.
    taxid
    version Annotation resource version.

    Retrieve gene information

    GET /annotation/genes/

    Retrieve all gene annotation data.

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/genes/" --data-urlencode "filter=source in 3 and chrom eq '10' and start le 115067678 and end ge 114550452"
    
    {
      "data": [
        {
          "chrom": "10", 
          "end": 114578503, 
          "exons": [
            {
              "chrom": "10", 
              "end": 114207225, 
              "exon_id": "ENSE00001449955.2_1", 
              "start": 114206756, 
              "strand": "+"
            }, 
            {
              "chrom": "10", 
              "end": 114207225, 
              "exon_id": "ENSE00001882813.1_1", 
              "start": 114206757, 
              "strand": "+"
            }
          ], 
          "gene_id": "ENSG00000151532.13_2", 
          "gene_name": "VTI1A", 
          "start": 114206756, 
          "strand": "+", 
          "transcripts": [
            {
              "chrom": "10", 
              "end": 114210484, 
              "exons": [
                {
                  "chrom": "10", 
                  "end": 114207225, 
                  "exon_id": "ENSE00001449955.2_1", 
                  "start": 114206756, 
                  "strand": "+"
                }
              ], 
              "start": 114206992, 
              "strand": "+", 
              "transcript_id": "ENST00000489142.5_1"
            }
          ]
        }
      ], 
      "lastPage": null
    }
    

    FIELDS

    Field Description
    source Genes annotation resource id (used for queries)
    gene_name Gene name (non-unique).
    gene_id Gene unique id.
    chrom Chromosome name.
    start Gene start position.
    end Gene end position.
    strand Gene strand
    transcripts A nested object defining available transcripts, and each exon within each transcript

    FILTERS

    Filter Description
    source in 1, 2 Selects gene annotation source by a unique identifier.
    gene_name in 'APOE', 'TCF7L2' Selects gene annotation by non-unique display name(s).
    gene_id in 'ENSG00000223972.5', 'ENSG00000227232.5' Selects gene annotation by unique gene ID(s).
    chrom eq 'chr20' Selects gene annotation that lie within a chromosome.
    start ge 20000000 Selects gene annotation with start positions greater than a certain value.
    end le 20100000 Selects gene annotation with end positions less than a certain value.

    SORT

    Not yet implemented

    GWAS Catalogs

    List all available GWAS catalogs

    We currently support the EBI GWAS catalog, and the UK BioBank GWAS hits.

    GET /annotation/gwascatalog/

    Example: retrieve all GWAS catalogs

    curl "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/"
    
    {
      "data": {
        "catalog_version": [
          "e91_r2018-03-13",
          "e91_r2018-03-13"
        ],
        "date_inserted": [
          "2018-03-18T17:20:40-04:00",
          "2018-03-18T17:20:40-04:00"
        ],
        "genome_build": [
          "GRCh38",
          "GRCh37"
        ],
        "id": [
          1,
          2
        ],
        "name": [
          "EBI GWAS Catalog",
          "EBI GWAS Catalog"
        ]
      },
      "lastPage": null
    }
    

    Or alternatively in object mode:

    curl "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/?format=objects"
    
    {
      "data": [
        {
          "catalog_version": "e91_r2018-03-13",
          "date_inserted": "2018-03-18T17:20:40-04:00",
          "genome_build": "GRCh38",
          "id": 1,
          "name": "EBI GWAS Catalog"
        },
        {
          "catalog_version": "e91_r2018-03-13",
          "date_inserted": "2018-03-18T17:20:40-04:00",
          "genome_build": "GRCh37",
          "id": 2,
          "name": "EBI GWAS Catalog"
        }
      ],
      "lastPage": null
    }
    

    FIELDS

    Field Description
    id Unique ID assigned to each GWAS catalog
    name Name of the catalog, e.g. "EBI" or "UKBB"
    genome_build Positions in the catalog are anchored to this build
    catalog_version Version of the GWAS catalog (varies by catalog)
    date_inserted Date the GWAS catalog was inserted into the database

    Retrieve variants from one or multiple GWAS catalogs

    GET /annotation/gwascatalog/results/

    Retrieve all known disease/trait associated variants within a genomic region for a specific catalog

    Understanding the format is easier in object mode, so we use that below.

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and chrom eq '10' and pos le 112998595 and pos ge 112998585"
    
    {
      "data": [
        {
          "alt": "T",
          "chrom": "10",
          "first_author": "Sladek R",
          "genes": "TCF7L2",
          "id": 1,
          "log_pvalue": 33.7,
          "or_beta": 1.65,
          "pmid": "17293876",
          "pos": 112998590,
          "pubdate": "2007-02-11",
          "ref": "C",
          "risk_allele": "T",
          "risk_frq": 0.3,
          "rsid": "rs7903146",
          "study": "A genome-wide association study identifies novel risk loci for type 2 diabetes.",
          "trait": "Type 2 diabetes",
          "trait_group": "Type 2 diabetes",
          "variant": "10:112998590_C/T"
        },
        {
          ...
        }
      ]
    }
    

    One record is returned per variant * trait * pmid. The same variant <--> trait association can be reported in multiple publications.

    Retrieve associations for a specific variant

    You should use a catalog that is anchored to the same genome build as your variant (since it contains a position.) For example, 10:112998590_C/T is rs7903146 in GRCh38, but 10:114758349_C/T in GRCh37. In this example, assume the GWAS catalog with ID 1 is a GRCh38 catalog.

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and variant eq '10:112998590_C/T'"
    

    You can also retrieve by rsID instead of a variant:

    curl -G "https://portaldev.sph.umich.edu/api/v1/annotation/gwascatalog/results/?format=objects" --data-urlencode "filter=id eq 1 and rsid eq 'rs7903146'"
    

    FIELDS

    Field Description
    id GWAS catalog ID
    alt Alternate allele
    chrom Chromosome
    first_author First author of the publication reporting this association
    log_pvalue -log10 p-value for association between variant and trait
    or_beta Effect size (or odds ratio if binary trait)
    pmid PubMed ID for the publication reporting this association
    pos Position
    pubdate Publication date (YYYY-MM-DD)
    ref Reference allele
    risk_allele Specifies allele for effect direction and risk frequency
    risk_frq Frequency of risk allele
    rsid rsID of the variant
    study A human-readable description of the study
    trait Name of the trait/phenotype/disease
    trait_group Grouping of traits as defined by the catalog
    variant Variant in chr:pos_ref/alt format

    FILTERS

    Filter Description
    id in 1, 3, 6 Selects GWAS catalogs by their IDs
    chrom eq '6' Select only variants on a particular chromosome
    pos ge 1 Select only variants with position greater than or equal to a value
    pos le 10 Select only variants with position less than or equal to a value
    pos gt 1 Select only variants with position greater than a value
    pos lt 10 Select only variants with position less than a value
    variant eq '10:112998590_C/T' Select a particular variant
    rsid eq 'rs7903146' Select a variant by rsID

    SORT

    Return sorted results by including the sort=field parameter. Probably the most common would be to sort by log p-value, for example sort=log_pvalue.