.

DAS in Ensembl

The Ensembl DAS reference server

Ensembl provides a DAS reference server which gives access to genomic sequences, the latest Ensembl gene predictions, and for some species, karyotypes and ditags. A list of the sources currently served from the Ensembl DAS reference server may be found as XML documents at

http://www.ensembl.org/das/dsn (listing generated from the standard DAS data source name (DSN) request)
http://www.ensembl.org/das/sources (a listing with extended information)

Example requests

DAS request URLs have a specific format:

protocol://site-prefix/das/data-source/command?arguments

For example:

http://www.ensembl.org/das/Homo_sapiens.NCBI36.transcript/features?segment=13:31787617,31871806
Request all transcripts (exons really) in the region [31787617,31871806] on human chromosome 13 (this is where the gene BRCA2 is located in the NCBI 36 assembly).
http://www.ensembl.org/das/Gallus_gallus.WASHUC1.reference/sequence?segment=1:1,1000
Request the first 1000 bp of the first chicken chromosome.

DAS stylesheet support in Ensembl

Ensembl supports use of DAS "stylesheet" request which provides a facility to make the display of the features nicer. A PDF document that describes the stylesheet support is avaliable here:

Ensembl Stylesheet Support

DAS clients in Ensembl and semantic extensions to the DAS specification

DAS clients are built into several Ensembl displays to enable incorporation of third-party annotation data by "attaching a DAS source". Ensembl uses DAS for showing external third-party data and data from in-house DAS sources. The help pages for DASConfView explain how to attach a DAS data source to an Ensembl display.

Some Ensembl displays introduce semantic extensions to the DAS specification. This is explained below.

ContigView and CytoView

A genomic DAS source serves data in the format specified in the DAS/1.5 specification. In Ensembl, genomic DAS sources can be attached to ContigView and to CytoView (see examples in human) if the annotation is in chromosomal coordinates, on contigs or on any other top-level assembly structure known to Ensembl for that species. For genomic DAS sources the assembly version is important as otherwise features from a source annotation (e.g. from the NCBI 36 human assembly) may be offset in ContigView.

Please note that for most species, Ensembl chromosomes are named like "1", "2", "X", "Y" rather than like "chr1", "chr2", "chrX", "chrY". The exceptions are the Drosophila melanogaster, Anopheles gambiae, Caenorhabditis elegans, and Saccharomyces cerevisiae genomes which uses other naming conventions.

ProtView

Proteomic DAS sources (or ProteinDAS sources) may be attached to ProtView which displays protein information (see example in human).

A proteomic DAS client and server exchange protein annotations using the DAS protocol. Genomic DAS sources annotate DNA sequence from a genomic assembly in contrast to proteomic DAS sources which annotate proteins. The extension of the DAS protocol to proteins is only semantic and the requests and replies between client and server uses the exact same specification as in the genomic case. Sections of the XML reply that are not relevant to proteomic data (e.g. "phase" and "strand"), are simply ignored by the proteomic DAS client.

In most cases, the UniProt peptide sequences are used as the common reference.

There is an official UniProt reference server which is not maintained by the Ensembl group. This server is serving the latest UniProt data from the UniProt group at the EBI and it also serves peptide sequences:

Example requests would be:

The UniProt DAS reference server is able to respond to any of the types of identifiers listed on the UniProt DAS information page.

Proteomic DAS annotations can be browsed from Ensembl ProtView pages, e.g.

The ProtView display of Ensembl is able to display non-positional features in the same way as GeneView (see below).

GeneView

DAS sources serving non-positional features (sometimes also called GeneDAS sources) can be attached to GeneView (see example for human) which displays gene information. Non-positional feature DAS is a semantic extension to the DAS protocol. It allows exchange of annotations tied to identifiers, such as a HUGO gene name or an Ensembl gene ID, rather than to a segment of reference sequence. Positional data is irrelevant and the annotation applies to the entire entity referenced by the request. Non-positional annotations are defined as having 'start' and 'end' attributes set to 0 (zero) and the annotation itself is carried in the 'note' attribute.

Among others, Ensembl currently uses the UniProt proteomic DAS reference server (see above) in a GeneDAS capacity. An example DAS request with both non-positional and proteomic DAS annotations is:

The corresponding Ensembl GeneView page is:

URL Auto-Configuration

DAS sources can be automatically attached and configured while following links to Ensembl data displays.

Mandatory Parameters

URLs for automatic DAS source attachment must contain the following parameters:

url
The URL to the DAS server ending with '/das'.
dsn
The DAS source name as defined in the DAS server.
name
The DAS source name as it will appear on Ensembl data displays.
type
The mapping type, which defines the coordinate sytem to use for fetching features.

Possible values for the type are:

hugo
The gene symbol as assigned by the Human Genome Nomencalture Committee (HGNC) e.g. BRCA2.
markersymbol
The gene symbol as assigned by Mouse Genome Informatics (MGI) e.g. Brca2.
mgi
MGI IDs corresponding to the Mouse Genome Informatics database e.g. MGI109337 (note that the colon which is usually part of the MGI identifier is removed).
ensembl_gene
Ensembl Gene ID e.g. ENSG00000139618.
ensembl_location_chromosome
Ensembl Chromosome Location e.g. 13:31787617,31871805 (analogous to this there are also ensembl_location_contig, ensembl_location_scaffold, and ensembl_location_clone).
ensembl_peptide
Ensembl Peptide ID e.g. ENSP00000267071.
ensembl_transcript
Ensembl Transcript ID e.g. ENST00000267071.
uniprot/swissprot
Uniprot/Swiss-Prot Name e.g. BRCA2_HUMAN.
uniprot/swissprot_acc
Uniprot/Swiss-Prot Accession number P51587.
entrezgene
Entrez Gene ID e.g. 675.
ipi_acc
International Protein Index (IPI) accession number e.g. IPI00015171.
ipi_id
International Protein Index (IPI) name e.g. IPI00015171.4.

Optional Parameters

active
Set active=1 to not only attach the DAS source but also automatically switch it on for display.
stylesheet
Set stylesheet=y to use a source stylesheet to display features.
score
Set score=h to display data as a histogram.
label
Set label=TEXT to define a name for the DAS track in ContigView and CytoView.
caption
Set caption=TEXT to define a name of the DAS source as it will appear in the DAS menu.
group
Set group=y to group features.
color
Set color=COLOUR to use this colour for feature display.
strand
Set strand=b|f|r to display features on both, forward or reverse strand, respectively.
depth
Set depth=INT for the number of rows to be used by this DAS track in ContigView. Set to 10000 for unlimited.
labelflag
Set labelflag=n|o|u to display no feature labels at all or to display them over or under the feature, respectively.

Example URL

The following example URL activates a DAS track in Ensembl ContigView, which annotates the results of Ensembl Peptide ID to PDB protein structure mapping.

http://www.ensembl.org/Homo_sapiens/contigview?peptide=ENSP00000354398&add_das_source=(url=http://das.sanger.ac.uk/das+dsn=ensppdbmapping+type=ensembl_peptide+name=ENSP_PDB_mapping+active=1)

 

© 2024 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.

                
GermOnline based on Ensembl release 50 - Jul 2008
HELP