Synopsis

Introduction

Ensembl 'ContigView' is the principal data visualisation tool for genome sequence annotation information. It provides a high level view of the contig sequences that form the genome sequence assembly, and of genes and other features that have been placed on it.

'ContigView' can be customised to suit you. More information can be added or the displays can be simplified to make browsing faster. Please look at the 'Menu Bar' section for details.

The page is split into four sections representing different levels of zooming into the chromosome:

  1. The entire chromosome or a scaffold as an alternative sequence block for species with pre-chromosome assemblies.
  2. An 'Overview' panel displaying a chromosome region of up to 1 Mb.
  3. The main 'Detailed View' panel showing a broad range of features.
  4. A 'Basepair View' panel showing within a small assembly region of up to 500 bases the actual sequence, six frame translations and restriction endonuclease recognition sites.

The red boxes in the 'Chromosome', 'Overview' and 'Detailed View' panels represent the regions shown at higher magnification in the panel following below. The absolute base pair location of the region displayed in 'Detailed View' is indicated in the 'Navigation Bar' at the top of the panel. You can use this bar to navigate along any chromosome by entering a new chromosome and location. Physical map locations may be directly specified entering base pair coordinates or numbers with 'kb' or 'Mb' as suffix.

Chromosome

The 'Chromosome' panel displays an ideogram of an entire chromosome together with its cytogenetic banding pattern. Maps of cytogenetic bands to the genome sequence allow for rather crude orientation and are not available for all species. For all those species with genome sequence assemblies in a pre-chromosome stage, Ensembl displays other 'top-level' sequence entities such as 'scaffolds'.

The red box illustrates the extent of the region displayed in the 'Overview' panel below and can be moved by clicking anywhere on the chromosome. The 'Chromosome' display can be turned on or off using the plus [+] or minus [-] boxes, respectively.

Overview

The 'Overview' panel displays a larger section of a chromosome together with its basic annotation. Usually the range is set to 1 Mb but can be smaller for species with genomes of higher density.

The panel displays the following information:

The red box illustrates the extent of the region displayed in the subsequent 'Detailed View' panel below. You may click anywhere in the 'Overview' panel to re-centre the red box at that point on the contig map. The 'Detailed View' display below will change accordingly. Except for re-centring the display, contigs and genes are not clickable in the 'Overview' display, but they are selectable in the 'Detailed View' panel below. The 'Overview' display can be turned on or off using the plus [+] or minus [-] button, respectively.

Detailed View

The third panel 'Detailed View' shows smaller regions of chromosomes and provides more detailed insight into genome annotation. Features are annotated in tracks along the genome sequence assembly in its standard notation from the p-telomere to the q-telomere. The genomic DNA sequence is generally assembled from smaller sequence-level entities (BAC clones, whole genome sequencing scaffolds or contig sequences in general), which are represented by alternating dark and light blue blocks. Colour-coded features above the contigs are positioned on the forward strand, while those below are on the reverse strand, respectively.

The entire 'Detailed View' display panel can be turned on or off using the plus [+] or minus [-] button, respectively.

The 'Menu Bar' and the 'Navigation Bar' on top of the 'Detailed View' panel are the main tools to customise this display. A set of pull-down menus is available and allows upon opening selection of options via check boxes. Changes take effect by clicking at the 'Close menu' option at bottom of the menus. The following menus are available:

The 'Navigation Bar' and the 'Menu Bar' on top of the 'Detailed View' panel are the main tools to customise this display. The following navigation functions are available:

Feature Tracks

Feature tracks are named at the left side of the 'Detailed View' panel. Clicking a track name will directly link to a description in Ensembl 'HelpView'. Black track names represent Ensembl-internal feature tracks, while blue names indicate tracks served via the Distributed Annotation System from external DAS sources. Tracks may be turned on or off and customized to suit your requirements. Pointing the mouse to a feature will bring up a pop-up window showing the feature identifier together with links to more detailed information whenever available. Pop-up menus can be turned off by un-checking 'show pop-up menus' in the 'Options' pull-down menu. A single click on most features will take you to an appropriate page with more information on that particular feature, unless the '... pop-up on click' option from the Options menu has been selected. (See customizing the display for more details.)

DNA:contigs

The 'DNA (contigs)' track shows a representation of the genomic sequence assembly. Alternating light and dark blue blocks represent individual contig sequences in the genome sequence assembly. Small arrows near sequence identifiers represent the relative orientation of a particular contig sequence within the genome assembly in standard notation. Where no blue contig is shown, there is a gap in the assembly.

Pointing at a contig sequence representation in 'Detailed View' will display a pop-up menu with the complete Ensembl sequence identifier (e.g. AC120349.5.1.183055) at the top. Sequence identifiers regularly include an EMBL accession number whenever available, as well as a sequence version, a start and an end coordinate ([EMBL accession number].[sequence version].[start].[end]). Since Ensembl is designed to use several coordinate systems like 'contigs', 'clones', 'supercontigs', 'scaffolds', 'chunks' or 'chromosomes' in parallel, corresponding sequence regions in other coordinate systems will be listed. Links in the pop-up windows allow for export of the sequence region or for centring the 'Detailed View' panel on a particular sequence region. For BAC clones, Ensembl will provide an "EMBL source file" link to the underlying sequence database record in the pop-up window.

Clicking on a contig sequence representation in the 'Detailed View' track will immediately centre on the sequence region.

Transcripts and Genes

'Detailed View' does not display genes as such but rather as their individual transcripts. Transcripts shown above the DNA:contig bar are transcribed in the forward direction (left to right), while transcripts shown below the bar are transcribed in the reverse direction (right to left). Ensembl considers genes as a collection of exons, which may form several transcripts. A colored box represents each exon, while angled lines represent introns joining all exons in a transcript. 5' and 3' untranslated regions of the transcript (UTRs) are shown as coloured outlines, while the predicted coding regions are shown in solid colour. This distinction is seen best when viewing relatively small regions of a chromosome.

If there are several transcripts displayed on different lines at the same point on the sequence, then that gene has been assessed as producing multiple alternatively spliced transcripts.

Several transcript types are available within the Ensembl system:

Protein Homology Evidence

Protein homology evidence tracks display protein sequence entries from various databases aligned against the genome sequence. Evidence for Ensembl gene predictions, taken from protein sequence entries in databases. The presence of an entry in an evidence track shows that it has significant homology with at least one of the exons displayed in an Ensembl or GENSCAN transcript. The data sets displayed differ for the different Ensembl species.

Pointing at a protein sequence representation in 'Detailed View' will display a pop-up menu with the external database accession number and a clickable link to a UniProt/Swiss-Prot or UniProt/TrEMBL display of this entry. The same database record is also reached by directly clicking on the feature.

Please note that a maximum of seven entries is displayed in any one position, although more entries may have been mapped to this location. (All protein entries mapped to a certain genome position can be retrieved from the 'protein_align_feature' tables in species-specific Ensembl 'core' databases.) Those entries that were actually used during the building of an Ensembl transcript can be seen in more detail by examining the 'Supporting Evidence' section on Ensembl 'ExonView' pages.

mRNA Homology Evidence

Evidence for Ensembl gene predictions, taken from mRNA sequence entries in databases. The presence of an entry in an evidence track shows that it has significant homology with at least one of the exons displayed in an Ensembl or GENSCAN transcript (except for the ESTs track and the human cDNA track which show all above-threshold hits to the assembly - see below). The data sets that were used for the gene predictions and that are displayed in 'ContigView' differ for the different Ensembl species.

Pointing at a cDNA sequence representation in 'Detailed View' will display a pop-up menu with the external database accession number and a clickable link to an EMBL display of this entry. The same database record is also reached by directly clicking on the feature.

Please note that a maximum of seven entries is displayed in any one position, although more entries may have been mapped to this location. (All mRNA entries mapped to a certain genome position can be retrieved from the 'dna_align_feature' tables in species-specific Ensembl 'core' databases.) Those entries that were actually used during the building of an Ensembl transcript can be seen in more detail by examining the 'Supporting Evidence' section on Ensembl 'ExonView' pages.

EST Homology Evidence

tRNA

Indicates the location of tRNA genes, predicted by the tRNAscan-SE program.

Todd M. Lowe and Sean R. Eddy
tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.
Nucleic Acids Res. 1997 Mar 1;25(5):955-964
[Abstract] [Full text]

Warning In the mouse and rat genomes, tRNAscan is currently unable to distinguish reliably between functional tRNAs, pseudogenes, and tRNA-related SINE repeats. The tRNA track in these species will therefore show a mixture of these elements.

Rfam

Indicates non-coding RNA families from Rfam (RNA families database of alignments and CMs).

Rfam: an RNA family database.
Sam Griffiths-Jones, Alex Bateman, Mhairi Marshall, Ajay Khanna and Sean R. Eddy.
Nucleic Acids Research, 2003, 31, 1, 439-441.

Eponine

Eponine is an algorithm that predicts transcription start sites in mammalian genomic DNA, based on the linear combination of weight matrices. Although Ensembl annotates Eponine transcription start site (TSS) predictions on the genomic sequence, its predictions are presently not used to artificially extend Ensembl gene model predictions beyond solid biological evidence.

Thomas A. Down and Tim J. P. Hubbard
Computational detection and location of transcription start sites in mammalian genomic DNA.
Genome Res. 2002 Mar;12(3):458-461.
[Abstract] [Full text]

First Exon Finder

The First Exon Finder FirstEF is a 5' terminal exon and promoter prediction program. It implements a decision tree based on discriminant functions that can recognise structural and compositional features such as CpG islands, promoter regions and first splice-donor sites. The probabilistic models are optimised to find potential first donor sites and CpG-related and non-CpG-related promoter regions based on discriminant analysis. For every potential first donor site (GT) and an upstream promoter region, FirstEF decides whether or not the intermediate region can be a potential first exon, based on a set of quadratic discriminant functions.

Ramana V. Davuluri, Ivo Grosse and Michael Q. Zhang
Computational identification of promoters and first exons in the human genome.
Nat Genet. 2001 Dec;29(4):412-417.
doi:10.1038/ng780

Microarray Probe Sets

Ensembl annotates microarray probe sets on the genome sequences if manufacturers disclosed individual probe set sequences for a particular micro array. The mapping process is a two step procedure out-lined in the Microarry Probe Set Mapping document.

Whole Genome Similarity Matches

Tracks displaying whole genome similarity matches to other genomes in Ensembl are available from the 'Compara' menu. The track names include a four-letter abbreviation of the systematic species name and the method used for characterising the whole genome similarity matches.

An overview document lists species pairs and methods involved in the comparison.

The following tracks are available:

Clicking the red plus [+] or minus [-] box to the left of the track toggles whole genome alignment tracks between expanded and collapsed display, respectively.

When the matches are shown as expanded, individual high scoring pairs (HSPs) of identical orientation are joined by horizontal lines and the minus [-] box is shown. Pointing at the track produces pop-up windows with the coordinates of the assembly segment from the matching species, the relative orientation and a link to see that segment in 'ContigView'.

When the matches are shown as collapsed, individual high scoring pairs (HSPs) on the same region of the chromosome or scaffold are not joined by horizontal lines and the plus [+] button is shown. Pointing at a hit provides a pop up window with links to the pairwise alignment in Ensembl 'AlignView', a dot matrix display of the aligned region from the two species in 'DotterView' and an option to display genomic regions from both species simultaneously in 'MultiContigView'. Alternativelly you can display the alignments in 'AlignSliceView' by selecting the "View alignment with..." option in the left hand side menu. There is also a link to the corresponding 'ContigView' display for the other species.

N. B. All similarity matches are strand independent tracks and are therefore displayed at the top of the 'Detailed View' panel. For more information about regions of conserved synteny, consult Ensembl 'SyntenyView'.

Markers

Mouse-over will show the marker identifier and an option to view details and synonyms in 'MarkerView'. Note that only a sub-set of the markers stored in the Ensembl databases are displayed. Information about other markers may be found via the text search box near the top of almost any Ensembl page. For detailed instructions see the Ensembl 'TextView' page.

Quantitative Trait Loci (QTL)

Only a preliminary mapping of Rattus norvegicus Quantitative Trait Loci (QTLs) is available at present. Loci are mapped onto the genome sequence assembly via mapping of QTL-defining sequence tagged sites (STS) markers. Because some markers could not be mapped to the assembly by the Ensembl analysis and annotation pipeline, some QTLs may be represented by just one marker, while others are not shown at all. The loci are annotated as red blocks, with the name of the trait displayed on the block if there is enough space. Where only one of the defining markers could be mapped, a red block of arbitrary size 1 Mb is drawn around it. Pointing to a block produces a pop-up menu with the name of the trait and a link to either the Rat Genome Database (RGD) or the RatMap resource for further information.

CpG Islands

Indicates a CpG island. Pointing with the mouse over an island provides the score and the location as additional information. The Ensembl analysis and annotation pipeline uses the cpg program for the definition of CpG islands. This programme was developed by Gos Micklem and is essentially identical to the newcpgreport programme in the EMBOSS package.

For the inclusion of CpG islands into the Ensembl database we require a minimum length of 1000 bp, a minimal observed/expected ratio of 0.6 and a minimal GC content of 50%, for human. For other species, the length cutoff may be different.

Regulatory Features

The regulatory build provides a single "best guess" set of regulatory elements. These elements are based on the information contained within the ensembl-functional genomics database.

Regulatory features are built as a composite set of annotations based on co-occurence analysis and classification of multiple genome wide epigenomic data sets. Each feature was built on some or all of the following data sets:

Anchor/Focus Sets Data type Source
DNase1 Hypersensitivity site ChIP-Seq 1
CCCTC-binding factor (CTCF) ChIP-Chip* 2
Histone 3 Lysine 4 Tri-Methylation (H3K4me3) ChIP-Chip 3
Supporting Sets Data type Source
H4K20me3 ChIP-Chip 3
H3K27me3 ChIP-Chip 3
H3K36me3 ChIP-Chip 3
H3K79me3 ChIP-Chip 3
H3K9me3 ChIP-Chip 3

The anchor/focus sets were chosen to define a set of regions as potentially regulatory and for their previously known specific properties including DNaseI as a marker of open chromatin, H4K4me3 association with active promoters, and CTCF's association with "insulator regions."

In short, the Regulatory Build process performs an overlap analysis on each anchor/focus set with respect to each other and each of the supporting sets. The result of this analysis was then combined into one 'RegulatoryFeature' set, merging constituent feature boundaries up to a maximium of 4KB and integrating information on proximity (<2.5KB) to transcription start and end sites. The 4KB limit was chosen to avoid chaining of large regulatory regions (e.g the Hox cluster) in an attempt to provide more granularity over regions of interest. On breach of the 4KB maximum length limit, features were broken down into H3K4me3 elements, or CTCF elements where appropriate.

The composite regulatory features were then classified by the patterns of data sets observed across each regulatory feature. Some preliminary analysis has identified the following combinations that are common and strongly associated with other annotated features in Ensembl:

We have used these patterns of the basis of annotations used in this version of the Ensembl Regulatory Build.

Data source citations:

1. Genome-wide identification of DNaseI hypersensitive sites was performed by Greg Crawford and Terry Furey (Duke University) using a whole genome DNase-sequencing protocol (Crawford et al., Genome Research 2006).
DNase-sequencing was performed using the Illumina (Solexa) sequencing by synthesis method from a DNase treated library generated from the GM06990 cell line (Crawford and Furey, unpublished). A Parzen density estimator used density of sequences in regions to generate scores indicating the presence of DNaseI hypersensitive sites.

2. Kim, T.H.; Abdullaev, Z.K.; Smith, A.D.; Ching, K.A.; Loukinov, D.I.; Green, R.D.; Zhang, M.Q.; Lobanenkov, V.V. & Ren, B.
Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome.
Cell, 2007 , 128 , 1231-1245

3. Hirst, M; Hurd, P.J.; Bainbridge, M.; Robertson, G.; Kirmizis, A.; Nelson, C.; Zhao, Y.; Zeng, T.; Pandoh, P.; Tam, A.; Prabhu, A.; Dhalla, N.; Sa, D.; Delaney, A.; Bilenky, M.; Jones, S.; Kouzarides, T.; Marra, M. (In preparation)

* The CTCF data was processed with the Nessie HMM (Flicek, unpublished).

CTCF

Enriched sites were identified by the nessie algorithm for ChIP-chip data analysis (Flicek, unpublished). For this analysis, nessie uses a two-state hidden Markov Model.

Raw data from tiling array experiments is normalised and displayed as simple wiggle tracks. This data is supplied to support and give a visual reference for the associated annotated features track. The default normalisation of the data uses the VSN (Variance Stabilisation Normalisation) package from Bioconductor, which performs a generalised log transformation. This roughly equates to the difference between the control and experimental value at low signal and smoothly transforms to the ratio between the values at high signals i.e. significant signal. This has the effect of minimising anomalies arising from low signals pairs giving high ratio scores.

The CTCF data source is:

Kim, T.H.; Abdullaev, Z.K.; Smith, A.D.; Ching, K.A.; Loukinov, D.I.; Green, R.D.; Zhang, M.Q.; Lobanenkov, V.V. & Ren, B.
Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome.
Cell, 2007 , 128 , 1231-1245

cisRED/miRanda/FlyReg

For human this track displays regulatory features imported from the cisRED database and microRNA regulatory features resulting from the miRanda analyis performed by Anton Enright's group at the Wellcome Trust Sanger Institute.

Those regions of the genome that were subjected to cisRED analysis are indicated in the track 'cisRED search regions'.

For fly this track displays regulatory features based on a curated set of transcription factor binding sites imported from the Drosophila DNase I Footprint Database and a set of 120 likely transcription factor binding sites identified from a large set of Drosophila promoter regions, using the Tiffin pipeline.

Many of these motifs could be correlated to patterns of embryonic gene expression. Regulatory regions on the Drosophila melanogaster genome are predicted using a phylogenetic HMM, then scanned using the motif set to identify probable transcription factor binding sites.

Single Nucleotide Polymorphisms (SNPs)

Single Nucleotide Polymorphisms and other sequence variations are mapped to the genome sequence. Small insertions and deletions (in-dels) are annotated with small triangles, while SNPs are represented by vertical bars and colour-coded as follows:

When zoomed in on a small region, as in the 'Basepair View' panel, the ambiguity code for the SNP polymorphism is displayed.

Pointing at a genetic variation representation in 'Detailed View' or 'Basepair View' will display a pop-up menu with the SNP identifier at the top and clickable 'SNP properties' link to Ensembl 'SNPView'. Depending on the source of the variation data, a summary about variation properties and links to HGBASE data, TSC-CSHL data and dbSNP data are available. Ensembl 'SNPView' is also reached by directly clicking on the feature.

Please note that alleles and ambiguity codes for genetic variations shown in 'ContigView' and 'SNPView' is identical to the NCBI dbSNP entry. To see the alleles and effects as appropriate to a transcript or protein, look at the SNP information in 'TransView' and 'ProteinView'.

Repeats

Repeats - Repetitive sequence regions of all classes are annotated in this track. Most tracks are characterised by running the RepeatMasker program, while the Tandem Repeats Finder generates the 'tandem repeats' track. Mouse-over on an individual repeat element brings up additional information. Tracks may be switched on or off using the 'Repeats' pull-down menu on the 'menu bar'.

Repeat Sub-Classes - Sub-classes of repeats like Dust, LTRs, Low complexity regions, simple repeats, RNA repeats can be selected from the 'Repeats' pull-down menu on the 'menu bar'.

Tile Path and other Clone Sets

Tile path - The tile path track in human Ensembl shows the location of BAC clones within the current genome sequence assembly. Clones for which fluorescence in situ hybridisation (FISH) mapping information is available are marked with a black triangle in the top left corner. Where a clone is shown in outline, the mapping of the clone to the sequence assembly is problematic and the true length is not displayed. Mouse-over brings up information about a particular clone. More information about clones is available from Ensembl 'CytoView', which can be reached via the 'Jump to...' menu on the menu bar above the 'Detailed View' panel.

Acc clones - The accessioned clones track displays BAC clones for which some sequence has been deposited in the nucleotide sequence databases. Coloured bars represent BAC clones and pointing at them displays a pop-up window with more information about the clone, including its EMBL nucleotide sequence database accession number, sequencing status and estimated length. The segment of the sequence assembly to which the clone has been mapped can be exported by going to ExportView, selecting 'Sequence ID' in the Feature section, and entering the international clone name or its accession number. To retrieve the actual sequence deposited in the sequence databases, use EMBL or GenBank directly. Note that these clones represent only a small proportion of the BAC clones positioned on the complete FPC-based clone map. To view other clones, go to Ensembl CytoView using the 'Jump to' pull-down menu on the menu bar above the 'Detailed View' panel.

Vertical black lines at the left or right ends of a clone indicate that the BAC end sequence of that end has been matched to this point on the assembly. The length of the black line underneath a clone indicates its length estimated by fingerprinting. The line length may differ from the length of the coloured bar, because the clone lengths as shown by bars have been adjusted so that the entire FPC map will fit around the points at which clones on the map have been matched to the assembly. If the BAC is represented by an outline instead of a coloured bar, the adjusted clone length is unrealistically large and should be treated with caution. The different length estimates are shown in the pop-up window list.

Fosmid Map - For mouse, end sequences of WIBR-1 fosmid library clones have been mapped to the genome sequence.

Gaps

Shows gaps in the current sequence assembly. Where possible, gaps are categorised as:

Pointing with the mouse shows the category, and the size and position of the gap in the assembly.

%GC

The plot shows the relative content of the nucleotides G+C along the genome sequence. The horizontal red line indicates 50% G+C.

DAS Sources

The Distributed Annotation System (DAS) provides a way of displaying external annotation.

A pre-configured, species-specific set of external data sources is available from the 'DAS sources' pull-down menu. Tick to select individual sources. The selected tracks will then be displayed after the menu has been closed. Brief descriptions for pre-configured DAS sources are available from the 'Apropos: Genome DAS' document.

In addition to the pre-configured sources it is also possible to add DAS sources from external DAS servers via the 'Manage sources' option in the 'DAS Sources' menu.

You can also upload your own data sets into a DAS server provided by Ensembl using the 'Upload data' menu item.

Generally, all DAS tracks in Ensembl 'ContigView' and 'CytoView' panels have blue names labels. Pointing to features in most DAS tracks will produce a pop-up menu showing an identifier, and one or more links to view an associated sequence in FASTA format if appropriate. Ensembl 'FASTAView' pages also provide, where possible, a brief description of the data source and a link to a web page from the group that provided the data for the DAS track.

URL Sources

Ensembl allows for attachment and display of smaller user data sets via the simple URL souce mechanism. Thereby, data sets obeying simple formatting rules are generally placed within user directories that are exported via web servers. URLs corresponding to these data files could be attached from the "URL-based data" option in the DAS Sources menu. Ensembl will then query the corresponding third party web server before rendering the page.

Detailed data set formatting rules can be found in the corresponding help page.

Basepair View

The fourth panel is used to show features on a small segment of the assembly. By default, 'Basepair View', shows a region of 100 bp, taken from the centre of the 'Detailed View' panel. The display can be zoomed in or out, and moved left or right, using navigation controls similar to those of the 'Detailed View' panel. Clicking the plus or minus buttons zooms into or out of the region by a factor of approximately 2. These buttons allow zooming from as little as 1 bp up to 500 bp. The zooming ladder restricts or expands the field of view to a scale suitable to view any feature of interest. Individual steps of the ramp represent 25, 50, 100, 200, 300 and 500 bp sequence, respectively. Regions larger than 500 bp can only be seen in the 'Detailed View' panel.

The 'Basepair View' panel can be turned on or off using the plus [+] or minus [-] button. The tracks can be switched off or on using the 'Options' pull-down menu on the gold bar of the 'Detailed View' panel. 'Basepair View' displays some of the tracks from 'Detailed View' like DNA (contigs), Transcripts and Genes, repeats and tile path) plus the following additional features:

Sequence

The forward strand of the genomic sequence is displayed above the DNA (contigs) track, the reverse strand below. Each ribonucleoside has a different background colour, to make it easier to visualise runs of bases:

The IUPAC single letter codes for ribonucleosides are shown when there is room for display.

Amino acids

Raw translations of the assembled genome sequence. All three possible reading frames are shown, above the DNA (contigs) bar for the forward strand, below it for the reverse strand. Each amino acid has a different background colour, and amino acids with related physico-chemical properties have related shades:

The IUPAC single letter codes for the amino acids are shown when there is room for display.

Start/Stop Codons

The positions of potential start and stop codons are displayed, only when a region less than 50 kb is displayed in 'Detailed View'.

Restriction Enzymes

This track shows the sequence of potential restriction endonuclease cleavage sites together with the name of the enzyme. Enzymes that cut the DNA strands in a staggered fashion to produce 'sticky ends' are drawn in blue. Sites in green denote enzymes that cut the DNA strands at the same point to produce 'blunt ends'. Red vertical lines mark the expected cleavage positions, joined by horizontal lines to the recognition site where the cleavage site does not overlap the recognition site. Pointing to a site produces a pop-up window with the name of the enzyme and its general recognition sequence.

Pop-up Menus

For most features in the 'Detailed View' panel, you can display extra information and links with the mouse pointer. Putting the mouse pointer over a feature ("mouse-over") brings up a pop-up menu window. The top menu item is a name or database identifier in bold text. Below this may be one or more information points (text in grey and not clickable), and one or more clickable links (text in colour).

So that the display does not become cluttered, the pop-up text windows stay on the screen for only about 6 seconds. Click on the X at top-right or the title bar to close the menu window immediately. The window will also disappear if you move the pointer onto a different feature. To re-display, move the pointer off the feature and then point again. To click on a displayed link, move the mouse pointer down the menu - the hand link symbol will appear when you are over a clickable link.

Hints: Does the pop-up window disappear when you try to move the pointer onto it? You have probably moved through another feature without realising it. Try moving slowly onto a feature and stop moving when the text appears. If you keep having problems, zoom in on the region you are exploring, so that features are not so close together. You can also click directly on most features to go straight to a default link.

The pop-up menus can be turned off by unchecking the 'show popup menus' option on the 'Options' pull-down menu in the 'Detailed View' 'Menu Bar'. This may speed your browsing. You will still be able to click on a feature to go to an appropriate information page. An alternative way of using the pop-up windows is also available. To have pop-up menus appearing only when clicking on a feature, check the '... popup on click' option in the 'Options' menu. However, if you plan to use Ensembl regularly it is well worth getting used to the default behaviour of the pop-up menus.


The search box at the top of the page allows you to search for any identifier present in Ensembl. For detailed instructions see the Ensembl 'TextView' page.