GenomeView is a stand-alone sequence browser specifically designed to visualize and manipulate a multitude of genomics data interactively. GenomeView enables users to dynamically browse high volumes of aligned short read data, with dynamic navigation and semantic zooming, from the whole genome level to the single nucleotide. At the same time, the tool enables visualization of whole genome alignments of dozens of genomes relative to a reference sequence. GenomeView is unique in it capability to interactively handle huge data sets consisting of dozens of aligned genomes, thousands of annotation features and millions of mapped short reads both as viewer and editor.
System requirements
Java 1.6u10+ is required to run the application. You can get a recent version from http://www.java.com. It is recommended that you have 1 Gb of memory, but GenomeView will work with less. Similarly it is recommended to have a dual-core or better processor, but GenomeView will work with less.
Installation
The most straightforward way to start GenomeView is by Java Webstart.
Clicking the above button will immediately launch the application.
Local installation
You can download the latest version from http://sourceforge.net/projects/genomeview/
Unpack the zip file to a directory and start the genomeview-
java -Xmx1g -jar genomeview-<version>.jar,
where <version> is replaced by the appropriate version number.
This is a compatibility matrix of GenomeView with various OS/browser combinations. The Web Start version matrix only has the operating system as it is independent of the browser. The Applet version of GenomeView has many more combinations, not all of which work equally well.
The green
means we have confirmed this combination to work. A red
means we could not get it to work. Blanks in the table are combinations we haven't been able to test, your help is appreciated in filling the blanks.
GenomeView requires at least Java 1.6u10. This is a higher version than by default available on some platforms. In those cases you need to update Java first.
Only major names are included as we have not yet received reports of any recent unsupported versions or minor versions. If you come across any, please let us know.
| Operating System | Status |
|---|---|
| Windows | ![]() |
| Linux | ![]() |
| OS X | ![]() |
There is a table per operating system as there are typically many flavors of each OS. If a version is missing, feel free to contact us and we'll add it. We try only to include major versions, but if some minor versions behave differently they will be included. For example Firefox 3.6.14 broke applet support completely.
| Windows XP | Windows Vista | Windows 7 | |
|---|---|---|---|
| Firefox 3.6.15+ | ![]() |
||
| Firefox 4 | ![]() |
||
| Firefox 5 | ![]() |
![]() |
|
| Firefox 6 | ![]() |
![]() |
![]() |
| Firefox 7 | ![]() |
![]() |
![]() |
| Firefox 8 | ![]() |
![]() |
![]() |
| Chrome 12 | |||
| Chrome 13 | ![]() |
![]() |
|
| Internet Explorer 7 | |||
| Internet Explorer 8 | ![]() |
||
| Internet Explorer 9 | ![]() |
||
| Safari 5.0 | ![]() |
||
| Safari 5.1 | ![]() |
||
| Opera 11 | ![]() |
![]() |
| Ubuntu 10.04 Lucid Lynx | Ubuntu 11.04 Natty Narwhal |
|
|---|---|---|
| Firefox 3.6.15+ | ![]() |
|
| Firefox 4 | ![]() |
|
| Firefox 5 | ![]() |
|
| Chrome 12 | ![]() |
|
| Chrome 13 | ||
| Internet Explorer 7 | ||
| Internet Explorer 8 | ||
| Internet Explorer 9 | ||
| Safari 5 | ||
| Opera 11 | ![]() |
| OS X 10.5.8 Leopard |
OS X 10.6.8 Snow Leopard |
OS X 10.7 Lion |
|
|---|---|---|---|
| Firefox 3.6.15+ | ![]() |
||
| Firefox 4 | * |
||
| Firefox 5 | * |
||
| Chrome 12 | * |
||
| Chrome 13 | * |
||
| Internet Explorer 7 | |||
| Internet Explorer 8 | |||
| Internet Explorer 9 | |||
| Safari 5.0 | ![]() |
![]() |
|
| Safari 5.1 | ![]() |
* |
|
| Opera 11 |
*On Mac OS X, some browsers need a work-around to get the applet visible:
We need help to fill in the blanks.
To test the applet please go to http://genomeview.org/start/applet.html. If it starts GenomeView, you can load data from one of the demo instances and it shows the data. The GenomeView applet works on that OS/browser combination.
Send your experiences to support@genomeview.org and they will be included in the table.
Make sure you provide detailed version information for both the browser you tested and your operating system as well as your exact Java version.
If your experiences do not match with the table, let us know and we'll investigate further.
For Mac OS X testers: if the regular test fails, please test the following work-around and report those results as well:
Press CMD+Shift and the top bar of the applet appears in the browser. Drag the applet out of the browser. It should now paint correctly.
Your name will appear here if you contribute to the compatibility matrix
Peter Sisk
Jon Goldberg
Ken Heyndrickx
Bram Verhelst
Michiel Van Bel
Navigation can either be done with the buttons in the toolbar (arrows and magnifying glasses), with the navigator panel (ruler at the top with a block box), with the mouse or with your keyboard.
Navigator
Drag the edges to zoom in/out. Cursor will change to a resize icon when you are at an edge. Drag the box or the half-circle handles to move to another place in the genome. Click somewhere on the ruler to move to that place.
Mouse
Keyboard
Navigation
You can use the arrows on the keypad of your keyboard to move around in the evidence and structure panels. The + and - sign buttons can be used to zoom in and out.
Various keys
Structure and evidence panel mouse actions
Structure panel specific actions
Evidence panel specific actions
Note: dragging is holding the mouse button and moving the mouse.
Sources
You can load data from files from your computer, or you can load data from a file that lives on the internet by its URL.
File formats
GenomeView supports a whole list of file formats. See the data format page for a complete list. GenomeView tries to limit the listed files to the supported data formats. However, there are many extensions in use and your files may be hidden. In that case, select "all files" from the drop-down list.
Some file formats can/need to be preprocessed for optimal performance.
Opening a file
Steps to open a file:
File > Load data > Local file > pick your file
Note that you can select multiple files at once.
There are two sample data files attached to this page you could use. One file contains genomic sequence, the other one annotation for this sequence. The data represents the mitochondrial DNA of C. elegans (WS200).
The video's below show how to open data from a local file and from a URL.
Watch the video full screen for best quality
Watch the video full screen for best quality
Loading data from a URL
| Attachment | Size |
|---|---|
| CHROMOSOME_MtDNA.fasta | 13.75 KB |
| MtDNA.gff | 3.35 KB |
Currently several commonly used formats are recognized by GenomeView. GenomeView using the identifiers present in each format to link different sources, so make sure that the identifiers match (case-sensitive).
Make sure you index your large files: reference genome, NGS data sets (SAM / BAM), annotation and read coverage plots While this is not required, it will speed up the process of browsing and loading data, as well as significantly reduce the amount of memory you need.
| Data type | File format | Index* | Max size** | Comments | |
|---|---|---|---|---|---|
| unindexed*** | indexed | ||||
| Reference sequence | fasta ¤ | Recommended | 50 Mb | unlimited | GenomeView will automagically create index for you if you don't have one. |
| embl, genbank | Not possible | 50 Mb | -- | EMBL and genbank are mixed file formats that can contain both annotation and reference sequence at the same time. | |
| Annotation | gff ¤ | Not recommended | 50 Mb | unlimited | |
| embl, genbank | Not possible | 50 Mb | -- | EMBL and genbank are mixed file formats that can contain both annotation and reference sequence at the same time. | |
| bed | Not possible | 50 Mb or less | -- | By default data from a bed file is added to the CDS track, if you want it in a different track, you have to add a line a the top of the file 'track name=Track_name'. No white-space is allowed in the track name. | |
| ptt, tbl | Not possible | 50 Mb or less | -- | Other standard annotation formats GenomeView understands | |
| various formats | Not possible | 50 Mb or less | -- | GenomeView can directly parse the output of the following programs: Blast, GeneMark, TransTermHP, FindPeaks, MaqSNP, tRNA-scan | |
| Multiple genome alignment | maf ¤ | Recommended | 100 Mb | unlimited | GenomeView will prompt you to create a compressed maf file and index it for you, if you're trying to load an unindexed maf file. MAF is the recommended file format for whole genome alignemnt of large/complex genomes |
| multi-fasta ¤ | Not possible | 100 Mb | -- | Recommended for small/simple genomes with a near 1:1 relationship. | |
| aln, ClustalW | Not possible | 100 Mb | -- | ||
| Sequence read alignment | bam ¤ | Required | -- | unlimited | GenomeView will prompt you if there is no index and will create one for you. |
| MAQ, MapView, BroadSolexa | Not possible | 100 Mb | -- | ||
| Read coverage summary | tdf ¤ | Native | unlimited | unlimited | TDF files can be created with the tdformat tool that is available for download. |
| bigwig | Native | unlimited | unlimited | This format can be used for any wig file, not just read coverage | |
| pileup | Required | -- | unlimited | The pileup format becomes slow when you have extreme read depth (>5000 x coverage) | |
| wig | Not possible | 50 Mb | -- | We strongly recommend to convert your wig files to bigwig or TDF. GenomeView can automatically convert wig files to TDF. Caveats: 'track' information should all be on a single line, 'browser' lines will be ignored as the are specific to the UCSC Genome Browser. WIG files need to be sorted by chromosome and by genomic coordinate within the chromosome. BedGraph as well as Wiggle_0 format is supported. For the wiggle_0 type, both variableStep and fixedStep should work. | |
| Allele diversity summary | pileup ¤ | Required | -- | unlimited | The pileup format becomes slow when you have extreme read depth (>5000 x coverage) |
* Indicates whether this file format can/should be indexed.
** Recommended maximum file size. First value is without index, the second with index. This values are only guidelines. When loading multiple data sets, you should add the sizes.
*** Unindexed data files can be gzip compressed.
¤ Recommended file format for this data type.
(Modified) annotations can be saved as either GFF or EMBL.
All data that is loaded can be exported in their original format. This will not include modifications.
To be able to easily handle large vertebrate reference genomes, it is required that they are indexed. This can be done with the faidx command from the samtools package.
If you are also preparing HTS data sets in the BAM format, this step will also be part of that procedure, so either you move right to the short read preparation page or you can skip the step there whenever you're ready.
To index a fasta file you run
samtools faidx reference.fasta
Attention
If your files was reference.fasta, GenomeView will search for reference.fasta.fai in the same directory. If you want to be able to load large files, make sure those two files are correctly named and in the same folder.
Large feature files need to be indexed before you can use them properly in GenomeView.
The definition of large is not strict in the sense that it depends on both the real size of the file, as well as the number of features in the file.
Recommendations:
Instructions:
To index a file, you need to pre-process it with tabix, much like is done with pile-up files.
Tabix can be downloaded from the tabix download page.
For BED formatted files:
sort -k1,1 -k2,2n input.bed | bgzip -c > compressed.bed.gz
tabix -p bed compressed.bed.gz
Note that indexing will not work with BED files that have a UCSC header ("track name=blah")
For GFF formatted files:
sort -T /group/tmp -k1,1 -k4,4n input.gff | bgzip -c > compressed.gff.gz
tabix -p gff compressed.gff.gz
In both cases, you will get two new files: (1) a gz file and (2) a tbi file.
Load the gz file in GenomeView.
Caveat:
The structure of genes will be lost when indexing gff files.
The best format to present short read alignments to GenomeView is the SAM/BAM format, which is emerging as the standard.
There are a number of tools available to convert the output from numerous aligners to SAM on the SAMtools website.
Once you have a SAM file you need to convert it to BAM and index it. Let us suppose you have a reference sequence called 'reference.fasta' and a read alignment in SAM format called 'alignment.sam'.
Steps to get from the various aligner formats to the SAM format are available on the SAMtools website.
Steps to go from SAM to indexed BAM.
(will create reference.fasta.fai for the next step)
samtools faidx reference.fasta
samtools view -bS -t reference.fasta.fai alignment.sam -o alignment.bam
(will create sorted.bam)
samtools sort alignment.bam sorted
(will create sorted.bam.bai, which is read by GenomeView together with the bam file)
samtools index sorted.bam
If you are looking for assistance to load your BAM file, see the short read alignment preparation page
Please see the description of the pile-up track for more information on what can be done with the pile-up track.
There are three formats supported for pileups. The first one is generated with a specific tool that is available from this page. The second one can be generated by samtools, the final one is a simple tab delimited file format. All are explained below, links to samtools and tabix can be found at the bottom of this page.
Important: TDF should not be indexed. The samtools pileup and tab delimited format MUST be indexed before GenomeView understands them.
TDF is a tiled data format which contains the coverage plot, as well as multiple resolution summaries which allows fast retrieval at any scale.
Download the latest version of tdformat, a small program to generate TDF files from BAM files. The BAM file has to be indexed, i.e. there has to be a BAI file as well.
Once you've downloaded and extracted the program (you need at least the lib folder and the tdformat jar file) you can invoke it with the following commands:
java -Xmx1g -jar tdformat-1576.jar <path to your BAM file>
Replace 1576 with the version number of the file you downloaded.
For large genomes, mammalian genomes for example, you may need to increase the memory allotment for the program:
java -Xmx4g -jar tdformat-1523.jar <path to your BAM file>
The TDF format does not have to be indexed.
Note: file name extension should contain .pileup
The first step to be able to browse a pileup is to generate one from your BAM file.
samtools pileup -f reference.fasta sorted.bam >sorted.pileup
As you run this command, you'll see that the generated file can be huge, even for small BAM files.
To be able to browse it in GenomeView, it needs to be indexed with tabix, a tool that is also available from the SAMtools web page.
sort -k1,1 -k2,2n sorted.pileup | bgzip -c > compressed.pileup.bgz
tabix -s 1 -b 2 -e 2 compressed.pileup.bgz
The file should be organized in four columns.
The first column holds the identifier of the sequence, the second column contains the genomic position, the third column contains the number of reads on the forward strand, the final column contains the number of reads on the reverse strand.
Example:
chr1 11 46 43
chr1 12 47 50
chr1 13 48 61
chr1 14 53 79
Note that the white-space between the columns are tabs, one tab between each column.
Once you have such a file, you can again index it for faster access and shorter download times.
sort -T . -k1,1 -k2,2n filename | bgzip -c > filename.bgz
tabix -s 1 -b 2 -e 2 filename.bgz
This page introduces the different components of GenomeView and explains a number of naming conventions we use throughout the documentation.
Components of GenomeView
The GenomeView GUI is divided into two columns. The left side is a graphical representation of the data, while on the right side you can find additional information, controllers and options in the form of tables.
Visual description of the user interface (click to enlarge)
All visualizations in GenomeView are organized into tracks. A track typically holds on particular type of data or one particular data set. There can be multiple tracks of each type.
When loading new data, a new track is added.
On the right side of the window there is an overview of all tracks that are currently available.

You can reorder the tracks by dragging them up and down in this table, hide them by clicking the eye icon or remove them with the garbage bin icon.
Gene structure track (click to enlarge)
This tracks shows a number of things, some of which only are visible when you are sufficiently zoomed in.
Things to know about this track:
The feature track can display a multitude of annotation information, supplied as GFF or BED files. Features like CDS, RNA, SNP, etc... are displayed as rectangles in different colors. A triangle on one side can indicate the strand. When zoomed in enough, feature names are displayed when available.

Short read are displayed in the Short read track as color boxes that are in some cases connected with pink lines. The pictures belows should give you an idea what the meaning is of the various visual clues.
Short read track, zooming in from left to right
Default color scheme
| Color | Description |
| Green | Read mapped to the forward strand from a sense fragment in a PE library or from a single end library |
| Blue | Read mapped to the reverse strand from a sense fragment in a PE library or from a single end library |
| Cyan | Read mapped to the reverse strand from an anti-sense fragment in a PE library |
| Orange | Read mapped to the forward strand from a anti-sense fragment in a PE library |
| Yellow | Mismatch between the read and the reference, the read nucleotide will be shown when zoomed in |
| Red | Gap/deletion in the read |
| Black | Insertion in the read. Hover over them to see inserted bases. |
| Gray | Insertion in the read that is a multiple of 3. Hover over them to see inserted bases. |
| Purple/Pink | Connection between two reads from a paired-end library (thin line), or connection between parts of a single read aligned over a splice junction (thick line). Both the PE connections and splice junctions ones will be shown simultaneously in data sets that have that information. |
Note that some older alignment software does not include the correct information in the BAM file and that the color scheme may be off for those files. Use common sense when interpreting results!
Overview of visual clues in the short read track
Hovering over reads shows detailed information about the read
The pile up track can consists of to information parts. The first one, the coverage plot, is always present, the second, the SNP plot, is only displayed if the loaded data set has the required information.
Typically coverage-only data files are TDF files, while coverage+SNP files are prepared using samtools pileup. More information on preparing pile-ups
Multi-fasta/ClustalW multiple alignment
Multi-fasta data can be displayed on three zoom levels.

MAF formatted multiple alignment
Details on the MAF format
Demo video showing the multiple alignment track.
Watch video full screen in HD mode for best quality, the video contains no sound
Multiple alignments can be displayed in three zoom levels.
The most detailed level shows mismatches and gaps for each alignment. Hovering over the track displays the names of the species on the left.

On the middle level, we can still hover the track to see the species. An alignment on the forward strand is drawn in green, one to the reverse strand in blue.

When we zoom even further out, the alignments are displayed in gray. The more species align to a certain part of the reference sequence, the longer the gray line. Individual species are not displayed anymore.

After this, zooming further out will not display alignments anymore because of performance reasons.
Color key:
| Gray | mismatch with reference |
| Red | gap in alignment |
| Green | Alignment to forward strand |
| Blue | Alignment to reverse strand |
No screenshots or description available yet.
Plugins are the basic extension mechanism to create new functionality for GenomeView, without modifying the core application.
plugin subdirectory of the .genomeview directory. You should have run GenomeView at least once for these directories to exist.We provide a number of demo instances of GenomeView that come with data preloaded.
Furthermore we provide a significant number of pre-loaded GenomeView instances through the Genome Explorer.
Here is a list of a couple of places where you can find data that may be of interest. This data will generally require some processing to get it into one of the standard file formats.
If you would like any of the genomes at UCSC or Ensembl included in the Genome Explorer, drop us an e-mail, we have scripts to automate the download and data massaging.
UCSC Genome Browser downloads
Ensembl downloads
Sequence Read Archive (SRA) is a repository that stores raw sequencing data from next generation of sequencing platforms.
EBI Sequence Read Archive is the European cousin of the SRA.
The UCSC data repository has a large number of whole genome multiple alignments. Look under the heading 'Multiple Alignments' for each species.
Some examples:
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz46way/maf/ hg19 aligned to 45 vertebrates
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz44way/maf/ hg18 aligned to 43 vertebrates
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz28way/maf/ hg18 aligned to 27 vertebrates
http://hgdownload.cse.ucsc.edu/goldenPath/dm2/multiz15way/ dm3 aligned to 14 other insects