Visualize your genome assemblies

Bandage (a Bioinformatics Application for Navigating De novo Assembly Graphs Easily), is a program that creates visualisations of sequence assemblies that you can interact with. When assembling a genome with your favorite assembler, you are usually building graphs, from which contigs are then created.

How can visualizing your assemblies help? A simple example is given by Ryan Wick (reprinted with permission): Imagine a bacterial genome that contains a single repeated element in two separate places (red) in the chromosome.

A researcher (who does not yet know the structure of the genome) sequences it, and the resulting 100 bp reads are assembled with a de novo assembler.

Because the repeated element is longer than the sequencing reads, the assembler was not able to reproduce the original genome as a single contig. Rather, three contigs are produced: one for the repeated sequence (even though it occurs twice) and one for each sequence between the repeated elements.

Given only the contigs, the relationship between these sequences is not clear. However, the assembly graph contains additional information which is made apparent in Bandage.

There are two principal underlying sequences that can produce this type of graph: either two separate circular sequences that share a region in common, or a single larger circular sequence with an element that occurs twice.

Additional knowledge, such as information on the approximate size of the bacterial chromosome, can help the researcher to rule out the first alternative. In this way, Bandage has assisted in turning a fragmented assembly of three contigs into a completed genome of one sequence.

If this seems interesting to you, check out Bandage’s website.

Reference

Wick R.R., Schultz M.B., Zobel J. & Holt K.E. (2015). Bandage: interactive visualization of de novo genome assemblies. Bioinformatics, 31(20), 3350-3352.

This entry was posted in bioinformatics, genomics, software and tagged , , . Bookmark the permalink.