Date: 24.10.2020

A new version of RepeatExplorer pipeline introduced in Nature Protocols

RepeatExplorer is a computational pipeline designed to identify and quantify repetitive sequences in eukaryotic genomes. The pipeline was developed by scientists from the Laboratory of Molecular Cytogenetics of the Biology Centre seven years ago, and since then it has become widely adopted for repetitive DNA analysis in plant and animal genomes. Its new version, featuring several major improvements and associated workflows, has been published in the current issue of the prestigious journal Nature Protocols.

Repetitive DNA sequences constitute significant parts of most plant and animal genomes. Due to their high abundance and sequence variability, they are, in principle, very difficult to characterize. RepeatExplorer employs unique algorithms that overcome these difficulties and perform repeat characterization directly from unassembled short sequence reads generated from the genome of interest. Up to now, the pipeline has been used in hundreds of scientific projects, ranging from repeat characterization in a single species and their comparison between multiple genomes, to investigation of centromeres and sex chromosomes, and evolutionary studies.

The new version of the pipeline, RepeatExplorer2, includes major improvements of existing programs and the introduction of several novel tools. The most significant is an automatic repeat classification module utilizing REXdb, a novel database of retrotransposon-coded protein domains. There are also specific tools for analyzing satellite repeats or data from ChIP-seq experiments.

Thanks to the active involvement of the Biology Centre in the ELIXIR-CZ infrastructure (https://www.elixir-czech.cz/), all RepeatExplorer versions and associated computational tools are made available to the public on a dedicated server (https://repeatexplorer-elixir.cerit-sc.cz/). The server facilitates access to the tools via a user-friendly Galaxy platform (https://galaxyproject.org/) and provides computation resources needed for the pipeline execution.    

   

Reference

Novák, P., Neumann, P., Macas, J. (2020) - Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nature Protocols, http://dx.doi.org/10.1038/s41596-020-0400-y.

Free link to the full-text: https://rdcu.be/b80Gr

Back

 

CONTACT

Biology Centre CAS
Institute of Plant Molecular Biology
Branišovská 1160/31
370 05 České Budějovice

Staff search