RepeatModeler
RepeatModeler is a de novo transposable element family identification and modeling package. Consisting of 3 core programs (RECON, RepeatScout, and LtrHarvest/Ltr_retriever) which identify repeat element boundaries and family relationships from sequencing data.
For input it takes a database and an input sequence. It outputs 3 primary files,
families.fa, families.stk, and rmod.log. families.fa consists of the
consensus sequences, .stk consists of seed alignments, and rmod consists of
a summarized log of the run. Note that this program generates A LOT of temporary
files that ARE NOT cleaned up between runs. You will have to manually clean up
your directory between runs. Also note that this has long runtimes so plan your
resource request accordingly. This module also loads BuildDatabase and
RepeatClassifier.
For additional information and examples see the documentation
ml biocontainers repeatmodeler
BuildDatabase -name fish fish.fa
RepeatModeler -database fish -threads # -LTRStruct
Parallel Capabilities: Single core default, Multithreading options supported.