Quick Start#

This section provides a quick guide to running the run_dbcan tool suite with example data and explains the output files generated.

1. Running Example Data#

To run the dbCAN tool suite on the Escherichia coli Strain MG1655 example data, use the following command. The input file EscheriaColiK12MG1655.fna represents the FASTA format complete genome DNA sequence, and prok specifies that the organism is a prokaryote.

run_dbcan EscheriaColiK12MG1655.fna prok --out_dir output_EscheriaColiK12MG1655

2. Understanding the Output#

After running the tool, several output files are generated in output_EscheriaColiK12MG1655, each with specific information:

uniInput: The unified input file for subsequent tools, created by Prodigal if a nucleotide sequence is used.
dbsub.out: Output from the dbCAN_sub run.
diamond.out: Results from the Diamond BLAST.
hmmer.out: Output from the HMMER run.
tf.out: Diamond BLAST output predicting Transcription Factors (TFs) for CGCFinder.
tc.out: Diamond BLAST output predicting Transporter Classifications (TCs) for CGCFinder.
cgc.gff: GFF input file for CGCFinder.
cgc.out: Output from the CGCFinder run.
cgc_standard.out: Simplified version of cgc.out, containing columns like CGC_id, Type, Contig_id, Gene_id, Start, End, Strand, and Annotation.
overview.txt: Summarizes CAZyme predictions across tools, including SignalP results.

Quick Start

Contents

Quick Start#

1. Running Example Data#

2. Understanding the Output#