cdhit_454: Identify artificial duplicates from metagenomic samples


454 metagenomic reads file.
    Upload a Fasta or Fastq file of 454 metagenomic reads :

Sequence Identity Parameters
Sequence identity cut-off.

Algorithm Parameters
-g: sequence is clustered to the first cluster that meet the threshold. NoYes
-b: bandwidth of alignment.
-D: max size per indel.
-n: word_length.

Alignment Coverage Parameters.
-aL: minimal alignment coverage (fraction) for the longer sequence
-AL: maxium unaligned part (amino acids/bases) for the longer sequence
-aS: minimal alignment coverage (fraction) for the shorter sequence
-AS: maxium unaligned part (amino acids/bases) for the shorter sequence

Mail address for job checking.
Give your mail address:
    


Reference:
  1. Beifang Niu, Limin Fu, Shulei Sun and Weizhong Li. Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinformatics 2010 11:187 doi:10.1186/1471-2105-11-187.download PDF
Contact @Beifang Niu & Weizhong Li