metagenome~clustering_pfam-annotation

Clustering by abundance patterns and annotating species by aggregating Pfam domains present in each cluster.

input_1:Assembled contigs (FASTA)

input_1/covid-assembled.fasta

>ERR5697277_1_(paired)_contig_1
TTGTCAGGGTAATAAACACCACGTGTGAAAGAATTAGTGTATGCAGGGGGTAATTGAGTT
CTGGTTGTAAGATTAACACACTGACTAGAGACTAGTGGCAATAAAACAAGAAAAACAAAC
ATTGTTCGTTTAGTTGTTAACAAGAACATCACTAGAAATAACAACTCTGTTGTTTTCTCT
AATTATAAGTCTACCTTTACTAAGAAGAGATAAAATCATATCATTGATTTGACCTTCTTT
TAAAGACATAACAGCAGTACCCCTTAATTTAAGGGGAAATTTACTCATGTCAAATAAAGA
ATAGGAAGACAACTGAATTGGATTTGTATTCCTCCAAAATATGTAATTTGCATGCATGAC
ATAACCATCTATTTGTTCGCGTGGTTTGCCAAGATAATTACATCCAATTAAAAATGCTTC
AGATGATGACGCATTCACATTAGTAACAAAGGCTGTCCACCATGCGAAGTGTCCCATGAG
CTTATAAAGATCAGCATTCCAAGAATGTTCTGTTATCTTTATAGCCACGGAACCTCCAAG

input_2:paired-end FASTQ(.gz)

input_2/ERR5697277_1.fastq.gz

input_2/ERR5697277_2.fastq.gz

input_2/ERR5707536_1.fastq.gz

input_2/ERR5707536_2.fastq.gz

input_2/SRR14280680_1.fastq.gz

input_2/SRR14280680_2.fastq.gz

input_3:single-end FASTQ(.gz)

Option

-c 4 -m 100 input_1/covid-assembled.fasta input_2/ input_3/

Output

OUTPUT.tsv

id	ERR5697277_1.fastq.gz	ERR5707536_1.fastq.gz	SRR14280680_1.fastq.gz	cluster_id	cluster size	Pfam annotation 1	Pfam annotation 2	Pfam annotation 3
ERR5697277_1_(paired)_contig_1	2458.29	1099.59	263.416	1	33602	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Severe acute respiratory syndrome-related coronavirus;SARS coronavirus;SARS_coronavirus_taxid227859:0.9722591297859078	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;unclassified coronaviruses;Bat coronavirus BM48-31/BGR/2008;Bat_coronavirus_BM48-31_BGR_2008_taxid864596:0.9570711176584975	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Rousettus bat coronavirus HKU9;Rousettus_bat_coronavirus_HKU9_taxid694006:0.7272540125865317
ERR5697277_1_(paired)_contig_11	0.372654	0	0	1	33602	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Severe acute respiratory syndrome-related coronavirus;SARS coronavirus;SARS_coronavirus_taxid227859:0.9722591297859078	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;unclassified coronaviruses;Bat coronavirus BM48-31/BGR/2008;Bat_coronavirus_BM48-31_BGR_2008_taxid864596:0.9570711176584975	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Rousettus bat coronavirus HKU9;Rousettus_bat_coronavirus_HKU9_taxid694006:0.7272540125865317
ERR5697277_1_(paired)_contig_12	1.32468	0.657143	0	1	33602	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Severe acute respiratory syndrome-related coronavirus;SARS coronavirus;SARS_coronavirus_taxid227859:0.9722591297859078	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;unclassified coronaviruses;Bat coronavirus BM48-31/BGR/2008;Bat_coronavirus_BM48-31_BGR_2008_taxid864596:0.9570711176584975	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Rousettus bat coronavirus HKU9;Rousettus_bat_coronavirus_HKU9_taxid694006:0.7272540125865317
ERR5697277_1_(paired)_contig_13	10.6366	0.423944	0	1	33602	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;Betacoronavirus;Severe acute respiratory syndrome-related coronavirus;SARS coronavirus;SARS_coronavirus_taxid227859:0.9722591297859078	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronaviridae;Coronavirinae;unclassified coronaviruses;Bat coronavirus BM48-31/BGR/2008;Bat_coronavirus_BM48-31_BGR_2008_taxid864596:0.9570711176584975	Viruses;ssRNA viruses;ssRNA positive-strand viruses, no DNA stage;Nidovirales;Coronavirid

view all outputs