simple statistics of FASTA/Q files
QC~seqkit -c 8 -m 32 input_1/
file format type num_seqs sum_len min_len avg_len max_len Q1 Q2 Q3 sum_gap N50 Q20(%) Q30(%)
input_1//ecoli.fasta FASTA DNA 1 4,641,652 4,641,652 4,641,652 4,641,652 0 4,641,652 0 0 4,641,652 0 0
input_1//fastq_runid_4de97058536f00a50a5594d603041572795f8954_0.fastq FASTQ DNA 4,000 25,135,131 5 6,283.8 107,136 991 2,887 7,672 0 13,646 45.11 25.74
pp QC~seqkit -c 8 -m 32 input_1/ Checking the realpath of input files. 0 input_1/ 1 /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/QC~seqkit/input_1/ecoli.fasta 1 /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/QC~seqkit/input_1/fastq_runid_4de97058536f00a50a5594d603041572795f8954_0.fastq /home/yoshitake.kazutoshi/files/m256y -> /suikou/files/m256y/yoshitake.kazutoshi/work /home/yoshitake.kazutoshi/files/m256y/pp-dev -> /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev /home/yoshitake.kazutoshi/files/m256y/pp-dev/yoshitake -> /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake /home/yoshitake.kazutoshi/files/m256y/pp-dev/yoshitake/test -> /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test /home/yoshitake.kazutoshi/files/m256y/pp-dev/yoshitake/test/QC~seqkit -> /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/QC~seqkit /suikou/files/m768/yoshitake.kazutoshi/work/ecoli /suikou/files/m256y/yoshitake.kazutoshi/work /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/QC~seqkit centos:centos6 quay.io/biocontainers/seqkit:0.12.1--0 using docker file format type num_seqs sum_len min_len avg_len max_len Q1 Q2 Q3 sum_gap N50 Q20(%) Q30(%) input_1//ecoli.fasta FASTA DNA 1 4,641,652 4,641,652 4,641,652 4,641,652 0 4,641,652 0 0 4,641,652 0 0 input_1//fastq_runid_4de97058536f00a50a5594d603041572795f8954_0.fastq FASTQ DNA 4,000 25,135,131 5 6,283.8 107,136 991 2,887 7,672 0 13,646 45.11 25.74 1. file input file, "-" for STDIN 2. format FASTA or FASTQ 3. type DNA, RNA, Protein or Unlimit 4. num_seqs number of sequences 5. sum_len number of bases or residues , with gaps or spaces counted 6. min_len minimal sequence length , with gaps or spaces counted 7. avg_len average sequence length , with gaps or spaces counted 8. max_len miximal sequence length , with gaps or spaces counted 9. Q1 first quartile of sequence length , with gaps or spaces counted 10. Q2 median of sequence length , with gaps or spaces counted 11. Q3 third quartile of sequence length , with gaps or spaces counted 12. sum_gap number of gaps 13. N50 N50. https://en.wikipedia.org/wiki/N50,_L50,_and_related_statistics#N50 14. Q20(%) percentage of bases with the quality score greater than 20 15. Q30(%) percentage of bases with the quality score greater than 30 16. GC(%) percentage of GC content PID: 1838626