assemble~flye

Fast and accurate de novo assembler for single molecule sequencing reads

input_1:FASTQ

input_1/SRR27458461.fastq.gz

Command

assemble~flye -c 8 -m 32 input_1/

Output

output/assembly.fasta

>contig_1
ACGGCGGCCAATTCTTCTGCTCCTGCGATCCGCACTGTCGCCGGTGTTGTTTGCAAGATA
ACTGGAGCATTAGCCTCTTCTGCCGCTTCAATTACAGCTCGTATTGTCTCTAAGTTGTAG
ACATTGCACCCTAACACTGCGTATTTCCCTTCGCAGGCTGCGGCCAACAGAGATGTCGTC
GTGACCAATGGCATGAATTAATTCCTCCCGACCCACATAGTCGGATTTCTTACCGACCGG
ATTTCCACGAAGTTGAATTGGAAAGCTATAAGCACTAGTGCCGAGCAATTCGATGATCGC
CGCCCCCAGCAGCACTTGGCATGACGCCCAACCTATTCTCAATGTTGCCTCAGGGAGTAA
AGGTTGACCCACGTGTCCTTGGGTCGATTCTAGGCTCATCGGTCGACGGCTAGAACCCTT
GCGCCAGTACATCAGATACATGCTCAACACCGGTTAATAGCGCAACAGAACATGTGGAAA
GAGGTGGACGTAACCGCGAGCAGCACGTGGGTTCACGTAACCTACGACGTCGCCTCAGTT

view all outputs

Log

pp assemble~flye -c 8 -m 32 input_1/
PID: 361067
/home/yoshitake.kazutoshi/work/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/pp 'assemble~flye' -c 8 -m 32 input_1/
Checking the realpath of input files.
1
script: /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/assemble~flye
Containers: centos:centos6 quay.io/biocontainers/flye:2.9.5--py39hdf45acc_1
using docker
++ set -o pipefail
+ set -eux
+ set -o pipefail
++ find input_1//
++ grep -E '[.]f(ast|)q([.]gz|)$'
+ reads=input_1//SRR27458461.fastq.gz
+ FUNC_RUN_DOCKER quay.io/biocontainers/flye:2.9.5--py39hdf45acc_1 flye --nano-raw input_1//SRR27458461.fastq.gz --out-dir output --threads 8 --genome-size 10M
+ PP_RUN_IMAGE=quay.io/biocontainers/flye:2.9.5--py39hdf45acc_1
+ shift
+ PP_RUN_DOCKER_CMD=("${@}")
++ date +%Y%m%d_%H%M%S_%3N
+ PPDOCNAME=pp20241105_092749_141_309
+ echo pp20241105_092749_141_309
++ id -u
++ id -g
+ docker run --name pp20241105_092749_141_309 -v /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/assemble~flye:/data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/assemble~flye -w /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/assemble~flye -v /data/yoshitake.kazutoshi:/data/yoshitake.kazutoshi -u 2007:600 -i --rm quay.io/biocontainers/flye:2.9.5--py39hdf45acc_1 flye --nano-raw input_1//SRR27458461.fastq.gz --out-dir output --threads 8 --genome-size 10M
[2024-11-05 00:27:49] INFO: Starting Flye 2.9.5-b1801
[2024-11-05 00:27:49] INFO: >>>STAGE: configure
[2024-11-05 00:27:49] INFO: Configuring run
[2024-11-05 00:27:54] INFO: Total read length: 264541423
[2024-11-05 00:27:54] INFO: Input genome size: 10000000
[2024-11-05 00:27:54] INFO: Estimated coverage: 26
[2024-11-05 00:27:54] INFO: Reads N50/N90: 9108 / 4635
[2024-11-05 00:27:54] INFO: Minimum overlap set to 5000
[2024-11-05 00:27:54] INFO: >>>STAGE: assembly
[2024-11-05 00:27:54] INFO: Assembling disjointigs
[2024-11-05 00:27:54] INFO: Reading sequences
[2024-11-05 00:27:58] INFO: Counting k-mers:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:29:03] INFO: Filling index table (1/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:29:19] INFO: Filling index table (2/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:29:59] INFO: Extending reads
[2024-11-05 00:32:14] INFO: Overlap-based coverage: 73
[2024-11-05 00:32:14] INFO: Median overlap divergence: 0.0452029
0% 30% 40% 90% 100%
[2024-11-05 00:34:24] INFO: Assembled 9 disjointigs
[2024-11-05 00:34:24] INFO: Generating sequence
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:34:25] INFO: Filtering contained disjointigs
0% 10% 20% 30% 40% 50% 60% 70% 80% 100%
[2024-11-05 00:34:26] INFO: Contained seqs: 4
[2024-11-05 00:34:26] INFO: >>>STAGE: consensus
[2024-11-05 00:34:26] INFO: Running Minimap2
[2024-11-05 00:34:49] INFO: Computing consensus
[2024-11-05 00:36:35] INFO: Alignment error rate: 0.008413
[2024-11-05 00:36:35] INFO: >>>STAGE: repeat
[2024-11-05 00:36:35] INFO: Building and resolving repeat graph
[2024-11-05 00:36:35] INFO: Parsing disjointigs
[2024-11-05 00:36:35] INFO: Building repeat graph
0% 20% 40% 60% 80% 100%
[2024-11-05 00:36:37] INFO: Median overlap divergence: 0.00219487
[2024-11-05 00:36:37] INFO: Parsing reads
[2024-11-05 00:36:41] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:37:07] INFO: Aligned read sequence: 230426379 / 231413392 (0.995735)
[2024-11-05 00:37:07] INFO: Median overlap divergence: 0.00109482
[2024-11-05 00:37:07] INFO: Mean edge coverage: 76
[2024-11-05 00:37:07] INFO: Simplifying the graph
[2024-11-05 00:37:07] INFO: >>>STAGE: contigger
[2024-11-05 00:37:07] INFO: Generating contigs
[2024-11-05 00:37:07] INFO: Reading sequences
[2024-11-05 00:37:11] INFO: Generated 1 contigs
[2024-11-05 00:37:11] INFO: Added 0 scaffold connections
[2024-11-05 00:37:11] INFO: >>>STAGE: polishing
[2024-11-05 00:37:11] INFO: Polishing genome (1/1)
[2024-11-05 00:37:11] INFO: Running minimap2
[2024-11-05 00:37:33] INFO: Separating alignment into bubbles
[2024-11-05 00:39:02] INFO: Alignment error rate: 0.002672
[2024-11-05 00:39:02] INFO: Correcting bubbles
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-11-05 00:39:28] INFO: >>>STAGE: finalize
[2024-11-05 00:39:28] INFO: Assembly statistics:

Total length:	3002544
Fragments:	1
Fragments N50:	3002544
Largest frg:	3002544
Scaffolds:	0
Mean coverage:	89

[2024-11-05 00:39:28] INFO: Final assembly: /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/assemble~flye/output/assembly.fasta
+ post_processing
+ '[' 1 = 1 ']'
+ rm -f /home/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/assemble~flye/pp-singularity-flag
+ '[' '' = y ']'
+ echo 0
+ exit