post-assemble~busco_v5

The Benchmarking Universal Single-Copy Ortholog assessment tool. Based on evolutionarily-informed expectations of gene content of near-universal single-copy orthologs, BUSCO metric is complementary to technical metrics like N50.

input_1:An assembled genome file

input_1/cnv2_edited.fasta

>1
CCTCGGTCGGCGCCATCTCCTCCTCGACTATTTCCTCCTCCATCTCCTTCTCCCTCTCCTCCACTTCTTCTTCTTCCTCCCTTTCTTCTTCACTCTCCTCAACAACCCTTTGCCGCTTCAGCGGACGCCGCCGCTCGGTGGTCCTCCTCGACGGTCCTGCCGCCGCCGGTCCCTCTCCCATACCCATCAAGGACATCCTAGTGTCCATGGATATGAGGCGCGTATGGTCGTCCTCTTGCCTTCTCAGCAGCTGGCGAAGGTACTGCTGAGTCTTCGAATTCGCCTCTCGTAGGGCCCGGTTATCGGCGAGGAGTTGTGCCATTTGACTTTCGAGGACAGCTAGCCGCTCGCCGCTCGGGCCCGCTCCTCCATCCTTCTTCACTACATAAGGACGACCCGAGTACGAGCATCGTACTTTCGAGTCGTGGCAGGGCTTACATATTACGCTACTCCGCTTGCCAGCCTGCTTCAGGCAAGGGATTTTCTTCGACCTGCACCGTTCGCAAGGTTCTCGGTCTTCTTCGTCCTCATCATCGTCGTCGCCGTCGTCGCCGTCGTCTGGGTCCTCATCGACGACCTGAGTAGAAACAAGTCAATAAAGAAAATTTAATGTTAGCCGAAAAAGACGTACCGGGGCCTTGGCTTTACCCTTTCCCTTATTCTTCATCTTCGATATCTCCACAATAGGTCTCCTCGGCGAAGCAGAAACATCCCCGGTGGAAGTTCCGCCTTGACTTCTGGCGGTAGCAGCCTCCGCCAACTTCCTCCGTCGTTCGACGATCTCGTCTTCCTGTTCTCGAGCCCGGACGGCCCGATCGCGAGCATCTTGAGCAGCTTGGCGTCTCGCCGCCGCCGCTCGCTCCGCCGCCTACTTCTTCTTTTCCGCCTCCTCTGCCGCCTTCTTAGCAGCTGCCTCCTCCGCTGCCTTCCTTTTAGCAGCCTCCGCTGCCTTCCGCGCCTTCACCCTCTCAACCCTCGCAAGGGCCTCCCTGATGATTTCGTCTTCGTCTTTGTCGATAGGGGCGCCGGGACTCGGAGGATCGGCGGGGCGAGAAGAGGTACTGGCGGAGGGTTGGGTGGTCGTGGTGGTTCGGCTGGTCGTCATGTTGGTGGACTTAGAAGATTTTGAGAGAGTTAGGGCCTTAGCGGCAAAACCCTGGGCCTTTTATAGGAAAAAAACTGCGCACGGACCAGCGAAAATTTTCCTTCGAGTTGTACGGGGTGTACGGTGAACCGAAAATTTTCACTGATGACTAACGCCAATTCCCGGTACTAACGGCTTTTGAAAATTCTCCTGCTGACTGATGGGCCAAATCCCCGATCCTGACGGCCTTTGAAATTTTTCTTTGGTGACTAACCAACTACCGGTATTAATGGCTTTCGAGGGCTCAAGCCTAAGAGGGGATACTGTCAAGTCCATGCCGCGGGCCTCCATTCCGCCGCTCGGTACTCCGAAGATCCACTCGTTATTGTAATCTCACTCTTATTGTTCTTTCTCGCTACACATACCCTTTCGTGCTAGTCCCTTCTTATTGTCTTTTACTCTACCCTGCAGGTACTACTCGTACAGCTCCTAGTAGTTAGTCTGGCGCTCCGCACCGATGCGCCGTTCCGTAGATATAAATACTGCCTGTACGCTACTGTAGTTCCTCAGTTTAATTATCAACTTAGATCAACTTAGATCTCCCTTTCCAACTTAACTCTCCTCTAAGTCTTCAGGTTCAACGCCTCGAAGTACGTTCTCCTCGACCTTAGGACTTTAACGTTCTCTGGTCCGCTCCTCGTCATCGGCGATACCGCCGCTCCGACCTGGCTGCTACTTAGCTCCTCTCTCTCTACTCTCGCTCTCCGGGGACTAACGGCGTAACGGCGGTGTTGTTAATAACTTCACTACTGTCCCATTCGTCTTCCGCACTTAGACACCCTGTGGCCTTGGCACGGTTACACGGTGTACATGATCTTAACATGAGCATGATATAGCATATATACTATGTTATACACCTATCGAACTTAATCTGCTGTAGAGAAACCACTCTGAAGAGGGTGTTGTTCCTTGAATTCAGCCCTCGGGAGTCAGAAAATATATCCAAGAAGCGCCATTATTTTTGCTCCCACGTCCTCAACCTCTTTCCACCACTGCGACTTTACTTTCTCCATCACTGTTTTGTTGCCACAGATCCCCTGTAGCAGCCTGTAACATGGTAAGTTCTCAAACTATCTTTTGAAAAACTTTCTCACCTGTTCTCTAGCTCTTTGATACCGTGTCTTATTCCATCTCCCAGCTCTGCTTGTGAACATTTTCCATCTGCGCAAACTGTTCTCTCCCCCAGGGATTCCTCGTAACAAGTACGTGCTGTCTATGTTGCTCTCTTTCTGATGCACATAGCTGAGCTAATTAGCTTAGCATCTGCAACCCAAACAGCTCTTTAACACCCCTATCCTATCCTCAGCTCCGTTGTGAACACACTCCATTTGTGCAATGCATTTTCTCTCCTCAATACTCCCTGGGGCACTTTTGTACCAAGTAAGTGCTACACGCATTGCTCCTTCCTATGCACATATTCACAGCTGGTTTGATTACTATCTAGCAGCCACTATCTAAAACTCAACATATGACATTGTCCGCATCTCATACACGAAAGTTGTGGTTTTTTGTGTGAGTCCTTGATTTTTTTGCCATCTCCTCTACTCAACACAAGGTGTCTTCATGATTTCTAGCATGGTTCCTGCCCTTTACTTTCACTGCCATCCCAATGCCATCCTCCTCAGTCACCATCATTATTTTGTGAATTGTACCCTTCGCCCACATTATGTTACCATGTTACCATGTTGCCTACTTACAACTGTTCACTCTGTCATTCACTACTTGCTATCCTAATGACATGTTGCCTTATATATATTGCTCATGTCACTCATTTCAATGTGCACAAGTTGAACGAACTTACCCTCCACCTGTAGCTCTGATAATGATCTAATGCAACATAGATTGTGGTGTA

Command

post-assemble~busco_v5 -c 16 -m 32 input_1/cnv2_edited.fasta

Output

cnv2_edited.fasta.busco/short_summary.specific.eukaryota_odb10.cnv2_edited.fasta.busco.txt

# BUSCO version is: 5.8.0 
# The lineage dataset is: eukaryota_odb10 (Creation date: 2024-01-08, number of genomes: 70, number of BUSCOs: 255)
# Summarized benchmarking in BUSCO notation for file /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/input_1/cnv2_edited.fasta
# BUSCO was run in mode: euk_genome_min
# Gene predictor used: miniprot

	***** Results: *****

	C:97.6%[S:94.5%,D:3.1%],F:1.6%,M:0.8%,n:255,E:8.8%	   
	249	Complete BUSCOs (C)	(of which 22 contain internal stop codons)		   

view all outputs

Log

pp post-assemble~busco_v5 -c 16 -m 32 input_1/cnv2_edited.fasta
PID: 2268820
/home/yoshitake.kazutoshi/files/m256y/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/pp 'post-assemble~busco_v5' -c 16 -m 32 input_1/cnv2_edited.fasta
Checking the realpath of input files.
1
script: /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/post-assemble~busco_v5
Containers: c2997108/centos7:metacor7 centos:centos6 ezlabgva/busco:v5.8.0_cv1
using docker
++ docker pull --platform=linux/amd64 ezlabgva/busco:v5.8.0_cv1
v5.8.0_cv1: Pulling from ezlabgva/busco
09f376ebb190: Pulling fs layer
83c479b5dcf7: Pulling fs layer
a3ed95caeb02: Pulling fs layer
ea09bc7efc99: Pulling fs layer
8e02477d6853: Pulling fs layer
02dfe434c556: Pulling fs layer
ea09bc7efc99: Waiting
02dfe434c556: Waiting
8e02477d6853: Waiting
a3ed95caeb02: Verifying Checksum
a3ed95caeb02: Download complete
83c479b5dcf7: Verifying Checksum
83c479b5dcf7: Download complete
8e02477d6853: Download complete
02dfe434c556: Verifying Checksum
02dfe434c556: Download complete
09f376ebb190: Verifying Checksum
09f376ebb190: Download complete
09f376ebb190: Pull complete
83c479b5dcf7: Pull complete
a3ed95caeb02: Pull complete
ea09bc7efc99: Verifying Checksum
ea09bc7efc99: Download complete
ea09bc7efc99: Pull complete
8e02477d6853: Pull complete
02dfe434c556: Pull complete
Digest: sha256:c26d4f89b66992bece899e76ec55773a38f1ee0577370381d7f35e749a1f8f1f
Status: Downloaded newer image for ezlabgva/busco:v5.8.0_cv1
docker.io/ezlabgva/busco:v5.8.0_cv1
++ set +ex
++ set -o pipefail
+ set -eux
+ set -o pipefail
++ echo input_1/cnv2_edited.fasta
++ grep '[.]gz$'
++ wc -l
++ true
+ '[' 0 = 1 ']'
++ basename input_1/cnv2_edited.fasta
+ FUNC_RUN_DOCKER ezlabgva/busco:v5.8.0_cv1 busco -i input_1/cnv2_edited.fasta -o cnv2_edited.fasta.busco -c 16 -l eukaryota_odb10 -m genome
+ PP_RUN_IMAGE=ezlabgva/busco:v5.8.0_cv1
+ shift
+ PP_RUN_DOCKER_CMD=("${@}")
++ date +%Y%m%d_%H%M%S_%3N
+ PPDOCNAME=pp20241126_234413_544_15754
+ echo pp20241126_234413_544_15754
++ id -u
++ id -g
+ docker run --name pp20241126_234413_544_15754 -v /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5:/data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5 -w /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5 -v /data/yoshitake.kazutoshi:/data/yoshitake.kazutoshi -u 2007:600 -i --rm ezlabgva/busco:v5.8.0_cv1 busco -i input_1/cnv2_edited.fasta -o cnv2_edited.fasta.busco -c 16 -l eukaryota_odb10 -m genome
2024-11-26 14:44:18 INFO:	***** Start a BUSCO v5.8.0 analysis, current time: 11/26/2024 14:44:18 *****
2024-11-26 14:44:18 INFO:	Configuring BUSCO with local environment
2024-11-26 14:44:18 INFO:	Running genome mode
2024-11-26 14:44:18 INFO:	Downloading information on latest versions of BUSCO data...
2024-11-26 14:44:21 INFO:	Input file is /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/input_1/cnv2_edited.fasta
2024-11-26 14:44:21 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/lineages/eukaryota_odb10.2024-01-08.tar.gz'
2024-11-26 14:44:34 INFO:	Decompressing file '/data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/busco_downloads/lineages/eukaryota_odb10.tar.gz'
2024-11-26 14:44:35 INFO:	Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2024-01-08)
2024-11-26 14:44:35 INFO:	Running 1 job(s) on bbtools, starting at 11/26/2024 14:44:35
2024-11-26 14:44:38 INFO:	[bbtools]	1 of 1 task(s) completed
2024-11-26 14:44:38 INFO:	Running 1 job(s) on miniprot_index, starting at 11/26/2024 14:44:38
2024-11-26 14:44:41 INFO:	[miniprot_index]	1 of 1 task(s) completed
2024-11-26 14:44:42 INFO:	Running 1 job(s) on miniprot_align, starting at 11/26/2024 14:44:42
2024-11-26 14:56:13 INFO:	[miniprot_align]	1 of 1 task(s) completed
2024-11-26 14:57:05 INFO:	***** Run HMMER on gene sequences *****
2024-11-26 14:57:05 INFO:	Running 255 job(s) on hmmsearch, starting at 11/26/2024 14:57:05
2024-11-26 14:57:08 INFO:	[hmmsearch]	26 of 255 task(s) completed
2024-11-26 14:57:09 INFO:	[hmmsearch]	51 of 255 task(s) completed
2024-11-26 14:57:09 INFO:	[hmmsearch]	77 of 255 task(s) completed
2024-11-26 14:57:10 INFO:	[hmmsearch]	102 of 255 task(s) completed
2024-11-26 14:57:10 INFO:	[hmmsearch]	128 of 255 task(s) completed
2024-11-26 14:57:12 INFO:	[hmmsearch]	153 of 255 task(s) completed
2024-11-26 14:57:12 INFO:	[hmmsearch]	179 of 255 task(s) completed
2024-11-26 14:57:13 INFO:	[hmmsearch]	204 of 255 task(s) completed
2024-11-26 14:57:14 INFO:	[hmmsearch]	230 of 255 task(s) completed
2024-11-26 14:57:15 INFO:	[hmmsearch]	255 of 255 task(s) completed
2024-11-26 14:57:16 INFO:	50 candidate overlapping regions found
2024-11-26 14:57:16 INFO:	1970 exons in total
2024-11-26 14:57:16 WARNING:	22 of 249 Complete matches (8.8%) contain internal stop codons in Miniprot gene predictions
2024-11-26 14:57:16 INFO:

-------------------------------------------------------------------------------------------
|Results from dataset eukaryota_odb10                                                      |
-------------------------------------------------------------------------------------------
|C:97.6%[S:94.5%,D:3.1%],F:1.6%,M:0.8%,n:255,E:8.8%                                        |
|249    Complete BUSCOs (C)    (of which 22 contain internal stop codons)                  |
|241    Complete and single-copy BUSCOs (S)                                                |
|8    Complete and duplicated BUSCOs (D)                                                   |
|4    Fragmented BUSCOs (F)                                                                |
|2    Missing BUSCOs (M)                                                                   |
|255    Total BUSCO groups searched                                                        |
-------------------------------------------------------------------------------------------
2024-11-26 14:57:16 INFO:	BUSCO analysis done with WARNING(s). Total running time: 775 seconds

***** Summary of warnings: *****
2024-11-26 14:57:16 WARNING:busco.busco_tools.hmmer	22 of 249 Complete matches (8.8%) contain internal stop codons in Miniprot gene predictions

2024-11-26 14:57:16 INFO:	Results written in /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/cnv2_edited.fasta.busco
2024-11-26 14:57:16 INFO:	For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html

2024-11-26 14:57:16 INFO:	Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
+ post_processing
+ '[' 1 = 1 ']'
+ rm -f /home/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/pp-singularity-flag
+ '[' '' = y ']'
+ echo 0
+ exit