The Benchmarking Universal Single-Copy Ortholog assessment tool. Based on evolutionarily-informed expectations of gene content of near-universal single-copy orthologs, BUSCO metric is complementary to technical metrics like N50.
>1
CCTCGGTCGGCGCCATCTCCTCCTCGACTATTTCCTCCTCCATCTCCTTCTCCCTCTCCTCCACTTCTTCTTCTTCCTCCCTTTCTTCTTCACTCTCCTCAACAACCCTTTGCCGCTTCAGCGGACGCCGCCGCTCGGTGGTCCTCCTCGACGGTCCTGCCGCCGCCGGTCCCTCTCCCATACCCATCAAGGACATCCTAGTGTCCATGGATATGAGGCGCGTATGGTCGTCCTCTTGCCTTCTCAGCAGCTGGCGAAGGTACTGCTGAGTCTTCGAATTCGCCTCTCGTAGGGCCCGGTTATCGGCGAGGAGTTGTGCCATTTGACTTTCGAGGACAGCTAGCCGCTCGCCGCTCGGGCCCGCTCCTCCATCCTTCTTCACTACATAAGGACGACCCGAGTACGAGCATCGTACTTTCGAGTCGTGGCAGGGCTTACATATTACGCTACTCCGCTTGCCAGCCTGCTTCAGGCAAGGGATTTTCTTCGACCTGCACCGTTCGCAAGGTTCTCGGTCTTCTTCGTCCTCATCATCGTCGTCGCCGTCGTCGCCGTCGTCTGGGTCCTCATCGACGACCTGAGTAGAAACAAGTCAATAAAGAAAATTTAATGTTAGCCGAAAAAGACGTACCGGGGCCTTGGCTTTACCCTTTCCCTTATTCTTCATCTTCGATATCTCCACAATAGGTCTCCTCGGCGAAGCAGAAACATCCCCGGTGGAAGTTCCGCCTTGACTTCTGGCGGTAGCAGCCTCCGCCAACTTCCTCCGTCGTTCGACGATCTCGTCTTCCTGTTCTCGAGCCCGGACGGCCCGATCGCGAGCATCTTGAGCAGCTTGGCGTCTCGCCGCCGCCGCTCGCTCCGCCGCCTACTTCTTCTTTTCCGCCTCCTCTGCCGCCTTCTTAGCAGCTGCCTCCTCCGCTGCCTTCCTTTTAGCAGCCTCCGCTGCCTTCCGCGCCTTCACCCTCTCAACCCTCGCAAGGGCCTCCCTGATGATTTCGTCTTCGTCTTTGTCGATAGGGGCGCCGGGACTCGGAGGATCGGCGGGGCGAGAAGAGGTACTGGCGGAGGGTTGGGTGGTCGTGGTGGTTCGGCTGGTCGTCATGTTGGTGGACTTAGAAGATTTTGAGAGAGTTAGGGCCTTAGCGGCAAAACCCTGGGCCTTTTATAGGAAAAAAACTGCGCACGGACCAGCGAAAATTTTCCTTCGAGTTGTACGGGGTGTACGGTGAACCGAAAATTTTCACTGATGACTAACGCCAATTCCCGGTACTAACGGCTTTTGAAAATTCTCCTGCTGACTGATGGGCCAAATCCCCGATCCTGACGGCCTTTGAAATTTTTCTTTGGTGACTAACCAACTACCGGTATTAATGGCTTTCGAGGGCTCAAGCCTAAGAGGGGATACTGTCAAGTCCATGCCGCGGGCCTCCATTCCGCCGCTCGGTACTCCGAAGATCCACTCGTTATTGTAATCTCACTCTTATTGTTCTTTCTCGCTACACATACCCTTTCGTGCTAGTCCCTTCTTATTGTCTTTTACTCTACCCTGCAGGTACTACTCGTACAGCTCCTAGTAGTTAGTCTGGCGCTCCGCACCGATGCGCCGTTCCGTAGATATAAATACTGCCTGTACGCTACTGTAGTTCCTCAGTTTAATTATCAACTTAGATCAACTTAGATCTCCCTTTCCAACTTAACTCTCCTCTAAGTCTTCAGGTTCAACGCCTCGAAGTACGTTCTCCTCGACCTTAGGACTTTAACGTTCTCTGGTCCGCTCCTCGTCATCGGCGATACCGCCGCTCCGACCTGGCTGCTACTTAGCTCCTCTCTCTCTACTCTCGCTCTCCGGGGACTAACGGCGTAACGGCGGTGTTGTTAATAACTTCACTACTGTCCCATTCGTCTTCCGCACTTAGACACCCTGTGGCCTTGGCACGGTTACACGGTGTACATGATCTTAACATGAGCATGATATAGCATATATACTATGTTATACACCTATCGAACTTAATCTGCTGTAGAGAAACCACTCTGAAGAGGGTGTTGTTCCTTGAATTCAGCCCTCGGGAGTCAGAAAATATATCCAAGAAGCGCCATTATTTTTGCTCCCACGTCCTCAACCTCTTTCCACCACTGCGACTTTACTTTCTCCATCACTGTTTTGTTGCCACAGATCCCCTGTAGCAGCCTGTAACATGGTAAGTTCTCAAACTATCTTTTGAAAAACTTTCTCACCTGTTCTCTAGCTCTTTGATACCGTGTCTTATTCCATCTCCCAGCTCTGCTTGTGAACATTTTCCATCTGCGCAAACTGTTCTCTCCCCCAGGGATTCCTCGTAACAAGTACGTGCTGTCTATGTTGCTCTCTTTCTGATGCACATAGCTGAGCTAATTAGCTTAGCATCTGCAACCCAAACAGCTCTTTAACACCCCTATCCTATCCTCAGCTCCGTTGTGAACACACTCCATTTGTGCAATGCATTTTCTCTCCTCAATACTCCCTGGGGCACTTTTGTACCAAGTAAGTGCTACACGCATTGCTCCTTCCTATGCACATATTCACAGCTGGTTTGATTACTATCTAGCAGCCACTATCTAAAACTCAACATATGACATTGTCCGCATCTCATACACGAAAGTTGTGGTTTTTTGTGTGAGTCCTTGATTTTTTTGCCATCTCCTCTACTCAACACAAGGTGTCTTCATGATTTCTAGCATGGTTCCTGCCCTTTACTTTCACTGCCATCCCAATGCCATCCTCCTCAGTCACCATCATTATTTTGTGAATTGTACCCTTCGCCCACATTATGTTACCATGTTACCATGTTGCCTACTTACAACTGTTCACTCTGTCATTCACTACTTGCTATCCTAATGACATGTTGCCTTATATATATTGCTCATGTCACTCATTTCAATGTGCACAAGTTGAACGAACTTACCCTCCACCTGTAGCTCTGATAATGATCTAATGCAACATAGATTGTGGTGTA
post-assemble~busco_v5 -c 16 -m 32 input_1/cnv2_edited.fasta
# BUSCO version is: 5.8.0
# The lineage dataset is: eukaryota_odb10 (Creation date: 2024-01-08, number of genomes: 70, number of BUSCOs: 255)
# Summarized benchmarking in BUSCO notation for file /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/input_1/cnv2_edited.fasta
# BUSCO was run in mode: euk_genome_min
# Gene predictor used: miniprot
***** Results: *****
C:97.6%[S:94.5%,D:3.1%],F:1.6%,M:0.8%,n:255,E:8.8%
249 Complete BUSCOs (C) (of which 22 contain internal stop codons)
pp post-assemble~busco_v5 -c 16 -m 32 input_1/cnv2_edited.fasta PID: 2268820 /home/yoshitake.kazutoshi/files/m256y/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/pp 'post-assemble~busco_v5' -c 16 -m 32 input_1/cnv2_edited.fasta Checking the realpath of input files. 1 script: /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/PortablePipeline/PortablePipeline/scripts/post-assemble~busco_v5 Containers: c2997108/centos7:metacor7 centos:centos6 ezlabgva/busco:v5.8.0_cv1 using docker ++ docker pull --platform=linux/amd64 ezlabgva/busco:v5.8.0_cv1 v5.8.0_cv1: Pulling from ezlabgva/busco 09f376ebb190: Pulling fs layer 83c479b5dcf7: Pulling fs layer a3ed95caeb02: Pulling fs layer ea09bc7efc99: Pulling fs layer 8e02477d6853: Pulling fs layer 02dfe434c556: Pulling fs layer ea09bc7efc99: Waiting 02dfe434c556: Waiting 8e02477d6853: Waiting a3ed95caeb02: Verifying Checksum a3ed95caeb02: Download complete 83c479b5dcf7: Verifying Checksum 83c479b5dcf7: Download complete 8e02477d6853: Download complete 02dfe434c556: Verifying Checksum 02dfe434c556: Download complete 09f376ebb190: Verifying Checksum 09f376ebb190: Download complete 09f376ebb190: Pull complete 83c479b5dcf7: Pull complete a3ed95caeb02: Pull complete ea09bc7efc99: Verifying Checksum ea09bc7efc99: Download complete ea09bc7efc99: Pull complete 8e02477d6853: Pull complete 02dfe434c556: Pull complete Digest: sha256:c26d4f89b66992bece899e76ec55773a38f1ee0577370381d7f35e749a1f8f1f Status: Downloaded newer image for ezlabgva/busco:v5.8.0_cv1 docker.io/ezlabgva/busco:v5.8.0_cv1 ++ set +ex ++ set -o pipefail + set -eux + set -o pipefail ++ echo input_1/cnv2_edited.fasta ++ grep '[.]gz$' ++ wc -l ++ true + '[' 0 = 1 ']' ++ basename input_1/cnv2_edited.fasta + FUNC_RUN_DOCKER ezlabgva/busco:v5.8.0_cv1 busco -i input_1/cnv2_edited.fasta -o cnv2_edited.fasta.busco -c 16 -l eukaryota_odb10 -m genome + PP_RUN_IMAGE=ezlabgva/busco:v5.8.0_cv1 + shift + PP_RUN_DOCKER_CMD=("${@}") ++ date +%Y%m%d_%H%M%S_%3N + PPDOCNAME=pp20241126_234413_544_15754 + echo pp20241126_234413_544_15754 ++ id -u ++ id -g + docker run --name pp20241126_234413_544_15754 -v /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5:/data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5 -w /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5 -v /data/yoshitake.kazutoshi:/data/yoshitake.kazutoshi -u 2007:600 -i --rm ezlabgva/busco:v5.8.0_cv1 busco -i input_1/cnv2_edited.fasta -o cnv2_edited.fasta.busco -c 16 -l eukaryota_odb10 -m genome 2024-11-26 14:44:18 INFO: ***** Start a BUSCO v5.8.0 analysis, current time: 11/26/2024 14:44:18 ***** 2024-11-26 14:44:18 INFO: Configuring BUSCO with local environment 2024-11-26 14:44:18 INFO: Running genome mode 2024-11-26 14:44:18 INFO: Downloading information on latest versions of BUSCO data... 2024-11-26 14:44:21 INFO: Input file is /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/input_1/cnv2_edited.fasta 2024-11-26 14:44:21 INFO: Downloading file 'https://busco-data.ezlab.org/v5/data/lineages/eukaryota_odb10.2024-01-08.tar.gz' 2024-11-26 14:44:34 INFO: Decompressing file '/data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/busco_downloads/lineages/eukaryota_odb10.tar.gz' 2024-11-26 14:44:35 INFO: Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2024-01-08) 2024-11-26 14:44:35 INFO: Running 1 job(s) on bbtools, starting at 11/26/2024 14:44:35 2024-11-26 14:44:38 INFO: [bbtools] 1 of 1 task(s) completed 2024-11-26 14:44:38 INFO: Running 1 job(s) on miniprot_index, starting at 11/26/2024 14:44:38 2024-11-26 14:44:41 INFO: [miniprot_index] 1 of 1 task(s) completed 2024-11-26 14:44:42 INFO: Running 1 job(s) on miniprot_align, starting at 11/26/2024 14:44:42 2024-11-26 14:56:13 INFO: [miniprot_align] 1 of 1 task(s) completed 2024-11-26 14:57:05 INFO: ***** Run HMMER on gene sequences ***** 2024-11-26 14:57:05 INFO: Running 255 job(s) on hmmsearch, starting at 11/26/2024 14:57:05 2024-11-26 14:57:08 INFO: [hmmsearch] 26 of 255 task(s) completed 2024-11-26 14:57:09 INFO: [hmmsearch] 51 of 255 task(s) completed 2024-11-26 14:57:09 INFO: [hmmsearch] 77 of 255 task(s) completed 2024-11-26 14:57:10 INFO: [hmmsearch] 102 of 255 task(s) completed 2024-11-26 14:57:10 INFO: [hmmsearch] 128 of 255 task(s) completed 2024-11-26 14:57:12 INFO: [hmmsearch] 153 of 255 task(s) completed 2024-11-26 14:57:12 INFO: [hmmsearch] 179 of 255 task(s) completed 2024-11-26 14:57:13 INFO: [hmmsearch] 204 of 255 task(s) completed 2024-11-26 14:57:14 INFO: [hmmsearch] 230 of 255 task(s) completed 2024-11-26 14:57:15 INFO: [hmmsearch] 255 of 255 task(s) completed 2024-11-26 14:57:16 INFO: 50 candidate overlapping regions found 2024-11-26 14:57:16 INFO: 1970 exons in total 2024-11-26 14:57:16 WARNING: 22 of 249 Complete matches (8.8%) contain internal stop codons in Miniprot gene predictions 2024-11-26 14:57:16 INFO: ------------------------------------------------------------------------------------------- |Results from dataset eukaryota_odb10 | ------------------------------------------------------------------------------------------- |C:97.6%[S:94.5%,D:3.1%],F:1.6%,M:0.8%,n:255,E:8.8% | |249 Complete BUSCOs (C) (of which 22 contain internal stop codons) | |241 Complete and single-copy BUSCOs (S) | |8 Complete and duplicated BUSCOs (D) | |4 Fragmented BUSCOs (F) | |2 Missing BUSCOs (M) | |255 Total BUSCO groups searched | ------------------------------------------------------------------------------------------- 2024-11-26 14:57:16 INFO: BUSCO analysis done with WARNING(s). Total running time: 775 seconds ***** Summary of warnings: ***** 2024-11-26 14:57:16 WARNING:busco.busco_tools.hmmer 22 of 249 Complete matches (8.8%) contain internal stop codons in Miniprot gene predictions 2024-11-26 14:57:16 INFO: Results written in /data/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/cnv2_edited.fasta.busco 2024-11-26 14:57:16 INFO: For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html 2024-11-26 14:57:16 INFO: Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO + post_processing + '[' 1 = 1 ']' + rm -f /home/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/post-assemble~busco_v5/pp-singularity-flag + '[' '' = y ']' + echo 0 + exit