[05/05/24 15:51:35]: /venv/bin/funannotate train -i MyAssembly.fa -o fun --species Species name --cpus 16 --memory 51G --stranded RF --jaccard_clip --left input_2/10-3-f1_S4_L001_R1_001.fastq.gz input_2/10-3-f2_S5_L001_R1_001.fastq.gz input_2/10-3-f3_S6_L001_R1_001.fastq.gz input_2/11-6-m1_S1_L001_R1_001.fastq.gz input_2/11-6-m2_S2_L001_R1_001.fastq.gz input_2/11-6-m3_S3_L001_R1_001.fastq.gz input_2/7-31-f1_S10_L001_R1_001.fastq.gz input_2/7-31-f2_S11_L001_R1_001.fastq.gz input_2/7-31-f3_S12_L001_R1_001.fastq.gz input_2/7-31-m1_S7_L001_R1_001.fastq.gz input_2/7-31-m2_S8_L001_R1_001.fastq.gz input_2/7-31-m3_S9_L001_R1_001.fastq.gz --right input_2/10-3-f1_S4_L001_R2_001.fastq.gz input_2/10-3-f2_S5_L001_R2_001.fastq.gz input_2/10-3-f3_S6_L001_R2_001.fastq.gz input_2/11-6-m1_S1_L001_R2_001.fastq.gz input_2/11-6-m2_S2_L001_R2_001.fastq.gz input_2/11-6-m3_S3_L001_R2_001.fastq.gz input_2/7-31-f1_S10_L001_R2_001.fastq.gz input_2/7-31-f2_S11_L001_R2_001.fastq.gz input_2/7-31-f3_S12_L001_R2_001.fastq.gz input_2/7-31-m1_S7_L001_R2_001.fastq.gz input_2/7-31-m2_S8_L001_R2_001.fastq.gz input_2/7-31-m3_S9_L001_R2_001.fastq.gz [05/05/24 15:51:35]: OS: Debian GNU/Linux 10, 16 cores, ~ 791 GB RAM. Python: 3.8.12 [05/05/24 15:51:35]: Running 1.8.17 [05/05/24 15:51:36]: fasta version=36.3.8g path=/venv/bin/fasta [05/05/24 15:51:36]: minimap2 version=2.26-r1175 path=/venv/bin/minimap2 [05/05/24 15:51:36]: hisat2 version=2.2.1 path=/venv/bin/hisat2 [05/05/24 15:51:36]: hisat2-build version=NA path=/venv/bin/hisat2-build [05/05/24 15:51:36]: Trinity version=2.8.5 path=/venv/bin/Trinity [05/05/24 15:51:36]: java version=11.0.8-internal path=/venv/bin/java [05/05/24 15:51:36]: kallisto version=0.46.1 path=/venv/bin/kallisto [05/05/24 15:51:36]: /venv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl version=NA path=/venv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl [05/05/24 15:51:36]: /venv/opt/pasa-2.4.1/bin/seqclean version=NA path=/venv/opt/pasa-2.4.1/bin/seqclean [05/05/24 15:51:36]: trimmomatic version=0.39 path=/venv/bin/trimmomatic [05/05/24 15:51:36]: minimap2 version=2.26-r1175 path=/venv/bin/minimap2 [05/05/24 15:51:36]: blat version=BLAT v35 path=/venv/bin/blat [05/05/24 15:51:37]: Multiple inputs for --left and --right detected, concatenating PE reads [05/05/24 15:51:37]: cat /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f1_S4_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f2_S5_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f3_S6_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m1_S1_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m2_S2_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m3_S3_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f1_S10_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f2_S11_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f3_S12_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m1_S7_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m2_S8_L001_R1_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m3_S9_L001_R1_001.fastq.gz [05/05/24 15:57:49]: cat /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f1_S4_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f2_S5_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/10-3-f3_S6_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m1_S1_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m2_S2_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/11-6-m3_S3_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f1_S10_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f2_S11_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-f3_S12_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m1_S7_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m2_S8_L001_R2_001.fastq.gz /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/input_2/7-31-m3_S9_L001_R2_001.fastq.gz [05/05/24 16:05:35]: Input reads: ('fun/training/left.fq.gz', 'fun/training/right.fq.gz', None) [05/05/24 16:05:35]: Adapter and Quality trimming PE reads with Trimmomatic [05/05/24 16:05:35]: trimmomatic PE -threads 16 -phred33 fun/training/left.fq.gz fun/training/right.fq.gz fun/training/trimmomatic/trimmed_left.fastq fun/training/trimmomatic/trimmed_left.unpaired.fastq fun/training/trimmomatic/trimmed_right.fastq fun/training/trimmomatic/trimmed_right.unpaired.fastq ILLUMINACLIP:/venv/lib/python3.8/site-packages/funannotate/config/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 [05/05/24 16:08:24]: TrimmomaticPE: Started with arguments: -threads 16 -phred33 fun/training/left.fq.gz fun/training/right.fq.gz fun/training/trimmomatic/trimmed_left.fastq fun/training/trimmomatic/trimmed_left.unpaired.fastq fun/training/trimmomatic/trimmed_right.fastq fun/training/trimmomatic/trimmed_right.unpaired.fastq ILLUMINACLIP:/venv/lib/python3.8/site-packages/funannotate/config/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT' ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Input Read Pairs: 30076546 Both Surviving: 28638946 (95.22%) Forward Only Surviving: 1435686 (4.77%) Reverse Only Surviving: 27 (0.00%) Dropped: 1887 (0.01%) TrimmomaticPE: Completed successfully [05/05/24 16:08:24]: pigz -f -p 16 fun/training/trimmomatic/trimmed_left.fastq [05/05/24 16:11:22]: pigz -f -p 16 fun/training/trimmomatic/trimmed_left.unpaired.fastq [05/05/24 16:11:32]: pigz -f -p 16 fun/training/trimmomatic/trimmed_right.fastq [05/05/24 16:14:37]: pigz -f -p 16 fun/training/trimmomatic/trimmed_right.unpaired.fastq [05/05/24 16:14:37]: Quality trimmed reads: ('fun/training/trimmomatic/trimmed_left.fastq.gz', 'fun/training/trimmomatic/trimmed_right.fastq.gz', None) [05/05/24 16:14:37]: FASTQ headers seem compatible with Trinity [05/05/24 16:14:37]: Running read normalization with Trinity [05/05/24 16:14:37]: /venv/opt/trinity-2.8.5/util/insilico_read_normalization.pl --PARALLEL_STATS --JM 51G --min_cov 5 --max_cov 50 --seqType fq --output fun/training/normalize --CPU 16 --SS_lib_type RF --pairs_together --left fun/training/trimmomatic/trimmed_left.fastq.gz --right fun/training/trimmomatic/trimmed_right.fastq.gz [05/05/24 16:54:02]: Normalized reads: ('fun/training/normalize/left.norm.fq', 'fun/training/normalize/right.norm.fq', None) [05/05/24 16:54:02]: Long reads: (None, None, None) [05/05/24 16:54:02]: Long reads FASTA format: (None, None, None) [05/05/24 16:54:02]: Long SeqCleaned reads: (None, None, None) [05/05/24 21:29:16]: Running StringTie on Hisat2 coordsorted BAM [05/05/24 21:29:16]: stringtie -p 16 --rf fun/training/hisat2.coordSorted.bam [05/05/24 21:29:37]: Removing poly-A sequences from trinity transcripts using seqclean [05/05/24 21:29:37]: /venv/opt/pasa-2.4.1/bin/seqclean trinity.fasta -c 16 [05/05/24 21:29:56]: seqclean running options: seqclean trinity.fasta -c 16 Standard log file: seqcl_trinity.fasta.log Error log file: err_seqcl_trinity.fasta.log Using 16 CPUs for cleaning -= Rebuilding trinity.fasta cdb index =- Launching actual cleaning process: psx -p 16 -n 1000 -i trinity.fasta -d cleaning -C '/trinity.fasta:ANLMS100:::11:0' -c '/venv/opt/pasa-2.4.1/bin/seqclean.psx' Collecting cleaning reports ************************************************** Sequences analyzed: 60028 ----------------------------------- valid: 59993 (738 trimmed) trashed: 35 ************************************************** ----= Trashing summary =------ by 'dust': 35 ------------------------------ Output file containing only valid and trimmed sequences: trinity.fasta.clean For trimming and trashing details see cleaning report : trinity.fasta.cln -------------------------------------------------- seqclean (trinity.fasta) finished on machine 5f1d37f6bbc3 in , without a detectable error. [05/05/24 21:29:56]: minimap2 -ax splice -t 16 --cs -u b -G 3000 fun/training/genome.fasta fun/training/trinity.fasta.clean | samtools sort --reference fun/training/genome.fasta -@ 4 -o fun/training/trinity.alignments.bam - [05/05/24 21:30:36]: Converting transcript alignments to GFF3 format [05/05/24 21:30:43]: Converting Trinity transcript alignments to GFF3 format [05/05/24 21:30:49]: Running PASA alignment step using 59,993 transcripts [05/05/24 21:30:49]: /venv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl -c /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/pasa/alignAssembly.txt -r -C -R -g /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/genome.fasta --IMPORT_CUSTOM_ALIGNMENTS /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/trinity.alignments.gff3 -T -t /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/trinity.fasta.clean -u /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/trinity.fasta --stringent_alignment_overlap 30.0 --TRANSDECODER --ALT_SPLICE --MAX_INTRON_LENGTH 3000 --CPU 16 --ALIGNERS blat --transcribed_is_aligned_orient --trans_gtf /suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/funannotate_train.stringtie.gtf [05/06/24 16:44:02]: PASA assigned 55,859 transcripts to 35,900 loci (genes) [05/06/24 16:44:02]: Getting PASA models for training with TransDecoder [05/06/24 16:44:02]: /venv/opt/pasa-2.4.1/scripts/pasa_asmbls_to_training_set.dbi --pasa_transcripts_fasta Species_name_pasa.assemblies.fasta --pasa_transcripts_gff3 Species_name_pasa.pasa_assemblies.gff3 [05/06/24 16:54:01]: PASA finished. PASAweb accessible via: localhost:port/cgi-bin/index.cgi?db=/suikou/files/m256y/yoshitake.kazutoshi/work/pp-dev/yoshitake/test/annotation~funannotate/test/fun/training/pasa/Species_name_pasa [05/06/24 16:54:01]: Using Kallisto TPM data to determine which PASA gene models to select at each locus [05/06/24 16:54:01]: Building Kallisto index [05/06/24 16:54:01]: /venv/opt/pasa-2.4.1/misc_utilities/gff3_file_to_proteins.pl fun/training/pasa.step1.gff3 fun/training/genome.fasta cDNA [05/06/24 16:54:43]: [05/06/24 16:54:43]: kallisto index -i fun/training/getBestModel/bestModel fun/training/getBestModel/transcripts.fa [05/06/24 16:56:19]: [build] loading fasta file fun/training/getBestModel/transcripts.fa [build] k-mer length: 31 [build] warning: clipped off poly-A tail (longer than 10) from 4 target sequences [build] warning: replaced 875 non-ACGUT characters in the input sequence with pseudorandom nucleotides [build] counting k-mers ... done. [build] building target de Bruijn graph ... done [build] creating equivalence classes ... done [build] target de Bruijn graph has 83310 contigs and contains 30288416 k-mers [05/06/24 16:56:19]: Mapping reads using pseudoalignment in Kallisto [05/06/24 16:56:19]: kallisto quant -i fun/training/getBestModel/bestModel -o fun/training/getBestModel/kallisto --plaintext -t 16 --rf-stranded fun/training/trimmomatic/trimmed_left.fastq.gz fun/training/trimmomatic/trimmed_right.fastq.gz [05/06/24 16:59:32]: [quant] fragment length distribution will be estimated from the data [index] k-mer length: 31 [index] number of targets: 33,046 [index] number of k-mers: 30,288,416 [index] number of equivalence classes: 57,540 [quant] running in paired-end mode [quant] will process pair 1: fun/training/trimmomatic/trimmed_left.fastq.gz fun/training/trimmomatic/trimmed_right.fastq.gz [quant] finding pseudoalignments for the reads ... done [quant] processed 28,638,946 reads, 21,421,278 reads pseudoaligned [quant] estimated average fragment length: 171.492 [ em] quantifying the abundances ... done [ em] the Expectation-Maximization algorithm ran for 1,084 rounds [05/06/24 16:59:32]: Parsing expression value results. Keeping best transcript at each locus. [05/06/24 17:00:55]: Wrote 19,612 PASA gene models [05/06/24 17:00:55]: PASA database name: Species_name [05/06/24 17:00:55]: Trinity/PASA has completed, you are now ready to run funanotate predict, for example: funannotate predict -i MyAssembly.fa \ -o fun -s "Species name" --cpus 16