1.Install Windows Subsystem for Linux (WSL)
Confirm that the version of Windows 10 is 1803 (spring 2018), 1809 (fall 2018) and 1903 (spring 2019). Open PowerShell with administrator privileges to enable WSL. (Right-click the Windows logo at the bottom left of the screen → Windows PowerShell (Admin))
Paste and execute the following command to enable the WSL function.
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
Then restart Windows
2.Install Ubuntu
Left-click the Windows logo on the lower left of the screen, and start “Microsoft Store” from the menu. Click Search in the store and enter Ubuntu. Install and launch the displayed Ubuntu 18.04 LTS. (Other Ubuntu seems to be fine but 18.04 is recommended.)
Start Ubuntu from the start menu by left-clicking the Windows logo on the lower left of the screen. The user name and password should be typed at the first startup.
3.Install wsl-terminal
The default Windows terminal is fatally difficult to use because of font misalignment. Download and unzip the following wsl-terminal to an appropriate folder. Run “open-wsl.exe” to start WSL. https://github.com/goreliu/wsl-terminal/releases/download/v0.8.13/wsl-terminal-0.8.13.zip
Alternatively, if the version of your Windows is 1903, Microsoft's new terminal can be used, so you can search for and install “Windows Terminal” in the Microsoft Store.
4.Install Docker
Docker is an ultra-lightweight virtual PC that starts up in about a second and performs container-type virtualization.
Advantages
-A lightweight, it uses only a few tens of megabytes of disk space with a base guest OS.
-There is a DockerHub as a repository of unlimited capacity.
Disadvantages
-It uses Linux-specific features, guest OS is only for Linux, and host OS is limited to new Linux.
-It is faster because it is not completely virtualized like a virtual machine, but it requires careful security and requires administrator privileges to use it. At supercomputer we will use container-type virtualization software called singularity, not docker, but it requires root privileges at the time of installation. (singularity has been installed on the Tokyo University Shirokane supercomputer, so it can be used without root privileges.)
from https://blog.cloudboost.io/docker-vs-vm-548032d3ef58
Paste the following long command on the launched WSL screen. (When pasting, click middle click or right click and select from menu)
cat << 'EOF2' | bash if [ `which docker|wc -l` = 0 ];then sudo sed -i 's/%sudo\tALL=(ALL:ALL) ALL/%sudo\tALL=NOPASSWD: ALL/' /etc/sudoers sudo sed -i.bak -e "s%http://[^ ]\+%http://ftp.jaist.ac.jp/pub/Linux/ubuntu/%g" /etc/apt/sources.list sudo apt-get update sudo apt install -y libltdl7 cgroupfs-mount cd wget https://download.docker.com/linux/ubuntu/dists/xenial/pool/stable/amd64/docker-ce_17.03.3~ce-0~ubuntu-xenial_amd64.deb sudo dpkg -i docker-ce_17.03.3~ce-0~ubuntu-xenial_amd64.deb fi if [ `id -a $USER|grep "(docker)"|wc -l` = 0 ]; then sudo usermod -aG docker $USER fi if [ `service docker status|grep " is running"|wc -l` = 0 ]; then powershell.exe start-process bash -verb runas -ArgumentList "'"'-c "sudo cgroupfs-mount; sudo service docker start"'"'" fi if [ `grep DOCKER ~/.bashrc|wc -l` = 0 ]; then cat << 'EOF' >> ~/.bashrc #for docker alias DOCKER='docker run -it --rm -v $PWD:$PWD -w $PWD' shopt -s expand_aliases if [ `service docker status|grep " is running"|wc -l` = 0 ]; then powershell.exe start-process bash -verb runas -ArgumentList "'"'-c "sudo cgroupfs-mount; sudo service docker start"'"'" fi EOF fi if [ "`gcc --version 2> /dev/null`" = "" ]; then sudo apt install -y build-essential fi EOF2 exit
You will be asked for the password only once. Enter it. Also, a dialog asking if you want to execute bash with administrator privileges will be displayed, so click “Yes”.
When finished successfully, the terminal will close, so reopen the wsl-terminal (you may be prompted to run bash again with administrator privileges, but then click “Yes”).
Please type the following command.
docker run hello-world
If you can see “Hello from Docker!”, you have installed docker successfully.
1.Install Docker
Make sure that the OS version is OS X Sierra 10.12 or later.
Download Docker Desktop for Mac from the following URL, double-click the dmg file, and follow the instructions to complete the installation.
https://download.docker.com/mac/stable/Docker.dmg
2.Change the setting of Docker
The memory limit of Docker's virtual machine is low by default, so click the Docker icon (picture of a whale) at the top of the screen, click Preferences …, open the Advanced tab. The CPU is set to the number of CPU cores, and the memory is set to your computer's maximum memory size excluding about 1 to 2 GB for the OS.
3.Run Docker
Open Finder and start Applications → Utilities → Terminal.
Type the following command, and check if you can see “Hello from Docker!”
docker run hello-world
4.Install Homebrew
The command-line tools of Mac are 10 years old, so you need to install new tools. Open the terminal and paste the following commands.
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" echo "export PATH=/usr/local/opt/coreutils/libexec/gnubin:/usr/local/bin:/usr/local/sbin:${PATH} >> ~/.bash_profile" source ~/.bash_profile # You may not need the next two lines. If you get an error with brew install, you need to run. mkdir -p /usr/local/sbin /usr/local/opt sudo chown $USER /usr/local/sbin /usr/local/opt brew install grep gawk gzip ed htop iftop brew install gnu-tar gnu-sed gnu-time gnu-getopt brew install binutils findutils diffutils coreutils moreutils
You will be asked for your Mac password, so enter it.
Biocontainers: For reproducibility of Bioinformatics, they aim to be able to easily install the tool. https://biocontainers.pro/#/registry It seems that over 7,000 tools are currently registered, but the following tools are not registered.
-License required (e.g. CLC, USEARCH, SSPACE) -Huge database (e.g. Polyphen2, PROVEAN) -Minor tools
Many tools are registered, so first we will learn how to use Biocontainers.
Example: If you want to use a mapping tool called BWA
docker run -it --rm -v $PWD:$PWD -w $PWD quay.io/biocontainers/bwa:0.7.17--h84994c4_5 bash # Since Windows users have registered shortcuts at above setup, the command can be shortened as below. DOCKER quay.io/biocontainers/bwa:0.7.17--h84994c4_5 bash
Change the container name (quay.io/biocontainers/bwa:0.7.17–h84994c4_5).
The commands installed in each container are mostly installed in /usr/local/bin.
When you have finished running the tool, you can quit Docker by typing exit
.
Since the website of Biocontainers is falling well, then you can look directly at QUAY.IO.
https://quay.io/organization/biocontainers
However, if you search here, you should enter all letters as lowercase.
I will introduce a procedure to run a CentOS 7 container with Docker and create a new Docker container with an appropriate name. First, start the CentOS 7 container.
docker run -it centos:7 bash
Try installing the necessary tools. For example, if you install wget
yum install -y wget
When installation is complete, exit with exit
. In order to save the container in which the tool is installed, you first need to know the container ID, so execute the following command.
docker ps -a
Make a note of its ID, as the container displayed at the top is the newest container. (Example: it is a94dd36ef032) When you enter the following command, it will be registered in your PC as a new container.
docker commit a94dd36ef032 my_dockerhub_id/wget:latest
You can see the list of registered images.
docker images
If you are registered at dockerhub, you can publish the container to other people with the following command:
docker push my_dockerhub_id/wget:latest
Unnecessary containers can be deleted by the following command.
docker rm a94dd36ef032
As described above, it is also possible to create a new image by a predetermined procedure. In addition to above, we can create new image by a file named Dockerfile.
For example, the homology search tool diamond is not registered in BioContainers, but the creator of the tool creates a Dockerfile, so you can easily create an image yourself. The procedure is as follows.
git clone https://github.com/bbuchfink/diamond.git cd diamond docker build -t diamond . #The name of the image follows -t, so give a descriptive name (in this case, "diamond")
After installation, you can use the docker image as follows.
docker run -it --rm -v $PWD:$PWD -w $PWD diamond sh
There are many other tools that distribute Dockerfile recently.
Because docker needs administrator privileges (root privileges), it can not be used on a shared server. Instead, software called singularity may be available. Both the supercomputer at the University of Tokyo Shirokane campus and the supercomputer at the National Institute of Genetics can use singularity.
Public images of docker registered in DockerHub, QUAY, etc. can be used for singularity. You can log in to the server and execute it like the following command. (In the case of our laboratory servers, be sure to move to the “work” work folder and then perform analysis etc. Unfortunately, although WSL can create images of singularity, it can not start containers. )
First, convert the docker public image to a singularity image. The example below uses a bwa image from BioContainers.
singularity pull --name bwa.sif docker://quay.io/biocontainers/bwa:0.7.17--h84994c4_5
The image bwa.sif is created. When converting other images, change the image name created with --name
and the URL of the docker image after docker://
as appropriate. After that, the bwa container starts with the next command, so you can use it the same way as docker.
singularity shell bwa.sif
When you want to start only bwa,
singularity exec bwa.sif bwa
In addition, to make the shortcut of bwa using singularity, you can add an alias to ~/.bashrc as follows:
shopt -s expand_aliases #to enable bwa aliases in batch scripts. alias bwa='singularity exec /suikou/files/m48/user2/work/img/bwa.sif bwa' #Specify the image of singularity with an absolute path.
From the next time, please decide the category of tools to be surveyed (may overlap with other people), and report one or more tools in two weeks. After the second survey, compare the results of the tools you examined in the past. (About computational time, memory usage, disk usage, accuracy, sensitivity, etc.)
Fill in the results of the survey on the WordPress blog below. Any language is acceptable. However, the screen shown when explaining at the briefing session should be in English. (Please write English in advance, or confirm that Google translation becomes natural English.)
http://www.suikou.fs.a.u-tokyo.ac.jp/blog/
Login is from the following URL, ID, password is the same as when logging in to the laboratory server. However, change your password and e-mail address after login.
http://webpark1634.sakura.ne.jp/blog/wp-login.php
Please contact me if you want to use a new WordPress plugin.
Category and Tool Example
Read QC illumina: FASTQC nanopore: nanoQC Read trimming illumina: FASTX-toolkit, trimmomatic, sickle nanopore: Nanofilt, k-mer analysis genome size prediction: KmerGenie, KAT, GenomeScope (docker pull greatfireball/ime_genomescope) homology search blast, MAGICBLAST, last, diamond, ghostx, blat mapping whole genome illumina→genome: bwa aln, bwa mem, bowtie2, subread, soapaligner whole genome nanopore→genome: minimap2, minialign, last RNA-seq→genome: tophat, hisat2, star cDNA→genome: exonerate, GeneWise (https://www.ebi.ac.uk/~birney/wise2/), gmap, spaln, minimap2 RNA-seq→cDNA: kallisto, RSEM, salmon rRNA→rRNA: blast, usearch assembly genome illumina: CLC, SOAPdenovo, Platanus (docker pull c2997108/platanus:1.2.4), ABySS, SPAdes, MaSuRCA, Meraculous genome nanopore: canu, flye, Ra (docker pull c2997108/ra:2018-12-11), Redbean (wtdbg), Unicycler, Manta, FALCON metagenome illumina: megahit, metaSPAdes (SPAdes) RNA-seq illumina: Trinity, TransABySS, SOAPdenovo-Trans, rnaSPAdes (SPAdes) scaffolding illumina: BESST, SSPACE illumina RNA-seq: BESST_RNA (https://github.com/ksahlin/BESST_RNA), Rascaf, P_RNA_scaffolder (lost?) pacbio, nanopore: LINKS, quickmerge (https://github.com/mahulchak/quickmerge) using close species: Chromosomer, MEDUSA, AlignGraph (https://github.com/baoe/AlignGraph) gap close pacbio, nanopore: LR_Gapcloser (https://github.com/CAFS-bioinformatics/LR_Gapcloser), GMcloser merge assembly Metassembler (https://sourceforge.net/projects/metassembler/), GAM-NGS (https://github.com/vice87/gam-ngs), Mix (https://github.com/cbib/MIX) poolishing pilon, racon assembly QC genome: QUAST-LG, assembly-stats, REAPR (https://www.sanger.ac.uk/science/tools/reapr), BUSCO cDNA: TransRate clustering CD-HIT, SSEARCH, VSEARCH reference guided assembly cufflinks, stringtie, strawberry (https://github.com/ruolin/Strawberry) SNP calling GATK, bcftools mpileup, freebayes, VarScan SV (Structural Variants) calling pindel, breakdancer, Manta databases rDNA: SILVA, RDP statistics DE: edgeR, DESeq2, cuffdiff, slueth, ballgown annotation transcriptome: dammit, trinotate pipeline shotgun metagenome: MG-RAST, Sunbeam, 16S rRNA metagenome: QIIME RNA-seq: SPARTA, VIPER, iDEP single cell RNA-seq: zUMIs, seurat
The following is a list of slides submitted when we investigated the enrichment analysis tool two years ago.
The following sequence is the predicted nucleotide sequence of a certain gene in pearl oysters.
atgactctgaaggatgccctcaacaaaagtcacacaaatacaggaaacatgctcacaata cttcaaagctttgaaaatcgtttaaagaagttagagggaacagttgagcctgtttacaat gagacagaaatgctgcggcgcagacaagaaaatatagagaaaactatgacaacactggac aatgtgctgggttactaccatattgctaaagatgtacaagatttgattaaagaaggtcca gtagtttgtggtctggagaagtacctgtctactatggaccggctgctccaagcactgaac tactttaataaacataacccaaccagtctggaagtgacagatgtcatcaaagtatatgat gatggtaaagatacattgaatgcagagttccgtagtttacttggtcgtcactgtcgtccg gtgccggctgttactatactggatttactaggaccagatgaagagttacaaacaatggaa aatgatgcacccatagaacatctgcctgagaaaattgtgaatgatttaaccctcatcgca aagtggctatacaccaatggtaaagctacagagtatatgaaagattacaccaaagtcagg tcccaaatgctcctctactctctgcaggggaactcaataaagcggaaggctaccacggcc ttgatgcagtccccttttgatccaggtcatagaagacaaggctcttataacgaattgaca aaagaggaaagttttgatgttgaaattgatatctacataacagaactaacagcattgctg aaacttattcagaatgaccctgagagatcttcgatgccccgagacggtacagttcatgaa ctgacaaaccataccattatagtactggagcccctgttagattatgctgagacagctggg gccatgttactcacccatggtgaacatgcagttccatctgatgctgtggatgtcaagaaa agtaaactcaagttggctgactatatcactaaggttttgtcagcattaggattaaactta agtaacaaggcagaaacttacagtgatccaatactcagacatgtgttcatgcttaataac tatcactacatactcaagtctttaaaaaggtctggggtattagaattaattcacacatgg aataaagatgtaggacagttttatgaggaccagatacatgaacaaaaaagactttattcc cagagctggagtaaagttctacattttgtactggaaatgaatgagccaatatcccaacaa agaatccagcaaatggagacatcaaagataaaggacaaagaaaagcagaatataaaagac aagttctctggattcaacaaagagttggaagaaatctcacgtgttcagaaagcatacgcc attcctgatccagaactgagggacaatatcaagaaagacaataaagaatatattgtgccg cgatacaagcttttcttagaaaaatttcaacggctgaacttcacaaagaattcagaaaaa tatatgaaatacactgtaaaggatgtggaagaaacacttgataaatttttcgatacttca gcttaa
This sequence is homologous to the Exocyst complex component 7 of the closely related species C. gigas, but some Exons are missing. (Gene prediction is wrong.) Let's investigate it. Please use BLAST's blastx command.
How to use BLAST: https://togotv.dbcls.jp/20170606.html
The amino acid sequence of Exocyst complex component 7 [Crassostrea gigas]
>EKC30356.1 Exocyst complex component 7 [Crassostrea gigas] MLTILQSFENRLRKLENTVEPVYNETEMLRRRQENIEKTMVTLDNVLGYYHVGKEVEEFIKEGPHNCGLE KYLSIMDRLVQAHNYFNKHNPTSLELTDVIRVYDDGKEALVIEFRTLLGRHCRPVPPVMVLDMISTDEEL QGSDDIQLEHLPEKILTELSLISTWLFNNTKNTEYMKDYTRSRSSMLIKSLQGHSFKRRAVITLMQSPFD PGNKRQGSHAELPKEENLDVEVDIYITELSALLKLIQSEAQLMSGIIADKHHRSVFDNIIQEGLDSVIKN GELLAVNAKKSIAKHDFINVLSVFPVLKHLRSIKPEFDLTLEGCATPTRAKLTSLLSTLGSTAAKALEEF ALSIKTDPEKASMPKDGTVHELTNRTIIFLEPLQDYADTAGAMLLLHGEQAAPSEAVDPKKSKMRLADYI TKTLSALGLNLTIKAETYSDPTLRPVFMLNNYHYILKSLKRSGLLDLIHTWNKDVGQFYEDRINEQKKLY SESWSRVMHYITEVHEPISQQRIQAMENSKLKDKEKQNIKDKFSGFNKELEDILKIQKGYAIPDPELREQ MKKDNKDFIIPAFRMFLDKFKRLNFTKNPEKYIKYSVQDVAEVVDKLFDMSA
In addition, download the pearl oyster genome from the following URL and find out which scaffold has the exon lacking in gene prediction. The command to extract the fasta.gz file is gzip -d
.
https://marinegenomics.oist.jp/pearl/download/pfu_genome1.0.fasta.gz