Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
how_to [2020/03/06 01:44] – [Access a Bioconductor package source code?] adminhow_to [2021/06/06 12:58] – [Install R locally (e.g. on a cluster)?] admin
Line 86: Line 86:
 ==== Install R locally (e.g. on a cluster)? ==== ==== Install R locally (e.g. on a cluster)? ====
  
-If you want to install the latest __development__ version of R on your macOS, first install [[https://github.com/fxcoudert/gfortran-for-macOS/releases|Fortran]] if you do not have it. You may also need to update [[https://superuser.com/a/664326|PCRE]].+Like most of Unix programs, R can be installed from source by a) downloading the [[https://cloud.r-project.org/|source]] code, configuring and compiling the code, and then installing the binaries. If you try this simple approach and your get errors like [[https://tdhock.github.io/blog/2017/compiling-R/|these]], it means the dependencies are not available or updated on your machine. E.g., f you want to install the latest __development__ version of R on your macOS, first install [[https://github.com/fxcoudert/gfortran-for-macOS/releases|Fortran]] if you do not have it. You may also need to update PCRE using brew on [[http://superuser.com/a/664326|macOS]]. Alternatively, you can compile PCRE2 from the [[https://www.linuxfromscratch.org/blfs/view/svn/general/pcre2.html|source]], and then let R where it is [[https://unix.stackexchange.com/a/149361|using]] CPPFLAGS and LDFLAGS.
  
-If you do not have sudo permissions, like when you are working on a cluster, you should either use the software module, or install it locally in your home directory. E.g., you can install R on Stampede or Maverick as follows:+If you do not have sudo permissions, like when you are working on a cluster, you should either use the software module, or install locally in your home directory. E.g., you can install R on Stampede or Maverick TACC clusters as follows:
  
 <code> <code>
Line 101: Line 101:
 </code> </code>
  
-Now, you can add $HOME/arch to your path by inserting the following line in your .bachrc file:+If the above works without any error, you can add $HOME/arch to your path by inserting the following line in your .bachrc file:
  
 <code> <code>
Line 107: Line 107:
 </code> </code>
  
-I had to follow [[http://pj.freefaculty.org/blog/?p=315|these]] steps to resolve the bzip2 issue on the Lonestar5 cluster. Oncinfo Lab members can use it if they add the following to their .bashrc+On some clusters, [[https://tdhock.github.io/blog/2017/compiling-R/|a few ]]libraries might not be installed or they might be too old (e.g., zlib, curl, bzip2, xz, pcre). In particular, the bzip2 issue can be resolved by following [[http://pj.freefaculty.org/blog/?p=315|these]] steps on the Lonestar5 cluster. Oncinfo Lab members can use it if they add the following to their .bashrc
  
 <code> <code>
 export PATH=/home1/03270/zare/Install/bin:$PATH export PATH=/home1/03270/zare/Install/bin:$PATH
 </code> </code>
 +
 +Installing R using [[https://datascience.stackexchange.com/questions/77335/anconda-r-version-how-to-upgrade-to-4-0-and-later/86905#86905|conda]] is only a quick and [[https://www.perfectlyrandom.org/2016/04/08/install-xml2-r-package-on-macos/|dirty]], temporary solution. E.g., as of 2021-05-14, the xml2 package that is installed by conda is not compatible with R 4.0 that is installed using conda, therefore, solving the issue in [[https://stackoverflow.com/questions/37035088/unable-to-install-r-package-due-to-xml-dependency-mismatch|this way]] moves the R version to from 4.0 back to 3.0! The time you will spend addressing such issues would be possibly more than the time you need to put on to a clean instalation of R from source.
  
 ---- ----
Line 124: Line 126:
  
 ==== Get familiar with machine learning and its applications in computational biology? ==== ==== Get familiar with machine learning and its applications in computational biology? ====
 +
 +- You can enroll in many online machine learning courses. Some of the best courses in ML can be found [[https://docs.google.com/spreadsheets/d/1AK8lqS-ztMhh8YoOaQ7ScIZmabrQ5AFxAyXKwYWiT04/edit#gid=0|here]].
  
 - Most common ML techniques are very well explained in [[https://scikit-learn.org/stable/user_guide.html|Scikit learn]] with [[https://scikit-learn.org/stable/modules/decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://www.kaggle.com/getting-started/5243|R]] packages including mlr3 and tidymodels. - Most common ML techniques are very well explained in [[https://scikit-learn.org/stable/user_guide.html|Scikit learn]] with [[https://scikit-learn.org/stable/modules/decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://www.kaggle.com/getting-started/5243|R]] packages including mlr3 and tidymodels.
Line 150: Line 154:
 ==== Convert pdf to MS word? ==== ==== Convert pdf to MS word? ====
  
-Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://www.authorea.com/users/54336|Authorea]], [[https://www.overleaf.com/|Overleaf]] or at least Google Doc. In Google Doc, references can be easily handled using [[https://gsuite.google.com/marketplace/app/paperpile/894076725911|Paperpile]] add-on, and figures can be automatically numbered using the the [[https://gsuite.google.com/marketplace/app/cross_reference/269114033347?pann=cwsdp&hl=en|Cross Reference]] add-on as suggested in these [[https://lcolladotor.github.io/2019/04/02/how-to-write-academic-documents-with-googledocs/#.Xjne6RNKjUI|guidelines]] on how to write academic documents with Google Docs . __Only__ if your biologist collaborators cannot [[http://www.dedoimedo.com/computers/latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as docs.[[https://docs.zone/|zone]]. Alternatively, Acrobat Pro can export a .pdf as a .doc file. If Bibtex is not an option, use [[http://www.easybib.com/|EasyBib]].+Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://www.authorea.com/users/54336|Authorea]], [[https://www.overleaf.com/|Overleaf]] or at least Google Doc. In Google Doc, references can be easily handled using [[https://gsuite.google.com/marketplace/app/paperpile/894076725911|Paperpile]] add-on (NOT the extension), and figures can be automatically numbered using the the [[https://gsuite.google.com/marketplace/app/cross_reference/269114033347?pann=cwsdp&hl=en|Cross Reference]] add-on as suggested in these [[https://lcolladotor.github.io/2019/04/02/how-to-write-academic-documents-with-googledocs/#.Xjne6RNKjUI|guidelines]] on how to write academic documents with Google Docs. Add-ons are not available when editing .docx files. __Only__ if your biologist collaborators cannot [[http://www.dedoimedo.com/computers/latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as Adobe [[https://chrome.google.com/webstore/detail/adobe-acrobat/efaidnbmnnnibpcajpcglclefindmkaj|Acrobat]] Chrome extension or Acrobat Pro, which can export a .pdf as a .doc file. docs. The docs [[https://docs.zone/|zone]] is an online alternative. If you need to to separate pages, use [[https://superuser.com/a/1584919|pdfjam]]. If Bibtex is not an option, use [[http://www.easybib.com/|EasyBib]].
  
 ---- ----
Line 231: Line 235:
 ===== Write a scientific paper? ===== ===== Write a scientific paper? =====
  
-Put the figures together and then [[http://www.scidev.net/global/publishing/practical-guide/how-do-i-write-a-scientific-paper-.html|draft]] different [[https://www.nature.com/articles/nmeth.4532?WT.ec_id=NMETH-201712&spMailingID=55474826&spUserID=MTIyMzczNjc4MDI2S0&spJobID=1285409878&spReportId=MTI4NTQwOTg3OAS2|sections]]. Focus the [[http://www.grantcentral.com/strategies-for-avoiding-common-problems-with-research-manuscripts/|Discussion]]. Be careful about [[http://colah.github.io/posts/2019-05-Collaboration/index.html|authorship]].+Put the figures together and then [[http://www.scidev.net/global/publishing/practical-guide/how-do-i-write-a-scientific-paper-.html|draft]] different [[https://www.nature.com/articles/nmeth.4532?WT.ec_id=NMETH-201712&spMailingID=55474826&spUserID=MTIyMzczNjc4MDI2S0&spJobID=1285409878&spReportId=MTI4NTQwOTg3OAS2|sections]]. Focus the [[http://www.grantcentral.com/strategies-for-avoiding-common-problems-with-research-manuscripts/|Discussion]]. Be careful about [[http://colah.github.io/posts/2019-05-Collaboration/index.html|authorship]]. It might be easier to write the [[https://plos.org/resource/how-to-write-a-great-abstract/?utm_medium=email&utm_source=internal&utm_campaign=modnewsletters&utm_content=modnewsletter|abstract]] //after// other sections are drafted.
  
 ---- ----
 +
  
 ===== Prepare or review computational biology papers for Nature methods? ===== ===== Prepare or review computational biology papers for Nature methods? =====
Line 431: Line 436:
 ==== Encrypt a folder? ==== ==== Encrypt a folder? ====
  
-Compress the folder in 7z format using the AES-256 encrypting algorithm. [[https://www.dzhang.com/blog/2018/03/11/using-7-zip-create-aes-256-encrypted-zip-files-command-line|E.g]],+Encrypt a ''largeFolder'' folder using [[https://www.cyberciti.biz/tips/linux-how-to-encrypt-and-decrypt-files-with-a-password.html|tar]] and compress it using ''gpg'' based on the AES-256 encrypting algorithmYou will obtain the [[https://crypto.stackexchange.com/a/71078|strongest]] security with these options:
  
 <code> <code>
-7z a -tzip -mem=AES256 -p super-secret.7z super-secret_folder+tar -cvz largeFolder | gpg --s2k-mode 3 --s2k-count 65011712 --s2k-digest-algo SHA512 --s2k-cipher-algo AES256 --symmetric --no-symkey-cache -o largeFolder.tgz.gpg 
 +</code>
  
-7z x super-secret.7z ## Decrypt and uncomperess+The ''--no-symkey-cache'' option is available in [[https://unix.stackexchange.com/a/557051|version]] >=2.2.7. On macOS, you need to first install [[https://sourceforge.net/p/gpgosx/docu/Download/|GnuPG]]. An alternative approach is to use [[http://www.dzhang.com/blog/2018/03/11/using-7-zip-create-aes-256-encrypted-zip-files-command-line|7z]], which can be installed using [[http://molecularsciences.org/content/installing-and-running-7-zip-from-mac-terminal/|homebrew]], however, 7z is windows based and thus not recommended.\\ 
 +\\ 
 +To decrypt and uncomperess
 + 
 +<code> 
 +gpg --decrypt --no-symkey-cache largeFolder.tgz.gpg | tar -xv
 </code> </code>
 +
 +A good password should have at least 12 characters, include both small and capital letters, and at least one digit and one special character such as !@#$%^&*(). Do not use dictionary words in your password, instead, use a [[https://cybernews.com/best-password-managers/how-to-create-a-strong-password/|passphrase]] "to create strong passwords".
  
 ---- ----
 +
  
 ==== Upload a file to Oncinfo and link to it? ==== ==== Upload a file to Oncinfo and link to it? ====
Line 481: Line 495:
   - Reattach to a screen: ''screen -r $NAME''   - Reattach to a screen: ''screen -r $NAME''
   - Quit and [[https://askubuntu.com/questions/356006/kill-a-screen-session|kill]] your screen: ''Ctrl+a then Ctrl+\''   - Quit and [[https://askubuntu.com/questions/356006/kill-a-screen-session|kill]] your screen: ''Ctrl+a then Ctrl+\''
 +
 +----
 +
 +===== Search for public domain images? =====
 +
 +[[https://unsplash.com/|Unsplash]] is among the best [[https://wcmshelp.ucsc.edu/about-images/finding-public-domain-images.html|resources]] with many hi-resolution images, which are frequently used by media.
 +
 +----
 +
 +===== Identify Senescence in Cells and Tissues? =====
 +
 +Watch [[https://cellsignal.wistia.com/medias/p9khwp3hzx|this]] quick 6-minute introduction to the senescence concept, and learn about the common markers and kits.
  
 ---- ----