Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
how_to [2020/01/13 22:19]
admin [Prepare attractive, scientific presentations ?]
how_to [2020/05/28 07:33] (current)
shiva [Get familiar with machine learning and its applications in computational biology?]
Line 4: Line 4:
  
 ---- ----
 +
 +==== Work with screen sessions? ====
 +
 +There are five main [[http://​www.pixelbeat.org/​lkdb/​screen.html|commands]] while working with screen session:
 +
 +  - Start and name a screen: ''​screen -S <NAME of the screen>''​
 +  - Detach from a screen: ''​Ctrl+a d''​
 +  - See the list of active screens: ''​screen -ls''​
 +  - Reattach to a screen: ''​screen -r <NAME of the screen>''​
 +  - Quit and kill your screen: ''​Ctrl+a '' ​ then ''​Ctrl+\''​
 +
 +----
 +
 +
 +==== Read and write excel files in R ====
 +
 +Use [[https://​cran.r-project.org/​web/​packages/​openxlsx/​index.html|openxlsx]] package to read, write and edit xlsx files in R. Package'​s integration with C++ makes it faster and easier to use. Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of '​Rcpp',​ read/write times are comparable to the '​xlsx'​ and '​XLConnect'​ packages with the added benefit of removing the dependency on Java.
 +
 +<​code>​
 +E.g. Writing four dataframes in four sheets of excel workbook can be done as follows:
 +library(openxlsx)
 +listDataFrames <- list("​GO-BP"​=data.frame(egoBP),​ "​GO-MF"​=data.frame(egoMF),​
 +                       "​KEGG"​ = data.frame(eKEGG),​ "​NCG"​ = data.frame(ncg))
 +xlsFile <- file.path(resultPath,​ paste0(l1, "​_ORA_results.xlsx"​))
 +write.xlsx(x=listDataFrames,​ file=xlsFile)
 +</​code>​
 +
 +----
 +
  
 ==== Set local mirror for Rscript ==== ==== Set local mirror for Rscript ====
Line 56: Line 85:
  
 ==== Install R locally (e.g. on a cluster)? ==== ==== Install R locally (e.g. on a cluster)? ====
 +
 +If you want to install the latest __development__ version of R on your macOS, first install [[https://​github.com/​fxcoudert/​gfortran-for-macOS/​releases|Fortran]] if you do not have it. You may also need to update [[https://​superuser.com/​a/​664326|PCRE]].
  
 If you do not have sudo permissions,​ like when you are working on a cluster, you should either use the software module, or install it locally in your home directory. E.g., you can install R on Stampede or Maverick as follows: If you do not have sudo permissions,​ like when you are working on a cluster, you should either use the software module, or install it locally in your home directory. E.g., you can install R on Stampede or Maverick as follows:
Line 84: Line 115:
 ---- ----
  
-**Write in xls files from R?** \\ 
-Use [[http://​www.inside-r.org/​packages/​cran/​XLConnect/​docs/​writeWorksheetToFile|writeWorksheetToFile]]() function from [[http://​cran.r-project.org/​web/​packages/​XLConnect/​index.html|XLConnect]] package like below: 
  
-<​code>​ +==== Restore a file deleted in a local git directory? ====
-writeWorksheetToFile(data=matrix(1:​10,​2,​2),​file='​./​temp.xls',​sheet='​test1'​) +
-writeWorksheetToFile(data=matrix(1:​6,​2,​3),​file='​./​temp.xls',​sheet='​test2'​) +
-</​code>​+
  
----- 
- 
-**Restore a file** **deleted ****in a local git directory?​** \\ 
 [[http://​stackoverflow.com/​questions/​9305326/​why-doesnt-git-pull-bring-back-directories-that-ive-deleted|Use]] git reset –hard to completely bring your working directory to HEAD state. However, this is a [[http://​stackoverflow.com/​questions/​5473/​how-can-i-undo-git-reset-hard-head1|dangerous]] command because you may loose some local files that are not pushed yet. [[http://​stackoverflow.com/​questions/​9305326/​why-doesnt-git-pull-bring-back-directories-that-ive-deleted|Use]] git reset –hard to completely bring your working directory to HEAD state. However, this is a [[http://​stackoverflow.com/​questions/​5473/​how-can-i-undo-git-reset-hard-head1|dangerous]] command because you may loose some local files that are not pushed yet.
  
 ---- ----
  
-**Get familiar with machine learning and its applications in computational biology? ​** \\+ 
 +==== Get familiar with machine learning and its applications in computational biology? ​==== 
 + 
 +- You can enroll in many online machine learning courses. Some of the best courses in ML can be found [[https://​docs.google.com/​spreadsheets/​d/​1AK8lqS-ztMhh8YoOaQ7ScIZmabrQ5AFxAyXKwYWiT04/​edit#​gid=0|here]]. 
 - Most common ML techniques are very well explained in [[https://​scikit-learn.org/​stable/​user_guide.html|Scikit learn]] with [[https://​scikit-learn.org/​stable/​modules/​decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://​www.kaggle.com/​getting-started/​5243|R]] packages including mlr3 and tidymodels. - Most common ML techniques are very well explained in [[https://​scikit-learn.org/​stable/​user_guide.html|Scikit learn]] with [[https://​scikit-learn.org/​stable/​modules/​decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://​www.kaggle.com/​getting-started/​5243|R]] packages including mlr3 and tidymodels.
  
Line 106: Line 133:
 ---- ----
  
-**Get access to the papers through the library when you are off-campus?** \\ + 
-First add the following to your browser (Chrome or Firefox) bookmarks.+==== Get access to the papers through the library when you are off-campus? ​==== 
 + 
 +In any of these two ways:\\ 
 +a) First add the following to your browser (Chrome or Firefox) bookmarks.
  
 <​code>​ <​code>​
Line 114: Line 144:
  
 Then, on the journal page, click on the bookmark. Login and start reading. Then, on the journal page, click on the bookmark. Login and start reading.
 +
 +b) Use [[https://​infosec.uthscsa.edu/​two-factor-enrollment|GlobalProtect]],​ which is the University VPN.
  
 ---- ----
  
-**Convert pdf to MS word?** \\ + 
-Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://​www.authorea.com/​users/​54336|Authorea]],​ [[https://​www.overleaf.com/​|Overleaf]] or at least google ​doc. __Only__ if your biologist collaborators cannot [[http://​www.dedoimedo.com/​computers/​latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as docs.[[https://​docs.zone/​|zone]]. Alternatively,​ Acrobat Pro can export a .pdf as a .doc file. If Bibtex is not an option, use [[http://​www.easybib.com/​|EasyBib]].+==== Convert pdf to MS word? ==== 
 + 
 +Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://​www.authorea.com/​users/​54336|Authorea]],​ [[https://​www.overleaf.com/​|Overleaf]] or at least Google Doc. In Google Doc, references can be easily handled using [[https://​gsuite.google.com/​marketplace/​app/​paperpile/​894076725911|Paperpile]] add-on, and figures can be automatically numbered using the the [[https://​gsuite.google.com/​marketplace/​app/​cross_reference/​269114033347?​pann=cwsdp&​hl=en|Cross Reference]] add-on as suggested in these [[https://​lcolladotor.github.io/​2019/​04/​02/​how-to-write-academic-documents-with-googledocs/#​.Xjne6RNKjUI|guidelines]] on how to write academic documents with Google Docs . __Only__ if your biologist collaborators cannot [[http://​www.dedoimedo.com/​computers/​latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as docs.[[https://​docs.zone/​|zone]]. Alternatively,​ Acrobat Pro can export a .pdf as a .doc file. If Bibtex is not an option, use [[http://​www.easybib.com/​|EasyBib]].
  
 ---- ----
  
-**Enable spell check in Emacs on OS X?** \\+ 
 +==== Enable spell check in Emacs on OS X? ==== 
 The default Aquamacs spell checker has some issues. To replace it, first [[http://​stackoverflow.com/​questions/​19022015/​emacs-on-mac-os-x-how-to-get-spell-check-to-work|install]] Aspell, which is a [[https://​en.wikipedia.org/​wiki/​GNU_Aspell|replacement]] for Ispell: The default Aquamacs spell checker has some issues. To replace it, first [[http://​stackoverflow.com/​questions/​19022015/​emacs-on-mac-os-x-how-to-get-spell-check-to-work|install]] Aspell, which is a [[https://​en.wikipedia.org/​wiki/​GNU_Aspell|replacement]] for Ispell:
  
Line 137: Line 173:
 ---- ----
  
-**Do microaray analysis or anything else in Bioconductor?​** \\+==== Do microaray analysis or anything else in Bioconductor? ​==== 
 [[http://​manuals.bioinformatics.ucr.edu/​home/​R_BioCondManual#​TOC-Affy|This ]] is an excellent site with many well commented code examples and a lot of handy short-cuts. See also [[:​how_to|Functional analysis tools]]. [[http://​manuals.bioinformatics.ucr.edu/​home/​R_BioCondManual#​TOC-Affy|This ]] is an excellent site with many well commented code examples and a lot of handy short-cuts. See also [[:​how_to|Functional analysis tools]].
  
Line 200: Line 237:
 ---- ----
  
-=====   ​Prepare or review computational biology papers for Nature methods? ​  ===== +===== Prepare or review computational biology papers for Nature methods? =====
- +
-=====   =====+
  
 Read their "​Reviewing computational methods"​ ([[http://​www.nature.com/​nmeth/​journal/​v12/​n12/​full/​nmeth.3686.html|2015]]) and "​Guidelines for algorithms and software in Nature Methods"​ ([[http://​blogs.nature.com/​methagora/​2014/​02/​guidelines-for-algorithms-and-software-in-nature-methods.html|2014]]) articles. Provide source code, pseudocode, compiled executables,​ and the mathematical description. Softwares must be accompanied with documentation,​ sample data and the expected output, and a license (e.g., GPL≥2). Have a look at [[:​the_list_of_computational_biology_papers_in_nature_methods|The list of computational biology papers in Nature Methods]] published in 2015, and the [[https://​www.google.com/​url?​sa=t&​rct=j&​q=&​esrc=s&​source=web&​cd=9&​cad=rja&​uact=8&​ved=0ahUKEwic2Oum3dvJAhUGeSYKHWurD-EQFghQMAg&​url=http%3A%2F%2Ford.ntu.edu.tw%2Ftc%2Fincludes%2FGetFile.ashx%3FmID%3D253%26id%3D1744%26chk%3De15262f3-87bf-4a7e-be6a-4103cbc61968&​usg=AFQjCNHfNXyryLDQMWBRrInOpJVKIL0LCA&​sig2=8C9UE1arY4vi2q_CUdsdiQ|hints]] by an editor of Nature Communications. Read their "​Reviewing computational methods"​ ([[http://​www.nature.com/​nmeth/​journal/​v12/​n12/​full/​nmeth.3686.html|2015]]) and "​Guidelines for algorithms and software in Nature Methods"​ ([[http://​blogs.nature.com/​methagora/​2014/​02/​guidelines-for-algorithms-and-software-in-nature-methods.html|2014]]) articles. Provide source code, pseudocode, compiled executables,​ and the mathematical description. Softwares must be accompanied with documentation,​ sample data and the expected output, and a license (e.g., GPL≥2). Have a look at [[:​the_list_of_computational_biology_papers_in_nature_methods|The list of computational biology papers in Nature Methods]] published in 2015, and the [[https://​www.google.com/​url?​sa=t&​rct=j&​q=&​esrc=s&​source=web&​cd=9&​cad=rja&​uact=8&​ved=0ahUKEwic2Oum3dvJAhUGeSYKHWurD-EQFghQMAg&​url=http%3A%2F%2Ford.ntu.edu.tw%2Ftc%2Fincludes%2FGetFile.ashx%3FmID%3D253%26id%3D1744%26chk%3De15262f3-87bf-4a7e-be6a-4103cbc61968&​usg=AFQjCNHfNXyryLDQMWBRrInOpJVKIL0LCA&​sig2=8C9UE1arY4vi2q_CUdsdiQ|hints]] by an editor of Nature Communications.
Line 208: Line 243:
 ---- ----
  
-**Set the default width of fill mode (line length) in emacs?** \\+===== Set the default width of fill mode (line length) in emacs? ​===== 
 [[http://​stackoverflow.com/​questions/​3566727/​how-to-set-the-default-width-of-fill-mode-to-80-with-emacs|Use]] 'M-x customize-variable'​ to set '​fill-column'​ (100 in Oncinfo). Use DejaVu Sans Mono (~[[http://​www.leancrew.com/​all-this/​2009/​10/​the-compleat-menlovera-sans-comparison/​|Menlo]] on MacOS) size 18-20 is an [[http://​ergoemacs.org/​emacs/​emacs_unicode_fonts.html|appropriate]] font for programming in Emacs. To do so, you may need to manually edit your .emacs in [[https://​stackoverflow.com/​questions/​4821984/​emacs-osx-default-font-setting-does-not-persist|macOS]],​ and add the following [[https://​stackoverflow.com/​questions/​4879785/​can-i-break-the-long-line-in-emacs-non-windows-to-the-next-line|line]]:​ [[http://​stackoverflow.com/​questions/​3566727/​how-to-set-the-default-width-of-fill-mode-to-80-with-emacs|Use]] 'M-x customize-variable'​ to set '​fill-column'​ (100 in Oncinfo). Use DejaVu Sans Mono (~[[http://​www.leancrew.com/​all-this/​2009/​10/​the-compleat-menlovera-sans-comparison/​|Menlo]] on MacOS) size 18-20 is an [[http://​ergoemacs.org/​emacs/​emacs_unicode_fonts.html|appropriate]] font for programming in Emacs. To do so, you may need to manually edit your .emacs in [[https://​stackoverflow.com/​questions/​4821984/​emacs-osx-default-font-setting-does-not-persist|macOS]],​ and add the following [[https://​stackoverflow.com/​questions/​4879785/​can-i-break-the-long-line-in-emacs-non-windows-to-the-next-line|line]]:​
  
Line 231: Line 267:
 ---- ----
  
-**Convert gene or protein IDs?** \\+===== Convert gene or protein IDs? ===== 
 [[https://​www.biostars.org/​p/​22/​|Use]] [[https://​biodbnet-abcc.ncifcrf.gov/​db/​db2db.php|bioDBnet]],​ BioMart - Ensembl, or [[https://​bioconductor.org/​packages/​release/​bioc/​html/​AnnotationDbi.html|AnnotationDbi]] package in R to convert between Entrez Gene, RefSeq, Ensemble, and many more. [[https://​www.biostars.org/​p/​22/​|Use]] [[https://​biodbnet-abcc.ncifcrf.gov/​db/​db2db.php|bioDBnet]],​ BioMart - Ensembl, or [[https://​bioconductor.org/​packages/​release/​bioc/​html/​AnnotationDbi.html|AnnotationDbi]] package in R to convert between Entrez Gene, RefSeq, Ensemble, and many more.
  
 ---- ----
 +
  
 ===== Prepare attractive, scientific presentations ? ===== ===== Prepare attractive, scientific presentations ? =====
Line 245: Line 283:
 ==== Access a Bioconductor package source code? ==== ==== Access a Bioconductor package source code? ====
  
-It is always better to a install the latest version of a package as directed in the corresponding Bioconductor page (e.g., [[https://​bioconductor.org/​packages/​Pigengene|Pigengene]]). If you need to see more details in the source code, or you need the development version, you can clone the source from the Bioconductor using the "​Source Repository (Developer Access) " command, which is posted on the corresponding package [[https://​bioconductor.org/​packages/​release/​bioc/​html/​Pigengene.html|page]],​ e.g.,+It is always better to a install the latest version of a package as directed in the corresponding Bioconductor page (e.g., [[https://​bioconductor.org/​packages/​Pigengene|Pigengene]]). If you need to see more details in the source code, or you need the development version. If the package maintainer adds your public ssh [[https://​git.bioconductor.org/​BiocCredentials/​|key]]then you can clone the source from the Bioconductor using the "​Source Repository (Developer Access) " command, which is posted on the corresponding package [[https://​bioconductor.org/​packages/​release/​bioc/​html/​Pigengene.html|page]],​ e.g.,
  
 <​code>​ <​code>​
Line 419: Line 457:
 ==== Disable scroll acceleration in macOS? ==== ==== Disable scroll acceleration in macOS? ====
  
-Install and [[https://​www.reddit.com/​r/​osx/​comments/​6kx6zb/​how_to_disable_mouse_scrolling_acceleration/​|use]] USB [[http://​www.usboverdrive.com/​USBOverdrive/​Information.html|Overdrive]] to set Wheel up and down "​Speed"​ to say, 6 lines. The following [[https://​apple.stackexchange.com/​questions/​253111/​how-to-disable-scroll-acceleration-in-macos-sierra|command]] does NOT work:+Install and [[https://​www.reddit.com/​r/​osx/​comments/​6kx6zb/​how_to_disable_mouse_scrolling_acceleration/​|use]] USB [[http://​www.usboverdrive.com/​USBOverdrive/​Information.html|Overdrive]] to set Wheel up and down "​Speed" ​of your mouse to say, 6 lines. The following [[https://​apple.stackexchange.com/​questions/​253111/​how-to-disable-scroll-acceleration-in-macos-sierra|command]] does NOT work:
  
 <​code>​ <​code>​
Line 426: Line 464:
  
 Logitech Control Center may help on the [[https://​support.logi.com/​hc/​en-gb/​articles/​360025297833-Logitech-Control-Center-for-Macintosh-OS-X|Logitech]] MX mice older than 2019. Logitech Control Center may help on the [[https://​support.logi.com/​hc/​en-gb/​articles/​360025297833-Logitech-Control-Center-for-Macintosh-OS-X|Logitech]] MX mice older than 2019.
 +
 +----
 +
 +
 +===== Choose a solid state (SSD) external drive? =====
 +
 +The non-volatile memory express (NVMe) devices are [[https://​ssd.borecraft.com/​SSD_Buying_Guide_List.pdf|better]] than SATA solid state drives. Good brands include [[https://​smile.amazon.com/​gp/​product/​B07X6CKHH1/​ref=ox_sc_act_title_1?​smid=A29Y8OP2GPR7PE&​psc=1|Sabrent]] (Nano is smaller than Pro but gets hot when extensivly used), Seagete, Addlink, and Team. As of 2020, a speed of 1000 Mb/s is possible using USB 3.2.
 +
 +----
 +
 +===== Work with screen session? =====
 +
 +There are five main [[http://​www.pixelbeat.org/​lkdb/​screen.html|commands]] while working with screen session:
 +
 +  - Start and name a screen: ''​screen -S $NAME''​
 +  - Detach from a screen: ''​Ctrl+a d''​
 +  - See the list of active screens: ''​screen -ls''​
 +  - Reattach to a screen: ''​screen -r $NAME''​
 +  - Quit and [[https://​askubuntu.com/​questions/​356006/​kill-a-screen-session|kill]] your screen: ''​Ctrl+a then Ctrl+\''​
 +
 +----