Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
how_to [2019/10/11 15:53]
admin [Use git via proxy or vpn?]
how_to [2020/05/28 07:33] (current)
shiva [Get familiar with machine learning and its applications in computational biology?]
Line 4: Line 4:
  
 ---- ----
 +
 +==== Work with screen sessions? ====
 +
 +There are five main [[http://​www.pixelbeat.org/​lkdb/​screen.html|commands]] while working with screen session:
 +
 +  - Start and name a screen: ''​screen -S <NAME of the screen>''​
 +  - Detach from a screen: ''​Ctrl+a d''​
 +  - See the list of active screens: ''​screen -ls''​
 +  - Reattach to a screen: ''​screen -r <NAME of the screen>''​
 +  - Quit and kill your screen: ''​Ctrl+a '' ​ then ''​Ctrl+\''​
 +
 +----
 +
 +
 +==== Read and write excel files in R ====
 +
 +Use [[https://​cran.r-project.org/​web/​packages/​openxlsx/​index.html|openxlsx]] package to read, write and edit xlsx files in R. Package'​s integration with C++ makes it faster and easier to use. Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of '​Rcpp',​ read/write times are comparable to the '​xlsx'​ and '​XLConnect'​ packages with the added benefit of removing the dependency on Java.
 +
 +<​code>​
 +E.g. Writing four dataframes in four sheets of excel workbook can be done as follows:
 +library(openxlsx)
 +listDataFrames <- list("​GO-BP"​=data.frame(egoBP),​ "​GO-MF"​=data.frame(egoMF),​
 +                       "​KEGG"​ = data.frame(eKEGG),​ "​NCG"​ = data.frame(ncg))
 +xlsFile <- file.path(resultPath,​ paste0(l1, "​_ORA_results.xlsx"​))
 +write.xlsx(x=listDataFrames,​ file=xlsFile)
 +</​code>​
 +
 +----
 +
  
 ==== Set local mirror for Rscript ==== ==== Set local mirror for Rscript ====
Line 18: Line 47:
 ---- ----
  
-Add only [[http://​stackoverflow.com/​questions/​7124726/​git-add-only-modified-changes-and-ignore-untracked-files|modified]] changes and ignore untracked files using git?+===== Add only modified changes and ignore untracked files using git? =====
  
 <​code>​ <​code>​
Line 24: Line 53:
 </​code>​ </​code>​
  
-If there is a conflict at push time, first pull. Now, you need to look for ">>>"​ in the code, and manually fix the conflict. Then, push again.+The [[https://​stackoverflow.com/​questions/​7124726/​git-add-only-modified-changes-and-ignore-untracked-files|above]] should work if there is no conflict. ​If there is a conflict at push time, first pull. Now, you need to look for ">>>"​ in the code, and manually fix the conflict. Then, push again.
  
 ---- ----
 +
  
 ==== Use a package that is being developed? ==== ==== Use a package that is being developed? ====
Line 55: Line 85:
  
 ==== Install R locally (e.g. on a cluster)? ==== ==== Install R locally (e.g. on a cluster)? ====
 +
 +If you want to install the latest __development__ version of R on your macOS, first install [[https://​github.com/​fxcoudert/​gfortran-for-macOS/​releases|Fortran]] if you do not have it. You may also need to update [[https://​superuser.com/​a/​664326|PCRE]].
  
 If you do not have sudo permissions,​ like when you are working on a cluster, you should either use the software module, or install it locally in your home directory. E.g., you can install R on Stampede or Maverick as follows: If you do not have sudo permissions,​ like when you are working on a cluster, you should either use the software module, or install it locally in your home directory. E.g., you can install R on Stampede or Maverick as follows:
Line 83: Line 115:
 ---- ----
  
-**Write in xls files from R?** \\ 
-Use [[http://​www.inside-r.org/​packages/​cran/​XLConnect/​docs/​writeWorksheetToFile|writeWorksheetToFile]]() function from [[http://​cran.r-project.org/​web/​packages/​XLConnect/​index.html|XLConnect]] package like below: 
  
-<​code>​ +==== Restore a file deleted in a local git directory? ====
-writeWorksheetToFile(data=matrix(1:​10,​2,​2),​file='​./​temp.xls',​sheet='​test1'​) +
-writeWorksheetToFile(data=matrix(1:​6,​2,​3),​file='​./​temp.xls',​sheet='​test2'​) +
-</​code>​+
  
----- 
- 
-**Restore a file** **deleted ****in a local git directory?​** \\ 
 [[http://​stackoverflow.com/​questions/​9305326/​why-doesnt-git-pull-bring-back-directories-that-ive-deleted|Use]] git reset –hard to completely bring your working directory to HEAD state. However, this is a [[http://​stackoverflow.com/​questions/​5473/​how-can-i-undo-git-reset-hard-head1|dangerous]] command because you may loose some local files that are not pushed yet. [[http://​stackoverflow.com/​questions/​9305326/​why-doesnt-git-pull-bring-back-directories-that-ive-deleted|Use]] git reset –hard to completely bring your working directory to HEAD state. However, this is a [[http://​stackoverflow.com/​questions/​5473/​how-can-i-undo-git-reset-hard-head1|dangerous]] command because you may loose some local files that are not pushed yet.
  
 ---- ----
  
-**Get familiar with machine learning and its applications in computational biology? ​** \\+ 
 +==== Get familiar with machine learning and its applications in computational biology? ​==== 
 + 
 +- You can enroll in many online machine learning courses. Some of the best courses in ML can be found [[https://​docs.google.com/​spreadsheets/​d/​1AK8lqS-ztMhh8YoOaQ7ScIZmabrQ5AFxAyXKwYWiT04/​edit#​gid=0|here]]. 
 - Most common ML techniques are very well explained in [[https://​scikit-learn.org/​stable/​user_guide.html|Scikit learn]] with [[https://​scikit-learn.org/​stable/​modules/​decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://​www.kaggle.com/​getting-started/​5243|R]] packages including mlr3 and tidymodels. - Most common ML techniques are very well explained in [[https://​scikit-learn.org/​stable/​user_guide.html|Scikit learn]] with [[https://​scikit-learn.org/​stable/​modules/​decomposition.html|illustrations]] and example Python code. These techniques have been implemented in [[https://​www.kaggle.com/​getting-started/​5243|R]] packages including mlr3 and tidymodels.
  
Line 105: Line 133:
 ---- ----
  
-**Get access to the papers through the library when you are off-campus?** \\ + 
-First add the following to your browser (Chrome or Firefox) bookmarks.+==== Get access to the papers through the library when you are off-campus? ​==== 
 + 
 +In any of these two ways:\\ 
 +a) First add the following to your browser (Chrome or Firefox) bookmarks.
  
 <​code>​ <​code>​
Line 113: Line 144:
  
 Then, on the journal page, click on the bookmark. Login and start reading. Then, on the journal page, click on the bookmark. Login and start reading.
 +
 +b) Use [[https://​infosec.uthscsa.edu/​two-factor-enrollment|GlobalProtect]],​ which is the University VPN.
  
 ---- ----
  
-**Convert pdf to MS word?** \\ + 
-Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://​www.authorea.com/​users/​54336|Authorea]],​ [[https://​www.overleaf.com/​|Overleaf]] or at least google ​doc. __Only__ if your biologist collaborators cannot [[http://​www.dedoimedo.com/​computers/​latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as docs.[[https://​docs.zone/​|zone]]. Alternatively,​ Acrobat Pro can export a .pdf as a .doc file. If Bibtex is not an option, use [[http://​www.easybib.com/​|EasyBib]].+==== Convert pdf to MS word? ==== 
 + 
 +Try whatever you can to avoid conversion! Instead, educate your team and your collaborators to use [[https://​www.authorea.com/​users/​54336|Authorea]],​ [[https://​www.overleaf.com/​|Overleaf]] or at least Google Doc. In Google Doc, references can be easily handled using [[https://​gsuite.google.com/​marketplace/​app/​paperpile/​894076725911|Paperpile]] add-on, and figures can be automatically numbered using the the [[https://​gsuite.google.com/​marketplace/​app/​cross_reference/​269114033347?​pann=cwsdp&​hl=en|Cross Reference]] add-on as suggested in these [[https://​lcolladotor.github.io/​2019/​04/​02/​how-to-write-academic-documents-with-googledocs/#​.Xjne6RNKjUI|guidelines]] on how to write academic documents with Google Docs . __Only__ if your biologist collaborators cannot [[http://​www.dedoimedo.com/​computers/​latex.html|unfortunately]] edit the LaTeX source, consider using a conversion tool such as docs.[[https://​docs.zone/​|zone]]. Alternatively,​ Acrobat Pro can export a .pdf as a .doc file. If Bibtex is not an option, use [[http://​www.easybib.com/​|EasyBib]].
  
 ---- ----
  
-**Enable spell check in Emacs on OS X?** \\+ 
 +==== Enable spell check in Emacs on OS X? ==== 
 The default Aquamacs spell checker has some issues. To replace it, first [[http://​stackoverflow.com/​questions/​19022015/​emacs-on-mac-os-x-how-to-get-spell-check-to-work|install]] Aspell, which is a [[https://​en.wikipedia.org/​wiki/​GNU_Aspell|replacement]] for Ispell: The default Aquamacs spell checker has some issues. To replace it, first [[http://​stackoverflow.com/​questions/​19022015/​emacs-on-mac-os-x-how-to-get-spell-check-to-work|install]] Aspell, which is a [[https://​en.wikipedia.org/​wiki/​GNU_Aspell|replacement]] for Ispell:
  
Line 136: Line 173:
 ---- ----
  
-**Do microaray analysis or anything else in Bioconductor?​** \\+==== Do microaray analysis or anything else in Bioconductor? ​==== 
 [[http://​manuals.bioinformatics.ucr.edu/​home/​R_BioCondManual#​TOC-Affy|This ]] is an excellent site with many well commented code examples and a lot of handy short-cuts. See also [[:​how_to|Functional analysis tools]]. [[http://​manuals.bioinformatics.ucr.edu/​home/​R_BioCondManual#​TOC-Affy|This ]] is an excellent site with many well commented code examples and a lot of handy short-cuts. See also [[:​how_to|Functional analysis tools]].
  
Line 193: Line 231:
 ---- ----
  
-**Write a scientific paper?** \\+===== Write a scientific paper? ​===== 
 Put the figures together and then [[http://​www.scidev.net/​global/​publishing/​practical-guide/​how-do-i-write-a-scientific-paper-.html|draft]] different [[https://​www.nature.com/​articles/​nmeth.4532?​WT.ec_id=NMETH-201712&​spMailingID=55474826&​spUserID=MTIyMzczNjc4MDI2S0&​spJobID=1285409878&​spReportId=MTI4NTQwOTg3OAS2|sections]]. Focus the [[http://​www.grantcentral.com/​strategies-for-avoiding-common-problems-with-research-manuscripts/​|Discussion]]. Be careful about [[http://​colah.github.io/​posts/​2019-05-Collaboration/​index.html|authorship]]. Put the figures together and then [[http://​www.scidev.net/​global/​publishing/​practical-guide/​how-do-i-write-a-scientific-paper-.html|draft]] different [[https://​www.nature.com/​articles/​nmeth.4532?​WT.ec_id=NMETH-201712&​spMailingID=55474826&​spUserID=MTIyMzczNjc4MDI2S0&​spJobID=1285409878&​spReportId=MTI4NTQwOTg3OAS2|sections]]. Focus the [[http://​www.grantcentral.com/​strategies-for-avoiding-common-problems-with-research-manuscripts/​|Discussion]]. Be careful about [[http://​colah.github.io/​posts/​2019-05-Collaboration/​index.html|authorship]].
  
 ---- ----
  
-**Prepare or review computational biology papers for Nature methods?** \\+===== Prepare or review computational biology papers for Nature methods? ​===== 
 Read their "​Reviewing computational methods"​ ([[http://​www.nature.com/​nmeth/​journal/​v12/​n12/​full/​nmeth.3686.html|2015]]) and "​Guidelines for algorithms and software in Nature Methods"​ ([[http://​blogs.nature.com/​methagora/​2014/​02/​guidelines-for-algorithms-and-software-in-nature-methods.html|2014]]) articles. Provide source code, pseudocode, compiled executables,​ and the mathematical description. Softwares must be accompanied with documentation,​ sample data and the expected output, and a license (e.g., GPL≥2). Have a look at [[:​the_list_of_computational_biology_papers_in_nature_methods|The list of computational biology papers in Nature Methods]] published in 2015, and the [[https://​www.google.com/​url?​sa=t&​rct=j&​q=&​esrc=s&​source=web&​cd=9&​cad=rja&​uact=8&​ved=0ahUKEwic2Oum3dvJAhUGeSYKHWurD-EQFghQMAg&​url=http%3A%2F%2Ford.ntu.edu.tw%2Ftc%2Fincludes%2FGetFile.ashx%3FmID%3D253%26id%3D1744%26chk%3De15262f3-87bf-4a7e-be6a-4103cbc61968&​usg=AFQjCNHfNXyryLDQMWBRrInOpJVKIL0LCA&​sig2=8C9UE1arY4vi2q_CUdsdiQ|hints]] by an editor of Nature Communications. Read their "​Reviewing computational methods"​ ([[http://​www.nature.com/​nmeth/​journal/​v12/​n12/​full/​nmeth.3686.html|2015]]) and "​Guidelines for algorithms and software in Nature Methods"​ ([[http://​blogs.nature.com/​methagora/​2014/​02/​guidelines-for-algorithms-and-software-in-nature-methods.html|2014]]) articles. Provide source code, pseudocode, compiled executables,​ and the mathematical description. Softwares must be accompanied with documentation,​ sample data and the expected output, and a license (e.g., GPL≥2). Have a look at [[:​the_list_of_computational_biology_papers_in_nature_methods|The list of computational biology papers in Nature Methods]] published in 2015, and the [[https://​www.google.com/​url?​sa=t&​rct=j&​q=&​esrc=s&​source=web&​cd=9&​cad=rja&​uact=8&​ved=0ahUKEwic2Oum3dvJAhUGeSYKHWurD-EQFghQMAg&​url=http%3A%2F%2Ford.ntu.edu.tw%2Ftc%2Fincludes%2FGetFile.ashx%3FmID%3D253%26id%3D1744%26chk%3De15262f3-87bf-4a7e-be6a-4103cbc61968&​usg=AFQjCNHfNXyryLDQMWBRrInOpJVKIL0LCA&​sig2=8C9UE1arY4vi2q_CUdsdiQ|hints]] by an editor of Nature Communications.
  
 ---- ----
  
-**Set the default width of fill mode (line length) in emacs?** \\+===== Set the default width of fill mode (line length) in emacs? ​===== 
 [[http://​stackoverflow.com/​questions/​3566727/​how-to-set-the-default-width-of-fill-mode-to-80-with-emacs|Use]] 'M-x customize-variable'​ to set '​fill-column'​ (100 in Oncinfo). Use DejaVu Sans Mono (~[[http://​www.leancrew.com/​all-this/​2009/​10/​the-compleat-menlovera-sans-comparison/​|Menlo]] on MacOS) size 18-20 is an [[http://​ergoemacs.org/​emacs/​emacs_unicode_fonts.html|appropriate]] font for programming in Emacs. To do so, you may need to manually edit your .emacs in [[https://​stackoverflow.com/​questions/​4821984/​emacs-osx-default-font-setting-does-not-persist|macOS]],​ and add the following [[https://​stackoverflow.com/​questions/​4879785/​can-i-break-the-long-line-in-emacs-non-windows-to-the-next-line|line]]:​ [[http://​stackoverflow.com/​questions/​3566727/​how-to-set-the-default-width-of-fill-mode-to-80-with-emacs|Use]] 'M-x customize-variable'​ to set '​fill-column'​ (100 in Oncinfo). Use DejaVu Sans Mono (~[[http://​www.leancrew.com/​all-this/​2009/​10/​the-compleat-menlovera-sans-comparison/​|Menlo]] on MacOS) size 18-20 is an [[http://​ergoemacs.org/​emacs/​emacs_unicode_fonts.html|appropriate]] font for programming in Emacs. To do so, you may need to manually edit your .emacs in [[https://​stackoverflow.com/​questions/​4821984/​emacs-osx-default-font-setting-does-not-persist|macOS]],​ and add the following [[https://​stackoverflow.com/​questions/​4879785/​can-i-break-the-long-line-in-emacs-non-windows-to-the-next-line|line]]:​
  
Line 211: Line 252:
  
 ---- ----
 +
  
 ==== Get older versions using git? ==== ==== Get older versions using git? ====
Line 225: Line 267:
 ---- ----
  
-**Convert gene or protein IDs?** \\+===== Convert gene or protein IDs? ===== 
 [[https://​www.biostars.org/​p/​22/​|Use]] [[https://​biodbnet-abcc.ncifcrf.gov/​db/​db2db.php|bioDBnet]],​ BioMart - Ensembl, or [[https://​bioconductor.org/​packages/​release/​bioc/​html/​AnnotationDbi.html|AnnotationDbi]] package in R to convert between Entrez Gene, RefSeq, Ensemble, and many more. [[https://​www.biostars.org/​p/​22/​|Use]] [[https://​biodbnet-abcc.ncifcrf.gov/​db/​db2db.php|bioDBnet]],​ BioMart - Ensembl, or [[https://​bioconductor.org/​packages/​release/​bioc/​html/​AnnotationDbi.html|AnnotationDbi]] package in R to convert between Entrez Gene, RefSeq, Ensemble, and many more.
  
 ---- ----
  
-**Prepare attractive,** **scientific presentations****?** \\+ 
 +===== Prepare attractive, scientific presentations ? ===== 
 Use a "home slide"​. Also, learn about other tips from Susan [[https://​www.youtube.com/​watch?​v=Hp7Id3Yb9XQ|McConnell]]. Use a "home slide"​. Also, learn about other tips from Susan [[https://​www.youtube.com/​watch?​v=Hp7Id3Yb9XQ|McConnell]].
  
 ---- ----
  
-====   ​Access a Bioconductor package source code?   ==== 
  
-It is always better to a install the latest version of a package as directed in the corresponding Bioconductor page (e.g., [[https://​bioconductor.org/​packages/​Pigengene|Pigengene]]). If you need to see more details in the source code, you can clone the source from the Bioconductor ​mirror, e.g.,+==== Access a Bioconductor package source code? ==== 
 + 
 +It is always better to a install the latest version of a package as directed in the corresponding Bioconductor page (e.g., [[https://​bioconductor.org/​packages/​Pigengene|Pigengene]]). If you need to see more details in the source code, or you need the development version. If the package maintainer adds your public ssh [[https://​git.bioconductor.org/​BiocCredentials/​|key]],​ then you can clone the source from the Bioconductor ​using the "​Source Repository (Developer Access) " command, which is posted on the corresponding package [[https://​bioconductor.org/​packages/​release/​bioc/​html/​Pigengene.html|page]], e.g.,
  
 <​code>​ <​code>​
 mkdir ~/proj; cd ~/proj mkdir ~/proj; cd ~/proj
-git clone https://github.com/​Bioconductor-mirror/​Pigengene.git+git clone git@git.bioconductor.org:packages/Pigengene 
 +</code> 
 + 
 +Now, you can build the package fom the source using: 
 + 
 +<​code>​ 
 +R CMD REMOVE ​Pigengene; R CMD build Pigengene 
 +</​code>​ 
 + 
 +If the build is successful, a tarbal will be createdYou can install the new package using: 
 + 
 +<​code>​ 
 + R CMD INSTALL Pigengene_<​Version>​.tar.gz
 </​code>​ </​code>​
  
 ---- ----
 +
  
 ==== Use git via proxy or vpn? ==== ==== Use git via proxy or vpn? ====
Line 398: Line 457:
 ==== Disable scroll acceleration in macOS? ==== ==== Disable scroll acceleration in macOS? ====
  
-Install and [[https://​www.reddit.com/​r/​osx/​comments/​6kx6zb/​how_to_disable_mouse_scrolling_acceleration/​|use]] USB [[http://​www.usboverdrive.com/​USBOverdrive/​Information.html|Overdrive]] to set Wheel up and down "​Speed"​ to say, 6 lines. The following [[https://​apple.stackexchange.com/​questions/​253111/​how-to-disable-scroll-acceleration-in-macos-sierra|command]] does NOT work:+Install and [[https://​www.reddit.com/​r/​osx/​comments/​6kx6zb/​how_to_disable_mouse_scrolling_acceleration/​|use]] USB [[http://​www.usboverdrive.com/​USBOverdrive/​Information.html|Overdrive]] to set Wheel up and down "​Speed" ​of your mouse to say, 6 lines. The following [[https://​apple.stackexchange.com/​questions/​253111/​how-to-disable-scroll-acceleration-in-macos-sierra|command]] does NOT work:
  
 <​code>​ <​code>​
Line 405: Line 464:
  
 Logitech Control Center may help on the [[https://​support.logi.com/​hc/​en-gb/​articles/​360025297833-Logitech-Control-Center-for-Macintosh-OS-X|Logitech]] MX mice older than 2019. Logitech Control Center may help on the [[https://​support.logi.com/​hc/​en-gb/​articles/​360025297833-Logitech-Control-Center-for-Macintosh-OS-X|Logitech]] MX mice older than 2019.
 +
 +----
 +
 +
 +===== Choose a solid state (SSD) external drive? =====
 +
 +The non-volatile memory express (NVMe) devices are [[https://​ssd.borecraft.com/​SSD_Buying_Guide_List.pdf|better]] than SATA solid state drives. Good brands include [[https://​smile.amazon.com/​gp/​product/​B07X6CKHH1/​ref=ox_sc_act_title_1?​smid=A29Y8OP2GPR7PE&​psc=1|Sabrent]] (Nano is smaller than Pro but gets hot when extensivly used), Seagete, Addlink, and Team. As of 2020, a speed of 1000 Mb/s is possible using USB 3.2.
 +
 +----
 +
 +===== Work with screen session? =====
 +
 +There are five main [[http://​www.pixelbeat.org/​lkdb/​screen.html|commands]] while working with screen session:
 +
 +  - Start and name a screen: ''​screen -S $NAME''​
 +  - Detach from a screen: ''​Ctrl+a d''​
 +  - See the list of active screens: ''​screen -ls''​
 +  - Reattach to a screen: ''​screen -r $NAME''​
 +  - Quit and [[https://​askubuntu.com/​questions/​356006/​kill-a-screen-session|kill]] your screen: ''​Ctrl+a then Ctrl+\''​
 +
 +----