Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
for_members [2019/02/04 19:50] adminfor_members [2019/04/18 20:31] – [General guidelines for conducting research in the Oncinfo Lab] admin
Line 1: Line 1:
-==== General guidelines for conducting research in Oncinfo lab ====+==== General guidelines for conducting research in the Oncinfo Lab ====
  
-  - All google docs that need to be edited by lab members should be put in Oncinfo [[https://drive.google.com/?tab=mo&authuser=0#folders/0B5Cpru0UXP0adTZTckg3aEd4SEE|folder]]. They should be kept confidential. Send your gmail address to Habil to get access to this folder. Then, create a subfolder with your name there, and create a google doc in your subfolder. Copy all items from this "For members" page to that google doc, and write "**Done**", "**Todo**", "**Skip**" in front of each item.+  - All google docs that need to be edited by lab members should be put in Oncinfo [[https://drive.google.com/?tab=mo&authuser=0#folders/0B5Cpru0UXP0adTZTckg3aEd4SEE|folder]]. They should be kept confidential. Send your gmail address to Habil to get access to this folder. Remind him to add you his Oncinfo Google group. Then, create a subfolder with your name there, and create a google doc in your subfolder. Copy all items from this "For members" page to that google doc, and write "**Done**", "**Todo**", "**Skip**" in front of each item.
   - Pass the online training courses required by the University e.g., conflict of interest, safety, etc.   - Pass the online training courses required by the University e.g., conflict of interest, safety, etc.
   - All experiments and analysis are done on Unix. That is a __real__  Unix system like Linux, OS X, etc., NOT a virtual machine. Start with a [[http://www.ee.surrey.ac.uk/Teaching/Unix/|tutorial]] for beginners or the [[http://nebc.nerc.ac.uk/downloads/courses/Bio-Linux/bl8_latest.pdf|introduction]] to Bio-Linux.   - All experiments and analysis are done on Unix. That is a __real__  Unix system like Linux, OS X, etc., NOT a virtual machine. Start with a [[http://www.ee.surrey.ac.uk/Teaching/Unix/|tutorial]] for beginners or the [[http://nebc.nerc.ac.uk/downloads/courses/Bio-Linux/bl8_latest.pdf|introduction]] to Bio-Linux.
   - [[http://www.r-project.org/|R]] is primarily used for statistical analysis and other scripting purposes in Oncinfo Lab. [[https://www.coursera.org/course/rprog|This]] is a good online course on R which takes about 1 month to complete. A couple of days should be enough to read [[http://cran.r-project.org/doc/manuals/R-intro.pdf|this]] good guide for starters to get the basis ideas, or cover the [[http://www.r-tutor.com/r-introduction|introduction]] section from R-Tutorial. [[https://www.datacamp.com/|DataCamp]] facilitates reading about R and running examples at the same time using a browser . Those who know R to some extend can use the book Bioinformatics with R {{:bioinformatics-r-cookbook.pdf|Cookbook}}  or [[http://adv-r.had.co.nz/|Advanced]] R by Hadley Wickham to gradually learn more as they proceed in a project. The next step after learning R is to learn [[http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html|Bioconductor]] .   - [[http://www.r-project.org/|R]] is primarily used for statistical analysis and other scripting purposes in Oncinfo Lab. [[https://www.coursera.org/course/rprog|This]] is a good online course on R which takes about 1 month to complete. A couple of days should be enough to read [[http://cran.r-project.org/doc/manuals/R-intro.pdf|this]] good guide for starters to get the basis ideas, or cover the [[http://www.r-tutor.com/r-introduction|introduction]] section from R-Tutorial. [[https://www.datacamp.com/|DataCamp]] facilitates reading about R and running examples at the same time using a browser . Those who know R to some extend can use the book Bioinformatics with R {{:bioinformatics-r-cookbook.pdf|Cookbook}}  or [[http://adv-r.had.co.nz/|Advanced]] R by Hadley Wickham to gradually learn more as they proceed in a project. The next step after learning R is to learn [[http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html|Bioconductor]] .
-  - Using [[https://en.wikipedia.org/wiki/Emacs|Emacs]] as a powerful, general purpose, text editor is encouraged ([[http://www2.lib.uchicago.edu/keith/tcl-course/emacs-tutorial.html|tutorial]]). In terminal, you can start it by typing emacs even in an SSH session. On Ubuntu you can simply install Emacs using Software Center, or by Package Synaptics, or by the following command: sudo apt-get install emacs. On OS X, you can install [[https://emacsformacosx.com/|Emacs]] For MAC OS X, which is better than Aquamacs. Compared to these two, [[https://vigou3.gitlab.io/emacs-modified-macos/|Emacs Modified for macOS]] because it supports [[https://ess.r-project.org/|ESS]] and [[https://www.gnu.org/software/auctex/|AUCTeX]]. You can customize your emacs by editing .emacs file. . Feel free to copy some, but not all, commands from Habil's .emacs file for [[https://www.dropbox.com/s/pdt6fbho57k421d/emacs_UTosx2018?dl=0|macOS]].+  - Using [[https://en.wikipedia.org/wiki/Emacs|Emacs]] as a powerful, general purpose, text editor is encouraged ([[http://www2.lib.uchicago.edu/keith/tcl-course/emacs-tutorial.html|tutorial]]). In terminal, you can start it by typing emacs even in an SSH session. On Ubuntu you can simply install Emacs using Software Center, or by Package Synaptics, or by the following command: sudo apt-get install emacs. On OS X, you can install [[https://emacsformacosx.com/|Emacs]] For MAC OS X, which is better than Aquamacs. Compared to these two, [[https://vigou3.gitlab.io/emacs-modified-macos/|Emacs Modified for macOS]] might be better because it supports [[https://ess.r-project.org/|ESS]] and [[https://www.gnu.org/software/auctex/|AUCTeX]]. You can customize your emacs by editing .emacs file. . Feel free to copy some, but not all, commands from Habil's .emacs file for [[https://www.dropbox.com/s/pdt6fbho57k421d/emacs_UTosx2018?dl=0|macOS]].
   - Using proprietary file formats is not professional when you are sharing information (e.g., your CV) with others. The pdf and png formats are OK and portable. Use Google Docs instead of .docx, and Google Presentation instead of .ppt.   - Using proprietary file formats is not professional when you are sharing information (e.g., your CV) with others. The pdf and png formats are OK and portable. Use Google Docs instead of .docx, and Google Presentation instead of .ppt.
   - [[https://www.youtube.com/watch?v=WsofH466lqk|This]] video illustrates transcription ([[https://en.wikipedia.org/wiki/Transcription_(genetics)|wikipedia]], [[https://www.youtube.com/watch?v=5MfSYnItYvg|video 2]]), more videos on [[https://www.youtube.com/watch?v=OEWOZS_JTgk|gene expression]] ([[https://en.wikipedia.org/wiki/Gene_expression|wikipedia]]), [[https://www.youtube.com/watch?v=TfYf_rPWUdY|translation]] ([[https://www.youtube.com/watch?v=5bLEDd-PSTQ|detailed]]), etc.   - [[https://www.youtube.com/watch?v=WsofH466lqk|This]] video illustrates transcription ([[https://en.wikipedia.org/wiki/Transcription_(genetics)|wikipedia]], [[https://www.youtube.com/watch?v=5MfSYnItYvg|video 2]]), more videos on [[https://www.youtube.com/watch?v=OEWOZS_JTgk|gene expression]] ([[https://en.wikipedia.org/wiki/Gene_expression|wikipedia]]), [[https://www.youtube.com/watch?v=TfYf_rPWUdY|translation]] ([[https://www.youtube.com/watch?v=5bLEDd-PSTQ|detailed]]), etc.
Line 14: Line 14:
   - Do NOT use space in the file or folder names. Do NOT include binary files such as png, pdf, RData, etc. in a Bitbucket repository unless on an exceptional basis. Instead, use e.g., ''rsync -avz -e ssh <usrname>@ls5.tacc.utexas.edu''  or ''scp ''to transfer files, and document the exact paths in a readme file in the corresponding folder.   - Do NOT use space in the file or folder names. Do NOT include binary files such as png, pdf, RData, etc. in a Bitbucket repository unless on an exceptional basis. Instead, use e.g., ''rsync -avz -e ssh <usrname>@ls5.tacc.utexas.edu''  or ''scp ''to transfer files, and document the exact paths in a readme file in the corresponding folder.
   - If you want to use TACC resources, you first [[https://portal.tacc.utexas.edu/account-request|create]] an account, and then ask Habil to add you to a project. A simple test for running a job on Stampede cluster is the following. Look at their user [[https://portal.tacc.utexas.edu/user-guides/stampede|guide]] or [[https://srcc.stanford.edu/sge-slurm-conversion|this]] table of commands for more details. \\  $ ssh <username>@stampede.tacc.utexas.edu \\  $ cd ~zare \\  login4.stampede(1)$ sbatch -p normal -n 1 -t 3 ./test.sh \\  We usually use Lonestar5 for computing and Ranch for storage of large data.   - If you want to use TACC resources, you first [[https://portal.tacc.utexas.edu/account-request|create]] an account, and then ask Habil to add you to a project. A simple test for running a job on Stampede cluster is the following. Look at their user [[https://portal.tacc.utexas.edu/user-guides/stampede|guide]] or [[https://srcc.stanford.edu/sge-slurm-conversion|this]] table of commands for more details. \\  $ ssh <username>@stampede.tacc.utexas.edu \\  $ cd ~zare \\  login4.stampede(1)$ sbatch -p normal -n 1 -t 3 ./test.sh \\  We usually use Lonestar5 for computing and Ranch for storage of large data.
-  - Every member should upload their photo to his profile in the wiki. Todo this, click on your username at the top right, then, Account. In addition, everyone should have a photo and their updated CV in pdf format on their personal page. [[:file_view_cv_template.zip_543305154_cv_template.zip|This]] is an optional LaTeX template. The permission of the lab notebooks should be set to "hidden"and it is important that they be updated EVERY day. Write your posts in anti-chronological order so that the newest post comes at the top.+  - Every member should upload their photo to his profile in the wiki. Todo this, click on your username at the top right, then, Account. In addition, everyone should have a photo and their updated CV in pdf format on their personal page. [[:file_view_cv_template.zip_543305154_cv_template.zip|This]] is an optional LaTeX template. The permission of any lab notebook (lano) should be set to "hidden"and it is important that they be updated EVERY day. CiviHosting provides us with two edit modes DW and ckg. Write your posts in anti-chronological order so that the newest post comes at the top.
   - You can install Google Scholar [[https://chrome.google.com/webstore/detail/google-scholar-button/ldipcbpaocekfooobnbcddclnhejkcpn?hl=en|Button]] add-on for an easier way of searching Google Scholar. You select the paper title and then click on the little blue icon on the top right corner. For any paper which you want to cite on the lab wiki, find it on Google Scholar, click on "More>Cite" and copy the MLA format.   - You can install Google Scholar [[https://chrome.google.com/webstore/detail/google-scholar-button/ldipcbpaocekfooobnbcddclnhejkcpn?hl=en|Button]] add-on for an easier way of searching Google Scholar. You select the paper title and then click on the little blue icon on the top right corner. For any paper which you want to cite on the lab wiki, find it on Google Scholar, click on "More>Cite" and copy the MLA format.
   - Code style in Oncinfo lab: We follow Hadley Wickhams’s R Style [[http://r-pkgs.had.co.nz/style.html|Guide]] unless another convention is mentioned below. The goal is to include as much code as possible on 1 page so that it is easier to skim while keeping the overall structure such as proper indentation. \\  When writing R code, use "x <- 5" for assigning a value to a variable. Do NOT use "x = 5" or "x<-5". **Do NOT use underscore, '_', in variable or function names**. Instead of "inverse_of", use "inverseOf" as a variable name so that you can select it by 1 click. Use "inverse.of" as a function name to indicate it is a function not a variable. Almost all functions must return a list so that extending them will be easy. Use "##" for comments NOT a single "#". Write the name of the loaded object in a comment in front of load(). Avoid long lines of code. Most lines should be < 90 characters, and all lines must be <100 characters . Thus, do NOT include space when using = in function calls. Good example: ''average <- mean(feet[ ,"real"]/12+inches, na.rm=TRUE) ## Spaces only around "<-" and after ","''. The space in "''[ ,''" is OK, which refers to all rows. It is OK not to place a space before the parenthesis after "''if(''", "''for(''", and alike. \\ When the line is long, it usually means you need to extract some of it and define a new variable right above that line.Data structures in R can be ordered from simple to complex as follows: number , vector, matrix, and list. Always use the simplest possible data structure, e.i., do not use a list when you can use a matrix.   - Code style in Oncinfo lab: We follow Hadley Wickhams’s R Style [[http://r-pkgs.had.co.nz/style.html|Guide]] unless another convention is mentioned below. The goal is to include as much code as possible on 1 page so that it is easier to skim while keeping the overall structure such as proper indentation. \\  When writing R code, use "x <- 5" for assigning a value to a variable. Do NOT use "x = 5" or "x<-5". **Do NOT use underscore, '_', in variable or function names**. Instead of "inverse_of", use "inverseOf" as a variable name so that you can select it by 1 click. Use "inverse.of" as a function name to indicate it is a function not a variable. Almost all functions must return a list so that extending them will be easy. Use "##" for comments NOT a single "#". Write the name of the loaded object in a comment in front of load(). Avoid long lines of code. Most lines should be < 90 characters, and all lines must be <100 characters . Thus, do NOT include space when using = in function calls. Good example: ''average <- mean(feet[ ,"real"]/12+inches, na.rm=TRUE) ## Spaces only around "<-" and after ","''. The space in "''[ ,''" is OK, which refers to all rows. It is OK not to place a space before the parenthesis after "''if(''", "''for(''", and alike. \\ When the line is long, it usually means you need to extract some of it and define a new variable right above that line.Data structures in R can be ordered from simple to complex as follows: number , vector, matrix, and list. Always use the simplest possible data structure, e.i., do not use a list when you can use a matrix.