Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
for_members [2019/08/22 00:12] – [General guidelines for conducting research in the Oncinfo Lab] adminfor_members [2019/10/26 16:39] – [General guidelines for conducting research in the Oncinfo Lab] admin
Line 3: Line 3:
   - To join the Oncinfo Lab, you need to: \\ a) Be __organized and disciplined__, otherwise your efforts will be fruitless and lost even if you make important discoveries. Other lab members will enjoy working with you if your code is clean and you can clearly talk about your project. \\ b) __Work hard__, otherwise even a genius will not get to anywhere if they do not move. \\ c) Be __talented__. Nobody knows everything that is needed to do multidisciplinary research.You should be able to learn many things that you were not thought in courses. You often need to find novel solutions for small and big challenges that you face because you are the first person who is working on your specific study. \\ d) Be __knowledgeable__  because we are not interested in reinventing the wheel. \\ The above items are ordered based on importance. The most critical one is **discipline**.   - To join the Oncinfo Lab, you need to: \\ a) Be __organized and disciplined__, otherwise your efforts will be fruitless and lost even if you make important discoveries. Other lab members will enjoy working with you if your code is clean and you can clearly talk about your project. \\ b) __Work hard__, otherwise even a genius will not get to anywhere if they do not move. \\ c) Be __talented__. Nobody knows everything that is needed to do multidisciplinary research.You should be able to learn many things that you were not thought in courses. You often need to find novel solutions for small and big challenges that you face because you are the first person who is working on your specific study. \\ d) Be __knowledgeable__  because we are not interested in reinventing the wheel. \\ The above items are ordered based on importance. The most critical one is **discipline**.
   - All Google docs that need to be edited by lab members should be put in Oncinfo [[https://drive.google.com/?tab=mo&authuser=0#folders/0B5Cpru0UXP0adTZTckg3aEd4SEE|folder]]. They should be kept confidential. Send your gmail address to Habil to get access to this folder. Remind him to add you his Oncinfo Google group. Then, create a subfolder with your name there, and create a google doc in your subfolder. Copy all items from this "For members" page to that google doc, and write "**Done**", "**Todo**", "**Skip**" in front of each item.   - All Google docs that need to be edited by lab members should be put in Oncinfo [[https://drive.google.com/?tab=mo&authuser=0#folders/0B5Cpru0UXP0adTZTckg3aEd4SEE|folder]]. They should be kept confidential. Send your gmail address to Habil to get access to this folder. Remind him to add you his Oncinfo Google group. Then, create a subfolder with your name there, and create a google doc in your subfolder. Copy all items from this "For members" page to that google doc, and write "**Done**", "**Todo**", "**Skip**" in front of each item.
 +  - If you have a lab computer, add the tag number written on the back of the laptop, your name, and the date you start using it in the [[https://docs.google.com/spreadsheets/d/1A6ouCCPov5VXt7xBCdTh7Cwc6jtPJGGJlekmGnKL5JY/edit#gid=441648294|table]] of computers.
   - Pass the online training courses required by the University e.g., conflict of interest, safety, etc.   - Pass the online training courses required by the University e.g., conflict of interest, safety, etc.
   - All experiments and analysis are done on Unix. That is a __real__  Unix system like Linux, OS X, etc., NOT a virtual machine. Start with a [[http://www.ee.surrey.ac.uk/Teaching/Unix/|tutorial]] for beginners or the [[http://nebc.nerc.ac.uk/downloads/courses/Bio-Linux/bl8_latest.pdf|introduction]] to Bio-Linux.   - All experiments and analysis are done on Unix. That is a __real__  Unix system like Linux, OS X, etc., NOT a virtual machine. Start with a [[http://www.ee.surrey.ac.uk/Teaching/Unix/|tutorial]] for beginners or the [[http://nebc.nerc.ac.uk/downloads/courses/Bio-Linux/bl8_latest.pdf|introduction]] to Bio-Linux.
Line 11: Line 12:
   - All members should know about central [[https://en.wikipedia.org/wiki/Central_dogma_of_molecular_biology|dogma]] of biology which is almost enough biological knowledge to start the majority of projects [[:dogma.pdf?media=dogma.pdf|pdf]]]. Familiarity with some basic concepts such as [[https://en.wikipedia.org/wiki/Exon|exon]], intron, etc. is helpful. Watch [[https://www.dnalc.org/resources/3d/|animations]] from DNA Learning Center.   - All members should know about central [[https://en.wikipedia.org/wiki/Central_dogma_of_molecular_biology|dogma]] of biology which is almost enough biological knowledge to start the majority of projects [[:dogma.pdf?media=dogma.pdf|pdf]]]. Familiarity with some basic concepts such as [[https://en.wikipedia.org/wiki/Exon|exon]], intron, etc. is helpful. Watch [[https://www.dnalc.org/resources/3d/|animations]] from DNA Learning Center.
   - Any file or data on this wiki that has restricted permissions, such as some paper pdfs or drafts, should not be shared with nonmembers unless authorized by the PI.   - Any file or data on this wiki that has restricted permissions, such as some paper pdfs or drafts, should not be shared with nonmembers unless authorized by the PI.
 +  - For future reference, please add the link to your presentations and drafts on the [[https://oncinfo.org/drafts|drafts]] page. At a minimum, please include: the author, the date, the audience, and the subject.
   - All members should read and follow [[http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000424|Bill's]] guidelines, and organize their files and folders accordingly and to some extend. Start by making a "~/proj" directory in your home folder that will eventually contain a subfolder for each project you are working on. Major subfolders must have a readme file for example to describe where the data is coming from. Your code folder must include a runall.R script that sources other scripts. Avoid sourcing scripts in other scripts except for the runall because then following and debugging the pipeline would be difficult.   - All members should read and follow [[http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000424|Bill's]] guidelines, and organize their files and folders accordingly and to some extend. Start by making a "~/proj" directory in your home folder that will eventually contain a subfolder for each project you are working on. Major subfolders must have a readme file for example to describe where the data is coming from. Your code folder must include a runall.R script that sources other scripts. Avoid sourcing scripts in other scripts except for the runall because then following and debugging the pipeline would be difficult.
   - Your code and documents should be stored in a Bitbucket repository like [[https://bitbucket.org/habilzare/genetwork|https://bitbucket.org/habilzare/genetwork]]. Sign up for an [[https://bitbucket.org/account/signup/|account]] and add your photo. Do NOT sign in using your Google account. Only then, send your username to Habil. If you are new to Bitbucket, take [[https://guides.co/g/bitbucket-101/11146|Bitbucket 101]]. You can [[https://confluence.atlassian.com/bitbucket/use-the-ssh-protocol-with-bitbucket-cloud-221449711.html|avoid]] having to manually type a password each time you pull using ssh. To add a key, click on your photo at the top right corner of Bitbucket page, Bitbucket settings, SSH keys, Add key. This trick is not appropriate for TACC clusters because we should not change our .ssh folder there. On the cluster, use https to clone instead of ssh. Do NOT mess up with other's git folders on the cluster. You should //only//  clone, pull, and push in your own home or work directory. Do NOT skip this step. Before changing anything in a repository, read and abide to the conventions described in the main readme file.   - Your code and documents should be stored in a Bitbucket repository like [[https://bitbucket.org/habilzare/genetwork|https://bitbucket.org/habilzare/genetwork]]. Sign up for an [[https://bitbucket.org/account/signup/|account]] and add your photo. Do NOT sign in using your Google account. Only then, send your username to Habil. If you are new to Bitbucket, take [[https://guides.co/g/bitbucket-101/11146|Bitbucket 101]]. You can [[https://confluence.atlassian.com/bitbucket/use-the-ssh-protocol-with-bitbucket-cloud-221449711.html|avoid]] having to manually type a password each time you pull using ssh. To add a key, click on your photo at the top right corner of Bitbucket page, Bitbucket settings, SSH keys, Add key. This trick is not appropriate for TACC clusters because we should not change our .ssh folder there. On the cluster, use https to clone instead of ssh. Do NOT mess up with other's git folders on the cluster. You should //only//  clone, pull, and push in your own home or work directory. Do NOT skip this step. Before changing anything in a repository, read and abide to the conventions described in the main readme file.
Line 16: Line 18:
   - If you want to use TACC resources, you first [[https://portal.tacc.utexas.edu/account-request|create]] an account, and then ask Habil to add you to a project. A simple test for running a job on Stampede cluster is the following. Look at their user [[https://portal.tacc.utexas.edu/user-guides/stampede|guide]] or [[https://srcc.stanford.edu/sge-slurm-conversion|this]] table of commands for more details. \\  $ ssh <username>@stampede.tacc.utexas.edu \\  $ cd ~zare \\  login4.stampede(1)$ sbatch -p normal -n 1 -t 3 ./test.sh \\  We usually use Lonestar5 for computing and Ranch for storage of large data.   - If you want to use TACC resources, you first [[https://portal.tacc.utexas.edu/account-request|create]] an account, and then ask Habil to add you to a project. A simple test for running a job on Stampede cluster is the following. Look at their user [[https://portal.tacc.utexas.edu/user-guides/stampede|guide]] or [[https://srcc.stanford.edu/sge-slurm-conversion|this]] table of commands for more details. \\  $ ssh <username>@stampede.tacc.utexas.edu \\  $ cd ~zare \\  login4.stampede(1)$ sbatch -p normal -n 1 -t 3 ./test.sh \\  We usually use Lonestar5 for computing and Ranch for storage of large data.
   - Every member should upload their photo to his profile in the wiki. Todo this, click on your username at the top right, then, Account. In addition, everyone should have a photo and their updated CV in pdf format on their personal page. [[:file_view_cv_template.zip_543305154_cv_template.zip|This]] is an optional LaTeX template. The permission of any lab notebook (lano) should be set to "hidden"and it is important that they be updated EVERY day. [[https://civihosting.com/|CiviHosting]] provides us with two edit modes: ckg and DW. Use the one that is more convenient for you. Write your posts in anti-chronological order so that the newest post comes at the top. For facilitating future reference, avoid sending data as attachments. Instead, upload files to your lano and link to them where needed.   - Every member should upload their photo to his profile in the wiki. Todo this, click on your username at the top right, then, Account. In addition, everyone should have a photo and their updated CV in pdf format on their personal page. [[:file_view_cv_template.zip_543305154_cv_template.zip|This]] is an optional LaTeX template. The permission of any lab notebook (lano) should be set to "hidden"and it is important that they be updated EVERY day. [[https://civihosting.com/|CiviHosting]] provides us with two edit modes: ckg and DW. Use the one that is more convenient for you. Write your posts in anti-chronological order so that the newest post comes at the top. For facilitating future reference, avoid sending data as attachments. Instead, upload files to your lano and link to them where needed.
-  - You can install Google Scholar [[https://chrome.google.com/webstore/detail/google-scholar-button/ldipcbpaocekfooobnbcddclnhejkcpn?hl=en|Button]] add-on for an easier way of searching Google Scholar. You select the paper title and then click on the little blue icon on the top right corner. For any paper which you want to cite on the lab wiki, find it on Google Scholar, click on "More>Cite" and copy the MLA format.+  - You can install Google Scholar [[https://chrome.google.com/webstore/detail/google-scholar-button/ldipcbpaocekfooobnbcddclnhejkcpn?hl=en|Button]] add-on for an easier way of searching Google Scholar. You select the paper title and then click on the little blue icon on the top right corner. For any paper which you want to cite on the lab wiki, find it on Google Scholar, click on "More>Cite" and copy the MLA format. Also, use [[https://gsuite.google.com/marketplace/app/paperpile/894076725911|Paperpile]] for easy citation in Google doc, and Math [[https://gsuite.google.com/marketplace/app/math_equations/825973477142|Equations]] for writing and manipulating equations on Google presentations.
   - Code style in Oncinfo lab: We follow Hadley Wickhams’s R Style [[http://r-pkgs.had.co.nz/style.html|Guide]] unless another convention is mentioned below. The goal is to include as much code as possible on 1 page so that it is easier to skim while keeping the overall structure such as proper indentation. \\  When writing R code, use "x <- 5" for assigning a value to a variable. Do NOT use "x = 5" or "x<-5". **Do NOT use underscore, '_', in variable or function names**. Instead of "inverse_of", use "inverseOf" as a variable name so that you can select it by 1 click. Use "inverse.of" as a function name to indicate it is a function not a variable. Almost all functions must return a list so that extending them will be easy. Use "##" for comments NOT a single "#". Write the name of the loaded object in a comment in front of load(). Avoid long lines of code. Most lines should be < 90 characters, and all lines must be <100 characters . Thus, do NOT include space when using = in function calls. Good example: ''average <- mean(feet[ ,"real"]/12+inches, na.rm=TRUE) ## Spaces only around "<-" and after ","''. The space in "''[ ,''" is OK, which refers to all rows. It is OK not to place a space before the parenthesis after "''if(''", "''for(''", and alike. \\ When the line is long, it usually means you need to extract some of it and define a new variable right above that line.Data structures in R can be ordered from simple to complex as follows: number , vector, matrix, and list. Always use the simplest possible data structure, e.i., do not use a list when you can use a matrix.   - Code style in Oncinfo lab: We follow Hadley Wickhams’s R Style [[http://r-pkgs.had.co.nz/style.html|Guide]] unless another convention is mentioned below. The goal is to include as much code as possible on 1 page so that it is easier to skim while keeping the overall structure such as proper indentation. \\  When writing R code, use "x <- 5" for assigning a value to a variable. Do NOT use "x = 5" or "x<-5". **Do NOT use underscore, '_', in variable or function names**. Instead of "inverse_of", use "inverseOf" as a variable name so that you can select it by 1 click. Use "inverse.of" as a function name to indicate it is a function not a variable. Almost all functions must return a list so that extending them will be easy. Use "##" for comments NOT a single "#". Write the name of the loaded object in a comment in front of load(). Avoid long lines of code. Most lines should be < 90 characters, and all lines must be <100 characters . Thus, do NOT include space when using = in function calls. Good example: ''average <- mean(feet[ ,"real"]/12+inches, na.rm=TRUE) ## Spaces only around "<-" and after ","''. The space in "''[ ,''" is OK, which refers to all rows. It is OK not to place a space before the parenthesis after "''if(''", "''for(''", and alike. \\ When the line is long, it usually means you need to extract some of it and define a new variable right above that line.Data structures in R can be ordered from simple to complex as follows: number , vector, matrix, and list. Always use the simplest possible data structure, e.i., do not use a list when you can use a matrix.
   - **Never copy code**, instead generalize your code and write functions. If you are copying more than a line of code, most likely you are doing something wrong.   - **Never copy code**, instead generalize your code and write functions. If you are copying more than a line of code, most likely you are doing something wrong.