Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
r_test [2019/06/06 06:20]
admin [Question 2:]
r_test [2019/07/12 03:19] (current)
admin [Question 2:]
Line 32: Line 32:
 ==== Question 2: ==== ==== Question 2: ====
  
-Using the [[https://​bioconductor.org/​packages/​release/​bioc/​vignettes/​maftools/​inst/​doc/​maftools.html|maftools]] and [[http://​bioconductor.org/​packages/​release/​bioc/​html/​TCGAbiolinks.html|TCGAbiolinks]] packages, determine the 3 most frequently mutated genes in liver cancer. Which of these 3 mutations is more predictive of survival? To answer this question, write a function that takes as input a gene name, and save KM plots in png format. Add the p-value as a legend in the plot. Deliverables are similar to question 1.\\ +**A)** ​Using the [[https://​bioconductor.org/​packages/​release/​bioc/​vignettes/​maftools/​inst/​doc/​maftools.html|maftools]] and [[http://​bioconductor.org/​packages/​release/​bioc/​html/​TCGAbiolinks.html|TCGAbiolinks]] packages, determine the 3 most frequently mutated genes in liver cancer. Which of these 3 mutations is more predictive of survival? To answer this question, write a function that takes as input a gene name, and save KM plots in png format. Add the p-value as a legend in the plot. Deliverables are similar to question 1. 
-__Bonus__: ​Let's define the //impact// of a set of genes to be the p-value of a log-rank test with the null hypothesis that when all of these genes are mutated together, the survival does not change. Write a function ''​most.impact()''​ that takes as input two ''​k1''​ and ''​n1''​ integers, and in the list of ''​n1''​ most mutated genes, finds the names of the ''​k1''​ genes with the best impact. Your function should return the names of the best ''​k1''​ genes, and also their impact. Run your function for ''​k1=3'',​ and ''​n1=3'',​ ''​10'',​ and ''​100''​. What the biological interpretation of your results?+ 
 +\\ 
 +**B)** ​Let's define the //impact// of a set of genes to be the p-value of a log-rank test with the null hypothesis that when all of these genes are mutated together, the survival does not change. Write a function ''​most.impact()''​ that takes as input two ''​k1''​ and ''​n1''​ integers, and in the list of ''​n1''​ most mutated genes, finds the names of the ''​k1''​ genes with the best impact. Your function should return the names of the best ''​k1''​ genes, and also their impact. Run your function for ''​k1=3'',​ and ''​n1=3'',​ ''​10'',​ and ''​100''​. What the biological interpretation of your results? 
 + 
 +__Hint:__ Use the ''​utils::​c?​m?​n()''​ function, where you need to guess the question marks. 
 + 
 +**Bonus**: Implement the ''​utils::​c?​m?​n()''​ function yourself using dynamic programming. Compare the running time of your implementation vs. the utils implementations using large inputs that require at least a couple of minutes.