Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
gene_networks_inference [2018/12/14 15:28] – [Related work] admingene_networks_inference [2018/12/14 15:32] – [Papers] admin
Line 11: Line 11:
 ===== Papers ===== ===== Papers =====
  
-  - A. Zainulabadeen et al., Underexpression of Specific Interferon Genes Is Associated with Poor Prognosis of Melanoma, [[http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0170025|PLoS One]] 2017, 12(1).\\  "Using our recently developed gene network model, we identified biological signatures that confidently predict the prognosis of melanoma. We showed that our predictive model assesses the risk more accurately than the traditional Clark staging method." +  - A. Zainulabadeen et al., Underexpression of Specific Interferon Genes Is Associated with Poor Prognosis of Melanoma, [[http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0170025|PLoS One]] 2017, 12(1). \\  "Using our recently developed gene network model, we identified biological signatures that confidently predict the prognosis of melanoma. We showed that our predictive model assesses the risk more accurately than the traditional Clark staging method." 
-  - Foroushani, Amir, et al. "Large-scale gene network analysis reveals the significance of extracellular matrix pathway and homeobox genes in acute myeloid leukemia: an introduction to the Pigengene package and its applications." [[https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-017-0253-6|BMC medical genomics]] 10.1 (2017): 16.\\ \\  +  - Foroushani, Amir, et al. "Large-scale gene network analysis reveals the significance of extracellular matrix pathway and homeobox genes in acute myeloid leukemia: an introduction to the Pigengene package and its applications." [[https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-017-0253-6|BMC medical genomics]] 10.1 (2017): 16.​​ 
-=====   =====+  - Agrahari, Rupesh, et al. "Applications of Bayesian network models in predicting types of hematological malignancies." [[https://www.nature.com/articles/s41598-018-24758-5|Scientific Reports]] 8.1 (2018): 6951.
  
  
Line 156: Line 156:
  \\  \\ [[:gene_networks_inference|Drafts]], [[:gene_networks_inference|Next steps]]  \\  \\ [[:gene_networks_inference|Drafts]], [[:gene_networks_inference|Next steps]]
  
- 
-===== Data ===== 
- 
-  - mRNA expression and Mutation data from [[novartis_data|Novartis]].\\  Broad-Novartis Cancer Cell Line Encyclopedia ([[http://www.broadinstitute.org/ccle/home|CCLE]], [[http://www.nature.com/nature/journal/v483/n7391/full/nature11003.html|Barretina]] et al.).About 500 cell lines from different human cancers. The goal here is to predict drug responses in particular in synergistic settings. Specifically, BROWS > DATA >\\  a. mRNA expression > gene-centric RMA-normalized mRNA expression data > the gctx [[http://www.broadinstitute.org/ccle/data/browseData?conversationPropagation=begin|file]]. ([[|How to]] read a gctx file?) \\  b. Pharmacological profiling Drug data > Pharmacologic profiles for 24 anticancer drugs across 504 CCLE lines.\\ c. [[/file/view/CCLE_Chris_clustering.csv/535042602/CCLE_Chris_clustering.csv|Clustering]] and gene [[/file/view/CCLE_Chris_GO.csv/535042614/CCLE_Chris_GO.csv|ontology]] analysis done by Dr. Chris [[https://www.linkedin.com/profile/view?id=145098421&authType=NAME_SEARCH&authToken=xR3c&locale=en_US&srchid=1029939271418481101510&srchindex=1&srchtotal=1&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A1029939271418481101510%2CVSRPtargetId%3A145098421%2CVSRPcmpt%3Aprimary|Gaiteri]]. 
-  - RNA-seq from about a hundred [[alys_data|AML]] and MDS cases are available from Karsan lab. The goal here is to identify the general underling mechanisms of the disease, and to compare them with the relapse factors. More specific questions are a) What pathways are different in AML than MDS? b) Are there pathways which can define AML subtypes, which are expected to exist due to differences in prognosis? c) What are the molecular mechanisms of [[http://dsas9a9gxtv2e.cloudfront.net/content/haematol/95/10/1623/F1.large.jpg|transformation]] of some MDS cases to AML? 
-  - "[[http://www.ncbi.nlm.nih.gov/geo/|GEO]], a public functional genomics data repository" ([[https://www.biostars.org/p/97370/|source]]). Includes over 4,000 leukemia subjects in the MILES series, a particular microArray study that contains around 400 AML and 300 MDS cases. 
-  - NCI-60 cell lines ([[http://www.cbioportal.org/public-portal/study.do?cancer_study_id=cellline_nci60|cBio]] portal, [[http://dtp.nci.nih.gov/branches/btb/ivclsp.html|DTB]]). 
-  - Genentech data set [[http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3080.html|published]] in 2014 (Klinj et al.), RNA-seq for 675 cell lines including 15 AMLs, and response to 5 drugs. 
-  - Sanger data set [[http://www.nature.com/nature/journal/v483/n7391/full/nature11005.html|published]] in 2012 (Garnet et al.), similar to CCLE data with 639 cell lines and 130 compounds. 
-  - GlaxoSmithKline data set [[http://cancerres.aacrjournals.org/content/70/9/3677.long|published]] in 2010 (Greshock et al.), similar to CCLE data with 311 cell lines and 19 compounds. 
-  - Nucleic Acids Research online Molecular Biology Database [[http://nar.oxfordjournals.org/content/41/D1/D1.abstract?ijkey=a782763acb573716f2620e420a35a6d3fbaa3cf5&keytype2=tf_ipsecsha|Collection]].\\  lists 1512 miscellaneous online databases. 
-  - RNA-seq data of over 100 Xiphophorus fish treated with light under different conditions such as dosage and wavelength. 20-30 controls are also available from Walter [[http://mbrg.chemistry.txstate.edu/|Lab]] [[[https://docs.google.com/spreadsheets/d/1oG2doOyKcfCidYciPgO6Q50nwtUmflU-Ll8-iRkr_6U/edit#gid=0|table]]]. 
-  - [[http://www.reuters.com/article/2015/09/22/us-astrazeneca-cancer-idUSKCN0RM0MG20150922|AstraZeneca's]] crowd sourcing initiative as part of the DREAM Challenge. ~10,000 tested combinations measuring the ability of drugs to destroy cancer cell lines, and the corresponding genomic information. 
-  - Breast Cancer datasets: We will examine the generalizability of the method that we developed for haematological malignancies (AML/MDS) by examining its performance on several breast cancer datasets: 209 ER+ samples from Wang et al's [[http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=gse2034|dataset ]](GEO). ([[http://www.nature.com/ncomms/journal/v1/n4/full/ncomms1033.html|Paper]]), 201 ER+ samples from Miller et al's [[http://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE3494&platform=GPL96|dataset]] (GEO), as well as expression data from [[https://www.ebi.ac.uk/ega/studies/EGAS00000000083|METABRIC]] study ( ~2000 samples, hosted by EGA) ([[http://www.nature.com/nature/journal/v486/n7403/full/nature10983.html|paper]]). 
-  - Microarray expression profiles of 1005 colorectal cancer patients from 13 independent cohorts ([[http://www.cancer-systemsbiology.org/Papers/JAMA-2015.pdf|paper]]). 
-  - Gene expression [[https://docs.google.com/spreadsheets/d/1oG2doOyKcfCidYciPgO6Q50nwtUmflU-Ll8-iRkr_6U/edit#gid=0|data]] of fish exposed to light (Walter Lab). 
-  - 16 pairs of tumor-normal samples from fish with [[/file/view/count_table_extra_32_samples_cpm.csv/578966565/count_table_extra_32_samples_cpm.csv|melanoma]] (Walter Lab). 
-  - 499 prostate adenocarcinoma ([[https://tcga-data.nci.nih.gov/tcga/tcgaCancerDetails.jsp?diseaseType=PRAD&diseaseName=Prostate%20adenocarcinoma|TCGA]], Provisional) samples. Low risk cases are "Disease Free" for at least 5 years and the "Recurred" ones are high risk. The relevant clinical data are shown in "Disease Free (Months)" and "Disease Free Status" columns in [[http://www.cbioportal.org/study.do?cancer_study_id=prad_tcga#|cBioPortal]], respectively. 
-  - 470 skin [[https://tcga-data.nci.nih.gov/tcga/tcgaCancerDetails.jsp?diseaseType=SKCM&diseaseName=Skin%20Cutaneous%20Melanoma|cutaneous]] melanoma samples from TCGA. The clinical data for survival analysis are shown in "Disease Free (Months)", "Disease Free Status", and "Days to Last Followup" columns. 
-  - 200 AML cases from TCGA (LAML dataset). Available data types include gene expression , DNA-methylation, CNV, mutation, etc. TCGA data moved to [[https://gdc-portal.nci.nih.gov/|GDC]] but DNA-methylation is not there. Instead, it can be retrieved from GDC Legacy [[https://gdc-portal.nci.nih.gov/legacy-archive/search/f?filters=%7B%22op%22:%22and%22,%22content%22:%5B%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22cases.project.project_id%22,%22value%22:%5B%22TCGA-LAML%22%5D%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.data_category%22,%22value%22:%5B%22DNA%20methylation%22%5D%7D%7D%5D%7D&pagination=%7B%22files%22:%7B%22from%22:0,%22size%22:20,%22sort%22:%22cases.project.project_id:asc%22%7D%7D|Archive]] or the original [[https://tcga-data.nci.nih.gov/docs/publications/laml_2012/|paper]]. 
-  - German [[https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37642|AMLCG]] 1999 provides microarray data of 562 AML samples. 
-  - Papaemmanuil, Elli, et al. "Genomic classification and prognosis in acute myeloid leukemia." [[http://www.nejm.org/doi/full/10.1056/NEJMoa1516192#t=article|NEJM]] 374.23 (2016): 2209-2221.\\  The mutations of 111 genes in over **1,500 AML** cases are reported. The authors used this information to classify cases into groups and showed these groups have different prognosis. I.,e., [[https://www.mskcc.org/sites/default/files/node/2246/documents/discrete-cpe.pdf|concordance]] (probability estimates) improves from 64% using only the European LeukemiaNet criteria to 71%. Using the alternative allele frequency, they estimated the time of occurrence for the driver mutations. The data are available through the links in the corresponding [[http://www.nature.com/ng/journal/v49/n3/full/ng.3756.html|Nature]] paper [{{ :ng.3756.pdf|pdf}}]. Information on downloading these data is contained in the readme file found in genetwork:~/proj/genetwork/data/AML/gerstung/readme.txt. In particular, we have access to [[https://www.ebi.ac.uk/ega/studies/EGAS00001000275|EGAS00001000275]] through [[https://ega-archive.org/|EGA]] Archives. See [[habils_lab_notebook|Habil's]] note on 2017/09/05 for more detail. Any member of Oncinfo Lab who touches (analyzes or views) these data from Sanger Institute must read and abide to the {{ :sanger_data_agreement_2017-08-09.pdf|agreement}}. 
-  - RNA, DNA methylation, whole genome, etc. data of 960 (pediatric?) AML cases are available from [[https://ocg.cancer.gov/programs/target/acute-myeloid-leukemia|TARGET]] AML study. 
-  - AML-NK gene expression data (RNA-Seq) from three datasets (TCGA, Leucegene, and PMP/BCCA). [[https://docs.google.com/a/princeton.edu/document/d/1tB75BDAoG6-ggkoKzxF_f8anTnaP0lOAZ4MG-wEWCyk/edit?usp=sharing|Full description]]. 
-  - [[https://docs.google.com/document/d/1Q6tuMDw4fweRQNttmiG3NEZiBoe2JcRj7M1Qg-_rCgk/edit|List]] of available AML datasets with DNA methylation or gene expression data. 
-  - Genomic Data Commons ([[https://portal.gdc.cancer.gov/repository|GDC]]), which contains TCGA data and more. 
-  - [[https://amp.pharm.mssm.edu/archs4/|ARCHS4]], which was developed at the Icahn School of Medicine at Mount Sinai, and provides tools to download and analyze RNA-Seq data including single-cell gene expression. 
-=====   ===== 
-=====   =====