20100729基因网络分析.ppt

上传人:laozhun 文档编号:2227034 上传时间:2023-02-03 格式:PPT 页数:32 大小:3.34MB
返回 下载 相关 举报
20100729基因网络分析.ppt_第1页
第1页 / 共32页
20100729基因网络分析.ppt_第2页
第2页 / 共32页
20100729基因网络分析.ppt_第3页
第3页 / 共32页
20100729基因网络分析.ppt_第4页
第4页 / 共32页
20100729基因网络分析.ppt_第5页
第5页 / 共32页
点击查看更多>>
资源描述

《20100729基因网络分析.ppt》由会员分享,可在线阅读,更多相关《20100729基因网络分析.ppt(32页珍藏版)》请在三一办公上搜索。

1、In Silico Rice Gene-Phenotype Associations,Running Header:In Silico Rice Gene-Phenotype Associations,Corresponding Author:F.Alex Feltus.51 New Cherry St.BRC#302C,Clemson,SC 29634,USA.Tel:+1-864-656-3231;Fax:+1-864-656-4293;E-mail:ffeltusclemson.edu,Research Category:Genome Analysis,Plant Physiology

2、Preview.Published on July 28,2010,as DOI:10.1104/pp.110.159459,Copyright 2010 by the American Society of Plant Biologists,In Silico Rice Gene-Phenotype AssociationsThe Association of Multiple Interacting Genes with Specific PhenotypesIn Rice(Oryza sativa)Using Gene Co-Expression NetworksStephen P.Fi

3、cklin1,Feng Luo2,F.Alex Feltus1,31,23,School of Computing,Clemson University,Clemson,SC 29634,USA.Department of Genetics&Biochemistry,Clemson University,Clemson,SC 29634,USA.,In Silico Rice Gene-Phenotype Associations,FOOTNOTES,This work was supported in part by Clemson Experiment Station Project#SC

4、-1700381 to FAF.,Corresponding Author:F.Alex Feltus,ffeltusclemson.edu.,In Silico Rice Gene-Phenotype AssociationsABSTRACTDiscovering gene sets underlying expression of a given phenotype is of great importance as manyphenotypes are the result of complex gene-gene interactions.Gene co-expression netw

5、orks,built using aset of microarray samples as input,can help elucidate tightly co-expressed gene sets(modules)which aremixed with genes of known and unknown function.Functional enrichment analysis of modules furthersubdivides the co-expressed gene set into co-functional gene clusters that may co-ex

6、ist in the module with,other functionally related gene clusters.,In this study,45 co-expressed gene modules and 76 co-,functional gene clusters were discovered for Oryza sativa(rice),using a global,knowledge-independentparadigm and the combination of two network construction methodologies.Some clust

7、ers were enrichedfor previously characterized mutant phenotypes,providing evidence for specific gene sets(and theirannotated molecular functions)that underlie specific phenotypes.,In Silico Rice Gene-Phenotype AssociationsA current challenge in understanding biological systems,especially those relat

8、ed to multicellulareukaryotic organisms,is the understanding of complex gene product interactions and resultingphenotypes.Integrated studies at a systems biology level are critical for unraveling complex genotype-phenotype relationships.These studies are increasingly feasible with high-throughput mi

9、croarray assays,next-generation sequencing technologies,proteomics,and the wealth of accumulated functional and,structural genomics data across species.,Oryza sativa(rice)is one of the worlds most important food,crops,and serves as a model organism for the grass family.An improved understanding of c

10、omplexinteractions among rice genes is of great importance to improve nutritional value,grain yield,cultivationrange,disease and stress tolerance of rice and other cereals.In silico derived networks such as protein-protein interaction,metabolism,transcription,and geneco-expression model real biologi

11、cal interactions and exhibit naturally occurring properties such as small-world,scale-free,modularity and hierarchical characteristics(Ravasz et al.,2002;Barabasi and Oltvai,2004).Barabasi and Oltvai(2004)provide a review of biological networks,and a brief description ofrelevant network properties c

12、an be found in Supplemental Table S1.One type of biological network,thegene co-expression network,is constructed from microarray gene expression profiles(Stuart et al.,2003;,Persson et al.,2005;Luo et al.,2007).,Nodes in the network represent microarray probe sets(or genes),and edges between nodes e

13、xist when gene expression profiles are significantly correlated(co-expressed)across all samples.In many cases the microarray samples encompass multiple tissue types,growth stagesand experimental variables.Networks constructed from mixed sample sets represent a“global”,or meta-analysis view of gene c

14、o-expression.Gene co-expression networks can be applied to a broad range of biological problems.Examplesinclude those constructed to identify functional gene modules in humans(Lee et al.,2004),identificationof genes involved with cellulose synthase in Arabidopsis(Persson et al.,2005),identification

15、ofbiomarkers for glycerol kinase deficient mice(MacLennan et al.,2009),identification of cis-regulatoryelements in gene clusters for budding yeast(Mario-Ramrez et al.,2009),construction of a regulatorynetwork of iron response in Shewanella oneidensis(Yang et al.,2009),and identification of conserved

16、gene clusters across several species(Stuart et al.,2003).For plants,global co-expression networks havebeen constructed for Arabidopsis(Persson et al.,2005;Wei et al.,2006;Mentzen et al.,2008;Atias et al.,2009;Mao et al.,2009;Wang et al.,2009),barley(Faccioli et al.,2005),rice(Jupiter et al.,2009;Lee

17、 etal.,2009),and tobacco(Edwards et al.,2010).,Several online resources exist for plant co-expression networks.,For Arabidopsis,online,resources for co-expression networks include the Arabidopsis Co-expression Tool(ACT)which allows,In Silico Rice Gene-Phenotype Associationsusers to mine genes with s

18、imilar co-expression patterns as well as functional terms(Manfield et al.,2006),and the Arabidopsis thaliana trans-factor and cis-elements prediction database(ATTED II)whichprovides a visualization and online data mining tool for co-expression networks in Arabidopsis(Obayashiet al.,2009).The RiceArr

19、ayNet(RAN)(Lee et al.,2009)and STARNET 2(Jupiter et al.,2009)providesimilar functionality for rice.An online resource exists for poplar(Ogata et al.,2009)and a similar sitenamed the Coexpressed Biological Processes(CoP)database provides a searchable database of functionalassociations for co-expressi

20、on network modules across multiple plant species including rice(Ogata et al.,2010).,Gene co-expression networks do suffer from limitations.,First,they cannot provide a full,understanding of complex gene-gene interactions because they infer only a single level of interaction:gene co-expression.Also,c

21、o-expression can only be measured when genes are consistently co-expressedor when genes are sometimes co-expressed but otherwise consistently silent(Aoki et al.,2007).Additionally,expression of all genes in every environmental or temporal condition cannot be measuredand hence co-expression networks

22、do not capture all possible relationships.Moreover,genes that are not,co-expressed,but which may be essential are not captured.,Despite these limitations,co-expression,networks provide valuable glimpses into complex gene-product interactions.Once constructed,a gene co-expression network can be exami

23、ned for sub-networks of co-expressed and possibly co-functional genes.A reduced-bias sub-network discovery method can beperformed using knowledge-independent approaches that employ statistical methods to circumscribe non-random gene set interactions.In contrast,gene-guided methods use a priori selec

24、ted“bait”genes todefine gene sets consisting of closely connected neighbors(Persson et al.,2005;Aoki et al.,2007).Aknowledge-independent approach provides inferences into the interaction set that might be obscured fromgene-guided methods which filter genes based on prior assumptions of the biologica

25、l system underscrutiny.Using a knowledge-independent method,co-expression networks can be subdivided into tightlyconnected gene modules.Modules are defined as sets of highly correlated(connected)genes that formsub-networks and are often connected to the global network through a few connections.It ha

26、s been shown that modules often consist of genes that participate in similar functions(Stuartet al.,2003;Lee et al.,2004).As a result,genes of unknown function or genes not previously known toparticipate in molecular pathways can be identified through a“guilt-by-association”inference with genesof kn

27、own function(Wolfe et al.,2005).Alternatively,function-enriched gene clusters within modulescan be identified by counting annotated terms,such as Gene Ontology(GO)(Ashburner et al.,2000),in aset of genes.Functional enrichment of a given term occurs if the term is significantly more abundant in,In Si

28、lico Rice Gene-Phenotype Associationsthe module relative to its occurrence in the genome background and implies that the module is associatedwith the mixture of enriched function.Furthermore,gene subsets within modules can be identified thatnon-randomly share functional terms(co-functional clusters)

29、.Modules may consist of hundreds of nodeswith numerous functional terms and multiple co-functional clusters.Publically available tools such asDAVID(Dennis et al.,2003;Huang da et al.,2009),EASE(Hosack et al.,2003),Fatigo(Al-Shahrour etal.,2007)and Blast2GO(Gotz et al.,2008)represent some of the tool

30、s that exist for functionalenrichment analysis.Recent studies show that co-expression networks can be used to identify a set of candidate genesunderlying specific phenotypes.Mutwil et al.demonstrate a novel clustering method for co-expressionnetworks,coupled with associated phenotypic terms,to predi

31、ct gene sets in Arabidopsis for lethality(Mutwil et al.,2010).Lee et al.show the predicative power of a network for Arabidopsis composed of adiverse set of data(including co-expression data)to predict gene sets associated with lethality andpigmentation(Lee et al.,2010).By prioritization of genes thr

32、ough guilt-by-association Lee et al.show a,ten-fold improvement over screens of random insertion mutants.,Both studies demonstrate the,applicability this systems genetics approach for predicting biologically meaningfully relationships.Herein we describe the construction and functional partitioning o

33、f a rice gene co-expressionnetwork to associate multiple co-expressed gene sets with common molecular function andexperimentally verified phenotypes.The underlying implication is that gene sets enriched for knowngene lesions may be causal to a specific phenotype,and the molecular functions that are

34、co-enriched forphenotype-associated genes may provide clues to the molecular mechanisms that lead to the phenotype.Each cluster or module is a candidate gene set for studying complex traits where multiple genes may havean effect on phenotypic expression.RESULTSThe Rice NetworkConstruction of the ric

35、e co-expression network began with a total of 508 Affymetrix rice arraysdownloaded from NCBIs Gene Expression Omnibus(GEO)(Supplemental Table S2)which werefiltered for outliers and RMA normalized(see Materials and Methods).Pearson correlation between geneexpression profiles was used as the underlyin

36、g metric for co-expression.This study used the strengths ofthe RMT(Luo et al.,2007),and WGCNA(Langfelder and Horvath,2008)methods to construct the geneco-expression network.WGCNA was used for module detection and RMT for automatic threshold,In Silico Rice Gene-Phenotype Associations,(signal-to-noise

37、)identification.Figure 1 provides a schematic of steps involved in network constructionincluding RMA normalization,outlier detection and removal,calculation of Pearson correlation values,module detection using WGCNA and determination of a threshold value using RMT.,Co-expression network construction

38、 yielded 4,528 nodes(mapped to 4,502 rice loci)connectedby 43,144 edges within 45 modules,some of which were later removed after thresholding.SupplementalTable S3 provides a listing of all edges in the co-expression network.The network follows the propertiesof natural biological networks,namely it i

39、s small-world,scale-free,modular and hierarchical.Thenetwork demonstrates small-world characteristics with an average distance between any two nodes(pathlength)of 11.Scale-free behavior is indicated by a negative linear correlation between the number ofedges,log(k),and the probability of finding a n

40、ode with k edges,P(k)(Supplemental Figure S1A).Anegative correlation between the number of edges,k,and the clustering coefficient for nodes with k edges,C(k),indicates hierarchical and modular behavior(Supplemental Figure S1B)The average clusteringcoefficient,was 0.318.A graphical representation of

41、the network,generated using Cytoscape(Shannon et al.,2003),can be seen in Figure 2.Nodes in the network are color-coded according to themodules.,In order to explore the relationship between modules,the WGCNA package was used to calculateeigenvectors,or first principle components,for each module.The

42、eigenvector,or eigengene,acts as arepresentative expression profile for the module and allows for a meta-analytic view of the entire moduleset.All eigengenes were clustered using WGCNA.Figure 3 provides a view of the modules in the formof a dendrogram that indicates“closeness”of expression similarit

43、y of the 45 modules.Each module isnumbered from zero to 44 and prefixed with ME,meaning module eigengene.Adjacent modules aremore highly similar in terms of expression.It should be noted that these eigenvectors were computedfrom WGCNA modules prior to edge removal that were below the RMT-derived har

44、d-threshold.,Mapping of Microarray Probe sets to Rice Loci,Prior to functional enrichment,the mapping of network nodes(microarray probe sets)toannotated rice gene models was necessary to ensure that annotation terms were not over-counted.TheMichigan State University(MSU)Rice Genome Annotation versio

45、n 6.0 contains 56,797 protein codingsequence loci.Of the 57,381 probe sets on the rice microarray,50,468 mapped to 46,498 loci.Of thosemappings,34,028 probe sets mapped directly with all 11 probes from a single probe set to a gene locus.Of those mappings,26,382 are unique one-to-one mappings between

46、 a probe set and locus.Redundant,In Silico Rice Gene-Phenotype Associations,mappings are those where multiple probe sets map to a single loci.Ambiguous mappings are those wherea probe set maps to multiple loci.The distribution of probes,probe sets and loci within the mappings canbe observed in the c

47、harts of Supplemental Figure S2.There are 17,762 redundant mappings and 4,769ambiguous mappings.Ambiguity was removed from the mappings,and the remaining redundancy wasaddressed with a weighted counting method(see Materials and Methods).,Functional Enrichment and Clustering,A functional enrichment a

48、nalysis was performed to examine enrichment of annotated terms.After counting GO(Ashburner et al.,2000),KEGG(Kanehisa et al.,2008),InterPro(Apweiler et al.,2001),and Tos17 mutant phenotype(Hirochika et al.,1996;Miyao et al.,2003)terms for each moduleand for the genome background,Fishers test compari

49、sons were performed for each module to identifyfunctionally enriched terms.Co-functional gene clusters with overlapping function were then identified.Clusters are sub-networks within modules.Nodes in modules are co-expressed and nodes within clustersare both co-expressed and co-functional.Some modul

50、es had multiple clusters while others had none.Functional enrichment yielded 2,412 unique enriched terms in all network modules with 939 of theseaggregating into clusters.Of the total enriched terms,21 were unique mutant phenotype terms thatassociated with 25 clusters.Four mutant phenotype terms wer

展开阅读全文
相关资源
猜你喜欢
相关搜索
资源标签

当前位置:首页 > 建筑/施工/环境 > 项目建议


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号