Vol. 37 ( 217 ) No. 5 J. of Math. (PRC) 1, 2, 3 (1., 4372) (2., 4379) (3., 437) :.,,,.,,. : ; ; ; MR(21) : 92B5 : O212.4; O212.5 : A : 255-7797(217)5-193-8 1,.,, [1,2] [3,4],.,, [5]. -, [6]. 1,.,,., DNA,, [7].,,.,.,,.,,. : 215-1-6 : 215-5-6 : (11275259); (9133113). : (199 ),,,, :.
194 Vol. 37 [8],. 2,,. 2.1. PPI, G = (V, E) [9], V V = N, E E. G A, i j, A ij = 1, A ij =,,. N T, N T F. A F. i j, A ij = 1, t, i j, F it > F jt >, i j t., T, T. 2.2,,,, 1. 2.2.1 7 T,, 7 MCL MCODE MINE ClusterONE DPClus SPICi CoAch. 1:
No. 5 : 195 MCL PPI, Expansion Inflation [1,11]. MCODE, PPI,, ( ) [11,14]. MINE MCODE,,, [13]. DPClus, CP [11,14]. ClusterONE, [15]. CoAch,, 1,, 2 [11,16]. SPICi, SPICi,,,,,, [17]. 2.2.2, 7, 7 V 1, V 2,, V 7, V i (i = 1 : 7) N P i (i = 1 : 7), N, P i i. V i, N i, N j,, N k e (1 <= e <= P i ), e, N i, N j,, N k 1,. 7 V 1, V 2,, V 7, V = [V 1, V 2,, V 7 ], V N P (P = 7 P i ),, T. i=1,., [18,19]. Lee Seung, min W,H (V ij log i,j V ij (W H) ij V ij + (W H) ij ). (W ia V iu /(W H) iu (H au V iu /(W H) iu i u H au H au, W ia W ia. H av k W ka W (N K ) H (K P ), W,, U ik = W ik /W i.. U, τ, U ij > τ, N i K j.,, K τ. v
196 Vol. 37 3 3,, <2. 3 3.1 BIOGPS Affymetrix 83 [2], BioGrid [21] -, 83, [2], >1 26, 26 : BDCA 4+ CD15+ CD19+B CD4+ T CD56+ CD71+ CD8+T k - 562 (MOLT-4) HL-6 burkitt(daudi) burkitt(raji). 3.2,, CORUM [22], 2151 324, 3. 3.3. ACC [23],,. MMR (the Maximum Matching Ratio) Paccanaro. oid ACC Thyroid MMR B cell ACC.35.6.25 k=8 k=1 k=12.3.5.2 k=14.25.2.4.15.15.3.1.1.5.2.5.5 1.5 1.1.5 1 Cortex ACC Prefrontal Cortex MMR T cell ACC T cell MMR.35.6.25.3.5.2.25.2.4.15.15.3.1.1.5.2.5.5 1.5 1.1.5 1 2: τ
No. 5 : 197 3.4 MCL,, 3. 5.,.2; MCODE 3, ; MINE 3, ; DPCLUS, d cp,.7.5; ClusterOne ; CoAch ω,,.225.925,.5; SPICi,.1 1,.1. 7, ACC MMR. 3.5, K τ, K,, 6 16, 2, τ,.9,.1. 26, 1. 1: ACC - coach Dpclus mcl mcode mine cluster one spici NMF BDCA 4+++.34.44.42.27.42.5.46.51.67.64.54.49.61.67.68.68 CD15+++.35.39.38.28.44.45.45.51.6.45.43.32.49.52.5.6 CD19+++B.39.5.44.29.46.5.43.55.38.46.43.29.43.52.49.52.31.39.39.25.37.47.38.47 CD4+++ T.41.51.46.33.53.56.52.58 CD56+.35.47.44.29.44.51.44.52 CD71+++.46.56.48.43.51.61.59.62 CD8+++T.37.47.47.3.46.52.5.54.7.64.54.52.64.68.68.71.62.62.56.41.61.63.62.63 k - 562.69.69.56.51.66.71.65.71.59.52.53.42.52.55.53.59 HL-6.63.6.59.47.56.64.62.64 burkitt(daudi).34.45.42.28.43.51.45.5 burkitt(raji).63.57.52.44.59.65.57.63.41.52.44.33.47.55.56.58.39.51.46.35.49.55.52.58.52.56.52.32.62.56.54.64.71.67.53.55.67.66.62.73.47.5.49.27.51.48.52.6.73.66.59.49.64.68.7.74.64.63.59.43.59.63.62.65.65.64.62.51.63.63.65.67, 6 2.1
198 Vol. 37,, 4, : B T, 2. 3.6, 7 26. 7, ACC MMR 26, 1, ACC 24,. 26 7,, 13%. 7, : 51.61% 33.33% 39.53% 122.22% 27.3% 25.% 27.91%, 3,, MCODE, 26, 3% ; MCL, 8% 4%; CLusterONE, 1.96% 3.8%. 25 2 15 CoAch DPClus MCL MCODE MINE Cluster ONE SPICi 1 5.2.4.6.8 1 3:, NMF, 7. 4,.,, 1(TNPO1) CD4 PPP3CA TNPO3, SRP19 TNPO3, IPO5 IPO7 NUTF2 RAN SRP19,, TNPO1 TNPO3,.,, PPI,.,,.,, ACC, MMR,,,,
No. 5 : 199. [1] Aebersold R, Mann M. Mass spectrometry-based proteomics[j]. Nature, 23, 422(6928): 198 27. [2] Ho Y, Gruhler A, Heilbut A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry[j]. Nature, 22, 415(6868): 18 183. [3] Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome[j]. Proceed. National Acad. Sci. United States America, 21, 98(8): 4569 4574. [4] Uetz P, Giot L, Cagney G, Mansfield T A, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae[j]. Nature, 2, 43(677): 623 627. [5] Gavin A C, B sche M, Krause R, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes[j]. Nature, 22, 415(6868): 141 147. [6] Lage K, Karlberg E O, Størling Z M, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders[j]. Nature Biotechnology, 27, 25(3): 39 316. [7] Lo K, Raftery A E, Dombek K M, et al. Integrating external biological knowledge in the construction of regulatory networks from time-series expression data[j]. BMC Sys. Bio., 212, 6(2): 11. [8] Vasmatzis G, Klee E W, Kube D M, Therneau T M, Kosari F. Quantitating tissue specificity of human genes to facilitate biomarker discovery[j]. Bioinformatics, 27, 23(11): 1348 1355. [9] Li D, Li J, Ouyang S, Wang J, Wu S, Wan P, Zhu Y, Xu X, He F. Protein interaction networks of Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster: large-scale organization and robustness[j]. Proteomics, 26, 6(2): 456 461. [1] Enright A J, Dongen S V, Ouzounis C A. An efficient algorithm for largescale detection of protein families[j]. Nucleic Acids Res, 212, 3(7): 1575 1584. [11],,,. [J]., 214, 4(4): 577 593. [12] Bader G D, Hogue C W V. An automated method for finding molecular complexes in large protein interaction networks[j]. BMC Bioinformatics, 23, 4(1): 2. [13] Rhrissorrakrai K, Gunsalus K C. MINE: module identification in networks[j]. BMC Bioinformatics, 211, 12(1): 192. [14] Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S. Development and implementation of an algorithm for detection of protein complexes in large interaction networks[j]. BMC Bioinformatics, 26, 7(1): 27. [15] Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks[j]. Nature Methods, 212, 9(5): 471 472. [16] Wu M, Li X L, Kwoh C K, Ng C K. A core-attachment based method to detect protein complexes in PPI networks[j]. BMC Bioinformatics, 29, 1(1): 169. [17] Jiang P, Singh M. SPICi: a fast clustering algorithm for large biological networks[j]. Bioinformatics, 21, 26(8): 115 1111. [18] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization[j]. Nature, 1999, 41(6755): 788 791. [19] Ding C, He X F, Simon H D. On the equivalence of nonnegative matrix factorization and spectral clustering[j]. Siam Intern. Confer. Data Min., 25, 5: 66 61.
11 Vol. 37 [2] Lopes T J, Schaefer M, Shoemaker J, Matsuoka Y, Fontaine J F, Neumann G, Andrade-Navarro M A, Kawaoka Y, Kitano H. Tissue-specific subnetworks and characteristics of publicly available human protein interaction databases[j]. Bioinformatics, 211, 27(17): 2414 2421. [21] Chatr-aryamontri A, Breitkreutz B J, Heinicke S, et al. The Biogrid interaction database: 213 update[j]. Nucleic Acids Research, 213, 41(2): 816 823. [22] Havugimana P C, Hart G T, Nepusz T, et al. A census of human soluble protein complexes[j]. Cell, 212, 15(5): 168 181. [23] Li X, Wu M, Kwoh C K, et al. Computational approaches for detecting protein complexes from protein interaction networks: a survey[j]. BMC Genomics, 21, 11(4): S3. [24] Ou-Yang L, Dai D Q, Zhang X F. Protein complex detection via weighted ensemble clustering based on bayesian nonnegative matrix factorization[j]. Plos One, 213, 8(5): 639 642. [25] Ou-Yang L, Dai D Q, Li X L, Wu M, Zhang X F, Yang P. Detecting temporal protein complexes from dynamic protein-protein interaction networks[j]. BMC Bioinformatics, 214, 15(1): 161 165. [26] Zhang X F, Dai D Q, Ou-Yang L, Yan H. Detecting overlapping protein complexes based on a generative model with functional and topological properties[j]. BMC Bioinformatics, 214, 15(2): 836 842. [27] Zhang W, Zou X F. A new method for detecting protein complexes based on the three node cliques[j]. IEEE/ACM Trans Comput. Biol. Bioinform, 215, 12(4): 879 886. [28]. [J]., 26, 26(1): 67 7. IDENTIFICATION OF THE TISSUE SPECIFIC PROTEIN COMPLEXES DING Xia 1, ZHANG Xiao-fei 2, YI Ming 3 (1.School of Mathematics and Statistics, Wuhan University, Wuhan 4372,China) (2.School of Mathematics and Statistics, Central China Normal University, Wuhan 4379, China) (3.School of Science, Huazhong Agricultural University, Wuhan 437, China) Abstract: In this paper, we study the identification problem of tissue-specific protein complexes. By using a variety of typical clustering algorithm to cluster the network, we construct a tissue-specific protein-protein interaction network based on the protein-protein interaction networks as well as the tissue-specific gene expression data, then merge the results with non-negative matrix factorization model to obtain tissue-specific protein complexes. The results show that clustering effect has been significantly improved, and can identify tissue-specific protein complexes. Keywords: protein-protein interaction networks; complexes identification; tissue-specific; non-negative matrix factorization 21 MR Subject Classification: 92B5