The following information has been received by the server: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ reference predict_h26320 (Wed Oct 7 17:37:12 MDT 1998) from staben@pop.uky.edu password(###) resp MAIL orig HTML prediction of: -secondary structure (PHDsec)-solvent accessibility (PHDacc)- return msf format # Human G3PDH GKVKVGVDGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFHGTVKAEDGKLVIDG KAITIFQERDPENIKWGDAGTAYVVESTGVFTTMEKAGAHLKGGAKRIVISAPSADAPMFVMGVNHFKYA NSLKIISNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDSPSGKLWRGGRGAAQNLIPAST GAAKAVGKVIPELDGKLTGMAFRVPTANVSVLDLTCRLEKPAKYDDIKKVVKEASEGPLKGILGYTEDEV VSDDFNGSNHSSIFDAGAGIELNDTFVKLVSWYDNEFGYSERVVDLMAHMASKE ________________________________________________________________________________ Result of PROSITE search (Amos Bairoch): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: A Bairoch, P Bucher & K Hofmann: The PROSITE database, its status in 1997. Nucl. Acids Res., 1997, 25, 217-221. ________________________________________________________________________________ -------------------------------------------------------- -------------------------------------------------------- Pattern-ID: ASN_GLYCOSYLATION PS00001 PDOC00001 Pattern-DE: N-glycosylation site Pattern: N[^P][ST][^P] 148 NASC 389 NVSV 678 NGSN 984 NDTF Pattern-ID: PKC_PHOSPHO_SITE PS00005 PDOC00005 Pattern-DE: Protein kinase C phosphorylation site Pattern: [ST].[RK] 24 SGK 84 TVK 228 SLK 413 TQK 606 SGK 853 TCR 1175 SER Pattern-ID: CK2_PHOSPHO_SITE PS00006 PDOC00006 Pattern-DE: Casein kinase II phosphorylation site Pattern: [ST].{2}[DE] 102 TTME 345 SVLD 624 TEDE 919 SIFD 1233 SWYD Pattern-ID: MYRISTYL PS00008 PDOC00008 Pattern-DE: N-myristoylation site Pattern: G[^EDRKHPFYW].{2}[STAGCN][^P] 57 GTVKAE 161 GVFTTM 334 GIVEGL 536 GGRGAA 752 GAAKAV 1044 GSNHSS 1348 GIELND Pattern-ID: GAPDH PS00071 PDOC00069 Pattern-DE: Glyceraldehyde 3-phosphate dehydrogenase active site Pattern: [ASV]SC[NT]T.{2}[LIM] 149 ASCTTNCL ________________________________________________________________________________ Result of ProDom domain search (Corpet, Gouzy, Kahn): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - please quote: ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492 ________________________________________________________________________________ --- ------------------------------------------------------------ --- Results from running BLAST against PRODOM domains --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- BEGIN of BLASTP output BLASTP 1.4.7 [16-Oct-94] [Build 17:06:52 Oct 31 1994] Reference: Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol. 215:403-10. Query= prot (#) ppOld, human g3pdh /home/phd/server/work/predict_h26320 (334 letters) Database: /home/phd/ut/prodom/prodom_34_2 53,597 sequences; 6,740,067 total letters. Searching..................................................done Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N 81 p34.2 (147) G3P(77) G3PC(20) G3P1(13) // DEHYDROGENA... 791 5.0e-158 5 78 p34.2 (132) G3P(64) G3PC(19) G3P2(12) // DEHYDROGENA... 90 6.5e-06 1 23361 p34.2 (3) G3P(3) // DEHYDROGENASE GAPDH 3-PHOSPHA... 87 1.3e-05 1 19275 p34.2 (1) MOCA_RHIME // RHIZOPINE CATABOLISM PROTE... 70 0.0021 1 45367 p34.2 (2) CY12(1) CY11(1) // CLONE PROTEIN PRECUR... 76 0.0051 1 29258 p34.2 (1) PF4L_PIG // PLATELET BASIC PROTEIN PRECU... 43 0.0083 2 >81 p34.2 (147) G3P(77) G3PC(20) G3P1(13) // DEHYDROGENASE 3-PHOSPHATE GLYCERALDEHYDE GAPDH CYTOSOLIC CHLOROPLAST PRECURSOR A B GAPDH-2 Length = 295 Score = 791 (362.6 bits), Expect = 5.0e-158, Sum P(5) = 5.0e-158 Identities = 154/183 (84%), Positives = 163/183 (89%) Query: 145 IISNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDSPSGKLWRGGRGAAQN 204 I+SNASCTTNCLAPLAKVIHD FGIVEGLMTTVHA TATQKTVD PS K WRGGR AAQN Sbjct: 113 IVSNASCTTNCLAPLAKVIHDKFGIVEGLMTTVHAYTATQKTVDGPSHKDWRGGRAAAQN 172 Query: 205 LIPASTGAAKAVGKVIPELDGKLTGMAFRVPTANVSVLDLTCRLEKPAKYDDIKKVVKEA 264 +IP+STGAAKAVGKVIPEL+GKLTGMAFRVPT NVSV+DLT RLEKPA YD+I +KEA Sbjct: 173 IIPSSTGAAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTVRLEKPATYDEINAAIKEA 232 Query: 265 SEGPLKGILGYTEDEVVSDDFNGSNHSSIFDAGAGIELNDTFVKLVSWYDNEFGYSERVV 324 SEGPLKGILGYTED VVS DFNG HSSIFDA AGI LND FVKLVSWYDNE+GYS RVV Sbjct: 233 SEGPLKGILGYTEDPVVSTDFNGDPHSSIFDAKAGIALNDNFVKLVSWYDNEWGYSNRVV 292 Query: 325 DLM 327 DL+ Sbjct: 293 DLV 295 Score = 168 (77.0 bits), Expect = 5.0e-158, Sum P(5) = 5.0e-158 Identities = 32/50 (64%), Positives = 36/50 (72%) Query: 71 KAITIFQERDPENIKWGDAGTAYVVESTGVFTTMEKAGAHLKGGAKRIVI 120 K +F ERDP N+ WG+ G YVVESTGVFTT EKA AHLKGG + VI Sbjct: 35 KIKVVFSERDPANLPWGELGVDYVVESTGVFTTKEKASAHLKGGGAKKVI 84 Score = 80 (36.7 bits), Expect = 5.0e-158, Sum P(5) = 5.0e-158 Identities = 13/17 (76%), Positives = 15/17 (88%) Query: 41 YMVYMFQYDSTHGKFHG 57 YM YMF+YDSTHG+F G Sbjct: 3 YMAYMFKYDSTHGRFKG 19 Score = 70 (32.1 bits), Expect = 5.0e-158, Sum P(5) = 5.0e-158 Identities = 13/14 (92%), Positives = 13/14 (92%) Query: 126 DAPMFVMGVNHFKY 139 DAPMFVMGVNH KY Sbjct: 93 DAPMFVMGVNHDKY 106 Score = 62 (28.4 bits), Expect = 5.0e-158, Sum P(5) = 5.0e-158 Identities = 11/19 (57%), Positives = 16/19 (84%) Query: 57 GTVKAEDGKLVIDGKAITI 75 GTV+ +DGKLV++GK I + Sbjct: 20 GTVEVKDGKLVVNGKKIKV 38 Score = 57 (26.1 bits), Expect = 7.1e-143, Sum P(5) = 7.1e-143 Identities = 10/14 (71%), Positives = 13/14 (92%) Query: 113 GGAKRIVISAPSAD 126 GGAK+++ISAPS D Sbjct: 78 GGAKKVIISAPSKD 91 >78 p34.2 (132) G3P(64) G3PC(19) G3P2(12) // DEHYDROGENASE 3-PHOSPHATE GLYCERALDEHYDE GAPDH CYTOSOLIC CHLOROPLAST PRECURSOR A B GAPDH-2 Length = 28 Score = 90 (41.3 bits), Expect = 6.5e-06, P = 6.5e-06 Identities = 18/28 (64%), Positives = 22/28 (78%) Query: 5 VGVDGFGRIGRLVTRAAFNSGKVDIVAI 32 VG++GFGRIGRLV RAA V++VAI Sbjct: 1 VGINGFGRIGRLVLRAALERDDVEVVAI 28 >23361 p34.2 (3) G3P(3) // DEHYDROGENASE GAPDH 3-PHOSPHATE GLYCERALDEHYDE PLASMIN RECEPTOR PLASMINOGEN-BINDING PROTEIN Length = 37 Score = 87 (39.9 bits), Expect = 1.3e-05, P = 1.3e-05 Identities = 18/31 (58%), Positives = 22/31 (70%) Query: 4 KVGVDGFGRIGRLVTRAAFNSGKVDIVAIND 34 KVG++GFGRIGRL R V++VAIND Sbjct: 4 KVGINGFGRIGRLALRRIQEVPGVEVVAIND 34 >19275 p34.2 (1) MOCA_RHIME // RHIZOPINE CATABOLISM PROTEIN MOCA. Length = 53 Score = 70 (32.1 bits), Expect = 0.0021, P = 0.0021 Identities = 14/34 (41%), Positives = 23/34 (67%) Query: 2 KVKVGVDGFGRIGRLVTRAAFNSGKVDIVAINDP 35 + ++G+ G GR+G++ RAA S V+I A+ DP Sbjct: 3 RFRLGLVGAGRMGQVHVRAAAESSLVEIAAVADP 36 >45367 p34.2 (2) CY12(1) CY11(1) // CLONE PROTEIN PRECURSOR C1 CYTOCHROME HEME PC18I PC13III Length = 90 Score = 76 (34.8 bits), Expect = 0.0051, P = 0.0051 Identities = 14/33 (42%), Positives = 20/33 (60%) Query: 2 KVKVGVDGFGRIGRLVTRAAFNSGKVDIVAIND 34 K+++G DGFGRI R +TR A + + ND Sbjct: 6 KIRIGFDGFGRINRFITRGAAQRNDSKLPSRND 38 >29258 p34.2 (1) PF4L_PIG // PLATELET BASIC PROTEIN PRECURSOR (PBP). Length = 55 Score = 43 (19.7 bits), Expect = 0.0083, Sum P(2) = 0.0083 Identities = 8/15 (53%), Positives = 12/15 (80%) Query: 205 LIPASTGAAKAVGKV 219 L+PA+ GAAK G++ Sbjct: 33 LVPATMGAAKIEGRM 47 Score = 39 (17.9 bits), Expect = 0.0083, Sum P(2) = 0.0083 Identities = 8/22 (36%), Positives = 16/22 (72%) Query: 142 SLKIISNASCTTNCLAPLAKVI 163 SL++ + +SCTT+ P+ +V+ Sbjct: 2 SLRLGAISSCTTSSPFPVLQVL 23 Parameters: E=0.1 B=500 V=500 -ctxfactor=1.00 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H +0 0 BLOSUM62 0.318 0.136 0.392 same same same Query Frame MatID Length Eff.Length E S W T X E2 S2 +0 0 334 334 0.10 71 3 11 22 0.20 34 Statistics: Query Expected Observed HSPs HSPs Frame MatID High Score High Score Reportable Reported +0 0 60 (27.5 bits) 791 (362.6 bits) 12 12 Query Neighborhd Word Excluded Failed Successful Overlaps Frame MatID Words Hits Hits Extensions Extensions Excluded +0 0 7169 4475034 902058 3568168 4806 4 Database: /home/phd/ut/prodom/prodom_34_2 Release date: unknown Posted date: 12:24 PM MET DST May 06, 1998 # of letters in database: 6,740,067 # of sequences in database: 53,597 # of database sequences satisfying E: 6 No. of states in DFA: 564 (111 KB) Total size of DFA: 262 KB (320 KB) Time to generate neighborhood: 0.03u 0.00s 0.03t Real: 00:00:00 Time to search database: 14.03u 0.08s 14.11t Real: 00:00:14 Total cpu time: 14.13u 0.10s 14.23t Real: 00:00:14 --- END of BLASTP output --- ------------------------------------------------------------ --- --- Again: these results were obtained based on the domain data- --- base collected by Daniel Kahn and his coworkers in Toulouse. --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- The general WWW page is on: ---- --------------------------------------- --- http://www.toulouse.inra.fr/prodom.html ---- --------------------------------------- --- --- For WWW graphic interfaces to PRODOM, in particular for your --- protein family, follow the following links (each line is ONE --- single link for your protein!!): --- http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=81 ==> multiple alignment, consensus, PDB and PROSITE links of domain 81 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=81 ==> graphical output of all proteins having domain 81 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=78 ==> multiple alignment, consensus, PDB and PROSITE links of domain 78 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=78 ==> graphical output of all proteins having domain 78 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=23361 ==> multiple alignment, consensus, PDB and PROSITE links of domain 23361 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=23361 ==> graphical output of all proteins having domain 23361 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=19275 ==> multiple alignment, consensus, PDB and PROSITE links of domain 19275 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=19275 ==> graphical output of all proteins having domain 19275 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=45367 ==> multiple alignment, consensus, PDB and PROSITE links of domain 45367 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=45367 ==> graphical output of all proteins having domain 45367 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=29258 ==> multiple alignment, consensus, PDB and PROSITE links of domain 29258 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=29258 ==> graphical output of all proteins having domain 29258 --- --- NOTE: if you want to use the link, make sure the entire line --- is pasted as URL into your browser! --- --- END of PRODOM --- ------------------------------------------------------------ ________________________________________________________________________________ Note: Your protein has a homolologue of known structure in PDB! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PHD prediction are inferior to a prediction by homology, which is possible if protein with known tertiary structure exists in PDB. For the sequence you sent, there is a known homologue in in PDB. We append the alignment of your sequence to some seq- uences, among them the PDB entry. Predicting 3D structure for your sequence is a straightforward task by using, e.g., SWISS-MODEL (for the address see: http://www.embl-heidelberg.de/~rost/wwwServices.html Should you have sent a known structure to evaluate the PHD pre- diction, please mind that the performance of PHD is expected to be superior for proteins used for training the networks. The list of proteins used for training is: 256b_A, 2aat , 8abp , 6acn , 1acx , 8adh , 3ait , 1ak3_A, 2alp , 9api_A, 9api_B, 8atc_A, 8atc_B, 1azu , 3b5c , 1bbp_A, 1bds , 3blm , 1bmv_1, 1bmv_2, 4bp2 , 2cab , 7cat_A, 1cbh, 1cc5 , 2ccy_A, 1cd4 , 1cdt_A, 3cla , 3cln , 4cms , 4cpa_I, 6cpa , 6cpp , 4cpv , 1crn , 1cse_I, 6cts , 2cyp , 5cyt_R, 3dfr , 6dfr , 3ebx , 1eca , 5er2_E, 1etu , 1fc2_C, 1fc2_D, 1fdl_H, 1fdx , 1fkf , 2fnr , 2fxb , 1fxi_A, 4fxn , 3gap_A, 2gbp , 2gcr , 1gd1_O, 2gls_A, 2gn5 , 1gox , 1gp1_A, 4gr1, 1hds_B, 1hip , 6hir , 2hla_A, 3hla_B, 3hmg_A, 3hmg_B, 2hmz_A, 5hvp_A, 2i1b , 3icb , 7icd , 1il8_A, 9ins_B, 1l58 , 1lap, 2lbp , 5ldh , 2lh4 , 2lhb , 1lrd_3, 2ltn_A, 2ltn_B, 5lyz, 1mcp_L, 4mdh_A, 2mev_1, 2mev_3, 2mev_4, 2mhu , 1mrt , 2or1_L, 1ovo_A, 2pab_A, 1paz , 9pap , 2pcy , 4pfk , 3pgm , 2phh, 2pka_A, 2pka_B, 1pmb_A, 1ppt , 1prc_C, 1prc_H, 1prc_L, 1prc_M, 1pyp , 1r09_2, 1rbp , 1rhd , 4rhv_1, 4rhv_3, 4rhv_4, 1rnh, 3rnt , 7rsa , 2rsp_A, 2rus_A, 4rxn , 1s01 , 4sbv_A, 1sdh_A, 4sgb_I, 1sgt , 1sh1 , 2sns , 2sod_B, 2stv , 2taa_A, 2tbv_A, 2tgp_I, 1tgs_I, 3tim_A, 6tmn_E, 2tmv_P, 1tnf_A, 4ts1_A, 2tsc_A, 1ubq , 2utg_A, 9wga_A, 2wrp_R, 1wsy_A, 1wsy_B, 4xia_A For personal messages or questions to the PHD authors, send email to Predict-Help@EMBL-Heidelberg.DE Burkhard Rost EMBL, 69120 Heidelberg, Europe --- ------------------------------------------------------------ --- 3D homologue: the known structure that appeared to have sig- --- 3D homologue: nificant sequence identity to your protein is: --- 3D homologue: 3GPD, 1GPD, 1GAE, 1GGA, 4DBV, 1A7K, 1CER, 1HDG, 1NLG, . --- 3D homologue: Note: we do NOT check whether the similarity --- 3D homologue: is in the region for which structure has --- 3D homologue: been determined. Thus, please verify! --- ------------------------------------------------------------ --- Database used for sequence comparison: --- SEQBASE RELEASE 34.0 OF EMBL/SWISS-PROT WITH 59021 SEQUENCES The alignment that has been used as input to the network is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- ------------------------------------------------------------ --- MAXHOM multiple sequence alignment --- ------------------------------------------------------------ --- --- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY --- ID : identifier of aligned (homologous) protein --- STRID : PDB identifier (only for known structures) --- PIDE : percentage of pairwise sequence identity --- WSIM : percentage of weighted similarity --- LALI : number of residues aligned --- NGAP : number of insertions and deletions (indels) --- LGAP : number of residues in all indels --- LSEQ2 : length of aligned sequence --- ACCNUM : SwissProt accession number --- NAME : one-line description of aligned protein --- --- MAXHOM ALIGNMENT HEADER: SUMMARY ID STRID IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME g3p1_human 3GPD 100 100 334 0 0 334 P00354 GLYCERALDEHYDE 3-PHOSPHAT g3p_canfa 95 98 63 0 0 63 Q28259 (FRAGMENT). g3p_pig 91 96 332 0 0 332 P00355 GLYCERALDEHYDE 3-PHOSPHAT g3p_cavpo 92 97 75 0 0 75 P70685 (FRAGMENT). g3p_rabit 90 96 332 0 0 332 P46406 GLYCERALDEHYDE 3-PHOSPHAT g3p_bovin 90 96 320 0 0 320 P10096 (FRAGMENT). g3p2_human 90 96 334 0 0 334 P04406 GLYCERALDEHYDE 3-PHOSPHAT g3p_crigr 89 96 332 0 0 332 P17244 GLYCERALDEHYDE 3-PHOSPHAT g3p_sheep 89 93 250 1 11 250 Q28554 (FRAGMENTS). g3p_mouse 89 95 332 0 0 332 P16858 GLYCERALDEHYDE 3-PHOSPHAT g3p_mesau 88 95 312 0 0 312 P51640 (FRAGMENT). g3p_cotja 88 95 332 0 0 332 Q05025 GLYCERALDEHYDE 3-PHOSPHAT g3p_rat 88 95 332 0 0 332 P04797 GLYCERALDEHYDE 3-PHOSPHAT g3p1_jacor 87 93 333 1 30 363 P80534 (GAPDH). g3p_chick 87 95 332 0 0 332 P00356 GLYCERALDEHYDE 3-PHOSPHAT g3p_xenla 82 92 332 0 0 332 P51469 GLYCERALDEHYDE 3-PHOSPHAT g3p2_jacor 96 99 25 0 0 25 P80447 (FRAGMENT). g3p_homam 1GPD 73 88 331 1 1 333 P00357 GLYCERALDEHYDE 3-PHOSPHAT g3p3_caebr 73 87 334 2 6 341 P32810 GLYCERALDEHYDE 3-PHOSPHAT g3p1_drome 72 88 332 1 1 332 P07486 GLYCERALDEHYDE 3-PHOSPHAT g3p2_caebr 72 87 334 2 6 341 P32809 GLYCERALDEHYDE 3-PHOSPHAT g3p2_drome 72 87 332 1 1 332 P07487 GLYCERALDEHYDE 3-PHOSPHAT g3p3_caeel 72 87 334 2 6 341 P17330 GLYCERALDEHYDE 3-PHOSPHAT g3p_drohy 72 87 332 1 1 332 Q01597 GLYCERALDEHYDE 3-PHOSPHAT g3p2_caeel 72 87 334 2 6 341 P17329 GLYCERALDEHYDE 3-PHOSPHAT g3p_schpo 71 88 334 0 0 336 P78958 GLYCERALDEHYDE 3-PHOSPHAT g3p_schma 71 86 334 1 1 338 P20287 LARVAL SURFACE ANTIGEN) ( g3p_bruma 71 86 333 2 5 339 P48812 GLYCERALDEHYDE 3-PHOSPHAT g3p_lacdt 70 87 290 0 0 290 P55070 (FRAGMENT). g3p_podan 70 87 334 0 0 337 P32637 GLYCERALDEHYDE 3-PHOSPHAT g3p_ustma 70 87 334 0 0 337 P09317 GLYCERALDEHYDE 3-PHOSPHAT g3p_pharh 70 86 334 1 1 338 O13507 GLYCERALDEHYDE 3-PHOSPHAT g3p1_caeel 70 86 334 2 6 341 P04970 GLYCERALDEHYDE 3-PHOSPHAT g3p_aspng 70 87 334 0 0 336 Q12552 GLYCERALDEHYDE 3-PHOSPHAT g3p4_caeel 70 86 334 2 6 341 P17331 GLYCERALDEHYDE 3-PHOSPHAT g3p_boled 70 86 290 1 1 291 Q00301 (FRAGMENT). g3p_coche 69 86 334 0 0 337 P29497 GLYCERALDEHYDE 3-PHOSPHAT g3pc_ginbi 69 86 334 1 2 340 Q39769 GLYCERALDEHYDE 3-PHOSPHAT g3p_amamu 69 86 289 0 0 289 P55071 (FRAGMENT). g3pc_phypa 68 85 333 2 3 342 P34923 GLYCERALDEHYDE 3-PHOSPHAT g3p_curlu 68 86 334 0 0 337 P28844 GLYCERALDEHYDE 3-PHOSPHAT g3pc_ranac 68 85 334 1 2 338 P26521 GLYCERALDEHYDE 3-PHOSPHAT g3px_horvu 68 86 334 1 2 337 P26517 GLYCERALDEHYDE 3-PHOSPHAT g3p_schco 68 87 334 0 0 337 P32638 GLYCERALDEHYDE 3-PHOSPHAT g3p_crypa 68 86 334 0 0 337 P19089 GLYCERALDEHYDE 3-PHOSPHAT g3pc_maize 68 86 334 1 2 337 P08735 GLYCERALDEHYDE 3-PHOSPHAT g3p_phach 68 86 334 0 0 337 Q01982 GLYCERALDEHYDE 3-PHOSPHAT g3p_lyosh 68 86 333 0 0 337 Q92243 GLYCERALDEHYDE 3-PHOSPHAT g3pt_mouse 68 84 333 2 2 440 Q64467 (EC 1.2.1.12) (GAPDH). g3p_neucr 67 86 334 0 0 338 P54118 (CLOCK-CONTROLLED PROTEIN g3p_emeni 67 86 334 0 0 336 P20445 GLYCERALDEHYDE 3-PHOSPHAT g3p_colgl 67 86 334 0 0 338 P35143 GLYCERALDEHYDE 3-PHOSPHAT g3pc_sinal 67 85 333 1 2 337 P04796 GLYCERALDEHYDE 3-PHOSPHAT g3pc_pethy 67 85 334 1 2 337 P26520 GLYCERALDEHYDE 3-PHOSPHAT g3pc_pinsy 67 85 334 1 2 340 P34924 GLYCERALDEHYDE 3-PHOSPHAT g3p_serma 67 83 293 3 3 294 P24166 (FRAGMENT). g3p_canal 67 86 331 0 0 331 Q92211 GLYCERALDEHYDE 3-PHOSPHAT g3pc_arath 67 85 333 1 2 338 P25858 GLYCERALDEHYDE 3-PHOSPHAT g3p1_triko 67 84 332 1 1 335 P17729 GLYCERALDEHYDE 3-PHOSPHAT g3p_atrnu 66 85 334 1 2 360 P34783 GLYCERALDEHYDE 3-PHOSPHAT g3p_picpa 66 86 333 0 0 333 Q92263 GLYCERALDEHYDE 3-PHOSPHAT g3p_colln 66 85 333 1 1 337 P54117 GLYCERALDEHYDE 3-PHOSPHAT g3pc_tobac 66 85 324 1 2 326 P09094 (FRAGMENT). g3p_monan 66 84 329 2 5 331 P53430 GLYCERALDEHYDE 3-PHOSPHAT g3p_escfe 66 82 293 3 3 294 P24746 (FRAGMENT). g3pc_pea 66 85 334 1 2 338 P34922 GLYCERALDEHYDE 3-PHOSPHAT g3pc_mescr 66 84 334 1 2 337 P17878 GLYCERALDEHYDE 3-PHOSPHAT g3pc_grave 66 85 334 0 0 335 P54270 GLYCERALDEHYDE 3-PHOSPHAT g3p_erygr 66 86 334 0 0 338 Q00640 GLYCERALDEHYDE 3-PHOSPHAT g3pc_diaca 66 85 333 1 2 338 P34921 GLYCERALDEHYDE 3-PHOSPHAT g3pc_horvu 66 84 303 1 2 305 P08477 (FRAGMENT). g3p_phyin 66 86 332 0 0 332 P26988 GLYCERALDEHYDE 3-PHOSPHAT g3p1_ecoli 1GAE 66 83 329 3 3 330 P06977 GLYCERALDEHYDE 3-PHOSPHAT g3pc_antma 66 85 334 1 2 337 P25861 GLYCERALDEHYDE 3-PHOSPHAT g3p_clapu 66 85 334 0 0 337 Q00584 GLYCERALDEHYDE 3-PHOSPHAT g3pc_orysa 66 85 334 1 2 337 Q42977 GLYCERALDEHYDE 3-PHOSPHAT g3pc_petcr 66 84 334 1 2 336 P26519 GLYCERALDEHYDE 3-PHOSPHAT g3pc_chocr 66 85 334 0 0 335 P34920 GLYCERALDEHYDE 3-PHOSPHAT g3p_serod 66 82 293 3 3 294 P24753 (FRAGMENT). g3pc_leime 65 82 329 3 3 330 Q01558 (GAPDH). g3pc_taxba 65 84 334 1 2 340 Q41595 GLYCERALDEHYDE 3-PHOSPHAT g3p2_triko 65 86 334 0 0 337 P17730 GLYCERALDEHYDE 3-PHOSPHAT g3pc_magli 65 85 334 1 2 341 P26518 GLYCERALDEHYDE 3-PHOSPHAT g3p3_yeast 65 85 331 0 0 331 P00359 GLYCERALDEHYDE 3-PHOSPHAT g3p2_agabi 65 85 333 0 0 338 P32636 GLYCERALDEHYDE 3-PHOSPHAT g3pc_crapl 65 84 334 1 2 337 Q42671 GLYCERALDEHYDE 3-PHOSPHAT g3p_klepn 65 83 303 1 1 303 P24164 (FRAGMENT). g3p2_yeast 65 85 331 0 0 331 P00358 GLYCERALDEHYDE 3-PHOSPHAT g3pc_trybb 64 82 329 3 3 330 P10097 (GAPDH). g3p1_escvu 64 83 294 1 1 294 P24751 (FRAGMENT). g3p_haein 64 81 332 4 8 339 P44304 GLYCERALDEHYDE 3-PHOSPHAT g3p_esche 64 82 294 1 1 294 P24750 (FRAGMENT). g3p_klula 64 84 329 1 1 329 P17819 GLYCERALDEHYDE 3-PHOSPHAT g3p_triha 64 85 334 0 0 338 P87197 GLYCERALDEHYDE 3-PHOSPHAT g3p1_yeast 64 84 331 0 0 331 P00360 GLYCERALDEHYDE 3-PHOSPHAT g3p2_kluma 64 84 331 1 1 331 Q01077 GLYCERALDEHYDE 3-PHOSPHAT g3p_citfr 64 82 294 1 1 294 P24748 (FRAGMENT). g3pc_chlre 63 83 334 1 1 341 P49644 GLYCERALDEHYDE 3-PHOSPHAT g3p_entae 63 82 294 1 1 294 P24163 (FRAGMENT). g3p_escbl 63 82 294 1 1 294 P24749 (FRAGMENT). g3p_zygro 63 84 333 0 0 333 P08439 GLYCERALDEHYDE 3-PHOSPHAT g3p1_syny3 63 81 332 3 5 339 P49433 (GAP-1). g3p1_salty 62 81 294 1 1 294 P24165 (FRAGMENT). g3p1_agabi 62 82 333 1 4 337 P32635 GLYCERALDEHYDE 3-PHOSPHAT g3p1_anava 61 80 328 3 5 334 P34916 GLYCERALDEHYDE 3-PHOSPHAT g3p_bacfr 61 79 297 3 3 299 Q59199 (FRAGMENT). g3p_bucap 60 80 331 3 3 332 Q07234 GLYCERALDEHYDE 3-PHOSPHAT g3p1_giala 56 79 334 1 1 337 P53429 GLYCERALDEHYDE 3-PHOSPHAT g3p_burso 56 81 76 0 0 76 P52694 (FRAGMENT). g3pg_trybb 1GGA 55 73 333 7 20 358 P22512 (GAPDH). g3p_chltr 54 77 334 1 1 341 Q46450 GLYCERALDEHYDE 3-PHOSPHAT g3pg_trycr 54 73 334 7 20 359 P22513 (GAPDH). g3p1_bacsu 52 75 331 4 4 334 P09124 GLYCERALDEHYDE 3-PHOSPHAT g3p_bacme 52 75 331 4 4 334 P23722 GLYCERALDEHYDE 3-PHOSPHAT g3p_bacst 4DBV 52 75 332 3 3 335 P00362 GLYCERALDEHYDE 3-PHOSPHAT g3pg_leime 1A7K 51 72 334 6 20 360 Q27890 (GAPDH). g3p_borbu 50 71 327 6 11 335 P46795 GLYCERALDEHYDE 3-PHOSPHAT g3p_borhe 49 69 255 5 10 264 P46796 (FRAGMENT). g3p_theaq 1CER 48 70 328 5 6 331 P00361 GLYCERALDEHYDE 3-PHOSPHAT g3pa_sinal 48 73 231 2 2 233 P09672 (FRAGMENT). g3p_mycle 48 70 332 5 9 339 P46713 GLYCERALDEHYDE 3-PHOSPHAT g3p_myctu 48 70 332 5 9 339 O06822 GLYCERALDEHYDE 3-PHOSPHAT g3pp_alceu 47 70 331 5 8 336 P50322 GLYCERALDEHYDE 3-PHOSPHAT g3p_bacco 46 71 332 3 3 335 P15115 GLYCERALDEHYDE 3-PHOSPHAT g3pc_alceu 46 70 331 5 8 336 P50321 GLYCERALDEHYDE 3-PHOSPHAT g3p_thema 1HDG 46 72 330 3 3 332 P17721 GLYCERALDEHYDE 3-PHOSPHAT g3p_clopa 46 70 328 5 10 334 Q59309 / CP 18). g3p_strpy 46 68 328 6 11 335 P50467 (PLASMINOGEN-BINDING PROT g3pa_maize 46 70 332 5 7 403 P09315 (EC 1.2.1.12). g3pb_arath 45 69 333 6 7 402 P25857 (EC 1.2.1.12). g3pa_spiol 45 70 331 5 6 337 P19866 GLYCERALDEHYDE 3-PHOSPHAT g3p_pseae 45 68 329 7 10 335 P27726 GLYCERALDEHYDE 3-PHOSPHAT g3pa_chlre 1NLG_M 45 70 332 5 7 374 P50362 (EC 1.2.1.12). g3pb_pea 45 69 333 6 7 451 P12859 (EC 1.2.1.12). g3p2_anava 45 71 332 4 4 336 P34917 GLYCERALDEHYDE 3-PHOSPHAT g3p2_syny3 45 71 330 5 7 336 P80505 (GAP-2) (NAD(P)-DEPENDENT g3p_zymmo 45 69 333 4 5 337 P09316 GLYCERALDEHYDE 3-PHOSPHAT g3p_mycpn 45 69 327 5 11 337 P75358 GLYCERALDEHYDE 3-PHOSPHAT g3p_streq 45 67 328 6 11 335 Q59906 (PLASMINOGEN-BINDING PROT g3pb_spiol 44 68 333 6 7 451 P12860 (EC 1.2.1.12). g3p_strae 44 68 331 5 5 333 P54226 GLYCERALDEHYDE 3-PHOSPHAT g3pa_arath 44 70 332 5 6 396 P25856 (EC 1.2.1.12). g3p_xanfl 44 70 333 3 3 335 P51009 GLYCERALDEHYDE 3-PHOSPHAT g3pb_tobac 44 68 333 6 7 438 P09044 (EC 1.2.1.12) (FRAGMENT). g3p_corgl 44 69 332 4 4 336 Q01651 GLYCERALDEHYDE 3-PHOSPHAT g3p1_anasp 59 83 34 0 0 35 P80506 (FRAGMENT). g3p2_rhosh 43 69 331 4 5 333 P29272 GLYCERALDEHYDE 3-PHOSPHAT g3pa_grave 43 70 332 5 7 416 P30724 (EC 1.2.1.12). g3pa_tobac 43 68 333 4 4 392 P09043 (EC 1.2.1.12) (FRAGMENT). g3p_halva 42 66 328 6 8 335 Q48335 GLYCERALDEHYDE 3-PHOSPHAT g3pa_pea 42 69 333 4 4 405 P12858 (EC 1.2.1.12). g3pa_chocr 42 69 333 4 5 414 P34919 (EC 1.2.1.12). g3p3_ecoli 42 69 329 3 6 332 P33898 GLYCERALDEHYDE 3-PHOSPHAT g3p_mycge 42 67 327 5 11 337 P47543 GLYCERALDEHYDE 3-PHOSPHAT g3p2_bacsu 42 70 332 4 4 340 O34425 GLYCERALDEHYDE 3-PHOSPHAT g3p_strau 42 68 330 4 4 332 Q59800 GLYCERALDEHYDE 3-PHOSPHAT g3p_lacla 42 66 329 6 12 337 P52987 GLYCERALDEHYDE 3-PHOSPHAT g3p_helpy 41 64 322 6 19 332 P55971 GLYCERALDEHYDE 3-PHOSPHAT g3p3_anava 40 68 333 4 4 337 P34918 GLYCERALDEHYDE 3-PHOSPHAT e4pd_ecoli 35 65 332 3 5 338 P11603 D-ERYTHROSE 4-PHOSPHATE D --- --- MAXHOM ALIGNMENT: IN MSF FORMAT MSF of: /home/phd/server/work/predict_h26320-22273.hssp from: 1 to: 334 /home/phd/server/work/predict_h26320-22273.msfRet MSF: 334 Type: P 7-Oct-98 17:39:4 Check: 9276 .. Name: predict_h263 Len: 334 Check: 5354 Weight: 1.00 Name: g3p1_human Len: 334 Check: 5354 Weight: 1.00 Name: g3p_canfa Len: 334 Check: 6621 Weight: 1.00 Name: g3p_pig Len: 334 Check: 7983 Weight: 1.00 Name: g3p_cavpo Len: 334 Check: 7760 Weight: 1.00 Name: g3p_rabit Len: 334 Check: 7734 Weight: 1.00 Name: g3p_bovin Len: 334 Check: 2082 Weight: 1.00 Name: g3p2_human Len: 334 Check: 9029 Weight: 1.00 Name: g3p_crigr Len: 334 Check: 8884 Weight: 1.00 Name: g3p_sheep Len: 334 Check: 3675 Weight: 1.00 Name: g3p_mouse Len: 334 Check: 9517 Weight: 1.00 Name: g3p_mesau Len: 334 Check: 2920 Weight: 1.00 Name: g3p_cotja Len: 334 Check: 8600 Weight: 1.00 Name: g3p_rat Len: 334 Check: 8670 Weight: 1.00 Name: g3p1_jacor Len: 334 Check: 8797 Weight: 1.00 Name: g3p_chick Len: 334 Check: 8178 Weight: 1.00 Name: g3p_xenla Len: 334 Check: 2504 Weight: 1.00 Name: g3p2_jacor Len: 334 Check: 7993 Weight: 1.00 Name: g3p_homam Len: 334 Check: 9746 Weight: 1.00 Name: g3p3_caebr Len: 334 Check: 177 Weight: 1.00 Name: g3p1_drome Len: 334 Check: 9125 Weight: 1.00 Name: g3p2_caebr Len: 334 Check: 9296 Weight: 1.00 Name: g3p2_drome Len: 334 Check: 8691 Weight: 1.00 Name: g3p3_caeel Len: 334 Check: 9593 Weight: 1.00 Name: g3p_drohy Len: 334 Check: 862 Weight: 1.00 Name: g3p2_caeel Len: 334 Check: 9029 Weight: 1.00 Name: g3p_schpo Len: 334 Check: 117 Weight: 1.00 Name: g3p_schma Len: 334 Check: 1815 Weight: 1.00 Name: g3p_bruma Len: 334 Check: 8912 Weight: 1.00 Name: g3p_lacdt Len: 334 Check: 8348 Weight: 1.00 Name: g3p_podan Len: 334 Check: 1935 Weight: 1.00 Name: g3p_ustma Len: 334 Check: 854 Weight: 1.00 Name: g3p_pharh Len: 334 Check: 4944 Weight: 1.00 Name: g3p1_caeel Len: 334 Check: 2012 Weight: 1.00 Name: g3p_aspng Len: 334 Check: 1010 Weight: 1.00 Name: g3p4_caeel Len: 334 Check: 1805 Weight: 1.00 Name: g3p_boled Len: 334 Check: 8629 Weight: 1.00 Name: g3p_coche Len: 334 Check: 1478 Weight: 1.00 Name: g3pc_ginbi Len: 334 Check: 1748 Weight: 1.00 Name: g3p_amamu Len: 334 Check: 8303 Weight: 1.00 Name: g3pc_phypa Len: 334 Check: 7769 Weight: 1.00 Name: g3p_curlu Len: 334 Check: 3055 Weight: 1.00 Name: g3pc_ranac Len: 334 Check: 3236 Weight: 1.00 Name: g3px_horvu Len: 334 Check: 378 Weight: 1.00 Name: g3p_schco Len: 334 Check: 4152 Weight: 1.00 Name: g3p_crypa Len: 334 Check: 757 Weight: 1.00 Name: g3pc_maize Len: 334 Check: 584 Weight: 1.00 Name: g3p_phach Len: 334 Check: 2644 Weight: 1.00 Name: g3p_lyosh Len: 334 Check: 2958 Weight: 1.00 Name: g3pt_mouse Len: 334 Check: 5146 Weight: 1.00 Name: g3p_neucr Len: 334 Check: 1236 Weight: 1.00 Name: g3p_emeni Len: 334 Check: 2634 Weight: 1.00 Name: g3p_colgl Len: 334 Check: 279 Weight: 1.00 Name: g3pc_sinal Len: 334 Check: 440 Weight: 1.00 Name: g3pc_pethy Len: 334 Check: 290 Weight: 1.00 Name: g3pc_pinsy Len: 334 Check: 1719 Weight: 1.00 Name: g3p_serma Len: 334 Check: 4450 Weight: 1.00 Name: g3p_canal Len: 334 Check: 6823 Weight: 1.00 Name: g3pc_arath Len: 334 Check: 9905 Weight: 1.00 Name: g3p1_triko Len: 334 Check: 338 Weight: 1.00 Name: g3p_atrnu Len: 334 Check: 2506 Weight: 1.00 Name: g3p_picpa Len: 334 Check: 1668 Weight: 1.00 Name: g3p_colln Len: 334 Check: 520 Weight: 1.00 Name: g3pc_tobac Len: 334 Check: 7315 Weight: 1.00 Name: g3p_monan Len: 334 Check: 8877 Weight: 1.00 Name: g3p_escfe Len: 334 Check: 2793 Weight: 1.00 Name: g3pc_pea Len: 334 Check: 9801 Weight: 1.00 Name: g3pc_mescr Len: 334 Check: 9093 Weight: 1.00 Name: g3pc_grave Len: 334 Check: 8114 Weight: 1.00 Name: g3p_erygr Len: 334 Check: 8879 Weight: 1.00 Name: g3pc_diaca Len: 334 Check: 408 Weight: 1.00 Name: g3pc_horvu Len: 334 Check: 5984 Weight: 1.00 Name: g3p_phyin Len: 334 Check: 3270 Weight: 1.00 Name: g3p1_ecoli Len: 334 Check: 846 Weight: 1.00 Name: g3pc_antma Len: 334 Check: 2397 Weight: 1.00 Name: g3p_clapu Len: 334 Check: 9557 Weight: 1.00 Name: g3pc_orysa Len: 334 Check: 540 Weight: 1.00 Name: g3pc_petcr Len: 334 Check: 216 Weight: 1.00 Name: g3pc_chocr Len: 334 Check: 7391 Weight: 1.00 Name: g3p_serod Len: 334 Check: 3065 Weight: 1.00 Name: g3pc_leime Len: 334 Check: 7930 Weight: 1.00 Name: g3pc_taxba Len: 334 Check: 1084 Weight: 1.00 Name: g3p2_triko Len: 334 Check: 2808 Weight: 1.00 Name: g3pc_magli Len: 334 Check: 360 Weight: 1.00 Name: g3p3_yeast Len: 334 Check: 9259 Weight: 1.00 Name: g3p2_agabi Len: 334 Check: 1870 Weight: 1.00 Name: g3pc_crapl Len: 334 Check: 9414 Weight: 1.00 Name: g3p_klepn Len: 334 Check: 6440 Weight: 1.00 Name: g3p2_yeast Len: 334 Check: 8380 Weight: 1.00 Name: g3pc_trybb Len: 334 Check: 4409 Weight: 1.00 Name: g3p1_escvu Len: 334 Check: 4057 Weight: 1.00 Name: g3p_haein Len: 334 Check: 6431 Weight: 1.00 Name: g3p_esche Len: 334 Check: 3011 Weight: 1.00 Name: g3p_klula Len: 334 Check: 6220 Weight: 1.00 Name: g3p_triha Len: 334 Check: 1743 Weight: 1.00 Name: g3p1_yeast Len: 334 Check: 9025 Weight: 1.00 Name: g3p2_kluma Len: 334 Check: 5541 Weight: 1.00 Name: g3p_citfr Len: 334 Check: 4618 Weight: 1.00 Name: g3pc_chlre Len: 334 Check: 9722 Weight: 1.00 Name: g3p_entae Len: 334 Check: 4599 Weight: 1.00 Name: g3p_escbl Len: 334 Check: 5811 Weight: 1.00 Name: g3p_zygro Len: 334 Check: 9016 Weight: 1.00 Name: g3p1_syny3 Len: 334 Check: 6948 Weight: 1.00 Name: g3p1_salty Len: 334 Check: 4072 Weight: 1.00 Name: g3p1_agabi Len: 334 Check: 527 Weight: 1.00 Name: g3p1_anava Len: 334 Check: 3085 Weight: 1.00 Name: g3p_bacfr Len: 334 Check: 8149 Weight: 1.00 Name: g3p_bucap Len: 334 Check: 9037 Weight: 1.00 Name: g3p1_giala Len: 334 Check: 561 Weight: 1.00 Name: g3p_burso Len: 334 Check: 7371 Weight: 1.00 Name: g3pg_trybb Len: 334 Check: 4896 Weight: 1.00 Name: g3p_chltr Len: 334 Check: 9120 Weight: 1.00 Name: g3pg_trycr Len: 334 Check: 4488 Weight: 1.00 Name: g3p1_bacsu Len: 334 Check: 3172 Weight: 1.00 Name: g3p_bacme Len: 334 Check: 3329 Weight: 1.00 Name: g3p_bacst Len: 334 Check: 5336 Weight: 1.00 Name: g3pg_leime Len: 334 Check: 3061 Weight: 1.00 Name: g3p_borbu Len: 334 Check: 4965 Weight: 1.00 Name: g3p_borhe Len: 334 Check: 7783 Weight: 1.00 Name: g3p_theaq Len: 334 Check: 924 Weight: 1.00 Name: g3pa_sinal Len: 334 Check: 6287 Weight: 1.00 Name: g3p_mycle Len: 334 Check: 6329 Weight: 1.00 Name: g3p_myctu Len: 334 Check: 4635 Weight: 1.00 Name: g3pp_alceu Len: 334 Check: 6864 Weight: 1.00 Name: g3p_bacco Len: 334 Check: 2831 Weight: 1.00 Name: g3pc_alceu Len: 334 Check: 6618 Weight: 1.00 Name: g3p_thema Len: 334 Check: 9549 Weight: 1.00 Name: g3p_clopa Len: 334 Check: 7962 Weight: 1.00 Name: g3p_strpy Len: 334 Check: 7917 Weight: 1.00 Name: g3pa_maize Len: 334 Check: 7101 Weight: 1.00 Name: g3pb_arath Len: 334 Check: 7969 Weight: 1.00 Name: g3pa_spiol Len: 334 Check: 6778 Weight: 1.00 Name: g3p_pseae Len: 334 Check: 6504 Weight: 1.00 Name: g3pa_chlre Len: 334 Check: 8691 Weight: 1.00 Name: g3pb_pea Len: 334 Check: 7893 Weight: 1.00 Name: g3p2_anava Len: 334 Check: 3040 Weight: 1.00 Name: g3p2_syny3 Len: 334 Check: 1429 Weight: 1.00 Name: g3p_zymmo Len: 334 Check: 8528 Weight: 1.00 Name: g3p_mycpn Len: 334 Check: 9155 Weight: 1.00 Name: g3p_streq Len: 334 Check: 9463 Weight: 1.00 Name: g3pb_spiol Len: 334 Check: 134 Weight: 1.00 Name: g3p_strae Len: 334 Check: 6316 Weight: 1.00 Name: g3pa_arath Len: 334 Check: 4290 Weight: 1.00 Name: g3p_xanfl Len: 334 Check: 2316 Weight: 1.00 Name: g3pb_tobac Len: 334 Check: 9974 Weight: 1.00 Name: g3p_corgl Len: 334 Check: 4238 Weight: 1.00 Name: g3p1_anasp Len: 334 Check: 4843 Weight: 1.00 Name: g3p2_rhosh Len: 334 Check: 2742 Weight: 1.00 Name: g3pa_grave Len: 334 Check: 1226 Weight: 1.00 Name: g3pa_tobac Len: 334 Check: 7343 Weight: 1.00 Name: g3p_halva Len: 334 Check: 7676 Weight: 1.00 Name: g3pa_pea Len: 334 Check: 7912 Weight: 1.00 Name: g3pa_chocr Len: 334 Check: 9974 Weight: 1.00 Name: g3p3_ecoli Len: 334 Check: 3755 Weight: 1.00 Name: g3p_mycge Len: 334 Check: 6735 Weight: 1.00 Name: g3p2_bacsu Len: 334 Check: 6929 Weight: 1.00 Name: g3p_strau Len: 334 Check: 3949 Weight: 1.00 Name: g3p_lacla Len: 334 Check: 6035 Weight: 1.00 Name: g3p_helpy Len: 334 Check: 8922 Weight: 1.00 Name: g3p3_anava Len: 334 Check: 629 Weight: 1.00 Name: e4pd_ecoli Len: 334 Check: 7106 Weight: 1.00 // 1 50 predict_h263 GKVKVGVDGF GRIGRLVTRA AFNSGKVDIV AINDPFIDLH YMVYMFQYDS g3p1_human GKVKVGVDGF GRIGRLVTRA AFNSGKVDIV AINDPFIDLH YMVYMFQYDS g3p_canfa .......... .......... .......... .......... .......... g3p_pig ..VKVGVNGF GRIGRLVTRA AFNSGKVDIV AINDPFIDLH YMVYMFQYDS g3p_cavpo .......... .......... .......... .......... .......... g3p_rabit ..VKVGVNGF GRIGRLVTRA AFNSGKVDVV AINDPFIDLH YMVYMFQYDS g3p_bovin ..VKVGVNGF GRIGRLVTRA AFNSGKVDIV AINDPFIDLH YMVYMFQYDS g3p2_human GKVKVGVNGF GRIGRLVTRA AFNSGKVDIV AINDPFIDLN YMVYMFQYDS g3p_crigr ..VKVGVNGF GRIGRLVTRA AFTSGKVEVV AINDPFIDLN YMVYMFQYDS g3p_sheep .......... .......... .......... .......... .......... g3p_mouse ..VKVGVNGF GRIGRLVTRA AICSGKVEIV AINDPFIDLN YMVYMFQYDS g3p_mesau .......... .RIGRLVTRA AFTSGKVDIV AINDPFIDLN YMVYMFQYDS g3p_cotja ..VKVGVNGF GRIGRLVTRA AVLSGKVQVV AINDPFIDLN YMVYMFKYDS g3p_rat ..VKVGVNGF GRIGRLVTRA AFSCDKVDIV AINDPFIDLN YMVYMFQYDS g3p1_jacor .MVKVGVNGF GRIGRLVTRA AFNSGKvdIV AINDPFIDLN YMVYMFKYDS g3p_chick ..VKVGVNGF GRIGRLVTRA AVLSGKVQVV AINDPFIDLN YMVYMFKYDS g3p_xenla ..VKVGINGF GCIGRLVTRA AFDSGKVQVV AINDPFIDLD YMVYMFKYDS g3p2_jacor ..VKVGVNGF GRIGRLVTRA AFNSGKV... .......... .......... g3p_homam ..SKIGIDGF GRIGRLVLRA ALSCGA.QVV AVNDPFIALE YMVYMFKYDS g3p3_caebr SKPSVGINGF GRIGRLVLRA AVEKDSVNVV AVNDPFISID YMVYLFQYDS g3p1_drome .MSKIGINGF GRIGRLVLRA AIDKGA.SVV AVNDPFIDVN YMVYLFKFDS g3p2_caebr SKPTVGINGF GRIGRLVLRA AVEKDSVNVV AVNDPFISID YMVYLFQYDS g3p2_drome .MSKIGINGF GRIGRLVLRA AIDKGA.NVV AVNDPFIDVK YMVYLFKFDS g3p3_caeel TKPSVGINGF GRIGRLVLRA AVEKDSVNVV AVNDPFISID YMVYLFQYDS g3p_drohy .MSKIGINGF GRIGRLVLRA AVDKGA.SVV AVNDPFIDVN YMVYLFKFDS g3p2_caeel PKPNVGINGF GRIGRLVLRA AVEKDSVNVD AVNDPFISID YMVYLFQYDS g3p_schpo AIPKVGINGF GRIGRIVLRN ALVAKTIQVV AINDPFIDLE YMAYMFKYDS g3p_schma SRAKVGINGF GRIGRLVLRA AFLKNTVDVV SVNDPFIDLE YMVYMIKRDS g3p_bruma SKPKVGINGF GRIGRLVLRA AVEKDTVDVV AVNDPFINID YMVYMFKYDS g3p_lacdt .......... .......... .LLDPRVKVL AVSDPFIDLQ YMVYMFKYDS g3p_podan MTVKVGINGF GRIGRIVFRN AVEHPDVEIV AVNDPFIEPK YAEYMLKYDS g3p_ustma SQVNIGINGF GRIGRIVFRN SVVHNTANVV AINDPFIDLE YMVYMLKYDS g3p_pharh MAVKVGINGF GRIGRIVLRN AIIHGDIDVV AINDPFIDLE YMVYMFKYDS g3p1_caeel SKANVGINGF GRIGRLVLRA AVEKDTVQVV AVNDPFITID YMVYLFKYDS g3p_aspng MAPKVGINGF GRIGRIVFRN AINHGEVDVV AVNDPFIETH YAAYMLKYDS g3p4_caeel SKANVGINGF GRIGRLVLRA AVEKDTVQVV AVNDPFITID YMVYLFKYDS g3p_boled .......... .......... .LENPEINIT AVNDPFIDLD YMVYMFKYDS g3p_coche MVVKVGINGF GRIGRIVFRN AIEHNDVDIV AVNDPFIEPH YAAYMLKYDS g3pc_ginbi GKIKIGINGF GRIGRLVARV ALLRDDIELV AVNDPFISTD YMTYMFKYDS g3p_amamu .......... .......... ..LETDLDVV AINDPFIDLA YMVYMFKYDS g3pc_phypa AKIKVGINGF GRIGRLVARV ALERDDIELV AINDPFITPE YMTYMFKYDS g3p_curlu MVVKVGINGF GRIGRIVFRN AIEHNDVEIV AVNDPFIEPH YAAYMLKYDS g3pc_ranac GKIKIGINGF GRIGRLVARV ALARDDVELV AVNDPFITTD YMTYMFKYDT g3px_horvu GKIKIGINGF GRIGRLVARV ALQSDDVELV AVNDPFITTE YMTYMFKYDT g3p_schco MAVKVGINGF GRIGRIVLRN ALQLGNIEVV AINDPFIALD YMVYMFKYDT g3p_crypa MVVKVGINGF GRIGRIVFRN AHEHSDVEIV AVNDPFIEPH YAAYMLKYDS g3pc_maize GKIKIGINGF GRIGRLVARV ALQSEDVELV AVNDPFITTD YMTYMFKYDT g3p_phach MPVKAGINGF GRIGRIVLRN ALLHGDIDVV AVNDPFIDLE YMVYMFKYDS g3p_lyosh .MVNVGINGF GRIGRIVFRN ALLNPKIQVV AINDPFINLE YMVYMFKYDS g3pt_mouse RELTVGINGF GRIGRLVLRV CMEKG.IRVV AVNDPFIDPE YMVYMFKYDS g3p_neucr MVVKVGINGF GRIGRIVFRN AIEHDDIHIV AVNDPFIEPK YAAYMLRYDT g3p_emeni MAPKVGINGF GRIGRIVFRN AIEAGTVDVV AVNDPFIETH YAAYMLKYDS g3p_colgl APIKVGINGF GRIGRIVFRN AIEHPEVEIV AVNDPFIETK YAAYMLKYDS g3pc_sinal KKIKIGINGF GRIGRLVARV ILQRNDVELV AVNDPFITTE YMTYMFKYDS g3pc_pethy AKIKIGINGF GRIGRLVARV ALQRDDVELV AVNDPFISVE YMTYMFKYDS g3pc_pinsy GKIKIGINGF GRIGRLVARV ALTRDDIELV GVNDPFISTD YMSYMFKYDS g3p_serma .......... .....IVFRA AQERSDIEIV AIND.LLDAE YMAYMLKYDS g3p_canal MAIKIGINGF GRIGRLVLRV ALGRKDIEVV AVNDPFIAPD YAAYMFKYDS g3pc_arath KKIRIGINGF GRIGRLVARV VLQRDDVELV AVNDPFITTE YMTYMFKYDS g3p1_triko .VPKVGINGF GRIGRVVLRN ALETGAVEVV ALNDPFIEPH YAEYMFKYDS g3p_atrnu AKVKIGINGF GRIGRLVARV ILQSDDCELV AINDPFITTD YMTYMFKYDS g3p_picpa MAITVGINGF GRIGRLVLRV ALSRADIKVV AINDPFIAPE YAAYMFKYDS g3p_colln APIKVGINGF GRIGRIVFRN AVEHPDVEIV AVNDPFIETK YAAYMLKYDS g3pc_tobac .......... GRIGRLVARV ALQRDDVELV AVNDPFISTD YMTYMFKYDS g3p_monan VVPKVGINGF GRIGRIVFRN AIEHEGVDIV AVNDPFIE.. ..AYMLKYDS g3p_escfe .......... .....IVFRA AQKRSDIEIV AIND.LLDAD YMAYMLKYDS g3pc_pea AKIKIGINGF GRIGRLVARV ALKRDDVELV AVNDPFITTD YMTYMFKYDS g3pc_mescr AKVKVGINGF GRIGRLVARV ILQRDDCELV AVNDPFISTD YMTYMFKYDS g3pc_grave TVPQVGINGF GRIGRLVLRA AIEKDTMSVV AINDPFIDLE YMAYMFKFDS g3p_erygr APIKVGINGF GRIGRIVFRN AAQSCEVEVV AVNDPFIEPE YAAYMLKYDS g3pc_diaca APIKIGINGF GRIGRLVARV ILQREDCELV AVNDPFITTE YMTYMFKYDS g3pc_horvu .......... .......... .......... .VNDPFITTD YMTYMFKYDT g3p_phyin ..MNVAINGF GRIGRLVLRA SAKNPLINIV AINDPFVSTT YMEYMLEYDT g3p1_ecoli .TIKVGINGF GRIGRIVFRA AQKRSDIEIV AIND.LLDAD YMAYMLKYDS g3pc_antma APIKIGINGF GRIGRLVARV ALQRDDVELV AVNDPFISTD YMTYMFKYDS g3p_clapu MAVKVGINGF GRIGRIVFRN AVEHPEIEVV AVNDPFIDPE YAAYMLKYDS g3pc_orysa GKIKIGINGF GRIGRLVARV ALQSEDVELV AVNDPFITTD YMTYMFKYDT g3pc_petcr MKMKIGINGF GRIGRLVARV ALMSDDIELV AVNDPFITTE YMTYMFKYDS g3pc_chocr TAPKVGINGF GRIGRLVLRA AIEKGTCQVV AINDPFIDLD YMAYMLKYDS g3p_serod .......... .....IVFRA AQERSDIEIV AIND.LLDAE YMAYMLKYDS g3pc_leime ..VKVGINGF GRIGRVVFRA AQMRPDIEIV GIND.LLDAE YMAYSLKYDS g3pc_taxba GKIKIGINGF GRIGRLVARV ALQRDDIELV AVNDPFISTE SLTSLFKYDS g3p2_triko APIKVGINGF GRIGRIVFRN AVEHPDIEVV AVNDPFIETT YAAYMLKYDS g3pc_magli KKIKIGINGF GRIGRLVARV ALQRDDVELV AVNDPFITTD YMTYMFKYDS g3p3_yeast ..VRVAINGF GRIGRLVMRI ALSRPNVEVV ALNDPFITND YAAYMFKYDS g3p2_agabi .MVKVGINGF GRIGRIVLRN ALQFQDIEVV AVNDPFIDLE YMAYMFKYDS g3pc_crapl AKVKIGINGF GRIGRLVARV ALVRDDVELV AVNDPFITVD YMAYMFKYDT g3p_klepn ......INGF GRIGRIVFRA AQKRSDIEIV AIND.LLDAE YMAYMLKYDS g3p2_yeast ..VRVAINGF GRIGRLVMRI ALQRKNVEVV ALNDPFISND YSAYMFKYDS g3pc_trybb .VIRVGINGF GRIGRVVFRA AQRRNDIEIV GIND.LLDAD YMAYMLKYDS g3p1_escvu .......... .....IVFRA AQKRSDIEIV AIND.LLDAE YMAYMLKYDS g3p_haein MAIKIGINGF GRIGRIVFRA AQHRDDIEVV GIND.LIDVE YMAYMLKYDS g3p_esche .......... .....IVFRA AQTRSDIEIV AIND.LLDAE YMAYMLKYDS g3p_klula .MVKVAINGF GRIGRLVLRI ALQRKALEVV AVNDPFISVD YAAYMFKYDS g3p_triha MSIKVGINGF GRIGRILLSN ALEKPELSVV AVNDPFIEPT YAAYMLKYDS g3p1_yeast ..IRIAINGF GRIGRLVLRL ALQRKDIEVV AVNDPFISND YAAYMVKYDS g3p2_kluma .MVRIAINGF GRIGRLVLRI ALSRKNIEVV AINDPFITVD YAAYMFKYDS g3p_citfr .......... .....IVFRA AQERSDIEIV AIND.LLDAD YMAYMLKYDS g3pc_chlre GKIKIGINGF GRIGRLVMRA TMLRPDIEVV AINDPFIDAE YMAYMFKYDS g3p_entae .......... .....IVFRA AQKRSDIEIV GIND.LLDAE YMAYMLKYDS g3p_escbl .......... .....IVFRA AQERSDIEIV AIND.LLDAE YMAYMLKYDS g3p_zygro .MVNVSVNGF GRIGRLVTRI AISRKDINLV AINDPFISTD YAAYMFKYDS g3p1_syny3 .MLKIGINGF GRIGRLVARI AMANPQVTLV GIND.LVPAS NLAYLFKYDS g3p1_salty .......... .....IVFRA AQKRSDIEIV AIND.LLDAE YMAYMLKYDS g3p1_agabi .MVNVGINGF GRIGRLVLRN ALQMQILTVV AVNDPFLDVE YMAYLFKYDS g3p1_anava AKLKVGINGF GRIGRLVLRA GINNPNIEFV GIND.LVPPD NLAYLLKYDS g3p_bacfr .......... ...GRMVFRA AVKNFGNDIQ IVgnDLLDAE YLAYMLKYDS g3p_bucap MTIKIGINGF GRIGRVLFRL AQERENIEVV AIND.LLDPK YIAYMLKYDS g3p1_giala MPIRLGINGF GRIGRMALRA SLNIDGVQVV AINDPFTDCE YMEYMLKYDT g3p_burso .......... .......... .......... .......... .......... g3pg_trybb .TIKVGINGF GRIGRMVFQA LCDDgeIDVV AVVDMNTDAR YFAYQMKYDS g3p_chltr LAMRIVINGF GRIGRLVLRQ ILKRNSPIEV VAINDLVAGD LLTYLFKYDS g3pg_trycr MPIKVGINGF GRIGRMVFQA LCEDgeIDVV AVVDMNTDAE YFAYQMRYDT g3p1_bacsu .AVKVGINGF GRIGRNVFRA ALNNPEVEVV AVND.LTDAN MLAHLLQYDS g3p_bacme .AVKIGINGF GRIGRNVFRA ALKNDNVEVV AIND.LTDAN MLAHLLKYDS g3p_bacst .AVKVGINGF GRIGRNVFRA ALKNPDIEVV AVNDLTANAD GLAHLLKYDS g3pg_leime APIKVGINGF GRIGRMVFQA ICDQgeIDVV AVVDMSTNAE YFAYQMKHDT g3p_borbu ..MKLAINGF GRIGRNVFKI AFERG.IDIV AIND.LTDPK TLAHLLKYDS g3p_borhe .......... .......... .......... ....RFTDPK TLAHLLKYDS g3p_theaq ..MKVGINGF GRIGRQVFRI LHSRGV..EV ALINDLTDNK TLAHLLKYDS g3pa_sinal .......... .......... .......... .......... .......... g3p_mycle MTVRVGINGF GRIGRNFYRA LLAQQEHGIA DVqnDITDNS TLAYLLKFDS g3p_myctu MTVRVGINGF GRIGRNFYRA LLAQQEqvEV VAANDITDNS TLAHLLKFDS g3pp_alceu MTIKVAINGY GRIGRNVLRA HYEGGklEIV AIND.LGNAA TNAHLTQYDT g3p_bacco .AVKVGINGF GRIGRNVFRA AVKNPDIEVV AVNDLTANAD GLAHLLKYDS g3pc_alceu MTIKVAINGY GRIGRNVLRA HYEGGkiEIV AIND.LGNAA TNAHLTQYDT g3p_thema ..ARVAINGF GRIGRLVYRI IYERKNPDIE VVanDLTDTK TLAHLLKYDS g3p_clopa .MTKVAINGF GRIGRLALRR ILEVPGLEVV AIND.LTDAK MLAHLFKYDS g3p_strpy .VVKVGINGF GRIGRLAFRR IQNIEGVEVT RIND.LTDPN MLAHLLKYDT g3pa_maize AKLKVAINGF GRIGRNFLRC WHGRGdlDVI AINDTG.GVK QASHLLKYDS g3pb_arath AKLKVAINGF GRIGRNFLRC WHGRKDslEV VVLNDSGGVK NASHLLKYDS g3pa_spiol .KLKVAINGF GRIGRNFLRC WHGRKdlDVV VINDTG.GVK QASHLLKYDS g3p_pseae MTIRLAINGF GRIGRNVLRA LYtrEQLQVV AIND.LGDAA VNAHLFQYDS g3pa_chlre KKIRVAINGF GRIGRNFLRC WHGRQnlDVV AINDSG.GVK QASHLLKYDS g3pb_pea AKLKVAINGF GRIGRNFLRC WHGRKDSPLE VIvnDSGGVK NASHLLKYDS g3p2_anava .MIRVAINGF GRIGRNFARC WLGRENSNIE LVanDTSDPR TNAHLLNYDS g3p2_syny3 ..TRVAINGF GRIGRNFLRC WLGrsQLEVV GINDTS.DPR TNAHLLRYDS g3p_zymmo MAVKVAINGF GRIGRLAARA ILSRPDSGLE LVtnDLGSVE GNAFLFKRDS g3p_mycpn KTIRVAINGF GRIGRLVFRA LLSQKNIEIV AVND.LTHPD TLAHLLKYDS g3p_streq .VVKVGINGF GRIGRLAFRR IQNVEGVEVT RIND.LTDPN MLAHLLKYDT g3pb_spiol AKLKVAINGF GRIGRNFLRC WHGRKDslDV VVVNDSGGVK SATHLLKYDS g3p_strae MTVRIGINGF GRIGRNVFRA AAARSSELEI VAVNDLGDVP TMAHLLAYDS g3pa_arath AKLKVAINGF GRIGRNFLRC WHGRKdlDII AINDTG.GVK QASHLLKYDS g3p_xanfl MSVKVAINGF GRIGRNVLRA IIESGRTDIE VVanDLGPVE TNAHLFRFDS g3pb_tobac AKLKVAINGF GRIGRNFLRC WHGRKDslDV VVVNDSGGVK NASHLLKYDS g3p_corgl MTIRVGINGF GRIGRNFFRA VLERNGDLEV VAVNDLTDNK TLSTLLKFDS g3p1_anasp .KLKVGINGF GRIGRLVLRA GINNPNIEFV GINDL..... .......... g3p2_rhosh MTIRVAINGF GRIGRNVLRA IVESGRTDIE VVanDLGQVE TNAHLLRFDS g3pa_grave MKVRVAINGF GRIGRNFIRC WAGrsNMDVV CINDTS.GVK TASHLLKYDS g3pa_tobac AKLKVAINGF GRIGRNFLRC WHGRKDSPLD VIanDTGGVK QASHLLKYDS g3p_halva EPVRVGLNGF GRIGRNVFRA SLHSDDVEIV GINDVM.DDS EIDYFAQYDS g3pa_pea KQLKVAINGF GRIGRNFLRC WHGRKDSPLD VIanDTGGVK QASHLLKYDS g3pa_chocr MKVRVAINGF GRIGRNFIRc aGRSDSNMEV VCINDTSGVK TASHLLKYDS g3p3_ecoli .MSKVGINGF GRIGRLVLGR LLEVKSNIDV VAINDLTSPL ILAYLLKHDS g3p_mycge RTIKVAINGF GRIGRLVFRS LLSKANVEVV AIND.LTQPE VLAHLLKYDS g3p2_bacsu MKVKVAINGF GRIGRMVFRK AMLDDQIQVV AINASYSAET .LAHLIKYDT g3p_strau .MTRIAINGF GRIGRNVLRA LLERDSDLDV VAVNDLTEPA TLARLLAYDT g3p_lacla MVVKVGINGF GRIGRLALRR IQEVEGVEVA HIND.LTDPA MLAHLLKYDT g3p_helpy ..MKIFINGF GRIGRCVLRA ILenPKLEVI GINDPA.NWE ILAYLLEHDS g3p3_anava MKIRVGINGF GRMGRLALRA AWGWPELEFV HINEIKGGAV AAAHLLKFDS e4pd_ecoli .TVRVAINGF GRIGRNVVRA LYESGRriTV VAINELADAA GMAHLLKYDT 51 100 predict_h263 THGKFHGTVK AEDGKLVIDG KAITIFQERD PENIKWGDAG TAYVVESTGV g3p1_human THGKFHGTVK AEDGKLVIDG KAITIFQERD PENIKWGDAG TAYVVESTGV g3p_canfa .......... .......... .......... .......... .......... g3p_pig THGKFHGTVK AENGKLVING KAITIFQERD PANIKWGDAG ATYVVESTGV g3p_cavpo .......... .......... .......... .......... .......... g3p_rabit THGKFHGTVK AENGKLVING KAITIFQERD PANIKWGDAG AEYVVESTGV g3p_bovin THGKFNGTVK AENGKLVING KAITIFQERD PANIKWGDAG AEYVVESTGV g3p2_human THGKFHGTVK AENGKLVING NPITIFQERD PSKIKWGDAG AEYVVESTGV g3p_crigr THGKFKGTVK AENGKLVING KAITIFQERD PANIKWGDAG AEYVVESTGV g3p_sheep .......... .......... ...TIFQERD PANIKWGDAG AEYVVESTGV g3p_mouse THGKFNGTVK AENGKLVING KPITIFQERD PTNIKWGEAG AEYVVESTGV g3p_mesau THGKFKGTVK AENGKLVING KAITIFQERD PTNIKWGDAG AEYVVESTGV g3p_cotja THGHFKGTVK AENGKLVING NAITIFQERD PSNIKWGDAG AEYVVESTGV g3p_rat THGKFNGTVK AENGKLVING KPITIFQERD PVKIKWGDAG AEYVVESTGV g3p1_jacor THGKFKGTVK AENGKLVING HAITIFQERD PSKIKWGDAG AEYVVESTGV g3p_chick THGHFKGTVK AENGKLVING HAITIFQERD PSNIKWADAG AEYVVESTGV g3p_xenla THGRFKGTVK AENGKLIIND QVITVFQERD PSSIKWGDAG AVYVVESTGV g3p2_jacor .......... .......... .......... .......... .......... g3p_homam THGVFKGEVK MEDGALVVDG KKITVFNEMK PENIPWSKAG AEYIVESTGV g3p3_caebr THGRFKGTVA HEGDYLLVAK ekIKVYNSRD PAEIQWGSAG ADYVVESTGV g3p1_drome THGRFKGTVA AEGGFLVVNG QKITVFSERD PANINWASAG AEYVVESTGV g3p2_caebr THGRFKGTVK HEGDYLIVAN ekIKVYNSKD PAEIQWGAAG ADYVVESTGV g3p2_drome THGRFKGTVA AEGGFLVVNG QKITVFSERD PANINWASAG AEYIVESTGV g3p3_caeel THGRFKGTVA HEGDYLLVAK ekIKVYNSRD PAEIQWGASG ADYVVESTGV g3p_drohy THGRFKGTVS AEGGFLVVNG QKITVFSERD PANINWASAG AEYVVESTGV g3p2_caeel THGRFKGTVA HEGDYLLVAK ekIKVYNSRD PAEIQWGASG ADYVVESTGV g3p_schpo THGRFDGSVE IKDGKLVIDG NAIDVHNERD PADIKWSTSG ADYVIESTGV g3p_schma THGTFPGEVS TENGKLKVNG KLISVHCERD PANIPWDKDG AEYVVESTGV g3p_bruma THGRFKGSVS AEGGKLIVTN ghISVHNSKD PAEIPWGVDG AEYVVESTGV g3p_lacdt VHGRFKGTVE IKDGKLVIDG HPITVFQERD PANIQWGSVG ADYVVESSGV g3p_podan THGVFKGTIQ VSGSDLIVNG KTVKFYTERD PSAIPWKDTG AEYIVESTGV g3p_ustma THGVFNGDIS TKDGKLIVNG KSIAVFAEKD PSNIPWGQAG AHYVVESTGV g3p_pharh THGVFKGSVE IKDGKLVIEG KPIVVYGERD PANIQWGAAG ADYVVESTGV g3p1_caeel THGQFKGTVT YDGDFLIVQK dkIKVFNSKD PAAIAWGSVK ADFVVESTGV g3p_aspng THGQFKGTIE TYEEGLIVNG KKIRFFAERD PAAIPWGTTG ADYIVESTGV g3p4_caeel THGQFKGTVT YDGDFLIVQK dkIKVFNSKD PAAIAWGSVK ADFVVESTGV g3p_boled VHGRFEGEVS TKDGKLVING KAITVFAERD PANIPWGTVG AQYVVESTGV g3p_coche THGQFKGDIK VDGNNLTVNG KTIRFHMEKD PANIPWSETG AYYVVESTGV g3pc_ginbi VHGQWKKHEV kdSNTLLFGE KAVTVFGCRN PEEIPWGETG VEYVVESTGV g3p_amamu VHGRFSGSVE TKDGKLWINQ KPITVFRKRD PVQIPWGSAG AEYIVESTGV g3pc_phypa THGQWKKTEV tsEGHLTFGG NPVAVYACRD PSEIPWGKHG ADFVVESTGV g3p_curlu THGQFKGDIK VDGNNLTVNG KTVRFHMEKD PANIPWSETG AYYVVESTGV g3pc_ranac VHGQWKHHEL kdEKTLLFGE KPVTVFGCRN PEEIPWGETG AEFVVESTGV g3px_horvu VHGHWKhiKL KDDKTLLFGE KPVTVFGVRN PEEIPWGEAG ADYVVESTGV g3p_schco VHGRYKGTVE VKDGKLVVDG HAITVFAEKN PADIKWGSAG ADYIVESTGV g3p_crypa QHGNFKGDVT VEGSDLVVGG KKVRFYTERD PAAIPWSETG ADYIVESTGV g3pc_maize VHGHWKhiTL KDSKTLLFGD KPVTVFGIRN PEEIPWGEAG AEYVVESTGV g3p_phach VHGRFKGSVE AKDGKLYVEG KPIHVFAEKD PANIPWGSVG AEYIVESTGV g3p_lyosh VHGRFKGTVE AKDGKLWIQG KPVIVYGEKN PSDIKWGAAG RDYVVESTGV g3pt_mouse THGRYKGNVE HKNGQLVVDN LEINTYQCKD PKEIPWSSIG NPYVVECTGV g3p_neucr THGNFKGTIE VDGADLVVNG KKVKFYTDAD PAAIPWSETG ADYIVESTGV g3p_emeni QHGQFKGTIE TYDEGLIVNG KKIRFHTERD PANIPWGQDG AEYIVESTGV g3p_colgl THGIFNGEIK QEGNDLVING KTVKFYTERD PAAIPWKETG ADYVVESTGV g3pc_sinal VHGQWKhlKV KDEKTLLFGE KPVTVFGIRN PEDIPWGEAG ADFVVESTGV g3pc_pethy VHGQWKhlKA KDDKTLLFGE KPVTVFGIRN PEEIPWGEAG ADYVVESTGV g3pc_pinsy VHGKWKHHEV ndSKTLLFGE KSVAVFGCRN PEEIPWGEVG AEYVVESTGV g3p_serma THGRFNGTVE VKDGHLVVNG KTIRVTAEKD PANLKWNEVG VDVVAEATGI g3p_canal THGRYKGEVT ASGDDLVIDG HKIKVFQERD PANIPWGKSG VDYVIESTGV g3pc_arath VHGQWKHNEL kdEKTLLFGE KPVTVFGIRN PEDIPWAEAG ADYVVESTGV g3p1_triko THGRFKGDIK VDGKDLVIDG KRIKFYQERD PANIPWKDSG AEYIVESTGV g3p_atrnu VHGQWKHHEL kdEKTLLFGE RPVTVFGNRN PEEIPWGQTG ADYVVESTGV g3p_picpa THKAYKGEVS ASGNKINIDG KEITVFQERD PVNIPWGKAG VDYVIESTGV g3p_colln THGIFNGEIA QDGNDLVING KKVKFYTERD PAVIPWKETG ADYVVESTGV g3pc_tobac VHGQWKHHEL kdEKTLLFGE KSVRVFGIRN PEEIPWAEAG ADFVVESTGV g3p_monan THGRFNGAVE FDGNTLIVNG KKIKFYAERD PAQIPWSETG .QYVVESTGV g3p_escfe THGRFDGTVE VKDGHLIVNG KKIRVTAERD PANLKWDEVG VDVVAEATGI g3pc_pea VHGQWKnlTV KDSNTLLFGQ KPVTVFAHRN PEEIPWASTG ADIIVESTGV g3pc_mescr VHGQCKsiKL KDEKTLLFGE TPVAVFGCRN PEEIPWGQAG ADFVVESTGV g3pc_grave THGRYAGSVE TKDGKLIVNG KSITIYGHRD PAEIPWAEAG ADYVVESTGV g3p_erygr THGQFNGDIQ TVEDGLVVNS RNVKFYNKRN PEEIPWAETG AEYIVESTGV g3pc_diaca VHGQWKhiKV KDEKTLLFGE KAVTVFGNRN PEEIPWGGTG ADYVVESTGV g3pc_horvu VHGQWKHHEV kdSKTLLFGE KEVAVFGCRN PEEIPWAAAG AEYVVESTGV g3p_phyin VHGKFDGSLS HDETHIFVNG KPIRVFNEMN PENIKWGEEQ VQYVVESTGA g3p1_ecoli THGRFDGTVE VKDGHLIVNG KKIRVTAERD PANLKWDEVG VDVVAEATGL g3pc_antma VHGAWKHHEL kdEKTLLFGE KPVVVFGRRN PEEIPRASTG AEYIVESTGV g3p_clapu SHGVFKGEIK KDADGLIVNG KKVKFHTERD PSAIPWKASG AEYIVESTGV g3pc_orysa VHGQWKHSDI kdSKTLLLGE KPVTVFGIRN PDEIPWAEAG AEYVVESTGV g3pc_petcr VHGQWKklKV KDSKTLLFGD KPLTVFGVRN PEEDPWGEAG AEYVVESTGV g3pc_chocr THGRYAGDVS IKDGKLQVDG NSITVFAHRD PAEIPWATAA ADYIVEATGV g3p_serod THGRFNGSVE VKDGHLVVNG QTIRVTAEKD PANLKWNEVG VDVVAEATGI g3pc_leime THGRFDGTVE VIKGALVVNG KSIRVTSERD PANLKWDEIG VEVVVESTGL g3pc_taxba VHGQWKKHEV kdEKTLLFGE KHVAVFGCRN PEEIPWGEVG AEYVVESTGV g3p2_triko SHGLFKGEVE VDGKDLVVNG KKVRFYTERN PADIKWSETG AEYVVESTGV g3pc_magli VHGQWKHHEL kdSKTLLFGE KPVTVFGVRN PEEIPWGETG AEFVVESTGV g3p3_yeast THGRYAGEVS HDDKHIIVDG KKIATYQERD PANLPWGSSN VDIAIDSTGV g3p2_agabi VHGRFKGTVE VKNGSFVVDG RPMKVFAERD PAAIPWGSVG ADYVVESTGV g3pc_crapl VHGQYKHHEL kdEKTLLFGD KPVAVFGLRN PEEIPWAETG AEYVVESTGV g3p_klepn THGRFDGTVE VKDGHLVVNG KKIRVTAERD PANLKWDEVG VDVVAEATGI g3p2_yeast THGRYAGEVS HDDKHIIVDG HKIATFQERD PANLPWASLN IDIAIDSTGV g3pc_trybb THGRFEGAVE VQGGALVVNG KKIRVTSERD PANLKWNEIN VDVVVESTGL g3p1_escvu THGRFDGTVE VKDGHLVVNG KKIRVTAERD PANLKWDEVG VDVVAEATGI g3p_haein THGRFDGTVE VKDGNLVVNG KTIRVTAERD PANLNWGAIG VDIAVEATGL g3p_esche THGRFDGTVE VKDGHLVVNG KKIRVTAEKD PANLKWNEVG VDVVAEATGI g3p_klula THGRYKGEVT TSGNDLVIDG HKIAVFQEKD PANLPWGKLG VDIVIDSTGV g3p_triha SHGLFKGDIE VDGQNLVVNG KPIRFYSERD PANIKWSETG AEYIVESTGV g3p1_yeast THGRYKGTVS HDDKHIIIDG VKIATYQERD PANLPWGSLK IDVAVDSTGV g3p2_kluma THGRFDGEVS HDGKSLIIDG KKVLVFQERD PATLPWGAEK IDIAIDSTGI g3p_citfr THGRFNGTVE VKDGHLIVNG KKIRVTAERD PANLKWDEVG VDVVAEATGL g3pc_chlre VHKTWPGHVN GSKDGFLVEG RKIHTFTESD PSKINWGAAG ADIVIESTGV g3p_entae THGRFDGTVE VKDGHLVVNG KTIRVTAEKD PANLKWNEIG VDVVAEATGI g3p_escbl THGRFNGTVE VKDGHLIVNG KKIRVTAERD PANLKWNEAG VEVVAEATGL g3p_zygro THGRFDGEVS HDKDHIILNG KKVAVFNEKD PAALPWGKLG VDVAIDSTGI g3p1_syny3 THGSYGGTVV AKEEGIVIDD QFIPCFSQRN PAQLPWGDLG ADYVVESTGL g3p1_salty THGRFDGTVE VKDGHLIVNG KKIRVTAERD PANLKWDEVG VDVVADATVI g3p1_agabi VHGRYQGKVE TKDGKLIIDG HKIAAFAERE PANIKWADCG AEYIVESTGV g3p1_anava THGKLRSQVE AKDDGIVIDG HFIPCVSVRN PAELPWGKLG ADYVVESTGL g3p_bacfr VHGRFEGEVA VEDGALIVNG NKIRLTAEMD PANLKWNEVD ADVVVESTGF g3p_bucap THGNFKKDIE VKNNNLIING KKIRITSIKD PEKLMWDKLL IDVVIESTGL g3p1_giala VHGRFDGTIA HSEDSITVNG NKISVFKSMK PEEIPWGKTQ VDIVLECTGR g3p_burso .......... .......... .......... .......... .......... g3pg_trybb VHGKFKHSVs aKDDTLVVNG HRILCVkqRN PADLPWGKLG VEYVIESTGL g3p_chltr THGSFAPQAT FSDGCLVMGE RKIRFLAEKD VQKLPWKDLD VDVVVESTGL g3pg_trycr VHGKFKYEVt aKDDTLVVNG HRILCVkqRN PADLPWGKLG VEYVIESTGL g3p1_bacsu VHGKLDAEVS VDGNNLVVNG KTIEVSAERD PAKLSWGKQG VEIVVESTGF g3p_bacme VHGKLDAEVV VDGSNLVVNG KTIEISAERD PAQLSWGKQG VEIVVESTGF g3p_bacst VHGRLDAEVV VND.GVSVNG KEIIVKAERN PENLAWGEIG VDIVVESTGR g3pg_leime VHGRPKYTVE aaDVLVVNGH RIKCVKAQRN PADLPWGKLG VDYVIESTGL g3p_borbu TFGVYNKKVE SRDGAIVVDG REIKIIAERD PKNLPWAKLG IDVVIESTGV g3p_borhe TFGVYNKKVE SRDGAIVVDG REIKIIAERD PKNLPWGKLG XDVVIESTGV g3p_theaq IYHRFPGEVA YDDQYLYVDG KAIRATAVKD PKEIPWAEAG VGVVIESTGV g3pa_sinal .......... .......... .......... .......... .......... g3p_mycle ILGRLPHDVs eEDTIVVGSE KIKALAVREG PAALPWHAFG VDVVVESTGL g3p_myctu ILGRLPCDVG LEGDDTIVVG RAKilAVREG PAALPWGDLG VDVVVESTGL g3pp_alceu VHGRFPGEVS VDGDAFRVNG DRIRVLAQRN PAELPWGELG VDVVMECTGL g3p_bacco VHGRLDAEVV VND.GVSVNG KEIIVKAERN PENLAWGEIG VDIVVESTGR g3pc_alceu VHGRFPGEVS VDGDAFRVNG DRIRVLAQRN PAELPWGELG VDVVMECTGL g3p_thema VHKKFPGKVE YTENSLIVDG KEIKVFAEPD PSKLPWKDLG VDFVIESTGV g3p_clopa SQGRFNGEIE VKEGAFVVNG KEVKVFAEAD PEKLPWGELG IDVVLECTGF g3p_strpy TQGRFDGTVE VKEGGFEVNG NFIKVSAERD PENIDWATDG VEIVLEATGF g3pa_maize TLGIFDADVK pgDNAISVDG KVIKVVSDRN PSNLPWGELG IDLVIEGTGV g3pb_arath MLGTFKAEVK IVDntISVDG KLIKVVSNRD PLKLPWAELG IDIVIEGTGV g3pa_spiol ILGTFDADVk aGDSAISVDG KVIKVVSDRN PVNLPWGDMG IDLVIEGTGV g3p_pseae VHGHFPGEVE HDAESLRVMG DRIAVSAIRN PAELPWKSLG VDIVLECTGL g3pa_chlre TLGTFAADVk vDDSHISVDG KQIKIVSSRD PLQLPWKEMN IDLVIEGTGV g3pb_pea MLGTFKAEVK inNETITVDG KPIKVVSSRD PLKLPWAELG IDIVIEGTGV g3p2_anava SLGVKKVDIT ADDNSITVNG KTIKCVSDRN PENLPWKEWE IDLIIESTGV g3p2_syny3 MLGKLDADIS ADENSITVNG KTIKCVSDRN PLNLPWAEWN VDLVIEATGV g3p_zymmo AHGTYPGTVT TEGNDMVIDG KKIVVTAERD PANLPHKKLG VDIVMECTGI g3p_mycpn AHGEFKKKVV AKDNTLMIDK KKVLVFSEKD PANLPWAEHN IDIVVESTGR g3p_streq TQGRFDGTVE VKEGGFEVNG NFIKVSAERD PENIDWATDG VEIVLEATGF g3pb_spiol ILGTFKADVK IIDntFSIDG KPIKVVSNRD PLKLPWAELG IDIVIEGTGV g3p_strae ILGRFPEEVT AEPGAIRVGD RTIKVLAERD PGALPWGDLG VDIVIESTGI g3pa_arath TLGIFDADVK PSGeaISVDG KIIQVVSNRN PSLLPWKELG IDIVIEGTGV g3p_xanfl VHGRFPGEVK VAGDTIDVGR GPIKVTAVRN PAELPHKELG VDIALECTGI g3pb_tobac MLGTFKADVK IVDntISVDG KHIKVVSSRD PLKLPWAELG IDIVIEGTGV g3p_corgl IMGRLGQEVE YDDDSITVGG KRIAVYAERD PKNLDWAAHN VDIVIESTGF g3p1_anasp .......... .......... .......... .......... .......... g3p2_rhosh VHGRFPAKVT SGDDWIDVGR GPIKVTAIRN PAELPW..AG VDMAMECTGI g3pa_grave ILGTFDSDVV AGEDSITVDG KTIKVVSNRN PLELPWKEME IDIVVEATGV g3pa_tobac TLGIFDADVK PVgdGISVDG KVIQVVSDRN PVNLPWGDLG IDLVIEGTGV g3p_halva VMGELEgsVD DGVLTVDGTD FEAGIFHETD PTQLPWDDLD VDVAFEATGI g3pa_pea TLGIFDADVK PVgdGISVDG KVIKVVSDRN PANLPWKELG IDLVIEGTGV g3pa_chocr ILGTFDADVS AGEDTISVNG KTIKIVSNRN PLQLPWKEMN IDIVVEATGV g3p3_ecoli NYGPFPWSVD FTEDSLIVDG KSIAVYAEKE AKNIPWKAKG AEIIVECTGF g3p_mycge AHGELKRKIT VKQNILQIDR KKVYVFSEKD PQNLPWDEHD IDVVIESTGR g3p2_bacsu IHGRYDKEVV AGEDSLIVNG KKVLLLNSRD PKQLPWREYD IDIVVEATGK g3p_strau TSGRLGRPVT VEGNVLVVDG RRITVTAERE PANLPWAELG VDIVLEATGR g3p_lacla TQGRFKGTVE VKEDGFDVNG KFVKVTAERN PEDIQWADSG VEIVLEATGF g3p_helpy VHGLLPKEVR YSNYKLIIGS LEIPVF.... ..NSIKDLKG VDVIIECSGK g3p3_anava VHGRWTPEVE AEGERVLIDS TPLSFSEYGK PEDVPWEDFG VDLVLECSGK e4pd_ecoli SHGRFAWEVR QERDQLFVGD DAIRVLHERS LQSLPWRELG VDVVLDCTGV 101 150 predict_h263 FTTMEKAGAH LKGGAKRIVI SAPSADAPMF VMGVNHFKYA NSLKIISNAS g3p1_human FTTMEKAGAH LKGGAKRIVI SAPSADAPMF VMGVNHFKYA NSLKIISNAS g3p_canfa .......... .......... .APSADAPMF VMGVNHEKYD NSLKIVSNAS g3p_pig FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYD NSLKIVSNAS g3p_cavpo .......... .......... .......... .......... .......... g3p_rabit FTTMEKAGAH LKGGAKRVII SAPSXDAPMF VMGVNHEKYD NSLKIVSNAS g3p_bovin FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYN NTLKIVSNAS g3p2_human FTTMEKAGAH LQGGAKRVII SAPSADAPMF VMGVNHEKYD NSLKIISNAS g3p_crigr FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNQDKYD NSLKIVSNAS g3p_sheep FTTMEKAGAH LKGGAKIIII SAPSLDAPMF VMGVNHEKY. .......... g3p_mouse FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYD NSLKIVSNAS g3p_mesau FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHDKYD NSLKIVSNAS g3p_cotja FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYD KSLKIVSNAS g3p_rat FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYD NSLKIVSNAS g3p1_jacor FTTMEKAGAH LKGGAKRVII SAPSRDAPMF VMGVNHEKYD KSLKIVSNAS g3p_chick FTTMEKAGAH LKGGAKRVII SAPSADAPMF VMGVNHEKYD KSLKIVSNAS g3p_xenla FTTTEKASLH LKGGAKRVVI SAPSADAPMF VVGVNHEKYE NSLKVVSNAS g3p2_jacor .......... .......... .......... .......... .......... g3p_homam FTTIEKASAH FKGGAKKVVI SAPSADAPMF VCGVNLEKYS KDMTVVSNAS g3p3_caebr FTTIEKANAH LKGGAKKVII SAPSADAPMF VVGVNHEKYD HAnhIISNAS g3p1_drome FTTIDKASTH LKGGAKKVII SAPSADAPMF VCGVNLDAYS PDMKVVSNAS g3p2_caebr FTTIEKANAH LKGGAKKVII SAPSADAPMF VVGVNHEKYD HAnhIISNAS g3p2_drome FTTIDKASTH LKGGAKKVII SAPSADAPMF VCGVNLDAYK PDMKVVSNAS g3p3_caeel FTTIEKANAH LKGGAKKVII SAPSADAPMF VVGVNHEKYD HAnhIISNAS g3p_drohy FTTTEKASTH LKGGAKKVVI SAPSADAPMF VCGVNLDAYS PDMKVVSNAS g3p2_caeel FTTIEKANAH LKGGAKKVII SAPSADAPMF VVGVNHEKYD HAnhIISNAS g3p_schpo FTTQETASAH LKGGAKRVII SAPSKDAPMY VVGVNEEKFN PSEKVISNAS g3p_schma FTTIDKAQAH IknRAKKVII SAPSADAPMF VVGVNENSYE KSMSVVSNAS g3p_bruma FTTTDKASAH LKGGAKKVII SAPSADAPMF VMGVNNDMYD KAnhIISNAS g3p_lacdt FTTVDKASAH LKGGAKKVII SAPSADAPMF VVGVNLDAYD SKYTVISNAS g3p_podan FTTTEKASAH LKGGAKRVII SAPSADAPMY VMGVNEKTYD GKAAVISNAS g3p_ustma FTTIDKASAH IKGGAKKVVI SAPSADAPMY VCGVNLDAYD PKAQVVSNAS g3p_pharh FTTQEKAELH LKGGAKKVVI SAPSADAPMF VCGVNLDKYD PKYTVVSNAS g3p1_caeel FTTKEKASAH LQGGAKKVII SAPSADAPMY VVGVNHEKYD ASnhVVSNAS g3p_aspng FTTQEKAAAH LKGGAKKVVI SAPSADAPMF VMGVNNTSYT KDINVLSNAS g3p4_caeel FTTKEKASAH LQGGAKKVII SAPSADAPMY VVGVNHEKYD ASnhVISNAS g3p_boled FTTIEKASAH LKGGAKKVII SAPSSDAPMF VCGVNLDAYD PKHTVISNAS g3p_coche FTTTEKAKAH LKGGAKKVVI SAPSPDAPMF VMGVNHETYK PDIEALSNAS g3pc_ginbi FTDKEKAAAH IKGGAKKVVI TAPSKDAPMF VVGVNEHEYK PDLAIVSNAS g3p_amamu FTTTEKASAH LKGGAKKIVI SAPSADAPMF VCGVNLDKYD PKFQVVSNAS g3pc_phypa FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VMGVNENKYS DE.DIVSNAS g3p_curlu FTTTEKAKAH LKGGAKKVVI SAPSADAPMF VMGVNHETYK SDIEVLSNSS g3pc_ranac FTTKEKASAH LKGGAKKVVI SAPSADAPMF VVGVNHTEYK SDIDIVSNAS g3px_horvu FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEDKYT SDVNIVSNAS g3p_schco FTTVEKASLH LQGGAKKVVI SAPSADAPMF VVGVNLDKYD SKYQVISNAS g3p_crypa FTTTEKAKAH LKGGAKKVII SAPSADAPMY VMGVNEKTYD GSGMVISNAS g3pc_maize FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEDKYT SDVNIVSNAS g3p_phach FTTTEKASAH LKGVCKKVII SAPSADAPMF VCGVNLDAYD SKYKVISNAS g3p_lyosh FTTVEKAEGH LKGGAKKVII SAPSADAPMF VMGCNLDQYD PKYTVISNAS g3pt_mouse YLSIEAASAH ISSGARRVVV TAPSPDAPMF VMGVNEKDYN psMTIVSNAS g3p_neucr FTTTEKASAH LKGGAKKVII SAPSADAPMY VMGVNNETYD GSADVISNAS g3p_emeni FTTQEKASAH LKGGAKKVVI SAPSADAPMF VMGVNNETYK KDIQVLSNAS g3p_colgl FTTTDKAKAH LQGGAKKVII SAPSADAPMY VMVVNEKSYD GSADVISNAS g3pc_sinal FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEHEYK SDLNIVSNAS g3pc_pethy FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEKEYK SDLTVVSNAS g3pc_pinsy FTDKEKASAH LKAGAKKVVI SAPSKDAPMF VVGVNEHQYK SDVNIVSNAS g3p_serma FLTDETARKH ITAGAKKVVL TGPSKDapMF VRGANFDKYA GQ.DIVSNAS g3p_canal FTKVEGAQKH IDAGAKKVII TAPSADAPMF VVGVNEDKYT PDLKIISNAS g3pc_arath FTDKDKAAAH LKGGAKKVVI SEPSKDAPMF VVGVNEHEYK SDLDIVSNAS g3p1_triko FTTTEKASAH FKGGAKKVII SAPSADAPMY VMGVNEDTYA GA.NVISNAS g3p_atrnu FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEHEYK SELNIVSNAS g3p_picpa FTTLEGAQKH IDAGAKKVVI TAPSKDAPMF VVGVNEEKYT SDLNIVSNAS g3p_colln FTTIDKAKAH LQGGAKKVII SAPSADAPMY VMGVNEKSYD GS.AVISQAS g3pc_tobac FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEKEYK PEYDIVSNAS g3p_monan FTKQEKASLH LRGCAKKVII SAPSSDSPMF VMGVNNDQYT KDITVLSNAS g3p_escfe FLTDETARKH ITAGAKKVVL TGPSKdtPMF VKGANFDKYA GQ.DIVSNAS g3pc_pea FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNENEYK PEFDIISNAS g3pc_mescr FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEHEYK SDLNIVSNAS g3pc_grave FTLKEKAEKH FTGGAKKVII SAPSKDAPMF VCGVNEDKYT PDLNVISNAS g3p_erygr FTTTEKAKAH LKGGAKKVII SAPSADAPMF VMGVNHKSYT TDLEVISNAS g3pc_diaca FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEHEYK PELDIVSNAS g3pc_horvu FTDKDKAAAH IKGGAKKVII SAPSKDAPMF VCGVNEKEYK SDIDIVSNAS g3p_phyin FTTLEKASTH LKNGVEKVVI SAPSSDAPMF VMGVNHELYE KNMHVVSNAS g3p1_ecoli FLTDETARKH ITAGAKKVVM TGPSKdtPMF VKGANFDKYA GQ.DIVSNAS g3pc_antma FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEKEYK SDLHIVSNAS g3p_clapu FTTTEKAKAH LTGGAKKVII SAPSADAPMY VMGVNEKTYD GKADVISNAS g3pc_orysa FTDKEKASSH LKGGAKKVFI SAPSKDAPMF VCGVNEDKYT SDIDIVSNAS g3pc_petcr FTDKDKAAAH LKGGAKKVVI SAPSGNAPMF VVGVNEKEYK KDIDIVSNAS g3pc_chocr FTLKDKAAAH FKGGAKKVVI SAPSKDAPMF VCGVNEAKYT PDLDIISNAS g3p_serod FLTDETARKH IAAGAKKVVL TGPSKdtPMF VMGVNLKSYA GQ.EIVSNAS g3pc_leime FLTQETAHKH IEAGARRVVM TGPPKdtPMF VMGVNHTTYK GQ.PIISNAS g3pc_taxba FTDKDKAAAH LKGGAKKVVI SAPSKDAPMF VVGVNEHEYK SDLTIVSNAS g3p2_triko FTTTEKAKAH LVGGAKKVII SAPSADAPMY VMGVNESDYD GSADVISNAS g3pc_magli FTDKDKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEHEYK SNIDIVSNAS g3p3_yeast FKELDTAQKH IDAGAKKVVI TAPSSTAPMF VMGVNEEKYT SDLKIVSNAS g3p2_agabi FTTIDKASAH LKGGAKKVVI SAPSADAPMY VCGVNLDKYN PKDTIISNAS g3pc_crapl FTEKEKAAAH LKGGAKKVII SAPSKDAPMF VVGVNEKTYT PDIDVVSNAS g3p_klepn FLTDETARKH ITAGAKKVVL TGPSKDNTPM FVRGANFDAY AGQDIVSNAS g3p2_yeast FKELDTAQKH IDAGAKKVVI TAPSSTAPMF VMGVNEEKYT SDLKIVSNAS g3pc_trybb FLSDDTARKH IQAGAKKVVI TGPSKdtPMF VMGVNHTTYK GE.AIVSNAS g3p1_escvu FLTDETARKH ITAGAKKVVL TGPSKDNTPM FVRGANFDTY AGQDIVSNAS g3p_haein FLTDETARKH ITAGAKKVVL TGPSKDapMF VRGVNFNAYA GQ.DIVSNAS g3p_esche FLTDETARKH ITAGAKKVVL TGPSKDDTPM FVKGANFDKY NGQDIVSNAS g3p_klula FKELDSAQKH LDAGAKKVVI TAPSKTAPMF VVGVNEDKYN GE.TIVSNAS g3p_triha FTTIDKAKAH LNGGAKKVII SAPSADAPMF VMGVNHKDYD GTPTVLSNAS g3p1_yeast FKELDTAQKH IDAGAKKVVI TAPSSSAPMF VVGVNHTKYT PDKKIVSNAS g3p2_kluma FKELDSAQKH IDAGAKKVVI TAPSSTAPMF VVGVNEDKYA GQ.TIVSNAS g3p_citfr FLTDETARKH ITAGAKKVVL TGPSKDNTPM FVKGANFDKY EGQDIVSNAS g3pc_chlre FTDIAKATAH LTGGAKKVII TAPSNDAPMY VMGVNHEKYN PAthIISNAS g3p_entae FLTDETARKH ITAGAKKVVL TGPSKDNTPM FVRGANFETY AGQDIVSNRS g3p_escbl FLTDETARKH ITAGAKKVVM TGPSKDSTPM FVRGANFDTY AGQDIVSNAS g3p_zygro FKEMDSANKH IEAGAKKVVI TAPSGSAPMY VMGVNEETYT PDQKIVSNAS g3p1_syny3 FTTYATAENH LKAGAKRVII SAPSKDppTF VVGVNHLNYN ADtkIVSNAS g3p1_salty FLTDETARKH ITAGAKKVVL TGPSKDNTPM FVKGANFDKY EGQDIVSNAS g3p1_agabi FKTEELAKEH LKGGAKKVVI TAPGSGVPTY VVGVNLDKYD PKEVVISNAS g3p1_anava FTDSEGASKH LQAGAKRVII SAPTKDprTL LVGVNHDLFD PSkvIVSNAS g3p_bacfr FLTNETARKH IQAGAKKVIM SAPSKDspMF VYGVNHTSYA GQ.DIISNAS g3p_bucap FLTKDTAYKH ILSGAKKVVI TGPSKDdpMF VRGANFDKYK GE.KIVSNAS g3p1_giala FTTKKDAELH ITGGCKRVII SAPSADAPMF VCGCNLETYD PsmKVISNAS g3p_burso .......... .......... .......... .......... .......... g3pg_trybb FTVKSAAEGH LRGGARKVVI SAPASgaKTF VMGVNHNNYN PrqHVVSNAS g3p_chltr FVNRDDAAKH LDSGAKRVLI TAPAkdVPTF VMGVNHQQFD PADVIISNAS g3pg_trycr FTAKAAAEGH LRGGARKVVI SAPASgaKTL VMGVNHHEYN PshHVVSNAS g3p1_bacsu FTKRADAAKH LEAGAKKVII SAPANEEdtI VMGVNEDKYD anHDVISNAS g3p_bacme FTKRADAAKH LEAGAKKVII SAPASDEdtI VMGVNEDKYD anHNVISNAP g3p_bacst FTKREDAAKH LEAGAKKVII SAPAkeNITV VMGVNQDKYD PKahVISNAS g3pg_leime FTDKLKAEGH IKGGAKKVVI SAPASgaKTI VMGVNQHEYS psHHVVSNAS g3p_borbu FSSATSDkvN HAGAKKVILT VPAKDEIKTI VLGVNDHDIN SDLKAVSNAS g3p_borhe FSSATSDkvE AGAGAKKVIL TAPAKDekTI XLGVXDHDIN XDLKAVSNAS g3p_theaq FTDADKAKAH LEGGAKKVII TAPAKGEdtI VMGVNHEAYD PSrhIISNAS g3pa_sinal ..DREGAGKH IQAGAKKVLI TAPGkdIPTY VVGVNAELYS HEDTIISNAS g3p_mycle FTNAAKAKGH LEAGAKKVIV SAPATDPdtI VFGVNDDKYD GSQNIISNAS g3p_myctu FTNAAKAKGH LDAGAKKVII SAPATDEdtI VLGVNDDKYD GSQNIISNAS g3pp_alceu FTSKEKASAH LKGGAKKVII SAPGGkdATI VYGVNHGVLK ATDTVISNAS g3p_bacco FTKREDAAKH LEAGAKKVII SAPAkeNITV VMGVNQDKYD ADahVISNAS g3pc_alceu FTSKEKASAH LRGGAKKVII SAPGGkdATI VYGVNHGVLK ATDTVISNAS g3p_thema FRNREKAELH LQAGAKKVII TAPAKGEdtV VIGCNEDQLK PEHTIISCAS g3p_clopa FTKKEKAEAH VRAGAKKVVI SAPAgdLKTI VFNVNNEDLD GTETVISGAS g3p_strpy FAKKEAAEKH LhnGAKKVVI TAPGgdVKTV VFNTNHDILD GTETVISGAS g3pa_maize FVDREGAGKH IQAGAKKVLI TAPGkdIPTY VVGVNADQYN PDEPIISNAS g3pb_arath FVDGPGAGKH IQAGASKVII TAPaaDIPTY VMGVNEQDYG HDvnIISNAS g3pa_spiol FVDRDGAGKH LQAGAKKVLI TAPGkdIPTY VVGVNEEGYT HADTIISNAS g3p_pseae FTSRDKAAAH LQAGAGKVLI SAPGKdeATV VYGVNHRGVV rsHRIVSNAS g3pa_chlre FIDKVGAGKH IQAGASKVLI TAPAKDkpTF VVGVNEGDYK HEYPIISNAS g3pb_pea FVDGPGAGKH IQAGAKKVII TAPaaDIPTY VIGVNEQDYG HEvdIISNAS g3p2_anava FTSKEGALKH VNVGAKKVLI TAPGKNegTF VIGVNHHDYD HNvhIISNAS g3p2_syny3 FVTHEGATKH VQAGAKKVLI TAPgpNIGTY VVGVNAHEYK HeyEVISNAS g3p_zymmo FTNTEKASAH LTAGAKKVLI SAPAKgdRTV VYGVNHKDLT ADDKIVSNAS g3p_mycpn FVSEEGASLH LQAGAKRVII SAPAKQkkTV VYNVNHKIIN AEDKIISAAS g3p_streq FAKKEAAEKP LhnGAKKVVI TAPGgdVKQL FSTLTTSILD GTETVISGAS g3pb_spiol FVDGPGAGKH IQAGAKKVII TAPasDIPTY VVGVNEKDYG HDvnIISNAS g3p_strae FTDAAKARSH VDGGAKKVII AAPASGEdtV VLGVNDGDYD PErtIISNAS g3pa_arath FVDREGAGKH MEAGAKKVII TAPGkdIPTY VVGVNADAYS HDEPIISNAS g3p_xanfl FTSRDKAAAH LAAGAKRVLV SAPADGAdtV VYGVNHEKLT NEHLVVSNAS g3pb_tobac FVDGPGAGKH IQAGAKKVII TAPaaDIPTY VVGVNEQDYS HEvdIISNAS g3p_corgl FTDANAAKAH IEAGAKKVII SAPASNeaTF VYGVNHESYd eNHNVISGAS g3p1_anasp .......... .......... .......... .......... .......... g3p2_rhosh FTTKEKAAAH LQNGAKRVLV SAPCDGarTI VYGVNHATLT ADDLVVSNAS g3pa_grave FVDAVGAGKH IQAGAKKVLI TAPGKGegTF VVGVNDHLYS HdfDIVSNAS g3pa_tobac FVDREGAGKH IQAGAKKVLI TAPGkdIPTY VVGVNADLYN PDEPIISNAS g3p_halva FRTKEDASQH LDAGADKVLI SAPPKGDeqL VYGVNHDEYD GE.DVVSNAS g3pa_pea FVDREGAGRH ITAGAKKVLI TAPGkdIPTY VVGVNADAYT HADDIISNAS g3pa_chocr FVDAPGAGKH IEAGAKKVLI TAPGKGdgTF VVGVNEKDYS HdyDIVSNAS g3p3_ecoli YTSPEKSQAH LDAGAKKVLI SAPAGEMKTI VYNVNDDTLD GNDTIVSVAS g3p_mycge FVSEEGASLH LKAGAKRVII SAPAKEkrTV VYNVNHKTIS SDDKIISAAS g3p2_bacsu FNAKDKAMGH IEAGAKKVIL TAPGKNEdtI VMGVNEDQFD AERhiISNAS g3p_strau FTSAKAARAH LDAGAKKVLV SAPADGAdtL AFGVNTDAYD PDltIVSNAS g3p_lacla FATKEKAEKH LhgGAKKVLI TAPGgdVKTV VFNTNHTILD GSETVISAGS g3p_helpy FLEPKTLENY LLLGAKKVLL SAPFMGepTL VYGVNHFCYQ NQ.AIVSNAS g3p3_anava FRTPATLDPY FKRGVQKVIV AAPVKeaLNI VMGVNDYLYE PEkhLLTAAS e4pd_ecoli YGSREHGEAH IAAGAKKVLF SHPGSndATV VYGVNQDQLR AEHRIVSNAS 151 200 predict_h263 CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQKTVDSP SGKLWRGGRG g3p1_human CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQKTVDSP SGKLWRGGRG g3p_canfa CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQ...... .......... g3p_pig CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_cavpo .......... .......... .......... .......... .......... g3p_rabit CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_bovin CTTNCLAPLA KVIHDHFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p2_human CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_crigr CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_sheep CTTNCLAPLA KVIHDHFGIV EGLMTTVHAT TATQKTVDGP SGKLWRDGRG g3p_mouse CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_mesau CTTTCLAPLA KVIHDNFGIV KGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_cotja CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_rat CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p1_jacor CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SAKLWRDGAG g3p_chick CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SGKLWRDGRG g3p_xenla CTTNCLAPLA KVINDNFGIV EGLMTTVHAF TATQKTVDGP SGKLWRDGRG g3p2_jacor .......... .......... .......... .......... .......... g3p_homam CTTNCLAPVA KVLHENFEIV EGLMTTVHAV TATQKTVDGP SAKDWRGGRG g3p3_caebr CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p1_drome CTTNCLAPLA KVINDNFEIV EGLMTTVHAT TATQKTVDGP SGKLWRDGRG g3p2_caebr CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p2_drome CTTNCLAPLA KVINDNFEIV EGLMTTVHAT TATQKTVDGP SGKLWRDGRG g3p3_caeel CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p_drohy CTTNCLAPLA KVINDNFEIV EVLMTTVHAT TATQKTVDGP SGKLWRDGRG g3p2_caeel CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p_schpo CTTNCLAPLA KVINDTFGIE EGLMTTVHAT TATQKTVDGP SKKDWRGGRG g3p_schma CTTNCLAPLA KVIHDKFEIV EGLMTTVHSF TATQKVVDGP SSKLWRDGRG g3p_bruma CTTNCLAPLA KVIHDKFGII EGLMTTVHAT TATQKTVDGP SGKLWRDGRG g3p_lacdt CTTNCLAPLA KVINDKFGIV EGLMSTIHAT TATQKTVDGP SNKDWRGGRA g3p_podan CTTNCLAPLA KVVNDKFGIV EGLMTTVHSY TATQKTVDGP SAKDWRGGRG g3p_ustma CTTNCLAPLA KVIHDKFGIV EGLMTTVHAT TATQKTVDGP SAKDWRGGRA g3p_pharh CTTNCLAPLG KVIHDNYTIV EGLMTTVHAT TATQKTVDGP SNKDWRGGRG g3p1_caeel CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p_aspng CTTNCLAPLA KVINDKFGIV EGLMTTVHSY TATQKVVDAP SSKDWRGGRT g3p4_caeel CTTNCLAPLA KVINDNFGII EGLMTTVHAV TATQKTVDGP SGKLWRDGRG g3p_boled CTTNCLAPLA KVVNDKFGIV EGLMTTVHAT TATQKTVDGP SPKDWRGGRA g3p_coche CTTNCLAPLA KVIHDKYTII EGLMTTIHSY TATQKVVDGP SAKDWRGGRT g3pc_ginbi CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SNKDWRGGRG g3p_amamu CTTNCLAPLA KVVNDKFGIV EGLMTTVHAT TATQKTVDGP SAKDWRGGRS g3pc_phypa CTTNCLAPLA KVINDKFGIL EGLMTTVHAT TATQKTVDGP SSKDWRGGRS g3p_curlu CTTNCLAPLA KVIHDKYTII EGLMTTIHSY TATQKVVDGP SAKDWRGGRT g3pc_ranac CTTNCLAPLA KVINDRFGIV EGLMTTVHAM TATQKTVDGP SSKDWRGGRA g3px_horvu CTTNCLAPLA KVINDNFGII EGLMTTVHAI TATQKTVDGP SSKDWRGGRA g3p_schco CTTNCLAPLA KVIHDKYGIA EGLMTTVHAT TATQKTVDGP SHKDWRGGRS g3p_crypa CTTNCLAPLA KVINDEFKII EGLMTTVHSY TATQKTVDGP SAKDWRGGRT g3pc_maize CTTNCLAPLA KVIHDNFGIV EGLMTTVHAI TATQKTVDGP SAKDWRGGRA g3p_phach CTTNCLAPLA KVIHDKFGIV QGLMTSVHAT TATQKTVDGP SNKDWLGGRS g3p_lyosh CTTNCLAPLT KVIHDKYGII EGLMSTIHAT TATQKTVDGP SNKDWRGGRA g3pt_mouse CTTNCLAPLA KVIHENFGIV EGLMTTVHSY TATQKTVDGP SKKDWRGGRG g3p_neucr CTTNCLAPLA KVIHDNFTIV EGLMTTVHSY TATQKTVDGP SAKDWRGGRT g3p_emeni CTTNCLAPLA KVINDNFGII EGLMTTVHSY TATQKVVDGP SAKDWRGGRT g3p_colgl CTTNCLAPLA KVINDKFGIV EGLMTTVHSY TATQKTVHGP SAKDWRGGRT g3pc_sinal CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SMKDWRGGRA g3pc_pethy CTTNCLAPLA KVINDRFGIV EGLMTTVHSM TATQKTVDGP SAKDWRGGRA g3pc_pinsy CTTNCLAPLA KVINDKFGIV EGLMTTVHSI TATQKTVDGP SNKDWRGGRG g3p_serma CTTNCLAPLA KVINDNFGIV EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p_canal CTTNCLAPLA KVVNDTFGIE EGLMTTVHSI TATQKTVDGP SHKDWRGGRT g3pc_arath CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SMKDWRGGRA g3p1_triko CTTNCLAPLA KTLNDKFTIV EGLMTAIHAY TASQKTVDGP SSKDWRGGRA g3p_atrnu CTTNCLAPLA KVINDKFGIV EGLMTTVHSI TATQKTVDGP SAKDWRGGRA g3p_picpa CTTNCLAPLA KVVNDTFGIE SGLMTTVHSM TATQKTVDGP SHKDWRGGRT g3p_colln CTTNCLAPLA KVINDKYTII EGLMTTVHSY TATQKTVDGP SAKDWRGGRT g3pc_tobac CTTNCLAPLA KVINDRFGIV EGLMTTVHSL TATQKTVDGP SMKDWRGGRA g3p_monan CTTNCLAPLA KVINDKFGIV EGLMTTVHSY TATQKVVDGP SNKDWRGGRT g3p_escfe CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3pc_pea CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SSKDWRGGRA g3pc_mescr CTTNCLAPLA KVINDRFGIV EGLMTTVHAM TATQKTVDGP SMKDWRGGRA g3pc_grave CTTNCLAPLV KVIHEKYGIE EGLMTTVHAT TATQKTVDGP SQKDWRGGRG g3p_erygr CTTNCLAPLA KVINDEFTII EGLMTTIHSY TATQKTVDGP SAKDWRGGRT g3pc_diaca CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SMKDWRGGRA g3pc_horvu CTTNCPAPLA KVINDRFGIV EGLMTTVHAM TATQKTVDGP SSKDWRGGRA g3p_phyin CTTNCLAPLA KVVNDKFGIK EGLMTTVHAV TATQKTVDGP SKKDWRGGRG g3p1_ecoli CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3pc_antma CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SAKDWRGGRA g3p_clapu CTTNCLAPLA KVIHDKYTIV EGLMTTVHSY TATQKTVDGP SGKDWRGGRG g3pc_orysa CTTNCLAPLA KVIHDNFGII EGLMTTVHAI TATQKTVDGP SSKDWRGGRA g3pc_petcr CTTNCLAPLA KVLNDKFGIV EGLMTTVHSI TATRKTVDGP SMKDWRGGRA g3pc_chocr CTTNCLAPLV KVIHEKYGIE EGLMTTVHAT TATQKTVDGP SNKDWRGGRG g3p_serod CTTNCLAPLA KVINDKFGIV EALMTTVHAT TATQKTVDGP FHKDWRGGRG g3pc_leime CTTNCLAPLA KVVNEKYGIV EGLMTTVHAT TATQKTVDGP SLKDWRGGRG g3pc_taxba CTTNCLAPLA KVINDRFGIV EGLMTTVHSI TATQKTVDGP SNKDWRGGRA g3p2_triko CTTNCLAPLA KVINDNYGIV EGLMTTVHSY TATQKTVDGP SAKDWRGGRG g3pc_magli CTTNCLAPLA KVINDKFGIV EGLMTTVHSI TATQKTVDGP SSKDWRGGRA g3p3_yeast CTTNCLAPLA KVINDAFGIE EGLMTTVHSL TATQKTVDGP SHKDWRGGRT g3p2_agabi CTTNCLATLA KVIHDNFGIV EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3pc_crapl CTTNCLAPLA KVIHDRFGIV EGLMTTVHSI TATQKTVDGP SSKDWRGGRA g3p_klepn CTTNCLAPLA KVINDNFGIV EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p2_yeast CTTNCLAPLA KVINDAFGIE EGLMTTVHSM TATQKTVDGP SHKDWRGGRT g3pc_trybb CTTNCLAPLA KVLNDKFGIV EGLMTTVHAT TATQKTVDGP SQKDWRGGRG g3p1_escvu CTTNCLAPLA KVINDNFGIV EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p_haein CTTNCLAPLA RVVHETFGIK DGLMTTVHAT TATQKTVDGP SAKDWRGGRG g3p_esche CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p_klula CTTNCLAPIA KIINDEFGID EALMTTVHSI TATQKTVDGP SHKDWRGGRT g3p_triha CTTNGLAPLV KIVNDNFGIV EGLMTTVHSY TATQKTVDGP SGKDWRGGRG g3p1_yeast CTTNCLAPLA KVINDAFGIE EGLMTTVHSM TATQKTVDGP SHKDWRGGRT g3p2_kluma CTTNCLAPLA KIINNAFGIE EGLMTTVHSI TATQKTVDGP SHKDWRGGRT g3p_citfr CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3pc_chlre CTTNCLAPLA KVVNSKFGIK EGLMTTVHAT TATQKTVDGP SKKDWRGGRA g3p_entae CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p_escbl CTTNCLAPLA KVVNDNFGIV EALMTTVHAT TATQKTVDGP SHKDWRGGRG g3p_zygro CTTNCLAPLA KVIHNEFGIK EGLMTTVHSM TATQKTVDGP SHKDWRGGRT g3p1_syny3 CTTNCLAPIA KILDDNFGIV EGLMTTVHAM TATQPTVDGP SKKDFRGGRG g3p1_salty CTTNCLAPLA KVINDNFGII EGLMTTVHAT TATQKTVDGP SHKDWRGGRG g3p1_agabi CTTNCLAVLA KVINDKFGIV EGLMTTVHAT TATQKTVDAP AKKDWRSGRS g3p1_anava CTTNCLAPIA KVINDNFGLT EGLMTTVHAM TATQPTVDGP SKKDWRGGRG g3p_bacfr CTTNCLAPIA KVLNDKFGIV KGLMTTVHAA TATQKTVDGP SKKDWRGGRG g3p_bucap CTTNCLAPLS KVIDDHFGII EGLMTTVHAS TATQKIVDSA SKKDWRGGRG g3p1_giala CTTNCLAPLA MVVNKKFGIK EGLMTTVHAV TATQLPVDGP SKKDWRGGRS g3p_burso .......... .......... .......... .......... .......... g3pg_trybb CTTNCLAPLV HVLveGFGIS TGLMTTVHSY TATQKTVDGV SVKDWRGGRA g3p_chltr CTTNCLAPLA KVLLDNFGIE EGLMTTVHAA TATQSVVDGP SRKDWRGGRG g3pg_trycr CTTNCLAPIV HVLveGFGVQ TGLMTTIHSY TATQKTVDGV SVKDWRGGRA g3p1_bacsu CTTNCLAPFA KVLNDKFGIK RGMMTTVHSY TNDQQILDLP H.KDYRRARA g3p_bacme CTTNCLAPFA KVLNDKFGLK RGMMTTVHSY TNDQQILDLP H.KDYRRARA g3p_bacst CTTNCLAPFA KVLHQEFGIV RGMMTTVHSY TNNQRILDLP THKDLRGARA g3pg_leime CTTNCLAPIV HVLteNFGIE TGLMTTIHSY TATQKTVDGV SLKDWRGGRA g3p_borbu CTTNCLAPLA KVLHESFGIE QGLMTTVHAY TNDQRILDLP HSDLRRA.RA g3p_borhe CTTNCLAPLA KVLHESFGIE QGLMTTVHAY TNXQRILDLP HSDLRRA.RA g3p_theaq CTTNSLAPVM KVLEEAFGVE KALMTTVHSY TNDQRLLDLP HKDLRRA.RA g3pa_sinal CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_mycle CTTNCLAPLA KVLHDQFGIV KGLMTTVHAY TQDQNLQDGP HSDLRRA.RA g3p_myctu CTTNCLAPLA KVLDDEFGIV KGLMTTIHAY TQDQNLQDGP HKDLRRA.RA g3pp_alceu CTTNCLAPLV KPLHEKLGVV NGLMTTVHSY TNDQVLTDVY HEDLRRA.RS g3p_bacco CTTICLAAFA RVLHQIFGEV SRMMTTAHSY TNIQRILDAA THADLRGARA g3pc_alceu CTTNCLAPLV KPLHEKLGLV NGLMTTVHSY TNDQVLTDVY HEDLRRA.RS g3p_thema CTTNSIAPIV KVLHEKFGIV SGMLTTVHSY TNDQRVLDLP HKDLRRA.RA g3p_clopa CTTNCLAPMA KVLNDKFGIE KGFMTTIHAY TNDQNTLDGP HRkdFRRARA g3p_strpy CTTNCLAPMA KALHDAFGIQ KGLMTTIHAY TGDQMILDGP hgGDLRRARA g3pa_maize CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3pb_arath CTTNCLAPFA KVLDEEFGIV KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3pa_spiol CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_pseae CTTNCLAPVA QVLHRELGIE HGLMTTIHAY TNDQNLSDVY HPDLYRA.RS g3pa_chlre CTTNCLAPFV KVLEQKFGIV KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3pb_pea CTTNCLAPFA KVLDEEFGIV KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p2_anava CTTNCLAPIA KVLNDKFGII KGSMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p2_syny3 CTTNCLAPFG KVINDNFGII KGTMTTTHSY TGDQRILDAS HRDLRRA.RA g3p_zymmo CTTNCLAPVL HVLQQKIGIV RGLMTTVHSF TNDQRILDQI HSDLRRA.RT g3p_mycpn CTTNCLAPMV HVLEKNFGIL HGTMVTVHAY TADQRLQDAP HSDLRRA.RA g3p_streq CTTNCLAPMA KALHDAFGIQ KGLMTTIHAY TGDQMIVDGH rgGDLRRARA g3pb_spiol CTTNCLAPFV KVLDEELGIV KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_strae CTTNCLGVLA KVLHDAVGID SGMMTTVHAY TQDQNLQDAP HKDLRRA.RA g3pa_arath CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_xanfl CTTNCLAPVA KVLNDAVGIE KGFMTTIHAY TGDQPTLDTM HKDLYRA.RA g3pb_tobac CTTNCLAPFV KVMDEELGIV KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_corgl CTTNCLAPMA KVLNDKFGIE NGLMTTVHAY TGDQRLHDAP HRDLRRA.RA g3p1_anasp .......... .......... .......... .......... .......... g3p2_rhosh CTTNCLSPVA KVLHDAIGIA KGFMTTIHSY TGDQPTLDTM HKDLYRA.RA g3pa_grave CTTNCMAPFM KVLDDEFGVV RGMMTTTHSY TGDQRLLDAG HRDLRRA.RS g3pa_tobac CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3p_halva CTTNSITPVA KVLDEEFGIN AGQLTTVHAY TGSQNLMDGP NGKPRRR.RA g3pa_pea CTTNCLAPFV KVLDQKFGII KGTMTTTHSY TGDQRLLDAS HRDLRRA.RA g3pa_chocr CTTNCMAPFM KVLDDEFGVV RGMMTTTHSY TGDQRLLDAG HRDLRRA.RS g3p3_ecoli CTTNCLAPMA KALHDSFGIE VGTMTTIHAY TGTQSLVDGP RGKDLRASRA g3p_mycge CTTNCLAPLV HVLEKNFGIV YGTMLTVHAY TADQRLQDAP HNDLRRA.RA g3p2_bacsu CTTNCLAPVV KVLDEEFGIE SGLMTTVHAY TNDQKNIDNP HKDLRRA.RA g3p_strau CTTNALAPLA KVLDDLAGIE HGFMTTVHAY TQEQNLQDGP HRDPRRA.RA g3p_lacla CTTNSLAPMA DALNKNFGVK GGTMTTVHSY TGDQMTLDGP hgGDFRRARA g3p_helpy CTTNAIAPIC AILDKAFKIK EGMLTTIHSY TSDQKLIDLA HPLDKRRSRA g3p3_anava CTTNCLAPVV KVIHEGLGIK HGIITTIHDN TNTQTLVDAP HKDLRRA.RA e4pd_ecoli CTTNCIIPVI KLLDDAYGIE SGTVTTIHSA MHDQQVIDAY HPDLRRT.RA 201 250 predict_h263 AAQNLIPAST GAAKAVGKVI PELDGKLTGM AFRVPTANVS VLDLTCRLEK g3p1_human AAQNLIPAST GAAKAVGKVI PELDGKLTGM AFRVPTANVS VLDLTCRLEK g3p_canfa .......... .......... .......... .......... .......... g3p_pig AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_cavpo ...NIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_rabit AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_bovin AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p2_human ALQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTANVS VVDLTCRLEK g3p_crigr AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_sheep AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRPEK g3p_mouse AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_mesau AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_cotja AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_rat AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p1_jacor AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTANVS VVDLTCRLEK g3p_chick AAQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTCRLEK g3p_xenla AGQNIIPAST GAAKAVGKVI PELNGKITGM AFRVPTPNVS VVDLTCRLQK g3p2_jacor .......... .......... .......... .......... .......... g3p_homam AAQNIIPSST GAAKAVGKVI PELDGKLTGM AFRVPTPDVS VVDLTVRLGK g3p3_caebr AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTARLEK g3p1_drome AAQNIIPAAT GAAKAVGKVI PALNGKLTGM AFRVPTPNVS VVDLTVRLGK g3p2_caebr AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTARLEK g3p2_drome AAQNIIPAST GAAKAVGKVI PALNGKLTGM AFRVPTPNVS VVDLTVRLGK g3p3_caeel AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTARLEK g3p_drohy ACQNIIPAST GAAKAVGKVI PALNGKLTGM AFRVPTPNVS VVDLTVRLGK g3p2_caeel AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTARLEK g3p_schpo ASANIIPSST GAAKAVGKVI PALNGKLTGM AFRVPTPDVS VVDLTVKLAK g3p_schma AMQNIIPAST GAAKAVGKVI PALNGKLTGM AFRVPTPDVS VVDLTCRLGK g3p_bruma AGQNIIPAST GAAKAVGKVI PGLNGKLTGM ANRVPTPDVS VVDLTCRLQK g3p_lacdt VNGNIIPSST GAAKAVGKVI PALNGKLTGL AFRVPTNDVS VVDLVVRLEK g3p_podan AAQNIIPSST GAAKAVGKVI PELNGKLTGM AFRVPTSNVS VVDLTCRLEK g3p_ustma AAANIIPSST GAAKRVGKVI PSLNGKLTGM AFRVPTTNVS VVDLTARLEK g3p_pharh AGANIIPSST GAAKAVGKVI PSLNGKLTGM AFRVPTPDVS VVDLVVRIEK g3p1_caeel AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTVRLEK g3p_aspng AAQNIIPSST GAAKAVGKVI PTLNGKLTGM AMRVPTSNVS VVDLTCRLEK g3p4_caeel AGQNIIPAST GAAKAVGKVI PELNGKLTGM AFRVPTPDVS VVDLTVRLEK g3p_boled VNNNIIPSST GAAKAVGKVI PVLNGKLTGL AFRVPTLDVS VVDLVVRLAK g3p_coche AAQNIIPSST GAAKAVGKVI PELNGKLTGM AMRVPTANVS VVDLTVRIEK g3pc_ginbi AGFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTPDVS VVDLTVRLEK g3p_amamu VNNNIIPSST GAAKAVGKVI PELNGKLTGL SFRVPTLDVS VVDLVVRIEQ g3pc_phypa AATNIIPSAT GAAKAVGKVL PELNGKLTGM AFRVPTTDVS VVDLTVRLEK g3p_curlu AAQNIIPSST GAAKAVGKVI PELNGKLTGM AMRVPTANVS VVDLTVRIEK g3pc_ranac ASFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTVDVS VVGLTVRLEK g3px_horvu ASFNIIPSST GAAKAVGKVL PELNGKLTGM SFRVPTVDVS VVDLTVRTEK g3p_schco VNNNIIPSST GAAKAVGKVI PSLNGRLTGL AFRVPTLDVS VVDLVVRLEK g3p_crypa AAQNIIPSST GAAKAVGKVI PELNGKLTGM SMRVPTSNVS VVDLTVRIEK g3pc_maize ASFNIIPSST GAAKAVGKVL PDLNGKLTGM SFRVPTVDVS VVDLTVRIEK g3p_phach VGNNIIPSST GAAKAVGKVI PSLNGKLNGL AFRVPTVDVS VVDLVVRLEK g3p_lyosh VVNNIIPSST GAAKAVGKVI PSLNGKLTGL SFRVPTIDVS VIDLVVRLEK g3pt_mouse AHQNIIPSST GAAKAVGKVI PELKGKLTGM AFRVPTPNVS VVDLTCRLAK g3p_neucr AAQNIIPSST GAAKAVGKVI PDLNGKLTGM AMRVPTANVS VVDLTARIEK g3p_emeni AATNIIPSST GAAKAVGKVI PSLNGKLTGM AMRVPTSNVS VVDLTVRTEK g3p_colgl AAQNIIPSST GAAKAVGKVI PELNGKLTDM SMRVPTTNVS VVDLTARIEK g3pc_sinal ASFNIIPSST GAAKAVGKVL PQLNGKLTGM SFRVPTVDVS VVDLTVRLEK g3pc_pethy ASFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTVDVS VVDLTVRLEK g3pc_pinsy AGFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTPDVS VVDLTVRLEK g3p_serma ASQNIIPSST GAAKAVGKVI PELKGKLTGM AFRVPTPNVS VVDLTVRLEK g3p_canal ASGNIIPSST GAAKAVGKVI PELNGKLTGM SLRLPTTDVS VVDLTVRLKK g3pc_arath ASFNIIPSST GAAKAVGKVL PALNGKLTGM SFRVPTVDVS VVDLTVRLEK g3p1_triko AAQNLIPSST GAAKAVGKVI PELAGKVTGM SVRVPTVNVS LVDFTVRFAK g3p_atrnu ASFNIIPSST GAAKAVGKVL PALNGKLTGM SFRVPTVDVS VVDLTVRLEK g3p_picpa ASGNIIPSST GAAKAVGKVI PELNGKLTGL AFRVPTVDVS VVDLTVNLKK g3p_colln AAQNIIPSST GAPKAVGKVI PELNGKLTGM SMRVPTANVS VVDLTVRIEK g3pc_tobac TSFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTVDVS VVDLTVRLEK g3p_monan AAQNIIPSST GVPKAVGKVI PSLNGKLTGM SMRVPTSNAS VVDLTARLEK g3p_escfe ASQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3pc_pea ASFNIIPSST GAAKAVGKVL PALNGKLTGM SFRVPTVDVS VVDLTVRLEK g3pc_mescr ASFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTCDVS VVDLTVRIEK g3pc_grave AGANIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTSDVS VVDLTVRLAT g3p_erygr AAQNIIPSST GAAKAVGKVI PALNGKLTGM AMRVPTANVS VVDLTCRIEK g3pc_diaca ASFNIIPSST GAAKAVGKVL PSLNGKLTGM SFRVPTVDVS VVDLTVRIEK g3pc_horvu ASFNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTVDVS VVDLTVRLAK g3p_phyin ACFNIIPSST GAAKAVGKVI PSLNGKLTGM SFRVPTADVS VVDLTARLVN g3p1_ecoli ASQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3pc_antma ASFNIIPSST GAAKAVGKVL PQLNGKLTGM SFRVPTVDVS VVDLTVRLEK g3p_clapu AAQNIIPSST GAAKAVGKVI PDLNGKLTGM SMRVPTPNVS VVDLTVRIEK g3pc_orysa ASFNIIPSST GAAKAVGKVL PDLNGKLTGM SFRVPTVDVS VVDLTVRIEK g3pc_petcr ASFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTVDVS VVDLTARLEK g3pc_chocr RGRNIIPSST GAAKAVGKVM PELNGKLTGM AFRVPTPDVS VVDLTVRLAS g3p_serod ASQNIIPSST GAAKAVGKVI PELNGKLTGM AFRVPTPNVS VVDLTARLEK g3pc_leime ASQNIIPSST GAPKAVGKVY PALDGKLTGM AFRVPTPNVS VVDLTVRLEK g3pc_taxba AGFNIIPSST GAAKAVGKVL PVLNGKLTGM CFRVPTQDVS VVDLTVKLEK g3p2_triko AAQNIIPSST GAAKAVGKVI PALNGKLTGM SIRVPTANVS VVDLTVRIEK g3pc_magli AGFNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTVDVS VVDLTVRLEK g3p3_yeast ASGNIIPSST GAAKAVGKVL PELQGKLTGM AFRVPTVDVS VVDLTVKLNK g3p2_agabi VGNNIIPSST GAAKAVGKVI PSLNGKLTGL SMRVPTQDVS VVDLVVRLEK g3pc_crapl ASFNIIPSST GAAKAVGKVL PDLNGKLTGM AFRVPTVDVS VVDLTVTLAK g3p_klepn AAQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3p2_yeast ASGNIIPSST GAAKAVGKVL PELQGKLTGM AFRVPTVDVS VVDLTVKLNK g3pc_trybb AAQNIIPSST GAAKAVGKII PSLNGKLTGM AFRVPTPNVS VVDLTVRLER g3p1_escvu AAQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3p_haein ASQNIIPSST GAAKAVGKVL PALNGKLTGM AFRVPTPNVS VVDLTVNLEK g3p_esche AAQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3p_klula ASGNIIPSST GAAKAVGKVL PELQGKLTGM AFRVPTVDVS VVDLTVKLAK g3p_triha AAQNIIPSST GAAKAVGKVI PAMNGKITGM SFRVPSANVS VIDLTVRLEK g3p1_yeast ASGNIIPSST GAAKAVGKVL PELQGKLTGM AFRVPTVDVS VVDLTVKLEK g3p2_kluma ASGNIIPSST GAAKAVGKVL PELQGKLTGM AFRVPTVDVS VVDLTVKLAK g3p_citfr ASQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3pc_chlre VNGNIIPSST GAAKAVGKVL PELKGKLTGM AFRVPTNDVS VVDLTVTLEK g3p_entae AAQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3p_escbl ASQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLAK g3p_zygro ASGNIIPSST GAAKAVGKVL PSLQGKLTGM AFRVPTVDVS VVDLTVNLAK g3p1_syny3 AAQNIIPSST GAAKAAALVL PQLKGKLTGM AFRVPTPNVS VVDLTFKTEK g3p1_salty ASQNIIPSST GAAKAVGKVL PELNGKLTGM AFRVPTPNVS VVDLTVRLEK g3p1_agabi VTNNIIPAST GAAKAVTKAI PDLEGKLTGL AFRVPTLDVS VVDLVVRLEK g3p1_anava AAQNIIPSST GAAKAVALVL PELKGKLTGM AFRVPIPDVS VVDLTFKTAK g3p_bacfr ILENIIPSST GAAKAVGKVL PVLNGKLTGM AFRVPTSDVS VVDLTVVLEK g3p_bucap ALQNIIPSST GAAVAVGKVL PNLNGKLTGI AFRVPTANVS VVDLTVRYKK g3p1_giala CGANVIPSST GAAKAVGKVL PALNGKLTGM AFRVPVPDVS VVDLTCTLEK g3p_burso .......... .......... .......... .......... .......... g3pg_trybb AALNIIPSTT GAAKAVGMVI PSTQGKLTGM AFRVPTADVS VVDLTFIATR g3p_chltr AFQNIIPAST GAAKAVGLCL PELKGKLTGM AFRVPVADVS VVDLTVKLSS g3pg_trycr AAVNIIPSTT GAAKAVGMVI PSTQGKLTGM SFRVPTPDVS VVDLTFTAAR g3p1_bacsu AAENIIPTST GAAKAVSLVL PELKGKLNGG AMRVPTPNVS LVDLVAELNQ g3p_bacme AAENIIPTST GAAKAVSLVL PELKGKLNGG AMRVPTPNVS LVDLVAELDK g3p_bacst AAESIIPTTT GAAKAVALVL PELKGKLNGM AMRVPTPNVS VVDLVAELEK g3pg_leime AAVNIIPSTT GAAKAVGMVI PSTKGKLTGM SFRVPTPDVS VVDLTFRATR g3p_borbu AALSIIPTST GAAKAVGLVL PELKGKLNGT SMRVPVPTGS IVDLTVQLKK g3p_borhe AALSIIPTST GAAKAVGLVL PELKGKLNGT SMRVPVPTGS IVDLTVQLKK g3p_theaq AAINIIPTTT GAAKATALVL PSLKGRFDGM ALRVPTATGS ISDITALLKR g3pa_sinal AALNIVPTST GAAKAVALVL PNLKGKLNGI ALRVPTPNVS VVDLVVQVSK g3p_mycle AALNVVPTST GAAKAIGLVM PELKGKLDGY ALRVPIPTGS VTDLTADLSK g3p_myctu AALNIVPTST GAAKAIGLVM PQLKGKLDGY ALRVPIPTGS VTDLTVDLST g3pp_alceu ATMSMIPTKT GAAAAVGLVM PELDGRLDGF AVRVPTINVS LVDLSFVAAR g3p_bacco AAESIIDTTN GAAMAVALVL PELKGKLNGM AMRVATANVS VVDLVYELAK g3pc_alceu ATMSMIPTKT GAAAAVGLVM PELDGRLDGF AVRVPTINVS LVDLSFVAAR g3p_thema AAVNIIPTTT GAAKAVALVV PEVKGKLDGM AIRVPTPDGS ITDLTVLVEK g3p_clopa AAVSIIPNST GAAKAIAQVI PELKGKLDGN AQRVPVPTGS VTELISVLKK g3p_strpy GAANIVPNST GAAKAIGLVI PELNGKLDGA AQRVPVPTGS VTELVVTLDK g3pa_maize AALNIVPTST GAAKAVSLVL PNLKGKLNGI ALRVPTPNVS VVDLVVQVSK g3pb_arath AALNIVPTST GAAKAVSLVL PQLKGKLNGI ALRVPTPNVS VVDLVINVEK g3pa_spiol ACLNIVPTST GAAKAVALVL PQLKGKLNGI ALRVPTPNVS VVDLVVQVSK g3p_pseae ATQSMIPTKT GAAEAVGLVL PELAGKLTGL AVRVPVINVS LVDLTVQVAR g3pa_chlre AALNIVPTTT GAAKAVSLVL PSLKGKLNGI ALRVPTPTVS VVDLVVQVEK g3pb_pea AALNIVPTST GAAKAVSLVL PQLKGKLNGI ALRVPTPNVS VVDLVVNVAK g3p2_anava AAINIVPTST GAAKRVALVI PELKGKLNGV ALRVPTPNVS MVDFVVQVEK g3p2_syny3 AAVNIVPTST GAAKAVALVI PELQGKLNGI ALRVPTPNVS VVDLVVQVEK g3p_zymmo ASASMIPTST GAARAVALVI PELKGKLDGI SIRVPTPDVS LVDFTFVPQR g3p_mycpn AACNIVPTTT GAAKAIGLVV PEATGKLNGM ALRVPVLTGS IVELCVALEK g3p_streq GAANIVPNST GARKAIGLVI PELNGKLDGA AQRVPVPTGS VTELVVTLDK g3pb_spiol AALNIVPTST GAAKAVSLVL PQLKGKLNGI ALRVPTPNVS VVDLVVNIEK g3p_strae AALNIVPTSS GAAKAIGLVL PELAGRLDAF ALRVPVPTGS VTDLTVTTRR g3pa_arath AALNIVPTST GAAKAVALVL PNLKGKLNGI ALRVPTPNVS VVDLVVQVSK g3p_xanfl AAMSMIPTST GAAKAVGLVL PELAGKLDGT SIRVPTPNVS VIDLKFVAKR g3pb_tobac AALNIVPTST GAAKAVSLVL PQLKGKLNGI ALRVPTPNVS VVDLVVNVAK g3p_corgl AAVNIVPTST GAAKAVALVL PELKGKLDGY ALRVPVITGS ATDLTFNTKS g3p1_anasp .......... .......... .......... .......... .......... g3p2_rhosh AALSMIPTST GAAKAVGLVL PELKGRLDGV SIRVPTPNVS VVDLVFEAAR g3pa_grave AALNIVPTTT GAAKAVALVV PTLAGKLNGI ALRVPTPNVS VCDVVMQVSK g3pa_tobac AALNIVPTST GAAKAVALSS QALRGSSMAL PLRVPTPNVS VVDLVVQVSK g3p_halva AAENIIPTST GAAQAATEVL PELEGKLDGM AIRVPVPNGS ITEFVVDLDD g3pa_pea AALNIVPTST GAAKAVALVL PTLKGKLNGI ALRVPTPNVS VVDLVVQVSK g3pa_chocr AALNIVPTTT GAAKAVALVV PSLKGKLNGI ALRVPTPNVS VCDVVMQVNK g3p3_ecoli AAENIIPHTT GAAKAIGLVI PELSGKLKGH AQRVPVKTGS VTELYRFSEK g3p_mycge AAVNIVPTTT GAAKAIGLVV PEANGKLNGM SLRVPVLTGS IVELSVVLEK g3p2_bacsu CGESIIPTTT GAAKALSLVL PHLKGKLHGL ALRVPVPNVS LVDLVVDLKT g3p_strau AAVNIVPTTT GAAKAIGLVL PNLDGKLSGD SIRVPVPVGS IVELNTTVAR g3p_lacla AAENIVPASS GAAKAIGLVL PELSGLMKGH AQRVSTPTGS ITELVTVLEK g3p_helpy AASNIIPTTT KAALALHKVL PNLKNKMHGH SVRVPSLDVS MIDLSLFLEK g3p3_anava TSLSLIPTTT GSAKAIALIY PELKGKLNGI AVRVPLLNAS LTDCVFEVNR e4pd_ecoli ASQSIIPVDT KLAAGITRFF PQFNDRFEAI AVRVPTINVT AIDLSVTVKK 251 300 predict_h263 PAKYDDIKKV VKEASEGPLK GILGYTEDEV VSDDFNGSNH SSIFDAGAGI g3p1_human PAKYDDIKKV VKEASEGPLK GILGYTEDEV VSDDFNGSNH SSIFDAGAGI g3p_canfa .......... .......... .......... .......... .......... g3p_pig PAKYDDIKKV VKQASEGPLK GILGYTEDQV VSCDFNSDTH SSTFDAGAGI g3p_cavpo PAKYDDIKKV VKQASEGSLK GILGYTED.. .......... .......... g3p_rabit AAKYDDIKKV VKQASEGPLK GILGYTEDQV VSCDFNSATH SSTFDAGAGI g3p_bovin PAKYDEIKKV VKQASEGPLK GILGYTEDQV VSCDFNSDTH SSTFDAGAGI g3p2_human PAKYDDIKKV VKQASEGPLK GILGYTEHQV VSSDFNSDTH SSTFDAGAGI g3p_crigr PAKYEDIKKV VKQASEGPLK GILGYTEDQV VSCDFNSDSH SSTFDAGAGI g3p_sheep PAKYDEIKKV VKQASEGPLK GILGYTEDQV VSCDFNSDTH SSTFDAGAGI g3p_mouse PAKYDDIKKV VKQASEGPLK GILGYTEDQV VSCDFNSNSH SSTFDAGAGI g3p_mesau AAKYEDIKKV VKQASEGPLK GILGYTEDQV VSCDFKSDSH SSTFDAGAGI g3p_cotja PAKYDDIKRV VKAAAEGPLK GILGYTEDQV VSCDFNGDSH SSTFDAGAGI g3p_rat PAKYDDIKKV VKQAAEGPLK GILGYTEDQV VSCDFNSNSH SSTFDAGAGI g3p1_jacor PAKYDDIKRV VKQACDGPLK GMLGYTEHQV VSSDFNGDSH SSTFDAGAGI g3p_chick PAKYDDIKRV VKAAADGPLK GILGYTEDQV VSCDFNGDSH SSTFDAGAGI g3p_xenla PAKYDDIKAA IKTASEGPMK GILGYTQDQV VSTDFNGDTH SSIFDADAGI g3p2_jacor .......... .......... .......... .......... .......... g3p_homam ECSYDDIKAA MKTASEGPLQ GFLGYTEDDV VSSDFIGDNR SSIFDAKAGI g3p3_caebr PASLDDIKKV VKAAAEGPLK GVLAYTEDQV VSTDFVSDTH SSIFDAGASI g3p1_drome GATYDEIKAK VEEASKGPLK GILGYTDEEV VSTDFLSDTH SSVFDAKAGI g3p2_caebr PASLDDIKRV IKAAAEGPLK GVLAYTEDQV VSTDFVSDTH SSIFDAGASI g3p2_drome GASYDEIKAK VQEAANGPLK GILGYTDEEV VSTDFLSDTH SSVFDAKAGI g3p3_caeel PASLDDIKKV IKAAADGPMK GILAYTEDQV VSTDFVSDTN SSIFDAGASI g3p_drohy GASYDEIKAK VQEAANGPLK GILGYTDEEV VSTDFLSDTH SSVFEPKAGI g3p2_caeel PASLDDIKKV IKAAADGPMK GILAYTEDQV VSTDFVSDTN SSIFDAGASI g3p_schpo PTNYEDIKAA IKAASEGPMK GVLGYTEDAV VSTDFCGDNH SSIFDASAGI g3p_schma GASYEEIKAA VKAAASGPLK GILEYTEDEV VSSDFVGSTS SSIFDAKAGI g3p_bruma GATMDEIKAA VKEAANGPMK GILEYTEDQV VSTDFTGDTH SSIFDSLACI g3p_lacdt EATYDEIKLA VKEAADGPLK GIIEYTDDLV VSTDFIGSTA SSIFDAGAGI g3p_podan PASYETIKAA LKEASEGELK GILGYTEDEI VSSDLNGNAN SSIFDAKAGI g3p_ustma GASYDEIKAE VKRASENELK GILGYTEDAV VSQDFIGNSH SSIFDAAAGI g3p_pharh GASYEEIKET IKKASQTplK GILNYTDDQV VSTDFTGDSA SSTFDAQGGI g3p1_caeel PASMDDIKKV VKAAADGPMK GILAYTEDQV VSTDFVSDPH SSIFDAGACI g3p_aspng ATSYDEIKKA LKDASENELK GILGYTEDDI VSSDLNGDDH SSIFDAKAGI g3p4_caeel PASMDDIKKV VKAAADGPMK GILAYTEDQV VSTDFVSDPH SSIFDTGACI g3p_boled PTSYEEIKTA FKEASEseLK GIVAYTEDAV VSTDFLGHHA SSIFDATGGI g3p_coche GASYDEIKQA VKEASEGSLN GILGYTEDDI VSTDLNGDNR SSIFDAKAGI g3pc_ginbi PASYDEIKAA IKEESEGKLK GILGYTEDDV VSTDFIGDNR SSIFDAKAGI g3p_amamu SATYDEIKEA FREASKGSLK GIIEYTDEHV VSTDFIGHTA SSIFDSLAGI g3pc_phypa PASYDAIKTA IKEASEGQMK GILGYTEDDV VSTDFITDSR SSIFDAKAGI g3p_curlu GASYDEIKQA VKEASEGPLS GILGYTEDDI VTTDLNGDNR SSIFDAKAGI g3pc_ranac KATYADIKAA IKEESEGKLK GILGYTEDDV VSTDFIGDNR SSIFDAKAGI g3px_horvu AASYDDIKKA IKAASEGKLK GIMGYVEEDL VSTDFVGDSR SSIFDAKAGI g3p_schco EASYDEIVAT VKEASEGPLK GILGFTDESV VSTDFTGANE SSIFDSKAGI g3p_crypa GATYEQIKTA VKKAADGPLK GVLAYTEDDV VSTDMNGNPN SSIFDAKAGI g3pc_maize GASYEDIKKA IKAASEGPLK GIMGYVEEDL VSTDFLGDSR SSIFDAKAGI g3p_phach PASYDEIKQA IKEASETTHK GILGYTEEKV VSTDFTGNDN SSIFDRDAGI g3p_lyosh PASYEDIKKT VKEASEGAYK GIIEYTEEQV VSADFIGHHA SSIFDAQAGI g3pt_mouse PASYSAITEA VKAAAKGPLA GILAYTEDQV VSTDFNGNPH SSIFDAKAGI g3p_neucr GATYDEIKEV IKKASEGPLA GILAYTEDEV VSSDMNGNPA SSIFDAKAGI g3p_emeni AVTYDQIKDA VKKASENELK GILGYTEDDI VSTDLNGDTR SSIFDAKAGI g3p_colgl GASYDEIKQA IKEAAEGPLK GVLAYTEDDV VSTDMIGNPN SSIFDAKAGI g3pc_sinal AATYDEIKKA IKEESQGKLK GILGYTEDDV VSTDFVGDNR SSIFDAKAGI g3pc_pethy EATYDEIKAA IKEESEGKLK GIPGHTEDDV VSTDFIGDNR SSIFDAKAGI g3pc_pinsy SATYDEIKAA IKAESEGNLK GILGYTEDAV VSTDFIGDSR SSIFDAQAGI g3p_serma PATYEEIKKA MKDAAEGSMK GVLGYVEDDV VSTDFNGEVL TSVFDAKAGI g3p_canal AASYEEIAPA IKKASEGPLK GVLGYTEDAV VSTDFLGSSY SSIFDEKAGI g3pc_arath AATYEEIKKA IKEESEGKLK GILGYTEDDV VSTDFVGDNR SSIFDAKAGI g3p1_triko DVTYDEVKAA IKEASEGPLK GILAYTEDDI VSTDILTDPH SSTFDAKAGI g3p_atrnu EATYEQIKAA IKEESEGKMK GILGYTEDDV VSTDFVGDNR SSIFDAKAGI g3p_picpa ETTYEEIKSV IKAASEGKLK GVLGYTEDAV VSSDFLGDER SSIFDASAGI g3p_colln GATYDEIKQA IKEAAEGPLK GVLAYTEDDF VSTDMIGNPN SSIFDAKAGI g3pc_tobac EASYDDIKAA IKEESEGKLK GILGFTEDDV VSTDFVGDSR SSIFDAKAGI g3p_monan AATYDEIKQA VKKASERPLK GILGYTEDDV VSSDLNGDPH SSIFDAKAGI g3p_escfe PATYEQIKAA VKAAAEGEMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3pc_pea AATYDEIKAA IKEESEGKLK GILGYTEDDV VSTDFIGDTR SSIFDAKAGI g3pc_mescr AASYEQIKAA IKEESEGKLK GILGYTEDDL VSTDFIGDNR SSIFDAKAGI g3pc_grave ETSYDDIKAT MKAAAEDSMK GILKYTEEAV VSTDFIHEEA SCVFDAGAGI g3p_erygr SATYDEIKKA IKIAAGNELQ GILSYTEDEI VSTDLIGNNH SSIFDAKAGI g3pc_diaca PATYEQVKAA IKEESEGKLK GILGYTEDDV VSTDFVGDNR SSIFDAKAGI g3pc_horvu PATYEQIKAA IKEESEGNLK GILGYVDEDL VSTDFQGDSR SSIFDAKAGI g3p_phyin PASYDEIKAA IKSASENEMK GILGYTEKAV VSSDFIGDSH SSIFDAEAGI g3p1_ecoli AATYEQIKAA VKAAAEGEMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3pc_antma KATYEQIKAA IKEESEGKLK GILGYTEDDV VSTDFVGDSR SSIFDAKAGI g3p_clapu GATYDEIKAT VKEAANGSLA GILGYTEDDI VSSDMNGNTN SSIFDAKAGI g3pc_orysa AASYDAIKSA IKSASEGKLK GIIGYVEEDL VSTDFVGDSR SSIFDAKAGI g3pc_petcr AATYDEIKAA IKHESETSLK GILGYTEDDV VSTDFVGDSR SSIFDAKAGI g3pc_chocr ETTYEDIKAT MKAAADDSMK GIMKYTEDAV VSTDFIHDDA SCIFDASAGI g3p_serod PASYKEVCAA IKEAAEGELK GVLGYTEDDV VSTDFNGEKL TSVFDAKAGI g3pc_leime PATYKDICAA IKAAAEGEMK GILGYTDDEV VSSDFNGVAL TSVFDVKAGI g3pc_taxba SATYGEIKAA IKEESEGKLK GILGYTEDDV VSTDFIGDSR SSIFDAKAGI g3p2_triko GASYEEITET IKKAADGPLK GVLAYTGDDV VSSDMLGNTN SSIFDIKAGI g3pc_magli AASYEEIKAV IKAESEGKLK GILGYTEEDV VSTDFIGDNR SSIFDAKAGI g3p3_yeast ETTYDEIKKV VKAAAEGKLK GVLGYTEDAV VSSDFLGDSH SSIFDASAGI g3p2_agabi PASYEQIKEV MRKAAEGEYK GIIAYTDEDV VSTDFISDNN SCVFDAKAGI g3pc_crapl EASYDEIKAA IKEESEGKLK GILGYTEEDV VSSDFVGDSR SSIFDAKAGI g3p_klepn AASYEEIKKA IKAASEGAMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3p2_yeast ETTYDEIKKV VKAAAEGKLK GVLGYTEDAV VSSDFLGDSN SSIFDAAAGI g3pc_trybb PATYKQICDA IKAASEGELK GILGYVDEEI VSSDINGIPL TSVFDARAGI g3p1_escvu AASYEEIKKA IKAASEGAMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3p_haein PASYDAIKQA IKDAAEgeLK GVLGYTEDAV VSTDFNGCAL TSVFDADAGI g3p_esche AASYEDIKKA IKAAAEGEMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3p_klula EATYDEIKAA VKKASQGKLK NVVGYTEDSV VSSDFLGDTH STIFDASAGI g3p_triha AASYEEITSA IKKAADGELK GIMAYTSDEV VSTDMLGNNN SSIFDIKAGI g3p1_yeast EATYDQIKKA VKAAAEGPMK GVLGYTEDAV VSSDFLGDTH ASIFDASAGI g3p2_kluma PATYEEIKAV VKKASENELK GVMGYTEDAV VSSDFLGDTH SSIFDAAAGI g3p_citfr AASYEEIKKA IKAASEGPMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3pc_chlre ATTYEDIMKA LKEASEGEMK GVLAYTDEDV VSSDFVTDPA SCTVDAKAGI g3p_entae AASYEEIKKA IKAASEGPMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3p_escbl PATYEEIKKA MKAASEGAMK GVLGYTEDDV VSTDFNGETC TSVFDAKAGI g3p_zygro ETSYDEIKAA LKKASEGSMK GILGYTEDDV VSSDFLGDAH SSIVDAAAGI g3p1_syny3 ATSYEEICAA MKTAAEGELK GILGYTADDV VSMDFRTDPR SSIFDAGAGI g3p1_salty AATYEQIKAA VKAAAEGEMK GVLGYTEDDV VSTDFNGEVC TSVFDAKAGI g3p1_agabi ETSYDDVKKA MRDAADGKhk GIVDYTEEDV VSTDFVGSNY SMIFDAKAGI g3p1_anava ATSYKEICAA MKQASEGSLA GILGYTDEEV VSTDFQGDTH SSIFDAGAGI g3p_bacfr AATMAEINAA MKEASEGELK GILGYTEDAV VSTDFRGCAN TSIYDSKAGI g3p_bucap TATYNEICEV VKNASEQKMK GILGYTEDEV VSTDFNGKEL TSIFDAKAGL g3p1_giala DATYDEICAE IKRGSENELK GIMTYTNEDV VSSDFLSTTS TCNFDSKAGI g3p_burso ......ICAE MKAQSEGALK GVLGYTEDKV VATDFRGDAR TSIFDAEAGI g3pg_trybb DTSIKEIDAA LKRASKTYMK NILGYTDEEL VSADFISDSR SSIYDSKATL g3p_chltr ATTYEAICEA VKHAANTSMK NIMYYTEEAV VSSDFIGCEY SSIFDAQAGV g3pg_trycr DTSIQEIDAA LKRASKTYMK GILGYTDEEL VSADFINDNR SSIYDSKATL g3p1_bacsu EVTAEEVNAA LKEAAEGDLK GILGYSEEPL VSGDYNGNKN SSTIDALSTM g3p_bacme EVTVEDVNNA LKEAAEGDLK GILGYSEEPL VSGDYNGNIN SSTIDALSTM g3p_bacst EVTVEEVNAA LKAAAEGELK GILAYSEEPL VSRNYNGSTV SSTIDALSTM g3pg_leime DTSIQEIDKA IKKAAQTYMK GILGFTDEEL VSADFINDNR SSVYDSKATL g3p_borbu KdtKEEINSV LRKASETplK GILGYTEDPI VSSDIKGNSH SSIVDGLETM g3p_borhe KdtKEEINSV LKKASETplN GILGYTEDPX VSSDXKGNSH .......... g3p_theaq EVTAEEVNAA LKAAAEGPLK GILAYTEDEI VLQDIVMDPH SSIVDAKLTK g3pa_sinal KTFAEEVNAA FRDAAEKELK GILDVCDEPL VSVDFRCSDV SSTIDSSLTM g3p_mycle CVSVNEINAV FQDAAEGRLK GILKYVDAPI VSSDIVTDPH SSIFDSGLTK g3p_myctu RASVDEINAA FKAAAEGRLK GILKYYDAPI VSSDIVTDPH SSIFDSGLTK g3pp_alceu PTTVEEVNGI LKAAAEGELK GILDYNTAPL VSVDFNHNPA SSTFDA.TLT g3p_bacco EVTVEEVNAA LKAIAEGELK GILAYSIEPL VIRNYNGSTV SSTIDILSTM g3pc_alceu PTTVEEVNGI LKAAAEGELK GILDYNTAPL VSVDFNHNPA SSTFDA.TLT g3p_thema ETTVEEVNAV MKEATEGRLK GIIGYNDEPI VSSDIIGTTF SGIFDATITN g3p_clopa NVTVEEINAA MKEAA....N ESFGYTEDEI VSADVVGISY GSLFDATLTK g3p_strpy NVSVDEINSA MKAAS....N DSFGYTEDPI VSSDIVGVSY GSLFDATQTK g3pa_maize KTLAEEVNQA FRDAAANELT GILEVCDVPL VSVDFRCSDV SSTIDASLTM g3pb_arath KgtAEDVNEA FRKAANGPMK GILDVCDAPL VSVDFRCSDV STTIDSSLTM g3pa_spiol KTFAEEVNAA FRESADNELK GILSVCDEPL VSIDFRCTDV SSTIDSSLTM g3p_pseae DTSVDEVNRL LREASKG..S PVLGYNTQPL VSVDFNHDPR SSIFDA.NHT g3pa_chlre KTFAEEVNAA FREAANGPMK GVLHVEDAPL VSIDFKCTDQ STSIDASLTM g3pb_pea KgsAEDVNAA FRKAAEGPLK GILDVCDVPL VSVDFRCSDV STTIDSSLTM g3p2_anava RTITEEVNQA LKDASEGPLK GILDYSELQL VSSDYQGTDA SSIVDASLTL g3p2_syny3 NTIAEQVNGV LKEAANTSLK GVLEYTDLEL VSSDFRGTDC SSTVDGSLTM g3p_zymmo DTTAEEINSV LKAAAdgDMT GVLGYTDEPL VSRDFYSDPH SSTVDSRETA g3p_mycpn DATVEQINQA MKKAAS.... ASFRYCEDEI VSSDIVGSEH GSIFDSKLTN g3p_streq NVSVDEINAA MKAAS....N DSFGYTEDPI VSSDIVGVSY GSLFDATQTK g3pb_spiol VgtAEDVNNA FRKAAAGPLK GVLDVCDIPL VSVDFRCSDF SSTIDSSLTM g3p_strae GTSVEEVKEA YAAAASGPYK GLLSYVDAPL VSTDIVG.DP ASLFDAGLT. g3pa_arath KTFAEEVNAA FRDSAEKELK GILDVCDEPL VSVDFRCSDF STTIDSSLTM g3p_xanfl ATSKEEINEA IIAAASQQLK GILGFTDQPN VSIDFNHNPN SSTFHLDQTK g3pb_tobac KgtAEDVNAA FRKAADGPLK GVLAVCDEPL VSVDFRCSDV SSTIDSSLTM g3p_corgl EVTVESINAA IKEAAVGEFG ETLAYSEEPL VSTDIVHDSH GSIFDAGLT. g3p1_anasp .......... .......... .......... .......... .......... g3p2_rhosh DTTVEEVNAA IEAAACGPLK GVLGFTTEPN VSSDFNHDPH SSVFHMDQTK g3pa_grave KTFKEEVNGA LLKAANGSMK GIIKYSDEPL VSCDYRGTDE STIIDSSLTM g3pa_tobac KTFAEEVNAA FREAADKELK GILDVCDEPL VSVDFRCSDV SSTVDASLTM g3p_halva DVTESDVNAA FEDAAAGELE GVLGVTSDDV VSSDILGDPY STQVDLQSTN g3pa_pea KTFAEEVNEA FRESAAKELT GILSVCDEPL VSVDFRCTDV SSTVDSSLTM g3pa_chocr KTFKEEVNGA LLKASEGAMK GIIKYSDEPL VSCDHRGTDE STIIDSSLTM g3p3_ecoli VT.AEEVNNA LKQATTN..N ESFGYTDEEI VSSDIIGSHF GSVFDATQTE g3p_mycge SPSVEQVNQA MKRFAS.... ASFKYCEDPI VSSDVVSSEY GSIFDSKLTN g3p2_bacsu DVTAEEVNEA FKRAAKTSMY GVLDYSDEPL VSTDYNTNPH SAVIDGLTTM g3p_strau DVTRDEVLDA YRAAAQGPLA GVLEYSEDPL VSSDITGNPA SSIFDS.ELT g3p_lacla HVTVDEINEA MKAAAD.... ESFGYNVDEI VSSDIIGMAY GSLFDAttDL g3p_helpy KAPKDPINDL LIEASKGVLK GVLEIDLKER VSSDFISNPH SVIIAPDLTF g3p3_anava PTTVEEINAL LKAASeaPLQ GILGYEERPL VSIDYKDDPR SSIIDALSTM e4pd_ecoli PVKANEVNLL LQKAAQGAFH GIVDYTELPL VSVDFNHDPH SAIVDGTQTR 301 334 predict_h263 ELNDTFVKLV SWYDNEFGYS ERVVDLMAHM ASKE g3p1_human ELNDTFVKLV SWYDNEFGYS ERVVDLMAHM ASKE g3p_canfa .......... .......... .......... .... g3p_pig ALNDHFVKLI SWYDNEFGYS NRVVDLMVHM ASKE g3p_cavpo .......... .......... .......... .... g3p_rabit ALNDHFVKLI SWYDNEFGYS NRVVDLMVHM ASKE g3p_bovin ALNDHFVKLI SWYDNEFGYS KQ........ .... g3p2_human ALNDHFVKLI SWYDNEFGYS NRVVDLMAHM ASKE g3p_crigr ALNDNFVKLI SWYDNEFGYS NRVVDLMAYM ASKE g3p_sheep ALNDHFVKLI SWYDNEFGYS NRVVDLMVHM ASKE g3p_mouse ALNDNFVKLI SWYDNEYGYS NRVVDLMAYM ASKE g3p_mesau ALNDNFVKLI SWYDNEFGYS NRV....... .... g3p_cotja ALNDNFVKLV SWYDNEFGYS NRVVDLMVHM ASKE g3p_rat ALNDNIVKLI SWYDNEYGYS NRVVDLMAYM ASKE g3p1_jacor ALNDHFVKLV SWYDNEFGYS NRVVDLMVHM ASKE g3p_chick ALNDHFVKLV SWYDNEFGYS NRVVDLMVHM ASKE g3p_xenla ALNENFVKLV SWYDNECGYS NRVVDLVCHM ASKE g3p2_jacor .......... .......... .......... .... g3p_homam QLSKTFVKVV SWYDNEFGYS QRVIDLLKHM QKVD g3p3_caebr ILNPNFVKLI SWYDNEFGYS NRVVDLISYI ATKA g3p1_drome SLNDKFVKLI SWYDNEFGYS NRVIDLIKYM QSKD g3p2_caebr ILNPNFVKLI SWYDNEFGYS NRVVDLISYI ATKA g3p2_drome SLNDKFVKLI SWYDNEFGYS NRVIDLIKYM QSKD g3p3_caeel SLNPHFVKLV SWYDNEFGYS NRVVDLISYI ATKA g3p_drohy SLNDKFVKLI SWYDNEFGYS NRVIDLIKYM QSKD g3p2_caeel SLNPHFVKLV SWYDNEFGYS NRVVDLISYI ATKA g3p_schpo QLSPQFVKLV SWYDNEWGYS RRVVDLVAYT AAKD g3p_schma SLNNNFVKLV SWYDNEFGYS CRVVDLITHM HKVD g3p_bruma SLNPNFVKLI AWYDNEYGYS NRVVDLISYI ASR. g3p_lacdt QLNKNFAKLI S......... .......... .... g3p_podan SLNDNFVKLV SWYDNEWGYS RRVLDLLSYV AKYD g3p_ustma SLNNNFVKLV SWYDNEWGYS NRCLDLLVFM AQKD g3p_pharh SLNGNFVKLV SWYDNEWGYS ARVCDLVSYI AAQD g3p1_caeel SLNPNFVKLV SWYDNEYGYS NRVVDLIGYI ATRG g3p_aspng ALNSNFVKLV SWYDNEWGYS RRVVDLIAYI SKVD g3p4_caeel SLNPNFVKLV SWYDNEYGYS NRVVDLIGYI ATRG g3p_boled MLNDSFVKLI A......... .......... .... g3p_coche SLNKNFVKLV SWYDNEWGYS RRVLDLLVYI AKID g3pc_ginbi ALSDNFVKLV SWYDNEWGYS SRVIDLIVHM ASTV g3p_amamu QLNANFVKLI A......... .......... .... g3pc_phypa ALSDTFVKLV AWYDNEWGYS NRVVDLIVHM AKQG g3p_curlu SLNKNFVKLV SWYDNEWGYS RRVLDLLVYI AKID g3pc_ranac ALNDNCVKLV SWYDNEWGYS SRVVDLIVHM SKTQ g3px_horvu ALNDHFVKLV SWYDNEWGYS NRVVDLIRHM AKTQ g3p_schco AISKSFVKLI AWYDNEWGYS RRVCDLLVYA AKQD g3p_crypa SLNDHFVKLV SWYDNEWGYS RRVLDLISHV AKVD g3pc_maize ALNDHFVKLV SWYDNEWGYS NRVVDLIRHM FKTQ g3p_phach ALNKTFVKLI SWYDNEWGYS RRCCDLLGYA AKVD g3p_lyosh QLNPNFVKLI VWYDNEWGYS ARVCDLLVFA AEQD g3pt_mouse ALNDNFVKLV AWYDNEYGYS NRVVDLLRYM FSRE g3p_neucr SLNKNFVKLV SWYDNEWGYS RRVLDLISYI SKVD g3p_emeni ALNSNFIKLV SWYDNEWGYS RRVVDLISYI SKVD g3p_colgl SLNNNFVKLV SWYDNEWGYS RRVLDLLAHV AKVD g3pc_sinal ALSDNFVKLV SWYDNEWGYS TRVVDLIIHM SKA. g3pc_pethy ALSKNFVKLV SWYDNEMGYS TRVVDLIKHI ASVE g3pc_pinsy ALSDNFVKLV SWYDNEWGYS SRVVDLIVHM AATQ g3p_serma ALNDNFVKLV .......... .......... .... g3p_canal LLSPTFVKLI SWYDNEYGYS TKVVDLLEHV A... g3pc_arath ALSDKFVKLV SWYDNEWGYS SRVVDLIVHM SKA. g3p1_triko ALNKNFVKVM SWYDNEYGYS RRVVDLIVYV SKKD g3p_atrnu CLNGNFVKLV SWYDNEWGYS SRVVDLIRHM SKTT g3p_picpa QLTPSFVKLI SWYDNEYGYS TRVVDLLQHV AKA. g3p_colln SLNNNFVKLV SWYDNEWGYS RRVLDLLAHV AKVD g3pc_tobac ALSKNFVKLV SWYDNEWGYS SRVIDLICHM ASVA g3p_monan ALNSNFVKLF SWYDNEWGYS RRVIDLIAYA QVDA g3p_escfe ALNDNFVKLV .......... .......... .... g3pc_pea ALNDKFVKLV SWYDNELGYS TRVVDLIVHI AKQL g3pc_mescr SLNDNFVKLV SWYDNEWGYS TRVVDLIMHI SKCQ g3pc_grave MLNSRFCKLV AWYDNEWGYS NRVVDLIAHV AKLQ g3p_erygr SLNDNFVKLV AWYDNEWAYS RRVIDLISYI AGKD g3pc_diaca ALNDNFIKLV SWYDNEWGYS TRVVDLIAHI HKT. g3pc_horvu ALNDNFVKLV SWYDNEWGYS TRVVDLIRHM HSTK g3p_phyin ALTDDFVKLV SWYDNEWGYS SRVLDLIEHM VKNE g3p1_ecoli ALNDNFVKLV SWYDNETGYS NKVLDLIAHI SK.. g3pc_antma ALNDNFVKLV SWYDNEWGYS TRVVDLIVHM ASVQ g3p_clapu SLNKNFVKLI AWYDNEWGYS RRVLDLLAYV AKAD g3pc_orysa ALNDNFVKLV AWYDNEWGYS NRVIDLIRHM AKTQ g3pc_petcr ALNGNFVKVV SWYDNEWGYS NRVIDLIRHM ASVA g3pc_chocr MLNSKFCKLV AWYDNEWGYS NRVVDLIAHI SKVQ g3p_serod ALNDNFVKLV .......... .......... .... g3pc_leime SLNDHFVKLV SWYDNETGYS HKVLDLILHT SAR. g3pc_taxba ALNDNFVKLV SWYDNEWGYS SRVIDLIVHM DSTA g3p2_triko SLNKNFVKLV SWYDNEWGYS RRVLDLLAHV AKVD g3pc_magli ALNEHFVKLV SWYDNEWGYS SRVIDLILIV HMAS g3p3_yeast QLSPKFVKLV SWYDNEYGYS TRVVDLVEHV AKA. g3p2_agabi QLSPNFVKLI AWYDNEWGYS RRVCNLLQYV AKED g3pc_crapl ALSKKFVKIV AWYDNEWGYS SRVVDLIRHM AAAK g3p_klepn ALNDNFVKLV .......... .......... .... g3p2_yeast QLSPKFVKLV SWYDNEYGYS TRVVDLVEHV AKA. g3pc_trybb SLNDNFVKLV SWYDNETGYS NKVHDLIAHI TK.. g3p1_escvu ALNDNFVKLV .......... .......... .... g3p_haein ALTDSFVKLV SWYDNETGYS NKVLDLVAHI YNYK g3p_esche ALNDNFVKLV .......... .......... .... g3p_klula QLSPKFVKVV AWYDNEYGYS ERVVDLVEHV A... g3p_triha SLNPNFVKLV SWYDNEWGYS RRVLDLLEHV AKVD g3p1_yeast QLSPKFVKLI SWYDNEYGYS ARVVDLIEYV AKA. g3p2_kluma QLSPQFVKLV SWYDNEFGYS TRVVDLVELV AKN. g3p_citfr ALNDNFVKLV .......... .......... .... g3pc_chlre MLSPTFVKLV AWYDNEWGYS NRVVDLALHV AKKA g3p_entae ALNDNFVKLV .......... .......... .... g3p_escbl ALNDNFVKLV .......... .......... .... g3p_zygro QLTPTFVKLV SWYDNEFGYS TRVVDLVEHV AKSA g3p1_syny3 GLNSNFFKVV SWYDNEWGYS CRVIDLMLTM ASKD g3p1_salty ALNDNFVKLV .......... .......... .... g3p1_agabi ALNSRFMKLV AWYDNEWGYA RRVCDEVVYV AKKN g3p1_anava ELNSNFFKVV AWYDNEWGYS NRVVDLMLS. .... g3p_bacfr SLDSNFAKVV S......... .......... .... g3p_bucap SLNKNFAKLI SWYDNETGYS SKVLDLVELV ALK. g3p1_giala MLNSRFVKLV AWYDNEFGYA NKLVELAKYV GSKG g3p_burso ALDGTFIKVV SWYDNEWGYS NKVLEMARVV AK.. g3pg_trybb QNNlrFFKIV SWYDNEWGYS HRVVDLVRHM AARD g3p_chltr ALNDRFFKLV AWYDNEIGYA TRIVDLLEYV QENS g3pg_trycr QNNlrFFKIV SWYDNEWGYS HRVVDLVRHM ASKD g3p1_bacsu VMEGSMVKVI SWYDNESGYS NRVVDLAAYI AKKG g3p_bacme VMEGNMVKVI SWYDNESGYS NRVVDLAQYI AAKG g3p_bacst VIDGKMVKVV SWYDNETGYS HRVVDLAAYI NAKG g3pg_leime QNnkRFFKVV SWYDNEWAYS HRVVDLVRYM AAKD g3p_borbu VLENGFAKIL SWYDNEFGYS TRVVDLAQKL VK.. g3p_borhe .......... .......... .......... .... g3p_theaq ALGN.MVKVF AWYDNEWGYA NRVADLVELV LRKG g3pa_sinal VMGDDMVKVI AWYDNEWGYS QRVVDLADIV ANNW g3p_mycle VIA.SQAKVV SWYDNEWGYS NRLVDLVGLV GKSL g3p_myctu VIDDQ.AKVV SWYDNEWGYS NRLVDLVTLV GKSL g3pp_alceu KVNGTLVKVS AWYDNEWGFS NRMLDTAVAL AHAR g3p_bacco VIDGAMVKVV SWYDNETGYS HRVVALAAYI NAKG g3pc_alceu KVNGTLVKVS AWYDNEWGFS NRMLDTAVAL AHAR g3p_thema VIGGKLVKVA SWYDNEYGYS NRVVDTLELL LKM. g3p_clopa IVdsQLVKTV SWYDNEMSYT SQLVRTLEYF AKIA g3p_strpy VMesQLVKVV SWYDNEMSYT AQLVRTLEYF AKIA g3pa_maize VMGDDMVKVI SWYDNEWGYS QRVVDLADIC ANQW g3pb_arath VMGDDMVKVV AWYDNEWGYS QRVVDLAHLV ASKW g3pa_spiol VMGDDMVKVI AWYDNEWGYS QRVVDLADIV ANKW g3p_pseae KVSGRLVKAM AWYDNEWGFS NRMLDSALAL AAAR g3pa_chlre VMGDDMVKVV AWYDNEWGYS QRVVDLAEVT AKKW g3pb_pea VMGDDMVKVV AWYDNEWGYS QRVVDLAHLV ANKW g3p2_anava VMGNDLVKVM AWYDNEWGYS QRVLDLAELV AEKW g3p2_syny3 VMGGDMVKVI AWYDNEWGYS QRVVDLAEIV AKNW g3p_zymmo VLEGKLARVV AWYDNEWGFS NRMVDTAAQM AKTL g3p_mycpn IIenKLYKVY AWYDNESSYV NQLVRVVNYC AKL. g3p_streq VMesQLVKVV SWYDNEMSYT AQLVRTLEYF AKIA g3pb_spiol VMGGDMVKVV AWYDNEWGYS QRVVDLADLV ANKW g3p_strae RVCGPQVKVV GWYDNEWGYS NRLIDLATLI GSSL g3pa_arath VMGDDMVKVI AWYDNEWGYS QRVVDLADIV ANNW g3p_xanfl VMEGTLVRVL SWYDNEWGFS NRMSDTAVAM GKLG g3pb_tobac VMGDDMVKVV AWYDNEWGYS QRVVDLAHLV ANNW g3p_corgl KVSGNTVKVV SWYDNEWGYT CQLLRLTELV ASQA g3p1_anasp .......... .......... .......... .... g3p2_rhosh VMEGRMVRIL SWYDNEWGFS NRMADTAVAM GRLL g3pa_grave VMGDDMLKVV AWYDNEWGYS QRVVDLGEVM ASQW g3pa_tobac VMGDDMVKVI AWYDNEWGYS QRVVDLADIV ANQW g3p_halva VVSG.MTKIL TWYDNEYGFS NRMLDVAEYI TE.. g3pa_pea VMGDDLVKVI AWYDNEWGYS QRVVDLADIV ANNW g3pa_chocr VMGDDMIKVV AWYDNEWGYS QRVVDLGEVM ARQW g3p3_ecoli ItdLQLVKTV AWYDNEYGFV TQLIRTLEKF AKL. g3p_mycge IVemKLYKVY AWYDNESSYV HQLVRVVSYC AKL. g3p2_bacsu VMEDRKVKVL AWYDNEWGYS CRVVDLIRHV AARM g3p_strau RVDGRHIKVV AWYDNEWGFS NRVIDTLQLL AAR. g3p_lacla KDGGQLVKTA AWYDNEMSFT AQLIRTLEYF AKIA g3p_helpy TLEN.MVKIM GWYDNEWGYS NRLVDMAQFM YHY. g3p3_anava VVDETQVKIL AWYDNEWGYV NRMVELARKV ALSL e4pd_ecoli VSGAHLIKTL VWCDNEWGFA NRMLDTTLAM ATVA ________________________________________________________________________________ Prediction of: - secondary structure, by PHDsec - solvent accessibility, by PHDacc PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Predict-Help@EMBL-Heidelberg.DE All rights reserved. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Secondary structure prediction by PHDsec: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Rost@EMBL-Heidelberg.DE All rights reserved. About the network method ~~~~~~~~~~~~~~~~~~~~~~~ The network procedure is described in detail in: 1) Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. A brief description is given in: Rost, Burkhard; Sander, Chris: Improved prediction of protein secondary structure by use of se- quence profiles and neural networks. Proc. Natl. Acad. Sci. U.S.A., 1993, 90, 7558-7562. The PHD mail server is described in: 2) Rost, Burkhard; Sander, Chris; Schneider, Reinhard: PHD - an automatic mail server for protein secondary structure prediction. CABIOS, 1994, 10, 53-60. The latest improvement steps (up to 72%) are explained in: 3) Rost, Burkhard; Sander, Chris: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 1994, 19, 55-72. To be quoted for publications of PHD output: Papers 1-3 for the prediction of secondary structure and the pre- diction server. About the input to the network ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The prediction is performed by a system of neural networks. The input is a multiple sequence alignment. It is taken from an HSSP file (produced by the program MaxHom: Sander, Chris & Schneider, Reinhard: Database of Homology-Derived Structures and the Structural Meaning of Sequence Alignment. Proteins, 1991, 9, 56-68. For optimal results the alignment should contain sequences with varying degrees of sequence similarity relative to the input protein. The following is an ideal situation: +-----------------+----------------------+ | sequence: | sequence identity | +-----------------+----------------------+ | target sequence | 100 % | | aligned seq. 1 | 90 % | | aligned seq. 2 | 80 % | | ... | ... | | aligned seq. 7 | 30 % | +-----------------+----------------------+ Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 250 protein chains (in total about 55,000 residues) with less than 25% pairwise sequence identity gave the following results: ++================++-----------------------------------------+ || Qtotal = 72.1% || ("overall three state accuracy") | ++================++-----------------------------------------+ +----------------------------+-----------------------------+ | Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% | | Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% | | Qloop (% of observed)=79% | Qloop (% of predicted)=72% | +----------------------------+-----------------------------+ .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Qtotal = --------------------------------------- (*100) | number of all residues | | no of res correctly predicted to be in helix |Qhelix (% of obs) = -------------------------------------------- (*100) | no of all res observed to be in helix | | | no of res correctly predicted to be in helix |Qhelix (% of pred)= -------------------------------------------- (*100) | no of all residues predicted to be in helix .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the three state accuracy for each protein chain, and then averaging over 250 chains yields the following average: +-------------------------------====--+ | Qtotal/averaged over chains = 72.2% | +-------------------------------====--+ | standard deviation = 9.3% | +-------------------------------------+ .......................................................................... Further measures of performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Matthews correlation coefficient: +---------------------------------------------+ | Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 | +---------------------------------------------+ .......................................................................... Average length of predicted secondary structure segments: . +------------+----------+ . | predicted | observed | +-----------+------------+----------+ | Lhelix = | 10.3 | 9.3 | | Lstrand = | 5.0 | 5.3 | | Lloop = | 7.2 | 5.9 | +-----------+------------+----------+ .......................................................................... The accuracy matrix in detail: +---------------------------------------+ | number of residues with H, E, L | +---------+------+------+------+--------+ | |net H |net E |net L |sum obs | +---------+------+------+------+--------+ | obs H |12447 | 1255 | 3990 | 17692 | | obs E | 949 | 7493 | 3750 | 12192 | | obs L | 2604 | 2875 |19962 | 25441 | +---------+------+------+------+--------+ | sum Net |16000 |11623 |27702 | 55325 | +---------+------+------+------+--------+ Note: This table is to be read in the following manner: 12447 of all residues predicted to be in helix, were observed to be in helix, 949 however belong to observed strands, 2604 to observed loop regions. The term "observed" refers to the DSSP assignment of secondary structure calculated from 3D coordinates of experimentally determined structures (Dictionary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the three secondary structure types using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Qtot) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | | Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2| | | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4| | E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1| | | | | | | | | | | | | | H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4| | E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ The above table gives the cumulative results, e.g. 62.5% of all residues have a reliability of at least 5. The overall three-state accuracy for this subset of almost two thirds of all residues is 82.9%. For this subset, e.g., 83.1% of the observed helices are correctly predicted, and 86.9% of all residues predicted to be in helix are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res | 8.8| 9.5| 9.3| 9.1| 9.7| 10.5| 12.5| 15.7| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2| | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4| | E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1| | | | | | | | | | | | | H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4| | E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ For example, for residues with Relindex = 5 64% of all predicted betha- strand residues are correctly identified. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Solvent accessibility prediction by PHDacc: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Rost@EMBL-Heidelberg.DE All rights reserved. About the network method ~~~~~~~~~~~~~~~~~~~~~~~ The network for prediction of secondary structure is described in detail in: Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. The analysis of the prediction of solvent exposure is given in: Rost, Burkhard; Sander, Chris: Conservation and prediction of solvent accessibility in protein families. Proteins, 1994, 20, 216-226. To be quoted for publications of PHD exposure prediction: Both papers quoted above. Definition of accessibility ~~~~~~~~~~~~~~~~~~~~~~~~~~ For training the residue solvent accessibility the DSSP (Dictionary of Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22, 2577-2637) values of accessible surface area have been used. The prediction provides values for the relative solvent accessibility. The normalisation is the following: | ACCESSIBILITY (from DSSP in Angstrom) |RELATIVE_ACCESSIBILITY = ------------------------------------- * 100 | MAXIMAL_ACC (amino acid type i) where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i. The maximal values are: +----+----+----+----+----+----+----+----+----+----+----+----+ | A | B | C | D | E | F | G | H | I | K | L | M | | 106| 160| 135| 163| 194| 197| 84| 184| 169| 205| 164| 188| +----+----+----+----+----+----+----+----+----+----+----+----+ | N | P | Q | R | S | T | V | W | X | Y | Z | | 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196| +----+----+----+----+----+----+----+----+----+----+----+ Notation: one letter code for amino acid, B stands for D or N; Z stands for E or Q; and X stands for undetermined. The relative solvent accessibility can be used to estimate the number of water molecules (W) in contact with the residue: W = ACCESSIBILITY /10 The prediction is given in 10 states for relative accessibility, with RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC) where PREDICTED_ACC = 0 - 9. Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 238 protein chains (in total about 62,000 residues) with less than 25% pairwise sequence identity gave the following results: Correlation ........... The correlation between observed and predicted solvent accessibility is: ----------- corr = 0.53 ----------- This value ought to be compared to the worst and best case prediction scenario: random prediction (corr = 0.0) and homology modelling (corr = 0.66). (Note: homology modelling yields a relative accurate prediction in 3D if, and only if, a significantly identical sequence has a known 3D structure.) 3-state accuracy ................ Often the relative accessibility is projected onto, e.g., 3 states: b = buried (here defined as < 9% relative accessibility), i = intermediate ( 9% <= rel. acc. < 36% ), e = exposed ( rel. acc. >= 36% ). A projection onto 3 states or 2 states (buried/exposed) enables the compilation of a 3- and 2-state prediction accuracy. PHD reaches an overall 3-state accuracy of: Q3 = 57.5% (compared to 35% for random prediction and 70% for homology modelling). In detail: +-----------------------------------+-------------------------+ | Qburied (% of observed)=77% | Qb (% of predicted)=60% | | Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% | | Qexposed (% of observed)=78% | Qe (% of predicted)=56% | +-----------------------------------+-------------------------+ 10-state accuracy ................. The network predicts relative solvent accessibility in 10 states, with state i (i = 0-9) corresponding to a relative solvent accessibility of i*i %. The 10-state accuracy of the network is: Q10 = 24.5% .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Q3 = --------------------------------------- (*100) | number of all residues | | no of res. correctly predicted to be buried |Qburied (% of obs) = ------------------------------------------- (*100) | no of all res. observed to be buried | | | no of res. correctly predicted to be buried |Qburied (% of pred)= ------------------------------------------- (*100) | no of all residues predicted to be buried .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the correlation between observed and predicted accessibility for each protein chan, and then averaging over all 238 chains yields the following average: +-------------------------------====--+ | corr/averaged over chains = 0.53 | +-------------------------------====--+ | standard deviation = 0.11 | +-------------------------------------+ .......................................................................... Further details of performance accuracy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The accuracy matrix in detail: .............................. -------+----------------------------------------------------+----------- \ PHD | 0 1 2 3 4 5 6 7 8 9 | SUM %obs -------+----------------------------------------------------+----------- OBS 0 | 8611 140 8 44 82 169 772 334 27 0 | 10187 16.6 OBS 1 | 4367 164 0 50 106 231 738 346 44 3 | 6049 9.8 OBS 2 | 3194 168 1 68 125 303 951 513 42 7 | 5372 8.7 OBS 3 | 2760 159 8 80 136 327 1246 746 58 19 | 5539 9.0 OBS 4 | 2312 144 2 72 166 396 1615 1245 124 19 | 6095 9.9 OBS 5 | 1873 96 3 84 138 425 1979 1834 187 27 | 6646 10.8 OBS 6 | 1387 67 1 60 80 278 2237 2627 231 51 | 7019 11.4 OBS 7 | 1082 35 0 32 56 225 1871 3107 302 60 | 6770 11.0 OBS 8 | 660 25 0 27 43 136 1206 2374 325 87 | 4883 7.9 OBS 9 | 325 20 2 27 29 74 648 1159 366 214 | 2864 4.7 -------+----------------------------------------------------+----------- SUM |26571 1018 25 544 961 2564 13263 14285 1706 487 | %pred | 43.3 1.7 0.0 0.9 1.6 4.2 21.6 23.3 2.8 0.8 | -------+----------------------------------------------------+----------- Note: This table is to be read in the following manner: 8611 of all residues predicted to be in exposed by 0%, were observed with 0% relative accessibility. However, 325 of all residues predicted to have 0% are observed as completely exposed (obs = 9 -> rel. acc. >= 81%). The term "observed" refers to the DSSP compilation of area of solvent accessibility calculated from 3D coordinates of experimentally determined structures (Diction- ary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Accuracy for each amino acid: ............................. +---+------------------------------+-----+-------+------+ |AA | Q3 b%o b%p i%o i%p e%o e%p | Q10 | corr | N | +---+------------------------------+-----+-------+------+ | A | 59.0 87 60 2 38 66 57 | 31 | 0.530 | 5054 | | C | 62.0 91 67 5 39 25 21 | 34 | 0.244 | 893 | | D | 56.5 21 45 6 49 94 57 | 20 | 0.321 | 3536 | | E | 60.8 9 40 3 41 98 61 | 21 | 0.347 | 3743 | | F | 63.3 94 67 9 46 29 37 | 27 | 0.366 | 2436 | | G | 52.1 75 51 1 31 67 53 | 22 | 0.405 | 4787 | | H | 50.9 63 53 23 45 71 50 | 18 | 0.442 | 1366 | | I | 64.9 95 68 6 41 30 38 | 34 | 0.360 | 3437 | | K | 66.6 2 11 2 37 98 67 | 23 | 0.267 | 3652 | | L | 61.6 93 65 8 44 31 40 | 31 | 0.368 | 5016 | | M | 60.1 92 64 5 39 45 44 | 29 | 0.452 | 1371 | | N | 55.5 45 45 8 38 87 59 | 17 | 0.410 | 2923 | | P | 53.0 48 48 9 39 83 56 | 18 | 0.364 | 2920 | | Q | 54.3 27 44 7 44 92 56 | 20 | 0.344 | 2225 | | R | 49.9 15 47 36 47 76 51 | 18 | 0.372 | 2765 | | S | 55.6 69 53 3 51 81 56 | 22 | 0.464 | 3981 | | T | 51.8 61 51 8 38 78 53 | 21 | 0.432 | 3740 | | V | 61.1 93 65 5 40 39 42 | 34 | 0.418 | 4156 | | W | 56.2 85 62 20 49 29 27 | 21 | 0.318 | 891 | | Y | 49.7 73 52 33 49 36 38 | 19 | 0.359 | 2301 | +---+------------------------------+-----+-------+------+ Abbreviations: AA: amino acid in one-letter code b%o, i%o, e%o: = Qburied, Qintermediate, Qexposed (% of observed), i.e. percentage of correct prediction in each state, see above b%p, i%p, e%p: = Qburied, Qintermediate, Qexposed (% of predicted), i.e. probability of correct prediction in each state, see above b%o: = Qburied (% of observed), see above Q10: percentage of correctly predicted residues in each of the 10 states of predicted relative accessibility. corr: correlation between predicted and observed rel. acc. N: number of residues in data set Accuracy for different secondary structure: ........................................... +--------+------------------------------+----+-------+-------+ | type | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | N | +--------+------------------------------+----+-------+-------+ | helix | 59.5 79 64 8 44 80 56 | 27 | 0.574 | 20100 | | strand | 61.3 84 73 9 46 69 37 | 35 | 0.524 | 13356 | | loop | 54.4 64 43 11 44 78 61 | 18 | 0.442 | 27968 | +--------+------------------------------+----+-------+-------+ Abbreviations as before. Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the 10 states for relative accessibility using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit (with the constraint that the second largest output is compiled among all units at least 2 positions off the maximal unit) can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Q3, corr, asf.) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 57.5 77 60 9 44 78 56 | 24 | 0.535 | 100.0 | | 1 | 59.1 76 63 9 45 82 57 | 25 | 0.560 | 91.2 | | 2 | 61.7 79 66 4 47 87 58 | 27 | 0.594 | 77.1 | | 3 | 66.6 87 70 1 51 89 63 | 30 | 0.650 | 57.1 | | 4 | 70.0 89 72 0 83 91 67 | 32 | 0.686 | 45.8 | | 5 | 72.9 92 75 0 0 93 70 | 34 | 0.722 | 35.6 | | 6 | 76.3 95 77 0 0 93 75 | 36 | 0.769 | 24.7 | | 7 | 79.0 97 79 0 0 93 78 | 39 | 0.803 | 16.0 | | 8 | 80.9 98 80 0 0 91 81 | 43 | 0.824 | 9.6 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ Abbreviations as before. The above table gives the cumulative results, e.g. 45.8% of all residues have a reliability of at least 4. The correlation for this most reliably predicted half of the residues is 0.686, i.e. a value comparable to what could be expected if homology modelling were possible. For this subset of 45.8% of all residues, 89% of the buried residues are correctly predicted, and 72% of all residues predicted to be buried are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 40.9 79 40 16 41 21 40 | 14 | 0.175 | 8.8 | | 1 | 45.4 61 46 28 44 48 44 | 17 | 0.278 | 14.1 | | 2 | 47.4 53 52 10 46 80 44 | 19 | 0.343 | 19.9 | | 3 | 52.9 75 59 4 50 77 47 | 23 | 0.439 | 11.4 | | 4 | 60.0 81 63 0 83 84 56 | 25 | 0.547 | 10.1 | | 5 | 65.2 82 70 0 0 93 62 | 28 | 0.607 | 10.9 | | 6 | 71.3 90 72 0 0 94 70 | 31 | 0.692 | 8.8 | | 7 | 76.0 94 76 0 0 95 75 | 34 | 0.762 | 6.3 | | 8 | 80.5 97 81 0 0 94 79 | 39 | 0.808 | 3.8 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ For example, for residues with RI = 4 83% of all predicted intermediate residues are correctly predicted as such. The resulting network (PHD) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: secondary structure, by PHDsec solvent accessibility, by PHDacc and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, 69012 Heidelberg, Germany Internet: Rost@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. B Rost & C Sander: Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533. Some statistics ~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | G | A | V | K | D | | % of AA: | 10.2 | 9.9 | 9.6 | 8.1 | 6.9 | +--------------+--------+--------+--------+--------+--------+ | AA: | T | I | S | L | E | | % of AA: | 6.3 | 6.3 | 6.0 | 6.0 | 4.8 | +--------------+--------+--------+--------+--------+--------+ | AA: | F | N | P | R | H | | % of AA: | 4.5 | 3.9 | 3.3 | 3.0 | 3.0 | +--------------+--------+--------+--------+--------+--------+ | AA: | Y | M | Q | W | C | | % of AA: | 2.7 | 2.7 | 1.2 | 0.9 | 0.9 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 23.1 | 33.5 | 43.4 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: mixed class PHD output for your protein ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Wed Oct 7 17:39:52 1998 Jury on: 10 different architectures (version 5.94_317 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein ~~~~~~~~~~~~~~~~~ HEADER /home/phd/server/work/predict_h26320-222 COMPND SOURCE AUTHOR SEQLENGTH 334 NCHAIN 1 chain(s) in predict_h26320-22273 data set NALIGN 160 (=number of aligned sequences in HSSP file) Abbreviations: PHDsec ~~~~~~~~~~~~~~~~~~~~~ sequence: AA : amino acid sequence secondary structure: HEL: H=helix, E=extended (sheet), blank=other (loop) PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helix prE: 'probability' for assigning strand prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 Abbreviations: PHDacc ~~~~~~~~~~~~~~~~~~~~~ SS : secondary structure HEL: H=helix, E=extended (sheet), blank=other (loop) solvent accessibility: 3st: relative solvent accessibility (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) O_3: observed relative acc. in 3 states: B, I, E note: for convenience a blank is used intermediate (i). P_3: predicted relative accessibility in 3 states 10st:relative accessibility in 10 states: = n corresponds to a relative acc. of n*n % subset: SUB: a subset of the prediction, for all residues with an expected average correlation > 0.69 (tables in header) note: for this subset the following symbols are used: "I": is intermediate (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 4 protein: predict length 334 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |GKVKVGVDGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFHGTVK| PHD sec | EEEEE HHHHHHHHHHHHH EEEEE HHHHHEEEEEE EEE EEE| Rel sec |926888345527999999999722981799981999846863215524323312113577| detail: prH sec |000000000158989999999854000000000000127875542232110000000000| prE sec |047888632100000000000000014899984000000012346656533345543777| prL sec |952111367731000000000145985100015998862111110111256644445211| subset: SUB sec |L.EEEE..LL.HHHHHHHHHHH..LL.EEEEE.LLLL.HHH...EE...........EEE| ACCESSIBILITY 3st: P_3 acc |eebebbbebbbbbbbbbbbbbeeeeebebbbbbeebbebebbbbbbebbbb bebeeebe| 10st: PHD acc |970700060002000000000677970600000670070700000070000406077606| Rel acc |841554726750881267032155441177540140031415325452131042353132| subset: SUB acc |ee.ebbb.bbb.bb..bb....eeee..bbbb..e....e.b..bbe.....b..e....| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |AEDGKLVIDGKAITIFQERDPENIKWGDAGTAYVVESTGVFTTMEKAGAHLKGGAKRIVI| PHD sec |E EEEE EEEEEE EEEEEEE HHHHHHHHHH EEEE| Rel sec |149869975827999972694777722279679999535555359999999846848999| detail: prH sec |000000000000000000003110133100000000000010269999999862100000| prE sec |520178972148999974100001111310289998762212210000000000028899| prL sec |469820017851000015796778755579710000237766520000000137861000| subset: SUB sec |..LLEEEELL.EEEEEE.LL.LLLL...LLLEEEEEE.LLLL.HHHHHHHHH.LL.EEEE| ACCESSIBILITY 3st: P_3 acc |eeeeebbbeeeebebbbeee eebebeeebbebbbebbbbbbeeeebeebbeebbeebbb| 10st: PHD acc |777760007776060007764970707970070006000000667707700770066000| Rel acc |344416134352523205410553423420253761544060115644426651222999| subset: SUB acc |.eee.b..e.e.b....ee..ee.e..e...e.bb.bbb.b...eebee.bee....bbb| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |SAPSADAPMFVMGVNHFKYANSLKIISNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAI| PHD sec |EE EEEEEE EEEEE HHHHHHHHHHHHHH EEEEEEEEEEEE| Rel sec |935999766999645566789855997267638997799999820677368789899732| detail: prH sec |000000000000000101000000000000158898899998854110000000000000| prE sec |862000127999762211110127998511000000000000000001578888889755| prL sec |036999871000236677888872001477731001100000134778310100100134| subset: SUB sec |E.LLLLLLEEEEE.LLLLLLLLLEEEE.LLL.HHHHHHHHHHH..LLL.EEEEEEEEE..| ACCESSIBILITY 3st: P_3 acc |bbbeeebebbbbbbeeeebeeebebbbebebbbebbbebbebbeeebbbbebbbbbbbbb| 10st: PHD acc |000999070000006687077707000706000700060070077600007000000000| Rel acc |221486032784351154054504353440835474113553533154734542446251| subset: SUB acc |...eee...bbb.b..ee.eee.e.b.eb.b.bebb...be.b...bbb.ebb.bbb.b.| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |TATQKTVDSPSGKLWRGGRGAAQNLIPASTGAAKAVGKVIPELDGKLTGMAFRVPTANVS| PHD sec | HHHHHHHHH EE E EEEEE EE| Rel sec |353213459996300235423225533524889997533123345311227884799919| detail: prH sec |011121000002344432243321001146889998752122221110000000000000| prE sec |323333220000000000000122232100000000002442222344457886100058| prL sec |565545668997644567656546665653110001235425456544531103899940| subset: SUB sec |.L.....LLLLL.....L.....LL..L..HHHHHHH.......L.....EEE.LLLL.E| ACCESSIBILITY 3st: P_3 acc |eeebebbeeeeeeeb eeeebbeebbbebebbbebbbebbbebebebebbbb bbbeebb| 10st: PHD acc |677070077766770577670067000706000700070007070606000040007700| Rel acc |133051233311531153341314475321123624356345344241312415202446| subset: SUB acc |....e.......e...e..e...ebbb......e.b.eb.be.eb.b....b.b...ebb| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |VLDLTCRLEKPAKYDDIKKVVKEASEGPLKGILGYTEDEVVSDDFNGSNHSSIFDAGAGI| PHD sec |EEEEEEEE HHHHHHHHHHHHH EEEEEEE EEEEE EE E| Rel sec |898764221468199999999998719944577631274233214887886132123226| detail: prH sec |000011232011499999999998740011000000001000100001101111223221| prE sec |888776443210000000000000000026678765312555543101001355321236| prL sec |000112224678400000000001259862211234575333346887887423445441| subset: SUB sec |EEEEE.....LL.HHHHHHHHHHHH.LL..EEEE...L.......LLLLLL........E| ACCESSIBILITY 3st: P_3 acc |bbbbbbebeeeee eebeebbeebeeeebebbbebbeeebbbbebeeee bbbbbbebbb| 10st: PHD acc |000000607777757706700770977707000600677000060767740000007000| Rel acc |773942335662317492536561395416233101043477211215403345234332| subset: SUB acc |bb.bb...eee...eeb.e.bee..eee.e.......e.bbb.....ee...bb..e...| ....,....31...,....32...,....33...,....34...,....35...,....36 AA |ELNDTFVKLVSWYDNEFGYSERVVDLMAHMASKE| PHD sec |EE EEEEEEEE HHHHHHHHHHHHH | Rel sec |8467389999994287651468999999998169| detail: prH sec |0000000000001101224578999999998420| prE sec |8621389999986300001110000000000000| prL sec |1377610000002587774210000000001579| subset: SUB sec |E.LL.EEEEEEE..LLLL..HHHHHHHHHHH.LL| ACCESSIBILITY 3st: P_3 acc |ebeeebbebbbbbbe bbb bbbebbebbbeee| 10st: PHD acc |6077700600000065300050006006000799| Rel acc |1235435287461110000010672961031559| subset: SUB acc |...ee.b.bbbb..........bb.bb....eee| ________________________________________________________________________________ The resulting prediction of globularity is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- GLOBE: prediction of protein globularity --- --- nexp = 145 (number of predicted exposed residues) --- nfit = 137 (number of expected exposed residues --- diff = 70. (difference nexp-nfit) --- =====> your protein appears as compact, as a globular domain --- --- --- GLOBE: further explanations preliminaryily in: --- http://www.embl-heidelberg.de/~rost/Papers/98globe.html --- --- END of GLOBE ________________________________________________________________________________ ----------------------------------------------------------------------------- --- PredictProtein: NEWS from January, 1997 --- --- --- --- Dear user, --- --- --- --- as of January 1, 1997, EMBL has effectively decided to not --- --- support the PredictProtein service by personal resources. I do --- --- maintain the program, so to speak, in my private time. However, --- --- my contract obliges me to do science, instead. Unfortunately, --- --- the computer environment at EMBL is at the same time starting --- --- to become increasingly unstable. Consequence of these two re- --- --- cent developments is that the PredictProtein service is not as --- --- stable as it was. --- --- --- --- I apologise for the problems this may cause. In particular, --- --- I apologise for my inability to reply to the 20-30 daily, per- --- --- sonal mails, and suggest to re-submit requests after 24 hours! --- --- --- --- Hoping that I shall find a more convenient solution for the --- --- future of the PredictProtein I remain with my best regards, --- --- --- --- Burkhard Rost --- ----------------------------------------------------------------------------- --- PredictProtein: NEWS from April, 1998 --- --- --- -------------------------------- --- --- MOVING PredictProtein --- --- There appears to be light on the horizon! PP will may be having --- --- many hickups over the next months (as I shall leave EMBL). How- --- --- ever, the server seems to have a fair chance of survival thanks --- --- to a major support that is being raised by Columbia University, --- --- New York, U.S.A.). I hope that this will settle the issue for --- --- the years to come ... --- -------------------------------- --- --- WARNING --- --- After a major rewriting of most of the PP code over the last, --- --- I am afraid that not all errors have been traced by me, yet. --- --- Thus, please have mercy and report any bug you'll encounter! --- --- THANKS, Burkhard Rost --- -------------------------------- --- --- NEW PREDICTION DEFAULTS --- --- * Coiled-coil regions: now by default the program COILS written by --- --- Andrei Lupas is run on your sequence. An output is returned if a --- --- coiled-coil region has been detected. --- --- * Functional sequence motifs: now by default the PROSITE database --- --- written by Amos Bairoch, Philip Bucher and Kay Hofmann is scanned --- --- for sequence motifs. An output is returned if any motif has been --- --- detected. --- -------------------------------- --- --- see http://www.embl-heidelberg.de/predictprotein/ppNews.html --- --- for a description of the following new options. --- --- NEW INPUT OPTION --- --- * Your input sequence(s) in FASTA-list format ("# FASTA list ") --- --- NEW OUTPUT OPTIONS --- --- * Return also BLASTP output ("return blast") --- --- * Return prediction additionally in RDB format ("return phd rdb") --- --- * Return topits hssp ("return topits hssp") --- --- * Return topits strip ("return topits strip") --- --- * Return topits own ("return topits own") --- --- * Return no coils ("return no coils") --- --- * Return no prosite ("return no prosite") --- -----------------------------------------------------------------------------