MCP Papers in Press. Published on May 23, 2005 as Manuscript M500024-MCP200
Genetic Association, Post-translational Modification and Protein-protein Interactions in Type 2 Diabetes Mellitus. Amitabh Sharma£1, Sreenivas Chavali£1, Anubha Mahajan1, Rubina Tabassum1, Vijaya Banerjee1, Nikhil Tandon2, Dwaipayan Bharadwaj1∗
1
Functional Genomics Unit, Institute of Genomics and Integrative Biology, CSIR, Delhi, India
2
Department of Endocrinology, All India Institute of Medical Sciences, New Delhi, India
£
These authors contributed equally to this work.
∗
Corresponding author:
Dr. Dwaipayan Bharadwaj Functional Genomics Unit Institute of Genomics and Integrative Biology (CSIR) Mall Road, Delhi- 110 007 India
Tel
: +91 11 2766 6156/6157
Fax
: +91 112766 7471
E-mail :
[email protected]
Running Title: Functional assessment of variations in Type 2 Diabetes Mellitus
1 Copyright 2005 by The American Society for Biochemistry and Molecular Biology, Inc.
Abbreviations Used: SNPs - Single Nucleotide Polymorphisms T2DM - Type 2 Diabetes Mellitus DCVs - Disease Causing Variations DAVs - Disease Associated nonsynonymous Variations PSIC - Position Specific Independent Count CNVs - Control Nonsynonymous Variations
2
Summary: Type 2 Diabetes Mellitus is a complex disorder with a strong genetic component. Inherited complex disease susceptibility in humans is most commonly associated with single nucleotide polymorphisms. The mechanisms by which this occurs are still poorly understood. Here, we focus on analyzing the effect of a set of disease causing missense variations of monogenetic form of Type 2 Diabetes Mellitus and a set of disease associated nonsynonymous variations in comparison with that of nonsynonymous variations without any experimental evidence for association with any disease. Analysis of different properties such as evolutionary conservation status, solvent accessibility, secondary structure, etc. suggests that disease causing variations are associated with extreme changes in the value of the parameters relating to evolutionary conservation and/or protein stability. Disease associated variations are rather moderately conserved and have milder effect on protein function and stability. Majority of the genes harboring these variations are clustered in or near insulin signaling network. Most of these variations are identified as potential sites for post-translational modifications; certain predictions have already reported experimental evidences. Overall, our results indicate that Type 2 Diabetes Mellitus may result from a large number of SNPs which impair modular domain function and post-translational modifications involved in signaling. Our emphasis is more on conserved corresponding residues than the variation alone. We believe that the approach of considering a stretch of peptide sequence involving a polymorphism would aid as a better method of defining its role in the manifestation of this disease. Since most of the variations associated with the disease are rare, we hypothesize that this disease is a ‘Mosaic model’ of interaction between a large number of rare alleles and a small number of common alleles along with the environment, which is little contrary to the existing Common Disease Common Variants model.
3
Introduction: Type 2 Diabetes Mellitus (T2DM) is a genetically heterogeneous, polygenic disease with complex inheritance pattern and is caused by genetic predisposition and environmental factors. The precise biochemical defects are unknown and almost certainly include impairments in insulin secretion and insulin action. T2DM is characterized by abnormal glucose homeostasis leading to hyperglycemia and is represented primarily by insulin resistance. The vast majority of insulin resistance in T2DM has been shown to arise due to defects at the post-receptor level [1]. T2DM is also heterogeneous in the associated pathological and physiological symptoms leading to a variety of complications such as coronary heart disease, neuropathy, retinopathy, etc. Genetic dissection of any complex trait is done based on two approaches, which include genome wide scan studies and association studies. The concept of association studies [2] is being widely applied as an experimental technique to identify Single Nucleotide Polymorphisms (SNPs) underlying complex phenotype, which represents the most common form (90%) of genetic variations in humans [3]. Association is defined as a statistical statement about the cooccurrence of alleles or phenotypes. Owing to the application of high-throughput SNP detection techniques, the number of identified SNPs is growing rapidly enabling detailed statistical studies. Over the past decade many laboratories have sought to clarify the etiology of T2DM by attempting to associate clear differences in metabolic phenotype with mutations or polymorphisms in the genes. As a result of this a large amount of data has accumulated, associating SNPs in a large number of candidate genes with the disease across different populations. Unlike fully penetrant mutations that cause Mendelian diseases, SNPs involved in complex human phenotypes are not a necessary and sufficient condition defining the phenotype
4
but their effect depends on many other genetic and environmental components. In other words SNPs are shown to comprise risk factors of having a specific phenotype more in a statistical sense. This raises the question as to whether the associated SNPs are only of statistical significance. If not then, what might be the reason for encountering differences in variation statistics across different populations as shown by Cargill et al. [4]. However, identifying SNPs responsible for specific phenotypes appears to be an enigma that is very difficult to solve. Several recent studies [5-10] have applied computational methods to predict the potential effects of the nonsynonymous coding SNPs in bringing about variations in humans. A focus on the individual factors that highlight their maximum potential effect (whether positive or deleterious) is often optimistic, as in practice they do not operate in isolation. Instead they work jointly to generate the disease gene architecture and hence a study to determine the contribution of these interactions towards the disease is essential. Ideally, the end point of disease gene identification should be functional analysis of the disease associated allele and an understanding of the molecular mechanism of causation of the disease phenotype. The functional characterization can be facilitated by the computational analysis. Vitkup et al. [9] have shown that the probability of a nonsynonymous mutation causing a genetic disease increases monotonically with an increase in the degree of evolutionary conservation of the mutation site and a decrease in the solvent accessibility of the site; opposite trends are observed for non disease polymorphisms. In the current study we have extensively analyzed the effect of nonsynonymous variations on the structure and function of proteins and have attempted to determine their possible role in the disease phenotype.
5
Experimental procedures: Data set extraction: The data set considered for the study includes a set of 29 mutations shown to cause monogenetic T2DM in families or Maturity Onset of Diabetes in Young (disease causing variations-DCVs); 113 polymorphisms, associated with the disease in various populations in a total of 76 different candidate genes and 92 random nonsynonymous variations in 32 genes that do not have any experimental evidence of association with any disease as a control dataset (Supplementary Table 1 online). The selection of these random variations would help to distinguish specific behavior patterns of the disease related variations from that of chance occurrence. Hence these random variations through out the sequence in those genes that have been implicated with the T2DM were selected. The disease associated polymorphisms fall into four major categories-nonsynonymous (45), regulatory (42), synonymous (11) and intronic SNPs (15). In this study we determine the effect of the disease associated nonsynonymous variations (referred here after as DAVs) in comparison to the control nonsynonymous variations (CNVs) on the phenotype. DCVs were obtained querying Medline for ‘Type 2 Diabetes, Mutations’ ; DAVs by querying for ‘Type 2 Diabetes, SNPs’, ‘Type 2 Diabetes, Polymorphisms’, and CNVs from the SWISSPROT database [11]. The extraction of protein sequences needed for the analysis of all these variations was done from SWISSPROT. Relationship between the genes harboring DAVs was determined using Pathway Assist [12]. Pathway Assist is a software application for navigation and analysis of biological pathways, gene-regulation networks and protein interaction maps. It comes with the built in natural language processing module MedScan and a comprehensive database.
6
Evaluating evolutionary conservation status of the variations: The best method to evaluate the significance of a variation using evolutionary information is to consider the nature of the change with respect to the variability of the affected residue as estimated from the wild type sequences in different proteins of a protein family. Set of similar sequences can be characterized by a multiple sequence alignment within common sequence domains (in case of protein families) or just a small sequence region (motif). We have done systematic examination of positions of the variations in motif region of proteins, using Pfam database [13] of probablistic models of protein domains and families derived using the HMM method and eMATRIX database [14]. eMATRIX [15] is a minimum risk method for estimating the frequencies of amino acids at conserved position in a protein family. Minimum risk estimation, finds the optimal weighting between a set of observed amino acid counts and a set of pseudo frequencies. This provides the information regarding the position of the variations in specific domains and functional motifs respectively. The prediction of residues conservation amongst the homologous proteins was performed by Scorecons [16]. Scorecons algorithm scores each residue position with multiple sequence alignment in terms of conservation. Multiple sequence alignment of homologous protein was done by using ClustalW [17] algorithm and was formatted in ClustalX (1.81). The mutation matrix of Jones et al. [18] is used to determine the likelihood of particular residue being replaced by another and to calculate a score based on the variability of each position. Normalized Shannon entropy scores for each amino acid position were calculated using the general formulae [16]. Cent = - Σkapa log2 pa/log2 [min (N, K)] and pa = na/N
7
na is the number of amino acid residue of type A, N is the number of residues in the sequence database and K is the number of residue type. The program Scorecons http://www.biochem.ucl.ac.uk/cgi-bin/valder/Scorecons_server.pl was used for all calculations. A score of zero indicates a lack of conservation at that position where as score of 1 indicates very high sequence conservation.
Determining the involvement in formation of specific patterns: Non-conserved residues adjacent to the conserved residues in the primary sequence are generally less substitutable than other non-conserved residues, reflecting their involvement in functionally important region [19]. Peptide sequence containing the variant along with ten neighboring residues on either side was selected from protein sequence and pattern search was done using PROSITE [20] database to determine the involvement of the variants in formation of specific patterns. PROSITE consists of biologically significant sites, patterns like phosphorylation, glycosylation, etc. and profiles that help to reliably identify specific motifs within a peptide sequence. Sequences involving variants showing potential phosphorylation sites were evaluated for the affect on phosphorylation using NetPhos 2.0. NetPhos 2.0 is an artificial neural-network method that predicts phosphorylation sites in independent sequences with sensitivity in the range from 69-96% [21].
Assessing the effect of variation on structural parameters of Proteins: It is apparent that amino acid allelic variants have an impact on the protein structure and function and this has been shown to be predicted by analysis of multiple sequence alignments
8
and protein 3D structures [8]. To assess the effect of the variations on structure and function of proteins Polyphen [22] was used. Polyphen is a World Wide Web server to automate functional annotation of nonsynonymous SNPs, based on sequence-based characterization of the substitution site and structural parameters. This provides us with the PSIC score (Position Specific Independent Count) calculated from the overall similarity of the sequences that share the amino acid type at this position with the help of statistical concepts and predicts whether a nonsynonymous variation is damaging i.e. is supposed to affect the protein function, or benign i.e. most likely lacking a profound phenotypic effect. Large differences in PSIC values (difference range above 1.5) for specific genetic variants might indicate that the substitution of interest is rarely or never observed in protein family [23]. Variations in the protein core involving a change in the hydrophobic character of a buried residue may result in different degrees of protein destabilization [24]. The hydrophobic effect is measured by solvent accessible surface area of a protein that is part of a complex surface in direct contact with solvent. Solvent accessibility is predicted using RVPNET [25], which uses single residue information of neighbors and provides real predictions of accessible surface area. Hydrophobic interactions are considered to be the primary factor stabilizing β-sheets [26], therefore by Chou-Fasman predictions [27] identification of secondary structure elements was done. Statistical evaluation: To compare the DCVs and DAVs with CNVs during the assessment of their effect on the disease phenotype χ2 tests were performed and p-value was calculated.
9
Results: Pathway Assist analysis establishes the products of the genes harboring DAVs to be potential interacting members of insulin signaling cascade (Fig.1). Nevertheless it is to be noted that Pathway Assist connects any two input proteins and some of the proteins identified by Pathway Assist during networking of the input proteins might not be involved in Type 2 Diabetes as is understood at this point of time. Functional segregation of the proteins harboring the DAVs categorized enzymes as the major class (31%) whereas, transcription regulators were the major class harboring DCVs (58%) (Fig.2). Pfam analyses showed that most of the DCVs (67%), 49% of DAVs and 63% of CNVs correspond to the functional domains of respective proteins (Supplementary Table 2 online). Therefore, of the total variations, an average of 56% lie in functional domains of proteins (p=0.02). Further, in the total sequence space of the identified proteins, 60% is occupied by functional domains. eMatrix analysis revealed that majority of DCVs (50%) and DAVs (62%) corresponded to functional signatures in comparison to only 27% of CNVs. This clearly indicates that the DCVs and DAVs correspond significantly more to the functional signatures in comparison to the randomly picked CNVs (p= 0.0002). Scorecons analysis (Fig. 3) reveals that DCVs are more of conservative changes (90% above the value of 0.5) whereas DAVs are radical (56% above 0.5) in comparison to CNVs (47% above 0.5) which are mostly changes in variable regions with low Scorecons value (p=0.0003). Most of the patterns obtained from PROSITE for DCVs (51.7%) and DAVs (51.1%) represented consensus post-translational modification motifs for phosphorylation, glycosylation and myristoylation (Supplementary Table 2 online) in contrast to only 37% of CNVs. Few peptides showed more than one post-translational motif. Phosphorylation changes predicted by NetPhos 2.0 for the patterns, indicated a probable decrease in the phosphorylation of DCV-
10
T608R of IRS1 ( Common variant Hollow > Rare variant
> Disease causing > No role in the disease
30
Supplementary Table 1:
List of Disease associated variations, Disease causing variations of Type 2 Diabetes Mellitus and Control Nonsynonymous variations
Regulatory disease associated variations
Gene
Position
Base Change
Susceptible allele
ALR2
Promoter (-12)
C>G
G
Li, Q. et al. Chin Med J . 115, 209-213 (2002)
ALR2
Promoter (-106)
C>T
T
Li, Q. et al. Chin Med J. 115, 209-213 (2002)
APM1
Promoter(-11426)
A>G
G
Harvest, F. Gu, et al. Diabetes. 53, S31-S35 (2004)
APM1
Promoter(-11377)
G>C
C
Harvest, F. Gu, et al. Diabetes. 53, S31-S35 (2004)
CCR 5
Promoter (59029)
G>A
A
Nakajima, K. et al . Diabetes. 51, 238-242 (2002)
COX 2
Promoter
G>C
C
Konheim, Y. L. et al . Hum.Genet. 113, 377-381(2003)
CRP
Promoter
T>C
T
Walford, J. K. et al . Mol Genet Metab. 78, 136-144(2003)
FOXC2
5'UTR (-512 )
C>T
C
Ridderstale, M. et al . Diabetes. 51,3554-3560 (2002)
Reference
GFPT2
3' UTR
C>T
T
Zhang, H. et al . J Clin Endocrinol Metab. 89, 748-55 (2004)
GLUT 2
Promoter (-269)
A>C
C
Cha, J. Y. et al . Ann Clin Lab Sci. 32, 114-122 (2002)
GLUT 2
Promoter (-44)
A>G
G
Cha, J. Y. et al. Ann Clin Lab Sci. 32, 114-122 (2002)
GLUT 2
Promoter (103)
A>G
G
Cha, J. Y. et al . Ann Clin Lab Sci. 32, 114-122 (2002)
Hepatic GCK
Promoter (-258)
G>A
A
Chiu, K. C. et al . BMC Genetics . 1(1):2(2000)
IL 6
Promoter (-634)
C>G
G
Kitamura, A. et al . Diabet Med. 19, 1000-1005 (2002)
IL 6
Promoter (-174)
C>G
C
Mohlig, M. et al . J Clin Endocrinol Metab . 89, 1885-1890 (2004)
IRS2
3'UTR (4064)
T>C
C
Zeng, W. M. et al . YI Chuan Xue Bao. 30, 785-789 (2003)
ISL1
Promoter (-47)
A>G
A
Barat-Hauari, M. et al . Diabetes. 51, 1640-1643 (2002)
LIPC
Promoter (-250)
G>A
G
Todorova, B. et al . J Clin Endocrinol Metab . 89, 2019-2023 (2004)
MCP-1
Promoter(-2518)
A>G
A
Simeoni, E. et al . Diabetologia. 47,1574-1580 (2004)
ORP150
Promoter (-429)
G>A
A
Kovacs, P. et al . Diabetes . 51, 1618-1621, 2002
Pancreatic GCK
Promoter (-30)
G>A
A
Marz, W. et al . Circulation. 109, 2844 (2004)
PC -1
3'UTR
(2897)
G>A
A
Frittitta, L. et al . Diabetes. 50, 1952-1955 (2001)
PC -1
3' UTR (2906)
G>C
C
Frittitta, L. et al . Diabetes. 50, 1952-1955 (2001)
PC -1
3'UTR
(2948)
C>T
T
Frittitta, L. et al . Diabetes. 50, 1952-1955 (2001)
PCK 1
Promoter (-232)
C>G
G
Cao, H. et al . J Clin Endocrinol Metab. 89, 898-903 (2004)
PKLR
3'UTR
C>T
T
Wang, H. et al . Diabetes. 51,2861-2865 (2002)
PON1
Promoter(-107)
T>C
T
James, R. W. et al . Diabetes. 49, 1390-1393 (2000)
PTEN
5'UTR (-9)
C>G
C
Ishihara, H. et al . FEBS Lett. 554, 450-454 (2003)
Resistin
Promoter (-420)
C>G
G
Osawa, H. et al . Am.J.Hum.Genet. 75, 678-86 (2004)
Resistin
3'UTR (+62)
G>A
G
Tan, M. S. et al . J Clin Endocrinol Metab. 88, 1258-1263 (2003)
TNFα
Promoter (-238)
G>A
A
Shiau, M. Y. et al .Tissue Antigens . 61 , 393-397 (2003)
TNFα
Promoter (-308)
G>A
A
Kabaszek, A. et al . Diabetes. 52, 1872-1876 (2003)
TSC-22(-396)
Promoter (-396)
A>G
A
Sugawara, F. et al . Diabetes Res Clin Pract. 60, 191-197 (2003)
UCP1
5'UTR
A>C
C
Mori, H. et al . Diabetologia. 44, 373-376 (2001)
UCP2
Promoter (-866)
G>A
A
D'Adamo, M. et al . Diabetes. 53,1905-1910 (2004)
UCP3
Promoter (-55)
C>T
C
Meirhaeghe, A. et al . Diabetologia. 43, 1424-1428 (2000)
VEGF
5'UTR (-634)
C>G
C
Awata, T. et al . Diabetes. 51,1635-1639 (2002)
Disease associated non-synonymous variations (DAVs)
Gene
Base change
Amino acid change
Susceptible allele
ADCYAP1R1
G>A
G54D
G
Gu, H. F. Hum Mutat. 19, 572-573 (2002)
ADRB2
C>G
R16G
R
Chang, T. J. et al . Clin Endocrinol (Oxf). 57, 685-690 (2002)
ADRB3
T>C
W64R
R
Oizumi, T. et al . Diabetes Care. 24:1579-1583, 2001
AGT
T>C
M235T
T
Chang, H. R. et al . J Chin Med Assoc. 66, 51-56 (2003)
AGT
C>T
T174M
M
Chang, H. R. et al . J Chin Med Assoc. 66, 51-56 (2003)
APM1
C>T
R112C
C
Kondo, H. et al . Diabetes. 51, 2325-2328 (2002)
Reference
CD38
C>T
R140W
W
Yagui, K. et al . Diabetologia 41, 1024-1028 (1998)
FABP2
G>A
A54T
T
Albala, C. et al . Obesity Research. 12, 340-345 (2004)
GCGR
G>A
G40S
S
Hager, J. et al . Nat Genet .9, 299-304 (1995)
GFPT2
A>G
I471V
V
Zhang, H. et al . J Clin Endocrinol Metab. 89, 748-55 (2004)
GYS1
A>G
M416V
V
Shimomura, H. et al . Diabetologia. 40, 947-952 (1997)
HFE
G>A
C282Y
Y
Moczulski, D. K. et al . Diabetes Care. 24, 1187-1191(2001)
HFE
C>G
H63D
D
Moczulski, D. K. et al. Diabetes Care. 24, 1187-1191(2001)
HNF1α
A>C
I27L
L
Chiu, K. C. et al . J Clin Endocrinol Metab. 85,2178-2183 (2000)
HNF4α
C>T
T130I
I
Zhu, Q. et al . Diabetologia .46,567-573 (2003)
ICAM1
A>G
E469K
K
Kamiuchi, K. et al . Diabet Med. 19, 371-376 (2002)
INSR
G>A
V985M
M
Hart, L. M. et al . Am J Hum Genet. 59,1119-1125 (1996)
IRS1
G>A
G972R
R
Jellema, A. et al . Diabetologia. 46, 990-995 (2003)
IRS2
G>A
G1057D
G
Mammarella, S. et al . Hum Mol Genet. 9, 2517-2521(2000)
KCNJ11
C>G
L270V
V
Nielsen, E. M. et al . Diabetes. 52, 573-577 (2003)
KCNJ11
G>A
E23K
K
Love-Gregory, L. et al. Diabetologia. 46, 136-137(2003)
LEPR
A>G
Q223R
R
Chiu, K C. et al . Eur J Endocrinol. 150, 725-729 (2004)
LMNA
C>T
P213S
P
Kamiuchi, K. et al . J Diabetes Complications. 16, 333-337 (2002)
NeuroD
G>A
A45T
T
Ye, L. et al . Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 19, 484-487 (2002)
NOS3
G>T
E298D
D
Monti, L. D. et al . Diabetes. 52, 1270-1275 (2003)
NPY
C>G
L7V
V
Niskanen, L. et al . Exp Clin Endocrinol Diabetes. 108,235-236 (2000)
NR3C1
A>G
N363S
S
Roussel, R. et al . Clin Endocrinol. 59, 237-241(2003)
NR3C1
G>A
R23K
R
van Rossum, E. F. et al . Diabetes. 51,3128-3134, 2002
PC-1
A>C
K121Q
Q
Kubaszek, A. et al . J Clin Endocrinol Metal. 89, 2044-2047 (2004)
PGC1
G>A
G482S
S
Hara, K. et al . Diabetologia. 45, 740-743 (2002)
PLA2G4A
C>G
F479L
L
Wolford, J. K. et al . Mol Genet Metab. 79, 61-66 (2003)
PON1
A>G
Q191R
R
Murata, M. et al . Diabet Med. 21 , 837-844 (2004)
PON2
T>A
C311S
C
Wang, X. Y. et al . Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 20, 215-219 (2003)
PON2
C>G
A148G
G
Hegele R. A. et al . J Clin Endocrinol Metal .82, 3373-3377 (1997)
PPARγ2
C>G
P12A
P
Mori, H. et al . Diabetes. 50, 886–890 (2001)
PPP1R3
G>T
D905Y
Y
Wang, G. et al . Chin Med J. 114, 1258-1262 (2001)
PTPN1
C>T
P387L
L
Soren, M. et al . Diabetes. 51,1-6 (2002)
RAGE
G>A
G82S
G
Kumaramnickavel, G. et al . J Diabetes Complications. 16, 391-394 (2002)
SOD2
C>T
V16A
V
Nomiyama, T. et al . J Hum Genet. 48, 138-141(2003)
TGFβ
T>C
L10P
P
Wong, T. Y. et al . Kidney Int. 63, 1831-1835 (2003)
UCP1
A>C
M229L
L
Mori, H. et al . Diabetologia. 44, 373-376 (2001)
UCP2
C>T
A55V
V
Wang, H. et al. Am J Physiol Endocrinol Metab. 286, E1-7(2004)
UTS2
G>A
S89N
N
Wenyi, Z. et al . Diabetologia. 46, 972-976 (2003)
WFS1
G>A
R456H
R
Minton, J. A. et al . Diabetes. 51,1287–1290 (2002)
WFS1
G>A
R611H
H
Minton, J. A. et al . Diabetes. 51,1287–1290 (2002)
Disease associated synonymous variations
Gene
Base Change
Amino acid change
Alpha 2 Integrin
C>T
F224F
Maeno, T, et al.Diabetes. 51:1523-1528 (2002).
Alpha 2 Integrin
G>A
T246T
Maeno, T, et al.Diabetes. 51:1523-1528 (2002).
Glucocorticoid receptor
G>A
E22E
PGC 1
A>G
T394T
Hara, K, et al .Diabetologia. 45:740-3 (2002).
PKLR
C>A
R569R
Wang, H, et al. Diabetes. 51:2861-2865 (2002).
PTP-1B
C>T
P303P
Mok, A ,et al . J Clin Endocrinol Metab. 87-724-727 (2002).
SREBP1
G>C
G952G
Eberle,D, et al. Diabetes. 53:2153-2157 (2004).
SUR 1
G>A
R1273R
Rissanen, J, et al.Diabetes care. 23:70-73 (2000).
SUR1
C>T
T759T
Reis, AF, et al.Diabetes Metab. 28:209-215 (2002).
SUR1
C>T
T759T
Reis AF, et al.Diabetes Metab. 28:209-215 (2002).
Syntaxin 1 A
T>C
D68D
Tsunoda, K, et al .Diabetologia. 44:2092-2097 (2001).
Reference
Van, Rossum Ef, et al . Diabetes. 51:3128-3134 (2002).
Disease associated Intronic variations
Gene
Base Change
Position
APM1
G>T
Intron 3
Hara,K, et al . Diabetes. 51:536-540 (2002).
APM1
G>T
Intron 1
Hara,K, et al. Diabetes. 51:536-540 (2003).
CAPN10
G>A
Intron 3
Michael, J Garant et al . Diabetes. 51, 231-237 (2002).
CAPN10
T>C
Intron 3
Weedon, MN, et al. Am.J.Hum.Genet. 73:1208 (2003).
GFPT-2
C>T
Intron 2
Zhang, H, et al . J Clin Endocrinol Metab. 89: 748-755 (2004).
GFPT-2
A>G
Intron 12
Zhang, H, et al . J Clin Endocrinol Metab. 89:748-755 (2004).
HNF 1α
C>G
Intron
NOS3
A>C
Intron 18
Monti, LD, et al. Diabetes. 52:1270-1275 (2003).
PKLR
C>T
Intron 3
Wang, H, et al. Diabetes. 51:2861-2865 (2002).
PKLR
C>T
Intron 5
Wang, H, et al. Diabetes. 51:2861-2865 (2002).
PPARγ
C>T
Intron
Tai, ES, et al . J Lipid Res. 45:674-685. (2004)
Reference
Gragnoli, C, et al . Diabetologia. 44:1326-1329 (2001).
SREBP-1C
C>T
Intron 18
Laudes, M,et al . Diabetes. 53:842-846 (2004).
SUR 1
T>C
Intron 24
Ji,L, et al. Zhonghua Yi Xue Za Zhi . 78:774-775 (1998).
Disease causing variations (DCVs)
Gene
Base Change
Amino acid change
Amylin
A>G
S20G
Seino,S, et al . Diabetologia. 44, 906-909 (2001)
APM1
T>C
I164T
Kondo,H, et al . Diabetes. 51, 2325-2328 (2002)
G3PD
T>C
F635S
Novialis, A, et al . Biochem Biophys Res Commun. 231, 570-572 (1997)
GCK
C>T
A456V
Henrik,B,T,et al . Diabetes. 51:1240–1246, 2002
GCK
G>C
G299R
Stoffel,M, et al . Nat Genet. 2, 153-1566 (1992)
GCK
G>A
A188T
Shimada,F, et al . Diabetologia. 36, 433-437(1993
GLUT2
G>A
V197I
Mueckler, M, et al . Diabetologia. 37, 420-427 (1994)
GLUT4
G>A
V383I
Choi, WH, et al . Diabetes. 40, 1712-1718 (1991)
Reference
HNF1α
G>A
G319S
Hegele, RA, et al . Diabetes Care. 22,524 (1999)
HNF1β
C>T
R177X
Tomura,H, et al . J Biol Chem. 274, 12975-12978 (1999)
HNF1β
C>G
S465R
Furuta, H, et al . J Clin Endocrinol Metab. 87, 3859-3863 (2002)
HNF1β
C>T
R276X
Furuta, H, et al . J Clin Endocrinol Metab. 87, 3859-3863 (2002)
HNF3β
G>A
A86T
Zhu, Q. et al . Diabetologia. 43, 1197-1200 (2000)
HNF4α
G>A
V393I
Hani, EH, et al . J Clin Invest. 101, 521-526 (1998)
HNF4α
C>T
R127W
Furuta,H, et al . Diabetes. 46, 1652-1657 (1997)
INSR
T>G
F382V
Accili,D, et al . EMBO J. 8, 2509-2517(1989)
IPF-1
T>C
C18R
Macfarlane, WM, et al . J Clin Invest. 104, R33-R39 (1999)
IPF-1
G>A
R197H
Macfarlane, WM, et al . J Clin Invest. 104, R33-R39 (1999)
IPF-1
G>A
D76N
Macfarlane, WM, et al . J Clin Invest. 104, R33-R39 (1999)
IRS1
C>G
T608R
Esposito, DL, et al . J Clin Endocrinol Metab. 88, 1468-1475 (2003)
IRS2
C>G
L647V
Almind,K, et al . Diabetologia. 42, 1244-1249 (1999)
ISL-1
C>T
Q310X
Shimomura,H, et al . Diabetes. 49, 1597–1600 (2000)
LPL
G>A
A71T
Yang,T, et al . Hum Mutat. 21, 453 (2003)
LPL
G>A
V181I
Yang,T, et al . Hum Mutat. 21, 453 (2003)
LPL
G>A
G188E
Yang,T, et al . Hum Mutat. 21, 453 (2003)
NeuroD
G>T
R111L
Malecki,MT, Nat Genet. 23, 323-328 (1999)
Pax4
C>T
R121W
Shimajiri,Y, et al . Diabetes. 50, 2864-2869 (2001)
PPARγ2
G>A
V318M
Barroso,I, et al . Nature .402:880-883 (1999)
PPARγ2
C>T
P467L
Barroso,I, et al . Nature.402:880-883 (1999)
Controls Nonsynonymous Variations(CNVs) extracted from Swiss-Prot Database
Gene
Amino acid change
ADRB2
I159F
ADRB2
I159L
ADRB2
K375R
ADRB2
V34M
ADRB3
T265M
ANGT
L392M
ANGT
L372V
APM1
H241P
APM1
R221S
APM1
V117M
GCGR
P114A
GLUT2
L478V
GLUT2
P68L
GLUT2
T110I
GLUT2
V101I
GLUT4
T78S
GLUT4
A358V
GYS1
E359G
GYS1
E619Q
GYS1
I108M
GYS1
K130E
GYS1
N283S
GYS1
P691A
GYS1
S706R
HFE
T217I
HFE
V53M
HFE
V59M
HNF1a
A98V
HNF1a
H514R
HNF1a
S487N
HNF1b
G492S
HNF4a
P436S
HXK4
A11T
HXK4
D4N
HXK4
M107T
ICAM1
K155N
ICAM1
K56M
ICAM1
P352L
ICAM1
R397Q
ICAM1
R478W
ICAM1
V315M
INSR
I1023F
INSR
K492Q
INSR
T448I
INSR
Y1361C
IRS1
A512P
IRS1
M209T
IRS1
P158R
IRS1
P679A
IRS1
S809F
IRS1
S892G
KCNJ11
R195H
KCNJ11
S385C
LEPR
K109R
LEPR
K204R
LEPR
K656N
LEPR
S675T
LEPR
T85A
LPL
A288T
LPL
A427T
LPL
D36N
LPL
T379A
LPL
V370M
NPY
L22M
NR3C1
D233N
NR3C1
F29L
NR3C1
F65V
NR3C1
L112F
PC1
K171Q
PC1
K173Q
PC1
T779P
PC1
Y268H
PGC1
T612M
PLA2G4
K651R
PLA2G4
V224I
PON1
M54L
PON2
V172L
PPARg
P40A
PPR3R1
R883S
RAGE
Q100R
SOD2
E66V
SOD2
I82T
SOD2
S10I
TGF1b
T263I
WFS1
A684V
WFS1
E737K
WFS1
G576S
WFS1
G674R
WFS1
I333V
WFS1
I720V
WFS1
R708C
WFS1
V871M
Supplementary Table 2 a) Characterization of functional properties of Disease Associated nonsynonymous Variations (DAVs) of Type 2 Diabetes Mellitus
Gene
Amino Residue acid change change
ADCYAP1 G54D
Nonpolar> Negative
ADRB2
ADRB3
RVP-NET PROSITE [1] patterns [2]
In Domain (PFAM) [3]
eMATRIX [4]
Hormone_2 83- No signature 110,132-159 motif
PSIC score (polyphen)
Poly Phen
ChouFasman [5]
0.137
Benign
Turn
1.26
Benign
Turn
2.051
Damaging
Helix
Exposed
Near Tyrosine sulfation site
R16G
Positive> Nonpolar
Exposed
Glycosylation site
7 tm 50-326
Beta-2 adrenergic receptor signature I
W64R
Aromatic> Positive
Buried
No pattern
7 tm 54-346
No signature motif
0.61
Benign
Sheet
1.416
Benign
Sheet
AGT
M235T
Polar> Polar
Exposed
No pattern
Selectin superfamily Serpin domain 99complement481 binding repeat signature I
AGT
T174M
Polar> Polar
Exposed
No pattern
Serpin domain 99-481
Angiotensinog en signature V
APM1
Positive> R112 C Polar
Exposed
No pattern
C1q 117-242
Fibrillar collagen Cterminal domain
No prediction
Damaging
Sheet
CD38
R140W
Positive> Aromatic
Exposed
No pattern
ADP-ribosyl cyclase56-296
ADP-ribosyl cyclase
2.296
Damaging
Helix
FABP2
A54T
Nonpolar> Polar
Exposed
Myristoylation site
Lipocolin 1-131
Cytosolic fattyacid binding protein
0.944
Benign
Helix
G40S
Nonpolar> Polar
Exposed
Glycosylation site
HRM 55-126
Glucagon family receptor signature II
0.35
Benign
Sheet
GCGR
GFPT
I471V
Nonpolar> Nonpolar
Buried
P_Phospho_ site
SIS (sugar No signature binding) 377-511 motif
0.913
Benign
Sheet
GYS1
M416V
Polar> Nonpolar
Buried
No pattern
Glycogen No signature synthase 31-663 motif
1.581
Damaging
Helix
3.143
Damaging
Sheet
1.37
Damaging
Helix
C282Y
Polar> Aromatic
HFE
HFE
Exposed
No pattern
H63D
Positive> Negative
Buried
PKC_phospho_ MHC 1 26-202 site
HNF1α
I27L
Nonpolar> Nonpolar
Buried
No pattern
HNF-1 N 1-176
No signature motif
0.45
Benign
Helix
HNF4α
T130I
Aromatic> Nonpolar
Exposed
No pattern
zf-c4 49-124
No signature motif
1.33
Benign
Sheet
ICAM1
E469K
Negative> Positive
Exposed
No pattern
Ig like 404-475
No signature motif
1.43
Benign
Sheet
INSR
V985M
Nonpolar> Polar
Buried
No pattern
No domain
No signature motif
1.512
Damaging
Sheet
IRS 1
G972R
Nonpolar> Positive
Exposed
Myristoylation site
PH 13-115
No signature motif
1.406
Benign
Near Turn
IRS 2
G1057D
Nonpolar> Negative
Exposed
Myristoylation site
PH 31-144
Synapsin
0.45
Benign
Near Turn
L270V
Nonpolar> Nonpolar
IRK 36-309
Kir6.2 inward rectifier K+ channel signature V
1.183
Benign
Helix
KCNJ11
Buried
No pattern
222-294
Immunoglobul in and major histocompatibi lity complex domain Major histocompatibi lity complex protein, Class I
No pattern
IRK 36-309
Kir6.2 inward rectifier K+ channel signature I
0.006
Benign
Helix
KCNJ11
E23K
Negative> Positive
LEPR
Q223R
Polar> Positive
Exposed
No pattern
Leptin receptor Iq No signature 331-422 motif
0.123
Benign
Helix
LECAM-1
P213S
Nonpolar> Polar
Exposed
Glycosylation site
Sushi
No signature motif
0.982
Benign
Helix
NeuroD
A45T
Nonpolar> Polar
Buried
PKC_phospho_ HLH 102-154 site
No signature motif
0.97
Benign
Helix
NOS3
E298D
Negative> Negative
Exposed
No pattern
NO synthase114- No signature 485 motif
1.301
Benign
Sheet
NPY
L7V
Nonpolar> Polar
Buried
No pattern
Signal peptide 1- No signature 26 motif
0.834
Benign
No prediction
NR3C1
N363S
Polar> Polar
Exposed
No pattern
gcr 26-401
No signature motif
0.078
Benign
No prediction
gcr 26-401
Glucocorticoid receptor (3C nuclear receptor) signature I
1.014
Benign
Helix
Exposed
NR3C1
R23K
Positive> Positive
Exposed
RGD Cell attachment sequence
PC-1
K121Q
Positive> Polar
Exposed
Glycosylation site
Somatomedin B Somatomedin 104-144, 145-189 B domain
0.093
Benign
Turn
PGC 1
G482S
Nonpolar> Polar
Exposed
ARG_RICH
PPAR interacting No signature domain 293-339 motif
0.9
Benign
Helix
PLA2G4A F479L
Aromatic> Nonpolar
Buried
CK2_Phospho_ No signature PLA2 B 190-675 site motif
1.775
Damaging
Sheet
PON 1
Q191R
Polar> Positive
Exposed
Myristoylation site
Arylaceterase 1354
Arylesterase
0.996
Benign
Sheet
PON 2
C311S
Polar> Polar
Buried
No pattern
Arylaceterase 1354
Arylesterase
0.453
Benign
Helix
PON2
G148A
Nonpolar> Nonpolar
Exposed
No pattern
Arylaceterase 1354
Arylesterase
0.932
Benign
Helix
PPAR γ2
P12A
Nonpolar> Nonpolar
Exposed
zf 137No signature CK2_phospho_ 211,hormone motif site receptor318-500
PPP1R3
D905Y
Negative> Aromatic
Exposed
No pattern
No signature motif
2.336
Damaging
No prediction
PTPN1
P387L
Nonpolar> Nonpolar
Exposed
PKC_phospho_ y phosphatase 40- No signature 276 motif site
1.603
Damaging
Helix
RAGE
G82S
Nonpolar> Polar
Exposed
Glycosylation site
1.581
Damaging
Turn
SOD2
A16V
Nonpolar> Nonpolar
Buried
Superoxide PKC_Phospho_ dismutase 25Site 106,110-217
0.379
Benign
Sheet
1.823
Damaging
Sheet
2
Benign
Helix
0.093
Benign
Helix
TGF β
L10P
Nonpolar> Nonpolar
Buried
UCP 1
M229L
Polar> Nonpolar
Buried
UCP2
A55V
Nonpolar> Nonpolar
Buried
No domain
VIg domain 31-101 domain(ligand interaction)
No signature motif
Transforming Phosphatidylino growth factor sitol-specific Signal peptide 1beta 2 phospholipase 24 precursor X-box domain signature I Mitochondrial energy CK2_phospho_ transfer mit carr 6-102 site proteins (carrier protein) Mitochondrial brown fat Myristoylation mit carr 12-102 uncoupling site protein signature II
1.8
Damaging Near Turn
UTS2
S89N
Polar> Polar
Exposed
No pattern
Urotensin 113124
RGD-Cell attachment sequence
1.163
Benign
Near Turn
WFS1
R456H
Positive> Positive
Exposed
Myristoylation site
429-451 transmembrane
No signature motif
1.125
Benign
Sheet
WFS1
R611H
Positive> Positive
Exposed
No pattern
588-610 transmembrane
No signature motif
1.125
Benign
Helix
PSIC score
Poly Phen
ChouFasman
1.504
Damaging
Turn
No No prediction prediction
Sheet
b) Characterization of functional properties of Disease Causing Variations (DCVs) of Type 2 Diabetes Mellitus
Gene
Amino acid change
Residue change
RVP-NET
Pattern (PROSITE)
IN Domain (PFAM)
eMATRIX
Exposed
Islet amyloid protein PKC_phospho_ calc_CGRP_IAP (amylin) signature I site P 32-73
I164T
Nonpolar> Polar
Buried
C1q domain signature
G3PD
F635S
Aromatic> Polar
Buried
CK2_phospho_ No signature site EF hand 627-655 motif
1.667
Damaging
Helix
GCK
A456V
Nonpolar> Nonpolar
Buried
No pattern
SIS 91-259
No signature motif
1.15
Benign
Helix
GCK
G299R
Nonpolar> Positive
Exposed
No pattern
Hexokinase Hexokinase_219- family 458 signature V
2.743
Damaging
Helix(from pdb)
GCK
A188T
Nonpolar> Polar
Buried
No pattern
Hexokinase_10217
2.082
Damaging
Helix(from pdb)
S20G
Polar> Nonpolar
APM1
Amylin
c1q 117-242
Complement C1Q domain signature II
Hexokinase family
Buried
Nmyristoylation site
V383I
Nonpolar> Nonpolar
Buried
Sugar Major facilitator Sug tr domain26- transporter superfamily 483 signature IV
HNF 1α
G319S
Nonpolar> Polar
Buried
No pattern
HNF--1b_C280541
HNF 3β
A 86 T
Nonpolar> Polar
Buried
No pattern
HNF1β
R177X
Positive> Termination Termination No pattern
HNF1β
S465R
Polar> Positive
HNF1β
R276X
HNF4α
V393I
V197I
Nonpolar> Nonpolar
GLUT4
GLUT2
Glu tr 13-499 No signature (transmembrane) motif
3
Benign
Sheet
0.09
Benign
Helix
No signature motif
1.555
Damaging
Sheet
Fork head domain 159-254
No signature motif
1.427
Benign
No prediction
HNF 1 N 1-182
No signature motif
HNF 1b C 312551
No signature motif
Positive> Termination Termination No pattern
HOX 231-314
No signature motif
Nonpolar> Nonpolar
No pattern
Hormone No signature receptor 183-364 motif
0.693
Benign
Helix
Buried
No pattern
Zinc finger, C4 type 49-124
Retinoid X receptor (2B nuclear receptor) signature I
2.675
Damaging
No prediction
Buried
No pattern
Recep_L_domain No signature 359-474 motif
1.912
Damaging Near Turn
HNF4α
Positive> R127W Aromatic
INSR
F382V
Aromatic> Nonpolar
Buried
Buried
Nmyristoylation site
Termination Damaging
1.779
Damaging
Termination Damaging
No prediction
Helix
No prediction
IPF-1
R197H
Positive> Positive
Exposed
No pattern
Homeobox 147203
Lambda and other repressor Helix-TurnHelix signature II
IPF-1
C18R
Polar> Positive
Buried
Proline-rich region
Homeobox 147203
No signature motif
3.168
Damaging
IPF-1
D76N
Negative> Polar
Exposed
CK2_phospho_ Homeobox 147site 203
No signature motif
1.088
Damaging Near Turn
IRS1
T608R
Polar> Positive
Exposed
ASPPPH 13-115, IRS phosphoserine 160-262
No signature motif
1.297
Damaging
Sheet
IRS2
L647V
Nonpolar> Nonpolar
Buried
Tyrosine sulfation site
PH 31-144,IRS 194-296
No signature motif
1.135
Benign
Near Turn
ISL-1
Q310X
Polar> Termination Termination No pattern
Homeobox 182238
No signature motif
LPL
A71T
Nonpolar> Polar
Buried
Nmyristoylation site
Lipase 12-338
Triacylglycerol lipase family signature II
1.656
Damaging
Helix
LPL
V181I
Nonpolar> Nonpolar
Buried
No pattern
Lipase 12-338
No signature motif
1.114
Benign
Helix
G188E
Nonpolar> Negative
Lipase 12-338
Lipoprotein lipase signature V
0.19
Benign
Helix
HLH 102-154
Helix-LoopHelix dimerization domain
2.565
Damaging
Helix
LPL
NeuroD
R111L
Positive> Nonpolar
Buried
No pattern
Exposed
Bipartite nuclear targeting sequence
1.582
Damaging
Helix
No prediction
Termination Damaging
No prediction
Pax4
PPARγ2
PPARγ2
Positive> R121W Aromatic
V318M
P467L
Nonpolar> Polar
Nonpolar> Nonpolar
Exposed
Nmyristoylation site
Pax 5-129
Paired box domain
3.143
Damaging
Helix
Buried
Peroxisome proliferatoractivated zf 137receptor Ligand-binding 211,hormone gamma domain receptor318-500 signature IV
1.479
Damaging
Helix
Exposed
Peroxisome proliferatoractivated receptor (1C zf 137nuclear Ligand-binding 211,hormone receptor) domain receptor318-500 signature VII
2.956
Damaging
Sheet
PSIC score
Poly Phen
ChouFasman
b) Characterization of functional properties of Control Non-Synonymous variations(CNVs)
Gene
Amino acid change
ADRB2
I -> F
Nonpolar> Aromatic
Buried
ADRB2
I -> L
Nonpolar> Nonpolar
ADRB2
K -> R
ADRB2
Pattern (PROSITE)
IN Domain (PFAM)
eMATRIX
No pattern
7tm_1(50-326)
GPCR superfamily signature IV
1.5
Benign
Helix
Buried
No pattern
7tm_1(50-326)
GPCR superfamily signature IV
0.6
Benign
Helix
Positive> Positive
Exposed
No pattern
No domain
No signature
0.9
Benign
Helix
V -> M
Nonpolar> Nonpolar
Buried
Nmyristoylation site(37-42)
No domain
No signature
0.26
Benign
Helix
ADRB3
T -> M
Polar> Nonpolar
Exposed
No pattern
7tm_1(54-346)
No signature
0.175
Benign
No prediction
ANGT
L -> M
Nonpolar> Nonpolar
Buried
No pattern
Serpin(99-481)
No signature
0.7
Benign
Helix
Buried
cAMP- and cGMPdependent Serpin(99-481) protein kinase phosphorylation site(370-373)
No signature
1
Benign
No prediction
ANGT
L->V
Residue change
Nonpolar> Nonpolar
RVP-NET
APM1
H -> P
Positive> Nonpolar
Exposed
Nglycosylation site(230-233)
C1q(114-239)
No signature
Benign
Sheet
APM1
R -> S
Positive> Polar
Exposed
Tyrosine sulfation site(218-232)
C1q(114-239)
No signature
Benign
Loop
Benign
Sheet
1.026
Benign
No prediction
APM1
V -> M
Nonpolar> Nonpolar
Buried
No pattern
C1q(114-239)
C-terminal tandem repeated domain in type 4 procollagen
GCGR
P -> A
Nonpolar> Nonpolar
Buried
No pattern
HRM(55-126)
Glucagon receptor signature IV
GLUT2
L -> V
Nonpolar> Nonpolar
Buried
Nmyristoylation site(471-476)
Sugar_tr(13-499) No signature
0.4
Benign
Sheet
GLUT2
P -> L
Nonpolar> Nonpolar
Exposed
N-glycosylation Sugar_tr(13-499) No signature site (62-65)
1.2
Benign
Sheet
GLUT2
T -> I
Polar> Nonpolar
Buried
Nmyristoylation site(107-112)
Sugar_tr(13-499) No signature
1.9
Damaging
Sheet
GLUT2
V -> I
Nonpolar> Nonpolar
Buried
No pattern
Sugar_tr(13-499) No signature
1.4
Benign
Sheet
GLUT4
Polar> T -> S Nonpolar
Buried
No pattern
Glucose transporter Sugar_tr(26-483) type 4 (GLUT4)
0.5
Benign
Sheet
GLUT4
A -> V
Nonpolar> Nonpolar
Exposed
Nmyristoylation site(359-364)
Sugar_tr(26-483) No signature
0.3
Benign
No prediction
GYS1
E -> G
Negative> Nonpolar
Exposed
N-glycosylation Glycogen_syn(31No signature 663) site(356-359)
1.26
Benign
Sheet
GYS1
E -> Q
Negative> Polar
Exposed
Tyrosine sulfation site(616-629)
Glycogen_syn(31No signature 663)
1
Benign
Helix
GYS1
I->M
Nonpolar> Nonpolar
Buried
No pattern
Glycogen_syn(31No signature 663)
1.4
Benign
Sheet
GYS1
K -> E
Positive> Negative
Exposed
No pattern
Glycogen_syn(31No signature 663)
1.62
Damaging
Helix
GYS1
N -> S
Exposed
No pattern
Glycogen_syn(31No signature 663)
1.8
Damaging
Helix
Polar>Polar
GYS1
P -> A
Nonpolar> Nonpolar
Exposed
cAMP- and cGMPdependent No domain protein kinase phosphorylation site(695-698)
GYS1
S->R
Nonpolar> Positive
Exposed
No pattern
HFE
HFE
T -> I
V -> M
Polar> Nonpolar
Nonpolar> Nonpolar
No signature
2.01
Damaging
Helix
No domain
No signature
1.1
Benign
No prediction
C1-set(211-294)
Major histocompatibi lity complex protein, Class I
0.54
Benign
No prediction
Exposed
No pattern
Buried
Casein kinase II MHC_I(26-202) phosphorylation site (45-48)
A Major histocompatibi lity complex protein, Class I
1.3
Benign
Sheet
A Major histocompatibi lity complex protein, Class I
0.56
Benign
Sheet
HFE
V -> M
Nonpolar> Nonpolar
Buried
Protein kinase C MHC_I(26-202) phosphorylation site(65-67)
HNF1a
A -> V
Nonpolar> Nonpolar
Buried
No pattern
HNF-1_N(1-176) No signature
1.6
Damaging
Helix
HNF1a
H -> R
Positive> Positive
Exposed
No pattern
HNF-1B_C(282541)
No signature
2.2
Damaging
No prediction
HNF1a
S -> N
Nonpolar> Polar
Buried
No pattern
HNF-1B_C(282541)
No signature
0.201
Benign
No prediction
HNF1b
G -> S
Nonpolar> Nonpolar
Exposed
Nmyristoylation site(464-469)
HNF-1B_C(314551)
No signature
No prediction
No prediction
HNF4a
P -> S
Nonpolar> Nonpolar
Exposed
No pattern
No domain
No signature
0.16
Benign
No prediction
HXK4
A -> T
Nonpolar> Polar
Exposed
No pattern
Hexokinase_1 (10-217)
No signature
0.193
Benign
Helix
HXK4
D -> N
Negative> Polar
Buried
No pattern
No domain
No signature
0.13
Benign
Helix
HXK4
Nonpolar> M -> T Polar
No signature
1.1
Benign
Sheet
ICAM1
Positive> K -> N Polar
Intercellular adhesion molecule/vasc ular cell
0.9
Benign
No prediction
Buried
Exposed
Casein kinase II Hexokinase_1 phosphorylation (10-217) site (437-440) Nmyristoylation No domain site(140-145)
Exposed
No pattern
ICAM_N(5-115)
Intercellular adhesion molecule/vasc ular cell
Exposed
No pattern
No domain
No signature
2.1
Damaging
Sheet
Positive>Po Exposed lar
N-glycosylation No domain site(385-388)
No signature
0.7
Benign
No prediction
R -> W
Positive> Aromatic
Exposed
No pattern
No domain
No signature
1.65
Damaging
Sheet
ICAM1
V -> M
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
0.7
Benign
Sheet
INSR
I -> F
Nonpolar> Aromatic
Buried
Protein kinase domain(10231298)
Furin-like(179340)
No signature
2.3
Damaging
Helix
INSR
K -> Q
Positive> Polar
Exposed
No pattern
No domain
No signature
0.767
Benign
Sheet
INSR
T -> I
Polar> Nonpolar
Exposed
No pattern
No domain
No signature
1.295
Benign
No prediction
INSR
Y -> C
Aromatic> Polar
Buried
Tyrosine kinase phosphorylation No domain site(1353-1361)
No signature
2.8
Damaging
No prediction
IRS1
A -> P
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
1.042
Benign
No prediction
IRS(160-262)
Insulin receptor substrate-1 PTB domain signature II
2.3
Damaging
Loop
1.7
Damaging
Loop
ICAM1
Positive> K -> M Nonpolar
ICAM1
P -> L
Nonpolar> Nonpolar
ICAM1
R -> Q
ICAM1
IRS1
M -> T
Nonpolar> Polar
Buried
No pattern
0.9
Benign
Loop
IRS1
P -> R
Nonpolar> Positive
Exposed
No pattern
No domain
Insulin receptor substrate-1 PTB domain signature I
IRS1
P-A
Nonpolar> Nonpolar
Exposed
No pattern
No domain
No signature
1.4
Benign
No prediction
IRS1
S -> F
Nonpolar> Aromatic
Buried
Nmyristoylation site(677-682)
No domain
No signature
1.3
Benign
No prediction
IRS1
S -> G
Nonpolar> Nonpolar
Exposed
No pattern
No domain
No signature
0.4
Benign
No prediction
KCNJ11
R -> H
Positive> Positive
Exposed
No pattern
No domain
No signature
KCNJ11
S -> C
Nonpolar> Polar
Buried
No pattern
No domain
No signature
LEPR
K -> R
Positive> Positive
Exposed
N-glycosylation No domain site(101-104)
No signature
LEPR
K -> R
Positive> Positive
Exposed
No pattern
No domain
No signature
LEPR
Positive> K -> N Polar
LEPR
Nonpolar> S -> T Polar
Exposed
No pattern
0.18
Benign
Sheet
No prediction
No prediction
0.123
Benign
No prediction
0.141
Benign
No prediction
No domain
Long hematopoietin receptor, gp130 family
1.4
Benign
Helix
No domain
Long hematopoietin receptor, gp130 family
1.3
Benign
Sheet
Buried
No pattern
Exposed
N-glycosylation No domain site(81-84)
No signature
0.5
Benign
Sheet
LPL
Nonpolar> A -> T Polar
Buried
Casein kinase II Lipase(12-338) phosphorylation site(292-295)
Triacylglycerol lipase family signature VI
1.6
Damaging
Turn
LPL
A -> T
Nonpolar> Polar
Buried
No pattern
PLAT(341-427)
Lipoprotein lipase signature VII
0.455
Benign
Sheet
LPL
D -> N
Negative> Polar
Exposed
No pattern
Lipase(12-338)
No signature
1.3
Benign
Helix
LPL
T -> A
Polar> Nonpolar
Buried
No pattern
No domain
No signature
1.66
Damaging
No prediction
LPL
Nonpolar> V -> M Nonpolar
Buried
Casein kinase II PLAT(341-427) phosphorylation site(385-387)
No signature
0.98
Benign
No prediction
NPY
L -> M
Nonpolar> Nonpolar
Buried
Nmyristoylation site(11-16)
No signature
0.7
Benign
Helix
Exposed
Casein kinase II GCR(26-401) phosphorylation site(241-244)
No signature
1.4
Benign
No prediction
LEPR
NR3C1
T-A
Polar> Nonpolar
Negative> D -> N Polar
No domain
Glucocorticoid receptor (3C nuclear receptor) signature I
GCR(26-401)
No signature
Glucocorticoid receptor (3C nuclear receptor) signature III
NR3C1
F -> L
Aromatic> NonPolar
Buried
Protein kinase C GCR(26-401) phosphorylation site(32-34)
NR3C1
F -> V
Aromatic> NonPolar
Buried
Nmyristoylation site(68-73)
0.793
Benign
Helix
2
Damaging
No prediction
1.2
Benign
Sheet
NR3C1
L -> F
NonPolar> Aromatic
Buried
No pattern
GCR(26-401)
PC1
K -> Q
Positive> Polar
Exposed
No pattern
Somatomedin_B Somatomedin (147-189) B domain
0.169
Benign
No prediction
PC1
K -> Q
Positive> Polar
Exposed
No pattern
Somatomedin_B Somatomedin (147-189) B domain
0.093
Benign
No prediction
PC1
T -> P
Polar> Nonpolar
Exposed
No pattern
No domain
DNA/RNA nonspecific endonuclease
0.29
Benign
Helix
PC1
Y -> H
Aromatic> Positive
Buried
No pattern
No domain
No signature
0.17
Damaging
Sheet
PGC1
T -> M
Polar> Nonpolar
Buried
No pattern
No domain
No signature
No prediction
No prediction
Exposed
Bipartite nuclear PLA2_B(190targeting 675) sequence(651667)
No signature
0.141
Benign
Sheet
Casein kinase PLA2_B(190II phosphorylation 675) site(215-218)
Lysophospholi pase catalytic domain
0.089
Benign
Sheet
K -> R
Positive> Positive
PLA2G4
V -> I
Nonpolar> Nonpolar
Buried
PON1
M -> L
Nonpolar> Nonpolar
Buried
PON2
V -> L
Nonpolar> Nonpolar
Buried
PLA2G4
PPARg
P -> Polar> A Nonpolar
Exposed
Nmyristoylation site(46-51) Casein kinase II phosphorylation site(165-168)
No pattern
Arylesterase(1354)
No signature
0.8
Benign
Sheet
Arylesterase(2354)
No signature
0.283
Benign
Sheet
No domain
Peroxisome proliferatoractivated receptor gamma signature I
1.7
Damaging
No prediction
PPR3R1
R -> S
Positive> Nonpolar
Exposed
No pattern
No domain
No signature
0.2
Benign
No prediction
RAGE
Q -> R
Polar> Positive
Exposed
Nmyristoylation site(95-100)
No domain
No signature
0.192
Benign
Sheet
SOD2
Negative> E -> V Nonpolar
Exposed
No pattern
Sod_Fe_N(25106)
Manganese and iron superoxide dismutase
0.3
Benign
Helix
SOD2
I -> T
Nonpolar> Polar
Buried
No pattern
Sod_Fe_N(25106)
No signature
0.9
Benign
Helix
SOD2
S -> I
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
1.4
Benign
Helix
0.3
Benign
No prediction
Buried
No pattern
No domain
Transforming growth factor beta 1 precursor signature VI
Buried
Casein kinase II No domain phosphorylation site(691-694)
No signature
0.9
Benign
Helix
Negative> Positive
Exposed
No pattern
No domain
No signature
0.9
Benign
No prediction
G -> S
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
1.3
Benign
Helix
WFS1
G -> R
Nonpolar> Positive
Buried
No pattern
no domain
No signature
1.8
Damaging
Helix
WFS1
I -> V
Nonpolar> Nonpolar
Buried
no pattern
No domain
No signature
0.225
Benign
Sheet
WFS1
I -> V
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
0.22
Benign
Helix
WFS1
R -> C
Positive> Polar
Exposed
No pattern
No domain
No signature
1.8
Damaging
Sheet
WFS1
V -> M
Nonpolar> Nonpolar
Buried
No pattern
No domain
No signature
0.6
Benign
Sheet
Polar> Nonpolar
TGF1b
T -> I
WFS1
Nonpolar> A -> V Nonpolar
WFS1
E -> K
WFS1
References: 1. Ahmad, S., Gromiha, M. M. & Sarai, A. Bioinformatics. 19, 1849-1851 (2003). 2. Falquet, L. et al. Nucleic Acids Res. 30, 235-238 (2002). 3. Bateman, A. et al . Nucleic Acids Res. 30, 276-280 (2002). 4. Bennett, S. P., Lu, L. & Brutlag, D. L. Nucleic Acids Res. 31, 3328-3332 (2003). 5. Chou, P. Y. & Fasman, G. D. Biochemistry. 13, 211- 222 (1974).