In-frame seven amino-acid duplication in AIP arose over the last 3000 years, disrupts protein interaction and stability and is associated with gigantism

Objective Mutations in the aryl hydrocarbon receptor-interacting protein (AIP) gene are associated with pituitary adenoma, acromegaly and gigantism. Identical alleles in unrelated pedigrees could be inherited from a common ancestor or result from recurrent mutation events. Design and methods Observational, inferential and experimental study, including: AIP mutation testing; reconstruction of 14 AIP-region (8.3 Mbp) haplotypes; coalescent-based approximate Bayesian estimation of the time to most recent common ancestor (tMRCA) of the derived allele; forward population simulations to estimate current number of allele carriers; proposal of mutation mechanism; protein structure predictions; co-immunoprecipitation and cycloheximide chase experiments. Results Nine European-origin, unrelated c.805_825dup-positive pedigrees (four familial, five sporadic from the UK, USA and France) included 16 affected (nine gigantism/four acromegaly/two non-functioning pituitary adenoma patients and one prospectively diagnosed acromegaly patient) and nine unaffected carriers. All pedigrees shared a 2.79 Mbp haploblock around AIP with additional haploblocks privately shared between subsets of the pedigrees, indicating the existence of an evolutionarily recent common ancestor, the ‘English founder’, with an estimated median tMRCA of 47 generations (corresponding to 1175 years) with a confidence interval (9–113 generations, equivalent to 225–2825 years). The mutation occurred in a small tandem repeat region predisposed to slipped strand mispairing. The resulting seven amino-acid duplication disrupts interaction with HSP90 and leads to a marked reduction in protein stability. Conclusions The c.805_825dup allele, originating from a common ancestor, associates with a severe clinical phenotype and a high frequency of gigantism. The mutation is likely to be the result of slipped strand mispairing and affects protein–protein interactions and AIP protein stability.


Introduction
Although the majority of pituitary adenomas are sporadic, they co-occur in families in about 2-5% of the cases (1). Rarely, they form a part of multi-glandular tumor syndromes, such as MEN1 or Carney complex. More frequently, these families present with isolated pituitary adenomas (familial isolated pituitary adenoma (FIPA) syndrome). A significant proportion (~20%) of these pedigrees carry one of a variety of heterozygous alleles in the aryl hydrocarbon receptor-interacting protein gene (AIP, MIM: 60555) (2,3). AIP is a 330 amino-acid protein comprised of an N-terminal immunophilin-like domain (4) and a C-terminal domain with three tetratricopeptide (TPR) repeats (5) that are thought to be responsible for protein-protein interactions (1,6). Disease-associated AIP alleles reported to date are mainly the result of nonsense, missense, deletion/insertion or splice site mutations, but a few large deletions and a promoter mutation have also been described (1,7). AIP acts as a tumor suppressor gene, requiring a second somatic hit affecting the wild-type (WT) allele, with loss of heterozygosity identified in a number of cases (6,8,9).
Mutations leading to a truncated protein are dispersed over the entire gene, while missense mutations are more common in the C-terminal end of the molecule (6). Several alleles have been reported in different geographic regions and populations, raising the question whether they arose through recurrent mutation events (occurring at mutational hotspots in the gene) or are inherited from a recent common ancestor. The c.910C > T, p.R304* allele, for example, has been identified in several populations independently (Irish, Romanian, English, Italian, Indian, Mexican) (10,11), consistent with this CpG locus being a mutational hotspot, but in two instances, it has also been shown to give rise locally to numerous patients originating from the same ancestor (Ireland and Italy) (10,12).
We have previously reported the c.805_825dup, p.F269_H275dup AIP exon 6 in-frame duplication (rs267606578/EF643650) in three affected members of a family from the United Kingdom (3) and in an apparently sporadic giant from France (13). Here, we report seven additional pedigrees (four from the United Kingdom and three from the USA) with the same allele and provide evidence that they are all derived from a common ancestor. Using a coalescent-based approximate Bayesian computation approach, we estimate the time to most recent common ancestor (tMRCA), and by forward simulation, we estimate the current number of carriers. We hypothesize that slipped strand mispairing led to this unusual duplication and the inserted amino-acids render the protein extremely unstable and, at the same time, disrupt the binding site for crucial partner proteins.

Patients and methods
The study protocol was approved by the local ethics committees. All participants provided signed informed consent before the study. AIP genotyping (sequencing and multiplex ligation-dependent probe amplification) was performed as previously published (9,14). We report nine c.805_825dup-positive pedigrees ( Table 1, Fig. 1): two previously described, diagnosed in the United Kingdom (3) and France (13), and seven new ones, four diagnosed in the United Kingdom and three in the USA. All individuals have primarily European ancestry, with a known family link to the United Kingdom, but were not known to be related; they were living in France, different United Kingdom counties and two USA states (Maryland and Tennessee) (Fig. 1). Four of the nine pedigrees were FIPA families (Fig. 1), while five patients presented as simplex (apparently sporadic) cases.
We genotyped 14 short tandem repeat loci (microsatellites) around AIP (Fig. 2) in the index case of each pedigree, and in a second individual in each of the United Kingdom FIPA 1 and 2 pedigrees, as previously European Journal of Endocrinology (2017) 177, 257-266 described (10). Haplotypes were inferred using PHASE (15), incorporating phasing information from UK FIPA 1 and 2 pedigrees. To obtain tMRCA estimates, we used a combined approach of coalescent theory calculations and simulation in an approximate Bayesian computation framework (16). We used ms (17) for the simulations, combining recombination and mutation rates in order to assess the probability of neither event in the region of interest, as previously described (11). Based on the tMRCA estimates, we employed a forward simulation approach (10,11) to estimate the current number of carriers per generation. We reject simulations that generated <9 carriers, because we observed in our dataset at least 9 unrelated carriers.
A structural model for the mutated C-terminal domain of the AIP protein was generated by templatebased modeling, using the Phyre2 prediction server (18) (top templates PDB: 4aif and 4gcn; Phyre2 confidence score 100.0% and 99.6% respectively). Phyre2 uses the alignment of Hidden Markov models via HHsearch (19) to significantly improve the accuracy of alignment and detection rate. A match with a confidence score >90% indicates that the core of the protein is modeled at high accuracy (2-4 Å root mean square deviation from the native, true structure). However, surface loops can deviate from the native. Regions not covered by templates are modeled ab initio using the Poing algorithm embedded in Phyre2 (18).
The interaction between mutant AIP and HSP90, a wellestablished AIP-interacting partner (5), was investigated by co-immunoprecipitation. The coding sequence of AIP with the c.805_825dup, p.F269_H275dup mutation (20) was sub-cloned into the pcDNA3.0 vector C-terminally to the Myc tag epitope (3). Myc-tagged p.F269_H275dup AIP and HA-tagged HSP90β were co-transfected in HEK293 cells (10 × 10 6 , 20 µg of total plasmid DNA), using Lipofectamine 2000 (Life Technologies) following the manufacturer's instructions. Cells were harvested 48 h later by trypsinization and resuspended in 1.5 mL of lysis buffer (150 mM NaCl, 10 mM Tris-HCl pH 7.5, 10% v/v glycerol, 1% v/v IGEPAL CA-630 (Sigma-Aldrich) supplemented with Complete Protease Inhibitor Cocktail (Roche)). After being cleared by centrifugation, lysates were cleaned up by incubation with 50 µL of Protein G Sepharose 4 Fast Flow (GE Healthcare) and divided in thirds for incubation with 5 µg of anti-Myc (Santa Cruz Biotechnology; clone 9E10) or anti-HA (Sigma-Aldrich H3663) mouse monoclonal antibodies, or mouse IgG (Sigma-Aldrich I5381), as appropriate. The co-immunoprecipitation was performed as previously described (21), and the eluates were resolved by denaturing polyacrylamide gel electrophoresis followed by Western blot using anti-Myc (Santa Cruz Biotechnology; clone 9E10) and anti-HA (Sigma-Aldrich H6908) antibodies. The experiment was repeated twice for confirmation. Protein stability was assessed by cycloheximide chase experiments, as previously described (21). Briefly, Myc-tagged WT, p.F269_H275dup and p.R304* AIP     Table 1. A full colour were overexpressed in HEK293 cells, and the cells were subsequently treated with cycloheximide 20 μg/mL (Abcam) for the time indicated. Cell lysates were prepared for polyacrylamide gel electrophoresis and Western blot with anti-Myc (Santa Cruz Biotechnology; clone 9E10) and anti-GAPDH (Santa Cruz Biotechnology FL-335) antibodies. The normalized protein levels were expressed as percentage of those observed at time 0. The results of at least two independent experiments, in triplicates, were pooled together and analyzed using a one-phase decay equation. The degradation constant (K) was compared between the mutants and the WT protein using the extra sum-of-squares F test. Significance was set for P values <0.05.

Ethical approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Results
The majority of the c.805_825dup-positive affected individuals presented with somatotropinoma (14 of 16 patients), but two clinically non-functioning pituitary adenomas (NFPA) were also observed ( Table 1). Two FIPA families were homogenous, with all affected members harboring somatotropinomas and the other two were heterogeneous with both somatotropinomas and clinically NFPAs (Fig. 1). From a total of 16 affected allele carriers (4F/12M), 13 patients (10M) presented with signs/symptoms of pituitary macroadenoma: eight gigantism (1F/7M), three acromegaly (3M), 2 clinically non-functioning pituitary adenomas (2F, one with 5-10% GH-positive cells while for the other immunostaining is not available) ( Table 1). One asymptomatic male patient (UK FIPA 1), identified prospectively after genetic screening, had a GH-secreting pituitary microadenoma (IGF-1 = 1.5 × upper limit of normal) and a 1 cm gastrointestinal stromal tumor; after three years of medical therapy, his pituitary adenoma was successfully operated on, while he remains under surveillance for the gastrointestinal stromal tumor. One patient, originally operated at the age of 15 years when histology showed somatotroph hyperplasia, was suffering from diabetes and hypertension and died following leg surgery at the age of 35 years (UK FIPA 1). One subject (France Simplex) affected by gigantism had dilated cardiomyopathy, aortic dilatation and history of infantile psychosis.
The overall average age of onset of symptoms was 19.1 years (median 18.5, range 9-34).
Haplotype analysis showed a 2.79-3.63 Mbp (1.83-2.59 cM) haploblock around AIP shared among c.805_825dup allele-carrying chromosomes in all nine pedigrees, as well as additional haploblocks shared among subsets of chromosomes, strongly supporting the existence of a recent common ancestor; the 'English founder' (Fig. 2). We observed a different allele of D11S1249 in one subject (US Simplex), on the c.805_825dup allele-carrying haplotype and conditioned our coalescent simulation results on the occurrence of a mutation at this locus.
The estimated current number of carriers per generation -obtained through forward simulations -was 32 (9-299) (median, 95% CI), based on HapMap genetic distances, and 32 (9-250), based on Rutgers map distances ( Fig. 3C and D). These results integrate the uncertainty of the tMRCA estimates. Assuming that three 25-year generations co-exist at present, we estimate that 96 (27-897) (median, 95% CI, HapMap-based) c.805_825dup carriers may exist today, of which only 25 have been identified so far.
We studied the structure of the DNA sequence around the 21 base-pair tandem duplication allele. We have identified a region in the normal sequence with two short repeated sequences separated by an intervening sequence ( Fig. 4A and B), which is typical for a site prone to 'replication slippage'. Tandem duplications associated with short direct repeats can be explained by the model of 'replication slippage' or 'slipped strand mispairing' (25,26). The key feature of this model is that, during replication, the primer strand transiently dissociates from the template, slips backward or forward, and then re-associates at the short direct repeat resulting in a misaligned configuration (Fig. 4A). If the primer strand containing the newly synthesized second (i.e. 3′) repeat dissociates from the template strand and then misaligns at the first (i.e. 5′) direct repeat, continued DNA synthesis will lead to the insertion of one of two direct repeats plus the intervening sequence. In our case, the repeat Clinical Study R Salvatori, S Radian, Y Diekmann, D Iacovazzo and others Founder effect in AIP mutation www.eje-online.org is imperfect (caaggcctac and caaggcccac) (Fig. 4B), the intervening sequence is ttcaagcgggg and the duplicated region is ttcaagcggggcaaggcccac.
The mutation results in the duplication of seven amino-acids in the third TPR domain of AIP (TPR3). These can adopt a helical conformation, thus adding a 1.9 extra helical turn to the TPR3 alpha helix and increasing its length by 10.5 Å. The TPR3 domain is important for the interaction of AIP with other proteins, in particular, HSP90 and TOMM7 peptides (5). Superposition between the WT C-terminal of the AIP molecule (PDB: 4AIF) and the mutant C-terminal AIP model shows that the seven amino acid duplication may displace K266, which is crucial for HSP90 and possibly other partners binding (Fig. 4C, D and E), thus potentially leading to the disruption of AIP function. The results of a co-immunoprecipitation experiment with mutant AIP and HSP90 were consistent with lack of interaction between the two proteins (Fig. 5A), supporting the results of our in silico prediction. As compared with WT AIP, the steady state levels of the p.F269_H275dup protein were markedly reduced, and this led us to hypothesize an effect of the mutation on AIP protein stability as well, consistent with what has already been shown for several other AIP mutations (21). Cycloheximide chase experiments showed that the p.F269_H275dup mutation leads to a markedly unstable protein with a degradation constant K of 1.169, significantly higher than that of WT AIP (0.0593; P < 0.001) and even that of the truncating mutant p.R304* (0.3147; P < 0.05) (Fig. 5B), suggesting shorter protein half-life.

Discussion
AIP mutations are reported in about 20% of FIPA pedigrees (1,27), but a higher percentage has been observed in homogenous acromegaly families. Over 90 AIP mutations have been described to date. The R304 locus seems to be the most prevalent mutational 'hot spot', with mutations affecting the c.C910 and c.G911 base-pairs of this CpG site, resulting in R304* and R304Q alleles reported in various countries and unrelated families (28). A few other mutations have also been described as recurring in diverse geographical regions, such as the p.R271W and the p.R81* (3,29,30,31). In this paper, we report nine apparently unrelated pedigrees all carrying the same unique c.805_825dup, p.Phe269_His275dup AIP allele.  Seven of these pedigrees were not previously reported, and we provide updated information for the other two (3,9,13,20). When the same allele is identified in different families, the question arises whether this is the result of independent recurrent mutational events in a region of the genome that is more prone to mutations ('hot spot', such as CpG sites) (32) or the result of spreading of the disease-associated allele in the population originating from a single founder mutation event. The c.805_825dup,  and of the mutated allele (top row), starting at the c.781 base-pair (NM_003977.2). Note the two 9 base-pair long repeats, the 5′ one marked in purple (caaggcctac) and the 3′ one marked in blue (caaggcccac). These are imperfect repeats as the 7th base-pair (in red font) is t in the 5′ repeat and c in the 3′ repeat. The intervening sequence between the two direct repeats is ttcaagcgggg.
The underlined sequence is duplicated in the upper sequence, i.e. the mutated allele, in which the first copy of the duplicated sequence is underlined and the second copy of the duplicated sequence is italic. This duplication leads to three copies of the 9 bp repeat: the purple is the 5′ repeat, while the blue is the 3′ repeat which has been duplicated. (C, D and E) Cartoon representation of the AIP protein structure based on the crystal structure of the N and the C-terminal domains of the protein (4,5). The first and second alpha helices of the WT AIP TPR2 and TPR3 motifs (C) are presented in light orange and dark orange respectively. Details of the wild-type (D) and mutant (E) AIP bound to the SRMEEVD peptide, a fragment of the HSP90 partner protein (shown in blue). The seven amino-acid duplication in the first helix of AIP TPR3 is shown in purple. Residue K266, which in the wild-type AIP interacts with the HSP90 peptide (or with TOMM20 peptide) ( (25,26,33) (Fig. 4A). Certain structural features of the genome can predispose a particular region to rearrangement, such as repeated sequences located relatively close to each other, raising the possibility that similar duplications might occur independently. On the other hand, while the patients described here reside in three different areas of the world, all the individuals have ancestral links to the United Kingdom. Furthermore, the steady flow of immigration from Britain to North America in recent centuries and the close proximity of England and France are consistent with an evolutionarily recent common ancestor. Indeed, the result of our haplotype analysis strongly indicates that all patients share a recent common ancestor, the 'English founder', who lived approximately 1175 years ago. This estimate has, however, a wide 95% confidence interval of 225-2825 years. The cause of the geographic spreading of this allele is unknown, although mobility in the areas where carriers reside was historically highparticularly since the industrial revolution (34). The penetrance of the pituitary phenotype in patients with mutated AIP is known to be low (about 20%) (27). This is confirmed in the largest c.805_825dup-positive pedigree (UK FIPA 1), with an estimated penetrance of 23% (assuming that 50% of the relatives not genotyped would be carriers). Because of this low penetrance, AIP mutations do not prevent reproduction in the majority of carriers. Furthermore, even patients who develop pituitary adenomas may maintain normal fertility until the adenoma is large enough to cause hypogonadism. Whether the mutated allele confers some kind of survival advantage is unknown (11). It is important to note that four of the patients did not have a family history of pituitary adenoma. Following our haplotype analysis, these are unlikely to be de novo mutations. These patients were tested for AIP mutations because of their clinical characteristics (large adenomas developing at young age, with onset of symptoms ranging between 9 and 20 years), confirming previous reports of relatively high prevalence of AIP mutations in apparently sporadic GH-secreting macroadenomas of the youth (27,35,36).
The c.805_825dup AIP allele is pathogenic, as indicated by the clinical, in silico and experimental data. The mutation is predicted to affect AIP structure and function by altering a crucial interaction site between the third TPR domain of AIP (Fig. 4) and one of its most important partners, HSP90 (5), and this has been confirmed experimentally (Fig. 5A). Mutations at proteinprotein binding sites are a common cause of disease as they can disrupt the interaction between molecules in cellular protein networks, which is required for correct protein function (37). An alternative pathogenic mechanism by which the c.805_825dup (p.F269_H275dup) affects AIP function is by causing the disruption of AIP protein folding, leading to increased proteasomal degradation, as we have previously shown for numerous AIP missense mutations (21). Here, we provided evidence that the p.F269_H275dup mutation markedly affects protein stability, leading to a protein that is significantly less stable compared with the common p.R304* truncating mutant.
In summary, we report the same AIP allele occurring in nine apparently unrelated pedigrees from three different (but linked through gene flow) countries and show that all the affected subjects, the majority suffering from youngonset somatotropinomas, share a common ancestral haplotype. The mutation results in the duplication of seven amino-acids in third TPR domain of AIP, leading to the disruption of protein-protein interactions and markedly reduced protein stability.