Full Text Available Abstract Background Low copy repeats (LCRs are thought to play an important role in recent gene evolution, especially when they facilitate gene duplications. Duplicate genes are fundamental to adaptive evolution, providing substrates for the development of new or shared gene functions. Moreover, silencing of duplicate genes can have an indirect effect on adaptive evolution by causing genomic relocation of functional genes. These changes are theorized to have been a major factor in speciation. Results Here we present a novel example showing functional gene relocation within a LCR. We characterize the genomic structure and gene content of eight related LCRs on human Chromosomes 7 and 12. Two members of a novel transmembrane gene family, DPY19L, were identified in these regions, along with six transcribed pseudogenes. One of these genes, DPY19L2, is found on Chromosome 12 and is not syntenic with its mouse orthologue. Instead, the human locus syntenic to mouse Dpy19l2 contains a pseudogene, DPY19L2P1. This indicates that the ancestral copy of this gene has been silenced, while the descendant copy has remained active. Thus, the functional copy of this gene has been relocated to a new genomic locus. We then describe the expansion and evolution of the DPY19L gene family from a single gene found in invertebrate animals. Ancient duplications have led to multiple homologues in different lineages, with three in fish, frogs and birds and four in mammals. Conclusion Our results show that the DPY19L family has expanded throughout the vertebrate lineage and has undergone recent primate-specific evolution within LCRs.
Li, Zhen; Defoort, Jonas; Tasdighian, Setareh; Maere, Steven; Van de Peer, Yves; De Smet, Riet
Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. © 2016 American Society of Plant Biologists. All rights reserved.
Li, Zhen; Van de Peer, Yves; De Smet, Riet
Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of “gene duplicability” is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. PMID:26744215
Full Text Available Gene duplication followed by subfunctionalization and neofunctionalization is of a great evolutionary importance. In plant genomes, duplicated genes may result from either polyploidization (homoeologous genes or segmental chromosome duplications (paralogous genes. In allohexaploid wheat Triticum aestivum L. (2n=6x=42, genome BBAADD, both homoeologous and paralogous copies were found for the regulatory gene Myc encoding MYC-like transcriptional factor in the biosynthesis of flavonoid pigments, anthocyanins, and for the structural gene F3h encoding one of the key enzymes of flavonoid biosynthesis, flavanone 3-hydroxylase. From the 5 copies (3 homoeologous and 2 paralogous of the Myc gene found in T. aestivum, only one plays a regulatory role in anthocyanin biosynthesis, interacting complementary with another transcriptional factor (MYB-like to confer purple pigmentation of grain pericarp in wheat. The role and functionality of the other 4 copies of the Myc gene remain unknown. From the 4 functional copies of the F3h gene in T. aestivum, three homoeologues have similar function. They are expressed in wheat organs colored with anthocyanins or in the endosperm, participating there in biosynthesis of uncolored flavonoid substances. The fourth copy (the B-genomic paralogue is transcribed neither in wheat organs colored with anthocyanins nor in seeds, however, it’s expression has been noticed in roots of aluminium-stressed plants, where the three homoeologous copies are not active. Functional diversification of the duplicated flavonoid biosynthesis genes in wheat may be a reason for maintenance of the duplicated copies and preventing them from pseudogenization.The study was supported by RFBR (11-04-92707. We also thank Ms. Galina Generalova for technical assistance.
Furihata, Hazuka Y; Suenaga, Kazuya; Kawanabe, Takahiro; Yoshida, Takanori; Kawabe, Akira
PRC2 genes were analyzed for their number of gene duplications, d N /d S ratios and expression patterns among Brassicaceae and Gramineae species. Although both amino acid sequences and copy number of the PRC2 genes were generally well conserved in both Brassicaceae and Gramineae species, we observed that some rapidly evolving genes experienced duplications and expression pattern changes. After multiple duplication events, all but one or two of the duplicated copies tend to be silenced. Silenced copies were reactivated in the endosperm and showed ectopic expression in developing seeds. The results indicated that rapid evolution of some PRC2 genes is initially caused by a relaxation of selective constraint following the gene duplication events. Several loci could become maternally expressed imprinted genes and acquired functional roles in the endosperm.
Full Text Available Arabidopsis thaliana became the model organism for plant studies because of its small diploid genome, rapid lifecycle and short adult size. Its genome was the first among plants to be sequenced, becoming the reference in plant genomics. However, the Arabidopsis genome is characterized by an inherently complex organization, since it has undergone ancient whole genome duplications, followed by gene reduction, diploidization events and extended rearrangements, which relocated and split up the retained portions. These events, together with probable chromosome reductions, dramatically increased the genome complexity, limiting its role as a reference. The identification of paralogs and single copy genes within a highly duplicated genome is a prerequisite to understand its organization and evolution and to improve its exploitation in comparative genomics. This is still controversial, even in the widely studied Arabidopsis genome. This is also due to the lack of a reference bioinformatics pipeline that could exhaustively identify paralogs and singleton genes. We describe here a complete computational strategy to detect both duplicated and single copy genes in a genome, discussing all the methodological issues that may strongly affect the results, their quality and their reliability. This approach was used to analyze the organization of Arabidopsis nuclear protein coding genes, and besides classifying computationally defined paralogs into networks and single copy genes into different classes, it unraveled further intriguing aspects concerning the genome annotation and the gene relationships in this reference plant species. Since our results may be useful for comparative genomics and genome functional analyses, we organized a dedicated web interface to make them accessible to the scientific community.
Full Text Available Abstract Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.
Chain Frédéric JJ
Full Text Available Abstract Background Gene duplication is an important biological phenomenon associated with genomic redundancy, degeneration, specialization, innovation, and speciation. After duplication, both copies continue functioning when natural selection favors duplicated protein function or expression, or when mutations make them functionally distinct before one copy is silenced. Results Here we quantify the degree to which genetic parameters related to gene expression, molecular evolution, and gene structure in a diploid frog - Silurana tropicalis - influence the odds of functional persistence of orthologous duplicate genes in a closely related tetraploid species - Xenopus laevis. Using public databases and 454 pyrosequencing, we obtained genetic and expression data from S. tropicalis orthologs of 3,387 X. laevis paralogs and 4,746 X. laevis singletons - the most comprehensive dataset for African clawed frogs yet analyzed. Using logistic regression, we demonstrate that the most important predictors of the odds of duplicate gene persistence in the tetraploid species are the total gene expression level and evenness of expression across tissues and development in the diploid species. Slow protein evolution and information density (fewer exons, shorter introns in the diploid are also positively correlated with duplicate gene persistence in the tetraploid. Conclusions Our findings suggest that a combination of factors contribute to duplicate gene persistence following whole genome duplication, but that the total expression level and evenness of expression across tissues and through development before duplication are most important. We speculate that these parameters are useful predictors of duplicate gene longevity after whole genome duplication in other taxa.
Ancliff, Mark; Park, Jeong-Man
We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.
Full Text Available Abstract Background Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD event (ohnologs versus small-scale duplications (SSD to determine if there exist any differences in their patterns of sequence evolution. Results For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression. Conclusions Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.
Full Text Available Abstract Background Most genes in Arabidopsis thaliana are members of gene families. How do the members of gene families arise, and how are gene family copy numbers maintained? Some gene families may evolve primarily through tandem duplication and high rates of birth and death in clusters, and others through infrequent polyploidy or large-scale segmental duplications and subsequent losses. Results Our approach to understanding the mechanisms of gene family evolution was to construct phylogenies for 50 large gene families in Arabidopsis thaliana, identify large internal segmental duplications in Arabidopsis, map gene duplications onto the segmental duplications, and use this information to identify which nodes in each phylogeny arose due to segmental or tandem duplication. Examples of six gene families exemplifying characteristic modes are described. Distributions of gene family sizes and patterns of duplication by genomic distance are also described in order to characterize patterns of local duplication and copy number for large gene families. Both gene family size and duplication by distance closely follow power-law distributions. Conclusions Combining information about genomic segmental duplications, gene family phylogenies, and gene positions provides a method to evaluate contributions of tandem duplication and segmental genome duplication in the generation and maintenance of gene families. These differences appear to correspond meaningfully to differences in functional roles of the members of the gene families.
Fujimura, Koji; Conte, Matthew A.; Kocher, Thomas D.
vasa is a highly conserved RNA helicase involved in animal germ cell development. Among vertebrate species, it is typically present as a single copy per genome. Here we report the isolation and sequencing of BAC clones for Nile tilapia vasa genes. Contrary to a previous report that Nile tilapia have a single copy of the vasa gene, we find evidence for at least three vasa gene loci. The vasa gene locus was duplicated from the original site and integrated into two distant novel sites. For one of these insertions we find evidence that the duplication was mediated by a circular DNA intermediate. This mechanism of gene duplication may explain the origin of isolated gene duplicates during the evolution of fish genomes. These data provide a foundation for studying the role of multiple vasa genes in the development of tilapia gonads, and will contribute to investigations of the molecular mechanisms of sex determination and evolution in cichlid fishes. PMID:22216289
Gout, Jean-Francois; Lynch, Michael
Whole-genome duplications (WGDs) have contributed to gene-repertoire enrichment in many eukaryotic lineages. However, most duplicated genes are eventually lost and it is still unclear why some duplicated genes are evolutionary successful whereas others quickly turn to pseudogenes. Here, we show that dosage constraints are major factors opposing post-WGD gene loss in several Paramecium species that share a common ancestral WGD. We propose a model where a majority of WGD-derived duplicates preserve their ancestral function and are retained to produce enough of the proteins performing this same ancestral function. Under this model, the expression level of individual duplicated genes can evolve neutrally as long as they maintain a roughly constant summed expression, and this allows random genetic drift toward uneven contributions of the two copies to total expression. Our analysis suggests that once a high level of imbalance is reached, which can require substantial lengths of time, the copy with the lowest expression level contributes a small enough fraction of the total expression that selection no longer opposes its loss. Extension of our analysis to yeast species sharing a common ancestral WGD yields similar results, suggesting that duplicated-gene retention for dosage constraints followed by divergence in expression level and eventual deterministic gene loss might be a universal feature of post-WGD evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.
Dunand, Christophe; Mathé, Catherine; Lazzarotto, Fernanda; Margis, Rogério; Margis-Pinheiro, Marcia
Phylogenetic, genomic and functional analyses have allowed the identification of a new class of putative heme peroxidases, so called APx-R (APx-Related). These new class, mainly present in the green lineage (including green algae and land plants), can also be detected in other unicellular chloroplastic organisms. Except for recent polyploid organisms, only single-copy of APx-R gene was detected in each genome, suggesting that the majority of the APx-R extra-copies were lost after chromosomal or segmental duplications. In a similar way, most APx-R co-expressed genes in Arabidopsis genome do not have conserved extra-copies after chromosomal duplications and are predicted to be localized in organelles, as are the APx-R. The member of this gene network can be considered as unique gene, well conserved through the evolution due to a strong negative selection pressure and a low evolution rate. © 2011 Landes Bioscience
Fares, Mario A; Sabater-Muñoz, Beatriz; Toft, Christina
Gene duplication generates new genetic material, which has been shown to lead to major innovations in unicellular and multicellular organisms. A whole-genome duplication occurred in the ancestor of Saccharomyces yeast species but 92% of duplicates returned to single-copy genes shortly after duplication. The persisting duplicated genes in Saccharomyces led to the origin of major metabolic innovations, which have been the source of the unique biotechnological capabilities in the Baker's yeast Saccharomyces cerevisiae. What factors have determined the fate of duplicated genes remains unknown. Here, we report the first demonstration that the local genome mutation and transcription rates determine the fate of duplicates. We show, for the first time, a preferential location of duplicated genes in the mutational and transcriptional hotspots of S. cerevisiae genome. The mechanism of duplication matters, with whole-genome duplicates exhibiting different preservation trends compared to small-scale duplicates. Genome mutational and transcriptional hotspots are rich in duplicates with large repetitive promoter elements. Saccharomyces cerevisiae shows more tolerance to deleterious mutations in duplicates with repetitive promoter elements, which in turn exhibit higher transcriptional plasticity against environmental perturbations. Our data demonstrate that the genome traps duplicates through the accelerated regulatory and functional divergence of their gene copies providing a source of novel adaptations in yeast. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Hasselmann, Martin; Lechner, Sarah; Schulte, Christina; Beye, Martin
The most remarkable outcome of a gene duplication event is the evolution of a novel function. Little information exists on how the rise of a novel function affects the evolution of its paralogous sister gene copy, however. We studied the evolution of the feminizer (fem) gene from which the gene complementary sex determiner (csd) recently derived by tandem duplication within the honey bee (Apis) lineage. Previous studies showed that fem retained its sex determination function, whereas the rise of csd established a new primary signal of sex determination. We observed a specific reduction of nonsynonymous to synonymous substitution ratios in Apis to non-Apis fem. We found a contrasting pattern at two other genetically linked genes, suggesting that hitchhiking effects to csd, the locus under balancing selection, is not the cause of this evolutionary pattern. We also excluded higher synonymous substitution rates by relative rate testing. These results imply that stronger purifying selection is operating at the fem gene in the presence of csd. We propose that csd's new function interferes with the function of Fem protein, resulting in molecular constraints and limited evolvability of fem in the Apis lineage. Elevated silent nucleotide polymorphism in fem relative to the genome-wide average suggests that genetic linkage to the csd gene maintained more nucleotide variation in today's population. Our findings provide evidence that csd functionally and genetically interferes with fem, suggesting that a newly evolved gene and its functions can limit the evolutionary capability of other genes in the genome.
Jakobek Judy L
Full Text Available Abstract Background The biosynthesis of aflatoxin (AF involves over 20 enzymatic reactions in a complex polyketide pathway that converts acetate and malonate to the intermediates sterigmatocystin (ST and O-methylsterigmatocystin (OMST, the respective penultimate and ultimate precursors of AF. Although these precursors are chemically and structurally very similar, their accumulation differs at the species level for Aspergilli. Notable examples are A. nidulans that synthesizes only ST, A. flavus that makes predominantly AF, and A. parasiticus that generally produces either AF or OMST. Whether these differences are important in the evolutionary/ecological processes of species adaptation and diversification is unknown. Equally unknown are the specific genomic mechanisms responsible for ordering and clustering of genes in the AF pathway of Aspergillus. Results To elucidate the mechanisms that have driven formation of these clusters, we performed systematic searches of aflatoxin cluster homologs across five Aspergillus genomes. We found a high level of gene duplication and identified seven modules consisting of highly correlated gene pairs (aflA/aflB, aflR/aflS, aflX/aflY, aflF/aflE, aflT/aflQ, aflC/aflW, and aflG/aflL. With the exception of A. nomius, contrasts of mean Ka/Ks values across all cluster genes showed significant differences in selective pressure between section Flavi and non-section Flavi species. A. nomius mean Ka/Ks values were more similar to partial clusters in A. fumigatus and A. terreus. Overall, mean Ka/Ks values were significantly higher for section Flavi than for non-section Flavi species. Conclusion Our results implicate several genomic mechanisms in the evolution of ST, OMST and AF cluster genes. Gene modules may arise from duplications of a single gene, whereby the function of the pre-duplication gene is retained in the copy (aflF/aflE or the copies may partition the ancestral function (aflA/aflB. In some gene modules, the
Full Text Available Abstract Background Rhodobacter sphaeroides 2.4.1 is a metabolically versatile organism that belongs to α-3 subdivision of Proteobacteria. The present study was to identify the extent, history, and role of gene duplications in R. sphaeroides 2.4.1, an organism that possesses two chromosomes. Results A protein similarity search (BLASTP identified 1247 orfs (~29.4% of the total protein coding orfs that are present in 2 or more copies, 37.5% (234 gene-pairs of which exist in duplicate copies. The distribution of the duplicate gene-pairs in all Clusters of Orthologous Groups (COGs differed significantly when compared to the COG distribution across the whole genome. Location plots revealed clusters of gene duplications that possessed the same COG classification. Phylogenetic analyses were performed to determine a tree topology predicting either a Type-A or Type-B phylogenetic relationship. A Type-A phylogenetic relationship shows that a copy of the protein-pair matches more with an ortholog from a species closely related to R. sphaeroides while a Type-B relationship predicts the highest match between both copies of the R. sphaeroides protein-pair. The results revealed that ~77% of the proteins exhibited a Type-A phylogenetic relationship demonstrating the ancient origin of these gene duplications. Additional analyses on three other strains of R. sphaeroides revealed varying levels of gene loss and retention in these strains. Also, analyses on common gene pairs among the four strains revealed that these genes experience similar functional constraints and undergo purifying selection. Conclusions Although the results suggest that the level of gene duplication in organisms with complex genome structuring (more than one chromosome seems to be not markedly different from that in organisms with only a single chromosome, these duplications may have aided in genome reorganization in this group of eubacteria prior to the formation of R. sphaeroides as gene
Potier, M; Dutriaux, A; Orti, R; Groet, J; Gibelin, N; Karadima, G; Lutfalla, G; Lynn, A; Van Broeckhoven, C; Chakravarti, A; Petersen, M; Nizetic, D; Delabar, J; Rossier, J
Physical mapping across a duplication can be a tour de force if the region is larger than the size of a bacterial clone. This was the case of the 170- to 275-kb duplication present on the long arm of chromosome 21 in normal human at 21q11.1 (proximal region) and at 21q22.1 (distal region), which we described previously. We have constructed sequence-ready contigs of the two copies of the duplication of which all the clones are genuine representatives of one copy or the other. This required the identification of four duplicon polymorphisms that are copy-specific and nonallelic variations in the sequence of the STSs. Thirteen STSs were mapped inside the duplicated region and 5 outside but close to the boundaries. Among these STSs 10 were end clones from YACs, PACs, or cosmids, and the average interval between two markers in the duplicated region was 16 kb. Eight PACs and cosmids showing minimal overlaps were selected in both copies of the duplication. Comparative sequence analysis along the duplication showed three single-basepair changes between the two copies over 659 bp sequenced (4 STSs), suggesting that the duplication is recent (less than 4 mya). Two CpG islands were located in the duplication, but no genes were identified after a 36-kb cosmid from the proximal copy of the duplication was sequenced. The homology of this chromosome 21 duplicated region with the pericentromeric regions of chromosomes 13, 2, and 18 suggests that the mechanism involved is probably similar to pericentromeric-directed mechanisms described in interchromosomal duplications. Copyright 1998 Academic Press.
... 46 Shipping 8 2010-10-01 2010-10-01 false Responsibility for duplicating copies of NSA-WORKSMALREP Contract. Sec. 5 Section 5 Shipping MARITIME ADMINISTRATION, DEPARTMENT OF TRANSPORTATION A-NATIONAL... INDIVIDUAL CONTRACT FOR MINOR REPAIRS-NSA-WORKSMALREP Sec. 5 Responsibility for duplicating copies of NSA...
Venkatachalam, Ananda B; Parmar, Manoj B; Wright, Jonathan M
Increasing organismal complexity during the evolution of life has been attributed to the duplication of genes and entire genomes. More recently, theoretical models have been proposed that postulate the fate of duplicated genes, among them the duplication-degeneration-complementation (DDC) model. In the DDC model, the common fate of a duplicated gene is lost from the genome owing to nonfunctionalization. Duplicated genes are retained in the genome either by subfunctionalization, where the functions of the ancestral gene are sub-divided between the sister duplicate genes, or by neofunctionalization, where one of the duplicate genes acquires a new function. Both processes occur either by loss or gain of regulatory elements in the promoters of duplicated genes. Here, we review the genomic organization, evolution, and transcriptional regulation of the multigene family of intracellular lipid-binding protein (iLBP) genes from teleost fishes. Teleost fishes possess many copies of iLBP genes owing to a whole genome duplication (WGD) early in the teleost fish radiation. Moreover, the retention of duplicated iLBP genes is substantially higher than the retention of all other genes duplicated in the teleost genome. The fatty acid-binding protein genes, a subfamily of the iLBP multigene family in zebrafish, are differentially regulated by peroxisome proliferator-activated receptor (PPAR) isoforms, which may account for the retention of iLBP genes in the zebrafish genome by the process of subfunctionalization of cis-acting regulatory elements in iLBP gene promoters.
Zeira, Ron; Shamir, Ron
Genome rearrangement problems have been extensively studied due to their importance in biology. Most studied models assumed a single copy per gene. However, in reality, duplicated genes are common, most notably in cancer. In this study, we make a step toward handling duplicated genes by considering a model that allows the atomic operations of cut, join, and whole chromosome duplication. Given two linear genomes, [Formula: see text] with one copy per gene and [Formula: see text] with two copies per gene, we give a linear time algorithm for computing a shortest sequence of operations transforming [Formula: see text] into [Formula: see text] such that all intermediate genomes are linear. We also show that computing an optimal sequence with fewest duplications is NP-hard.
Bekpen, Cemalettin; Künzel, Sven; Xie, Chen; Eaaswarkhanth, Muthukrishnan; Lin, Yen-Lung; Gokcumen, Omer; Akdis, Cezmi A; Tautz, Diethard
Segmental duplications are an abundant source for novel gene functions and evolutionary adaptations. This mechanism of generating novelty was very active during the evolution of primates particularly in the human lineage. Here, we characterize the evolution and function of the SPATA31 gene family (former designation FAM75A), which was previously shown to be among the gene families with the strongest signal of positive selection in hominoids. The mouse homologue for this gene family is a single copy gene expressed during spermatogenesis. We show that in primates, the SPATA31 gene duplicated into SPATA31A and SPATA31C types and broadened the expression into many tissues. Each type became further segmentally duplicated in the line towards humans with the largest number of full-length copies found for SPATA31A in humans. Copy number estimates of SPATA31A based on digital PCR show an average of 7.5 with a range of 5-11 copies per diploid genome among human individuals. The primate SPATA31 genes also acquired new protein domains that suggest an involvement in UV response and DNA repair. We generated antibodies and show that the protein is re-localized from the nucleolus to the whole nucleus upon UV-irradiation suggesting a UV damage response. We used CRISPR/Cas mediated mutagenesis to knockout copies of the gene in human primary fibroblast cells. We find that cell lines with reduced functional copies as well as naturally occurring low copy number HFF cells show enhanced sensitivity towards UV-irradiation. The acquisition of new SPATA31 protein functions and its broadening of expression may be related to the evolution of the diurnal life style in primates that required a higher UV tolerance. The increased segmental duplications in hominoids as well as its fast evolution suggest the acquisition of further specific functions particularly in humans.
Full Text Available BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.
Kesterson Robert A
Full Text Available Abstract Background Chromosomal abnormalities affecting human chromosome 15q11-q13 underlie multiple genomic disorders caused by deletion, duplication and triplication of intervals in this region. These events are mediated by highly homologous segments of DNA, or duplicons, that facilitate mispairing and unequal cross-over in meiosis. The gene encoding an amyloid precursor protein-binding protein (APBA2 was previously mapped to the distal portion of the interval commonly deleted in Prader-Willi and Angelman syndromes and duplicated in cases of autism. Results We show that this gene actually maps to a more telomeric location and is partially duplicated within the broader region. Two highly homologous copies of an interval containing a large 5' exon and downstream sequence are located ~5 Mb distal to the intact locus. The duplicated copies, containing the first coding exon of APBA2, can be distinguished by single nucleotide sequence differences and are transcriptionally inactive. Adjacent to APBA2 maps a gene termed KIAA0574. The protein encoded by this gene is weakly homologous to a protein termed X123 that in turn maps adjacent to APBA1 on 9q21.12; APBA1 is highly homologous to APBA2 in the C-terminal region and is distinguished from APBA2 by the N-terminal region encoded by this duplicated exon. Conclusion The duplication of APBA2 sequences in this region adds to a complex picture of different low copy repeats present across this region and elsewhere on the chromosome.
Dyer, Kelly A; White, Brooke E; Bray, Michael J; Piqué, Daniel G; Betancourt, Andrea J
In contrast to the rest of the genome, the Y chromosome is restricted to males and lacks recombination. As a result, Y chromosomes are unable to respond efficiently to selection, and newly formed Y chromosomes degenerate until few genes remain. The rapid loss of genes from newly formed Y chromosomes has been well studied, but gene loss from highly degenerate Y chromosomes has only recently received attention. Here, we identify and characterize a Y to autosome duplication of the male fertility gene kl-5 that occurred during the evolution of the testacea group species of Drosophila. The duplication was likely DNA based, as other Y-linked genes remain on the Y chromosome, the locations of introns are conserved, and expression analyses suggest that regulatory elements remain linked. Genetic mapping reveals that the autosomal copy of kl-5 resides on the dot chromosome, a tiny autosome with strongly suppressed recombination. Molecular evolutionary analyses show that autosomal copies of kl-5 have reduced polymorphism and little recombination. Importantly, the rate of protein evolution of kl-5 has increased significantly in lineages where it is on the dot versus Y linked. Further analyses suggest this pattern is a consequence of relaxed purifying selection, rather than adaptive evolution. Thus, although the initial fixation of the kl-5 duplication may have been advantageous, slightly deleterious mutations have accumulated in the dot-linked copies of kl-5 faster than in the Y-linked copies. Because the dot chromosome contains seven times more genes than the Y and is exposed to selection in both males and females, these results suggest that the dot suffers the deleterious effects of genetic linkage to more selective targets compared with the Y chromosome. Thus, a highly degenerate Y chromosome may not be the worst environment in the genome, as is generally thought, but may in fact be protected from the accumulation of deleterious mutations relative to other nonrecombining
Labbé, Pierrick; Milesi, Pascal; Yébakima, André; Pasteur, Nicole; Weill, Mylène; Lenormand, Thomas
Gene duplications have long been advocated to contribute to the evolution of new functions. The role of selection in their early spread is more controversial. Unless duplications are favored for a direct benefit of increased expression, they are likely detrimental. In this article, we investigated the case of duplications favored because they combine already functionally divergent alleles. Their gene-dosage/fitness relations are poorly known because selection may operate on both overall expression and duplicates relative dosage. Using the well-documented case of Culex pipiens resistance to insecticides, we compared strains with various ace-1 allele combinations, including two duplicated alleles carrying both susceptible and resistant copies. The overall protein activity was nearly additive, but, surprisingly, fitness correlated better with the relative proportion of susceptible and resistant copies rather than any absolute measure of activity. Gene dosage is thus crucial, duplications stabilizing a "heterozygote" phenotype. It corroborates the view that these were favored because they fix a permanent heterosis, thereby solving the irreducible trade-off between resistance and synaptic transmission. Moreover, we showed that the contrasted successes of the two duplicated alleles in natural populations depend on genetic changes unrelated to ace-1, confirming the probable implication of recessive sublethal mutations linked to structural rearrangements in some duplications. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
Kim, Yuseob; Lee, Jang H; Babbitt, Gregory A
Population genetic theory of gene duplication suggests that the preservation of duplicate copies requires functional divergence upon duplication. Genes that can be readily modified to produce new gene expression patterns may thus be duplicated often. In yeast, genes exhibit dichotomous expression patterns based on their promoter architectures. The expression of genes that contain TATA box or occupied proximal nucleosome (OPN) tends to be variable and respond to external signals. On the other hand, genes without TATA box or with depleted proximal nucleosome (DPN) are expressed constitutively. We find that recent duplicates in the yeast genome are heavily biased to be TATA box containing genes and not to be DPN genes. This suggests that variably expressed genes, due to the functional organization in their promoters, have higher duplicability than constitutively expressed genes.
Singh, Nagendra K; Dalal, Vivek; Batra, Kamlesh; Singh, Binay K; Chitra, G; Singh, Archana; Ghazi, Irfan A; Yadav, Mahavir; Pandit, Awadhesh; Dixit, Rekha; Singh, Pradeep K; Singh, Harvinder; Koundal, Kirpa R; Gaikwad, Kishor; Mohapatra, Trilochan; Sharma, Tilak R
The high-quality rice genome sequence is serving as a reference for comparative genome analysis in crop plants, especially cereals. However, early comparisons with bread wheat showed complex patterns of conserved synteny (gene content) and colinearity (gene order). Here, we show the presence of ancient duplicated segments in the progenitor of wheat, which were first identified in the rice genome. We also show that single-copy (SC) rice genes, those representing unique matches with wheat expressed sequence tag (EST) unigene contigs in the whole rice genome, show more than twice the proportion of genes mapping to syntenic wheat chromosome as compared to the multicopy (MC) or duplicated rice genes. While 58.7% of the 1,244 mapped SC rice genes were located in single syntenic wheat chromosome groups, the remaining 41.3% were distributed randomly to the other six non-syntenic wheat groups. This could only be explained by a background dispersal of genes in the genome through transposition or other unknown mechanism. The breakdown of rice-wheat synteny due to such transpositions was much greater near the wheat centromeres. Furthermore, the SC rice genes revealed a conserved primordial gene order that gives clues to the origin of rice and wheat chromosomes from a common ancestor through polyploidy, aneuploidy, centromeric fusions, and translocations. Apart from the bin-mapped wheat EST contigs, we also compared 56,298 predicted rice genes with 39,813 wheat EST contigs assembled from 409,765 EST sequences and identified 7,241 SC rice gene homologs of wheat. Based on the conserved colinearity of 1,063 mapped SC rice genes across the bins of individual wheat chromosomes, we predicted the wheat bin location of 6,178 unmapped SC rice gene homologs and validated the location of 213 of these in the telomeric bins of 21 wheat chromosomes with 35.4% initial success. This opens up the possibility of directed mapping of a large number of conserved SC rice gene homologs in wheat
Lafond, Manuel; Chauve, Cedric; Dondi, Riccardo; El-Mabrouk, Nadia
Large-scale methods for inferring gene trees are error-prone. Correcting gene trees for weakly supported features often results in non-binary trees, i.e. trees with polytomies, thus raising the natural question of refining such polytomies into binary trees. A feature pointing toward potential errors in gene trees are duplications that are not supported by the presence of multiple gene copies. We introduce the problem of refining polytomies in a gene tree while minimizing the number of created non-apparent duplications in the resulting tree. We show that this problem can be described as a graph-theoretical optimization problem. We provide a bounded heuristic with guaranteed optimality for well-characterized instances. We apply our algorithm to a set of ray-finned fish gene trees from the Ensembl database to illustrate its ability to correct dubious duplications. The C++ source code for the algorithms and simulations described in the article are available at http://www-ens.iro.umontreal.ca/~lafonman/software.php. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
BACKGROUND: There is increasing interest in the evolution of protein-protein interactions because this should ultimately be informative of the patterns of evolution of new protein functions within the cell. One model proposes that the evolution of new protein-protein interactions and protein complexes proceeds through the duplication of self-interacting genes. This model is supported by data from yeast. We examined the relationship between gene duplication and self-interaction in the human genome. RESULTS: We investigated the patterns of self-interaction and duplication among 34808 interactions encoded by 8881 human genes, and show that self-interacting proteins are encoded by genes with higher duplicability than genes whose proteins lack this type of interaction. We show that this result is robust against the system used to define duplicate genes. Finally we compared the presence of self-interactions amongst proteins whose genes have duplicated either through whole-genome duplication (WGD) or small-scale duplication (SSD), and show that the former tend to have more interactions in general. After controlling for age differences between the two sets of duplicates this result can be explained by the time since the gene duplication. CONCLUSIONS: Genes encoding self-interacting proteins tend to have higher duplicability than proteins lacking self-interactions. Moreover these duplicate genes have more often arisen through whole-genome rather than small-scale duplication. Finally, self-interacting WGD genes tend to have more interaction partners in general in the PIN, which can be explained by their overall greater age. This work adds to our growing knowledge of the importance of contextual factors in gene duplicability.
Full Text Available Abstract Background The threespine stickleback (Gasterosteus aculeatus has a characteristic reproductive mode; mature males build nests using a secreted glue-like protein called spiggin. Although recent studies reported multiple occurrences of genes that encode this glue-like protein spiggin in threespine and ninespine sticklebacks, it is still unclear how many genes compose the spiggin multi-gene family. Results Genome sequence analysis of threespine stickleback showed that there are at least five spiggin genes and two pseudogenes, whereas a single spiggin homolog occurs in the genomes of other fishes. Comparative genome sequence analysis demonstrated that Muc19, a single-copy mucous gene in human and mouse, is an ortholog of spiggin. Phylogenetic and molecular evolutionary analyses of these sequences suggested that an ancestral spiggin gene originated from a member of the mucin gene family as a single gene in the common ancestor of teleosts, and gene duplications of spiggin have occurred in the stickleback lineage. There was inter-population variation in the copy number of spiggin genes and positive selection on some codons, indicating that additional gene duplication/deletion events and adaptive evolution at some amino acid sites may have occurred in each stickleback population. Conclusion A number of spiggin genes exist in the threespine stickleback genome. Our results provide insight into the origin and dynamic evolutionary process of the spiggin multi-gene family in the threespine stickleback lineage. The dramatic evolution of genes for mucous substrates may have contributed to the generation of distinct characteristics such as "bio-glue" in vertebrates.
Bernt, Matthias; Chen, Kuan Yu; Chen, Ming Chiang
A tandem duplication random loss (TDRL) operation duplicates a contiguous segment of genes, followed by the random loss of one copy of each of the duplicated genes. Although the importance of this operation is founded by several recent biological studies, it has been investigated only rarely from...
Garcia, Nelson; Messing, Joachim
The TEL2, TTI1, and TTI2 proteins are co-chaperones for heat shock protein 90 (HSP90) to regulate the protein folding and maturation of phosphatidylinositol 3-kinase-related kinases (PIKKs). Referred to as the TTT complex, the genes that encode them are highly conserved from man to maize. TTT complex and PIKK genes exist mostly as single copy genes in organisms where they have been characterized. Members of this interacting protein network in maize were identified and synteny analyses were performed to study their evolution. Similar to other species, there is only one copy of each of these genes in maize which was due to a loss of the duplicated copy created by ancient allotetraploidy. Moreover, the retained copies of the TTT complex and the PIKK genes tolerated extensive retrotransposon insertion in their introns that resulted in increased gene lengths and gene body methylation, without apparent effect in normal gene expression and function. The results raise an interesting question on whether the reversion to single copy was due to selection against deleterious unbalanced gene duplications between members of the complex as predicted by the gene balance hypothesis, or due to neutral loss of extra copies. Uneven alteration of dosage either by adding extra copies or modulating gene expression of complex members is being proposed as a means to investigate whether the data supports the gene balance hypothesis or not.
Full Text Available The TEL2, TTI1, and TTI2 proteins are co-chaperones for heat shock protein 90 (HSP90 to regulate the protein folding and maturation of phosphatidylinositol 3-kinase-related kinases (PIKKs. Referred to as the TTT complex, the genes that encode them are highly conserved from man to maize. TTT complex and PIKK genes exist mostly as single copy genes in organisms where they have been characterized. Members of this interacting protein network in maize were identified and synteny analyses were performed to study their evolution. Similar to other species, there is only one copy of each of these genes in maize which was due to a loss of the duplicated copy created by ancient allotetraploidy. Moreover, the retained copies of the TTT complex and the PIKK genes tolerated extensive retrotransposon insertion in their introns that resulted in increased gene lengths and gene body methylation, without apparent effect in normal gene expression and function. The results raise an interesting question on whether the reversion to single copy was due to selection against deleterious unbalanced gene duplications between members of the complex as predicted by the gene balance hypothesis, or due to neutral loss of extra copies. Uneven alteration of dosage either by adding extra copies or modulating gene expression of complex members is being proposed as a means to investigate whether the data supports the gene balance hypothesis or not.
Zauber, Peter; Marotta, Stephen; Sabbath-Solitare, Marlene
Changes in the number of alleles of a chromosome may have an impact upon gene expression. Loss of heterozygosity (LOH) indicates that one allele of a gene has been lost, and knowing the exact copy number of the gene would indicate whether duplication of the remaining allele has occurred. We were interested to determine the copy number of the Adenomatous Polyposis Coli (APC) gene in sporadic colorectal cancers with LOH. We selected 38 carcinomas with LOH for the APC gene region of chromosome 5, as determined by amplification of the CA repeat region within the D5S346 loci. The copy number status of APC was ascertained using the SALSA® MLPA® P043-B1 APC Kit. LOH for the DCC gene, KRAS gene mutation, and microsatellite instability were also evaluated for each tumor, utilizing standard polymerase chain reaction methods. No tumor demonstrated microsatellite instability. LOH of the DCC gene was also present in 33 of 36 (91.7%) informative tumors. A KRAS gene mutation was present in 16 of the 38 (42.1%) tumors. Twenty-four (63.2%) of the tumors were copy number neutral, 10 (26.3%) tumors demonstrated major loss, while two (5.3%) showed partial loss. Two tumors (5.3%) had copy number gain. Results of APC and DCC LOH, KRAS and microsatellite instability indicate our colorectal cancer cases were typical of sporadic cancers following the 'chromosomal instability' pathway. The majority of our colorectal carcinomas with LOH for APC gene are copy number neutral. However, one-third of our cases showed copy number loss, suggesting that duplication of the remaining allele is not required for the development of a colorectal carcinoma.
Zauber, Peter; Marotta, Stephen; Sabbath-Solitare, Marlene
Changes in the number of alleles of a chromosome may have an impact upon gene expression. Loss of heterozygosity (LOH) indicates that one allele of a gene has been lost, and knowing the exact copy number of the gene would indicate whether duplication of the remaining allele has occurred. We were interested to determine the copy number of the Adenomatous Polyposis Coli (APC) gene in sporadic colorectal cancers with LOH. We selected 38 carcinomas with LOH for the APC gene region of chromosome 5, as determined by amplification of the CA repeat region within the D5S346 loci. The copy number status of APC was ascertained using the SALSA® MLPA® P043-B1 APC Kit. LOH for the DCC gene, KRAS gene mutation, and microsatellite instability were also evaluated for each tumor, utilizing standard polymerase chain reaction methods. No tumor demonstrated microsatellite instability. LOH of the DCC gene was also present in 33 of 36 (91.7 %) informative tumors. A KRAS gene mutation was present in 16 of the 38 (42.1 %) tumors. Twenty-four (63.2 %) of the tumors were copy number neutral, 10 (26.3 %) tumors demonstrated major loss, while two (5.3 %) showed partial loss. Two tumors (5.3 %) had copy number gain. Results of APC and DCC LOH, KRAS and microsatellite instability indicate our colorectal cancer cases were typical of sporadic cancers following the ‘chromosomal instability’ pathway. The majority of our colorectal carcinomas with LOH for APC gene are copy number neutral. However, one-third of our cases showed copy number loss, suggesting that duplication of the remaining allele is not required for the development of a colorectal carcinoma
Gout, Jean-François; Duret, Laurent; Kahn, Daniel
Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.
Salaneck, Erik; Ardell, David H; Larson, Earl T; Larhammar, Dan
It has been debated whether the increase in gene number during early vertebrate evolution was due to multiple independent gene duplications or synchronous duplications of many genes. We describe here the cloning of three neuropeptide Y (NPY) receptor genes belonging to the Y1 subfamily in the spiny dogfish, Squalus acanthias, a cartilaginous fish. The three genes are orthologs of the mammalian subtypes Y1, Y4, and Y6, which are located in paralogous gene regions on different chromosomes in mammals. Thus, these genes arose by duplications of a chromosome region before the radiation of gnathostomes (jawed vertebrates). Estimates of duplication times from linearized trees together with evidence from other gene families supports two rounds of chromosome duplications or tetraploidizations early in vertebrate evolution. The anatomical distribution of mRNA was determined by reverse-transcriptase PCR and was found to differ from mammals, suggesting differential functional diversification of the new gene copies during the radiation of the vertebrate classes.
Full Text Available Abstract Background GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456. Results We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae and a hemichordate (Saccoglossus kowalevskii. We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons, providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events. Conclusion From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons, from single ancestral vertebrate GATA123 and GATA456
Luís Filipe Costa Castro
Full Text Available BACKGROUND: Aspartic proteases comprise a large group of enzymes involved in peptide proteolysis. This collection includes prominent enzymes globally categorized as pepsins, which are derived from pepsinogen precursors. Pepsins are involved in gastric digestion, a hallmark of vertebrate physiology. An important member among the pepsinogens is pepsinogen C (Pgc. A particular aspect of Pgc is its apparent single copy status, which contrasts with the numerous gene copies found for example in pepsinogen A (Pga. Although gene sequences with similarity to Pgc have been described in some vertebrate groups, no exhaustive evolutionary framework has been considered so far. METHODOLOGY/PRINCIPAL FINDINGS: By combining phylogenetics and genomic analysis, we find an unexpected Pgc diversity in the vertebrate sub-phylum. We were able to reconstruct gene duplication timings relative to the divergence of major vertebrate clades. Before tetrapod divergence, a single Pgc gene tandemly expanded to produce two gene lineages (Pgbc and Pgc2. These have been differentially retained in various classes. Accordingly, we find Pgc2 in sauropsids, amphibians and marsupials, but not in eutherian mammals. Pgbc was retained in amphibians, but duplicated in the ancestor of amniotes giving rise to Pgb and Pgc1. The latter was retained in mammals and probably in reptiles and marsupials but not in birds. Pgb was kept in all of the amniote clade with independent episodes of loss in some mammalian species. Lineage specific expansions of Pgc2 and Pgbc have also occurred in marsupials and amphibians respectively. We find that teleost and tetrapod Pgc genes reside in distinct genomic regions hinting at a possible translocation. CONCLUSIONS: We conclude that the repertoire of Pgc genes is larger than previously reported, and that tandem duplications have modelled the history of Pgc genes. We hypothesize that gene expansion lead to functional divergence in tetrapods, coincident with the
Vaszkó, Tibor; Papp, János; Krausz, Csilla; Casamonti, Elena; Géczi, Lajos; Olah, Edith
Due to its palindromic setup, AZFc (Azoospermia Factor c) region of chromosome Y is one of the most unstable regions of the human genome. It contains eight gene families expressed mainly in the testes. Several types of rearrangement resulting in changes in the cumulative copy number of the gene families were reported to be associated with diseases such as male infertility and testicular germ cell tumors. The best studied AZFc rearrangement is gr/gr deletion. Its carriers show widespread phenotypic variation from azoospermia to normospermia. This phenomenon was initially attributed to different gr/gr subtypes that would eliminate distinct members of the affected gene families. However, studies conducted to confirm this hypothesis have brought controversial results, perhaps, in part, due to the shortcomings of the utilized subtyping methodology. This proof-of-concept paper is meant to introduce here a novel method aimed at subtyping AZFc rearrangements. It is able to differentiate the partial deletion and partial duplication subtypes of the Deleted in Azoospermia (DAZ) gene family. The keystone of the method is the determination of the copy number of the gene family member-specific variant(s) in a series of sequence family variant (SFV) positions. Most importantly, we present a novel approach for the correct interpretation of the variant copy number data to determine the copy number of the individual DAZ family members in the context of frequent interloci gene conversion.Besides DAZ1/DAZ2 and DAZ3/DAZ4 deletions, not yet described rearrangements such as DAZ2/DAZ4 deletion and three duplication subtypes were also found by the utilization of the novel approach. A striking feature is the extremely high concordance among the individual data pointing to a certain type of rearrangement. In addition to being able to identify DAZ deletion subtypes more reliably than the methods used previously, this approach is the first that can discriminate DAZ duplication subtypes as well
Si, W.; Gu, L.; Yang, S.; Zhang, X.; Memon, S.
Eurosids basically evolved from the core Eudicots Rosids. The Rosids consist of two large assemblages, Eurosids I (Fabids) and Eurosids II (Malvids), which belong to the largest group of Angiosperms, comprising of >40,000 and ∼ 15,000 species, respectively. Although the evolutionary patterns of the largest class of disease resistance genes consisting of a nucleotide binding site (NBS) and leucine-rich repeats (LRRs) have been studied in many species, systemic research of NBS-encoding genes has not been performed in different orders of Eurosids II. Here, five Eurosids II species, Gossypium raimondii, Theobroma cacao, Carica papaya, Citrus clementina, and Arabidopsis thaliana, distributing in three orders, were used to gain insights into the evolutionary patterns of the NBS-encoding genes. Our data showed that frequent copy number variations of NBS-encoding genes were found among these species. Phylogenetic tree analysis and the numbers of the NBS-encoding genes in the common ancestor of these species showed that species-specific NBS clades, including multi-copy and single copy numbers are dominant among these genes. However, not a single clade was found with only five copies, which come from all of the five species, respectively, suggesting rapid turn-over with birth and death of the NBS-encoding genes among Eurosids II species. In addition, a strong positive correlation was observed between the Toll/interleukin receptor (TIR)) type NBS-encoding genes and species-specific genes, indicating rapid gene loss and duplication. Whereas, non- TIR type NBS-encoding genes in these five species showed two distinct evolutionary patterns. (author)
Hargreaves, Adam D.; Swain, Martin T.; Hegarty, Matthew J.; Logan, Darren W.; Mulley, John F.
Snake venom has been hypothesized to have originated and diversified through a process that involves duplication of genes encoding body proteins with subsequent recruitment of the copy to the venom gland, where natural selection acts to develop or increase toxicity. However, gene duplication is known to be a rare event in vertebrate genomes, and the recruitment of duplicated genes to a novel expression domain (neofunctionalization) is an even rarer process that requires the evolution of novel combinations of transcription factor binding sites in upstream regulatory regions. Therefore, although this hypothesis concerning the evolution of snake venom is very unlikely and should be regarded with caution, it is nonetheless often assumed to be established fact, hindering research into the true origins of snake venom toxins. To critically evaluate this hypothesis, we have generated transcriptomic data for body tissues and salivary and venom glands from five species of venomous and nonvenomous reptiles. Our comparative transcriptomic analysis of these data reveals that snake venom does not evolve through the hypothesized process of duplication and recruitment of genes encoding body proteins. Indeed, our results show that many proposed venom toxins are in fact expressed in a wide variety of body tissues, including the salivary gland of nonvenomous reptiles and that these genes have therefore been restricted to the venom gland following duplication, not recruited. Thus, snake venom evolves through the duplication and subfunctionalization of genes encoding existing salivary proteins. These results highlight the danger of the elegant and intuitive “just-so story” in evolutionary biology. PMID:25079342
Full Text Available The evolution of diversity across the animal kingdom has been accompanied by tremendous gene loss and gain. While comparative genomics has been fruitful to characterize differences in gene content across highly diverged species, little is known about the microevolution of structural variations that cause these differences in the first place. In order to investigate the genomic impact of structural variations, we made use of genomic and transcriptomic data from the nematode Pristionchus pacificus, which has been established as a satellite model to Caenorhabditis elegans for comparative biology. We exploit the fact that P. pacificus is a highly diverse species for which various genomic data including the draft genome of a sister species P. exspectatus is available. Based on resequencing coverage data for two natural isolates we identified large (> 2 kb deletions and duplications relative to the reference strain. By restriction to completely syntenic regions between P. pacificus and P. exspectatus, we were able to polarize the comparison and to assess the impact of structural variations on expression levels. We found that while loss of genes correlates with lack of expression, duplication of genes has virtually no effect on gene expression. Further investigating expression of individual copies at sites that segregate between the duplicates, we found in the majority of cases only one of the copies to be expressed. Nevertheless, we still find that certain gene classes are strongly depleted in deletions as well as duplications, suggesting evolutionary constraint acting on synteny. In summary, our results are consistent with a model, where most structural variations are either deleterious or neutral and provide first insights into the microevolution of structural variations in the P. pacificus genome.
Background Constructing species trees from multi-copy gene trees remains a challenging problem in phylogenetics. One difficulty is that the underlying genes can be incongruent due to evolutionary processes such as gene duplication and loss, deep coalescence, or lateral gene transfer. Gene tree estimation errors may further exacerbate the difficulties of species tree estimation. Results We present a new approach for inferring species trees from incongruent multi-copy gene trees that is based on a generalization of the Robinson-Foulds (RF) distance measure to multi-labeled trees (mul-trees). We prove that it is NP-hard to compute the RF distance between two mul-trees; however, it is easy to calculate this distance between a mul-tree and a singly-labeled species tree. Motivated by this, we formulate the RF problem for mul-trees (MulRF) as follows: Given a collection of multi-copy gene trees, find a singly-labeled species tree that minimizes the total RF distance from the input mul-trees. We develop and implement a fast SPR-based heuristic algorithm for the NP-hard MulRF problem. We compare the performance of the MulRF method (available at http://genome.cs.iastate.edu/CBL/MulRF/) with several gene tree parsimony approaches using gene tree simulations that incorporate gene tree error, gene duplications and losses, and/or lateral transfer. The MulRF method produces more accurate species trees than gene tree parsimony approaches. We also demonstrate that the MulRF method infers in minutes a credible plant species tree from a collection of nearly 2,000 gene trees. Conclusions Our new phylogenetic inference method, based on a generalized RF distance, makes it possible to quickly estimate species trees from large genomic data sets. Since the MulRF method, unlike gene tree parsimony, is based on a generic tree distance measure, it is appealing for analyses of genomic data sets, in which many processes such as deep coalescence, recombination, gene duplication and losses as
Full Text Available Abstract Background Dlx (Distal-less genes have various developmental roles and are widespread throughout the animal kingdom, usually occurring as single copy genes in non-chordates and as multiple copies in most chordate genomes. While the genomic arrangement and function of these genes is well known in vertebrates and arthropods, information about Dlx genes in other organisms is scarce. We investigate the presence of Dlx genes in several annelid species and examine Dlx gene expression in the polychaete Pomatoceros lamarckii. Results Two Dlx genes are present in P. lamarckii, Capitella teleta and Helobdella robusta. The C. teleta Dlx genes are closely linked in an inverted tail-to-tail orientation, reminiscent of the arrangement of vertebrate Dlx pairs, and gene conversion appears to have had a role in their evolution. The H. robusta Dlx genes, however, are not on the same genomic scaffold and display divergent sequences, while, if the P. lamarckii genes are linked in a tail-to-tail orientation they are a minimum of 41 kilobases apart and show no sign of gene conversion. No expression in P. lamarckii appendage development has been observed, which conflicts with the supposed conserved role of these genes in animal appendage development. These Dlx duplications do not appear to be annelid-wide, as the polychaete Platynereis dumerilii likely possesses only one Dlx gene. Conclusions On the basis of the currently accepted annelid phylogeny, we hypothesise that one Dlx duplication occurred in the annelid lineage after the divergence of P. dumerilii from the other lineages and these duplicates then had varied evolutionary fates in different species. We also propose that the ancestral role of Dlx genes is not related to appendage development.
Bowne, Sara J; Sullivan, Lori S; Wheaton, Dianna K; Locke, Kirsten G; Jones, Kaylie D; Koboldt, Daniel C; Fulton, Robert S; Wilson, Richard K; Blanton, Susan H; Birch, David G; Daiger, Stephen P
To identify the underlying cause of disease in a large family with North Carolina macular dystrophy (NCMD). A large four-generation family (RFS355) with an autosomal dominant form of NCMD was ascertained. Family members underwent comprehensive visual function evaluations. Blood or saliva from six affected family members and three unaffected spouses was collected and DNA tested for linkage to the MCDR1 locus on chromosome 6q12. Three affected family members and two unaffected spouses underwent whole exome sequencing (WES) and subsequently, custom capture of the linkage region followed by next-generation sequencing (NGS). Standard PCR and dideoxy sequencing were used to further characterize the mutation. Of the 12 eyes examined in six affected individuals, all but two had Gass grade 3 macular degeneration features. Large central excavation of the retinal and choroid layers, referred to as a macular caldera, was seen in an age-independent manner in the grade 3 eyes. The calderas are unique to affected individuals with MCDR1. Genome-wide linkage mapping and haplotype analysis of markers from the chromosome 6q region were consistent with linkage to the MCDR1 locus. Whole exome sequencing and custom-capture NGS failed to reveal any rare coding variants segregating with the phenotype. Analysis of the custom-capture NGS sequencing data for copy number variants uncovered a tandem duplication of approximately 60 kb on chromosome 6q. This region contains two genes, CCNC and PRDM13 . The duplication creates a partial copy of CCNC and a complete copy of PRDM13 . The duplication was found in all affected members of the family and is not present in any unaffected members. The duplication was not seen in 200 ethnically matched normal chromosomes. The cause of disease in the original family with MCDR1 and several others has been recently reported to be dysregulation of the PRDM13 gene, caused by either single base substitutions in a DNase 1 hypersensitive site upstream of the CCNC
Tyson, Jess; Majerus, Tamsin M O; Walker, Susan; Armour, John A L
For most cases of colorectal cancer that arise without a family history of the disease, it is proposed that an appreciable heritable component of predisposition is the result of contributions from many loci. Although progress has been made in identifying single nucleotide variants associated with colorectal cancer risk, the involvement of low-penetrance copy number variants is relatively unexplored. We have used multiplex amplifiable probe hybridization (MAPH) in a fourfold multiplex (QuadMAPH), positioned at an average resolution of one probe per 2 kb, to screen a total of 1.56 Mb of genomic DNA for copy number variants around the genes APC, AXIN1, BRCA1, BRCA2, CTNNB1, HRAS, MLH1, MSH2, and TP53. Two deletion events were detected, one upstream of MLH1 in a control individual and the other in APC in a colorectal cancer patient, but these do not seem to correspond to copy number polymorphisms with measurably high population frequencies. In summary, by means of our QuadMAPH assay, copy number measurement data were of sufficient resolution and accuracy to detect any copy number variants with high probability. However, this study has demonstrated a very low incidence of deletion and duplication variants within intronic and flanking regions of these nine genes, in both control individuals and colorectal cancer patients. Copyright © 2010 Elsevier Inc. All rights reserved.
Jessica B Hostetler
Full Text Available Plasmodium vivax causes the majority of malaria episodes outside Africa, but remains a relatively understudied pathogen. The pathology of P. vivax infection depends critically on the parasite's ability to recognize and invade human erythrocytes. This invasion process involves an interaction between P. vivax Duffy Binding Protein (PvDBP in merozoites and the Duffy antigen receptor for chemokines (DARC on the erythrocyte surface. Whole-genome sequencing of clinical isolates recently established that some P. vivax genomes contain two copies of the PvDBP gene. The frequency of this duplication is particularly high in Madagascar, where there is also evidence for P. vivax infection in DARC-negative individuals. The functional significance and global prevalence of this duplication, and whether there are other copy number variations at the PvDBP locus, is unknown.Using whole-genome sequencing and PCR to study the PvDBP locus in P. vivax clinical isolates, we found that PvDBP duplication is widespread in Cambodia. The boundaries of the Cambodian PvDBP duplication differ from those previously identified in Madagascar, meaning that current molecular assays were unable to detect it. The Cambodian PvDBP duplication did not associate with parasite density or DARC genotype, and ranged in prevalence from 20% to 38% over four annual transmission seasons in Cambodia. This duplication was also present in P. vivax isolates from Brazil and Ethiopia, but not India.PvDBP duplications are much more widespread and complex than previously thought, and at least two distinct duplications are circulating globally. The same duplication boundaries were identified in parasites from three continents, and were found at high prevalence in human populations where DARC-negativity is essentially absent. It is therefore unlikely that PvDBP duplication is associated with infection of DARC-negative individuals, but functional tests will be required to confirm this hypothesis.
Full Text Available Premise of the study: Primers were developed to amplify 12 intron-less, low-copy nuclear genes in the Hawaiian genus Clermontia (Campanulaceae, a suspected tetraploid. Methods and Results: Data from a pooled 454 titanium run of the partial transcriptomes of seven Clermontia species were used to identify the loci of interest. Most loci were amplified and sequenced directly with success in a representative selection of lobeliads even though several of these loci turned out to be duplicated. Levels of variation were comparable to those observed in commonly used plastid and ribosomal markers. Conclusions: We found evidence of a genome duplication that likely predates the diversification of the Hawaiian lobeliads. Some genes nevertheless appear to be single-copy and should be useful for phylogenetic studies of Clermontia or the entire Lobelioideae subfamily.
Full Text Available Due to the selection pressure imposed by highly variable environmental conditions, stress sensing and regulatory response mechanisms in plants are expected to evolve rapidly. One potential source of innovation in plant stress response mechanisms is gene duplication. In this study, we examined the evolution of stress-regulated gene expression among duplicated genes in the model plant Arabidopsis thaliana. Key to this analysis was reconstructing the putative ancestral stress regulation pattern. By comparing the expression patterns of duplicated genes with the patterns of their ancestors, duplicated genes likely lost and gained stress responses at a rapid rate initially, but the rate is close to zero when the synonymous substitution rate (a proxy for time is > approximately 0.8. When considering duplicated gene pairs, we found that partitioning of putative ancestral stress responses occurred more frequently compared to cases of parallel retention and loss. Furthermore, the pattern of stress response partitioning was extremely asymmetric. An analysis of putative cis-acting DNA regulatory elements in the promoters of the duplicated stress-regulated genes indicated that the asymmetric partitioning of ancestral stress responses are likely due, at least in part, to differential loss of DNA regulatory elements; the duplicated genes losing most of their stress responses were those that had lost more of the putative cis-acting elements. Finally, duplicate genes that lost most or all of the ancestral responses are more likely to have gained responses to other stresses. Therefore, the retention of duplicates that inherit few or no functions seems to be coupled to neofunctionalization. Taken together, our findings provide new insight into the patterns of evolutionary changes in gene stress responses after duplication and lay the foundation for testing the adaptive significance of stress regulatory changes under highly variable biotic and abiotic environments.
Zevering, C E; Moritz, C; Heideman, A; Sturm, R A
Analysis of mitochondrial DNAs (mtDNAs) from parthenogenetic lizards of the Heteronotia binoei complex with restriction enzymes revealed an approximately 5-kb addition present in all 77 individuals. Cleavage site mapping suggested the presence of a direct tandem duplication spanning the 16S and 12S rRNA genes, the control region and most, if not all, of the gene for the subunit 1 of NADH dehydrogenase (ND1). The location of the duplication was confirmed by Southern hybridization. A restriction enzyme survey provided evidence for modifications to each copy of the duplicated sequence, including four large deletions. Each gene affected by a deletion was complemented by an intact version in the other copy of the sequence, although for one gene the functional copy was heteroplasmic for another deletion. Sequencing of a fragment from one copy of the duplication which encompassed the tRNA(leu)(UUR) and parts of the 16S rRNA and ND1 genes, revealed mutations expected to disrupt function. Thus, evolution subsequent to the duplication event has resulted in mitochondrial pseudogenes. The presence of duplications in all of these parthenogens, but not among representatives of their maternal sexual ancestors, suggests that the duplications arose in the parthenogenetic form. This provides the second instance in H. binoei of mtDNA duplication associated with the transition from sexual to parthenogenetic reproduction. The increased incidence of duplications in parthenogenetic lizards may be caused by errors in mtDNA replication due to either polyploidy or hybridity of their nuclear genomes.
Full Text Available In contrast to S. cerevisiae and C. elegans, analyses based on the current knockout (KO mouse phenotypes led to the conclusion that duplicate genes had almost no role in mouse genetic robustness. It has been suggested that the bias of mouse KO database toward ancient duplicates may possibly cause this knockout duplicate puzzle, that is, a very similar proportion of essential genes (PE between duplicate genes and singletons. In this paper, we conducted an extensive and careful analysis for the mouse KO phenotype data and corroborated a strong effect of duplicate genes on mouse genetics robustness. Moreover, the effect of duplicate genes on mouse genetic robustness is duplication-age dependent, which holds after ruling out the potential confounding effect from coding-sequence conservation, protein-protein connectivity, functional bias, or the bias of duplicates generated by whole genome duplication (WGD. Our findings suggest that two factors, the sampling bias toward ancient duplicates and very ancient duplicates with a proportion of essential genes higher than that of singletons, have caused the mouse knockout duplicate puzzle; meanwhile, the effect of genetic buffering may be correlated with sequence conservation as well as protein-protein interactivity.
Zhong, Yan; Jia, Yanxiao; Gao, Yang; Tian, Dacheng; Yang, Sihai; Zhang, Xiaohui
Gene duplication supplies the raw materials for novel gene functions and many gene families arisen from duplication experience adaptive evolution. Most studies of young duplicates have focused on mammals, especially humans, whereas reports describing their genome-wide evolutionary patterns across the closely related Drosophila species are rare. The sequenced 12 Drosophila genomes provide the opportunity to address this issue. In our study, 3,647 young duplicate gene families were identified across the 12 Drosophila species and three types of expansions, species-specific, lineage-specific and complex expansions, were detected in these gene families. Our data showed that the species-specific young duplicate genes predominated (86.6%) over the other two types. Interestingly, many independent species-specific expansions in the same gene family have been observed in many species, even including 11 or 12 Drosophila species. Our data also showed that the functional bias observed in these young duplicate genes was mainly related to responses to environmental stimuli and biotic stresses. This study reveals the evolutionary patterns of young duplicates across 12 Drosophila species on a genomic scale. Our results suggest that convergent evolution acts on young duplicate genes after the species differentiation and adaptive evolution may play an important role in duplicate genes for adaption to ecological factors and environmental changes in Drosophila.
Full Text Available Dosage sensitivity is an important evolutionary force which impacts on gene dispensability and duplicability. The newly available data on human copy-number variation (CNV allow an analysis of the most recent and ongoing evolution. Provided that heterozygous gene deletions and duplications actually change gene dosage, we expect to observe negative selection against CNVs encompassing dosage sensitive genes. In this study, we make use of several sources of population genetic data to identify selection on structural variations of dosage sensitive genes. We show that CNVs can directly affect expression levels of contained genes. We find that genes encoding members of protein complexes exhibit limited expression variation and overlap significantly with a manually derived set of dosage sensitive genes. We show that complexes and other dosage sensitive genes are underrepresented in CNV regions, with a particular bias against frequent variations and duplications. These results suggest that dosage sensitivity is a significant force of negative selection on regions of copy-number variation.
Calvete, Oriol; González, Josefa; Betrán, Esther; Ruiz, Alfredo
Chromosomal inversions are usually portrayed as simple two-breakpoint rearrangements changing gene order but not gene number or structure. However, increasing evidence suggests that inversion breakpoints may often have a complex structure and entail gene duplications with potential functional consequences. Here, we used a combination of different techniques to investigate the breakpoint structure and the functional consequences of a complex rearrangement fixed in Drosophila buzzatii and comprising two tandemly arranged inversions sharing the middle breakpoint: 2m and 2n. By comparing the sequence in the breakpoint regions between D. buzzatii (inverted chromosome) and D. mojavensis (noninverted chromosome), we corroborate the breakpoint reuse at the molecular level and infer that inversion 2m was associated with a duplication of a ∼13 kb segment and likely generated by staggered breaks plus repair by nonhomologous end joining. The duplicated segment contained the gene CG4673, involved in nuclear transport, and its two nested genes CG5071 and CG5079. Interestingly, we found that other than the inversion and the associated duplication, both breakpoints suffered additional rearrangements, that is, the proximal breakpoint experienced a microinversion event associated at both ends with a 121-bp long duplication that contains a promoter. As a consequence of all these different rearrangements, CG5079 has been lost from the genome, CG5071 is now a single copy nonnested gene, and CG4673 has a transcript ∼9 kb shorter and seems to have acquired a more complex gene regulation. Our results illustrate the complex effects of chromosomal rearrangements and highlight the need of complementing genomic approaches with detailed sequence-level and functional analyses of breakpoint regions if we are to fully understand genome structure, function, and evolutionary dynamics. PMID:22328714
Kane L Greer
Full Text Available Duchenne muscular dystrophy is a severe muscle-wasting disease caused by mutations in the dystrophin gene that ablate functional protein expression. Although exonic deletions are the most common Duchenne muscular dystrophy lesion, duplications account for 10–15% of reported disease-causing mutations, and exon 2 is the most commonly duplicated exon. Here, we describe the in vitro evaluation of phosphorodiamidate morpholino oligomers coupled to a cell-penetrating peptide and 2′-O-methyl phosphorothioate oligonucleotides, using three distinct strategies to reframe the dystrophin transcript in patient cells carrying an exon 2 duplication. Differences in exon-skipping efficiencies in vitro were observed between oligomer analogues of the same sequence, with the phosphorodiamidate morpholino oligomer coupled to a cell-penetrating peptide proving the most effective. Differences in exon 2 excision efficiency between normal and exon 2 duplication cells, were apparent, indicating that exon context influences oligomer-induced splice switching. Skipping of a single copy of exon 2 was induced in the cells carrying an exon 2 duplication, the simplest strategy to restore the reading frame and generate a normal dystrophin transcript. In contrast, multiexon skipping of exons 2–7 to generate a Becker muscular dystrophy-like dystrophin transcript was more challenging and could only be induced efficiently with the phosphorodiamidate morpholino oligomer chemistry.
Full Text Available BACKGROUND: The opioid system is involved in reward and pain mechanisms and consists in mammals of four receptors and several peptides. The peptides are derived from four prepropeptide genes, PENK, PDYN, PNOC and POMC, encoding enkephalins, dynorphins, orphanin/nociceptin and beta-endorphin, respectively. Previously we have described how two rounds of genome doubling (2R before the origin of jawed vertebrates formed the receptor family. METHODOLOGY/PRINCIPAL FINDINGS: Opioid peptide gene family members were investigated using a combination of sequence-based phylogeny and chromosomal locations of the peptide genes in various vertebrates. Several adjacent gene families were investigated similarly. The results show that the ancestral peptide gene gave rise to two additional copies in the genome doublings. The fourth member was generated by a local gene duplication, as the genes encoding POMC and PNOC are located on the same chromosome in the chicken genome and all three teleost genomes that we have studied. A translocation has disrupted this synteny in mammals. The PDYN gene seems to have been lost in chicken, but not in zebra finch. Duplicates of some peptide genes have arisen in the teleost fishes. Within the prepropeptide precursors, peptides have been lost or gained in different lineages. CONCLUSIONS/SIGNIFICANCE: The ancestral peptide and receptor genes were located on the same chromosome and were thus duplicated concomitantly. However, subsequently genetic linkage has been lost. In conclusion, the system of opioid peptides and receptors was largely formed by the genome doublings that took place early in vertebrate evolution.
Zimmer, Christoph T; Garrood, William T; Singh, Kumar Saurabh; Randall, Emma; Lueke, Bettina; Gutbrod, Oliver; Matthiesen, Svend; Kohler, Maxie; Nauen, Ralf; Davies, T G Emyr; Bass, Chris
Gene duplication is a major source of genetic variation that has been shown to underpin the evolution of a wide range of adaptive traits [1, 2]. For example, duplication or amplification of genes encoding detoxification enzymes has been shown to play an important role in the evolution of insecticide resistance [3-5]. In this context, gene duplication performs an adaptive function as a result of its effects on gene dosage and not as a source of functional novelty [3, 6-8]. Here, we show that duplication and neofunctionalization of a cytochrome P450, CYP6ER1, led to the evolution of insecticide resistance in the brown planthopper. Considerable genetic variation was observed in the coding sequence of CYP6ER1 in populations of brown planthopper collected from across Asia, but just two sequence variants are highly overexpressed in resistant strains and metabolize imidacloprid. Both variants are characterized by profound amino-acid alterations in substrate recognition sites, and the introduction of these mutations into a susceptible P450 sequence is sufficient to confer resistance. CYP6ER1 is duplicated in resistant strains with individuals carrying paralogs with and without the gain-of-function mutations. Despite numerical parity in the genome, the susceptible and mutant copies exhibit marked asymmetry in their expression with the resistant paralogs overexpressed. In the primary resistance-conferring CYP6ER1 variant, this results from an extended region of novel sequence upstream of the gene that provides enhanced expression. Our findings illustrate the versatility of gene duplication in providing opportunities for functional and regulatory innovation during the evolution of an adaptive trait. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang
Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication
Cuypers, Thomas D; Hogeweg, Paulien; Hogeweg, P.
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes.
Jeremy D DeBarry
Full Text Available We characterize the prevalence, distribution, divergence, and putative functions of detectable two-copy paralogs and segmental duplications in the Apicomplexa, a phylum of parasitic protists. Apicomplexans are mostly obligate intracellular parasites responsible for human and animal diseases (e.g. malaria and toxoplasmosis. Gene loss is a major force in the phylum. Genomes are small and protein-encoding gene repertoires are reduced. Despite this genomic streamlining, duplications and gene family amplifications are present. The potential for innovation introduced by duplications is of particular interest. We compared genomes of twelve apicomplexans across four lineages and used orthology and genome cartography to map distributions of duplications against genome architectures. Segmental duplications appear limited to five species. Where present, they correspond to regions enriched for multi-copy and species-specific genes, pointing toward roles in adaptation and innovation. We found a phylum-wide association of duplications with dynamic chromosome regions and syntenic breakpoints. Trends in the distribution of duplicated genes indicate that recent, species-specific duplicates are often tandem while most others have been dispersed by genome rearrangements. These trends show a relationship between genome architecture and gene duplication. Functional analysis reveals: proteases, which are vital to a parasitic lifecycle, to be prominent in putative recent duplications; a pair of paralogous genes in Toxoplasma gondii previously shown to produce the rate-limiting step in dopamine synthesis in mammalian cells, a possible link to the modification of host behavior; and phylum-wide differences in expression and subcellular localization, indicative of modes of divergence. We have uncovered trends in multiple modes of duplicate divergence including sequence, intron content, expression, subcellular localization, and functions of putative recent duplicates that
Jeffrey A. Fawcett
Full Text Available Gene conversion is one of the major mutational mechanisms involved in the DNA sequence evolution of duplicated genes. It contributes to create unique patters of DNA polymorphism within species and divergence between species. A typical pattern is so-called concerted evolution, in which the divergence between duplicates is maintained low for a long time because of frequent exchanges of DNA fragments. In addition, gene conversion affects the DNA evolution of duplicates in various ways especially when selection operates. Here, we review theoretical models to understand the evolution of duplicates in both neutral and non-neutral cases. We also explain how these theories contribute to interpreting real polymorphism and divergence data by using some intriguing examples.
Baker, Richard H; Narechania, Apurva; Johns, Philip M; Wilkinson, Gerald S
Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict.
Adomako-Ankomah, Yaw; English, Elizabeth D; Danielson, Jeffrey J; Pernas, Lena F; Parker, Michelle L; Boulanger, Martin J; Dubey, Jitender P; Boyle, Jon P
In Toxoplasma gondii, an intracellular parasite of humans and other animals, host mitochondrial association (HMA) is driven by a gene family that encodes multiple mitochondrial association factor 1 (MAF1) proteins. However, the importance of MAF1 gene duplication in the evolution of HMA is not understood, nor is the impact of HMA on parasite biology. Here we used within- and between-species comparative analysis to determine that the MAF1 locus is duplicated in T. gondii and its nearest extant relative Hammondia hammondi, but not another close relative, Neospora caninum Using cross-species complementation, we determined that the MAF1 locus harbors multiple distinct paralogs that differ in their ability to mediate HMA, and that only T. gondii and H. hammondi harbor HMA(+) paralogs. Additionally, we found that exogenous expression of an HMA(+) paralog in T. gondii strains that do not normally exhibit HMA provides a competitive advantage over their wild-type counterparts during a mouse infection. These data indicate that HMA likely evolved by neofunctionalization of a duplicate MAF1 copy in the common ancestor of T. gondii and H. hammondi, and that the neofunctionalized gene duplicate is selectively advantageous. Copyright © 2016 by the Genetics Society of America.
Full Text Available Abstract Background Based on the observation of an increased number of paralogous genes in teleost fishes compared with other vertebrates and on the conserved synteny between duplicated copies, it has been shown that a whole genome duplication (WGD occurred during the evolution of Actinopterygian fish. Comparative phylogenetic dating of this duplication event suggests that it occurred early on, specifically in teleosts. It has been proposed that this event might have facilitated the evolutionary radiation and the phenotypic diversification of the teleost fish, notably by allowing the sub- or neo-functionalization of many duplicated genes. Results In this paper, we studied in a wide range of Actinopterygians the duplication and fate of the androgen receptor (AR, NR3C4, a nuclear receptor known to play a key role in sex-determination in vertebrates. The pattern of AR gene duplication is consistent with an early WGD event: it has been duplicated into two genes AR-A and AR-B after the split of the Acipenseriformes from the lineage leading to teleost fish but before the divergence of Osteoglossiformes. Genomic and syntenic analyses in addition to lack of PCR amplification show that one of the duplicated copies, AR-B, was lost in several basal Clupeocephala such as Cypriniformes (including the model species zebrafish, Siluriformes, Characiformes and Salmoniformes. Interestingly, we also found that, in basal teleost fish (Osteoglossiformes and Anguilliformes, the two copies remain very similar, whereas, specifically in Percomorphs, one of the copies, AR-B, has accumulated substitutions in both the ligand binding domain (LBD and the DNA binding domain (DBD. Conclusion The comparison of the mutations present in these divergent AR-B with those known in human to be implicated in complete, partial or mild androgen insensitivity syndrome suggests that the existence of two distinct AR duplicates may be correlated to specific functional differences that may be
Hofberger, J.A.; Lyons, E.; Edger, P.P.; Pires, J.C.; Schranz, M.E.
Plants share a common history of successive whole genome duplication (WGD) events retaining genomic patterns of duplicate gene copies (ohnologs) organized in conserved syntenic blocks. Duplication was often proposed to affect the origin of novel traits during evolution. However, genetic evidence
Carelli Francesco N
Full Text Available Abstract Background Segmental duplications (SDs are blocks of genomic sequence of 1-200 kb that map to different loci in a genome and share a sequence identity > 90%. SDs show at the sequence level the same characteristics as other regions of the human genome: they contain both high-copy repeats and gene sequences. SDs play an important role in genome plasticity by creating new genes and modeling genome structure. Although data is plentiful for mammals, not much was known about the representation of SDs in plant genomes. In this regard, we performed a genome-wide analysis of high-identity SDs on the sequenced grapevine (Vitis vinifera genome (PN40024. Results We demonstrate that recent SDs (> 94% identity and >= 10 kb in size are a relevant component of the grapevine genome (85 Mb, 17% of the genome sequence. We detected mitochondrial and plastid DNA and genes (10% of gene annotation in segmentally duplicated regions of the nuclear genome. In particular, the nine highest copy number genes have a copy in either or both organelle genomes. Further we showed that several duplicated genes take part in the biosynthesis of compounds involved in plant response to environmental stress. Conclusions These data show the great influence of SDs and organelle DNA transfers in modeling the Vitis vinifera nuclear DNA structure as well as the impact of SDs in contributing to the adaptive capacity of grapevine and the nutritional content of grape products through genome variation. This study represents a step forward in the full characterization of duplicated genes important for grapevine cultural needs and human health.
Marcinkowska-Swojak, Malgorzata; Szczerbal, Izabela; Pausch, Hubert; Nowacka-Woszuk, Joanna; Flisikowski, Krzysztof; Dzimira, Stanislaw; Nizanski, Wojciech; Payan-Carreira, Rita; Fries, Ruedi; Kozlowski, Piotr; Switonski, Marek
Although the disorder of sex development in dogs with female karyotype (XX DSD) is quite common, its molecular basis is still unclear. Among mutations underlying XX DSD in mammals are duplication of a long sequence upstream of the SOX9 gene (RevSex) and duplication of the SOX9 gene (also observed in dogs). We performed a comparative analysis of 16 XX DSD and 30 control female dogs, using FISH and MLPA approaches. Our study was focused on a region harboring SOX9 and a region orthologous to the human RevSex (CanRevSex), which was located by in silico analysis downstream of SOX9. Two highly polymorphic copy number variable regions (CNVRs): CNVR1 upstream of SOX9 and CNVR2 encompassing CanRevSex were identified. Although none of the detected copy number variants were specific to either affected or control animals, we observed that the average number of copies in CNVR1 was higher in XX DSD. No copy variation of SOX9 was observed. Our extensive studies have excluded duplication of SOX9 as the common cause of XX DSD in analyzed samples. However, it remains possible that the causative mutation is hidden in highly polymorphic CNVR1.
Full Text Available Abstract The Sox gene family is found in a broad range of animal taxa and encodes important gene regulatory proteins involved in a variety of developmental processes. We have obtained clones representing the HMG boxes of twelve Sox genes from grass carp (Ctenopharyngodon idella, one of the four major domestic carps in China. The cloned Sox genes belong to group B1, B2 and C. Our analyses show that whereas the human genome contains a single copy of Sox4, Sox11 and Sox14, each of these genes has two co-orthologs in grass carp, and the duplication of Sox4 and Sox11 occurred before the divergence of grass carp and zebrafish, which support the "fish-specific whole-genome duplication" theory. An estimation for the origin of grass carp based on the molecular clock using Sox1, Sox3 and Sox11 genes as markers indicates that grass carp (subfamily Leuciscinae and zebrafish (subfamily Danioninae diverged approximately 60 million years ago. The potential uses of Sox genes as markers in revealing the evolutionary history of grass carp are discussed.
Campos, Carla Marques Rondon; Zanardo, Evelin Aline; Dutra, Roberta Lelis; Kulikowski, Leslie Domenici; Kim, Chong Ae
Congenital heart defects (CHD) are the most prevalent group of structural abnormalities at birth and one of the main causes of infant morbidity and mortality. Studies have shown a contribution of the copy number variation in the genesis of cardiac malformations. Investigate gene copy number variation (CNV) in children with conotruncal heart defect. Multiplex ligation-dependent probe amplification (MLPA) was performed in 39 patients with conotruncal heart defect. Clinical and laboratory assessments were conducted in all patients. The parents of the probands who presented abnormal findings were also investigated. Gene copy number variation was detected in 7/39 patients: 22q11.2 deletion, 22q11.2 duplication, 15q11.2 duplication, 20p12.2 duplication, 19p deletion, 15q and 8p23.2 duplication with 10p12.31 duplication. The clinical characteristics were consistent with those reported in the literature associated with the encountered microdeletion/microduplication. None of these changes was inherited from the parents. Our results demonstrate that the technique of MLPA is useful in the investigation of microdeletions and microduplications in conotruncal congenital heart defects. Early diagnosis of the copy number variation in patients with congenital heart defect assists in the prevention of morbidity and decreased mortality in these patients
Campos, Carla Marques Rondon, E-mail: email@example.com [Universidade Federal de Mato Grosso, Cuiabá, MT (Brazil); Zanardo, Evelin Aline; Dutra, Roberta Lelis [Departamento de Patologia - Laboratório de Citogenômica - LIM 03 - Universidade de São Paulo, São Paulo, SP (Brazil); Kulikowski, Leslie Domenici [Universidade de São Paulo, São Paulo, SP (Brazil); Departamento de Patologia - Laboratório de Citogenômica - LIM 03 - Universidade de São Paulo, São Paulo, SP (Brazil); Kim, Chong Ae [Universidade de São Paulo, São Paulo, SP (Brazil)
Congenital heart defects (CHD) are the most prevalent group of structural abnormalities at birth and one of the main causes of infant morbidity and mortality. Studies have shown a contribution of the copy number variation in the genesis of cardiac malformations. Investigate gene copy number variation (CNV) in children with conotruncal heart defect. Multiplex ligation-dependent probe amplification (MLPA) was performed in 39 patients with conotruncal heart defect. Clinical and laboratory assessments were conducted in all patients. The parents of the probands who presented abnormal findings were also investigated. Gene copy number variation was detected in 7/39 patients: 22q11.2 deletion, 22q11.2 duplication, 15q11.2 duplication, 20p12.2 duplication, 19p deletion, 15q and 8p23.2 duplication with 10p12.31 duplication. The clinical characteristics were consistent with those reported in the literature associated with the encountered microdeletion/microduplication. None of these changes was inherited from the parents. Our results demonstrate that the technique of MLPA is useful in the investigation of microdeletions and microduplications in conotruncal congenital heart defects. Early diagnosis of the copy number variation in patients with congenital heart defect assists in the prevention of morbidity and decreased mortality in these patients.
Assogba, Benoît S; Djogbénou, Luc S; Milesi, Pascal; Berthomieu, Arnaud; Perez, Julie; Ayala, Diego; Chandre, Fabrice; Makoutodé, Michel; Labbé, Pierrick; Weill, Mylène
Widespread resistance to pyrethroids threatens malaria control in Africa. Consequently, several countries switched to carbamates and organophophates insecticides for indoor residual spraying. However, a mutation in the ace-1 gene conferring resistance to these compounds (ace-1(R) allele), is already present. Furthermore, a duplicated allele (ace-1(D)) recently appeared; characterizing its selective advantage is mandatory to evaluate the threat. Our data revealed that a unique duplication event, pairing a susceptible and a resistant copy of the ace-1 gene spread through West Africa. Further investigations revealed that, while ace-1(D) confers less resistance than ace-1(R), the high fitness cost associated with ace-1(R) is almost completely suppressed by the duplication for all traits studied. ace-1 duplication thus represents a permanent heterozygote phenotype, selected, and thus spreading, due to the mosaic nature of mosquito control. It provides malaria mosquito with a new evolutionary path that could hamper resistance management.
Daniel J Kliebenstein
Full Text Available BACKGROUND: Most eukaryotic genomes have undergone whole genome duplications during their evolutionary history. Recent studies have shown that the function of these duplicated genes can diverge from the ancestral gene via neo- or sub-functionalization within single genotypes. An additional possibility is that gene duplicates may also undergo partitioning of function among different genotypes of a species leading to genetic differentiation. Finally, the ability of gene duplicates to diverge may be limited by their biological function. METHODOLOGY/PRINCIPAL FINDINGS: To test these hypotheses, I estimated the impact of gene duplication and metabolic function upon intraspecific gene expression variation of segmental and tandem duplicated genes within Arabidopsis thaliana. In all instances, the younger tandem duplicated genes showed higher intraspecific gene expression variation than the average Arabidopsis gene. Surprisingly, the older segmental duplicates also showed evidence of elevated intraspecific gene expression variation albeit typically lower than for the tandem duplicates. The specific biological function of the gene as defined by metabolic pathway also modulated the level of intraspecific gene expression variation. The major energy metabolism and biosynthetic pathways showed decreased variation, suggesting that they are constrained in their ability to accumulate gene expression variation. In contrast, a major herbivory defense pathway showed significantly elevated intraspecific variation suggesting that it may be under pressure to maintain and/or generate diversity in response to fluctuating insect herbivory pressures. CONCLUSION: These data show that intraspecific variation in gene expression is facilitated by an interaction of gene duplication and biological activity. Further, this plays a role in controlling diversity of plant metabolism.
Full Text Available Abstract Background The dystroglycan (DG complex is a major non-integrin cell adhesion system whose multiple biological roles involve, among others, skeletal muscle stability, embryonic development and synapse maturation. DG is composed of two subunits: α-DG, extracellular and highly glycosylated, and the transmembrane β-DG, linking the cytoskeleton to the surrounding basement membrane in a wide variety of tissues. A single copy of the DG gene (DAG1 has been identified so far in humans and other mammals, encoding for a precursor protein which is post-translationally cleaved to liberate the two DG subunits. Similarly, D. rerio (zebrafish seems to have a single copy of DAG1, whose removal was shown to cause a severe dystrophic phenotype in adult animals, although it is known that during evolution, due to a whole genome duplication (WGD event, many teleost fish acquired multiple copies of several genes (paralogues. Results Data mining of pufferfish (T. nigroviridis and T. rubripes and other teleost fish (O. latipes and G. aculeatus available nucleotide sequences revealed the presence of two functional paralogous DG sequences. RT-PCR analysis proved that both the DG sequences are transcribed in T. nigroviridis. One of the two DG sequences harbours an additional mini-intronic sequence, 137 bp long, interrupting the uncomplicated exon-intron-exon pattern displayed by DAG1 in mammals and D. rerio. A similar scenario emerged also in D. labrax (sea bass, from whose genome we have cloned and sequenced a new DG sequence that also harbours a shorter additional intronic sequence of 116 bp. Western blot analysis confirmed the presence of DG protein products in all the species analysed including two teleost Antarctic species (T. bernacchii and C. hamatus. Conclusion Our evolutionary analysis has shown that the whole-genome duplication event in the Class Actinopterygii (ray-finned fish involved also DAG1. We unravelled new important molecular genetic details
Melanie G Mayer
Full Text Available Many nematodes form dauer larvae when exposed to unfavorable conditions, representing an example of phenotypic plasticity and a major survival and dispersal strategy. In Caenorhabditis elegans, the regulation of dauer induction is a model for pheromone, insulin, and steroid-hormone signaling. Recent studies in Pristionchus pacificus revealed substantial natural variation in various aspects of dauer development, i.e. pheromone production and sensing and dauer longevity and fitness. One intriguing example is a strain from Ohio, having extremely long-lived dauers associated with very high fitness and often forming the most dauers in response to other strains' pheromones, including the reference strain from California. While such examples have been suggested to represent intraspecific competition among strains, the molecular mechanisms underlying these dauer-associated patterns are currently unknown. We generated recombinant-inbred-lines between the Californian and Ohioan strains and used quantitative-trait-loci analysis to investigate the molecular mechanism determining natural variation in dauer development. Surprisingly, we discovered that the orphan gene dauerless controls dauer formation by copy number variation. The Ohioan strain has one dauerless copy causing high dauer formation, whereas the Californian strain has two copies, resulting in strongly reduced dauer formation. Transgenic animals expressing multiple copies do not form dauers. dauerless is exclusively expressed in CAN neurons, and both CAN ablation and dauerless mutations increase dauer formation. Strikingly, dauerless underwent several duplications and acts in parallel or downstream of steroid-hormone signaling but upstream of the nuclear-hormone-receptor daf-12. We identified the novel or fast-evolving gene dauerless as inhibitor of dauer development. Our findings reveal the importance of gene duplications and copy number variations for orphan gene function and suggest daf-12 as
Carla Marques Rondon Campos
Full Text Available Background: Congenital heart defects (CHD are the most prevalent group of structural abnormalities at birth and one of the main causes of infant morbidity and mortality. Studies have shown a contribution of the copy number variation in the genesis of cardiac malformations. Objectives: Investigate gene copy number variation (CNV in children with conotruncal heart defect. Methods: Multiplex ligation-dependent probe amplification (MLPA was performed in 39 patients with conotruncal heart defect. Clinical and laboratory assessments were conducted in all patients. The parents of the probands who presented abnormal findings were also investigated. Results: Gene copy number variation was detected in 7/39 patients: 22q11.2 deletion, 22q11.2 duplication, 15q11.2 duplication, 20p12.2 duplication, 19p deletion, 15q and 8p23.2 duplication with 10p12.31 duplication. The clinical characteristics were consistent with those reported in the literature associated with the encountered microdeletion/microduplication. None of these changes was inherited from the parents. Conclusions: Our results demonstrate that the technique of MLPA is useful in the investigation of microdeletions and microduplications in conotruncal congenital heart defects. Early diagnosis of the copy number variation in patients with congenital heart defect assists in the prevention of morbidity and decreased mortality in these patients.
Wang, Jun; Tao, Feng; Marowsky, Nicholas C; Fan, Chuanzhu
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. © 2016 American Society of Plant Biologists. All rights reserved.
Mogensen, Mie; Skjørringe, Tina; Kodama, Hiroko
the identified duplicated fragments originated from a single or from two different X-chromosomes, polymorphic markers located in the duplicated fragments were analyzed. RESULTS: Partial ATP7A gene duplication was identified in 20 unrelated patients including one patient with Occipital Horn Syndrome (OHS...
Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi
The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs. PMID:23396833
Benner Steven A
Full Text Available Abstract Background Blocks of duplicated genomic DNA sequence longer than 1000 base pairs are known as low copy repeats (LCRs. Identified by their sequence similarity, LCRs are abundant in the human genome, and are interesting because they may represent recent adaptive events, or potential future adaptive opportunities within the human lineage. Sequence analysis tools are needed, however, to decide whether these interpretations are likely, whether a particular set of LCRs represents nearly neutral drift creating junk DNA, or whether the appearance of LCRs reflects assembly error. Here we investigate an LCR family containing the sulfotransferase (SULT 1A genes involved in drug metabolism, cancer, hormone regulation, and neurotransmitter biology as a first step for defining the problems that those tools must manage. Results Sequence analysis here identified a fourth sulfotransferase gene, which may be transcriptionally active, located on human chromosome 16. Four regions of genomic sequence containing the four human SULT1A paralogs defined a new LCR family. The stem hominoid SULT1A progenitor locus was identified by comparative genomics involving complete human and rodent genomes, and a draft chimpanzee genome. SULT1A expansion in hominoid genomes was followed by positive selection acting on specific protein sites. This episode of adaptive evolution appears to be responsible for the dopamine sulfonation function of some SULT enzymes. Each of the conclusions that this bioinformatic analysis generated using data that has uncertain reliability (such as that from the chimpanzee genome sequencing project has been confirmed experimentally or by a "finished" chromosome 16 assembly, both of which were published after the submission of this manuscript. Conclusion SULT1A genes expanded from one to four copies in hominoids during intra-chromosomal LCR duplications, including (apparently one after the divergence of chimpanzees and humans. Thus, LCRs may
Yang, Yanmei; Wang, Jinpeng; Di, Jianyong
Soybean (Glycine max) is one of the most important crop plants for providing protein and oil. It is important to investigate soybean genome for its economic and scientific value. Polyploidy is a widespread and recursive phenomenon during plant evolution, and it could generate massive duplicated genes which is an important resource for genetic innovation. Improved sequence alignment criteria and statistical analysis are used to identify and characterize duplicated genes produced by polyploidization in soybean. Based on the collinearity method, duplicated genes by whole genome duplication account for 70.3% in soybean. From the statistical analysis of the molecular distances between duplicated genes, our study indicates that the whole genome duplication event occurred more than once in the genome evolution of soybean, which is often distributed near the ends of chromosomes.
Full Text Available Abstract Background Recently originalization was proposed to be an effective way of duplicate-gene preservation, in which recombination provokes the high frequency of original (or wild-type allele on both duplicated loci. Because the high frequency of wild-type allele might drive the arising and accumulating of advantageous mutation, it is hypothesized that recombination might enlarge the probability of neofunctionalization (Pneo of duplicate genes. In this article this hypothesis has been tested theoretically. Results Results show that through originalization recombination might not only shorten mean time to neofunctionalizaiton, but also enlarge Pneo. Conclusions Therefore, recombination might facilitate neofunctionalization via originalization. Several extensive applications of these results on genomic evolution have been discussed: 1. Time to nonfunctionalization can be much longer than a few million generations expected before; 2. Homogenization on duplicated loci results from not only gene conversion, but also originalization; 3. Although the rate of advantageous mutation is much small compared with that of degenerative mutation, Pneo cannot be expected to be small.
Full Text Available It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes.
Full Text Available BACKGROUND: Synonymous DNA substitution rates in the plant chloroplast genome are generally relatively slow and lineage dependent. Non-synonymous rates are usually even slower due to purifying selection acting on the genes. Positive selection is expected to speed up non-synonymous substitution rates, whereas synonymous rates are expected to be unaffected. Until recently, positive selection has seldom been observed in chloroplast genes, and large-scale structural rearrangements leading to gene duplications are hitherto supposed to be rare. METHODOLOGY/PRINCIPLE FINDINGS: We found high substitution rates in the exons of the plastid clpP1 gene in Oenothera (the Evening Primrose family and three separate lineages in the tribe Sileneae (Caryophyllaceae, the Carnation family. Introns have been lost in some of the lineages, but where present, the intron sequences have substitution rates similar to those found in other introns of their genomes. The elevated substitution rates of clpP1 are associated with statistically significant whole-gene positive selection in three branches of the phylogeny. In two of the lineages we found multiple copies of the gene. Neighboring genes present in the duplicated fragments do not show signs of elevated substitution rates or positive selection. Although non-synonymous substitutions account for most of the increase in substitution rates, synonymous rates are also markedly elevated in some lineages. Whereas plant clpP1 genes experiencing negative (purifying selection are characterized by having very conserved lengths, genes under positive selection often have large insertions of more or less repetitive amino acid sequence motifs. CONCLUSIONS/SIGNIFICANCE: We found positive selection of the clpP1 gene in various plant lineages to correlated with repeated duplication of the clpP1 gene and surrounding regions, repetitive amino acid sequences, and increase in synonymous substitution rates. The present study sheds light on the
Wentz, Elisabet; Vujic, Mihailo; Kärrstedt, Ewa-Lotta; Erlandsson, Anna; Gillberg, Christopher
Autism spectrum disorder, severe behaviour problems and duplication of the Xq12 to Xq13 region have recently been described in three male relatives. To describe the psychiatric comorbidity and dysmorphic features, including craniosynostosis, of two male siblings with autism and duplication of the Xq13 to Xq21 region, and attempt to narrow down the number of duplicated genes proposed to be leading to global developmental delay and autism. We performed DNA sequencing of certain exons of the TWIST1 gene, the FGFR2 gene and the FGFR3 gene. We also performed microarray analysis of the DNA. In addition to autism, the two male siblings exhibited severe learning disability, self-injurious behaviour, temper tantrums and hyperactivity, and had no communicative language. Chromosomal analyses were normal. Neither of the two siblings showed mutations of the sequenced exons known to produce craniosynostosis. The microarray analysis detected an extra copy of a region on the long arm of chromosome X, chromosome band Xq13.1-q21.1. Comparison of our two cases with previously described patients allowed us to identify three genes predisposing for autism in the duplicated chromosomal region. Sagittal craniosynostosis is also a new finding linked to the duplication.
Full Text Available Angiosperm genomes differ from those of mammals by extensive and recursive polyploidizations. The resulting gene duplication provides opportunities both for genetic innovation, and for concerted evolution. Though most genes may escape conversion by their homologs, concerted evolution of duplicated genes can last for millions of years or longer after their origin. Indeed, paralogous genes on two rice chromosomes duplicated an estimated 60–70 million years ago have experienced gene conversion in the past 400,000 years. Gene conversion preserves similarity of paralogous genes, but appears to accelerate their divergence from orthologous genes in other species. The mutagenic nature of recombination coupled with the buffering effect provided by gene redundancy, may facilitate the evolution of novel alleles that confer functional innovations while insulating biological fitness of affected plants. A mixed evolutionary model, characterized by a primary birth-and-death process and occasional homoeologous recombination and gene conversion, may best explain the evolution of multigene families.
Niu, Ao-lei; Wang, Yin-qiu; Zhang, Hui; Liao, Cheng-hong; Wang, Jin-kai; Zhang, Rui; Che, Jun; Su, Bing
Homeobox genes are the key regulators during development, and they are in general highly conserved with only a few reported cases of rapid evolution. RHOXF2 is an X-linked homeobox gene in primates. It is highly expressed in the testicle and may play an important role in spermatogenesis. As male reproductive system is often the target of natural and/or sexual selection during evolution, in this study, we aim to dissect the pattern of molecular evolution of RHOXF2 in primates and its potential functional consequence. We studied sequences and copy number variation of RHOXF2 in humans and 16 nonhuman primate species as well as the expression patterns in human, chimpanzee, white-browed gibbon and rhesus macaque. The gene copy number analysis showed that there had been parallel gene duplications/losses in multiple primate lineages. Our evidence suggests that 11 nonhuman primate species have one RHOXF2 copy, and two copies are present in humans and four Old World monkey species, and at least 6 copies in chimpanzees. Further analysis indicated that the gene duplications in primates had likely been mediated by endogenous retrovirus (ERV) sequences flanking the gene regions. In striking contrast to non-human primates, humans appear to have homogenized their two RHOXF2 copies by the ERV-mediated non-allelic recombination mechanism. Coding sequence and phylogenetic analysis suggested multi-lineage strong positive selection on RHOXF2 during primate evolution, especially during the origins of humans and chimpanzees. All the 8 coding region polymorphic sites in human populations are non-synonymous, implying on-going selection. Gene expression analysis demonstrated that besides the preferential expression in the reproductive system, RHOXF2 is also expressed in the brain. The quantitative data suggests expression pattern divergence among primate species. RHOXF2 is a fast-evolving homeobox gene in primates. The rapid evolution and copy number changes of RHOXF2 had been driven by
Full Text Available Abstract Background Homeobox genes are the key regulators during development, and they are in general highly conserved with only a few reported cases of rapid evolution. RHOXF2 is an X-linked homeobox gene in primates. It is highly expressed in the testicle and may play an important role in spermatogenesis. As male reproductive system is often the target of natural and/or sexual selection during evolution, in this study, we aim to dissect the pattern of molecular evolution of RHOXF2 in primates and its potential functional consequence. Results We studied sequences and copy number variation of RHOXF2 in humans and 16 nonhuman primate species as well as the expression patterns in human, chimpanzee, white-browed gibbon and rhesus macaque. The gene copy number analysis showed that there had been parallel gene duplications/losses in multiple primate lineages. Our evidence suggests that 11 nonhuman primate species have one RHOXF2 copy, and two copies are present in humans and four Old World monkey species, and at least 6 copies in chimpanzees. Further analysis indicated that the gene duplications in primates had likely been mediated by endogenous retrovirus (ERV sequences flanking the gene regions. In striking contrast to non-human primates, humans appear to have homogenized their two RHOXF2 copies by the ERV-mediated non-allelic recombination mechanism. Coding sequence and phylogenetic analysis suggested multi-lineage strong positive selection on RHOXF2 during primate evolution, especially during the origins of humans and chimpanzees. All the 8 coding region polymorphic sites in human populations are non-synonymous, implying on-going selection. Gene expression analysis demonstrated that besides the preferential expression in the reproductive system, RHOXF2 is also expressed in the brain. The quantitative data suggests expression pattern divergence among primate species. Conclusions RHOXF2 is a fast-evolving homeobox gene in primates. The rapid
Venkatachalam Ananda B
RNAs for both the duplicated copies of fabp1a/fabp1b.1, and fabp7a/fabp7b, but in different tissues. Clofibrate also increased the steady-state level of fabp10a and fabp11a mRNAs and hnRNAs in liver, but not for fabp10b and fabp11b. Conclusion Some duplicated fabp genes have, most likely, retained PPREs, but induction by clofibrate is over-ridden by an, as yet, unknown tissue-specific mechanism(s. Regardless of the tissue-specific mechanism(s, transcriptional control of duplicated zebrafish fabp genes by clofibrate has markedly diverged since the WGD event.
Cook, R Kimberley; Deal, Megan E; Deal, Jennifer A; Garton, Russell D; Brown, C Adam; Ward, Megan E; Andrade, Rachel S; Spana, Eric P; Kaufman, Thomas C; Cook, Kevin R
Interchromosomal duplications are especially important for the study of X-linked genes. Males inheriting a mutation in a vital X-linked gene cannot survive unless there is a wild-type copy of the gene duplicated elsewhere in the genome. Rescuing the lethality of an X-linked mutation with a duplication allows the mutation to be used experimentally in complementation tests and other genetic crosses and it maps the mutated gene to a defined chromosomal region. Duplications can also be used to screen for dosage-dependent enhancers and suppressors of mutant phenotypes as a way to identify genes involved in the same biological process. We describe an ongoing project in Drosophila melanogaster to generate comprehensive coverage and extensive breakpoint subdivision of the X chromosome with megabase-scale X segments borne on Y chromosomes. The in vivo method involves the creation of X inversions on attached-XY chromosomes by FLP-FRT site-specific recombination technology followed by irradiation to induce large internal X deletions. The resulting chromosomes consist of the X tip, a medial X segment placed near the tip by an inversion, and a full Y. A nested set of medial duplicated segments is derived from each inversion precursor. We have constructed a set of inversions on attached-XY chromosomes that enable us to isolate nested duplicated segments from all X regions. To date, our screens have provided a minimum of 78% X coverage with duplication breakpoints spaced a median of nine genes apart. These duplication chromosomes will be valuable resources for rescuing and mapping X-linked mutations and identifying dosage-dependent modifiers of mutant phenotypes.
Jenny U Johansson
Full Text Available Alternative splicing is an evolutionary innovation to create functionally diverse proteins from a limited number of genes. SNAP-25 plays a central role in neuroexocytosis by bridging synaptic vesicles to the plasma membrane during regulated exocytosis. The SNAP-25 polypeptide is encoded by a single copy gene, but in higher vertebrates a duplication of exon 5 has resulted in two mutually exclusive splice variants, SNAP-25a and SNAP-25b. To address a potential physiological difference between the two SNAP-25 proteins, we generated gene targeted SNAP-25b deficient mouse mutants by replacing the SNAP-25b specific exon with a second SNAP-25a equivalent. Elimination of SNAP-25b expression resulted in developmental defects, spontaneous seizures, and impaired short-term synaptic plasticity. In adult mutants, morphological changes in hippocampus and drastically altered neuropeptide expression were accompanied by severe impairment of spatial learning. We conclude that the ancient exon duplication in the Snap25 gene provides additional SNAP-25-function required for complex neuronal processes in higher eukaryotes.
Wang, Jun; Tao, Feng; Marowsky, Nicholas C.; Fan, Chuanzhu
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella. Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. PMID:27485883
Cuypers, Thomas D; Hogeweg, Paulien; Hogeweg, P.
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and ada...
Thomas D Cuypers; Paulien Hogeweg
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and ada...
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Bemer, Marian; Gordon, Jonathan; Weterings, Koen; Angenent, Gerco C
The MADS-box transcription factor family has expanded considerably in plants via gene and genome duplications and can be subdivided into type I and MIKC-type genes. The two gene classes show a different evolutionary history. Whereas the MIKC-type genes originated during ancient genome duplications, as well as during more recent events, the type I loci appear to experience high turnover with many recent duplications. This different mode of origin also suggests a different fate for the type I duplicates, which are thought to have a higher chance to become silenced or lost from the genome. To get more insight into the evolution of the type I MADS-box genes, we isolated nine type I genes from Petunia, which belong to the Mgamma subclass, and investigated the divergence of their coding and regulatory regions. The isolated genes could be subdivided into two categories: two genes were highly similar to Arabidopsis Mgamma-type genes, whereas the other seven genes showed less similarity to Arabidopsis genes and originated more recently. Two of the recently duplicated genes were found to contain deleterious mutations in their coding regions, and expression analysis revealed that a third paralog was silenced by mutations in its regulatory region. However, in addition to the three genes that were subjected to nonfunctionalization, we also found evidence for neofunctionalization of one of the Petunia Mgamma-type genes. Our study shows a rapid divergence of recently duplicated Mgamma-type MADS-box genes and suggests that redundancy among type I paralogs may be less common than expected.
Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J
Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types
Arsovski, Andrej A; Pradinuk, Julian; Guo, Xu Qiu; Wang, Sishuo; Adams, Keith L
Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. © 2015 American Society of Plant Biologists. All Rights Reserved.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Miyake, Noriko; Abdel-Salam, Ghada; Yamagata, Takanori; Eid, Maha M; Osaka, Hitoshi; Okamoto, Nobuhiko; Mohamed, Amal M; Ikeda, Takahiro; Afifi, Hanan H; Piard, Juliette; van Maldergem, Lionel; Mizuguchi, Takeshi; Miyatake, Satoko; Tsurusaki, Yoshinori; Matsumoto, Naomichi
Coffin-Siris syndrome is a rare congenital malformation and intellectual disability syndrome. Mutations in at least seven genes have been identified. Here, we performed copy number analysis in 37 patients with features of CSS in whom no causative mutations were identified by exome sequencing. We identified a patient with a 9p24.3-p22.2 duplication and another patient with the chromosome der(6)t(6;9)(p25;p21)mat. Both patients share a duplicated 15.8-Mb region containing 46 protein coding genes, including SMARCA2. Dominant negative effects of SMARCA2 mutations may contribute to Nicolaides-Baraitser syndrome. We conclude that their features better resemble Coffin-Siris syndrome, rather than Nicolaides-Baraitser syndrome and that these features likely arise from SMARCA2 over-dosage. Pure 9p duplications (not caused by unbalanced translocations) are rare. Copy number analysis in patients with features that overlap with Coffin-Siris syndrome is recommended to further determine their genetic aspects. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Kamei, Hiroyasu; Lu, Ling; Jiao, Shuang
Background: Gene duplication is the primary force of new gene evolution. Deciphering whether a pair of duplicated genes has evolved divergent functions is often challenging. The zebrafish is uniquely positioned to provide insight into the process of functional gene evolution due to its amenabilit...
Zhou, Xiaofan; Lin, Zhenguo; Ma, Hong
Background Gene duplication is considered a major driving force for evolution of genetic novelty, thereby facilitating functional divergence and organismal diversity, including the process of speciation. Animals, fungi and plants are major eukaryotic kingdoms and the divergences between them are some of the most significant evolutionary events. Although gene duplications in each lineage have been studied extensively in various contexts, the extent of gene duplication prior to the split of pla...
Pezer, Željka; Chung, Amanda G; Karn, Robert C; Laukaitis, Christina M
The Androgen-binding protein ( Abp ) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus ( Mmd ) and Mus musculus musculus ( Mmm ), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd , primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm , Mus musculus castaneus and an outgroup, Mus spretus , although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Matthew A Carrigan
Full Text Available Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s, where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs and hominoids.To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines. Database mining then identified novel ADH1 paralogs in both macaque (an OWM and marmoset (a NWM. These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding sequences and intronic sequences.We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels. The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs and catarrhines (OWMs and hominoids having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates
Kursel, Lisa E; Malik, Harmit S
Despite their essential role in the process of chromosome segregation in most eukaryotes, centromeric histones show remarkable evolutionary lability. Not only have they been lost in multiple insect lineages, but they have also undergone gene duplication in multiple plant lineages. Based on detailed study of a handful of model organisms including Drosophila melanogaster, centromeric histone duplication is considered to be rare in animals. Using a detailed phylogenomic study, we find that Cid, the centromeric histone gene, has undergone at least four independent gene duplications during Drosophila evolution. We find duplicate Cid genes in D. eugracilis (Cid2), in the montium species subgroup (Cid3, Cid4) and in the entire Drosophila subgenus (Cid5). We show that Cid3, Cid4, and Cid5 all localize to centromeres in their respective species. Some Cid duplicates are primarily expressed in the male germline. With rare exceptions, Cid duplicates have been strictly retained after birth, suggesting that they perform nonredundant centromeric functions, independent from the ancestral Cid. Indeed, each duplicate encodes a distinct N-terminal tail, which may provide the basis for distinct protein-protein interactions. Finally, we show some Cid duplicates evolve under positive selection whereas others do not. Taken together, our results support the hypothesis that Drosophila Cid duplicates have subfunctionalized. Thus, these gene duplications provide an unprecedented opportunity to dissect the multiple roles of centromeric histones. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Zeira, Ron; Shamir, Ron
Problems of genome rearrangement are central in both evolution and cancer research. Most genome rearrangement models assume that the genome contains a single copy of each gene and the only changes in the genome are structural, i.e., reordering of segments. In contrast, tumor genomes also undergo numerical changes such as deletions and duplications, and thus the number of copies of genes varies. Dealing with unequal gene content is a very challenging task, addressed by few algorithms to date. More realistic models are needed to help trace genome evolution during tumorigenesis. Here we present a model for the evolution of genomes with multiple gene copies using the operation types double-cut-and-joins, duplications and deletions. The events supported by the model are reversals, translocations, tandem duplications, segmental deletions, and chromosomal amplifications and deletions, covering most types of structural and numerical changes observed in tumor samples. Our goal is to find a series of operations of minimum length that transform one karyotype into the other. We show that the problem is NP-hard and give an integer linear programming formulation that solves the problem exactly under some mild assumptions. We test our method on simulated genomes and on ovarian cancer genomes. Our study advances the state of the art in two ways: It allows a broader set of operations than extant models, thus being more realistic, and it is the first study attempting to reconstruct the full sequence of structural and numerical events during cancer evolution. Code and data are available in https://github.com/Shamir-Lab/Sorting-Cancer-Karyotypes. firstname.lastname@example.org, email@example.com. Supplementary data are available at Bioinformatics online.
Differential transcriptional modulation of duplicated fatty acid-binding protein genes by dietary fatty acids in zebrafish (Danio rerio: evidence for subfunctionalization or neofunctionalization of duplicated genes
Denovan-Wright Eileen M
Full Text Available Abstract Background In the Duplication-Degeneration-Complementation (DDC model, subfunctionalization and neofunctionalization have been proposed as important processes driving the retention of duplicated genes in the genome. These processes are thought to occur by gain or loss of regulatory elements in the promoters of duplicated genes. We tested the DDC model by determining the transcriptional induction of fatty acid-binding proteins (Fabps genes by dietary fatty acids (FAs in zebrafish. We chose zebrafish for this study for two reasons: extensive bioinformatics resources are available for zebrafish at zfin.org and zebrafish contains many duplicated genes owing to a whole genome duplication event that occurred early in the ray-finned fish lineage approximately 230-400 million years ago. Adult zebrafish were fed diets containing either fish oil (12% lipid, rich in highly unsaturated fatty acid, sunflower oil (12% lipid, rich in linoleic acid, linseed oil (12% lipid, rich in linolenic acid, or low fat (4% lipid, low fat diet for 10 weeks. FA profiles and the steady-state levels of fabp mRNA and heterogeneous nuclear RNA in intestine, liver, muscle and brain of zebrafish were determined. Result FA profiles assayed by gas chromatography differed in the intestine, brain, muscle and liver depending on diet. The steady-state level of mRNA for three sets of duplicated genes, fabp1a/fabp1b.1/fabp1b.2, fabp7a/fabp7b, and fabp11a/fabp11b, was determined by reverse transcription, quantitative polymerase chain reaction (RT-qPCR. In brain, the steady-state level of fabp7b mRNAs was induced in fish fed the linoleic acid-rich diet; in intestine, the transcript level of fabp1b.1 and fabp7b were elevated in fish fed the linolenic acid-rich diet; in liver, the level of fabp7a mRNAs was elevated in fish fed the low fat diet; and in muscle, the level of fabp7a and fabp11a mRNAs were elevated in fish fed the linolenic acid-rich or the low fat diets. In all cases
Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles
A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society
Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles
A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society.
Chen, Yuan; Ding, Yun; Zhang, Zuming; Wang, Wen; Chen, Jun-Yuan; Ueno, Naoto; Mao, Bingyu
The evolution of the central nervous system (CNS) is one of the most striking changes during the transition from invertebrates to vertebrates. As a major source of genetic novelties, gene duplication might play an important role in the functional innovation of vertebrate CNS. In this study, we focused on a group of CNS-biased genes that duplicated during early vertebrate evolution. We investigated the tempo-spatial expression patterns of 33 duplicate gene families and their orthologs during the embryonic development of the vertebrate Xenopus laevis and the cephalochordate Brachiostoma belcheri. Almost all the identified duplicate genes are differentially expressed in the CNS in Xenopus embryos, and more than 50% and 30% duplicate genes are expressed in the telencephalon and mid-hindbrain boundary, respectively, which are mostly considered as two innovations in the vertebrate CNS. Interestingly, more than 50% of the amphioxus orthologs do not show apparent expression in the CNS in amphioxus embryos as detected by in situ hybridization, indicating that some of the vertebrate CNS-biased duplicate genes might arise from non-CNS genes in invertebrates. Our data accentuate the functional contribution of gene duplication in the CNS evolution of vertebrate and uncover an invertebrate non-CNS history for some vertebrate CNS-biased duplicate genes. Copyright © 2011. Published by Elsevier Ltd.
Gadagkar Sudhindra R
Full Text Available Abstract Background The completion of 19 insect genome sequencing projects spanning six insect orders provides the opportunity to investigate the evolution of important gene families, here tubulins. Tubulins are a family of eukaryotic structural genes that form microtubules, fundamental components of the cytoskeleton that mediate cell division, shape, motility, and intracellular trafficking. Previous in vivo studies in Drosophila find a stringent relationship between tubulin structure and function; small, biochemically similar changes in the major alpha 1 or testis-specific beta 2 tubulin protein render each unable to generate a motile spermtail axoneme. This has evolutionary implications, not a single non-synonymous substitution is found in beta 2 among 17 species of Drosophila and Hirtodrosophila flies spanning 60 Myr of evolution. This raises an important question, How do tubulins evolve while maintaining their function? To answer, we use molecular evolutionary analyses to characterize the evolution of insect tubulins. Results Sixty-six alpha tubulins and eighty-six beta tubulin gene copies were retrieved and subjected to molecular evolutionary analyses. Four ancient clades of alpha and beta tubulins are found in insects, a major isoform clade (alpha 1, beta 1 and three minor, tissue-specific clades (alpha 2-4, beta 2-4. Based on a Homarus americanus (lobster outgroup, these were generated through gene duplication events on major beta and alpha tubulin ancestors, followed by subfunctionalization in expression domain. Strong purifying selection acts on all tubulins, yet maximum pairwise amino acid distances between tubulin paralogs are large (0.464 substitutions/site beta tubulins, 0.707 alpha tubulins. Conversely orthologs, with the exception of reproductive tissue isoforms, show little sequence variation except in the last 15 carboxy terminus tail (CTT residues, which serve as sites for post-translational modifications (PTMs and interactions
Mueller, Rachel Lockridge; Boore, Jeffrey L.
Extensive gene rearrangement is reported in the mitochondrial genomes of lungless salamanders (Plethodontidae). In each genome with a novel gene order, there is evidence that the rearrangement was mediated by duplication of part of the mitochondrial genome, including the presence of both pseudogenes and additional, presumably functional, copies of duplicated genes. All rearrangement-mediating duplications include either the origin of light strand replication and the nearby tRNA genes or the regions flanking the origin of heavy strand replication. The latter regions comprise nad6, trnE, cob, trnT, an intergenic spacer between trnT and trnP and, in some genomes, trnP, the control region, trnF, rrnS, trnV, rrnL, trnL1, and nad1. In some cases, two copies of duplicated genes, presumptive regulatory regions, and/or sequences with no assignable function have been retained in the genome following the initial duplication; in other genomes, only one of the duplicated copies has been retained. Both tandem and non-tandem duplications are present in these genomes, suggesting different duplication mechanisms. In some of these mtDNAs, up to 25 percent of the total length is composed of tandem duplications of non-coding sequence that includes putative regulatory regions and/or pseudogenes of tRNAs and protein-coding genes along with otherwise unassignable sequences. These data indicate that imprecise initiation and termination of replication, slipped-strand mispairing, and intra-molecular recombination may all have played a role in generating repeats during the evolutionary history of plethodontid mitochondrial genomes.
Noh, Hyun Ji; Ponting, Chris P; Boulding, Hannah C; Meader, Stephen; Betancur, Catalina; Buxbaum, Joseph D; Pinto, Dalila; Marshall, Christian R; Lionel, Anath C; Scherer, Stephen W; Webber, Caleb
Autism Spectrum Disorders (ASD) are highly heritable and characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. Considering four sets of de novo copy number variants (CNVs) identified in 181 individuals with autism and exploiting mouse functional genomics and known protein-protein interactions, we identified a large and significantly interconnected interaction network. This network contains 187 genes affected by CNVs drawn from 45% of the patients we considered and 22 genes previously implicated in ASD, of which 192 form a single interconnected cluster. On average, those patients with copy number changed genes from this network possess changes in 3 network genes, suggesting that epistasis mediated through the network is extensive. Correspondingly, genes that are highly connected within the network, and thus whose copy number change is predicted by the network to be more phenotypically consequential, are significantly enriched among patients that possess only a single ASD-associated network copy number changed gene (p = 0.002). Strikingly, deleted or disrupted genes from the network are significantly enriched in GO-annotated positive regulators (2.3-fold enrichment, corrected p = 2×10(-5)), whereas duplicated genes are significantly enriched in GO-annotated negative regulators (2.2-fold enrichment, corrected p = 0.005). The direction of copy change is highly informative in the context of the network, providing the means through which perturbations arising from distinct deletions or duplications can yield a common outcome. These findings reveal an extensive ASD-associated molecular network, whose topology indicates ASD-relevant mutational deleteriousness and that mechanistically details how convergent aetiologies can result extensively from CNVs affecting pathways causally implicated in ASD.
Full Text Available Abstract Background Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families. Results The 456 human mitochondrial proteins selected for this study were clustered into 305 gene families including 92 multigene families. Among the multigene families, 59 (64% consisted of both mitochondrial and cytosolic (non-mitochondrial proteins (mt-cy families while the remaining 33 (36% were composed of mitochondrial proteins (mt-mt families. Phylogenetic analyses of mt-cy families revealed three different scenarios of their neolocalization following gene duplication: 1 relocalization from mitochondria to cytosol, 2 from cytosol to mitochondria and 3 multiple subcellular relocalizations. The neolocalizations were most commonly enabled by the gain or loss of N-terminal mitochondrial targeting signals. The majority of detected subcellular relocalization events occurred early in animal evolution, preceding the evolution of tetrapods. Mt-mt protein families showed a somewhat different pattern, where gene duplication occurred more evenly in time. However, for both types of protein families, most duplication events appear to roughly coincide with two rounds of genome duplications early in vertebrate evolution. Finally, we evaluated the effects of inaccurate and incomplete annotation of mitochondrial proteins and found that our conclusion of the importance of subcellular relocalization after gene duplication on
Sato, Yukuto; Tsukamoto, Katsumi; Nishida, Mutsumi
Whole-genome duplication (WGD) is believed to be a significant source of major evolutionary innovation. Redundant genes resulting from WGD are thought to be lost or acquire new functions. However, the rates of gene loss and thus temporal process of genome reshaping after WGD remain unclear. The WGD shared by all teleost fish, one-half of all jawed vertebrates, was more recent than the two ancient WGDs that occurred before the origin of jawed vertebrates, and thus lends itself to analysis of gene loss and genome reshaping. Using a newly developed orthology identification pipeline, we inferred the post–teleost-specific WGD evolutionary histories of 6,892 protein-coding genes from nine phylogenetically representative teleost genomes on a time-calibrated tree. We found that rapid gene loss did occur in the first 60 My, with a loss of more than 70–80% of duplicated genes, and produced similar genomic gene arrangements within teleosts in that relatively short time. Mathematical modeling suggests that rapid gene loss occurred mainly by events involving simultaneous loss of multiple genes. We found that the subsequent 250 My were characterized by slow and steady loss of individual genes. Our pipeline also identified about 1,100 shared single-copy genes that are inferred to have become singletons before the divergence of clupeocephalan teleosts. Therefore, our comparative genome analysis suggests that rapid gene loss just after the WGD reshaped teleost genomes before the major divergence, and provides a useful set of marker genes for future phylogenetic analysis. PMID:26578810
Rare copy number variants (CNVs) have a prominent role in the aetiology of schizophrenia and other neuropsychiatric disorders. Substantial risk for schizophrenia is conferred by large (>500-kilobase) CNVs at several loci, including microdeletions at 1q21.1 (ref. 2), 3q29 (ref. 3), 15q13.3 (ref. 2) and 22q11.2 (ref. 4) and microduplication at 16p11.2 (ref. 5). However, these CNVs collectively account for a small fraction (2-4%) of cases, and the relevant genes and neurobiological mechanisms are not well understood. Here we performed a large two-stage genome-wide scan of rare CNVs and report the significant association of copy number gains at chromosome 7q36.3 with schizophrenia. Microduplications with variable breakpoints occurred within a 362-kilobase region and were detected in 29 of 8,290 (0.35%) patients versus 2 of 7,431 (0.03%) controls in the combined sample. All duplications overlapped or were located within 89 kilobases upstream of the vasoactive intestinal peptide receptor gene VIPR2. VIPR2 transcription and cyclic-AMP signalling were significantly increased in cultured lymphocytes from patients with microduplications of 7q36.3. These findings implicate altered vasoactive intestinal peptide signalling in the pathogenesis of schizophrenia and indicate the VPAC2 receptor as a potential target for the development of new antipsychotic drugs.
Tornow, J; Santangelo, G M
A duplicate copy of the RPL37A gene (encoding ribosomal protein L37) was cloned and sequenced. The coding region of RPL37B is very similar to that of RPL37A, with only one conservative amino-acid difference. However, the intron and flanking sequences of the two genes are extremely dissimilar. Disruption experiments indicate that the two loci are not functionally equivalent: disruption of RPL37B was insignificant, but disruption of RPL37A severely impaired the growth rate of the cell. When both RPL37 loci are disrupted, the cell is unable to grow at all, indicating that rpL37 is an essential protein. The functional disparity between the two RPL37 loci could be explained by differential gene expression. The results of two experiments support this idea: gene fusion of RPL37A to a reporter gene resulted in six-fold higher mRNA levels than was generated by the same reporter gene fused to RPL37B, and a modest increase in gene dosage of RPL37B overcame the lack of a functional RPL37A gene.
Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V
Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Cohen-Gihon, Inbar; Nussinov, Ruth; Sharan, Roded
During evolution, organisms have gained functional complexity mainly by modifying and improving existing functioning systems rather than creating new ones ab initio. Here we explore the interplay between two processes which during evolution have had major roles in the acquisition of new functions: gene duplication and protein domain rearrangements. We consider four possible evolutionary scenarios: gene families that have undergone none of these event types; only gene duplication; only domain rearrangement, or both events. We characterize each of the four evolutionary scenarios by functional attributes. Our analysis of ten fungal genomes indicates that at least for the fungi clade, species significantly appear to gain complexity by gene duplication accompanied by the expansion of existing domain architectures via rearrangements. We show that paralogs gaining new domain architectures via duplication tend to adopt new functions compared to paralogs that preserve their domain architectures. We conclude that evolution of protein families through gene duplication and domain rearrangement is correlated with their functional properties. We suggest that in general, new functions are acquired via the integration of gene duplication and domain rearrangements rather than each process acting independently
Michael H. Kohn
Full Text Available While it remains a matter of some debate, rapid sequence evolution of the coding sequences of duplicate genes is characteristic for early phases past duplication, but long established duplicates generally evolve under constraint, much like the rest of the coding genome. As for coding sequences, it may be possible to infer evolutionary rate, selection, and constraint via contrasts between duplicate gene divergence in the 5 prime regions and in the corresponding synonymous site divergence in the coding regions. Finding elevated rates for the 5 prime regions of duplicated genes, in addition to the coding regions, would enable statements regarding the early processes of duplicate gene evolution. Here, 1 kb of each of the 5 prime regulatory regions of Drosophila melanogaster duplicate gene pairs were mapped onto one another to isolate shared sequence blocks. Genetic distances within shared sequence blocks (d5’ were found to increase as a function of synonymous (dS, and to a lesser extend, amino-acid (dA site divergence between duplicates. The rate d5’/dS was found to rapidly decay from values > 1 in young duplicate pairs (dS 0.8. Such rapid rates of 5 prime evolution exceeding 1 (~neutral predominantly were found to occur in duplicate pairs with low amino-acid site divergence and that tended to be co-regulated when assayed on microarrays. Conceivably, functional redundancy and relaxation of selective constraint facilitates subsequent positive selection on the 5 prime regions of young duplicate genes. This might promote the evolution of new functions (neofunctionalization or division of labor among duplicate genes (subfunctionalization. In contrast, similar to the vast portion of the non-coding genome, the 5 prime regions of long-established gene duplicates appear to evolve under selective constraint, indicating that these long-established gene duplicates have assumed critical functions.
Bickhart, Derek M.; Xu, Lingyang; Hutchison, Jana L.; Cole, John B.; Null, Daniel J.; Schroeder, Steven G.; Song, Jiuzhou; Garcia, Jose Fernando; Sonstegard, Tad S.; Van Tassell, Curtis P.; Schnabel, Robert D.; Taylor, Jeremy F.; Lewin, Harris A.; Liu, George E.
The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1. Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future. PMID:27085184
Pabón-Mora, Natalia; Hidalgo, Oriane; Gleissberg, Stefan; Litt, Amy
Gene duplication and loss provide raw material for evolutionary change within organismal lineages as functional diversification of gene copies provide a mechanism for phenotypic variation. Here we focus on the APETALA1/FRUITFULL MADS-box gene lineage evolution. AP1/FUL genes are angiosperm-specific and have undergone several duplications. By far the most significant one is the core-eudicot duplication resulting in the euAP1 and euFUL clades. Functional characterization of several euAP1 and euFUL genes has shown that both function in proper floral meristem identity, and axillary meristem repression. Independently, euAP1 genes function in floral meristem and sepal identity, whereas euFUL genes control phase transition, cauline leaf growth, compound leaf morphogenesis and fruit development. Significant functional variation has been detected in the function of pre-duplication basal-eudicot FUL-like genes, but the underlying mechanisms for change have not been identified. FUL-like genes in the Papaveraceae encode all functions reported for euAP1 and euFUL genes, whereas FUL-like genes in Aquilegia (Ranunculaceae) function in inflorescence development and leaf complexity, but not in flower or fruit development. Here we isolated FUL-like genes across the Ranunculales and used phylogenetic approaches to analyze their evolutionary history. We identified an early duplication resulting in the RanFL1 and RanFL2 clades. RanFL1 genes were present in all the families sampled and are mostly under strong negative selection in the MADS, I and K domains. RanFL2 genes were only identified from Eupteleaceae, Papaveraceae s.l., Menispermaceae and Ranunculaceae and show relaxed purifying selection at the I and K domains. We discuss how asymmetric sequence diversification, new motifs, differences in codon substitutions and likely protein-protein interactions resulting from this Ranunculiid-specific duplication can help explain the functional differences among basal-eudicot FUL-like genes
Full Text Available Shikimate kinase (SK; EC 18.104.22.168 catalyzes the fifth reaction of the shikimate pathway, which directs carbon from the central metabolism pool to a broad range of secondary metabolites involved in plant development, growth, and stress responses. In this study, we demonstrate the role of plant SK gene duplicate evolution in the diversification of metabolic regulation and the acquisition of novel and physiologically essential function. Phylogenetic analysis of plant SK homologs resolves an orthologous cluster of plant SKs and two functionally distinct orthologous clusters. These previously undescribed genes, shikimate kinase-like 1 (SKL1 and -2 (SKL2, do not encode SK activity, are present in all major plant lineages, and apparently evolved under positive selection following SK gene duplication over 400 MYA. This is supported by functional assays using recombinant SK, SKL1, and SKL2 from Arabidopsis thaliana (At and evolutionary analyses of the diversification of SK-catalytic and -substrate binding sites based on theoretical structure models. AtSKL1 mutants yield albino and novel variegated phenotypes, which indicate SKL1 is required for chloroplast biogenesis. Extant SKL2 sequences show a strong genetic signature of positive selection, which is enriched in a protein-protein interaction module not found in other SK homologs. We also report the first kinetic characterization of plant SKs and show that gene expression diversification among the AtSK inparalogs is correlated with developmental processes and stress responses. This study examines the functional diversification of ancient and recent plant SK gene duplicates and highlights the utility of SKs as scaffolds for functional innovation.
Full Text Available Gene duplication is the primary force of new gene evolution. Deciphering whether a pair of duplicated genes has evolved divergent functions is often challenging. The zebrafish is uniquely positioned to provide insight into the process of functional gene evolution due to its amenability to genetic and experimental manipulation and because it possess a large number of duplicated genes.We report the identification and characterization of two hypoxia-inducible genes in zebrafish that are co-ortholgs of human IGF binding protein-1 (IGFBP-1. IGFBP-1 is a secreted protein that binds to IGF and modulates IGF actions in somatic growth, development, and aging. Like their human and mouse counterparts, in adult zebrafish igfbp-1a and igfbp-1b are exclusively expressed in the liver. During embryogenesis, the two genes are expressed in overlapping spatial domains but with distinct temporal patterns. While zebrafish IGFBP-1a mRNA was easily detected throughout embryogenesis, IGFBP-1b mRNA was detectable only in advanced stages. Hypoxia induces igfbp-1a expression in early embryogenesis, but induces the igfbp-1b expression later in embryogenesis. Both IGFBP-1a and -b are capable of IGF binding, but IGFBP-1b has much lower affinities for IGF-I and -II because of greater dissociation rates. Overexpression of IGFBP-1a and -1b in zebrafish embryos caused significant decreases in growth and developmental rates. When tested in cultured zebrafish embryonic cells, IGFBP-1a and -1b both inhibited IGF-1-induced cell proliferation but the activity of IGFBP-1b was significantly weaker.These results indicate subfunction partitioning of the duplicated IGFBP-1 genes at the levels of gene expression, physiological regulation, protein structure, and biological actions. The duplicated IGFBP-1 may provide additional flexibility in fine-tuning IGF signaling activities under hypoxia and other catabolic conditions.
Ye, Jun-jie; Ma, Li; Yang, Li-juan; Wang, Jin-huan; Wang, Yue-li; Guo, Hai; Gong, Ning; Nie, Wen-hui; Zhao, Shu-hua
There are many reports on associations between spermatogenesis and partial azoospermia factor c (AZFc) deletions as well as duplications; however, results are conflicting, possibly due to differences in methodology and ethnic background. The purpose of this study is to investigate the association of AZFc polymorphisms and male infertility in the Yi ethnic population, residents within Yunnan Province, China. A total of 224 infertile patients and 153 fertile subjects were selected in the Yi ethnic population. The study was performed by sequence-tagged site plus/minus (STS+/-) analysis followed by gene dosage and gene copy definition analysis. Y haplotypes of 215 cases and 115 controls were defined by 12 binary markers using single nucleotide polymorphism on Y chromosome (Y-SNP) multiplex assays based on single base primer extension technology. The distribution of Y haplotypes was not significantly different between the case and control groups. The frequencies of both gr/gr (7.6% vs. 8.5%) and b2/b3 (6.3% vs. 8.5%) deletions do not show significant differences. Similarly, single nucleotide variant (SNV) analysis shows no significant difference of gene copy definition between the cases and controls. However, the frequency of partial duplications in the infertile group (4.0%) is significantly higher than that in the control group (0.7%). Further, we found a case with sY1206 deletion which had two CDY1 copies but removed half of DAZ genes. Our results show that male infertility is associated with partial AZFc duplications, but neither gr/gr nor b2/b3 deletions, suggesting that partial AZFc duplications rather than deletions are risk factors for male infertility in Chinese-Yi population.
Full Text Available Abstract Background Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most widely studied groups of fish. Results 298,304 expressed sequence tags (ESTs from Atlantic salmon (69% of the total, 11,664 chinook, 10,813 sockeye, 10,051 brook trout, 10,975 grayling, 8,630 lake whitefish, and 3,624 northern pike ESTs were obtained in this study and have been deposited into the public databases. Contigs were built and putative full-length Atlantic salmon clones have been identified. A database containing ESTs, assemblies, consensus sequences, open reading frames, gene predictions and putative annotation is available. The overall similarity between Atlantic salmon ESTs and those of rainbow trout, chinook, sockeye, brook trout, grayling, lake whitefish, northern pike and rainbow smelt is 93.4, 94.2, 94.6, 94.4, 92.5, 91.7, 89.6, and 86.2% respectively. An analysis of 78 transcript sets show Salmo as a sister group to Oncorhynchus and Salvelinus within Salmoninae, and Thymallinae as a sister group to Salmoninae and Coregoninae within Salmonidae. Extensive gene duplication is consistent with a genome duplication in the common ancestor of salmonids. Using all of the available EST data, a new expanded salmonid cDNA microarray of 32,000 features was created. Cross-species hybridizations to this cDNA microarray indicate that this resource will be useful for studies of all 68 salmonid species. Conclusion An extensive collection and analysis of salmonid RNA putative transcripts indicate that Pacific salmon, Atlantic salmon and charr are 94–96% similar while the more distant whitefish, grayling, pike and smelt are 93, 92, 89 and 86% similar to salmon. The salmonid transcriptome reveals a complex history of gene duplication that is
Full Text Available Abstract Background One of the many gene families that expanded in early vertebrate evolution is the neuropeptide (NPY receptor family of G-protein coupled receptors. Earlier work by our lab suggested that several of the NPY receptor genes found in extant vertebrates resulted from two genome duplications before the origin of jawed vertebrates (gnathostomes and one additional genome duplication in the actinopterygian lineage, based on their location on chromosomes sharing several gene families. In this study we have investigated, in five vertebrate genomes, 45 gene families with members close to the NPY receptor genes in the compact genomes of the teleost fishes Tetraodon nigroviridis and Takifugu rubripes. These correspond to Homo sapiens chromosomes 4, 5, 8 and 10. Results Chromosome regions with conserved synteny were identified and confirmed by phylogenetic analyses in H. sapiens, M. musculus, D. rerio, T. rubripes and T. nigroviridis. 26 gene families, including the NPY receptor genes, (plus 3 described recently by other labs showed a tree topology consistent with duplications in early vertebrate evolution and in the actinopterygian lineage, thereby supporting expansion through block duplications. Eight gene families had complications that precluded analysis (such as short sequence length or variable number of repeated domains and another eight families did not support block duplications (because the paralogs in these families seem to have originated in another time window than the proposed genome duplication events. RT-PCR carried out with several tissues in T. rubripes revealed that all five NPY receptors were expressed in the brain and subtypes Y2, Y4 and Y8 were also expressed in peripheral organs. Conclusion We conclude that the phylogenetic analyses and chromosomal locations of these gene families support duplications of large blocks of genes or even entire chromosomes. Thus, these results are consistent with two early vertebrate
Inoue, K.; Sugiyama, N.; Kawanishi, C. [Yokohama City Univ., Yokohama (Japan)] [and others
Pelizaeus-Merzbacher disease (PMD) is an X-linked dysmyelinating disorder caused by abnormalities in the proteolipid protein (PLP) gene, which is essential for oligodendrocyte differentiation and CNS myelin formation. Although linkage analysis has shown the homogeneity at the PLP locus in patients with PMD, exonic mutations in the PLP gene have been identified in only 10% - 25% of all cases, which suggests the presence of other genetic aberrations, including gene duplication. In this study, we examined five families with PMD not carrying exonic mutations in PLP gene, using comparative multiplex PCR (CM-PCR) as a semiquantitative assay of gene dosage. PLP gene duplications were identified in four families by CM-PCR and confirmed in three families by densitometric RFLP analysis. Because a homologous myelin protein gene, PMP22, is duplicated in the majority of patients with Charcot-Marie-Tooth 1A, PLP gene overdosage may be an important genetic abnormality in PMD and affect myelin formation. 38 ref., 5 figs., 2 tabs.
Glessner, Joseph T; Wang, Kai; Cai, Guiqing; Korvatska, Olena; Kim, Cecilia E; Wood, Shawn; Zhang, Haitao; Estes, Annette; Brune, Camille W; Bradfield, Jonathan P; Imielinski, Marcin; Frackelton, Edward C; Reichert, Jennifer; Crawford, Emily L; Munson, Jeffrey; Sleiman, Patrick M A; Chiavacci, Rosetta; Annaiah, Kiran; Thomas, Kelly; Hou, Cuiping; Glaberson, Wendy; Flory, James; Otieno, Frederick; Garris, Maria; Soorya, Latha; Klei, Lambertus; Piven, Joseph; Meyer, Kacie J; Anagnostou, Evdokia; Sakurai, Takeshi; Game, Rachel M; Rudd, Danielle S; Zurawiecki, Danielle; McDougle, Christopher J; Davis, Lea K; Miller, Judith; Posey, David J; Michaels, Shana; Kolevzon, Alexander; Silverman, Jeremy M; Bernier, Raphael; Levy, Susan E; Schultz, Robert T; Dawson, Geraldine; Owley, Thomas; McMahon, William M; Wassink, Thomas H; Sweeney, John A; Nurnberger, John I; Coon, Hilary; Sutcliffe, James S; Minshew, Nancy J; Grant, Struan F A; Bucan, Maja; Cook, Edwin H; Buxbaum, Joseph D; Devlin, Bernie; Schellenberg, Gerard D; Hakonarson, Hakon
Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with approximately 550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 x 10(-3)). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 x 10(-3)). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 x 10(-6)). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.
Fingert, John H; Robin, Alan L; Scheetz, Todd E; Kwon, Young H; Liebmann, Jeffrey M; Ritch, Robert; Alward, Wallace L M
To investigate the role of TANK-binding kinase 1 ( TBK1 ) gene copy-number variations (ie, gene duplications and triplications) in the pathophysiology of various open-angle glaucomas. In previous studies, we discovered that copy-number variations in the TBK1 gene are associated with normal-tension glaucoma. Here, we investigated the prevalence of copy-number variations in cohorts of patients with other open-angle glaucomas-juvenile-onset open-angle glaucoma (n=30), pigmentary glaucoma (n=209), exfoliation glaucoma (n=225), and steroid-induced glaucoma (n=79)-using a quantitative polymerase chain reaction assay. No TBK1 gene copy-number variations were detected in patients with juvenile-onset open-angle glaucoma, pigmentary glaucoma, or steroid-induced glaucoma. A TBK1 gene duplication was detected in one (0.44%) of the 225 exfoliation glaucoma patients. TBK1 gene copy-number variations (gene duplications and triplications) have been previously associated with normal-tension glaucoma. An exploration of other open-angle glaucomas detected a TBK1 copy-number variation in a patient with exfoliation glaucoma, which is the first example of a TBK1 mutation in a glaucoma patient with a diagnosis other than normal-tension glaucoma. A broader phenotypic range may be associated with TBK1 copy-number variations, although mutations in this gene are most often detected in patients with normal-tension glaucoma.
Beth L Dumont
Full Text Available Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC. Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii the alignment-based method implemented in the GENECONV program. One-quarter (25.4% of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Full Text Available Abstract Background The Azoospermia Factor c (AZFc region of the human Y chromosome is a unique product of segmental duplication. It consists almost entirely of very long amplicons, represented by different colors, and is frequently deleted in subfertile men. Most of the AZFc amplicons have high sequence similarity with autosomal segments, indicating recent duplication and transposition to the Y chromosome. The Deleted in Azoospermia (DAZ gene within the red-amplicon arose from an ancestral autosomal DAZ-like (DAZL gene. It varies significantly between different men regarding to its copy number and the numbers of RNA recognition motif and DAZ repeat it encodes. We used Southern analyses to study the evolution of DAZ and AZFc amplicons on the Y chromosomes of primates. Results The Old World monkey rhesus macaque has only one DAZ gene. In contrast, the great apes have multiple copies of DAZ, ranging from 2 copies in bonobos and gorillas to at least 6 copies in orangutans, and these DAZ genes have polymorphic structures similar to those of their human counterparts. Sequences homologous to the various AZFc amplicons are present on the Y chromosomes of some but not all primates, indicating that they arrived on the Y chromosome at different times during primate evolution. Conclusion The duplication and transposition of AZFc amplicons to the human Y chromosome occurred in three waves, i.e., after the branching of the New World monkey, the gorilla, and the chimpanzee/bonobo lineages, respectively. The red-amplicon, one of the first to arrive on the Y chromosome, amplified by inverted duplication followed by direct duplication after the separation of the Old World monkey and the great ape lineages. Subsequent duplication/deletion in the various lineages gave rise to a spectrum of DAZ gene structure and copy number found in today's great apes.
Chen, H; Liu, W; Roberts, W; Hooker, S; Fedor, H; DeMarzo, A; Isaacs, W; Kittles, R A
Four independent regions within 8q24 near the MYC gene are associated with risk for prostate cancer (Pca). Here, we investigated allelic imbalance (AI) at 8q24 risk variants and MYC gene DNA copy number (CN) in 27 primary Pcas. Heterozygotes were observed in 24 of 27 patients at one or more 8q24 markers and 27% of the loci exhibited AI in tumor DNA. The 8q24 risk alleles were preferentially favored in the tumors. Increased MYC gene CN was observed in 33% of tumors, and the co-existence of increased MYC gene CN with AI at risk loci was observed in 86% (P<0.004 exact binomial test) of the informative tumors. No AI was observed in tumors, which did not reveal increased MYC gene CN. Higher Gleason score was associated with tumors exhibiting AI (P=0.04) and also with increased MYC gene CN (P=0.02). Our results suggest that AI at 8q24 and increased MYC gene CN may both be related to high Gleason score in Pca. Our findings also suggest that these two somatic alterations may be due to the same preferential chromosomal duplication event during prostate tumorigenesis.
Full Text Available Abstract Background Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning one-to-one orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (i.e., the random duplication model. However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (i.e., the tandem duplication model. Results In this paper, we develop MSOAR 2.0, an improved system for one-to-one ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (i.e., the so-called inparalogs, using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any one-to-one ortholog pairs, and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with the well-known genome-scale ortholog assignment tool InParanoid, Ensembl ortholog database, and the orthology information extracted from the well-known whole-genome multiple alignment program MultiZ, MSOAR 2.0 shows the highest sensitivity. Although the specificity of MSOAR 2.0 is slightly worse than that of InParanoid in the real data experiments
Thomas W R Harrop
Full Text Available Widespread use of insecticides has led to insecticide resistance in many populations of insects. In some populations, resistance has evolved to multiple pesticides. In Drosophila melanogaster, resistance to multiple classes of insecticide is due to the overexpression of a single cytochrome P450 gene, Cyp6g1. Overexpression of Cyp6g1 appears to have evolved in parallel in Drosophila simulans, a sibling species of D. melanogaster, where it is also associated with insecticide resistance. However, it is not known whether the ability of the CYP6G1 enzyme to provide resistance to multiple insecticides evolved recently in D. melanogaster or if this function is present in all Drosophila species. Here we show that duplication of the Cyp6g1 gene occurred at least four times during the evolution of different Drosophila species, and the ability of CYP6G1 to confer resistance to multiple insecticides exists in D. melanogaster and D. simulans but not in Drosophila willistoni or Drosophila virilis. In D. virilis, which has multiple copies of Cyp6g1, one copy confers resistance to DDT and another to nitenpyram, suggesting that the divergence of protein sequence between copies subsequent to the duplication affected the activity of the enzyme. All orthologs tested conferred resistance to one or more insecticides, suggesting that CYP6G1 had the capacity to provide resistance to anthropogenic chemicals before they existed. Finally, we show that expression of Cyp6g1 in the Malpighian tubules, which contributes to DDT resistance in D. melanogaster, is specific to the D. melanogaster-D. simulans lineage. Our results suggest that a combination of gene duplication, regulatory changes and protein coding changes has taken place at the Cyp6g1 locus during evolution and this locus may play a role in providing resistance to different environmental toxins in different Drosophila species.
Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke
The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.
Based on whole-genome analysis of Arabidopsis thaliana, there is compelling evidence that angiosperms underwent two whole-genome duplication events early during their evolutionary history. Recent studies have shown that these events were crucial for creation of many important developmental and regulatory genes ...
Gu, Xun; Wang, Yufeng; Gu, Jianying
The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.
Kordi, Misagh; Bansal, Mukul S
Duplication-Transfer-Loss (DTL) reconciliation has emerged as a powerful technique for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation takes as input a gene family phylogeny and the corresponding species phylogeny, and reconciles the two by postulating speciation, gene duplication, horizontal gene transfer, and gene loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. However, gene trees are frequently non-binary. With such non-binary gene trees, the reconciliation problem seeks to find a binary resolution of the gene tree that minimizes the reconciliation cost. Given the prevalence of non-binary gene trees, many efficient algorithms have been developed for this problem in the context of the simpler Duplication-Loss (DL) reconciliation model. Yet, no efficient algorithms exist for DTL reconciliation with non-binary gene trees and the complexity of the problem remains unknown. In this work, we resolve this open question by showing that the problem is, in fact, NP-hard. Our reduction applies to both the dated and undated formulations of DTL reconciliation. By resolving this long-standing open problem, this work will spur the development of both exact and heuristic algorithms for this important problem.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix–loop–helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks e...
Shao, Mingfu; Moret, Bernard M E
A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.
Onsongo, Getiria; Baughn, Linda B; Bower, Matthew; Henzler, Christine; Schomaker, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat
Simultaneous detection of small copy number variations (CNVs) (<0.5 kb) and single-nucleotide variants in clinically significant genes is of great interest for clinical laboratories. The analytical variability in next-generation sequencing (NGS) and artifacts in coverage data because of issues with mappability along with lack of robust bioinformatics tools for CNV detection have limited the utility of targeted NGS data to identify CNVs. We describe the development and implementation of a bioinformatics algorithm, copy number variation-random forest (CNV-RF), that incorporates a machine learning component to identify CNVs from targeted NGS data. Using CNV-RF, we identified 12 of 13 deletions in samples with known CNVs, two cases with duplications, and identified novel deletions in 22 additional cases. Furthermore, no CNVs were identified among 60 genes in 14 cases with normal copy number and no CNVs were identified in another 104 patients with clinical suspicion of CNVs. All positive deletions and duplications were confirmed using a quantitative PCR method. CNV-RF also detected heterozygous deletions and duplications with a specificity of 50% across 4813 genes. The ability of CNV-RF to detect clinically relevant CNVs with a high degree of sensitivity along with confirmation using a low-cost quantitative PCR method provides a framework for providing comprehensive NGS-based CNV/single-nucleotide variant detection in a clinical molecular diagnostics laboratory. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A
The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Kahn, Crystal L.; Mozes, Shay; Raphael, Benjamin J.
Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics consisting of pieces of multiple other segmental duplications. This complex genomic organization complicates analysis of the evolutionary history of these sequences. Earlier, we introduced a genomic distance, called duplication distance, that computes the most parsimonious way to build a target string by repeatedly copying substrings of a source string. We also showed how to use this distance to describe the formation of segmental duplications according to a two-step model that has been proposed to explain human segmental duplications. Here we describe polynomial-time exact algorithms for several extensions of duplication distance including models that allow certain types of substring deletions and inversions. These extensions will permit more biologically realistic analyses of segmental duplications in genomes.
Full Text Available Abstract Background Gene duplications have been proposed to be the main mechanism involved in genome evolution and in acquisition of new functions. Polydnaviruses (PDVs, symbiotic viruses associated with parasitoid wasps, are ideal model systems to study mechanisms of gene duplications given that PDV genomes consist of virulence genes organized into multigene families. In these systems the viral genome is integrated in a wasp chromosome as a provirus and virus particles containing circular double-stranded DNA are injected into the parasitoids’ hosts and are essential for parasitism success. The viral virulence factors, organized in gene families, are required collectively to induce host immune suppression and developmental arrest. The gene family which encodes protein tyrosine phosphatases (PTPs has undergone spectacular expansion in several PDV genomes with up to 42 genes. Results Here, we present strong indications that PTP gene family expansion occurred via classical mechanisms: by duplication of large segments of the chromosomally integrated form of the virus sequences (segmental duplication, by tandem duplications within this form and by dispersed duplications. We also propose a novel duplication mechanism specific to PDVs that involves viral circle reintegration into the wasp genome. The PTP copies produced were shown to undergo conservative evolution along with episodes of adaptive evolution. In particular recently produced copies have undergone positive selection in sites most likely involved in defining substrate selectivity. Conclusion The results provide evidence about the dynamic nature of polydnavirus proviral genomes. Classical and PDV-specific duplication mechanisms have been involved in the production of new gene copies. Selection pressures associated with antagonistic interactions with parasitized hosts have shaped these genes used to manipulate lepidopteran physiology with evidence for positive selection involved in
Laukaitis, Christina M; Heger, Andreas; Blakley, Tyler D; Munclinger, Pavel; Ponting, Chris P; Karn, Robert C
The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) alpha, beta and gamma subunits. Further investigation of 14 alpha-like (Abpa) and 13 beta- or gamma-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.
Full Text Available Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370 locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.
Full Text Available The primary objective of this study was to create a genome-wide high resolution map (i.e., >100 bp of 'rearrangement hotspots' which can facilitate the identification of regions capable of mediating de novo deletions or duplications in humans. A hierarchical method was employed to fragment segmental duplications (SDs into multiple smaller SD units. Combining an end space free pairwise alignment algorithm with a 'seed and extend' approach, we have exhaustively searched 409 million alignments to detect complex structural rearrangements within the reference-guided assembly of the NA18507 human genome (18× coverage, including the previously identified novel 4.8 Mb sequence from de novo assembly within this genome. We have identified 1,963 rearrangement hotspots within SDs which encompass 166 genes and display an enrichment of duplicated gene nucleotide variants (DNVs. These regions are correlated with increased non-allelic homologous recombination (NAHR event frequency which presumably represents the origin of copy number variations (CNVs and pathogenic duplications/deletions. Analysis revealed that 20% of the detected hotspots are clustered within the proximal and distal SD breakpoints flanked by the pathogenic deletions/duplications that have been mapped for 24 NAHR-mediated genomic disorders. FISH Validation of selected complex regions revealed 94% concordance with in silico localization of the highly homologous derivatives. Other results from this study indicate that intra-chromosomal recombination is enhanced in genic compared with agenic duplicated regions, and that gene desert regions comprising SDs may represent reservoirs for creation of novel genes. The generation of genome-wide signatures of 'rearrangement hotspots', which likely serve as templates for NAHR, may provide a powerful approach towards understanding the underlying mutational mechanism(s for development of constitutional and acquired diseases.
Urantowka, Adam Dawid; Hajduk, Kacper; Kosowska, Barbara
Amazona barbadensis is an endangered species of parrot living in northern coastal Venezuela and in several Caribbean islands. In this study, we sequenced full mitochondrial genome of the considered species. The total length of the mitogenome was 18,983 bp and contained 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, duplicated control region, and degenerate copies of ND6 and tRNA (Glu) genes. High degree of identity between two copies of control region suggests their coincident evolution and functionality. Comparative analysis of both the control region sequences from four Amazona species revealed their 89.1% identity over a region of 1300 bp and indicates the presence of distinctive parts of two control region copies.
Full Text Available Non-small cell lung cancer (NSCLC represents a genomically unstable cancer type with extensive copy number aberrations. The relationship of gene copy number alterations and subsequent mRNA levels has only fragmentarily been described. The aim of this study was to conduct a genome-wide analysis of gene copy number gains and corresponding gene expression levels in a clinically well annotated NSCLC patient cohort (n = 190 and their association with survival. While more than half of all analyzed gene copy number-gene expression pairs showed statistically significant correlations (10,296 of 18,756 genes, high correlations, with a correlation coefficient >0.7, were obtained only in a subset of 301 genes (1.6%, including KRAS, EGFR and MDM2. Higher correlation coefficients were associated with higher copy number and expression levels. Strong correlations were frequently based on few tumors with high copy number gains and correspondingly increased mRNA expression. Among the highly correlating genes, GO groups associated with posttranslational protein modifications were particularly frequent, including ubiquitination and neddylation. In a meta-analysis including 1,779 patients we found that survival associated genes were overrepresented among highly correlating genes (61 of the 301 highly correlating genes, FDR adjusted p<0.05. Among them are the chaperone CCT2, the core complex protein NUP107 and the ubiquitination and neddylation associated protein CAND1. In conclusion, in a comprehensive analysis we described a distinct set of highly correlating genes. These genes were found to be overrepresented among survival-associated genes based on gene expression in a large collection of publicly available datasets.
Emms, David M; Kelly, Steven
The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Cuypers, Thomas D; Hogeweg, Paulien
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30%) of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change.
Thomas D Cuypers
Full Text Available Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30% of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change.
Smith, Gilbert; Macias-Muñoz, Aide; Briscoe, Adriana D
Heliconius possess a unique ability among butterflies to feed on pollen. Pollen feeding significantly extends their lifespan, and is thought to have been important to the diversification of the genus. We used RNA sequencing to examine feeding-related gene expression in the mouthparts of four species of Heliconius and one nonpollen feeding species, Eueides isabella We hypothesized that genes involved in morphology and protein metabolism might be upregulated in Heliconius because they have longer proboscides than Eueides, and because pollen contains more protein than nectar. Using de novo transcriptome assemblies, we tested these hypotheses by comparing gene expression in mouthparts against antennae and legs. We first looked for genes upregulated in mouthparts across all five species and discovered several hundred genes, many of which had functional annotations involving metabolism of proteins (cocoonase), lipids, and carbohydrates. We then looked specifically within Heliconius where we found eleven common upregulated genes with roles in morphology (CPR cuticle proteins), behavior (takeout-like), and metabolism (luciferase-like). Closer examination of these candidates revealed that cocoonase underwent several duplications along the lineage leading to heliconiine butterflies, including two Heliconius-specific duplications. Luciferase-like genes also underwent duplication within lepidopterans, and upregulation in Heliconius mouthparts. Reverse-transcription PCR confirmed that three cocoonases, a peptidase, and one luciferase-like gene are expressed in the proboscis with little to no expression in labial palps and salivary glands. Our results suggest pollen feeding, like other dietary specializations, was likely facilitated by adaptive expansions of preexisting genes-and that the butterfly proboscis is involved in digestive enzyme production. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Fadista, João; Bendixen, Christian
Segmental duplications are >1kb segments of duplicated DNA present in a genome with high sequence identity (>90%). They are associated with genomic rearrangements and provide a significant source of gene and genome evolution within mammalian genomes. Although segmental duplications have been...... extensively studied in other organisms, its analysis in pig has been hampered by the lack of a complete pig genome assembly. By measuring the depth of coverage of Illumina whole-genome shotgun sequencing reads of the Tabasco animal aligned to the latest pig genome assembly (Sus scrofa 10 – based also...... and their associated copy number alterations, focusing on the global organization of these segments and their possible functional significance in porcine phenotypes. This work provides insights into mammalian genome evolution and generates a valuable resource for porcine genomics research...
ers were developed, and the 1990s, when genome sequenc- ing became ... transposed gene copies have been maintained in the human genome over the past 63 ..... competent artificial chromosome (TAC) libraries as the pri- mary substrates ...
Genomic instability is a hallmark of human cancer, and results in widespread somatic copy number alterations. We used a genome-scale shRNA viability screen in human cancer cell lines to systematically identify genes that are essential in the context of particular copy-number alterations (copy-number associated gene dependencies). The most enriched class of copy-number associated gene dependencies was CYCLOPS (Copy-number alterations Yielding Cancer Liabilities Owing to Partial losS) genes, and spliceosome components were the most prevalent.
Kordi, Misagh; Bansal, Mukul S
Duplication-Transfer-Loss (DTL) reconciliation is a powerful method for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation seeks to reconcile gene trees with species trees by postulating speciation, duplication, transfer, and loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. In practice, however, gene trees are often non-binary due to uncertainty in the gene tree topologies, and DTL reconciliation with non-binary gene trees is known to be NP-hard. In this paper, we present the first exact algorithms for DTL reconciliation with non-binary gene trees. Specifically, we (i) show that the DTL reconciliation problem for non-binary gene trees is fixed-parameter tractable in the maximum degree of the gene tree, (ii) present an exponential-time, but in-practice efficient, algorithm to track and enumerate all optimal binary resolutions of a non-binary input gene tree, and (iii) apply our algorithms to a large empirical data set of over 4700 gene trees from 100 species to study the impact of gene tree uncertainty on DTL-reconciliation and to demonstrate the applicability and utility of our algorithms. The new techniques and algorithms introduced in this paper will help biologists avoid incorrect evolutionary inferences caused by gene tree uncertainty.
Dong, Shaowei; Adams, Keith L
Polyploidy has occurred throughout plant evolution and can result in considerable changes to gene expression when it takes place and over evolutionary time. Little is known about the effects of abiotic stress conditions on duplicate gene expression patterns in polyploid plants. We examined the expression patterns of 60 duplicated genes in leaves, roots and cotyledons of allotetraploid Gossypium hirsutum in response to five abiotic stress treatments (heat, cold, drought, high salt and water submersion) using single-strand conformation polymorphism assays, and 20 genes in a synthetic allotetraploid. Over 70% of the genes showed stress-induced changes in the relative expression levels of the duplicates under one or more stress treatments with frequent variability among treatments. Twelve pairs showed opposite changes in expression levels in response to different abiotic stress treatments. Stress-induced expression changes occurred in the synthetic allopolyploid, but there was little correspondence in patterns between the natural and synthetic polyploids. Our results indicate that abiotic stress conditions can have considerable effects on duplicate gene expression in a polyploid, with the effects varying by gene, stress and organ type. Differential expression in response to environmental stresses may be a factor in the preservation of some duplicated genes in polyploids. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.
van Hooff, Jolien J E; Snel, Berend; Seidl, Michael F
Genomes of the plant-pathogenic genus Phytophthora are characterized by small duplicated blocks consisting of two consecutive genes (2HOM blocks) and by an elevated abundance of similarly aged gene duplicates. Both properties, in particular the presence of 2HOM blocks, have been attributed to a whole-genome duplication (WGD) at the last common ancestor of Phytophthora. However, large intraspecies synteny-compelling evidence for a WGD-has not been detected. Here, we revisited the WGD hypothesis by deducing the age of 2HOM blocks. Two independent timing methods reveal that the majority of 2HOM blocks arose after divergence of the Phytophthora lineages. In addition, a large proportion of the 2HOM block copies colocalize on the same scaffold. Therefore, the presence of 2HOM blocks does not support a WGD at the last common ancestor of Phytophthora. Thus, genome evolution of Phytophthora is likely driven by alternative mechanisms, such as bursts of transposon activity.
Full Text Available The basic leucine zipper (bZIP transcription factors are the most diverse members of dimerizing transcription factors. In the present study, 50, 116, and 47 bZIP genes were identified in Malus domestica (apple, Prunus persica (peach, and Fragaria vesca (strawberry, respectively. Species-specific duplication was the main contributor to the large number of bZIPs observed in apple. After WGD in apple genome, orthologous bZIP genes corresponding to strawberry on duplicated regions in apple genome were retained. However, in peach ancestor, these syntenic regions were quickly lost or deleted. Maybe the positive selection contributed to the expansion of clade S to adapt to the development and environment stresses. In addition, purifying selection was mainly responsible for bZIP sequence-specific DNA binding. The analysis of orthologous pairs between chromosomes indicates that these orthologs derived from one gene duplication located on one of the nine ancient chromosomes in the Rosaceae. The comparative analysis of bZIP genes in three species provides information on the evolutionary fate of bZIP genes in apple and peach after they diverged from strawberry.
Li, Qi; Zhang, Ning; Zhang, Liangsheng; Ma, Hong
Rhomboid proteins are intramembrane serine proteases that are involved in a plethora of biological functions, but the evolutionary history of the rhomboid gene family is not clear. We performed a comprehensive molecular evolutionary analysis of the rhomboid gene family and also investigated the organization and sequence features of plant rhomboids in different subfamilies. Our results showed that eukaryotic rhomboids could be divided into five subfamilies (RhoA-RhoD and PARL). Most orthology groups appeared to be conserved only as single or low-copy genes in all lineages in RhoB-RhoD and PARL, whereas RhoA genes underwent several duplication events, resulting in multiple gene copies. These duplication events were due to whole genome duplications in plants and animals and the duplicates might have experienced functional divergence. We also identified a novel group of plant rhomboid (RhoB1) that might have lost their enzymatic activity; their existence suggests that they might have evolved new mechanisms. Plant and animal rhomboids have similar evolutionary patterns. In addition, there are mutations affecting key active sites in RBL8, RBL9 and one of the Brassicaceae PARL duplicates. This study delineates a possible evolutionary scheme for intramembrane proteins and illustrates distinct fates and a mechanism of evolution of gene duplicates. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Thomas B Duguet
Full Text Available Helminth parasites rely on fast-synaptic transmission in their neuromusculature to experience the outside world and respond to it. Acetylcholine plays a pivotal role in this and its receptors are targeted by a wide variety of both natural and synthetic compounds used in human health and for the control of parasitic disease. The model, Caenorhabditis elegans is characterized by a large number of acetylcholine receptor subunit genes, a feature shared across the nematodes. This dynamic family is characterized by both gene duplication and loss between species. The pentameric levamisole-sensitive acetylcholine receptor has been characterized from C. elegans, comprised of five different subunits. More recently, cognate receptors have been reconstituted from multiple parasitic nematodes that are found to vary in subunit composition. In order to understand the implications of receptor composition change and the origins of potentially novel drug targets, we investigated a specific example of subunit duplication based on analysis of genome data for 25 species from the 50 helminth genome initiative. We found multiple independent duplications of the unc-29, acetylcholine receptor subunit, where codon substitution rate analysis identified positive, directional selection acting on amino acid positions associated with subunit assembly. Characterization of four gene copies from a model parasitic nematode, Haemonchus contortus, demonstrated that each copy has acquired unique functional characteristics based on phenotype rescue of transgenic C. elegans and electrophysiology of receptors reconstituted in Xenopus oocytes. We found evidence that a specific incompatibility has evolved for two subunits co-expressed in muscle. We demonstrated that functional divergence of acetylcholine receptors, driven by directional selection, can occur more rapidly than previously thought and may be mediated by alteration of receptor assembly. This phenomenon is common among the
Preston, Jill C; Jorgensen, Stacy A; Jha, Suryatapa G
Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene Suppressor Of Overexpression of Constans 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes Unshaven (UNS) and Floral Binding Protein 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods.
Jill C Preston
Full Text Available Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae, many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene Suppressor Of Overexpression of Constans 1 (SOC1 in the short-lived perennial Petunia hybrida (petunia, Solanaceae. Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes Unshaven (UNS and Floral Binding Protein 21 (FBP21, but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods.
Brady, Seán G; Litman, Jessica R; Danforth, Bryan N
The placement of the root node in a phylogeny is fundamental to characterizing evolutionary relationships. The root node of bee phylogeny remains unclear despite considerable previous attention. In order to test alternative hypotheses for the location of the root node in bees, we used the F1 and F2 paralogs of elongation factor 1-alpha (EF-1α) to compare the tree topologies that result when using outgroup versus paralogous rooting. Fifty-two taxa representing each of the seven bee families were sequenced for both copies of EF-1α. Two datasets were analyzed. In the first (the "concatenated" dataset), the F1 and F2 copies for each species were concatenated and the tree was rooted using appropriate outgroups (sphecid and crabronid wasps). In the second dataset (the "duplicated" dataset), the F1 and F2 copies were aligned to each another and each copy for all taxa were treated as separate terminals. In this dataset, the root was placed between the F1 and F2 copies (e.g., paralog rooting). Bayesian analyses demonstrate that the outgroup rooting approach outperforms paralog rooting, recovering deeper clades and showing stronger support for groups well established by both morphological and other molecular data. Sequence characteristics of the two copies were compared at the amino acid level, but little evidence was found to suggest that one copy is more functionally conserved. Although neither approach yields an unambiguous root to the tree, both approaches strongly indicate that the root of bee phylogeny does not fall near Colletidae, as has been previously proposed. We discuss paralog rooting as a general strategy and why this approach performs relatively poorly with our particular dataset. Copyright © 2011 Elsevier Inc. All rights reserved.
Haarsma, Loren; Nelesen, Serita; VanAndel, Ethan; Lamine, James; VandeHaar, Peter
We present a model of the evolution of protein complexes with novel functions through gene duplication, mutation, and co-option. Under a wide variety of input parameters, digital organisms evolve complexes of 2-5 bound proteins which have novel functions but whose component proteins are not independently functional. Evolution of complexes with novel functions happens more quickly as gene duplication rates increase, point mutation rates increase, protein complex functional probability increases, protein complex functional strength increases, and protein family size decreases. Evolution of complexity is inhibited when the metabolic costs of making proteins exceeds the fitness gain of having functional proteins, or when point mutation rates get so large the functional proteins undergo deleterious mutations faster than new functional complexes can evolve. Copyright © 2016 Elsevier Ltd. All rights reserved.
Popova, Olga V; Mikhailov, Kirill V; Nikitin, Mikhail A; Logacheva, Maria D; Penin, Aleksey A; Muntyan, Maria S; Kedrova, Olga S; Petrov, Nikolai B; Panchin, Yuri V; Aleoshin, Vladimir V
Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia.
Olga V Popova
Full Text Available Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida and Pycnophyes kielensis (Allomalorhagida. Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even
Naseer, Muhammad I; Chaudhary, Adeel G; Rasool, Mahmood; Kalamegam, Gauthaman; Ashgan, Fai T; Assidi, Mourad; Ahmed, Farid; Ansari, Shakeel A; Zaidi, Syed Kashif; Jan, Mohammed M; Al-Qahtani, Mohammad H
Epilepsy is genetically complex but common brain disorder of the world affecting millions of people with almost of all age groups. Novel Copy number variations (CNVs) are considered as important reason for the numerous neurodevelopmental disorders along with intellectual disability and epilepsy. DNA array based studies contribute to explain a more severe clinical presentation of the disease but interoperation of many detected CNVs are still challenging. In order to study novel CNVs with epilepsy related genes in Saudi family with six affected and two normal individuals with several forms of epileptic seizures, intellectual disability (ID), and minor dysmorphism, we performed the high density whole genome Agilent sure print G3 Hmn CGH 2x 400 K array-CGH chips analysis. Our results showed de novo deletions, duplications and deletion plus duplication on differential chromosomal regions in the affected individuals that were not shown in the normal fathe and normal kids by using Agilent CytoGenomics 22.214.171.124 softwear. Copy number gain were observed in the chromosome 1, 16 and 22 with LCE3C, HPR, GSTT2, GSTTP2, DDT and DDTL genes respectively whereas the deletions observed in the chromosomal regions 8p23-p21 (4303127-4337759) and the potential gene in this region is CSMD1 (OMIM: 612279). Moreover, the array CGH results deletions and duplication were also validated by using primer design of deleted regions utilizing the flanked SNPs using simple PCR and also by using quantitative real time PCR. We found some of the de novo deletions and duplication in our study in Saudi family with intellectual disability and epilepsy. Our results suggest that array-CGH should be used as a first line of genetic test for epilepsy except there is a strong indication for a monogenic syndrome. The advanced high through put array-CGH technique used in this study aim to collect the data base and to identify new mechanisms describing epileptic disorder, may help to improve the clinical
Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn
Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate
Elia, Josephine; Glessner, Joseph T; Wang, Kai; Takahashi, Nagahide; Shtir, Corina J; Hadley, Dexter; Sleiman, Patrick M A; Zhang, Haitao; Kim, Cecilia E; Robison, Reid; Lyon, Gholson J; Flory, James H; Bradfield, Jonathan P; Imielinski, Marcin; Hou, Cuiping; Frackelton, Edward C; Chiavacci, Rosetta M; Sakurai, Takeshi; Rabin, Cara; Middleton, Frank A; Thomas, Kelly A; Garris, Maria; Mentch, Frank; Freitag, Christine M; Steinhausen, Hans-Christoph; Todorov, Alexandre A; Reif, Andreas; Rothenberger, Aribert; Franke, Barbara; Mick, Eric O; Roeyers, Herbert; Buitelaar, Jan; Lesch, Klaus-Peter; Banaschewski, Tobias; Ebstein, Richard P; Mulas, Fernando; Oades, Robert D; Sergeant, Joseph; Sonuga-Barke, Edmund; Renner, Tobias J; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Meyer, Jobst; Pálmason, Haukur; Seitz, Christiane; Loo, Sandra K; Smalley, Susan L; Biederman, Joseph; Kent, Lindsey; Asherson, Philip; Anney, Richard J L; Gaynor, J William; Shaw, Philip; Devoto, Marcella; White, Peter S; Grant, Struan F A; Buxbaum, Joseph D; Rapoport, Judith L; Williams, Nigel M; Nelson, Stanley F; Faraone, Stephen V; Hakonarson, Hakon
Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts. PMID:22138692
Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning
Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.
... 10 Energy 1 2010-01-01 2010-01-01 false Cost of duplication of documents. 7.21 Section 7.21 Energy NUCLEAR REGULATORY COMMISSION ADVISORY COMMITTEES § 7.21 Cost of duplication of documents. Copies of the records, reports, transcripts, minutes, appendices, working papers, drafts, studies, agenda, or other...
Full Text Available Abstract Background Various evolutionary models have been proposed to interpret the fate of paralogous duplicates, which provides substrates on which evolution selection could act. In particular, domestication, as a special selection, has played important role in crop cultivation with divergence of many genes controlling important agronomic traits. Recent studies have indicated that a pair of duplicate genes was often sub-functionalized from their ancestral functions held by the parental genes. We previously demonstrated that the rice cell-wall invertase (CWI gene GIF1 that plays an important role in the grain-filling process was most likely subjected to domestication selection in the promoter region. Here, we report that GIF1 and another CWI gene OsCIN1 constitute a pair of duplicate genes with differentiated expression and function through independent selection. Results Through synteny analysis, we show that GIF1 and another cell-wall invertase gene OsCIN1 were paralogues derived from a segmental duplication originated during genome duplication of grasses. Results based on analyses of population genetics and gene phylogenetic tree of 25 cultivars and 25 wild rice sequences demonstrated that OsCIN1 was also artificially selected during rice domestication with a fixed mutation in the coding region, in contrast to GIF1 that was selected in the promoter region. GIF1 and OsCIN1 have evolved into different expression patterns and probable different kinetics parameters of enzymatic activity with the latter displaying less enzymatic activity. Overexpression of GIF1 and OsCIN1 also resulted in different phenotypes, suggesting that OsCIN1 might regulate other unrecognized biological process. Conclusion How gene duplication and divergence contribute to genetic novelty and morphological adaptation has been an interesting issue to geneticists and biologists. Our discovery that the duplicated pair of GIF1 and OsCIN1 has experienced sub
Karn, Robert C; Laukaitis, Christina M
In the present article, we summarize two aspects of our work on mouse ABP (androgen-binding protein): (i) the sexual selection function producing incipient reinforcement on the European house mouse hybrid zone, and (ii) the mechanism behind the dramatic expansion of the Abp gene region in the mouse genome. Selection unifies these two components, although the ways in which selection has acted differ. At the functional level, strong positive selection has acted on key sites on the surface of one face of the ABP dimer, possibly to influence binding to a receptor. A different kind of selection has apparently driven the recent and rapid expansion of the gene region, probably by increasing the amount of Abp transcript, in one or both of two ways. We have shown previously that groups of Abp genes behave as LCRs (low-copy repeats), duplicating as relatively large blocks of genes by NAHR (non-allelic homologous recombination). The second type of selection involves the close link between the accumulation of L1 elements and the expansion of the Abp gene family by NAHR. It is probably predicated on an initial selection for increased transcription of existing Abp genes and/or an increase in Abp gene number providing more transcriptional sites. Either or both could increase initial transcript production, a quantitative change similar to increasing the volume of a radio transmission. In closing, we also provide a note on Abp gene nomenclature.
Full Text Available The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max. In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Solomon, N.M.; Ross, S.; Morgan, T.; Belsky, J.L.; Hol, F.A.; Karnes, P.; Hopwood, N.J.; Myers, S.E.; Tan, A.; Warne, G.L.; Forrest, S.M.; Thomas, P.Q.
INTRODUCTION: Array comparative genomic hybridisation (array CGH) is a powerful method that detects alteration of gene copy number with greater resolution and efficiency than traditional methods. However, its ability to detect disease causing duplications in constitutional genomic DNA has not been
Yockteng, Roxana; Almeida, Ana M R; Morioka, Kelsie; Alvarez-Buylla, Elena R; Specht, Chelsea D
The diversity of floral forms in the plant order Zingiberales has evolved through alterations in floral organ morphology. One striking alteration is the shift from fertile, filamentous stamens to sterile, laminar (petaloid) organs in the stamen whorls, attributed to specific pollination syndromes. Here, we examine the role of the SEPALLATA (SEP) genes, known to be important in regulatory networks underlying floral development and organ identity, in the evolution of development of the diverse floral organs phenotypes in the Zingiberales. Phylogenetic analyses show that the SEP-like genes have undergone several duplication events giving rise to multiple copies. Selection tests on the SEP-like genes indicate that the two copies of SEP3 have mostly evolved under balancing selection, probably due to strong functional restrictions as a result of their critical role in floral organ specification. In contrast, the two LOFSEP copies have undergone differential positive selection, indicating neofunctionalization. Reverse transcriptase-polymerase chain reaction, gene expression from RNA-seq data, and in situ hybridization analyses show that the recovered genes have differential expression patterns across the various whorls and organ types found in the Zingiberales. Our data also suggest that AGL6, sister to the SEP-like genes, may play an important role in stamen morphology in the Zingiberales. Thus, the SEP-like genes are likely to be involved in some of the unique morphogenetic patterns of floral organ development found among this diverse order of tropical monocots. This work contributes to a growing body of knowledge focused on understanding the role of gene duplications and the evolution of entire gene networks in the evolution of flower development.
Harding, Tommy; Roger, Andrew J.; Simpson, Alastair G. B.
The capacity of halophiles to thrive in extreme hypersaline habitats derives partly from the tight regulation of ion homeostasis, the salt-dependent adjustment of plasma membrane fluidity, and the increased capability to manage oxidative stress. Halophilic bacteria, and archaea have been intensively studied, and substantial research has been conducted on halophilic fungi, and the green alga Dunaliella. By contrast, there have been very few investigations of halophiles that are phagotrophic protists, i.e., protozoa. To gather fundamental knowledge about salt adaptation in these organisms, we studied the transcriptome-level response of Halocafeteria seosinensis (Stramenopiles) grown under contrasting salinities. We provided further evolutionary context to our analysis by identifying genes that underwent recent duplications. Genes that were highly responsive to salinity variations were involved in stress response (e.g., chaperones), ion homeostasis (e.g., Na+/H+ transporter), metabolism and transport of lipids (e.g., sterol biosynthetic genes), carbohydrate metabolism (e.g., glycosidases), and signal transduction pathways (e.g., transcription factors). A significantly high proportion (43%) of duplicated genes were also differentially expressed, accentuating the importance of gene expansion in adaptation by H. seosinensis to high salt environments. Furthermore, we found two genes that were lateral acquisitions from bacteria, and were also highly up-regulated and highly expressed at high salt, suggesting that this evolutionary mechanism could also have facilitated adaptation to high salt. We propose that a transition toward high-salt adaptation in the ancestors of H. seosinensis required the acquisition of new genes via duplication, and some lateral gene transfers (LGTs), as well as the alteration of transcriptional programs, leading to increased stress resistance, proper establishment of ion gradients, and modification of cell structure properties like membrane
Full Text Available The capacity of halophiles to thrive in extreme hypersaline habitats derives partly from the tight regulation of ion homeostasis, the salt-dependent adjustment of plasma membrane fluidity, and the increased capability to manage oxidative stress. Halophilic bacteria, and archaea have been intensively studied, and substantial research has been conducted on halophilic fungi, and the green alga Dunaliella. By contrast, there have been very few investigations of halophiles that are phagotrophic protists, i.e., protozoa. To gather fundamental knowledge about salt adaptation in these organisms, we studied the transcriptome-level response of Halocafeteria seosinensis (Stramenopiles grown under contrasting salinities. We provided further evolutionary context to our analysis by identifying genes that underwent recent duplications. Genes that were highly responsive to salinity variations were involved in stress response (e.g., chaperones, ion homeostasis (e.g., Na+/H+ transporter, metabolism and transport of lipids (e.g., sterol biosynthetic genes, carbohydrate metabolism (e.g., glycosidases, and signal transduction pathways (e.g., transcription factors. A significantly high proportion (43% of duplicated genes were also differentially expressed, accentuating the importance of gene expansion in adaptation by H. seosinensis to high salt environments. Furthermore, we found two genes that were lateral acquisitions from bacteria, and were also highly up-regulated and highly expressed at high salt, suggesting that this evolutionary mechanism could also have facilitated adaptation to high salt. We propose that a transition toward high-salt adaptation in the ancestors of H. seosinensis required the acquisition of new genes via duplication, and some lateral gene transfers (LGTs, as well as the alteration of transcriptional programs, leading to increased stress resistance, proper establishment of ion gradients, and modification of cell structure properties like
Full Text Available Transgene copy number has a great impact on the expression level and stability of exogenous gene in transgenic plants. Proper selection of endogenous reference genes is necessary for detection of genetic components in genetically modification (GM crops by quantitative real-time PCR (qPCR or by qualitative PCR approach, especially in sugarcane with polyploid and aneuploid genomic structure. qPCR technique has been widely accepted as an accurate, time-saving method on determination of copy numbers in transgenic plants and on detection of genetically modified plants to meet the regulatory and legislative requirement. In this study, to find a suitable endogenous reference gene and its real-time PCR assay for sugarcane (Saccharum spp. hybrids DNA content quantification, we evaluated a set of potential “single copy” genes including P4H, APRT, ENOL, CYC, TST and PRR, through qualitative PCR and absolute quantitative PCR. Based on copy number comparisons among different sugarcane genotypes, including five S. officinarum, one S. spontaneum and two S. spp. hybrids, these endogenous genes fell into three groups: ENOL-3—high copy number group, TST-1 and PRR-1—medium copy number group, P4H-1, APRT-2 and CYC-2—low copy number group. Among these tested genes, P4H, APRT and CYC were the most stable, while ENOL and TST were the least stable across different sugarcane genotypes. Therefore, three primer pairs of P4H-3, APRT-2 and CYC-2 were then selected as the suitable reference gene primer pairs for sugarcane. The test of multi-target reference genes revealed that the APRT gene was a specific amplicon, suggesting this gene is the most suitable to be used as an endogenous reference target for sugarcane DNA content quantification. These results should be helpful for establishing accurate and reliable qualitative and quantitative PCR analysis of GM sugarcane.
Khan, Fayeza F; Carpenter, Danielle; Mitchell, Laura; Mansouri, Omniah; Black, Holly A; Tyson, Jess; Armour, John A L
Multi-allelic copy number variants include examples of extensive variation between individuals in the copy number of important genes, most notably genes involved in immune function. The definition of this variation, and analysis of its impact on function, has been hampered by the technical difficulty of large-scale but accurate typing of genomic copy number. The copy-variable alpha-defensin locus DEFA1A3 on human chromosome 8 commonly varies between 4 and 10 copies per diploid genome, and presents considerable challenges for accurate high-throughput typing. In this study, we developed two paralogue ratio tests and three allelic ratio measurements that, in combination, provide an accurate and scalable method for measurement of DEFA1A3 gene number. We combined information from different measurements in a maximum-likelihood framework which suggests that most samples can be assigned to an integer copy number with high confidence, and applied it to typing 589 unrelated European DNA samples. Typing the members of three-generation pedigrees provided further reassurance that correct integer copy numbers had been assigned. Our results have allowed us to discover that the SNP rs4300027 is strongly associated with DEFA1A3 gene copy number in European samples. We have developed an accurate and robust method for measurement of DEFA1A3 copy number. Interrogation of rs4300027 and associated SNPs in Genome-Wide Association Study SNP data provides no evidence that alpha-defensin copy number is a strong risk factor for phenotypes such as Crohn's disease, type I diabetes, HIV progression and multiple sclerosis.
Marandel, Lucie; Panserat, Stéphane; Plagnes-Juan, Elisabeth; Arbenoits, Eva; Soengas, José Luis; Bobe, Julien
Glucose-6-phosphate (G6pc) is a key enzyme involved in the regulation of the glucose homeostasis. The present study aims at revisiting and clarifying the evolutionary history of g6pc genes in vertebrates. g6pc duplications happened by successive rounds of whole genome duplication that occurred during vertebrate evolution. g6pc duplicated before or around Osteichthyes/Chondrichthyes radiation, giving rise to g6pca and g6pcb as a consequence of the second vertebrate whole genome duplication. g6pca was lost after this duplication in Sarcopterygii whereas both g6pca and g6pcb then duplicated as a consequence of the teleost-specific whole genome duplication. One g6pca duplicate was lost after this duplication in teleosts. Similarly one g6pcb2 duplicate was lost at least in the ancestor of percomorpha. The analysis of the evolution of spatial expression patterns of g6pc genes in vertebrates showed that all g6pc were mainly expressed in intestine and liver whereas teleost-specific g6pcb2 genes were mainly and surprisingly expressed in brain and heart. g6pcb2b, one gene previously hypothesised to be involved in the glucose intolerant phenotype in trout, was unexpectedly up-regulated (as it was in liver) by carbohydrates in trout telencephalon without showing significant changes in other brain regions. This up-regulation is in striking contrast with expected glucosensing mechanisms suggesting that its positive response to glucose relates to specific unknown processes in this brain area. Our results suggested that the fixation and the divergence of g6pc duplicated genes during vertebrates' evolution may lead to adaptive novelty and probably to the emergence of novel phenotypes related to glucose homeostasis.
Jabbour, Florian; Cossard, Guillaume; Le Guilloux, Martine; Sannier, Julie; Nadot, Sophie; Damerval, Catherine
Floral bilateral symmetry (zygomorphy) has evolved several times independently in angiosperms from radially symmetrical (actinomorphic) ancestral states. Homologs of the Antirrhinum majus Cycloidea gene (Cyc) have been shown to control floral symmetry in diverse groups in core eudicots. In the basal eudicot family Ranunculaceae, there is a single evolutionary transition from actinomorphy to zygomorphy in the stem lineage of the tribe Delphinieae. We characterized Cyc homologs in 18 genera of Ranunculaceae, including the four genera of Delphinieae, in a sampling that represents the floral morphological diversity of this tribe, and reconstructed the evolutionary history of this gene family in Ranunculaceae. Within each of the two RanaCyL (Ranunculaceae Cycloidea-like) lineages previously identified, an additional duplication possibly predating the emergence of the Delphinieae was found, resulting in up to four gene copies in zygomorphic species. Expression analyses indicate that the RanaCyL paralogs are expressed early in floral buds and that the duration of their expression varies between species and paralog class. At most one RanaCyL paralog was expressed during the late stages of floral development in the actinomorphic species studied whereas all paralogs from the zygomorphic species were expressed, composing a species-specific identity code for perianth organs. The contrasted asymmetric patterns of expression observed in the two zygomorphic species is discussed in relation to their distinct perianth architecture.
Full Text Available Floral bilateral symmetry (zygomorphy has evolved several times independently in angiosperms from radially symmetrical (actinomorphic ancestral states. Homologs of the Antirrhinum majus Cycloidea gene (Cyc have been shown to control floral symmetry in diverse groups in core eudicots. In the basal eudicot family Ranunculaceae, there is a single evolutionary transition from actinomorphy to zygomorphy in the stem lineage of the tribe Delphinieae. We characterized Cyc homologs in 18 genera of Ranunculaceae, including the four genera of Delphinieae, in a sampling that represents the floral morphological diversity of this tribe, and reconstructed the evolutionary history of this gene family in Ranunculaceae. Within each of the two RanaCyL (Ranunculaceae Cycloidea-like lineages previously identified, an additional duplication possibly predating the emergence of the Delphinieae was found, resulting in up to four gene copies in zygomorphic species. Expression analyses indicate that the RanaCyL paralogs are expressed early in floral buds and that the duration of their expression varies between species and paralog class. At most one RanaCyL paralog was expressed during the late stages of floral development in the actinomorphic species studied whereas all paralogs from the zygomorphic species were expressed, composing a species-specific identity code for perianth organs. The contrasted asymmetric patterns of expression observed in the two zygomorphic species is discussed in relation to their distinct perianth architecture.
Mullegama, Sureni V; Rosenfeld, Jill A; Orellana, Carmen; van Bon, Bregje W M; Halbach, Sara; Repnikova, Elena A; Brick, Lauren; Li, Chumei; Dupuis, Lucie; Rosello, Monica; Aradhya, Swaroop; Stavropoulos, D James; Manickam, Kandamurugu; Mitchell, Elyse; Hodge, Jennelle C; Talkowski, Michael E; Gusella, James F; Keller, Kory; Zonana, Jonathan; Schwartz, Stuart; Pyatt, Robert E; Waggoner, Darrel J; Shaffer, Lisa G; Lin, Angela E; de Vries, Bert B A; Mendoza-Londono, Roberto; Elsea, Sarah H
Copy number variations associated with abnormal gene dosage have an important role in the genetic etiology of many neurodevelopmental disorders, including intellectual disability (ID) and autism. We hypothesize that the chromosome 2q23.1 region encompassing MBD5 is a dosage-dependent region, wherein deletion or duplication results in altered gene dosage. We previously established the 2q23.1 microdeletion syndrome and report herein 23 individuals with 2q23.1 duplications, thus establishing a complementary duplication syndrome. The observed phenotype includes ID, language impairments, infantile hypotonia and gross motor delay, behavioral problems, autistic features, dysmorphic facial features (pinnae anomalies, arched eyebrows, prominent nose, small chin, thin upper lip), and minor digital anomalies (fifth finger clinodactyly and large broad first toe). The microduplication size varies among all cases and ranges from 68 kb to 53.7 Mb, encompassing a region that includes MBD5, an important factor in methylation patterning and epigenetic regulation. We previously reported that haploinsufficiency of MBD5 is the primary causal factor in 2q23.1 microdeletion syndrome and that mutations in MBD5 are associated with autism. In this study, we demonstrate that MBD5 is the only gene in common among all duplication cases and that overexpression of MBD5 is likely responsible for the core clinical features present in 2q23.1 microduplication syndrome. Phenotypic analyses suggest that 2q23.1 duplication results in a slightly less severe phenotype than the reciprocal deletion. The features associated with a deletion, mutation or duplication of MBD5 and the gene expression changes observed support MBD5 as a dosage-sensitive gene critical for normal development.
Gong, Jiachang; Guo, Jichang
Many advanced image-processing softwares are available for tampering images. How to determine the authenticity of an image has become an urgent problem. Copy-move is one of the most common image forgery operations. Many methods have been proposed for copy-move forgery detection (CMFD). However, most of these methods are designed for grayscale images without any color information used. They are usually not suitable when the duplicated regions have little structure or have undergone various transforms. We propose a CMFD method using local geometrical color invariant features to detect duplicated regions. The method starts by calculating the color gradient of the inspected image. Then, we directly take the color gradient as the input for scale invariant features transform (SIFT) to extract color-SIFT descriptors. Finally, keypoints are matched and clustered before their geometrical relationship is estimated to expose the duplicated regions. We evaluate the detection performance and computational complexity of the proposed method together with several popular CMFD methods on a public database. Experimental results demonstrate the efficacy of the proposed method in detecting duplicated regions with various transforms and poor structure.
Lempicki Richard A
Full Text Available Abstract Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive
Nakamine, Alisa; Ouchanov, Leonid; Jiménez, Patricia; Manghi, Elina R; Esquivel, Marcela; Monge, Silvia; Fallas, Marietha; Burton, Barbara K; Szomju, Barbara; Elsea, Sarah H; Marshall, Christian R; Scherer, Stephen W; McInnes, L Alison
Duplications of 17(p11.2p11.2) have been associated with various behavioral manifestations including attention deficits, obsessive-compulsive symptoms, autistic traits, and language delay. We are conducting a genetic study of autism and are screening all cases for submicroscopic chromosomal abnormalities, in addition to standard karyotyping, and fragile X testing. Using array-based comparative genomic hybridization analysis of data from the Affymetrix GeneChip(R) Human Mapping Array set, we detected a duplication of approximately 3.3 Mb on chromosome 17p11.2 in a male child with autism and severe expressive language delay. The duplication was confirmed by measuring the copy number of genomic DNA using quantitative polymerase chain reaction. Gene expression analyses revealed increased expression of three candidate genes for the Smith-Magenis neurobehavioral phenotype, RAI1, DRG2, and RASD1, in transformed lymphocytes from Case 81A, suggesting gene dosage effects. Our results add to a growing body of evidence suggesting that duplications of 17(p11.2p11.2) result in language delay as well as autism and related phenotypes. As Smith-Magenis syndrome is also associated with language delay, a gene involved in acquisition of language may lie within this interval. Whether a parent of origin effect, gender of the case, the presence of allelic variation, or changes in expression of genes outside the breakpoints influence the resultant phenotype remains to be determined. (c) 2007 Wiley-Liss, Inc.
Full Text Available The complement system acts as a first line of defense and promotes organism homeostasis by modulating the fates of diverse physiological processes. Multiple copies of component genes have been previously identified in fish, suggesting a key role for this system in aquatic organisms. Herein, we confirm the presence of three different previously reported complement c3 genes (c3.1, c3.2, c3.3 and identify five additional c3 genes (c3.4, c3.5, c3.6, c3.7, c3.8 in the zebrafish genome. Additionally, we evaluate the mRNA expression levels of the different c3 genes during ontogeny and in different tissues under steady-state and inflammatory conditions. Furthermore, while reconciling the phylogenetic tree with the fish species tree, we uncovered an event of c3 duplication common to all teleost fishes that gave rise to an exclusive c3 paralog (c3.7 and c3.8. These paralogs showed a distinct ability to regulate neutrophil migration in response to injury compared with the other c3 genes and may play a role in maintaining the balance between inflammatory and homeostatic processes in zebrafish.
Thomas, N Simon; Harvey, John F; Bunyan, David J; Rankin, Julia; Grigelioniene, Giedre; Bruno, Damien L; Tan, Tiong Y; Tomkins, Susan; Hastings, Robert
Deletions of the SHOX gene are well documented and cause disproportionate short stature and variable skeletal abnormalities. In contrast interstitial SHOX duplications limited to PAR1 appear to be very rare and the clinical significance of the only case report in the literature is unclear. Mapping of this duplication has now shown that it includes the entire SHOX gene but little flanking sequence and so will not encompass any of the long-range enhancers required for SHOX transcription. We now describe the clinical and molecular characterization of three additional cases. The duplications all included the SHOX coding sequence but varied in the amount of flanking sequence involved. The probands were ascertained for a variety of reasons: hypotonia and features of Asperger syndrome, Leri-Weill dyschondrosteosis (LWD), and a family history of cleft palate. However, the presence of a duplication did not correlate with any of these features or with evidence of skeletal abnormality. Remarkably, the proband with LWD had inherited both a SHOX deletion and a duplication. The effect of the duplications on stature was variable: height appeared to be elevated in some carriers, particularly in those with the largest duplications, but was still within the normal range. SHOX duplications are likely to be under ascertained and more cases need to be identified and characterized in detail in order to accurately determine their phenotypic consequences.
Nuttle, Xander; Giannuzzi, Giuliana; Duyzend, Michael H; Schraiber, Joshua G; Narvaiza, Iñigo; Sudmant, Peter H; Penn, Osnat; Chiatante, Giorgia; Malig, Maika; Huddleston, John; Benner, Chris; Camponeschi, Francesca; Ciofi-Baffoni, Simone; Stessman, Holly A F; Marchetto, Maria C N; Denman, Laura; Harshman, Lana; Baker, Carl; Raja, Archana; Penewit, Kelsi; Janke, Nicolette; Tang, W Joyce; Ventura, Mario; Banci, Lucia; Antonacci, Francesca; Akey, Joshua M; Amemiya, Chris T; Gage, Fred H; Reymond, Alexandre; Eichler, Evan E
Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage--a pattern unlikely to have arisen so rapidly in the absence of selection (P sapiens-specific duplication. In summary, the duplicative transposition of BOLA2 at the root of the H. sapiens lineage about 282 ka simultaneously increased copy number of a gene associated with iron homeostasis and predisposed our species to recurrent rearrangements associated with disease.
Full Text Available A fundamental step in the evolution of the visual system is the gene duplication of visual opsins and differentiation between the duplicates in absorption spectra and expression pattern in the retina. However, our understanding of the mechanism of expression differentiation is far behind that of spectral tuning of opsins. Zebrafish (Danio rerio have two red-sensitive cone opsin genes, LWS-1 and LWS-2. These genes are arrayed in a tail-to-head manner, in this order, and are both expressed in the long member of double cones (LDCs in the retina. Expression of the longer-wave sensitive LWS-1 occurs later in development and is thus confined to the peripheral, especially ventral-nasal region of the adult retina, whereas expression of LWS-2 occurs earlier and is confined to the central region of the adult retina, shifted slightly to the dorsal-temporal region. In this study, we employed a transgenic reporter assay using fluorescent proteins and P1-artificial chromosome (PAC clones encompassing the two genes and identified a 0.6-kb "LWS-activating region" (LAR upstream of LWS-1, which regulates expression of both genes. Under the 2.6-kb flanking upstream region containing the LAR, the expression pattern of LWS-1 was recapitulated by the fluorescent reporter. On the other hand, when LAR was directly conjugated to the LWS-2 upstream region, the reporter was expressed in the LDCs but also across the entire outer nuclear layer. Deletion of LAR from the PAC clones drastically lowered the reporter expression of the two genes. These results suggest that LAR regulates both LWS-1 and LWS-2 by enhancing their expression and that interaction of LAR with the promoters is competitive between the two genes in a developmentally restricted manner. Sharing a regulatory region between duplicated genes could be a general way to facilitate the expression differentiation in duplicated visual opsins.
Martin M Johansson
Full Text Available The human Y chromosome is almost always excluded from genome-wide investigations of copy number variants (CNVs due to its highly repetitive structure. This chromosome should not be forgotten, not only for its well-known relevance in male fertility, but also for its involvement in clinical phenotypes such as cancers, heart failure and sex specific effects on brain and behaviour.We analysed Y chromosome data from Affymetrix 6.0 SNP arrays and found that the signal intensities for most of 8179 SNP/CN probes in the male specific region (MSY discriminated between a male, background signals in a female and an isodicentric male containing a large deletion of the q-arm and a duplication of the p-arm of the Y chromosome. Therefore, this SNP/CN platform is suitable for identification of gain and loss of Y chromosome sequences. In a set of 1718 males, we found 25 different CNV patterns, many of which are novel. We confirmed some of these variants by PCR or qPCR. The total frequency of individuals with CNVs was 14.7%, including 9.5% with duplications, 4.5% with deletions and 0.7% exhibiting both. Hence, a novel observation is that the frequency of duplications was more than twice the frequency of deletions. Another striking result was that 10 of the 25 detected variants were significantly overrepresented in one or more haplogroups, demonstrating the importance to control for haplogroups in genome-wide investigations to avoid stratification. NO-M214(xM175 individuals presented the highest percentage (95% of CNVs. If they were not counted, 12.4% of the rest included CNVs, and the difference between duplications (8.9% and deletions (2.8% was even larger.Our results demonstrate that currently available genome-wide SNP platforms can be used to identify duplications and deletions in the human Y chromosome. Future association studies of the full spectrum of Y chromosome variants will demonstrate the potential involvement of gain or loss of Y chromosome sequence in
Conclusion: MLPA was proven to be a powerful tool for the detection of DMD gene deletions and duplications in male patients and female carriers. There was a relatively lower frequency of deletion and a higher frequency of duplication of DMD gene in this population compared to previous reports.
Salari, Keyan; Tibshirani, Robert; Pollack, Jonathan R
DNA copy number alterations (CNA) frequently underlie gene expression changes by increasing or decreasing gene dosage. However, only a subset of genes with altered dosage exhibit concordant changes in gene expression. This subset is likely to be enriched for oncogenes and tumor suppressor genes, and can be identified by integrating these two layers of genome-scale data. We introduce DNA/RNA-Integrator (DR-Integrator), a statistical software tool to perform integrative analyses on paired DNA copy number and gene expression data. DR-Integrator identifies genes with significant correlations between DNA copy number and gene expression, and implements a supervised analysis that captures genes with significant alterations in both DNA copy number and gene expression between two sample classes. DR-Integrator is freely available for non-commercial use from the Pollack Lab at http://pollacklab.stanford.edu/ and can be downloaded as a plug-in application to Microsoft Excel and as a package for the R statistical computing environment. The R package is available under the name 'DRI' at http://cran.r-project.org/. An example analysis using DR-Integrator is included as supplemental material. Supplementary data are available at Bioinformatics online.
Muhammad I. Naseer
Full Text Available Abstract Background Epilepsy is genetically complex but common brain disorder of the world affecting millions of people with almost of all age groups. Novel Copy number variations (CNVs are considered as important reason for the numerous neurodevelopmental disorders along with intellectual disability and epilepsy. DNA array based studies contribute to explain a more severe clinical presentation of the disease but interoperation of many detected CNVs are still challenging. Results In order to study novel CNVs with epilepsy related genes in Saudi family with six affected and two normal individuals with several forms of epileptic seizures, intellectual disability (ID, and minor dysmorphism, we performed the high density whole genome Agilent sure print G3 Hmn CGH 2x 400 K array-CGH chips analysis. Our results showed de novo deletions, duplications and deletion plus duplication on differential chromosomal regions in the affected individuals that were not shown in the normal fathe and normal kids by using Agilent CytoGenomics 126.96.36.199 softwear. Copy number gain were observed in the chromosome 1, 16 and 22 with LCE3C, HPR, GSTT2, GSTTP2, DDT and DDTL genes respectively whereas the deletions observed in the chromosomal regions 8p23-p21 (4303127–4337759 and the potential gene in this region is CSMD1 (OMIM: 612279. Moreover, the array CGH results deletions and duplication were also validated by using primer design of deleted regions utilizing the flanked SNPs using simple PCR and also by using quantitative real time PCR. Conclusions We found some of the de novo deletions and duplication in our study in Saudi family with intellectual disability and epilepsy. Our results suggest that array-CGH should be used as a first line of genetic test for epilepsy except there is a strong indication for a monogenic syndrome. The advanced high through put array-CGH technique used in this study aim to collect the data base and to identify new mechanisms describing
Yuksel-Apak, Memnune; Bögershausen, Nina; Pawlik, Barbara; Li, Yun; Apak, Selcuk; Uyguner, Oya; Milz, Esther; Nürnberg, Gudrun; Karaman, Birsen; Gülgören, Ayan; Grzeschik, Karl-Heinz; Nürnberg, Peter; Kayserili, Hülya; Wollnik, Bernd
Indian hedgehog (Ihh) signaling is a major determinant of various processes during embryonic development and has a pivotal role in embryonic skeletal development. A specific spatial and temporal expression of Ihh within the developing limb buds is essential for accurate digit outgrowth and correct digit number. Although missense mutations in IHH cause brachydactyly type A1, small tandem duplications involving the IHH locus have recently been described in patients with mild syndactyly and craniosynostosis. In contrast, a ∼600-kb deletion 5' of IHH in the doublefoot mouse mutant (Dbf) leads to severe polydactyly without craniosynostosis, but with craniofacial dysmorphism. We now present a patient resembling acrocallosal syndrome (ACS) with extensive polysyndactyly of the hands and feet, craniofacial abnormalities including macrocephaly, agenesis of the corpus callosum, dysplastic and low-set ears, severe hypertelorism and profound psychomotor delay. Single-nucleotide polymorphism (SNP) array copy number analysis identified a ∼900-kb duplication of the IHH locus, which was confirmed by an independent quantitative method. A fetus from a second pregnancy of the mother by a different spouse showed similar craniofacial and limb malformations and the same duplication of the IHH-locus. We defined the exact breakpoints and showed that the duplications are identical tandem duplications in both sibs. No copy number changes were observed in the healthy mother. To our knowledge, this is the first report of a human phenotype similar to the Dbf mutant and strikingly overlapping with ACS that is caused by a copy number variation involving the IHH locus on chromosome 2q35.
... COMMERCE CLAUSES AND FORMS SOLICITATION PROVISIONS AND CONTRACT CLAUSES Text of Provisions and Clauses 1352... copying in excess of the limits in paragraph (a) of this clause are unallowable without prior written..., duplicating, and copying in excess of the limits specified in paragraph (a) of this clause, a provision...
Full Text Available Abstract Background A large family of viruses that infect bacteria, called phages, is characterized by long tails used to inject DNA into their victims' cells. The tape measure protein got its name because the length of the corresponding gene is proportional to the length of the phage's tail: a fact shown by actually copying or splicing out parts of DNA in exemplar species. A natural question is whether there exist units for these tape measures, and if different tape measures have different units and lengths. Such units would allow us to retrace the evolution of tape measure proteins using their duplication/loss history. The vast number of sequenced phages genomes allows us to attack this problem with a comparative genomics approach. Results Here we describe a subset of phages whose tape measure proteins contain variable numbers of an 11 amino acids sequence repeat, aligned with sequence similarity, structural properties, and simple arithmetics. This subset provides a unique opportunity for the combinatorial study of phage evolution, without the added uncertainties of multiple alignments, which are trivial in this case, or of protein functions, that are well established. We give a heuristic that reconstructs the duplication history of these sequences, using divergent strains to discriminate between mutations that occurred before and after speciation, or lineage divergence. The heuristic is based on an efficient algorithm that gives an exhaustive enumeration of all possible parsimonious reconstructions of the duplication/speciation history of a single nucleotide. Finally, we present a method that allows, when possible, to discriminate between duplication and loss events. Conclusions Establishing the evolutionary history of viruses is difficult, in part due to extensive recombinations and gene transfers, and high mutation rates that often erase detectable similarity between homologous genes. In this paper, we introduce new tools to address this
Sharp, Andrew J; Hansen, Sierra; Selzer, Rebecca R; Cheng, Ze; Regan, Regina; Hurst, Jane A; Stewart, Helen; Price, Sue M; Blair, Edward; Hennekam, Raoul C; Fitzpatrick, Carrie A; Segraves, Rick; Richmond, Todd A; Guiver, Cheryl; Albertson, Donna G; Pinkel, Daniel; Eis, Peggy S; Schwartz, Stuart; Knight, Samantha J L; Eichler, Evan E
Genomic disorders are characterized by the presence of flanking segmental duplications that predispose these regions to recurrent rearrangement. Based on the duplication architecture of the genome, we investigated 130 regions that we hypothesized as candidates for previously undescribed genomic disorders. We tested 290 individuals with mental retardation by BAC array comparative genomic hybridization and identified 16 pathogenic rearrangements, including de novo microdeletions of 17q21.31 found in four individuals. Using oligonucleotide arrays, we refined the breakpoints of this microdeletion, defining a 478-kb critical region containing six genes that were deleted in all four individuals. We mapped the breakpoints of this deletion and of four other pathogenic rearrangements in 1q21.1, 15q13, 15q24 and 17q12 to flanking segmental duplications, suggesting that these are also sites of recurrent rearrangement. In common with the 17q21.31 deletion, these breakpoint regions are sites of copy number polymorphism in controls, indicating that these may be inherently unstable genomic regions.
Full Text Available Abstract Computing the edit distance between two genomes under certain operations is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be easily computed for genomes without duplicate genes. In this paper, we study the edit distance for genomes with duplicate genes under a model that includes DCJ operations, insertions and deletions. We prove that computing the edit distance is equivalent to finding the optimal cycle decomposition of the corresponding adjacency graph, and give an approximation algorithm with an approximation ratio of 1.5 + ∈.
Full Text Available Complete mitochondrial (mt genome sequences with duplicate control regions (CRs have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs at the 3' end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P
Xu, Xiu; Xu, Qiong; Zhang, Ying; Zhang, Xiaodi; Cheng, Tianlin; Wu, Bingbing; Ding, Yanhua; Lu, Ping; Zheng, Jingjing; Zhang, Min; Qiu, Zilong; Yu, Xiang
Abstract Background Autistic spectrum disorders (ASDs) are a family of neurodevelopmental disorders with strong genetic components. Recent studies have shown that copy number variations in dosage sensitive genes can contribute significantly to these disorders. One such gene is the transcription factor MECP2, whose loss of function in females results in Rett syndrome, while its duplication in males results in developmental delay and autism. Case presentation Here, we identified a Chinese famil...
Pendleton, Amanda L; Smith, Katherine E; Feau, Nicolas; Martin, Francis M; Grigoriev, Igor V; Hamelin, Richard; Nelson, C Dana; Burleigh, J Gordon; Davis, John M
Rust fungi are a group of fungal pathogens that cause some of the world's most destructive diseases of trees and crops. A shared characteristic among rust fungi is obligate biotrophy, the inability to complete a lifecycle without a host. This dependence on a host species likely affects patterns of gene expansion, contraction, and innovation within rust pathogen genomes. The establishment of disease by biotrophic pathogens is reliant upon effector proteins that are encoded in the fungal genome and secreted from the pathogen into the host's cell apoplast or within the cells. This study uses a comparative genomic approach to elucidate putative effectors and determine their evolutionary histories. We used OrthoMCL to identify nearly 20,000 gene families in proteomes of 16 diverse fungal species, which include 15 basidiomycetes and one ascomycete. We inferred patterns of duplication and loss for each gene family and identified families with distinctive patterns of expansion/contraction associated with the evolution of rust fungal genomes. To recognize potential contributors for the unique features of rust pathogens, we identified families harboring secreted proteins that: (i) arose or expanded in rust pathogens relative to other fungi, or (ii) contracted or were lost in rust fungal genomes. While the origin of rust fungi appears to be associated with considerable gene loss, there are many gene duplications associated with each sampled rust fungal genome. We also highlight two putative effector gene families that have expanded in Cqf that we hypothesize have roles in pathogenicity.
Boghossian, Nansi S; Sicko, Robert J; Giannakou, Andreas; Dimopoulos, Aggeliki; Caggana, Michele; Tsai, Michael Y; Yeung, Edwina H; Pankratz, Nathan; Cole, Benjamin R; Romitti, Paul A; Browne, Marilyn L; Fan, Ruzong; Liu, Aiyi; Kay, Denise M; Mills, James L
Prune belly syndrome (PBS), also known as Eagle-Barrett syndrome, is a rare congenital disorder characterized by absence or hypoplasia of the abdominal wall musculature, urinary tract anomalies, and cryptorchidism in males. The etiology of PBS is largely unresolved, but genetic factors are implicated given its recurrence in families. We examined cases of PBS to identify novel pathogenic copy number variants (CNVs). A total of 34 cases (30 males and 4 females) with PBS identified from all live births in New York State (1998-2005) were genotyped using Illumina HumanOmni2.5 microarrays. CNVs were prioritized if they were absent from in-house controls, encompassed ≥10 consecutive probes, were ≥20 Kb in size, had ≤20% overlap with common variants in population reference controls, and had ≤20% overlap with any variant previously detected in other birth defect phenotypes screened in our laboratory. We identified 17 candidate autosomal CNVs; 10 cases each had one CNV and four cases each had two CNVs. The CNVs included a 158 Kb duplication at 4q22 that overlaps the BMPR1B gene; duplications of different sizes carried by two cases in the intron of STIM1 gene; a 67 Kb duplication 202 Kb downstream of the NOG gene, and a 1.34 Mb deletion including the MYOCD gene. The identified rare CNVs spanned genes involved in mesodermal, muscle, and urinary tract development and differentiation, which might help in elucidating the genetic contribution to PBS. We did not have parental DNA and cannot identify whether these CNVs were de novo or inherited. Further research on these CNVs, particularly BMP signaling is warranted to elucidate the pathogenesis of PBS. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Grispo, Michael T; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E; Storz, Jay F
The majority of bird species co-express two functionally distinct hemoglobin (Hb) isoforms in definitive erythrocytes as follows: HbA (the major adult Hb isoform, with α-chain subunits encoded by the α(A)-globin gene) and HbD (the minor adult Hb isoform, with α-chain subunits encoded by the α(D)-globin gene). The α(D)-globin gene originated via tandem duplication of an embryonic α-like globin gene in the stem lineage of tetrapod vertebrates, which suggests the possibility that functional differentiation between the HbA and HbD isoforms may be attributable to a retained ancestral character state in HbD that harkens back to a primordial, embryonic function. To investigate this possibility, we conducted a combined analysis of protein biochemistry and sequence evolution to characterize the structural and functional basis of Hb isoform differentiation in birds. Functional experiments involving purified HbA and HbD isoforms from 11 different bird species revealed that HbD is characterized by a consistently higher O(2) affinity in the presence of allosteric effectors such as organic phosphates and Cl(-) ions. In the case of both HbA and HbD, analyses of oxygenation properties under the two-state Monod-Wyman-Changeux allosteric model revealed that the pH dependence of Hb-O(2) affinity stems primarily from changes in the O(2) association constant of deoxy (T-state)-Hb. Ancestral sequence reconstructions revealed that the amino acid substitutions that distinguish the adult-expressed Hb isoforms are not attributable to the retention of an ancestral (pre-duplication) character state in the α(D)-globin gene that is shared with the embryonic α-like globin gene.
Grispo, Michael T.; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E.; Storz, Jay F.
The majority of bird species co-express two functionally distinct hemoglobin (Hb) isoforms in definitive erythrocytes as follows: HbA (the major adult Hb isoform, with α-chain subunits encoded by the αA-globin gene) and HbD (the minor adult Hb isoform, with α-chain subunits encoded by the αD-globin gene). The αD-globin gene originated via tandem duplication of an embryonic α-like globin gene in the stem lineage of tetrapod vertebrates, which suggests the possibility that functional differentiation between the HbA and HbD isoforms may be attributable to a retained ancestral character state in HbD that harkens back to a primordial, embryonic function. To investigate this possibility, we conducted a combined analysis of protein biochemistry and sequence evolution to characterize the structural and functional basis of Hb isoform differentiation in birds. Functional experiments involving purified HbA and HbD isoforms from 11 different bird species revealed that HbD is characterized by a consistently higher O2 affinity in the presence of allosteric effectors such as organic phosphates and Cl− ions. In the case of both HbA and HbD, analyses of oxygenation properties under the two-state Monod-Wyman-Changeux allosteric model revealed that the pH dependence of Hb-O2 affinity stems primarily from changes in the O2 association constant of deoxy (T-state)-Hb. Ancestral sequence reconstructions revealed that the amino acid substitutions that distinguish the adult-expressed Hb isoforms are not attributable to the retention of an ancestral (pre-duplication) character state in the αD-globin gene that is shared with the embryonic α-like globin gene. PMID:22962007
Full Text Available We used deep sequencing technology to profile the transcriptome, gene copy number, and CpG island methylation status simultaneously in eight commonly used breast cell lines to develop a model for how these genomic features are integrated in estrogen receptor positive (ER+ and negative breast cancer. Total mRNA sequence, gene copy number, and genomic CpG island methylation were carried out using the Illumina Genome Analyzer. Sequences were mapped to the human genome to obtain digitized gene expression data, DNA copy number in reference to the non-tumor cell line (MCF10A, and methylation status of 21,570 CpG islands to identify differentially expressed genes that were correlated with methylation or copy number changes. These were evaluated in a dataset from 129 primary breast tumors. Gene expression in cell lines was dominated by ER-associated genes. ER+ and ER- cell lines formed two distinct, stable clusters, and 1,873 genes were differentially expressed in the two groups. Part of chromosome 8 was deleted in all ER- cells and part of chromosome 17 amplified in all ER+ cells. These loci encoded 30 genes that were overexpressed in ER+ cells; 9 of these genes were overexpressed in ER+ tumors. We identified 149 differentially expressed genes that exhibited differential methylation of one or more CpG islands within 5 kb of the 5' end of the gene and for which mRNA abundance was inversely correlated with CpG island methylation status. In primary tumors we identified 84 genes that appear to be robust components of the methylation signature that we identified in ER+ cell lines. Our analyses reveal a global pattern of differential CpG island methylation that contributes to the transcriptome landscape of ER+ and ER- breast cancer cells and tumors. The role of gene amplification/deletion appears to more modest, although several potentially significant genes appear to be regulated by copy number aberrations.
Kim, Seon-Hee; Bae, Young-An
Tyrosinase provides an essential activity during egg production in diverse platyhelminths by mediating sclerotization of eggshells. In this study, we investigated the genomic and evolutionary features of tyrosinases in parasitic platyhelminths whose genomic information is available. A pair of paralogous tyrosinases was detected in most trematodes, whereas they were lost in cyclophyllidean cestodes. A pseudophyllidean cestode displaying egg biology similar to that of trematodes possessed an orthologous gene. Interestingly, one of the paralogous tyrosinases appeared to have been multiplied into three copies in Clonorchis sinensis and Opisthorchis viverrini. In addition, a fifth tyrosinase gene that was minimally transcribed through all developmental stages was further detected in these opisthorchiid genomes. Phylogenetic analyses demonstrated that the tyrosinase gene has undergone duplication at least three times in platyhelminths. The additional opisthorchiid gene arose from the first duplication. A paralogous copy generated from these gene duplications, except for the last one, seemed to be lost in the major neodermatans lineages. In C. sinensis, tyrosinase gene expressions were initiated following sexual maturation and the levels were significantly enhanced by the presence of O2 and bile. Taken together, our data suggest that tyrosinase has evolved lineage-specifically across platyhelminths related to its copy number and induction mechanism.
Brian B Tuch
Full Text Available Due to growing throughput and shrinking cost, massively parallel sequencing is rapidly becoming an attractive alternative to microarrays for the genome-wide study of gene expression and copy number alterations in primary tumors. The sequencing of transcripts (RNA-Seq should offer several advantages over microarray-based methods, including the ability to detect somatic mutations and accurately measure allele-specific expression. To investigate these advantages we have applied a novel, strand-specific RNA-Seq method to tumors and matched normal tissue from three patients with oral squamous cell carcinomas. Additionally, to better understand the genomic determinants of the gene expression changes observed, we have sequenced the tumor and normal genomes of one of these patients. We demonstrate here that our RNA-Seq method accurately measures allelic imbalance and that measurement on the genome-wide scale yields novel insights into cancer etiology. As expected, the set of genes differentially expressed in the tumors is enriched for cell adhesion and differentiation functions, but, unexpectedly, the set of allelically imbalanced genes is also enriched for these same cancer-related functions. By comparing the transcriptomic perturbations observed in one patient to his underlying normal and tumor genomes, we find that allelic imbalance in the tumor is associated with copy number mutations and that copy number mutations are, in turn, strongly associated with changes in transcript abundance. These results support a model in which allele-specific deletions and duplications drive allele-specific changes in gene expression in the developing tumor.
Araud, Tanguy; Graw, Sharon; Berger, Ralph; Lee, Michael; Neveu, Estele; Bertrand, Daniel; Leonard, Sherry
The human α7 neuronal nicotinic acetylcholine receptor gene (CHRNA7) is a candidate gene for schizophrenia and an important drug target for cognitive deficits in the disorder. Activation of the α7*nAChR, results in opening of the channel and entry of mono- and divalent cations, including Ca(2+), that presynaptically participates to neurotransmitter release and postsynaptically to down-stream changes in gene expression. Schizophrenic patients have low levels of α7*nAChR, as measured by binding of the ligand [(125)I]-α-bungarotoxin (I-BTX). The structure of the gene, CHRNA7, is complex. During evolution, CHRNA7 was partially duplicated as a chimeric gene (CHRFAM7A), which is expressed in the human brain and elsewhere in the body. The association between a 2bp deletion in CHRFAM7A and schizophrenia suggested that this duplicate gene might contribute to cognitive impairment. To examine the putative contribution of CHRFAM7A on receptor function, co-expression of α7 and the duplicate genes was carried out in cell lines and Xenopus oocytes. Expression of the duplicate alone yielded protein expression but no functional receptor and co-expression with α7 caused a significant reduction of the amplitude of the ACh-evoked currents. Reduced current amplitude was not correlated with a reduction of I-BTX binding, suggesting the presence of non-functional (ACh-silent) receptors. This hypothesis is supported by a larger increase of the ACh-evoked current by the allosteric modulator 1-(5-chloro-2,4-dimethoxy-phenyl)-3-(5-methyl-isoxazol-3-yl)-urea (PNU-120596) in cells expressing the duplicate than in the control. These results suggest that CHRFAM7A acts as a dominant negative modulator of CHRNA7 function and is critical for receptor regulation in humans. Copyright © 2011 Elsevier Inc. All rights reserved.
Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes. PMID:20219098
Full Text Available Copy number variations (CNVs are important in relation to diversity and evolution but can sometimes cause disease. The most common genetic cause of the inherited peripheral neuropathy Charcot-Marie-Tooth disease is the PMP22 duplication; otherwise, CNVs have been considered rare. We investigated CNVs in a population-based sample of Charcot-Marie-Tooth (CMT families. The 81 CMT families had previously been screened for the PMP22 duplication and point mutations in 51 peripheral neuropathy genes, and a genetic cause was identified in 37 CMT families (46%. Index patients from the 44 CMT families with an unknown genetic diagnosis were analysed by whole-genome array comparative genomic hybridization to investigate the entire genome for larger CNVs and multiplex ligation-dependent probe amplification to detect smaller intragenomic CNVs in MFN2 and MPZ. One patient had the pathogenic PMP22 duplication not detected by previous methods. Three patients had potentially pathogenic CNVs in the CNTNAP2, LAMA2, or SEMA5A, that is, genes related to neuromuscular or neurodevelopmental disease. Genotype and phenotype correlation indicated likely pathogenicity for the LAMA2 CNV, whereas the CNTNAP2 and SEMA5A CNVs remained potentially pathogenic. Except the PMP22 duplication, disease causing CNVs are rare but may cause CMT in about 1% (95% CI 0–7% of the Norwegian CMT families.
Price, Ric N; Uhlemann, Anne-Catrin; Brockman, Alan; McGready, Rose; Ashley, Elizabeth; Phaipun, Lucy; Patel, Rina; Laing, Kenneth; Looareesuwan, Sornchai; White, Nicholas J; Nosten, François; Krishna, Sanjeev
The borders of Thailand harbour the world's most multidrug resistant Plasmodium falciparum parasites. In 1984 mefloquine was introduced as treatment for uncomplicated falciparum malaria, but substantial resistance developed within 6 years. A combination of artesunate with mefloquine now cures more than 95% of acute infections. For both treatment regimens, the underlying mechanisms of resistance are not known. The relation between polymorphisms in the P falciparum multidrug resistant gene 1 (pfmdr1) and the in-vitro and in-vivo responses to mefloquine were assessed in 618 samples from patients with falciparum malaria studied prospectively over 12 years. pfmdr1 copy number was assessed by a robust real-time PCR assay. Single nucleotide polymorphisms of pfmdr1, P falciparum chloroquine resistance transporter gene (pfcrt) and P falciparum Ca2+ ATPase gene (pfATP6) were assessed by PCR-restriction fragment length polymorphism. Increased copy number of pfmdr1 was the most important determinant of in-vitro and in-vivo resistance to mefloquine, and also to reduced artesunate sensitivity in vitro. In a Cox regression model with control for known confounders, increased pfmdr1 copy number was associated with an attributable hazard ratio (AHR) for treatment failure of 6.3 (95% CI 2.9-13.8, p<0.001) after mefloquine monotherapy and 5.4 (2.0-14.6, p=0.001) after artesunate-mefloquine therapy. Single nucleotide polymorphisms in pfmdr1 were associated with increased mefloquine susceptibility in vitro, but not in vivo. Amplification in pfmdr1 is the main cause of resistance to mefloquine in falciparum malaria. Multidrug resistant P falciparum malaria is common in southeast Asia, but difficult to identify and treat. Genes that encode parasite transport proteins maybe involved in export of drugs and so cause resistance. In this study we show that increase in copy number of pfmdr1, a gene encoding a parasite transport protein, is the best overall predictor of treatment failure with
Dunaway, Keith W; Islam, M Saharul; Coulson, Rochelle L; Lopez, S Jesse; Vogel Ciernia, Annie; Chu, Roy G; Yasui, Dag H; Pessah, Isaac N; Lott, Paul; Mordaunt, Charles; Meguro-Horike, Makiko; Horike, Shin-Ichi; Korf, Ian; LaSalle, Janine M
Rare variants enriched for functions in chromatin regulation and neuronal synapses have been linked to autism. How chromatin and DNA methylation interact with environmental exposures at synaptic genes in autism etiologies is currently unclear. Using whole-genome bisulfite sequencing in brain tissue and a neuronal cell culture model carrying a 15q11.2-q13.3 maternal duplication, we find that significant global DNA hypomethylation is enriched over autism candidate genes and affects gene expression. The cumulative effect of multiple chromosomal duplications and exposure to the pervasive persistent organic pollutant PCB 95 altered methylation of more than 1,000 genes. Hypomethylated genes were enriched for H2A.Z, increased maternal UBE3A in Dup15q corresponded to reduced levels of RING1B, and bivalently modified H2A.Z was altered by PCB 95 and duplication. These results demonstrate the compounding effects of genetic and environmental insults on the neuronal methylome that converge upon dysregulation of chromatin and synaptic genes. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Amanda L. Pendleton
Full Text Available Rust fungi are a group of fungal pathogens that cause some of the world’s most destructive diseases of trees and crops. A shared characteristic among rust fungi is obligate biotrophy, the inability to complete a lifecycle without a host. This dependence on a host species likely affects patterns of gene expansion, contraction, and innovation within rust pathogen genomes. The establishment of disease by biotrophic pathogens is reliant upon effector proteins that are encoded in the fungal genome and secreted from the pathogen into the host’s cell apoplast or within the cells. This study uses a comparative genomic approach to elucidate putative effectors and determine their evolutionary histories. We used OrthoMCL to identify nearly 20,000 gene families in proteomes of sixteen diverse fungal species, which include fifteen basidiomycetes and one ascomycete. We inferred patterns of duplication and loss for each gene family and identified families with distinctive patterns of expansion/contraction associated with the evolution of rust fungal genomes. To recognize potential contributors for the unique features of rust pathogens, we identified families harboring secreted proteins that: i arose or expanded in rust pathogens relative to other fungi, or ii contracted or were lost in rust fungal genomes. While the origin of rust fungi appears to be associated with considerable gene loss, there are many gene duplications associated with each sampled rust fungal genome. We also highlight two putative effector gene families that have expanded in Cqf that we hypothesize have roles in pathogenicity.
Benito-Sanz, S; Barroso, E; Heine-Suñer, D; Hisado-Oliva, A; Romanelli, V; Rosell, J; Aragones, A; Caimari, M; Argente, J; Ross, J L; Zinn, A R; Gracia, R; Lapunzina, P; Campos-Barros, A; Heath, K E
Léri-Weill dyschondrosteosis (LWD) is a skeletal dysplasia characterized by disproportionate short stature and the Madelung deformity of the forearm. SHOX mutations and pseudoautosomal region 1 deletions encompassing SHOX or its enhancers have been identified in approximately 60% of LWD and approximately 15% of idiopathic short stature (ISS) individuals. Recently SHOX duplications have been described in LWD/ISS but also in individuals with other clinical manifestations, thus questioning their pathogenicity. The objective of the study was to investigate the pathogenicity of SHOX duplications in LWD and ISS. Multiplex ligation-dependent probe amplification is routinely used in our unit to analyze for SHOX/pseudoautosomal region 1 copy number changes in LWD/ISS referrals. Quantitative PCR, microsatellite marker, and fluorescence in situ hybridization analysis were undertaken to confirm all identified duplications. During the routine analysis of 122 LWD and 613 ISS referrals, a total of four complete and 10 partial SHOX duplications or multiple copy number (n > 3) as well as one duplication of the SHOX 5' flanking region were identified in nine LWD and six ISS cases. Partial SHOX duplications appeared to have a more deleterious effect on skeletal dysplasia and height gain than complete SHOX duplications. Importantly, no increase in SHOX copy number was identified in 340 individuals with normal stature or 104 overgrowth referrals. MLPA analysis of SHOX/PAR1 led to the identification of partial and complete SHOX duplications or multiple copies associated with LWD or ISS, suggesting that they may represent an additional class of mutations implicated in the molecular etiology of these clinical entities.
Sung, Chang Ohk; Choi, Chel Hun; Ko, Young-Hyeh; Ju, Hyunjeong; Choi, Yoon-La; Kim, Nyunsu; Kang, So Young; Ha, Sang Yun; Choi, Kyusam; Bae, Duk-Soo; Lee, Jeong-Won; Kim, Tae-Joong; Song, Sang Yong; Kim, Byoung-Gie
Ovarian clear cell adenocarcinoma (Ov-CCA) is a distinctive subtype of ovarian epithelial carcinoma. In this study, we performed array comparative genomic hybridization (aCGH) and paired gene expression microarray of 19 fresh-frozen samples and conducted integrative analysis. For the copy number alterations, significantly amplified regions (false discovery rate [FDR] q genes demonstrating frequent copy number alterations (>25% of samples) that correlated with gene expression (FDR genes were mainly located on 8p11.21, 8p21.2-p21.3, 8q22.1, 8q24.3, 17q23.2-q23.3, 19p13.3, and 19p13.11. Among the regions, 8q24.3 was found to contain the most genes (30 of 94 genes) including PTK2. The 8q24.3 region was indicated as the most significant region, as supported by copy number, GISTIC, and integrative analysis. Pathway analysis using differentially expressed genes on 8q24.3 revealed several major nodes, including PTK2. In conclusion, we identified a set of 94 candidate genes with frequent copy number alterations that correlated with gene expression. Specific chromosomal alterations, such as the 8q24.3 gain containing PTK2, could be a therapeutic target in a subset of Ov-CCAs. Copyright © 2013. Published by Elsevier Inc.
Junnila, Siina; Kokkola, Arto; Karjalainen-Lindsberg, Marja-Liisa; Puolakkainen, Pauli; Monni, Outi
Gastric cancer is one of the most common malignancies worldwide and the second most common cause of cancer related death. Gene copy number alterations play an important role in the development of gastric cancer and a change in gene copy number is one of the main mechanisms for a cancer cell to control the expression of potential oncogenes and tumor suppressor genes. To highlight genes of potential biological and clinical relevance in gastric cancer, we carried out a systematic array-based survey of gene expression and copy number levels in primary gastric tumors and gastric cancer cell lines and validated the results using an affinity capture based transcript analysis (TRAC assay) and real-time qRT-PCR. Integrated microarray analysis revealed altogether 256 genes that were located in recurrent regions of gains or losses and had at least a 2-fold copy number- associated change in their gene expression. The expression levels of 13 of these genes, ALPK2, ASAP1, CEACAM5, CYP3A4, ENAH, ERBB2, HHIPL2, LTB4R, MMP9, PERLD1, PNMT, PTPRA, and OSMR, were validated in a total of 118 gastric samples using either the qRT-PCR or TRAC assay. All of these 13 genes were differentially expressed between cancerous samples and nonmalignant tissues (p < 0.05) and the association between copy number and gene expression changes was validated for nine (69.2%) of these genes (p < 0.05). In conclusion, integrated gene expression and copy number microarray analysis highlighted genes that may be critically important for gastric carcinogenesis. TRAC and qRT-PCR analyses validated the microarray results and therefore the role of these genes as potential biomarkers for gastric cancer
Indrasumunar, Arief; Wilde, Julia; Hayashi, Satomi; Li, Dongxue; Gresshoff, Peter M
Association between legumes and rhizobia results in the formation of root nodules, where symbiotic nitrogen fixation occurs. The early stages of this association involve a complex of signalling events between the host and microsymbiont. Several genes dealing with early signal transduction have been cloned, and one of them encodes the leucine-rich repeat (LRR) receptor kinase (SymRK; also termed NORK). The Symbiosis Receptor Kinase gene is required by legumes to establish a root endosymbiosis with Rhizobium bacteria as well as mycorrhizal fungi. Using degenerate primer and BAC sequencing, we cloned duplicated SymRK homeologues in soybean called GmSymRKα and GmSymRKβ. These duplicated genes have high similarity of nucleotide (96%) and amino acid sequence (95%). Sequence analysis predicted a malectin-like domain within the extracellular domain of both genes. Several putative cis-acting elements were found in promoter regions of GmSymRKα and GmSymRKβ, suggesting a participation in lateral root development, cell division and peribacteroid membrane formation. The mutant of SymRK genes is not available in soybean; therefore, to know the functions of these genes, RNA interference (RNAi) of these duplicated genes was performed. For this purpose, RNAi construct of each gene was generated and introduced into the soybean genome by Agrobacterium rhizogenes-mediated hairy root transformation. RNAi of GmSymRKβ gene resulted in an increased reduction of nodulation and mycorrhizal infection than RNAi of GmSymRKα, suggesting it has the major activity of the duplicated gene pair. The results from the important crop legume soybean confirm the joint phenotypic action of GmSymRK genes in both mycorrhizal and rhizobial infection seen in model legumes. Copyright © 2015 Elsevier GmbH. All rights reserved.
Dixit, Rahul; Naskar, Ruchira
In this work, we address the problem of region duplication or copy-move forgery detection in digital images, along with detection of geometric transforms (rotation and rescale) and postprocessing-based attacks (noise, blur, and brightness adjustment). Detection of region duplication, following conventional techniques, becomes more challenging when an intelligent adversary brings about such additional transforms on the duplicated regions. In this work, we utilize Fourier-Mellin transform with log-polar mapping and a color-based segmentation technique using K-means clustering, which help us to achieve invariance to all the above forms of attacks in copy-move forgery detection of digital images. Our experimental results prove the efficiency of the proposed method and its superiority to the current state of the art.
Lucotte, Elise A; Skov, Laurits; Jensen, Jacob Malte
we explore the evolution of human X- and Y-linked ampliconic genes by investigating copy number variation (CNV) and coding variation between populations using the Simons Genome Diversity Project. We develop a method to assess CNVs using the read-depth on modified X and Y chromosome targets containing...... related Y haplogroups, that diversified less than 50,000 years ago. Moreover, X and Y-linked ampliconic genes seem to have a faster amplification dynamic than autosomal multicopy genes. Looking at expression data from another study, we also find that XY-linked ampliconic genes with extensive copy number...
Full Text Available Prolonged human interactions and artificial selection have influenced the genotypic and phenotypic diversity among dog breeds. Because humans and dogs occupy diverse habitats, ecological contexts have likely contributed to breed-specific positive selection. Prior to the advent of modern dog-feeding practices, there was likely substantial variation in dietary landscapes among disparate dog breeds. As such, we investigated one type of genetic variant, copy number variation, in three metabolic genes: glucokinase regulatory protein (GCKR, phytanol-CoA 2-hydroxylase (PHYH, and pancreatic α-amylase 2B (AMY2B. These genes code for proteins that are responsible for metabolizing dietary products that originate from distinctly different food types: sugar, meat, and starch, respectively. After surveying copy number variation among dogs with diverse dietary histories, we found no correlation between diet and positive selection in either GCKR or PHYH. Although it has been previously demonstrated that dogs experienced a copy number increase in AMY2B relative to wolves during or after the dog domestication process, we demonstrate that positive selection continued to act on amylase copy number in dog breeds that consumed starch-rich diets in time periods after domestication. Furthermore, we found that introgression with wolves is not responsible for deterioration of positive selection on AMY2B among diverse dog breeds. Together, this supports the hypothesis that the amylase copy number expansion is found universally in dogs.
Edberg Jeffrey C
Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.
Balakrishnan, Christopher N; Ekblom, Robert; Völker, Martin; Westerdahl, Helena; Godinez, Ricardo; Kotkiewicz, Holly; Burt, David W; Graves, Tina; Griffin, Darren K; Warren, Wesley C; Edwards, Scott V
Due to its high polymorphism and importance for disease resistance, the major histocompatibility complex (MHC) has been an important focus of many vertebrate genome projects. Avian MHC organization is of particular interest because the chicken Gallus gallus, the avian species with the best characterized MHC, possesses a highly streamlined minimal essential MHC, which is linked to resistance against specific pathogens. It remains unclear the extent to which this organization describes the situation in other birds and whether it represents a derived or ancestral condition. The sequencing of the zebra finch Taeniopygia guttata genome, in combination with targeted bacterial artificial chromosome (BAC) sequencing, has allowed us to characterize an MHC from a highly divergent and diverse avian lineage, the passerines. The zebra finch MHC exhibits a complex structure and history involving gene duplication and fragmentation. The zebra finch MHC includes multiple Class I and Class II genes, some of which appear to be pseudogenes, and spans a much more extensive genomic region than the chicken MHC, as evidenced by the presence of MHC genes on each of seven BACs spanning 739 kb. Cytogenetic (FISH) evidence and the genome assembly itself place core MHC genes on as many as four chromosomes with TAP and Class I genes mapping to different chromosomes. MHC Class II regions are further characterized by high endogenous retroviral content. Lastly, we find strong evidence of selection acting on sites within passerine MHC Class I and Class II genes. The zebra finch MHC differs markedly from that of the chicken, the only other bird species with a complete genome sequence. The apparent lack of synteny between TAP and the expressed MHC Class I locus is in fact reminiscent of a pattern seen in some mammalian lineages and may represent convergent evolution. Our analyses of the zebra finch MHC suggest a complex history involving chromosomal fission, gene duplication and translocation in the
Burt David W
Full Text Available Abstract Background Due to its high polymorphism and importance for disease resistance, the major histocompatibility complex (MHC has been an important focus of many vertebrate genome projects. Avian MHC organization is of particular interest because the chicken Gallus gallus, the avian species with the best characterized MHC, possesses a highly streamlined minimal essential MHC, which is linked to resistance against specific pathogens. It remains unclear the extent to which this organization describes the situation in other birds and whether it represents a derived or ancestral condition. The sequencing of the zebra finch Taeniopygia guttata genome, in combination with targeted bacterial artificial chromosome (BAC sequencing, has allowed us to characterize an MHC from a highly divergent and diverse avian lineage, the passerines. Results The zebra finch MHC exhibits a complex structure and history involving gene duplication and fragmentation. The zebra finch MHC includes multiple Class I and Class II genes, some of which appear to be pseudogenes, and spans a much more extensive genomic region than the chicken MHC, as evidenced by the presence of MHC genes on each of seven BACs spanning 739 kb. Cytogenetic (FISH evidence and the genome assembly itself place core MHC genes on as many as four chromosomes with TAP and Class I genes mapping to different chromosomes. MHC Class II regions are further characterized by high endogenous retroviral content. Lastly, we find strong evidence of selection acting on sites within passerine MHC Class I and Class II genes. Conclusion The zebra finch MHC differs markedly from that of the chicken, the only other bird species with a complete genome sequence. The apparent lack of synteny between TAP and the expressed MHC Class I locus is in fact reminiscent of a pattern seen in some mammalian lineages and may represent convergent evolution. Our analyses of the zebra finch MHC suggest a complex history involving
Neill, Nicholas J; Ballif, Blake C; Lamb, Allen N; Parikh, Sumit; Ravnan, J Britt; Schultz, Roger A; Torchia, Beth S; Rosenfeld, Jill A; Shaffer, Lisa G
Insertions occur when a segment of one chromosome is translocated and inserted into a new region of the same chromosome or a non-homologous chromosome. We report 71 cases with unbalanced insertions identified using array CGH and FISH in 4909 cases referred to our laboratory for array CGH and found to have copy-number abnormalities. Although the majority of insertions were non-recurrent, several recurrent unbalanced insertions were detected, including three der(Y)ins(Y;18)(q?11.2;p11.32p11.32)pat inherited from parents carrying an unbalanced insertion. The clinical significance of these recurrent rearrangements is unclear, although the small size, limited gene content, and inheritance pattern of each suggests that the phenotypic consequences may be benign. Cryptic, submicroscopic duplications were observed at or near the insertion sites in two patients, further confounding the clinical interpretation of these insertions. Using FISH, linear amplification, and array CGH, we identified a 126-kb duplicated region from 19p13.3 inserted into MECP2 at Xq28 in a patient with symptoms of Rett syndrome. Our results demonstrate that although the interpretation of most non-recurrent insertions is unclear without high-resolution insertion site characterization, the potential for an otherwise benign duplication to result in a clinically relevant outcome through the disruption of a gene necessitates the use of FISH to determine whether copy-number gains detected by array CGH represent tandem duplications or unbalanced insertions. Further follow-up testing using techniques such as linear amplification or sequencing should be used to determine gene involvement at the insertion site after FISH has identified the presence of an insertion.
Yuan, Yinyin; Curtis, Christina; Caldas, Carlos; Markowetz, Florian
Copy number aberrations are recognized to be important in cancer as they may localize to regions harboring oncogenes or tumor suppressors. Such genomic alterations mediate phenotypic changes through their impact on expression. Both cis- and transacting alterations are important since they may help to elucidate putative cancer genes. However, amidst numerous passenger genes, trans-effects are less well studied due to the computational difficulty in detecting weak and sparse signals in the data, and yet may influence multiple genes on a global scale. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream transcriptional targets in breast cancer. With respect to goodness of fit on both simulated and real data, the performance of sparse network inference is no worse than other state-of-the-art models but with the advantage of simultaneous feature selection and efficiency. The DNA-RNA interaction network helps to distinguish copy-number driven expression alterations from those that are copy-number independent. Further, our approach yields a quantitative copy-number dependency score, which distinguishes cis- versus trans-effects. When applied to a breast cancer data set, numerous expression profiles were impacted by cis-acting copy-number alterations, including several known oncogenes such as GRB7, ERBB2, and LSM1. Several trans-acting alterations were also identified, impacting genes such as ADAM2 and BAGE, which warrant further investigation. An R package named lol is available from www.markowetzlab.org/software/lol.html.
Andersen, Gorm; Andersen, Birgit; Dobritzsch, D.
and related yeasts have two different genes/enzymes to apparently 'distinguish' between the two reactions in a single cell. It is likely that upon duplication similar to 200 million years ago, a specialized Uga1p evolved into a 'novel' transaminase enzyme with broader substrate specificity.......In humans, beta-alanine (BAL) and the neurotransmitter gamma-aminobutyrate (GABA) are transaminated by a single aminotransferase enzyme. Apparently, yeast originally also had a single enzyme, but the corresponding gene was duplicated in the Saccharomyces kluyveri lineage. SkUGA1 encodes a homologue...... to characterize the substrate specificity and kinetic parameters of the four enzymes. It was found that the cofactor pyridoxal 5'-phosphate is needed for enzymatic activity and alpha-ketoglutarate, and not pyruvate, as the amino group acceptor. SkPyd4p preferentially uses BAL as the amino group donor (V...
Kristensen, Lone Krøldrup; Kjaergaard, S; Kirchhoff, Marianne
with muscular hypertrophy and mildly retarded psychomotor development. Array-CGH identified a small duplication of 7q36.3 including the Sonic Hedgehog (SHH) gene in both the aborted foetus and the live born male sib. Neither of the parents carried the 7q36.3 duplication. The consequences of overexpression...
Xu, Xiu; Xu, Qiong; Zhang, Ying; Zhang, Xiaodi; Cheng, Tianlin; Wu, Bingbing; Ding, Yanhua; Lu, Ping; Zheng, Jingjing; Zhang, Min; Qiu, Zilong; Yu, Xiang
Autistic spectrum disorders (ASDs) are a family of neurodevelopmental disorders with strong genetic components. Recent studies have shown that copy number variations in dosage sensitive genes can contribute significantly to these disorders. One such gene is the transcription factor MECP2, whose loss of function in females results in Rett syndrome, while its duplication in males results in developmental delay and autism. Here, we identified a Chinese family with two brothers both inheriting a 2.2 Mb MECP2-containing duplication (151,369,305 - 153,589,577) from their mother. In addition, both brothers also had a 213.7 kb duplication on Chromosome 2, inherited from their father. The older brother also carried a 48.4 kb duplication on Chromosome 2 inherited from the mother, and a 8.2 kb deletion at 11q13.5 inherited from the father. Based on the published literature, MECP2 is the most autism-associated gene among the identified CNVs. Consistently, the boys displayed clinical features in common with other patients carrying MECP2 duplications, including intellectual disability, autism, lack of speech, slight hypotonia and unsteadiness of movement. They also had slight dysmorphic features including a depressed nose bridge, large ears and midface hypoplasia. Interestingly, they did not exhibit other clinical features commonly observed in American-European patients with MECP2 duplication, including recurrent respiratory infections and epilepsy. To our knowledge, this is the first identification and characterization of Chinese Han patients with MECP2-containing duplications. Further cases are required to determine if the above described clinical differences are due to individual variations or related to the genetic background of the patients.
Mental Retardation (MR) is the most frequent handicap. It touches 3% of the general population. The genetic causes of this handicap account for 40% of these cases. ARX gene (Aristaless related homeobox gene) belongs to the family of the genes homeobox located in Xp22.1. It is considered as the most frequently muted gene after the FMR1 gene. It is implicated in various forms of syndromic and nonsyndromic MR. Several types of mutation were identified on the level of this gene, including deletions/insertions, duplications, missense and nonsense mutations, responsible for a wide spectrum of phenotypes. The goal of this work is to seek the most frequent change of gene ARX: duplication 24pb (at the origin of an expansion of the field poly has protein ARX in the position 144-155AA) among Tunisian boys presenting in particular family forms of non specific MR, sporadic forms of non specific MR like certain patients presenting a West syndrome.To prove the duplication of 24 Pb, we used in this work the Pcr technique. The change of duplication 24pb was not found in our series, this could be explained by the low number of cases family studied (38 families) and by the absence of connection studies accusing a mode of transmission related to X chromosome in particular for the sporadic cases. (Author)
Lorin, Thibault; Brunet, Frédéric G.; Laudet, Vincent; Volff, Jean-Nicolas
Vertebrate pigmentation is a highly diverse trait mainly determined by neural crest cell derivatives. It has been suggested that two rounds (1R/2R) of whole-genome duplications (WGDs) at the basis of vertebrates allowed changes in gene regulation associated with neural crest evolution. Subsequently, the teleost fish lineage experienced other WGDs, including the teleost-specific Ts3R before teleost radiation and the more recent Ss4R at the basis of salmonids. As the teleost lineage harbors the highest number of pigment cell types and pigmentation diversity in vertebrates, WGDs might have contributed to the evolution and diversification of the pigmentation gene repertoire in teleosts. We have compared the impact of the basal vertebrate 1R/2R duplications with that of the teleost-specific Ts3R and salmonid-specific Ss4R WGDs on 181 gene families containing genes involved in pigmentation. We show that pigmentation genes (PGs) have been globally more frequently retained as duplicates than other genes after Ts3R and Ss4R but not after the early 1R/2R. This is also true for non-pigmentary paralogs of PGs, suggesting that the function in pigmentation is not the sole key driver of gene retention after WGDs. On the long-term, specific categories of PGs have been repeatedly preferentially retained after ancient 1R/2R and Ts3R WGDs, possibly linked to the molecular nature of their proteins (e.g., DNA binding transcriptional regulators) and their central position in protein-protein interaction networks. Taken together, our results support a major role of WGDs in the diversification of the pigmentation gene repertoire in the teleost lineage, with a possible link with the diversity of pigment cell lineages observed in these animals compared to other vertebrates. PMID:29599177
Anthony R Isles
maternally expressed imprinted genes in the contribution of Copy Number Variants (CNVs at this interval to the incidence of psychotic illness. This work will have tangible benefits for patients with 15q11.2-q13.3 duplications by aiding genetic counseling.
Isles, Anthony R.; Ingason, Andrés; Lowther, Chelsea; Gawlick, Micha; Stöber, Gerald; Potter, Harry; Georgieva, Lyudmila; Pizzo, Lucilla; Ozaki, Norio; Kushima, Itaru; Ikeda, Masashi; Iwata, Nakao; Levinson, Douglas F.; Gejman, Pablo V.; Shi, Jianxin; Sanders, Alan R.; Duan, Jubao; Sisodiya, Sanjay; Costain, Gregory; Degenhardt, Franziska; Giegling, Ina; Rujescu, Dan; Hreidarsson, Stefan J.; Saemundsen, Evald; Ahn, Joo Wook; Ogilvie, Caroline; Stefansson, Hreinn; Stefansson, Kari; O’Donovan, Michael C.; Owen, Michael J.; Bassett, Anne; Kirov, George
expressed imprinted genes in the contribution of Copy Number Variants (CNVs) at this interval to the incidence of psychotic illness. This work will have tangible benefits for patients with 15q11.2-q13.3 duplications by aiding genetic counseling. PMID:27153221
Background Miniature inverted-repeat transposable elements (MITEs) are expected to play important roles in evolution of genes and genome in plants, especially in the highly duplicated plant genomes. Various MITE families and their roles in plants have been characterized. However, there have been fewer studies of MITE families and their potential roles in evolution of the recently triplicated Brassica genome. Results We identified a new MITE family, BRAMI-1, belonging to the Stowaway super-family in the Brassica genome. In silico mapping revealed that 697 members are dispersed throughout the euchromatic regions of the B. rapa pseudo-chromosomes. Among them, 548 members (78.6%) are located in gene-rich regions, less than 3 kb from genes. In addition, we identified 516 and 15 members in the 470 Mb and 15 Mb genomic shotgun sequences currently available for B. oleracea and B. napus, respectively. The resulting estimated copy numbers for the entire genomes were 1440, 1464 and 2490 in B. rapa, B. oleracea and B. napus, respectively. Concurrently, only 70 members of the related Arabidopsis ATTIRTA-1 MITE family were identified in the Arabidopsis genome. Phylogenetic analysis revealed that BRAMI-1 elements proliferated in the Brassica genus after divergence from the Arabidopsis lineage. MITE insertion polymorphism (MIP) was inspected for 50 BRAMI-1 members, revealing high levels of insertion polymorphism between and within species of Brassica that clarify BRAMI-1 activation periods up to the present. Comparative analysis of the 71 genes harbouring the BRAMI-1 elements with their non-insertion paralogs (NIPs) showed that the BRAMI-1 insertions mainly reside in non-coding sequences and that the expression levels of genes with the elements differ from those of their NIPs. Conclusion A Stowaway family MITE, named as BRAMI-1, was gradually amplified and remained present in over than 1400 copies in each of three Brassica species. Overall, 78% of the members were identified in
Segall-Shapiro, Thomas H; Sontag, Eduardo D; Voigt, Christopher A
The internal environment of growing cells is variable and dynamic, making it difficult to introduce reliable parts, such as promoters, for genetic engineering. Here, we applied control-theoretic ideas to design promoters that maintained constant levels of expression at any copy number. Theory predicts that independence to copy number can be achieved by using an incoherent feedforward loop (iFFL) if the negative regulation is perfectly non-cooperative. We engineered iFFLs into Escherichia coli promoters using transcription-activator-like effectors (TALEs). These promoters had near-identical expression in different genome locations and plasmids, even when their copy number was perturbed by genomic mutations or changes in growth medium composition. We applied the stabilized promoters to show that a three-gene metabolic pathway to produce deoxychromoviridans could retain function without re-tuning when the stabilized-promoter-driven genes were moved from a plasmid into the genome.
Costain, Gregory; Merico, Daniele; Migita, Ohsuke; Liu, Ben; Yuen, Tracy; Rickaby, Jessica; Thiruvahindrapuram, Bhooma; Marshall, Christian R.; Scherer, Stephen W.; Bassett, Anne S.
Structural genetic changes, especially copy number variants (CNVs), represent a major source of genetic variation contributing to human disease. Tetralogy of Fallot (TOF) is the most common form of cyanotic congenital heart disease, but to date little is known about the role of CNVs in the etiology of TOF. Using high-resolution genome-wide microarrays and stringent calling methods, we investigated rare CNVs in a prospectively recruited cohort of 433 unrelated adults with TOF and/or pulmonary atresia at a single centre. We excluded those with recognized syndromes, including 22q11.2 deletion syndrome. We identified candidate genes for TOF based on converging evidence between rare CNVs that overlapped the same gene in unrelated individuals and from pathway analyses comparing rare CNVs in TOF cases to those in epidemiologic controls. Even after excluding the 53 (10.7%) subjects with 22q11.2 deletions, we found that adults with TOF had a greater burden of large rare genic CNVs compared to controls (8.82% vs. 4.33%, p = 0.0117). Six loci showed evidence for recurrence in TOF or related congenital heart disease, including typical 1q21.1 duplications in four (1.18%) of 340 Caucasian probands. The rare CNVs implicated novel candidate genes of interest for TOF, including PLXNA2, a gene involved in semaphorin signaling. Independent pathway analyses highlighted developmental processes as potential contributors to the pathogenesis of TOF. These results indicate that individually rare CNVs are collectively significant contributors to the genetic burden of TOF. Further, the data provide new evidence for dosage sensitive genes in PLXNA2-semaphorin signaling and related developmental processes in human cardiovascular development, consistent with previous animal models. PMID:22912587
Fadista, João; Thomsen, Bo; Holm, Lars-Erik
to genetic variation in cattle. Results We designed and used a set of NimbleGen CGH arrays that tile across the assayable portion of the cattle genome with approximately 6.3 million probes, at a median probe spacing of 301 bp. This study reports the highest resolution map of copy number variation...... in the cattle genome, with 304 CNV regions (CNVRs) being identified among the genomes of 20 bovine samples from 4 dairy and beef breeds. The CNVRs identified covered 0.68% (22 Mb) of the genome, and ranged in size from 1.7 to 2,031 kb (median size 16.7 kb). About 20% of the CNVs co-localized with segmental...... duplications, while 30% encompass genes, of which the majority is involved in environmental response. About 10% of the human orthologous of these genes are associated with human disease susceptibility and, hence, may have important phenotypic consequences. Conclusions Together, this analysis provides a useful...
Full Text Available Background: Colorectal cancer (CRC is one of the most frequently occurring cancers in Japan, and thus a wide range of methods have been deployed to study the molecular mechanisms of CRC. In this study, we performed a comprehensive analysis of CRC, incorporating copy number aberration (CRC and gene expression data. For the last four years, we have been collecting data from CRC cases and organizing the information as an “omics” study by integrating many kinds of analysis into a single comprehensive investigation. In our previous studies, we had experienced difficulty in finding genes related to CRC, as we observed higher noise levels in the expression data than in the data for other cancers. Because chromosomal aberrations are often observed in CRC, here, we have performed a combination of CNA analysis and expression analysis in order to identify some new genes responsible for CRC. This study was performed as part of the Clinical Omics Database Project at Tokyo Medical and Dental University. The purpose of this study was to investigate the mechanism of genetic instability in CRC by this combination of expression analysis and CNA, and to establish a new method for the diagnosis and treatment of CRC. Materials and methods: Comprehensive gene expression analysis was performed on 79 CRC cases using an Affymetrix Gene Chip, and comprehensive CNA analysis was performed using an Affymetrix DNA Sty array. To avoid the contamination of cancer tissue with normal cells, laser micro-dissection was performed before DNA/RNA extraction. Data analysis was performed using original software written in the R language. Result: We observed a high percentage of CNA in colorectal cancer, including copy number gains at 7, 8q, 13 and 20q, and copy number losses at 8p, 17p and 18. Gene expression analysis provided many candidates for CRC-related genes, but their association with CRC did not reach the level of statistical significance. The combination of CNA and gene
Dehal, Paramvir; Boore, Jeffrey L.
The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.
Sakudoh, Takashi; Nakashima, Takeharu; Kuroki, Yoko; Fujiyama, Asao; Kohara, Yuji; Honda, Naoko; Fujimoto, Hirofumi; Shimada, Toru; Nakagaki, Masao; Banno, Yutaka; Tsuchida, Kozo
The carotenoid-binding protein (CBP) of the domesticated silkworm, Bombyx mori, a major determinant of cocoon color, is likely to have been substantially influenced by domestication of this species. We analyzed the structure of the CBP gene in multiple strains of B. mori, in multiple individuals of the wild silkworm, B. mandarina (the putative wild ancestor of B. mori), and in a number of other lepidopterans. We found the CBP gene copy number in genomic DNA to vary widely among B. mori strains, ranging from 1 to 20. The copies of CBP are of several types, based on the presence of a retrotransposon or partial deletion of the coding sequence. In contrast to B. mori, B. mandarina was found to possess a single copy of CBP without the retrotransposon insertion, regardless of habitat. Several other lepidopterans were found to contain sequences homologous to CBP, revealing that this gene is evolutionarily conserved in the lepidopteran lineage. Thus, domestication can generate significant diversity of gene copy number and structure over a relatively short evolutionary time. © 2011 by the Genetics Society of America
Full Text Available Abstract Background Autistic spectrum disorders (ASDs are a family of neurodevelopmental disorders with strong genetic components. Recent studies have shown that copy number variations in dosage sensitive genes can contribute significantly to these disorders. One such gene is the transcription factor MECP2, whose loss of function in females results in Rett syndrome, while its duplication in males results in developmental delay and autism. Case presentation Here, we identified a Chinese family with two brothers both inheriting a 2.2 Mb MECP2-containing duplication (151,369,305 – 153,589,577 from their mother. In addition, both brothers also had a 213.7 kb duplication on Chromosome 2, inherited from their father. The older brother also carried a 48.4 kb duplication on Chromosome 2 inherited from the mother, and a 8.2 kb deletion at 11q13.5 inherited from the father. Based on the published literature, MECP2 is the most autism-associated gene among the identified CNVs. Consistently, the boys displayed clinical features in common with other patients carrying MECP2 duplications, including intellectual disability, autism, lack of speech, slight hypotonia and unsteadiness of movement. They also had slight dysmorphic features including a depressed nose bridge, large ears and midface hypoplasia. Interestingly, they did not exhibit other clinical features commonly observed in American-European patients with MECP2 duplication, including recurrent respiratory infections and epilepsy. Conclusions To our knowledge, this is the first identification and characterization of Chinese Han patients with MECP2-containing duplications. Further cases are required to determine if the above described clinical differences are due to individual variations or related to the genetic background of the patients.
Full Text Available Abstract Background Deletions and duplications of the PAFAH1B1 and YWHAE genes in 17p13.3 are associated with different clinical phenotypes. In particular, deletion of PAFAH1B1 causes isolated lissencephaly while deletions involving both PAFAH1B1 and YWHAE cause Miller-Dieker syndrome. Isolated duplications of PAFAH1B1 have been associated with mild developmental delay and hypotonia, while isolated duplications of YWHAE have been associated with autism. In particular, different dysmorphic features associated with PAFAH1B1 or YWHAE duplication have suggested the need to classify the patient clinical features in two groups according to which gene is involved in the chromosomal duplication. Methods We analyze the proband and his family by classical cytogenetic and array-CGH analyses. The putative rearrangement was confirmed by fluorescence in situ hybridization. Results We have identified a family segregating a 17p13.3 duplication extending 329.5 kilobases by FISH and array-CGH involving the YWHAE gene, but not PAFAH1B1, affected by a mild dysmorphic phenotype with associated autism and mental retardation. We propose that BHLHA9, YWHAE, and CRK genes contribute to the phenotype of our patient. The small chromosomal duplication was inherited from his mother who was affected by a bipolar and borderline disorder and was alcohol addicted. Conclusions We report an additional familial case of small 17p13.3 chromosomal duplication including only BHLHA9, YWHAE, and CRK genes. Our observation and further cases with similar microduplications are expected to be diagnosed, and will help better characterise the clinical spectrum of phenotypes associated with 17p13.3 microduplications.
Full Text Available Copy number variations (CNVs refer to large insertions, deletions and duplications in the genomic structure ranging from one thousand to several million bases in size. Since the development of next generation sequencing technology, several methods have been well built for detection of copy number variations with high credibility and accuracy. Evidence has shown that CNV occurring in gene region could lead to phenotypic changes due to the alteration in gene structure and dosage. However, it still remains unexplored whether CNVs underlie the phenotypic differences between Chinese and Western domestic pigs. Based on the read-depth methods, we investigated copy number variations using 49 individuals derived from both Chinese and Western pig breeds. A total of 3,131 copy number variation regions (CNVRs were identified with an average size of 13.4 Kb in all individuals during domestication, harboring 1,363 genes. Among them, 129 and 147 CNVRs were Chinese and Western pig specific, respectively. Gene functional enrichments revealed that these CNVRs contribute to strong disease resistance and high prolificacy in Chinese domestic pigs, but strong muscle tissue development in Western domestic pigs. This finding is strongly consistent with the morphologic characteristics of Chinese and Western pigs, indicating that these group-specific CNVRs might have been preserved by artificial selection for the favored phenotypes during independent domestication of Chinese and Western pigs. In this study, we built high-resolution CNV maps in several domestic pig breeds and discovered the group specific CNVs by comparing Chinese and Western pigs, which could provide new insight into genomic variations during pigs' independent domestication, and facilitate further functional studies of CNV-associated genes.
Marandel, Lucie; Seiliez, Iban; Véron, Vincent; Skiba-Cassy, Sandrine; Panserat, Stéphane
The rainbow trout (Oncorhynchus mykiss) is considered to be a strictly carnivorous fish species that is metabolically adapted for high catabolism of proteins and low utilization of dietary carbohydrates. This species consequently has a "glucose-intolerant" phenotype manifested by persistent hyperglycemia when fed a high-carbohydrate diet. Gluconeogenesis in adult fish is also poorly, if ever, regulated by carbohydrates, suggesting that this metabolic pathway is involved in this specific phenotype. In this study, we hypothesized that the fate of duplicated genes after the salmonid-specific 4th whole genome duplication (Ss4R) may have led to adaptive innovation and that their study might provide new elements to enhance our understanding of gluconeogenesis and poor dietary carbohydrate use in this species. Our evolutionary analysis of gluconeogenic genes revealed that pck1, pck2, fbp1a, and g6pca were retained as singletons after Ss4r, while g6pcb1, g6pcb2, and fbp1b ohnolog pairs were maintained. For all genes, duplication may have led to sub- or neofunctionalization. Expression profiles suggest that the gluconeogenesis pathway remained active in trout fed a no-carbohydrate diet. When trout were fed a high-carbohydrate diet (30%), most of the gluconeogenic genes were non- or downregulated, except for g6pbc2 ohnologs, whose RNA levels were surprisingly increased. This study demonstrates that Ss4R in trout involved adaptive innovation via gene duplication and via the outcome of the resulting ohnologs. Indeed, maintenance of ohnologous g6pcb2 pair may contribute in a significant way to the glucose-intolerant phenotype of trout and may partially explain its poor use of dietary carbohydrates. Copyright © 2015 the American Physiological Society.
Eichler Evan E
Full Text Available Abstract Background It has been suggested that chromosomal rearrangements harbor the molecular footprint of the biological phenomena which they induce, in the form, for instance, of changes in the sequence divergence rates of linked genes. So far, all the studies of these potential associations have focused on the relationship between structural changes and the rates of evolution of single-copy DNA and have tried to exclude segmental duplications (SDs. This is paradoxical, since SDs are one of the primary forces driving the evolution of structure and function in our genomes and have been linked not only with novel genes acquiring new functions, but also with overall higher DNA sequence divergence and major chromosomal rearrangements. Results Here we take the opposite view and focus on SDs. We analyze several of the features of SDs, including the rates of intraspecific divergence between paralogous copies of human SDs and of interspecific divergence between human SDs and chimpanzee DNA. We study how divergence measures relate to chromosomal rearrangements, while considering other factors that affect evolutionary rates in single copy DNA. Conclusion We find that interspecific SD divergence behaves similarly to divergence of single-copy DNA. In contrast, old and recent paralogous copies of SDs do present different patterns of intraspecific divergence. Also, we show that some relatively recent SDs accumulate in regions that carry inversions in sister lineages.
Irina S. Kolesnikova
Sep 1, 2017 ... Asia R. Shorina d, Alexander S. Graphodatsky a, Ekaterina M. Galanina b, Dmitry V. Yudkin a,b,* ... rRNA gene copy numbers on affected acrocentric chromosomes in .... estimated using MS Excel software (Microsoft, USA).
Full Text Available Copy number variations (CNVs, important genetic factors for study of human diseases, may have as large of an effect on phenotype as do single nucleotide polymorphisms. Indeed, it is widely accepted that CNVs are associated with differential disease susceptibility. However, the relationships between CNVs and gene expression have not been characterized in the horse. In this study, we investigated the effects of copy number deletion in the blood and muscle transcriptomes of Thoroughbred racing horses. We identified a total of 1,246 CNVs of deletion polymorphisms using DNA re-sequencing data from 18 Thoroughbred racing horses. To discover the tendencies between CNV status and gene expression levels, we extracted CNVs of four Thoroughbred racing horses of which RNA sequencing was available. We found that 252 pairs of CNVs and genes were associated in the four horse samples. We did not observe a clear and consistent relationship between the deletion status of CNVs and gene expression levels before and after exercise in blood and muscle. However, we found some pairs of CNVs and associated genes that indicated relationships with gene expression levels: a positive relationship with genes responsible for membrane structure or cytoskeleton and a negative relationship with genes involved in disease. This study will lead to conceptual advances in understanding the relationship between CNVs and global gene expression in the horse.
Challacombe, Jean F [Los Alamos National Laboratory; Eichorst, Stephanie A [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Kuske, Cheryl R [Los Alamos National Laboratory; Hauser, Loren [ORNL; Land, Miriam [ORNL
Bacterial genome sizes range from ca. 0.5 to 10Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Sequenced genomes of strains in the phylum Acidobacteria revealed that 'Solibacter usistatus' strain Ellin6076 harbors a 9.9 Mb genome. This large genome appears to have arisen by horizontal gene transfer via ancient bacteriophage and plasmid-mediated transduction, as well as widespread small-scale gene duplications. This has resulted in an increased number of paralogs that are potentially ecologically important (ecoparalogs). Low amino acid sequence identities among functional group members and lack of conserved gene order and orientation in the regions containing similar groups of paralogs suggest that most of the paralogs were not the result of recent duplication events. The genome sizes of cultured subdivision 1 and 3 strains in the phylum Acidobacteria were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 1 were estimated to have smaller genome sizes ranging from ca. 2.0 to 4.8 Mb, whereas members of subdivision 3 had slightly larger genomes, from ca. 5.8 to 9.9 Mb. It is hypothesized that the large genome of strain Ellin6076 encodes traits that provide a selective metabolic, defensive and regulatory advantage in the variable soil environment.
Bazrafshani, Mohammad Reza R; Nowshadi, Pouriaali A; Shirian, Sadegh; Daneshbod, Yahya; Nabipour, Fatemeh; Mokhtari, Maral; Hosseini, Fatemehsadat; Dehghan, Somayeh; Saeedzadeh, Abolfazl; Mosayebi, Ziba
Bladder cancer is a molecular disease driven by the accumulation of genetic, epigenetic, and environmental factors. The aim of this study was to detect the deletions/duplication mutations in TP53 gene exons using multiplex ligation-dependent probe amplification (MLPA) method in the patients with transitional cell carcinoma (TCC). The achieved formalin-fixed paraffin-embedded tissues from 60 patients with TCC of bladder were screened for exonal deletions or duplications of every 12 TP53 gene exons using MLPA. The pathological sections were examined by three pathologists and categorized according to the WHO scoring guideline as 18 (30%) grade I, 22 (37%) grade II, 13 (22%) grade III, and 7 (11%) grade IV cases of TCC. None mutation changes of TP53 gene were detected in 24 (40%) of the patients. Furthermore, mutation changes including, 15 (25%) deletion, 17 (28%) duplication, and 4 (7%) both deletion and duplication cases were observed among 60 samples. From 12 exons of TP53 gene, exon 1 was more subjected to exonal deletion. Deletion of exon 1 of TP53 gene has occurred in 11 (35.4%) patients with TCC. In general, most mutations of TP53, either deletion or duplication, were found in exon 1, which was statistically significant. In addition, no relation between the TCC tumor grade and any type of mutation were observed in this research. MLPA is a simple and efficient method to analyze genomic deletions and duplications of all 12 exons of TP53 gene. The finding of this report that most of the mutations of TP53 occur in exon 1 is in contrast to that of the other reports suggesting that exons 5-8 are the most (frequently) mutated exons of TP53 gene. The mutations of exon 1 of TP53 gene may play an important role in the tumorogenesis of TCC. © 2015 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Tang, Wei; Newton, Ronald J; Weidner, Douglas A
An efficient transgenic eastern white pine (Pinus strobus L.) plant regeneration system has been established using Agrobacterium tumefaciens strain GV3850-mediated transformation and the green fluorescent protein (gfp) gene as a reporter in this investigation. Stable integration of transgenes in the plant genome of pine was confirmed by polymerase chain reaction (PCR), Southern blot, and northern blot analyses. Transgene expression was analysed in pine T-DNA transformants carrying different numbers of copies of T-DNA insertions. Post-transcriptional gene silencing (PTGS) was mostly obtained in transgenic lines with more than three copies of T-DNA, but not in transgenic lines with one copy of T-DNA. In situ hybridization chromosome analysis of transgenic lines demonstrated that silenced transgenic lines had two or more T-DNA insertions in the same chromosome. These results suggest that two or more T-DNA insertions in the same chromosome facilitate efficient gene silencing in transgenic pine cells expressing green fluorescent protein. There were no differences in shoot differentiation and development between transgenic lines with multiple T-DNA copies and transgenic lines with one or two T-DNA copies.
Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...
Chen, Y; Solursh, M
The Msx-1 gene (formerly known as Hox-7) is a member of a discrete subclass of homeobox-containing genes. Examination of the expression pattern of Msx-1 in murine and avian embryos suggests that this gene may be involved in the regionalization of the medio-lateral axis during earlier development. We have examined the possible functions of Xenopus Msx-1 during early Xenopus embryonic development by overexpression of the Msx-1 gene. Overexpression of Msx-1 causes a left-right mirror-image duplication of primary axial structures, including notochord, neural tube, somites, suckers, and foregut. The embryonic developing heart is also mirror-image duplicated, including looping directions and polarity. These results indicate that Msx-1 may be involved in the mesoderm formation as well as left-right patterning in the early Xenopus embryonic development.
Full Text Available We describe a neonatal patient with biliary ductopenia featuring duplication of exon 6 of the JAG1 gene. Facial alterations were observed, consisting of a prominent forehead, sunken eyes, upward slanting palpebral fissures, hypertelorism, flat nasal root and prominent chin. From birth, these were accompanied by the development of haematuria and renal failure and by renal Doppler findings indicative of peripheral renal artery stenosis. JAG1 gene mutations on chromosome 20 have been associated with various anomalies, including biliary cholestasis, vertebral abnormalities, eye disorders, heart defects and facial dysmorphia. This syndrome, first described by Alagille, is an infrequent congenital disorder caused by a dominant autosomal inheritance with variable expressivity. Anatomopathological effects include the destruction and disappearance of hepatic bile ducts (ductopenia. The duplication of exon 6 of JAG1 has not previously been described as an alteration related to the Alagille syndrome with peripheral renal artery stenosis.
Peng Jingjing; Cai Chao; Qiao Min; Li Hong; Zhu Yongguan
This study investigates the dynamics of pyrene degradation rates, microbial communities, and functional gene copy numbers during the incubation of pyrene-spiked soils. Spiking pyrene to the soil was found to have negligible effects on the bacterial community present. Our results demonstrated that there was a significant difference in nidA gene copy numbers between sampling dates in QZ soil. Mycobacterium 16S rDNA clone libraries showed that more than 90% mycobacteria detected were closely related to fast-growing PAH-degrading Mycobacterium in pyrene-spiked soil, while other sequences related to slow-growing Mycobacterium were only detected in the control soil. It is suggested that nidA gene copy number and fast-growing PAH-degrading Mycobacterium could be used as indicators to predict pyrene contamination and its degradation activity in soils. - nidA gene and fast-growing PAH-degrading Mycobacterium can serve as indicators for pyrene contamination.
Carlson Sara E
Full Text Available Abstract Background CYCLOIDEA (CYC-like genes have been implicated in the development of capitulum inflorescences (i.e. flowering heads in Asteraceae, where many small flowers (florets are packed tightly into an inflorescence that resembles a single flower. Several rounds of duplication of CYC-like genes have occurred in Asteraceae, and this is hypothesized to be correlated with the evolution of the capitulum, which in turn has been implicated in the evolutionary success of the group. We investigated the evolution of CYC-like genes in Dipsacaceae (Dipsacales, a plant clade in which capitulum inflorescences originated independently of Asteraceae. Two main inflorescence types are present in Dipsacaceae: (1 radiate species contain two kinds of floret within the flowering head (disk and ray, and (2 discoid species contain only disk florets. To test whether a dynamic pattern of gene duplication, similar to that documented in Asteraceae, is present in Dipsacaceae, and whether these patterns are correlated with different inflorescence types, we inferred a CYC-like gene phylogeny for Dipsacaceae based on representative species from the major lineages. Results We recovered within Dipsacaceae the three major forms of CYC-like genes that have been found in most core eudicots, and identified several additional duplications within each of these clades. We found that the number of CYC-like genes in Dipsacaceae is similar to that reported for members of Asteraceae and that the same gene lineages (CYC1-like and CYC2B-like genes have duplicated in a similar fashion independently in both groups. The number of CYC-like genes recovered for radiate versus discoid species differed, with discoid species having fewer copies of CYC1-like and CYC2B-like genes. Conclusions CYC-like genes have undergone extensive duplication in Dipsacaceae, with radiate species having more copies than discoid species, suggesting a potential role for these genes in the evolution of disk and
Full Text Available Abstract Background Progranulin is an epithelial tissue growth factor (also known as proepithelin, acrogranin and PC-cell-derived growth factor that has been implicated in development, wound healing and in the progression of many cancers. The single mammalian progranulin gene encodes a glycoprotein precursor consisting of seven and one half tandemly repeated non-identical copies of the cystine-rich granulin motif. A genome-wide duplication event hypothesized to have occurred at the base of the teleost radiation predicts that mammalian progranulin may be represented by two co-orthologues in zebrafish. Results The cDNAs encoding two zebrafish granulin precursors, progranulins-A and -B, were characterized and found to contain 10 and 9 copies of the granulin motif respectively. The cDNAs and genes encoding the two forms of granulin, progranulins-1 and -2, were also cloned and sequenced. Both latter peptides were found to be encoded by precursors with a simplified architecture consisting of one and one half copies of the granulin motif. A cDNA encoding a chimeric progranulin which likely arises through the mechanism of trans-splicing between grn1 and grn2 was also characterized. A non-coding RNA gene with antisense complementarity to both grn1 and grn2 was identified which may have functional implications with respect to gene dosage, as well as in restricting the formation of the chimeric form of progranulin. Chromosomal localization of the four progranulin (grn genes reveals syntenic conservation for grna only, suggesting that it is the true orthologue of mammalian grn. RT-PCR and whole-mount in situ hybridization analysis of zebrafish grns during development reveals that combined expression of grna and grnb, but not grn1 and grn2, recapitulate many of the expression patterns observed for the murine counterpart. This includes maternal deposition, widespread central nervous system distribution and specific localization within the epithelial
Garcia-Longoria, Luz; Hellgren, Olof; Bensch, Staffan
Malaria parasites need to synthesize chitinase in order to go through the peritrophic membrane, which is created around the mosquito midgut, to complete its life cycle. In mammalian malaria species, the chitinase gene comprises either a large or a short copy. In the avian malaria parasites Plasmodium gallinaceum both copies are present, suggesting that a gene duplication in the ancestor to these extant species preceded the loss of either the long or the short copy in Plasmodium parasites of mammals. Plasmodium gallinaceum is not the most widespread and harmful parasite of birds. This study is the first to search for and identify the chitinase gene in one of the most prevalent avian malaria parasites, Plasmodium relictum. Both copies of P. gallinaceum chitinase were used as reference sequences for primer design. Different sequences of Plasmodium spp. were used to build the phylogenetic tree of chitinase gene. The gene encoding for chitinase was identified in isolates of two mitochondrial lineages of P. relictum (SGS1 and GRW4). The chitinase found in these two lineages consists both of the long (PrCHT1) and the short (PrCHT2) copy. The genetic differences found in the long copy of the chitinase gene between SGS1 and GRW4 were higher than the difference observed for the cytochrome b gene. The identification of both copies in P. relictum sheds light on the phylogenetic relationship of the chitinase gene in the genus Plasmodium. Due to its high variability, the chitinase gene could be used to study the genetic population structure in isolates from different host species and geographic regions.
Lagerström-Fermér, M; Sundvall, M; Johnsen, E; Warne, G L; Forrest, S M; Zajac, J D; Rickards, A; Ravine, D; Landegren, U; Pettersson, U
We present a linkage analysis and a clinical update on a previously reported family with X-linked recessive panhypopituitarism, now in its fourth generation. Affected members exhibit variable degrees of hypopituitarism and mental retardation. The markers DXS737 and DXS1187 in the q25-q26 region of the X chromosome showed evidence for linkage with a peak LOD score (Zmax) of 4.12 at zero recombination fraction (theta(max) = 0). An apparent extra copy of the marker DXS102, observed in the region of the disease gene in affected males and heterozygous carrier females, suggests that a segment including this marker is duplicated. The gene causing this disorder appears to code for a dosage-sensitive protein central to development of the pituitary. Images Figure 2 PMID:9106538
Full Text Available Background/Aim. Spinal muscular atrophy (SMA is an autosomal recessive disease characterized by degeneration of alpha motor neurons in the spinal cord and the medulla oblongata, causing progressive muscle weakness and atrophy. The aim of this study was to determine association between the SMN2 gene copy number and disease phenotype in Serbian patients with SMA with homozygous deletion of exon 7 of the SMN1 gene. Methods. The patients were identified using regional Serbian hospital databases. Investigated clinical characteristics of the disease were: patients’ gender, age at disease onset, achieved and current developmental milestones, disease duration, current age, and the presence of the spinal deformities and joint contractures. The number of SMN1 and SMN2 gene copies was determined using real-time polymerase chain reaction (PCR. Results. Among 43 identified patients, 37 (86.0% showed homozygous deletion of SMN1 exon 7. One (2.7% of 37 patients had SMA type I with 3 SMN2 copies, 11 (29.7% patients had SMA type II with 3.1 ± 0.7 copies, 17 (45.9% patients had SMA type III with 3.7 ± 0.9 copies, while 8 (21.6% patients had SMA type IV with 4.2 ± 0.9 copies. There was a progressive increase in the SMN2 gene copy number from type II towards type IV (p < 0.05. A higher SMN2 gene copy number was associated with better current motor performance (p < 0.05. Conclusion. In the Serbian patients with SMA, a higher SMN2 gene copy number correlated with less severe disease phenotype. A possible effect of other phenotype modifiers should not be neglected.
Jacqueline G. Miller
Full Text Available Centrioles play critical roles in the organization of microtubule-based structures, from the mitotic spindle to cilia and flagella. In order to properly execute their various functions, centrioles are subjected to stringent copy number control. Central to this control mechanism is a precise duplication event that takes place during S phase of the cell cycle and involves the assembly of a single daughter centriole in association with each mother centriole . Recent studies have revealed that posttranslational control of the master regulator Plk4/ZYG-1 kinase and its downstream effector SAS-6 is key to ensuring production of a single daughter centriole. In contrast, relatively little is known about how centriole duplication is regulated at a transcriptional level. Here we show that the transcription factor complex EFL-1-DPL-1 both positively and negatively controls centriole duplication in the Caenorhabditis elegans embryo. Specifically, we find that down regulation of EFL-1-DPL-1 can restore centriole duplication in a zyg-1 hypomorphic mutant and that suppression of the zyg-1 mutant phenotype is accompanied by an increase in SAS-6 protein levels. Further, we find evidence that EFL-1-DPL-1 promotes the transcription of zyg-1 and other centriole duplication genes. Our results provide evidence that in a single tissue type, EFL-1-DPL-1 sets the balance between positive and negative regulators of centriole assembly and thus may be part of a homeostatic mechanism that governs centriole assembly.
Ziemons, Sandra; Koutsantas, Katerina; Becker, Kordula; Dahlmann, Tim; Kück, Ulrich
Multi-copy gene integration into microbial genomes is a conventional tool for obtaining improved gene expression. For Penicillium chrysogenum, the fungal producer of the beta-lactam antibiotic penicillin, many production strains carry multiple copies of the penicillin biosynthesis gene cluster. This discovery led to the generally accepted view that high penicillin titers are the result of multiple copies of penicillin genes. Here we investigated strain P2niaD18, a production line that carries only two copies of the penicillin gene cluster. We performed pulsed-field gel electrophoresis (PFGE), quantitative qRT-PCR, and penicillin bioassays to investigate production, deletion and overexpression strains generated in the P. chrysogenum P2niaD18 background, in order to determine the copy number of the penicillin biosynthesis gene cluster, and study the expression of one penicillin biosynthesis gene, and the penicillin titer. Analysis of production and recombinant strain showed that the enhanced penicillin titer did not depend on the copy number of the penicillin gene cluster. Our assumption was strengthened by results with a penicillin null strain lacking pcbC encoding isopenicillin N synthase. Reintroduction of one or two copies of the cluster into the pcbC deletion strain restored transcriptional high expression of the pcbC gene, but recombinant strains showed no significantly different penicillin titer compared to parental strains. Here we present a molecular genetic analysis of production and recombinant strains in the P2niaD18 background carrying different copy numbers of the penicillin biosynthesis gene cluster. Our analysis shows that the enhanced penicillin titer does not strictly depend on the copy number of the cluster. Based on these overall findings, we hypothesize that instead, complex regulatory mechanisms are prominently implicated in increased penicillin biosynthesis in production strains.
Klopocki, Eva; Lohan, Silke; Brancati, Francesco; Koll, Randi; Brehm, Anja; Seemann, Petra; Dathe, Katarina; Stricker, Sigmar; Hecht, Jochen; Bosse, Kristin; Betz, Regina C; Garaci, Francesco Giuseppe; Dallapiccola, Bruno; Jain, Mahim; Muenke, Maximilian; Ng, Vivian C W; Chan, Wilson; Chan, Danny; Mundlos, Stefan
Indian hedgehog (IHH) is a secreted signaling molecule of the hedgehog family known to play important roles in the regulation of chondrocyte differentiation, cortical bone formation, and the development of joints. Here, we describe that copy-number variations of the IHH locus involving conserved noncoding elements (CNEs) are associated with syndactyly and craniosynostosis. These CNEs are able to drive reporter gene expression in a pattern highly similar to wild-type Ihh expression. We postulate that the observed duplications lead to a misexpression and/or overexpression of IHH and by this affect the complex regulatory signaling network during digit and skull development.
Todd J Treangen
Full Text Available Gene duplication followed by neo- or sub-functionalization deeply impacts the evolution of protein families and is regarded as the main source of adaptive functional novelty in eukaryotes. While there is ample evidence of adaptive gene duplication in prokaryotes, it is not clear whether duplication outweighs the contribution of horizontal gene transfer in the expansion of protein families. We analyzed closely related prokaryote strains or species with small genomes (Helicobacter, Neisseria, Streptococcus, Sulfolobus, average-sized genomes (Bacillus, Enterobacteriaceae, and large genomes (Pseudomonas, Bradyrhizobiaceae to untangle the effects of duplication and horizontal transfer. After removing the effects of transposable elements and phages, we show that the vast majority of expansions of protein families are due to transfer, even among large genomes. Transferred genes--xenologs--persist longer in prokaryotic lineages possibly due to a higher/longer adaptive role. On the other hand, duplicated genes--paralogs--are expressed more, and, when persistent, they evolve slower. This suggests that gene transfer and gene duplication have very different roles in shaping the evolution of biological systems: transfer allows the acquisition of new functions and duplication leads to higher gene dosage. Accordingly, we show that paralogs share most protein-protein interactions and genetic regulators, whereas xenologs share very few of them. Prokaryotes invented most of life's biochemical diversity. Therefore, the study of the evolution of biology systems should explicitly account for the predominant role of horizontal gene transfer in the diversification of protein families.
Full Text Available Abstract Background 1q21.1 Copy Number Variant (CNV is associated with a highly variable phenotype ranging from congenital anomalies, learning deficits/intellectual disability (ID, to a normal phenotype. Hence, the clinical significance of this CNV can be difficult to evaluate. Here we described the consequences of the 1q21.1 CNV on genome-wide gene expression and function of selected candidate genes within 1q21.1 using cell lines from clinically well described subjects. Methods and Results Eight subjects from 3 families were included in the study: six with a 1q21.1 deletion and two with a 1q21.1 duplication. High resolution Affymetrix 2.7M array was used to refine the 1q21.1 CNV breakpoints and exclude the presence of secondary CNVs of pathogenic relevance. Whole genome expression profiling, studied in lymphoblast cell lines (LBCs from 5 subjects, showed enrichment of genes from 1q21.1 in the top 100 genes ranked based on correlation of expression with 1q21.1 copy number. The function of two top genes from 1q21.1, CHD1L/ALC1 and PRKAB2, was studied in detail in LBCs from a deletion and a duplication carrier. CHD1L/ALC1 is an enzyme with a role in chromatin modification and DNA damage response while PRKAB2 is a member of the AMP kinase complex, which senses and maintains systemic and cellular energy balance. The protein levels for CHD1L/ALC1 and PRKAB2 were changed in concordance with their copy number in both LBCs. A defect in chromatin remodeling was documented based on impaired decatenation (chromatid untangling checkpoint (DCC in both LBCs. This defect, reproduced by CHD1L/ALC1 siRNA, identifies a new role of CHD1L/ALC1 in DCC. Both LBCs also showed elevated levels of micronuclei following treatment with a Topoisomerase II inhibitor suggesting increased DNA breaks. AMP kinase function, specifically in the deletion containing LBCs, was attenuated. Conclusion Our studies are unique as they show for the first time that the 1q21.1 CNV not only
Gouran, Hossein; Chakraborty, Sandeep; Rao, Basuthkar J; Asgeirsson, Bjarni; Dandekar, Abhaya
Duplication of genes is one of the preferred ways for natural selection to add advantageous functionality to the genome without having to reinvent the wheel with respect to catalytic efficiency and protein stability. The duplicated secretory virulence factors of Xylella fastidiosa (LesA, LesB and LesC), implicated in Pierce's disease of grape and citrus variegated chlorosis of citrus species, epitomizes the positive selection pressures exerted on advantageous genes in such pathogens. A deeper insight into the evolution of these lipases/esterases is essential to develop resistance mechanisms in transgenic plants. Directed evolution, an attempt to accelerate the evolutionary steps in the laboratory, is inherently simple when targeted for loss of function. A bigger challenge is to specify mutations that endow a new function, such as a lost functionality in a duplicated gene. Previously, we have proposed a method for enumerating candidates for mutations intended to transfer the functionality of one protein into another related protein based on the spatial and electrostatic properties of the active site residues (DECAAF). In the current work, we present in vivo validation of DECAAF by inducing tributyrin hydrolysis in LesB based on the active site similarity to LesA. The structures of these proteins have been modeled using RaptorX based on the closely related LipA protein from Xanthomonas oryzae. These mutations replicate the spatial and electrostatic conformation of LesA in the modeled structure of the mutant LesB as well, providing in silico validation before proceeding to the laborious in vivo work. Such focused mutations allows one to dissect the relevance of the duplicated genes in finer detail as compared to gene knockouts, since they do not interfere with other moonlighting functions, protein expression levels or protein-protein interaction.
Haberer, Georg; Panda, Arup; Das Laha, Shayani; Ghosh, Tapas Chandra; Schäffner, Anton R.
The identification of functionally equivalent, orthologous genes (functional orthologs) across genomes is necessary for accurate transfer of experimental knowledge from well-characterized organisms to others. This frequently relies on automated, coding sequence-based approaches such as OrthoMCL, Inparanoid, and KOG, which usually work well for one-to-one homologous states. However, this strategy does not reliably work for plants due to the occurrence of extensive gene/genome duplication. Frequently, for one query gene, multiple orthologous genes are predicted in the other genome, and it is not clear a priori from sequence comparison and similarity which one preserves the ancestral function. We have studied 11 organ-dependent and stress-induced gene expression patterns of 286 Arabidopsis lyrata duplicated gene groups and compared them with the respective Arabidopsis (Arabidopsis thaliana) genes to predict putative expressologs and nonexpressologs based on gene expression similarity. Promoter sequence divergence as an additional tool to substantiate functional orthology only partially overlapped with expressolog classification. By cloning eight A. lyrata homologs and complementing them in the respective four Arabidopsis loss-of-function mutants, we experimentally proved that predicted expressologs are indeed functional orthologs, while nonexpressologs or nonfunctionalized orthologs are not. Our study demonstrates that even a small set of gene expression data in addition to sequence homologies are instrumental in the assignment of functional orthologs in the presence of multiple orthologs. PMID:27303025
Rocca, Maria Santa; Di Nisio, Andrea; Marchiori, Arianna; Ghezzi, Marco; Opocher, Giuseppe; Foresta, Carlo; Ferlin, Alberto
Testicular germ cell tumor (TGCT) is one of the most heritable forms of cancer. In last years, many evidence suggested that constitutional genetic factors, mainly single nucleotide polymorphisms, can increase its risk. However, the possible contribution of copy number variations (CNVs) in TGCT susceptibility has not been substantially addressed. Indeed, an increasing number of studies have focused on the effect of CNVs on gene expression and on the role of these structural genetic variations as risk factors for different forms of cancer. E2F1 is a transcription factor that plays an important role in regulating cell growth, differentiation, apoptosis and response to DNA damage. Therefore, deficiency or overexpression of this protein might significantly influence fundamental biological processes involved in cancer development and progression, including TGCT. We analyzed E2F1 CNVs in 261 cases with TGCT and 165 controls. We found no CNVs in controls, but 17/261 (6.5%) cases showed duplications in E2F1 Blot analysis demonstrated higher E2F1 expression in testicular samples of TGCT cases with three copies of the gene. Furthermore, we observed higher phosphorylation of Akt and mTOR in samples with E2F1 duplication. Interestingly, normal, non-tumoral testicular tissue in patient with E2F1 duplication showed lower expression of E2F1 and lower AKT/mTOR phosphorylation with respect to adjacent tumor tissue. Furthermore, increased expression of E2F1 obtained in vitro in NTERA-2 testicular cell line induced increased AKT/mTOR phosphorylation. This study suggests for the first time an involvement of E2F1 CNVs in TGCT susceptibility and supports previous preliminary data on the importance of AKT/mTOR signaling pathway in this cancer. © 2017 Society for Endocrinology.
van Dyk, H.O.; Hoogstraat, M; ten Hoeve, J; Reinders, M.J.T.; Wessels, L.F.A.
The frequent recurrence of copy number aberrations across tumour samples is a reliable hallmark of certain cancer driver genes. However, state-of-the-art algorithms for detecting recurrent aberrations fail to detect several known drivers. In this study, we propose RUBIC, an approach that detects
Ivaničová, Zuzana; Valárik, Miroslav; Pánková, Kateřina; Trávníčková, Martina; Doležel, Jaroslav; Šafář, Jan; Milec, Zbyněk
The ability of plants to identify an optimal flowering time is critical for ensuring the production of viable seeds. The main environmental factors that influence the flowering time include the ambient temperature and day length. In wheat, the ability to assess the day length is controlled by photoperiod (Ppd) genes. Due to its allohexaploid nature, bread wheat carries the following three Ppd-1 genes: Ppd-A1, Ppd-B1 and Ppd-D1. While photoperiod (in)sensitivity controlled by Ppd-A1 and Ppd-D1 is mainly determined by sequence changes in the promoter region, the impact of the Ppd-B1 alleles on the heading time has been linked to changes in the copy numbers (and possibly their methylation status) and sequence changes in the promoter region. Here, we report that plants with the same number of Ppd-B1 copies may have different heading times. Differences were observed among F7 lines derived from crossing two spring hexaploid wheat varieties. Several lines carrying three copies of Ppd-B1 headed 16 days later than other plants in the population with the same number of gene copies. This effect was associated with changes in the gene expression level and methylation of the Ppd-B1 gene.
Full Text Available The ability of plants to identify an optimal flowering time is critical for ensuring the production of viable seeds. The main environmental factors that influence the flowering time include the ambient temperature and day length. In wheat, the ability to assess the day length is controlled by photoperiod (Ppd genes. Due to its allohexaploid nature, bread wheat carries the following three Ppd-1 genes: Ppd-A1, Ppd-B1 and Ppd-D1. While photoperiod (insensitivity controlled by Ppd-A1 and Ppd-D1 is mainly determined by sequence changes in the promoter region, the impact of the Ppd-B1 alleles on the heading time has been linked to changes in the copy numbers (and possibly their methylation status and sequence changes in the promoter region. Here, we report that plants with the same number of Ppd-B1 copies may have different heading times. Differences were observed among F7 lines derived from crossing two spring hexaploid wheat varieties. Several lines carrying three copies of Ppd-B1 headed 16 days later than other plants in the population with the same number of gene copies. This effect was associated with changes in the gene expression level and methylation of the Ppd-B1 gene.
Muhammad I. Ullah
Full Text Available Objectives: To identify the underlying gene mutation in a large consanguineous Pakistani family. Methods: This is an observational descriptive study carried out at the Department of Biochemistry, Shifa International Hospital, Quaid-i-Azam University, and Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan from 2013-2016. Genomic DNA of all recruited family members was extracted and the Trusight one sequencing panel was used to assess genes associated with a neuro-muscular phenotype. Comparative modeling of mutated and wild-type protein was carried out by PyMOL tool. Results: Clinical investigations of an affected individual showed typical features of Miyoshi myopathy (MM like elevated serum creatine kinase (CK levels, distal muscle weakness, myopathic changes in electromyography (EMG and muscle histopathology. Sequencing with the Ilumina Trusight one sequencing panel revealed a novel 22 nucleotide duplication (CTTCAACTTGTTTGACTCTCCT in the DYSF gene (NM_001130987.1_c.897-918dup; p.Gly307Leufs5X, which results in a truncating frameshift mutation and perfectly segregated with the disease in this family. Protein modeling studies suggested a disruption in spatial configuration of the putative mutant protein. Conclusion: A novel duplication of 22 bases (c.897_918dup; p.Gly307Leufs5X in the DYSF gene was identified in a family suffering from Miyoshi myopathy. Protein homology analysis proposes a disruptive impact of this mutation on protein function.
Full Text Available Calves born persistently infected with non-cytopathic bovine viral diarrhea virus (ncpBVDV frequently develop a fatal gastroenteric illness called mucosal disease. Both the original virus (ncpBVDV and an antigenically identical but cytopathic virus (cpBVDV can be isolated from animals affected by mucosal disease. Cytopathic BVDVs originate from their ncp counterparts by diverse genetic mechanisms, all leading to the expression of the non-structural polypeptide NS3 as a discrete protein. In contrast, ncpBVDVs express only the large precursor polypeptide, NS2-3, which contains the NS3 sequence within its carboxy-terminal half. We report here the investigation of the mechanism leading to NS3 expression in 41 cpBVDV isolates. An RT-PCR strategy was employed to detect RNA insertions within the NS2-3 gene and/or duplication of the NS3 gene, two common mechanisms of NS3 expression. RT-PCR amplification revealed insertions in the NS2-3 gene of three cp isolates, with the inserts being similar in size to that present in the cpBVDV NADL strain. Sequencing of one such insert revealed a 296-nucleotide sequence with a central core of 270 nucleotides coding for an amino acid sequence highly homologous (98% to the NADL insert, a sequence corresponding to part of the cellular J-Domain gene. One cpBVDV isolate contained a duplication of the NS3 gene downstream from the original locus. In contrast, no detectable NS2-3 insertions or NS3 gene duplications were observed in the genome of 37 cp isolates. These results demonstrate that processing of NS2-3 without bulk mRNA insertions or NS3 gene duplications seems to be a frequent mechanism leading to NS3 expression and BVDV cytopathology.
Majerfeld, I.H.; Roper, J.A.
Strains of Aspergillus nidulans which carry a particular segment of chromosome I in duplicate - one segment in normal position, the other translocated to chromosome II - are more resistant to uv light than are strains with a balanced haploid genome. A double dose of the uvsA + allele, carried on the duplicate segment, determines this enhanced resistance; this is shown by the descending order of resistance of duplication haploids uvsA + /uvsA + , uvsA1/uvsA + and uvsA1/uvsA1. An unbalanced diploid with three doses of the uvsA + allele also shows greater resistance than a balanced uvsA + //uvsA + diploid. However, in balanced diploids the uvsA1 allele appears to be completely recessive; uvsA + //uvsA + and uvsA + //uvsA1 diploids produce indistinguishable survival curves after uv irradiation. Thus, the uvsA + gene product is not rate-limiting in repair processes in strains with a balanced genome. The rate-limiting effect observed in these unbalanced strains presumably reflects an interaction of the uvsA + product and other functions determined by the rest of the genome. Duplication haploids and normal haploids lose photorepairable lesions at similar rates. This observation may be interpreted to indicate that differences in survival are not due to differences in the efficiency of excision of uv-induced pyrimidime dimers
Background The APOBEC3 (A3) genes play a key role in innate antiviral defense in mammals by introducing directed mutations in the DNA. The human genome encodes for seven A3 genes, with multiple splice alternatives. Different A3 proteins display different substrate specificity, but the very basic question on how discerning self from non-self still remains unresolved. Further, the expression of A3 activity/ies shapes the way both viral and host genomes evolve. Results We present here a detailed temporal analysis of the origin and expansion of the A3 repertoire in mammals. Our data support an evolutionary scenario where the genome of the mammalian ancestor encoded for at least one ancestral A3 gene, and where the genome of the ancestor of placental mammals (and possibly of the ancestor of all mammals) already encoded for an A3Z1-A3Z2-A3Z3 arrangement. Duplication events of the A3 genes have occurred independently in different lineages: humans, cats and horses. In all of them, gene duplication has resulted in changes in enzyme activity and/or substrate specificity, in a paradigmatic example of convergent adaptive evolution at the genomic level. Finally, our results show that evolutionary rates for the three A3Z1, A3Z2 and A3Z3 motifs have significantly decreased in the last 100 Mya. The analysis constitutes a textbook example of the evolution of a gene locus by duplication and sub/neofunctionalization in the context of virus-host arms race. Conclusions Our results provide a time framework for identifying ancestral and derived genomic arrangements in the APOBEC loci, and to date the expansion of this gene family for different lineages through time, as a response to changes in viral/retroviral/retrotransposon pressure. PMID:22640020
Scott Christopher J
Full Text Available Abstract Background The DUB/USP17 subfamily of deubiquitinating enzymes were originally identified as immediate early genes induced in response to cytokine stimulation in mice (DUB-1, DUB-1A, DUB-2, DUB-2A. Subsequently we have identified a number of human family members and shown that one of these (DUB-3 is also cytokine inducible. We originally showed that constitutive expression of DUB-3 can block cell proliferation and more recently we have demonstrated that this is due to its regulation of the ubiquitination and activity of the 'CAAX' box protease RCE1. Results Here we demonstrate that the human DUB/USP17 family members are found on both chromosome 4p16.1, within a block of tandem repeats, and on chromosome 8p23.1, embedded within the copy number variable beta-defensin cluster. In addition, we show that the multiple genes observed in humans and other distantly related mammals have arisen due to the independent expansion of an ancestral sequence within each species. However, it is also apparent when sequences from humans and the more closely related chimpanzee are compared, that duplication events have taken place prior to these species separating. Conclusions The observation that the DUB/USP17 genes, which can influence cell growth and survival, have evolved from an unstable ancestral sequence which has undergone multiple and varied duplications in the species examined marks this as a unique family. In addition, their presence within the beta-defensin repeat raises the question whether they may contribute to the influence of this repeat on immune related conditions.
Rep, Martijn; van der Does, H Charlotte; Cornelissen, Ben J C
The facultative pathogenic fungus Fusarium oxysporum is known to harbour many different transposable and/or repetitive elements. We have identified Drifter, a novel DNA transposon of the hAT family in F. oxysporum. It was found adjoining SIX1-H, a truncated homolog of the SIX1 avirulence gene in F. oxysporum f. sp. lycopersici. Absence of a target site duplication as well as the 5' part of SIX1-H suggests that transposition of Drifter into the ancestor of SIX1-H was followed by loss of a chromosomal segment through recombination between Drifters. F. oxysporum isolates belonging to various formae speciales harbour between 0 and 5 full-length copies of Drifter and/or one or more copies with an internal deletion. Transcription of Drifter is activated during starvation for carbon or nitrogen.
Kito, Keiji; Ito, Haruka; Nohara, Takehiro; Ohnishi, Mihoko; Ishibashi, Yuko; Takeda, Daisuke
Omics analysis is a versatile approach for understanding the conservation and diversity of molecular systems across multiple taxa. In this study, we compared the proteome expression profiles of four yeast species (Saccharomyces cerevisiae, Saccharomyces mikatae, Kluyveromyces waltii, and Kluyveromyces lactis) grown on glucose- or glycerol-containing media. Conserved expression changes across all species were observed only for a small proportion of all proteins differentially expressed between the two growth conditions. Two Kluyveromyces species, both of which exhibited a high growth rate on glycerol, a nonfermentative carbon source, showed distinct species-specific expression profiles. In K. waltii grown on glycerol, proteins involved in the glyoxylate cycle and gluconeogenesis were expressed in high abundance. In K. lactis grown on glycerol, the expression of glycolytic and ethanol metabolic enzymes was unexpectedly low, whereas proteins involved in cytoplasmic translation, including ribosomal proteins and elongation factors, were highly expressed. These marked differences in the types of predominantly expressed proteins suggest that K. lactis optimizes the balance of proteome resource allocation between metabolism and protein synthesis giving priority to cellular growth. In S. cerevisiae, about 450 duplicate gene pairs were retained after whole-genome duplication. Intriguingly, we found that in the case of duplicates with conserved sequences, the total abundance of proteins encoded by a duplicate pair in S. cerevisiae was similar to that of protein encoded by nonduplicated ortholog in Kluyveromyces yeast. Given the frequency of haploinsufficiency, this observation suggests that conserved duplicate genes, even though minor cases of retained duplicates, do not exhibit a dosage effect in yeast, except for ribosomal proteins. Thus, comparative proteomic analyses across multiple species may reveal not only species-specific characteristics of metabolic processes under
Zhang, Qiucen; Austin, Robert; Vyawahare, Saurabh; Lau, Alexandra
Escherichia coli (E. coli) cells when challenged with sublethal concentrations of the genotoxic antibiotic ciprofloxacin cease to divide and form long filaments which contain multiple bacterial chromosomes. These filaments are individual mesoscopic environmental niches which provide protection for a community of chromosomes (as opposed to cells) under mutagenic stress and can provide an evolutionary fitness advantage within the niche. We use comparative genomic hybridization to show that the mesoscopic niche evolves within 20 minutes of ciprofloxacin exposure via replication of multiple copies of genes expressing ATP dependent transporters. We show that this rapid genomic amplification is done in a time efficient manner via placement of the genes encoding the pumps near the origin of replication on the bacterial chromosome. The de-amplification of multiple copies back to the wild type number is a function of the duration is a function of the ciprofloxacin exposure duration: the longer the exposure, the slower the removal of the multiple copies. The project described was supported by the National Science Foundation and the National Cancer Institute
Iacovazzo, Donato; Caswell, Richard; Bunce, Benjamin; Jose, Sian; Yuan, Bo; Hernández-Ramírez, Laura C; Kapur, Sonal; Caimari, Francisca; Evanson, Jane; Ferraù, Francesco; Dang, Mary N; Gabrovska, Plamena; Larkin, Sarah J; Ansorge, Olaf; Rodd, Celia; Vance, Mary L; Ramírez-Renteria, Claudia; Mercado, Moisés; Goldstone, Anthony P; Buchfelder, Michael; Burren, Christine P; Gurlek, Alper; Dutta, Pinaki; Choong, Catherine S; Cheetham, Timothy; Trivellin, Giampaolo; Stratakis, Constantine A; Lopes, Maria-Beatriz; Grossman, Ashley B; Trouillas, Jacqueline; Lupski, James R; Ellard, Sian; Sampson, Julian R; Roncaroli, Federico; Korbonits, Márta
Non-syndromic pituitary gigantism can result from AIP mutations or the recently identified Xq26.3 microduplication causing X-linked acrogigantism (XLAG). Within Xq26.3, GPR101 is believed to be the causative gene, and the c.924G > C (p.E308D) variant in this orphan G protein-coupled receptor has been suggested to play a role in the pathogenesis of acromegaly.We studied 153 patients (58 females and 95 males) with pituitary gigantism. AIP mutation-negative cases were screened for GPR101 duplication through copy number variation droplet digital PCR and high-density aCGH. The genetic, clinical and histopathological features of XLAG patients were studied in detail. 395 peripheral blood and 193 pituitary tumor DNA samples from acromegaly patients were tested for GPR101 variants.We identified 12 patients (10 females and 2 males; 7.8 %) with XLAG. In one subject, the duplicated region only contained GPR101, but not the other three genes in found to be duplicated in the previously reported patients, defining a new smallest region of overlap of duplications. While females presented with germline mutations, the two male patients harbored the mutation in a mosaic state. Nine patients had pituitary adenomas, while three had hyperplasia. The comparison of the features of XLAG, AIP-positive and GPR101&AIP-negative patients revealed significant differences in sex distribution, age at onset, height, prolactin co-secretion and histological features. The pathological features of XLAG-related adenomas were remarkably similar. These tumors had a sinusoidal and lobular architecture. Sparsely and densely granulated somatotrophs were admixed with lactotrophs; follicle-like structures and calcifications were commonly observed. Patients with sporadic of familial acromegaly did not have an increased prevalence of the c.924G > C (p.E308D) GPR101 variant compared to public databases.In conclusion, XLAG can result from germline or somatic duplication of GPR101. Duplication of GPR101
Brinkmeyer-Langford, C L; Murphy, W J; Childers, C P; Skow, L C
The assembled genomic sequence of the horse major histocompatibility complex (MHC) (equine lymphocyte antigen, ELA) is very similar to the homologous human HLA, with the notable exception of a large segmental duplication at the boundary of ELA class I and class III that is absent in HLA. The segmental duplication consists of a ∼ 710 kb region of at least 11 repeated blocks: 10 blocks each contain an MHC class I-like sequence and the helicase domain portion of a BAT1-like sequence, and the remaining unit contains the full-length BAT1 gene. Similar genomic features were found in other Perissodactyls, indicating an ancient origin, which is consistent with phylogenetic analyses. Reverse-transcriptase PCR (RT-PCR) of mRNA from peripheral white blood cells of healthy and chronically or acutely infected horses detected transcription from predicted open reading frames in several of the duplicated blocks. This duplication is not present in the sequenced MHCs of most other mammals, although a similar feature at the same relative position is present in the feline MHC (FLA). Striking sequence conservation throughout Perissodactyl evolution is consistent with a functional role for at least some of the genes included within this segmental duplication. © 2010 The Authors, Journal compilation © 2010 Stichting International Foundation for Animal Genetics.
Boghossian, Nansi S; Sicko, Robert J; Kay, Denise M; Rigler, Shannon L; Caggana, Michele; Tsai, Michael Y; Yeung, Edwina H; Pankratz, Nathan; Cole, Benjamin R; Druschel, Charlotte M; Romitti, Paul A; Browne, Marilyn L; Fan, Ruzong; Liu, Aiyi; Brody, Lawrence C; Mills, James L
The cause of posterior urethral valves (PUV) is unknown, but genetic factors are suspected given their familial occurrence. We examined cases of isolated PUV to identify novel copy number variants (CNVs). We identified 56 cases of isolated PUV from all live-births in New York State (1998-2005). Samples were genotyped using Illumina HumanOmni2.5 microarrays. Autosomal and sex-linked CNVs were identified using PennCNV and cnvPartition software. CNVs were prioritized for follow-up if they were absent from in-house controls, contained ≥ 10 consecutive probes, were ≥ 20 Kb in size, had ≤ 20% overlap with variants detected in other birth defect phenotypes screened in our lab, and were rare in population reference controls. We identified 47 rare candidate PUV-associated CNVs in 32 cases; one case had a 3.9 Mb deletion encompassing BMP7. Mutations in BMP7 have been associated with severe anomalies in the mouse urethra. Other interesting CNVs, each detected in a single PUV case included: a deletion of PIK3R3 and TSPAN1, duplication/triplication in FGF12, duplication of FAT1--a gene essential for normal growth and development, a large deletion (>2 Mb) on chromosome 17q that involves TBX2 and TBX4, and large duplications (>1 Mb) on chromosomes 3q and 6q. Our finding of previously unreported novel CNVs in PUV suggests that genetic factors may play a larger role than previously understood. Our data show a potential role of CNVs in up to 57% of cases examined. Investigation of genes in these CNVs may provide further insights into genetic variants that contribute to PUV. © 2015 Wiley Periodicals, Inc.
Full Text Available NAC (NAM/ATAF/CUC proteins constitute one of the biggest plant-specific transcription factor (TF families and have crucial roles in diverse developmental programs during plant growth. Phylogenetic analyses have revealed both conserved and lineage-specific NAC subfamilies, among which various origins and distinct features were observed. It is reasonable to hypothesize that there should be divergent evolutionary patterns of NAC TFs both between dicots and monocots, and among NAC subfamilies. In this study, we compared the gene duplication and loss, evolutionary rate, and selective pattern among non-lineage specific NAC subfamilies, as well as those between dicots and monocots, through genome-wide analyses of sequence and functional data in six dicot and five grass lineages. The number of genes gained in the dicot lineages was much larger than that in the grass lineages, while fewer gene losses were observed in the grass than that in the dicots. We revealed (1 uneven constitution of Clusters of Orthologous Groups (COGs and contrasting birth/death rates among subfamilies, and (2 two distinct evolutionary scenarios of NAC TFs between dicots and grasses. Our results demonstrated that relaxed selection, resulting from concerted gene duplications, may have permitted substitutions responsible for functional divergence of NAC genes into new lineages. The underlying mechanism of distinct evolutionary fates of NAC TFs shed lights on how evolutionary divergence contributes to differences in establishing NAC gene subfamilies and thus impacts the distinct features between dicots and grasses.
Murray, Shauna A; Diwan, Rutuja; Orr, Russell J S; Kohli, Gurjeet S; John, Uwe
A group of marine dinoflagellates (Alveolata, Eukaryota), consisting of ∼10 species of the genus Alexandrium, Gymnodinium catenatum and Pyrodinium bahamense, produce the toxin saxitoxin and its analogues (STX), which can accumulate in shellfish, leading to ecosystem and human health impacts. The genes, sxt, putatively involved in STX biosynthesis, have recently been identified, however, the evolution of these genes within dinoflagellates is not clear. There are two reasons for this: uncertainty over the phylogeny of dinoflagellates; and that the sxt genes of many species of Alexandrium and other dinoflagellate genera are not known. Here, we determined the phylogeny of STX-producing and other dinoflagellates based on a concatenated eight-gene alignment. We determined the presence, diversity and phylogeny of sxtA, domains A1 and A4 and sxtG in 52 strains of Alexandrium, and a further 43 species of dinoflagellates and thirteen other alveolates. We confirmed the presence and high sequence conservation of sxtA, domain A4, in 40 strains (35 Alexandrium, 1 Pyrodinium, 4 Gymnodinium) of 8 species of STX-producing dinoflagellates, and absence from non-producing species. We found three paralogs of sxtA, domain A1, and a widespread distribution of sxtA1 in non-STX producing dinoflagellates, indicating duplication events in the evolution of this gene. One paralog, clade 2, of sxtA1 may be particularly related to STX biosynthesis. Similarly, sxtG appears to be generally restricted to STX-producing species, while three amidinotransferase gene paralogs were found in dinoflagellates. We investigated the role of positive (diversifying) selection following duplication in sxtA1 and sxtG, and found negative selection in clades of sxtG and sxtA1, clade 2, suggesting they were functionally constrained. Significant episodic diversifying selection was found in some strains in clade 3 of sxtA1, a clade that may not be involved in STX biosynthesis, indicating pressure for diversification
Stewart Lindsay B
Full Text Available Abstract Background Gene copy number variation (CNV is responsible for several important phenotypes of the malaria parasite Plasmodium falciparum, including drug resistance, loss of infected erythrocyte cytoadherence and alteration of receptor usage for erythrocyte invasion. Despite the known effects of CNV, little is known about its extent throughout the genome. Results We performed a whole-genome survey of CNV genes in P. falciparum using comparative genome hybridisation of a diverse set of 16 laboratory culture-adapted isolates to a custom designed high density Affymetrix GeneChip array. Overall, 186 genes showed hybridisation signals consistent with deletion or amplification in one or more isolate. There is a strong association of CNV with gene length, genomic location, and low orthology to genes in other Plasmodium species. Sub-telomeric regions of all chromosomes are strongly associated with CNV genes independent from members of previously described multigene families. However, ~40% of CNV genes were located in more central regions of the chromosomes. Among the previously undescribed CNV genes, several that are of potential phenotypic relevance are identified. Conclusion CNV represents a major form of genetic variation within the P. falciparum genome; the distribution of gene features indicates the involvement of highly non-random mutational and selective processes. Additional studies should be directed at examining CNV in natural parasite populations to extend conclusions to clinical settings.
Lai, Yi-Pin; Wang, Liang-Bo; Wang, Wei-An; Lai, Liang-Chuan; Tsai, Mong-Hsun; Lu, Tzu-Pin; Chuang, Eric Y
With the advancement in high-throughput technologies, researchers can simultaneously investigate gene expression and copy number alteration (CNA) data from individual patients at a lower cost. Traditional analysis methods analyze each type of data individually and integrate their results using Venn diagrams. Challenges arise, however, when the results are irreproducible and inconsistent across multiple platforms. To address these issues, one possible approach is to concurrently analyze both gene expression profiling and CNAs in the same individual. We have developed an open-source R/Bioconductor package (iGC). Multiple input formats are supported and users can define their own criteria for identifying differentially expressed genes driven by CNAs. The analysis of two real microarray datasets demonstrated that the CNA-driven genes identified by the iGC package showed significantly higher Pearson correlation coefficients with their gene expression levels and copy numbers than those genes located in a genomic region with CNA. Compared with the Venn diagram approach, the iGC package showed better performance. The iGC package is effective and useful for identifying CNA-driven genes. By simultaneously considering both comparative genomic and transcriptomic data, it can provide better understanding of biological and medical questions. The iGC package's source code and manual are freely available at https://www.bioconductor.org/packages/release/bioc/html/iGC.html .
Full Text Available Root and stem rot disease of soybean is caused by the oomycete Phytophthora sojae. The avirulence (Avr genes of P. sojae control race-cultivar compatibility. In this study, we identify the P. sojae Avr3c gene and show that it encodes a predicted RXLR effector protein of 220 amino acids. Sequence and transcriptional data were compared for predicted RXLR effectors occurring in the vicinity of Avr4/6, as genetic linkage of Avr3c and Avr4/6 was previously suggested. Mapping of DNA markers in a F(2 population was performed to determine whether selected RXLR effector genes co-segregate with the Avr3c phenotype. The results pointed to one RXLR candidate gene as likely to encode Avr3c. This was verified by testing selected genes by a co-bombardment assay on soybean plants with Rps3c, thus demonstrating functionality and confirming the identity of Avr3c. The Avr3c gene together with eight other predicted genes are part of a repetitive segment of 33.7 kb. Three near-identical copies of this segment occur in a tandem array. In P. sojae strain P6497, two identical copies of Avr3c occur within the repeated segments whereas the third copy of this RXLR effector has diverged in sequence. The Avr3c gene is expressed during the early stages of infection in all P. sojae strains examined. Virulent alleles of Avr3c that differ in amino acid sequence were identified in other strains of P. sojae. Gain of virulence was acquired through mutation and subsequent sequence exchanges between the two copies of Avr3c. The results illustrate the importance of segmental duplications and RXLR effector evolution in the control of race-cultivar compatibility in the P. sojae and soybean interaction.
Koch Marcus A
Full Text Available Abstract Background Positive selection is recognized as the prevalence of nonsynonymous over synonymous substitutions in a gene. Models of the functional evolution of duplicated genes consider neofunctionalization as key to the retention of paralogues. For instance, duplicate transcription factors are specifically retained in plant and animal genomes and both positive selection and transcriptional divergence appear to have played a role in their diversification. However, the relative impact of these two factors has not been systematically evaluated. Class B MADS-box genes, comprising DEF-like and GLO-like genes, encode developmental transcription factors essential for establishment of perianth and male organ identity in the flowers of angiosperms. Here, we contrast the role of positive selection and the known divergence in expression patterns of genes encoding class B-like MADS-box transcription factors from monocots, with emphasis on the family Orchidaceae and the order Poales. Although in the monocots these two groups are highly diverse and have a strongly canalized floral morphology, there is no information on the role of positive selection in the evolution of their distinctive flower morphologies. Published research shows that in Poales, class B-like genes are expressed in stamens and in lodicules, the perianth organs whose identity might also be specified by class B-like genes, like the identity of the inner tepals of their lily-like relatives. In orchids, however, the number and pattern of expression of class B-like genes have greatly diverged. Results The DEF-like genes from Orchidaceae form four well-supported, ancient clades of orthologues. In contrast, orchid GLO-like genes form a single clade of ancient orthologues and recent paralogues. DEF-like genes from orchid clade 2 (OMADS3-like genes are under less stringent purifying selection than the other orchid DEF-like and GLO-like genes. In comparison with orchids, purifying selection
Mitchell, Elyse; Douglas, Andrew; Kjaegaard, Susanne
The ability to identify the clinical nature of the recurrent duplication of chromosome 17q12 has been limited by its rarity and the diverse range of phenotypes associated with this genomic change. In order to further define the clinical features of affected patients, detailed clinical information......, potentially contributory copy number changes in a subset of patients, including one patient each with 16p11.2 deletion and 15q13.3 deletion. Our data further define and expand the clinical spectrum associated with duplications of 17q12 and provide support for the role of genomic modifiers contributing...... to phenotypic variability....
M Loredana Marcovecchio
Full Text Available Genome-wide association studies have identified more than 60 single nucleotide polymorphisms associated with Body Mass Index (BMI. Additional genetic variants, such as copy number variations (CNV, have also been investigated in relation to BMI. Recently, the highly polymorphic CNV in the salivary amylase (AMY1 gene, encoding an enzyme implicated in the first step of starch digestion, has been associated with obesity in adults and children. We assessed the potential association between AMY1 copy number and a wide range of BMI in a population of Italian school-children.744 children (354 boys, 390 girls, mean age (±SD: 8.4±1.4years underwent anthropometric assessments (height, weight and collection of saliva samples for DNA extraction. AMY1 copies were evaluated by quantitative PCR.A significant increase of BMI z-score by decreasing AMY1 copy number was observed in boys (β: -0.117, p = 0.033, but not in girls. Similarly, waist circumference (β: -0.155, p = 0.003, adjusted for age was negatively influenced by AMY1 copy number in boys. Boys with 8 or more AMY1 copy numbers presented a significant lower BMI z-score (p = 0.04 and waist circumference (p = 0.01 when compared to boys with less than 8 copy numbers.In this pediatric-only, population-based study, a lower AMY1 copy number emerged to be associated with increased BMI in boys. These data confirm previous findings from adult studies and support a potential role of a higher copy number of the salivary AMY1 gene in protecting from excess weight gain.
Szmulewicz, Martin N; Ruiz, Luis M; Reategui, Erika P; Hussini, Saeed; Herrera, Rene J
The evolution of the deleted in azoospermia (DAZ) gene family supports prevalent theories on the origin and development of sex chromosomes and sexual dimorphism. The ancestral DAZL gene in human chromosome 3 is known to be involved in germline development of both males and females. The available phylogenetic data suggest that some time after the divergence of the New World and Old World monkey lineages, the DAZL gene, which is found in all mammals, was copied to the Y chromosome of an ancestor to the Old World monkeys, but not New World monkeys. In modern man, the Y-linked DAZ gene complex is located on the distal part of the q arm. It is thought that after being copied to the Y chromosome, and after the divergence of the human and great ape lineages, the DAZ gene in the former underwent internal rearrangements. This included tandem duplications as well as a T > C transition altering an MboI restriction enzyme site in a duplicated sequence. In this study, we report on the ratios of MboI-/MboI+ variant sequences in individuals from seven worldwide human populations (Basque, Benin, Egypt, Formosa, Kungurtug, Oman and Rwanda) in the DAZ complex. The ratio of PCR MboI- and MboI+ amplicons can be used to characterize individuals and populations. Our results show a nonrandom distribution of MboI-/MboI+ sequence ratios in all populations examined, as well as significant differences in ratios between populations when compared pairwise. The multiple ratios imply that there have been more than one recent reorganization events at this locus. Considering the dynamic nature of this locus and its involvement in male fertility, we investigated the extent and distribution of this polymorphism. Copyright 2002 S. Karger AG, Basel
Bai, Xufeng; Huang, Yong; Hu, Yong; Liu, Haiyang; Zhang, Bo; Smaczniak, Cezary; Hu, Gang; Han, Zhongmin; Xing, Yongzhong
Transcriptional silencer and copy number variants (CNVs) are associated with gene expression. However, their roles in generating phenotypes have not been well studied. Here we identified a rice quantitative trait locus, SGDP7 (Small Grain and Dense Panicle 7). SGDP7 is identical to FZP (FRIZZY PANICLE), which represses the formation of axillary meristems. The causal mutation of SGDP7 is an 18-bp fragment, named CNV-18bp, which was inserted ~5.3 kb upstream of FZP and resulted in a tandem duplication in the cultivar Chuan 7. The CNV-18bp duplication repressed FZP expression, prolonged the panicle branching period and increased grain yield by more than 15% through substantially increasing the number of spikelets per panicle (SPP) and slightly decreasing the 1,000-grain weight (TGW). The transcription repressor OsBZR1 binds the CGTG motifs in CNV-18bp and thereby represses FZP expression, indicating that CNV-18bp is the upstream silencer of FZP. These findings showed that the silencer CNVs coordinate a trade-off between SPP and TGW by fine-tuning FZP expression, and balancing the trade-off could enhance yield potential.
Noordam, Michiel J.; Westerveld, G. Henrike; Hovingh, Suzanne E.; van Daalen, Saskia K. M.; Korver, Cindy M.; van der Veen, Fulco; van Pelt, Ans M. M.; Repping, Sjoerd
The azoospermia factor c (AZFc) region harbors multi-copy genes that are expressed in the testis. Deletions of the AZFc region lead to reduced copy numbers of these genes. Four (partial) AZFc deletions have been described of which the b2/b4 and gr/gr deletions affect semen quality. In most studies,
Zhong , Lei; Yu , Xiaomu; Tong , Jingou
Abstract The Sox gene family is found in a broad range of animal taxa and encodes important gene regulatory proteins involved in a variety of developmental processes. We have obtained clones representing the HMG boxes of twelve Sox genes from grass carp (Ctenopharyngodon idella), one of the four major domestic carps in China. The cloned Sox genes belong to group B1, B2 and C. Our analyses show that whereas the human genome contains a single copy of Sox4, Sox11 and Sox14, each of these genes h...
Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.
We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.
Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.
We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.
Full Text Available Abstract Background Oomycetes of the genus Phytophthora are pathogens that infect a wide range of plant species. For dicot hosts such as tomato, potato and soybean, Phytophthora is even the most important pathogen. Previous analyses of Phytophthora genomes uncovered many genes, large gene families and large genome sizes that can partially be explained by significant repeat expansion patterns. Results Analysis of the complete genomes of three different Phytophthora species, using a newly developed approach, unveiled a large number of small duplicated blocks, mainly consisting of two or three consecutive genes. Further analysis of these duplicated genes and comparison with the known gene and genome duplication history of ten other eukaryotes including parasites, algae, plants, fungi, vertebrates and invertebrates, suggests that the ancestor of P. infestans, P. sojae and P. ramorum most likely underwent a whole genome duplication (WGD. Genes that have survived in duplicate are mainly genes that are known to be preferentially retained following WGDs, but also genes important for pathogenicity and infection of the different hosts seem to have been retained in excess. As a result, the WGD might have contributed to the evolutionary and pathogenic success of Phytophthora. Conclusions The fact that we find many small blocks of duplicated genes indicates that the genomes of Phytophthora species have been heavily rearranged following the WGD. Most likely, the high repeat content in these genomes have played an important role in this rearrangement process. As a consequence, the paucity of retained larger duplicated blocks has greatly complicated previous attempts to detect remnants of a large-scale duplication event in Phytophthora. However, as we show here, our newly developed strategy to identify very small duplicated blocks might be a useful approach to uncover ancient polyploidy events, in particular for heavily rearranged genomes.
Hollox, E J; Atia, T; Cross, G; Parkin, T; Armour, J A L
Subtelomeric regions of the human genome are gene rich, with a high level of sequence polymorphism. A number of clinical conditions, including learning disability, have been attributed to subtelomeric deletions or duplications, but screening for deletion in these regions using conventional cytogenetic methods and fluorescence in situ hybridisation (FISH) is laborious. Here we report that a new method, multiplex amplifiable probe hybridisation (MAPH), can be used to screen for copy number at subtelomeric regions. We have constructed a set of MAPH probes with each subtelomeric region represented at least once, so that one gel lane can assay copy number at all chromosome ends in one person. Each probe has been sequenced and, where possible, its position relative to the telomere determined by comparison with mapped clones. The sensitivity of the probes has been characterised on a series of cytogenetically verified positive controls and 83 normal controls were used to assess the frequency of polymorphic copy number with no apparent phenotypic effect. We have also used MAPH to test a cohort of 37 people selected from males referred for fragile X syndrome testing and found six changes that were confirmed by dosage PCR. MAPH can be used to screen subtelomeric regions of chromosomes for deletions and duplications before confirmation by FISH or dosage PCR. The high throughput nature of this technique allows it to be used for large scale screening of subtelomeric copy number, before confirmation by FISH. In practice, the availability of a rapid and efficient screen may allow subtelomeric analysis to be applied to a wider selection of patients than is currently possible using FISH alone.
Full Text Available Abstract Background The genome of Paramecium tetraurelia, a unicellular model that belongs to the ciliate phylum, has been shaped by at least 3 successive whole genome duplications (WGD. These dramatic events, which have also been documented in plants, animals and fungi, are resolved over evolutionary time by the loss of one duplicate for the majority of genes. Thanks to a low rate of large scale genome rearrangement in Paramecium, an unprecedented large number of gene duplicates of different ages have been identified, making this organism an outstanding model to investigate the evolutionary consequences of polyploidization. The most recent WGD, with 51% of pre-duplication genes still in 2 copies, provides a snapshot of a phase of rapid gene loss that is not accessible in more ancient polyploids such as yeast. Results We designed a custom oligonucleotide microarray platform for P. tetraurelia genome-wide expression profiling and used the platform to measure gene expression during 1 the sexual cycle of autogamy, 2 growth of new cilia in response to deciliation and 3 biogenesis of secretory granules after massive exocytosis. Genes that are differentially expressed during these time course experiments have expression patterns consistent with a very low rate of subfunctionalization (partition of ancestral functions between duplicated genes in particular since the most recent polyploidization event. Conclusions A public transcriptome resource is now available for Paramecium tetraurelia. The resource has been integrated into the ParameciumDB model organism database, providing searchable access to the data. The microarray platform, freely available through NimbleGen Systems, provides a robust, cost-effective approach for genome-wide expression profiling in P. tetraurelia. The expression data support previous studies showing that at short evolutionary times after a whole genome duplication, gene dosage balance constraints and not functional change are
Tsezou, A; Kitsiou, S; Galla, A; Petersen, M B; Karadima, G; Syrrou, M; Sahlèn, S; Blennow, E
We report on two additional cases with duplication of 9p, minor with facial anomalies and developmental delay. Using fluorescence in situ hybridization and single-copy probes, we showed that the first case was a direct duplication, whereas the second case was inverted. The extent of the direct duplication was defined as 9p12 --> p24 by microdissection and microcloning of the aberrant chromosome and subsequent chromosome-specific comparative genomic hybridization. DNA polymorphism analysis with eight microsatellite markers revealed that the origin of the dup(9p) was maternal in the first case, whereas it was paternal in the second. Copyright 2000 Wiley-Liss, Inc.
van Wieringen, W.N.; van de Wiel, M.A.
The central dogma of molecular biology relates DNA with mRNA. Array CGH measures DNA copy number and gene expression microarrays measure the amount of mRNA. Methods that integrate data from these two platforms may uncover meaningful biological relationships that further our understanding of cancer.
Agelopoulos, Konstantin; Buerger, Horst; Brandt, Burkhard; Greve, Burkhard; Schmidt, Hartmut; Pospisil, Heike; Kurtz, Stefan; Bartkowiak, Kai; Andreas, Antje; Wieczorek, Marek; Korsching, Eberhard
Increased transcription of oncogenes like the epidermal growth factor receptor (EGFR) is frequently caused by amplification of the whole gene or at least of regulatory sequences. Aim of this study was to pinpoint mechanistic parameters occurring during egfr copy number gains leading to a stable EGFR overexpression and high sensitivity to extracellular signalling. A deeper understanding of those marker events might improve early diagnosis of cancer in suspect lesions, early detection of cancer progression and the prediction of egfr targeted therapies. The basal-like/stemness type breast cancer cell line subpopulation MDA-MB-468 CD44 high /CD24 -/low , carrying high egfr amplifications, was chosen as a model system in this study. Subclones of the heterogeneous cell line expressing low and high EGF receptor densities were isolated by cell sorting. Genomic profiling was carried out for these by means of SNP array profiling, qPCR and FISH. Cell cycle analysis was performed using the BrdU quenching technique. Low and high EGFR expressing MDA-MB-468 CD44 + /CD24 -/low subpopulations separated by cell sorting showed intermediate and high copy numbers of egfr, respectively. However, during cell culture an increase solely for egfr gene copy numbers in the intermediate subpopulation occurred. This shift was based on the formation of new cells which regained egfr gene copies. By two parametric cell cycle analysis clonal effects mediated through growth advantage of cells bearing higher egfr gene copy numbers could most likely be excluded for being the driving force. Subsequently, the detection of a fragile site distal to the egfr gene, sustaining uncapped telomere-less chromosomal ends, the ladder-like structure of the intrachromosomal egfr amplification and a broader range of egfr copy numbers support the assumption that dynamic chromosomal rearrangements, like breakage-fusion-bridge-cycles other than proliferation drive the gain of egfr copies. Progressive genome modulation
Vulto-van Silfhout, A.T.; de Brouwer, A.F.; de Leeuw, N.; Obihara, C.C.; Brunner, H.G.; Vries, L.B.A. de
De novo genomic aberrations are considered an important cause of autism spectrum disorders. We describe a de novo 380-kb gain in band p22.3 of chromosome 7 in a patient with Asperger syndrome. This duplicated region contains 9 genes including the LNFG gene that is an important regulator of NOTCH
Abe, Hideaki; Aoya, Daiki; Takeuchi, Hiro-Aki; Inoue-Murayama, Miho
Neuregulin 3 (NRG3) plays a key role in central nervous system development and is a strong candidate for human mental disorders. Thus, genetic variation in NRG3 may have some impact on a variety of phenotypes in non-mammalian vertebrates. Recently, genome-wide screening for short insertions and deletions in chicken (Gallus gallus) genomes has provided useful information about structural variation in functionally important genes. NRG3 is one such gene that has a putative frameshift deletion in exon 2, resulting in premature termination of translation. Our aims were to characterize the structure of chicken NRG3 and to compare expression patterns between NRG3 isoforms. Depending on the presence or absence of the 2-bp deletion in chicken NRG3, 3 breeds (red junglefowl [RJF], Boris Brown [BB], and Hinai-jidori [HJ]) were genotyped using flanking primers. In the commercial breeds (BB and HJ), approximately 45% of individuals had at least one exon 2 allele with the 2-bp deletion, whereas there was no deletion allele in RJF. The lack of a homozygous mutant indicated the existence of duplicated NRG3 segments in the chicken genome. Indeed, highly conserved elements consisting of exon 1, intron 1, exon 2, and part of intron 2 were found in the reference RJF genome, and quantitative PCR detected copy number variation (CNV) between breeds as well as between individuals. The copy number of conserved elements was significantly higher in chicks harboring the 2-bp deletion in exon 2. We identified 7 novel transcript variants using total mRNA isolated from the amygdala. Novel isoforms were found to lack the exon 2 cassette, which probably harbored the premature termination codon. The relative transcription levels of the newly identified isoforms were almost the same between chick groups with and without the 2-bp deletion, while chicks with the deletion showed significant suppression of the expression of previously reported isoforms. A putative frameshift deletion and CNV in chicken
... duplicates of all endorsements, amendments, riders, indemnity agreements, and other attachments thereto, and photographically reproduced copies of any letter of credit or amendment thereto, shall be filed with the Regional...
Le Chevanton, L; Leblon, G; Lebilcot, S
We present here the first report of a transformation system developed for the filamentous fungus Sordaria macrospora. Protoplasts from a ura-5 strain were transformed using the cloned Sordaria gene at a frequency of 2 x 10(-5) transformants per viable protoplast (10 per microgram of DNA). Transformation occurred by integration of the donor sequences in the chromosomes of the recipient strain. In 71 cases out of 74, integration occurred outside the ura5 locus; frequently several (two to four) copies were found at a unique integration site. Using the advantage of the spore colour phenotype of the ura5-1 marker, we have shown that the transformed phenotype is stable through mitosis and meiosis in all transformants analysed. No methylation of the duplicated sequences could be observed during meiotic divisions in the transformants.
Borlot, Felippe; Regan, Brigid M; Bassett, Anne S; Stavropoulos, D James; Andrade, Danielle M
Copy number variation (CNV) is an important cause of neuropsychiatric disorders. Little is known about the role of CNV in adults with epilepsy and intellectual disability. To evaluate the prevalence of pathogenic CNVs and identify possible candidate CNVs and genes in patients with epilepsy and intellectual disability. In this cross-sectional study, genome-wide microarray was used to evaluate a cohort of 143 adults with unexplained childhood-onset epilepsy and intellectual disability who were recruited from the Toronto Western Hospital epilepsy outpatient clinic from January 1, 2012, through December 31, 2014. The inclusion criteria were (1) pediatric seizure onset with ongoing seizure activity in adulthood, (2) intellectual disability of any degree, and (3) no structural brain abnormalities or metabolic conditions that could explain the seizures. DNA screening was performed using genome-wide microarray platforms. Pathogenicity of CNVs was assessed based on the American College of Medical Genetics guidelines. The Residual Variation Intolerance Score was used to evaluate genes within the identified CNVs that could play a role in each patient's phenotype. Of the 2335 patients, 143 probands were investigated (mean [SD] age, 24.6 [10.8] years; 69 male and 74 female). Twenty-three probands (16.1%) and 4 affected relatives (2.8%) (mean [SD] age, 24.1 [6.1] years; 11 male and 16 female) presented with pathogenic or likely pathogenic CNVs (0.08-18.9 Mb). Five of the 23 probands with positive results (21.7%) had more than 1 CNV reported. Parental testing revealed de novo CNVs in 11 (47.8%), with CNVs inherited from a parent in 4 probands (17.4%). Sixteen of 23 probands (69.6%) presented with previously cataloged human genetic disorders and/or defined CNV hot spots in epilepsy. Eight nonrecurrent rare CNVs that overlapped 1 or more genes associated with intellectual disability, autism, and/or epilepsy were identified: 2p16.1-p15 duplication, 6p25.3-p25.1 duplication, 8p23.3p
The CRISPR/Cas9 system enables genome editing and somatic cell genetic screens in mammalian cells. We performed genome-scale loss-of-function screens in 33 cancer cell lines to identify genes essential for proliferation/survival and found a strong correlation between increased gene copy number and decreased cell viability after genome editing. Within regions of copy-number gain, CRISPR/Cas9 targeting of both expressed and unexpressed genes, as well as intergenic loci, led to significantly decreased cell proliferation through induction of a G2 cell-cycle arrest.
Hidaka, Taira; Tsushima, Ikuo; Tsumori, Jun
Anaerobic co-digestion of various sewage sludges is a promising approach for greater recovery of energy, but the process is more complicated than mono-digestion of sewage sludge. The applicability of microbial structure analyses and gene quantification to understand microbial conditions was evaluated. The results show that information from gene analyses is useful in managing anaerobic co-digestion and damaged microbes in addition to conventional parameters like total solids, pH and biogas production. Total bacterial 16S rRNA gene copy numbers are the most useful tools for evaluating unstable anaerobic digestion of sewage sludge, rather than mcrA and total archaeal 16S rRNA gene copy numbers, and high-throughput sequencing. First order decay rates of gene copy numbers during pH failure were higher than typical decay rates of microbes in stable operation. The sequencing analyses, including multidimensional scaling, showed very different microbial structure shifts, but the results were not consistent. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evert van den Broek
Full Text Available Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large series of tumor samples. ‘GeneBreak’ is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH or by (low-pass whole genome sequencing (WGS. First, ‘GeneBreak’ collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, ‘GeneBreak’, is implemented in R (www.cran.r-project.org and is available from Bioconductor (www.bioconductor.org/packages/release/bioc/html/GeneBreak.html.
Erin S Kelleher
Full Text Available It frequently has been postulated that intersexual coevolution between the male ejaculate and the female reproductive tract is a driving force in the rapid evolution of reproductive proteins. The dearth of research on female tracts, however, presents a major obstacle to empirical tests of this hypothesis. Here, we employ a comparative EST approach to identify 241 candidate female reproductive proteins in Drosophila arizonae, a repleta group species in which physiological ejaculate-female coevolution has been documented. Thirty-one of these proteins exhibit elevated amino acid substitution rates, making them candidates for molecular coevolution with the male ejaculate. Strikingly, we also discovered 12 unique digestive proteases whose expression is specific to the D. arizonae lower female reproductive tract. These enzymes belong to classes most commonly found in the gastrointestinal tracts of a diverse array of organisms. We show that these proteases are associated with recent, lineage-specific gene duplications in the Drosophila repleta species group, and exhibit strong signatures of positive selection. Observation of adaptive evolution in several female reproductive tract proteins indicates they are active players in the evolution of reproductive tract interactions. Additionally, pervasive gene duplication, adaptive evolution, and rapid acquisition of a novel digestive function by the female reproductive tract points to a novel coevolutionary mechanism of ejaculate-female interaction.
Full Text Available Chaperonin GroEL (Cpn60 requires cofactor GroES (Cpn10 for protein refolding in bacteria that possess single groEL and groES genes in a bicistronic groESL operon. Among 4,861 completely-sequenced prokaryotic genomes, 884 possess duplicate groEL genes and 770 possess groEL genes with no neighboring groES. It is unclear whether stand-alone groEL requires groES in order to function and, if required, how duplicate groEL genes and unequal groES genes balance their expressions. In Myxococcus xanthus DK1622, we determined that, while duplicate groELs were alternatively deletable, the single groES that clusters with groEL1 was essential for cell survival. Either GroEL1 or GroEL2 required interactions with GroES for in vitro and in vivo functions. Deletion of groEL1 or groEL2 resulted in decreased expressions of both groEL and groES; and ectopic complementation of groEL recovered not only the groEL but also groES expressions. The addition of an extra groES gene upstream groEL2 to form a bicistronic operon had almost no influence on groES expression and the cell survival rate, whereas over-expression of groES using a self-replicating plasmid simultaneously increased the groEL expressions. The results indicated that M. xanthus DK1622 cells coordinate expressions of the duplicate groEL and single groES genes for synergistic functions of GroELs and GroES. We proposed a potential regulation mechanism for the expression coordination.
Nevrtalová, Eva; Baloun, Jiří; Hudzieczek, Vojtěch; Čegan, Radim; Vyskot, Boris; Doležel, Jaroslav; Šafář, Jan; Milde, D.; Hobza, Roman
Roč. 251, č. 6 (2014), s. 1427-1439 ISSN 0033-183X R&D Projects: GA ČR(CZ) GAP501/12/2220; GA ČR(CZ) GBP501/12/G090; GA ČR(CZ) GP13-34962P; GA ČR(CZ) GA522/09/0083 Institutional support: RVO:68081707 Keywords : Copper * Gene duplication * Metallothionein Subject RIV: BO - Biophysics; EF - Botanics (UEB-Q) Impact factor: 2.651, year: 2014
Jourda, Cyril; Cardi, Céline; Mbéguié-A-Mbéguié, Didier; Bocs, Stéphanie; Garsmeur, Olivier; D'Hont, Angélique; Yahiaoui, Nabila
Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening. Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed. Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them. We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling. © 2014 CIRAD New Phytologist © 2014 New Phytologist Trust.
Itokawa, Kentaro; Komagata, Osamu; Kasai, Shinji; Masada, Masahiro; Tomita, Takashi
A cytochrome P450 gene, Cyp9m10, is more than 200-fold overexpressed in a pyrethroid resistant strain of Culex quinquefasciatus, JPal-per. The haplotype of this strain contains two copies of Cyp9m10 resulted from recent tandem duplication. In this study, we discovered and isolated a Cyp9m10 haplotype closely related to this duplicated Cyp9m10 haplotype from JHB, a strain used for the recent genome project for this mosquito species. The isolated haplotype (JHB-NIID-B haplotype) shared the same insertion of a transposable element upstream of the coding region with JPal-per strain but not duplicated. The JHB-NIID-B haplotype was considered to have diverged from the JPal-per lineage just before the duplication event. Cyp9m10 was moderately overexpressed in larvae with the JHB-NIID-B haplotype. The overexpressions in JHB-NIID-B and JPal-per haplotypes were developmentally regulated in similar pattern indicating both haplotypes share a common cis-acting mutation responsible for the overexpressions. The isolated moderately overexpressed haplotype conferred resistance, however, its efficacy was relatively small. We hypothesized that the first cis-acting mutation modified the consequence of the subsequent duplication in JPal-per lineage to confer stronger phenotypic effect than that if it occurred before the first cis-acting mutation. Copyright © 2011 Elsevier Ltd. All rights reserved.
Albadine, Roula; Latour, Mathieu; Toubaji, Antoun; Haffner, Michael; Isaacs, William B; A Platz, Elizabeth; Meeker, Alan K; Demarzo, Angelo M; Epstein, Jonathan I; Netto, George J
Minute prostatic adenocarcinomas are considered to be of insufficient virulence. Given recent suggestions of TMPRSS2-ERG gene fusion association with aggressive prostatic adenocarcinoma, we evaluated the incidence of TMPRSS2-ERG fusion in minute prostatic adenocarcinomas. A total of 45 consecutive prostatectomies with minute adenocarcinoma were used for tissue microarray construction. A total of 63 consecutive non-minimal, Gleason Score 6 tumors, from a separate PSA Era prostatectomy tissue microarray, were used for comparison. FISH was carried out using ERG break-apart probes. Tumors were assessed for fusion by deletion (Edel) or split (Esplit), duplicated fusions and low-level copy number gain in normal ERG gene locus. Minute adenocarcinomas: Fusion was evaluable in 32/45 tumors (71%). Fifteen out of 32 (47%) tumors were positive for fusion. Six (19%) were of the Edel class and 7 (22%) were classified as combined Edel+Esplit. Non-minute adenocarcinomas (pT2): Fusion was identified in 20/30 tumors (67%). Four (13%) were of Edel class and 5 (17%) were combined Edel+Esplit. Duplicated fusions were encountered in 5 (16%) tumors. Non-minute adenocarcinomas (pT3): Fusion was identified in 19/33 (58%). Fusion was due to a deletion in 6 (18%) tumors. Seven tumors (21%) were classified as combined Edel+Esplit. One tumor showed Esplit alone. Duplicated fusions were encountered in 3 (9%) cases. The incidence of duplicated fusions was higher in non-minute adenocarcinomas (13 vs 0%; P=0.03). A trend for higher incidence of low-level copy number gain in normal ERG gene locus without fusion was noted in non-minute adenocarcinomas (10 vs 0%; P=0.07). We found a TMPRSS2-ERG fusion rate of 47% in minute adenocarcinomas. The latter is not significantly different from that of grade matched non-minute adenocarcinomas. The incidence of duplicated fusion was higher in non-minute adenocarcinomas. Our finding of comparable rate of TMPRSS2-ERG fusion in minute adenocarcinomas may argue
Hills, Mark; Jeyapalan, Jennie N; Foxon, Jennifer L; Royle, Nicola J
Subterminal regions, juxtaposed to telomeres on human chromosomes, contain a high density of segmental duplications, but relatively little is known about the evolutionary processes that underlie sequence turnover in these regions. We have characterized a segmental duplication adjacent to the Xp/Yp telomere, each copy containing a hypervariable array of the DXYS14 minisatellite. Both DXYS14 repeat arrays mutate at a high rate (0.3 and 0.2% per gamete) but linkage disequilibrium analysis across 27 SNPs and a direct crossover assay show that recombination during meiosis is suppressed. Therefore instability at DXYS14a and b is dominated by intra-allelic processes or possibly conversion limited to the repeat arrays. Furthermore some chromosomes (14%) carry only one copy of the duplicon, including one DXYS14 repeat array that is also highly mutable (1.2% per gamete). To explain these and other observations, we propose there is another low-rate mutation process that causes copy number change in part or all of the duplicon.
Titus, Tom A.; Yan, Yi-Lin; Wilson, Catherine; Starks, Amber M.; Frohnmayer, Jonathan D.; Canestro, Cristian; Rodriguez-Mari, Adriana; He, Xinjun; Postlethwait, John H.
Fanconi anemia (FA) is a genic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn, and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expresse...
Zhang, Gui-Min; Zheng, Li; He, Hua; Song, Cheng-Chuang; Zhang, Zi-Jing; Cao, Xiu-Kai; Lei, Chu-Zhao; Lan, Xian-Yong; Qi, Xing-Lei; Chen, Hong; Huang, Yong-Zhen
Copy number variations (CNVs) recently have been recognized as another important genetic variability followed single nucleotide polymorphisms (SNPs). The guanylate binding protein 2 (GBP2) gene plays an important role in cell proliferation. This study was performed to determine the presence of GBP2 CNV (relative to Angus cattle) in 466 individuals representing six main cattle breeds from China, identify its relationship with growth, and explore the biological effects of gene expression. There were two CNV regions in the GBP2 gene, for three types, CNV1 loss type (relative to Angus cattle) was more frequent in XN than other breeds, and CNV2 loss type (relative to Angus cattle) was more frequent in XN and CDM than other breeds. Though the GBP2 gene copy number presented no correlation with the transcriptional expression of JX (P > .05), but the transcriptional expression in heart is higher than other tissues, and the copy number in muscles and fat of JX is higher than others breeds. Statistical analysis revealed that the GBP2 gene CNV1 and CNV2 were significantly associated with growth traits (P cattle breeds, and our results suggested that the CNVs in GBP2 gene may be considered markers for the molecular breeding of Chinese beef cattle. Copyright © 2018. Published by Elsevier B.V.
Full Text Available Presence of the human Y-chromosome in females with Turner Syndrome (TS enhances the risk of development of gonadoblastoma besides causing several other phenotypic abnormalities. In the present study, we have analyzed the Y chromosome in 15 clinically diagnosed Turner Syndrome (TS patients and detected high level of mosaicisms ranging from 45,XO:46,XY = 100:0% in 4; 45,XO:46,XY:46XX = 4:94:2 in 8; and 45,XO:46,XY:46XX = 50:30:20 cells in 3 TS patients, unlike previous reports showing 5-8% cells with Y- material. Also, no ring, marker or di-centric Y was observed in any of the cases. Of the two TS patients having intact Y chromosome in >85% cells, one was exceptionally tall. Both the patients were positive for SRY, DAZ, CDY1, DBY, UTY and AZFa, b and c specific STSs. Real Time PCR and FISH demonstrated tandem duplication/multiplication of the SRY and DAZ genes. At sequence level, the SRY was normal in 8 TS patients while the remaining 7 showed either absence of this gene or known and novel mutations within and outside of the HMG box. SNV/SFV analysis showed normal four copies of the DAZ genes in these 8 patients. All the TS patients showed aplastic uterus with no ovaries and no symptom of gonadoblastoma. Present study demonstrates new types of polymorphisms indicating that no two TS patients have identical genotype-phenotype. Thus, a comprehensive analysis of more number of samples is warranted to uncover consensus on the loci affected, to be able to use them as potential diagnostic markers.
Grunnet, Mie; Calatayud, Dan; Schultz, Nicolai Aa.
) poison. Top1 protein, TOP1 gene copy number and mRNA expression, respectively, have been proposed as predictive biomarkers of response to irinotecan in other cancers. Here we investigate the occurrence of TOP1 gene aberrations in cancers of the bile ducts and pancreas. Material and methods. TOP1...
Full Text Available Insulin-like growth factors (IGFs are key regulators of development, growth, and longevity. In most vertebrate species including humans, there is one IGF-1 gene and one IGF-2 gene. Here we report the identification and functional characterization of 4 distinct IGF genes (termed as igf-1a, -1b, -2a, and -2b in zebrafish. These genes encode 4 structurally distinct and functional IGF peptides. IGF-1a and IGF-2a mRNAs were detected in multiple tissues in adult fish. IGF-1b mRNA was detected only in the gonad and IGF-2b mRNA only in the liver. Functional analysis showed that all 4 IGFs caused similar developmental defects but with different potencies. Many of these embryos had fully or partially duplicated notochords, suggesting that an excess of IGF signaling causes defects in the midline formation and an expansion of the notochord. IGF-2a, the most potent IGF, was analyzed in depth. IGF-2a expression caused defects in the midline formation and expansion of the notochord but it did not alter the anterior neural patterning. These results not only provide new insights into the functional conservation and divergence of the multiple igf genes but also reveal a novel role of IGF signaling in midline formation and notochord development in a vertebrate model.
Palta, Priit; Kaplinski, Lauris; Nagirnaja, Liina; Veidenberg, Andres; Möls, Märt; Nelis, Mari; Esko, Tõnu; Metspalu, Andres; Laan, Maris; Remm, Maido
DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.
Full Text Available DNA copy number variants (CNVs that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i phase normal and CNV-carrying haplotypes in the copy number variable regions, ii resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.
Ingason, A; Rujescu, D; Cichon, S
.007) and deletions in 0.12 % of cases and 0.04% of controls (P>0.05). The region can be divided into three intervals defined by flanking low copy repeats. Duplications spanning intervals I and II showed the most significant (P = 0.00010) association with schizophrenia. The age of onset in duplication and deletion...... carriers among cases ranged from 12 to 35 years, and the majority were males with a family history of psychiatric disorders. In a single Icelandic family, a duplication spanning intervals I and II was present in two cases of schizophrenia, and individual cases of alcoholism, attention deficit hyperactivity...
McKinney, C.; Fanciulli, M.; Merriman, M.E.; Phipps-Green, A.; Alizadeh, B.Z.; Koeleman, B.P.; Dalbeth, N.; Gow, P.J.; Harrison, A.A.; Highton, J.; Jones, P.B.; Stamp, L.K.; Steer, S.; Barrera, P.; Coenen, M.J.H.; Franke, B.; Riel, P.L.C.M. van; Vyse, T.J.; Aitman, T.J.; Radstake, T.R.D.J.; Merriman, T.R.
OBJECTIVE: There is increasing evidence that variation in gene copy number (CN) influences clinical phenotype. The low-affinity Fcgamma receptor 3B (FCGR3B) located in the FCGR gene cluster is a CN polymorphic gene involved in the recruitment to sites of inflammation and activation of
Zou, Zhi; Liu, Jianting; Yang, Lifu; Xie, Guishui
Arabidopsis thaliana SAG12, a senescence-specific gene encoding a cysteine protease, is widely used as a molecular marker for the study of leaf senescence. To date, its potential orthologues have been isolated from several plant species such as Brassica napus and Nicotiana tabacum. However, little information is available in rubber tree (Hevea brasiliensis), a rubber-producing plant of the Euphorbiaceae family. This study presents the identification of SAG12-like genes from the rubber tree genome. Results showed that an unexpected high number of 17 rubber orthologues with a single intron were found, contrasting the single copy with two introns in Arabidopsis. The gene expansion was also observed in another two Euphorbiaceae plants, castor bean (Ricinus communis) and physic nut (Jatropha curcas), both of which contain 8 orthologues. In accordance with no occurrence of recent whole-genome duplication (WGD) events, most duplicates in castor and physic nut were resulted from tandem duplications. In contrast, the duplicated HbSAG12H genes were derived from tandem duplications as well as the recent WGD. Expression analysis showed that most HbSAG12H genes were lowly expressed in examined tissues except for root and male flower. Furthermore, HbSAG12H1 exhibits a strictly senescence-associated expression pattern in rubber tree leaves, and thus can be used as a marker gene for the study of senescence mechanism in Hevea.
Bansal, Mukul S; Kellis, Manolis; Kordi, Misagh; Kundu, Soumya
RANGER-DTL 2.0 is a software program for inferring gene family evolution using Duplication-Transfer-Loss reconciliation. This new software is highly scalable and easy to use, and offers many new features not currently available in any other reconciliation program. RANGER-DTL 2.0 has a particular focus on reconciliation accuracy and can account for many sources of reconciliation uncertainty including uncertain gene tree rooting, gene tree topological uncertainty, multiple optimal reconciliations, and alternative event cost assignments. RANGER-DTL 2.0 is open-source and written in C ++ and Python. Pre-compiled executables, source code (open-source under GNU GPL), and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/. firstname.lastname@example.org.
Curry, Cynthia J; Rosenfeld, Jill A; Grant, Erica
. Older patients were often overweight. Three variant phenotypes included cleft lip/palate (CLP), split hand/foot with long bone deficiency (SHFLD), and a connective tissue phenotype resembling Marfan syndrome. The duplications in patients with clefts appear to disrupt ABR, while the SHFLD phenotype......Chromosome 17p13.3 is a gene rich region that when deleted is associated with the well-known Miller-Dieker syndrome. A recently described duplication syndrome involving this region has been associated with intellectual impairment, autism and occasional brain MRI abnormalities. We report 34...... was associated with duplication of BHLHA9 as noted in two recent reports. The connective tissue phenotype did not have a convincing critical region. Our experience with this large cohort expands knowledge of this diverse duplication syndrome....
Costa, Larissa Carvalho; Nalin, Rafael Storto; Ramalho, Magno Antonio Patto; de Souza, Elaine Aparecida
The race 65 of Colletotrichum lindemuthianum, etiologic agent of anthracnose in common bean, is distributed worldwide, having great importance in breeding programs for anthracnose resistance. Several resistance alleles have been identified promoting resistance to this race. However, the variability that has been detected within race has made it difficult to obtain cultivars with durable resistance, because cultivars may have different reactions to each strain of race 65. Thus, this work aimed at studying the resistance inheritance of common bean lines to different strains of C. lindemuthianum, race 65. We used six C. lindemuthianum strains previously characterized as belonging to the race 65 through the international set of differential cultivars of anthracnose and nine commercial cultivars, adapted to the Brazilian growing conditions and with potential ability to discriminate the variability within this race. To obtain information on the resistance inheritance related to nine commercial cultivars to six strains of race 65, these cultivars were crossed two by two in all possible combinations, resulting in 36 hybrids. Segregation in the F2 generations revealed that the resistance to each strain is conditioned by two independent genes with the same function, suggesting that they are duplicated genes, where the dominant allele promotes resistance. These results indicate that the specificity between host resistance genes and pathogen avirulence genes is not limited to races, it also occurs within strains of the same race. Further research may be carried out in order to establish if the alleles identified in these cultivars are different from those described in the literature.
Salati, Simona; Zini, Roberta; Nuzzo, Simona; Guglielmelli, Paola; Pennucci, Valentina; Prudente, Zelia; Ruberti, Samantha; Rontauroli, Sebastiano; Norfo, Ruggiero; Bianchi, Elisa; Bogani, Costanza; Rotunno, Giada; Fanelli, Tiziana; Mannarelli, Carmela; Rosti, Vittorio; Salmoiraghi, Silvia; Pietra, Daniela; Ferrari, Sergio; Barosi, Giovanni; Rambaldi, Alessandro; Cazzola, Mario; Bicciato, Silvio; Tagliafico, Enrico; Vannucchi, Alessandro M; Manfredini, Rossella
Primary myelofibrosis (PMF) is a Myeloproliferative Neoplasm (MPN) characterized by megakaryocyte hyperplasia, progressive bone marrow fibrosis, extramedullary hematopoiesis and transformation to Acute Myeloid Leukemia (AML). A number of phenotypic driver (JAK2, CALR, MPL) and additional subclonal mutations have been described in PMF, pointing to a complex genomic landscape. To discover novel genomic lesions that can contribute to disease phenotype and/or development, gene expression and copy number signals were integrated and several genomic abnormalities leading to a concordant alteration in gene expression levels were identified. In particular, copy number gain in the polyamine oxidase (PAOX) gene locus was accompanied by a coordinated transcriptional up-regulation in PMF patients. PAOX inhibition resulted in rapid cell death of PMF progenitor cells, while sparing normal cells, suggesting that PAOX inhibition could represent a therapeutic strategy to selectively target PMF cells without affecting normal hematopoietic cells' survival. Moreover, copy number loss in the chromatin modifier HMGXB4 gene correlates with a concomitant transcriptional down-regulation in PMF patients. Interestingly, silencing of HMGXB4 induces megakaryocyte differentiation, while inhibiting erythroid development, in human hematopoietic stem/progenitor cells. These results highlight a previously un-reported, yet potentially interesting role of HMGXB4 in the hematopoietic system and suggest that genomic and transcriptional imbalances of HMGXB4 could contribute to the aberrant expansion of the megakaryocytic lineage that characterizes PMF patients. © 2015 UICC.
Piskur, Jure; Sandrini, Michael P; Knecht, Wolfgang
Deoxyribonucleoside kinases, which catalyse the phosphorylation of deoxyribonucleosides, are present in several copies in most multicellular organisms and therefore represent an excellent model to study gene duplication and specialisation of the duplicated copies through partitioning of substrate...
Liu, Xiang; Li, Shangqi; Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Xu, Peng
The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.
Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.
The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp. PMID:27058731
Full Text Available The ATP-binding cassette (ABC gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.
Norshakimah Md Bakri
Full Text Available Background: Several studies in various populations have been conducted to determine candidate genes that could contribute to age-related macular degeneration (AMD pathogenesis. Objective: The present study was undertaken to determine the association of high temperature requirement A-1 (HTRA1, vascular endothelial growth factor (VEGF and very-low-density receptor (VLDR genes with wet AMD subjects in Malaysia. Methods: A total of 125 subjects with wet AMD and 120 subjects without AMD from the Malaysian population were selected for this study. Genomic DNA was extracted and copy number variations (CNVs were determined using quantitative real-time Polymerase Chain Reaction (qPCR and comparison between the two groups was done. The demographic characteristics were also recorded. Statistical analysis was carried out using software where a level of P 0.05. Conclusion: Observations of an association between CNVs of VEGF gene and wet AMD have revealed that the CNVs of VEGF gene appears to be a possible contributor to wet AMD subjects in Malaysia. Keywords: Age-related macular degeneration, Copy number variations, VEGF, HTRA1, VLDR genes and Malaysia
Windholz, Jan; Kovacs, Peter; Schlicke, Marina; Franke, Christin; Mahajan, Anubha; Morris, Andrew P; Lemke, Johannes R; Klammt, Jürgen; Kiess, Wieland; Schöneberg, Torsten; Pfäffle, Roland; Körner, Antje
Obesity is genetically heterogeneous and highly heritable, although polymorphisms explain the phenotype in only a small proportion of obese children. We investigated the presence of copy number variations (CNVs) in "classical" genes known to be associated with (monogenic) early-onset obesity in children. In 194 obese Caucasian children selected for early-onset and severe obesity from our obesity cohort we screened for deletions and/or duplications by multiplex ligation-dependent probe amplification reaction (MLPA). As we found one MLPA probe to interfere with a polymorphism in SIM1 we investigated its association with obesity and other phenotypic traits in our extended cohort of 2305 children. In the selected subset of most severely obese children, we did not find CNV with MLPA in POMC, LEP, LEPR, MC4R, MC3R or MC2R genes. However, one SIM1 probe located at exon 9 gave signals suggestive for SIM1 insufficiency in 52 patients. Polymerase chain reaction (PCR) analysis identified this as a false positive result due to interference with single nucleotide polymorphism (SNP) rs3734354/rs3734355. We, therefore, investigated for associations of this polymorphism with obesity and metabolic traits in our extended cohort. We found rs3734354/rs3734355 to be associated with body mass index-standard deviation score (BMI-SDS) (p = 0.003), but not with parameters of insulin metabolism, blood pressure or food intake. In our modest sample of severely obese children, we were unable to find CNVs in well-established monogenic obesity genes. Nevertheless, we found an association of rs3734354 in SIM1 with obesity of early-onset type in children, although not with obesity-related traits.
Han, Yuepeng; Gasic, Ksenija; Sun, Fengjie; Xu, Mingliang; Korban, Schuyler S
An apple starch-branching enzyme SbeI gene (GenBank Accession No. DQ115404) has been isolated, cloned, and sequenced. The SbeI is a single copy gene in the apple genome, consisting of 14 exons and 13 introns, and covering 6075bp. As detected by RT-PCR, the apple SbeI is expressed at very low levels during early stages of fruit development; while, the highest levels of mRNA transcripts are observed at approximately 44 days post-pollination. Besides fruits, the apple SbeI is also expressed in buds and flowers, and very weakly in leaves. The genomic structure of SbeI in apple is strikingly similar to those reported so far in grasses (Poaceae), with exons 4 through 13 being of identical lengths in both apple and grasses. Moreover, structure similarities in exon lengths have also been detected in SbeII genes of both grasses and eudicots. These findings prompted the investigation of the evolutionary process of the Sbe gene family in angiosperms. A total of 26 Sbe sequences, representing an array of monocots and eudicots, are investigated in this study. Phylogenetic analysis has suggested that Sbe genes have duplicated into SbeI and SbeII prior to the divergence of moncots from eudicots. The SbeII gene is further duplicated into SbeIIa and SbeIIb prior to the radiation of grasses; however, it is not yet clear whether this duplication event has occurred before or after the radiation of the eudicots.
Goostrey, Anna; Jones, Gareth; Secombes, Christopher J
The CXC group of chemokines exert their cellular effects via the CXCR group of G-protein coupled receptors. Six CXCR genes have been identified in humans (CXCR1-6), and homologues to some of these have been isolated from a range of vertebrate species. Here we isolate and characterize CXCR genes from a range of elasmobranch species. One CXCR1/2 gene fragment isolated from Scyliorhinus caniculus (lesser spotted catshark), and two CXCR1/2 copies from each of the elasmobranchs, Cetorhinus maximus (basking shark), Carcharodon carcharias (great white shark), and Raja naevus (cuckoo ray), exhibit high similarity to both CXCR1 and CXCR2. The two copies evident in the cuckoo ray and lamniform sharks provide strong evidence of CXCR1/2 lineage specific duplication in rays and sharks. A CXCR fragment isolated from Lamna ditropis (salmon shark) shows high similarity to a range of CXCR4 genes and strong clustering with CXCR4 gene homologues was apparent during phylogenetic reconstruction.
Full Text Available Gene knockdown using micro RNA (miRNA-based vector constructs is likely to become a prominent gene therapy approach. It was the aim of this study to improve the efficiency of gene knockdown through optimizing the structure of miRNA mimics. Knockdown of two target genes was analyzed: CCR5 and green fluorescent protein. We describe here a novel and optimized miRNA mimic design called mirGE comprising a lower stem length of 13 base pairs (bp, positioning of the targeting strand on the 5′ side of the miRNA, together with nucleotide mismatches in upper stem positions 1 and 12 placed on the passenger strand. Our mirGE proved superior to miR-30 in four aspects: yield of targeting strand incorporation into RNA-induced silencing complex (RISC; incorporation into RISC of correct targeting strand; precision of cleavage by Drosha; and ratio of targeting strand over passenger strand. A triple mirGE hairpin cassette targeting CCR5 was constructed. It allowed CCR5 knockdown with an efficiency of over 90% upon single-copy transduction. Importantly, single-copy expression of this construct rendered transduced target cells, including primary human macrophages, resistant to infection with a CCR5-tropic strain of HIV. Our results provide new insights for a better knockdown efficiency of constructs containing miRNA. Our results also provide the proof-of-principle that cells can be rendered HIV resistant through single-copy vector transduction, rendering this approach more compatible with clinical applications.
Full Text Available Abstract Duplications and rearrangements of coding genes are major themes in the evolution of mitochondrial genomes, bearing important consequences in the function of mitochondria and the fitness of organisms. Yu et al. (BMC Genomics 2008, 9:477 reported the complete mt genome sequence of the oyster Crassostrea hongkongensis (16,475 bp and found that a DNA segment containing four tRNA genes (trnK1, trnC, trnQ1 and trnN, a duplicated (rrnS and a split rRNA gene (rrnL5' was absent compared with that of two other Crassostrea species. It was suggested that the absence was a novel case of "tandem duplication-random loss" with evolutionary significance. We independently sequenced the complete mt genome of three C. hongkongensis individuals, all of which were 18,622 bp and contained the segment that was missing in Yu et al.'s sequence. Further, we designed primers, verified sequences and demonstrated that the sequence loss in Yu et al.'s study was an artifact caused by placing primers in a duplicated region. The duplication and split of ribosomal RNA genes are unique for Crassostrea oysters and not lost in C. hongkongensis. Our study highlights the need for caution when amplifying and sequencing through duplicated regions of the genome.
Norton, Nadine; Advani, Pooja P.; Serie, Daniel J.; Geiger, Xochiquetzal J.; Necela, Brian M.; Axenfeld, Bianca C.; Kachergus, Jennifer M.; Feathers, Ryan W.; Carr, Jennifer M.; Crook, Julia E.; Moreno-Aspitia, Alvaro; Anastasiadis, Panos Z.; Perez, Edith A.; Thompson, E. Aubrey
Background Invasive lobular carcinoma (ILC) comprises approximately ~10–20% of breast cancers. In general, multifocal/multicentric (MF/MC) breast cancer has been associated with an increased rate of regional lymph node metastases. Tumor heterogeneity between foci represents a largely unstudied source of genomic variation in those rare patients with MF/MC ILC. Methods We characterized gene expression and copy number in 2 or more foci from 11 patients with MF/MC ILC (all ER+, HER2-) and adjacent normal tissue. RNA and DNA were extracted from 3x1.5mm cores from all foci. Gene expression (730 genes) and copy number (80 genes) were measured using Nanostring PanCancer and Cancer CNV panels. Linear mixed models were employed to compare expression in tumor versus normal samples from the same patient, and to assess heterogeneity (variability) in expression among multiple ILC within an individual. Results 35 and 34 genes were upregulated (FC>2) and down-regulated (FC<0.5) respectively in ILC tumor relative to adjacent normal tissue, q<0.05. 9/34 down-regulated genes (FIGF, RELN, PROM1, SFRP1, MMP7, NTRK2, LAMB3, SPRY2, KIT) had changes larger than CDH1, a hallmark of ILC. Copy number changes in these patients were relatively few but consistent across foci within each patient. Amplification of three genes (CCND1, FADD, ORAOV1) at 11q13.3 was present in 2/11 patients in both foci. We observed significant evidence of within-patient between-foci variability (heterogeneity) in gene expression for 466 genes (p<0.05 with FDR 8%), including CDH1, FIGF, RELN, SFRP1, MMP7, NTRK2, LAMB3, SPRY2 and KIT. Conclusions There was substantial variation in gene expression between ILC foci within patients, including known markers of ILC, suggesting an additional level of complexity that should be addressed. PMID:27078887
Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location.
Zmienko, Agnieszka; Samelak-Czajka, Anna; Kozlowski, Piotr; Szymanska, Maja; Figlerowicz, Marek
Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2-14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular
Full Text Available Lupins, like other legumes, have a unique biosynthesis scheme of 5-deoxy-type flavonoids and isoflavonoids. A key enzyme in this pathway is chalcone isomerase (CHI, a member of CHI-fold protein family, encompassing subfamilies of CHI1, CHI2, CHI-like (CHIL, and fatty acid-binding (FAP proteins. Here, two Lupinus angustifolius (narrow-leafed lupin CHILs, LangCHIL1 and LangCHIL2, were identified and characterized using DNA fingerprinting, cytogenetic and linkage mapping, sequencing and expression profiling. Clones carrying CHIL sequences were assembled into two contigs. Full gene sequences were obtained from these contigs, and mapped in two L. angustifolius linkage groups by gene-specific markers. Bacterial artificial chromosome fluorescence in situ hybridization approach confirmed the localization of two LangCHIL genes in distinct chromosomes. The expression profiles of both LangCHIL isoforms were very similar. The highest level of transcription was in the roots of the third week of plant growth; thereafter, expression declined. The expression of both LangCHIL genes in leaves and stems was similar and low. Comparative mapping to reference legume genome sequences revealed strong syntenic links; however, LangCHIL2 contig had a much more conserved structure than LangCHIL1. LangCHIL2 is assumed to be an ancestor gene, whereas LangCHIL1 probably appeared as a result of duplication. As both copies are transcriptionally active, questions arise concerning their hypothetical functional divergence. Screening of the narrow-leafed lupin genome and transcriptome with CHI-fold protein sequences, followed by Bayesian inference of phylogeny and cross-genera synteny survey, identified representatives of all but one (CHI1 main subfamilies. They are as follows: two copies of CHI2, FAPa2 and CHIL, and single copies of FAPb and FAPa1. Duplicated genes are remnants of whole genome duplication which is assumed to have occurred after the divergence of Lupinus, Arachis
Kumazawa, Yoshinori; Endo, Hideki
The mitochondrial genome of the Komodo dragon (Varanus komodoensis) was nearly completely sequenced, except for two highly repetitive noncoding regions. An efficient sequencing method for squamate mitochondrial genomes was established by combining the long polymerase chain reaction (PCR) technology and a set of reptile-oriented primers designed for nested PCR amplifications. It was found that the mitochondrial genome had novel gene arrangements in which genes from NADH dehydrogenase subunit 6 to proline tRNA were extensively shuffled with duplicate control regions. These control regions had 99% sequence similarity over 700 bp. Although snake mitochondrial genomes are also known to possess duplicate control regions with nearly identical sequences, the location of the second control region suggested independent occurrence of the duplication on lineages leading to snakes and the Komodo dragon. Another feature of the mitochondrial genome of the Komodo dragon was the considerable number of tandem repeats, including sequences with a strong secondary structure, as a possible site for the slipped-strand mispairing in replication. These observations are consistent with hypotheses that tandem duplications via the slipped-strand mispairing may induce mitochondrial gene rearrangements and may serve to maintain similar copies of the control region.
Full Text Available Whole genome duplication (WGD and tandem duplication (TD are both important modes of gene expansion. However, how whole genome duplication influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751 and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the 3 species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole-genome polyploidization event.
Genome-wide identification and comparative expression analysis reveal a rapid expansion and functional divergence of duplicated genes in the WRKY gene family of cabbage, Brassica oleracea var. capitata.
Yao, Qiu-Yang; Xia, En-Hua; Liu, Fei-Hu; Gao, Li-Zhi
WRKY transcription factors (TFs), one of the ten largest TF families in higher plants, play important roles in regulating plant development and resistance. To date, little is known about the WRKY TF family in Brassica oleracea. Recently, the completed genome sequence of cabbage (B. oleracea var. capitata) allows us to systematically analyze WRKY genes in this species. A total of 148 WRKY genes were characterized and classified into seven subgroups that belong to three major groups. Phylogenetic and synteny analyses revealed that the repertoire of cabbage WRKY genes was derived from a common ancestor shared with Arabidopsis thaliana. The B. oleracea WRKY genes were found to be preferentially retained after the whole-genome triplication (WGT) event in its recent ancestor, suggesting that the WGT event had largely contributed to a rapid expansion of the WRKY gene family in B. oleracea. The analysis of RNA-Seq data from various tissues (i.e., roots, stems, leaves, buds, flowers and siliques) revealed that most of the identified WRKY genes were positively expressed in cabbage, and a large portion of them exhibited patterns of differential and tissue-specific expression, demonstrating that these gene members might play essential roles in plant developmental processes. Comparative analysis of the expression level among duplicated genes showed that gene expression divergence was evidently presented among cabbage WRKY paralogs, indicating functional divergence of these duplicated WRKY genes. Copyright © 2014 Elsevier B.V. All rights reserved.
D'Angelo, Debra; Lebon, Sébastien; Chen, Qixuan
Importance: The 16p11.2 BP4-BP5 duplication is the copy number variant most frequently associated with autism spectrum disorder (ASD), schizophrenia, and comorbidities such as decreased body mass index (BMI). Objectives: To characterize the effects of the 16p11.2 duplication on cognitive...... subgroups not observed with the deletion. These results suggest that additional genetic and familial factors contribute to this variability. Additional studies will be necessary to characterize the predictors of cognitive deficits....
Full Text Available With its incredible strength and toughness, spider dragline silk is widely lauded for its impressive material properties. Dragline silk is composed of two structural proteins, MaSp1 and MaSp2, which are encoded by members of the spidroin gene family. While previous studies have characterized the genes that encode the constituent proteins of spider silks, nothing is known about the physical location of these genes. We determined karyotypes and sex chromosome organization for the widow spiders, Latrodectus hesperus and L. geometricus (Araneae, Theridiidae. We then used fluorescence in situ hybridization to map the genomic locations of the genes for the silk proteins that compose the remarkable spider dragline. These genes included three loci for the MaSp1 protein and the single locus for the MaSp2 protein. In addition, we mapped a MaSp1 pseudogene. All the MaSp1 gene copies and pseudogene localized to a single chromosomal region while MaSp2 was located on a different chromosome of L. hesperus. Using probes derived from L. hesperus, we comparatively mapped all three MaSp1 loci to a single region of a L. geometricus chromosome. As with L. hesperus, MaSp2 was found on a separate L. geometricus chromosome, thus again unlinked to the MaSp1 loci. These results indicate orthology of the corresponding chromosomal regions in the two widow genomes. Moreover, the occurrence of multiple MaSp1 loci in a conserved gene cluster across species suggests that MaSp1 proliferated by tandem duplication in a common ancestor of L. geometricus and L. hesperus. Unequal crossover events during recombination could have given rise to the gene copies and could also maintain sequence similarity among gene copies over time. Further comparative mapping with taxa of increasing divergence from Latrodectus will pinpoint when the MaSp1 duplication events occurred and the phylogenetic distribution of silk gene linkage patterns.
Srivastava, Niloo; Manvati, Siddharth; Srivastava, Archita; Pal, Ranjana; Kalaiarasan, Ponnusamy; Chattopadhyay, Shilpi; Gochhait, Sailesh; Dua, Raina; Bamezai, Rameshwar N K
New levels of gene regulation with microRNA (miR) and gene copy number alterations (CNAs) have been identified as playing a role in various cancers. We have previously reported that sporadic breast cancer tissues exhibit significant alteration in H2AX gene copy number. However, how CNA affects gene expression and what is the role of miR, miR-24-2, known to regulate H2AX expression, in the background of the change in copy number, are not known. Further, many miRs, including miR-24-2, are implicated as playing a role in cell proliferation and apoptosis, but their specific target genes and the pathways contributing to them remain unexplored. Changes in gene copy number and mRNA/miR expression were estimated using real-time polymerase chain reaction assays in two mammalian cell lines, MCF-7 and HeLa, and in a set of sporadic breast cancer tissues. In silico analysis was performed to find the putative target for miR-24-2. MCF-7 cells were transfected with precursor miR-24-2 oligonucleotides, and the gene expression levels of BRCA1, BRCA2, ATM, MDM2, TP53, CHEK2, CYT-C, BCL-2, H2AFX and P21 were examined using TaqMan gene expression assays. Apoptosis was measured by flow cytometric detection using annexin V dye. A luciferase assay was performed to confirm BCL-2 as a valid cellular target of miR-24-2. It was observed that H2AX gene expression was negatively correlated with miR-24-2 expression and not in accordance with the gene copy number status, both in cell lines and in sporadic breast tumor tissues. Further, the cells overexpressing miR-24-2 were observed to be hypersensitive to DNA damaging drugs, undergoing apoptotic cell death, suggesting the potentiating effect of mir-24-2-mediated apoptotic induction in human cancer cell lines treated with anticancer drugs. BCL-2 was identified as a novel cellular target of miR-24-2. mir-24-2 is capable of inducing apoptosis by modulating different apoptotic pathways and targeting BCL-2, an antiapoptotic gene. The study suggests
McKinney, Cushla; Fanciulli, Manuela; Merriman, Marilyn E.; Phipps-Green, Amanda; Alizadeh, Behrooz Z.; Koeleman, Bobby P. C.; Dalbeth, Nicola; Gow, Peter J.; Harrison, Andrew A.; Highton, John; Jones, Peter B.; Stamp, Lisa K.; Steer, Sophia; Barrera, Pilar; Coenen, Marieke J. H.; Franke, Barbara; van Riel, Piet L. C. M.; Vyse, Tim J.; Aitman, Tim J.; Radstake, Timothy R. D. J.; Merriman, Tony R.
Objective There is increasing evidence that variation in gene copy number (CN) influences clinical phenotype. The low-affinity Fc gamma receptor 3B (FCGR3B) located in the FCGR gene cluster is a CN polymorphic gene involved in the recruitment to sites of inflammation and activation of
Mater, David Van; Goodman, Barbara K; Wang, Endi; Gaca, Ana M; Wechsler, Daniel S
Lymphoblastic lymphoma is the second most common type of non-Hodgkin lymphoma seen in children. Approximately, 90% of lymphoblastic lymphomas arise from T cells, with the remaining 10% being B-cell-lineage derived. Although T-cell lymphoblastic lymphoma most frequently occurs in the anterior mediastinum (thymus), B-cell lymphoblastic lymphoma (B-LBL) predominates in extranodal sites such as skin and bone. Here, we describe a pediatric B-LBL patient who presented with extensive abdominal involvement and whose lymphoma cells displayed segmental duplication of the mixed lineage leukemia (MLL) gene. MLL duplication/amplification has been described primarily in acute myeloid leukemia and myelodysplastic syndrome with no published reports of discrete MLL duplication/amplification events in B-LBL. The MLL gene duplication noted in this case may represent a novel mechanism for tumorigenesis in B-LBL.
Biedrzycka, Aleksandra; O'Connor, Emily; Sebastian, Alvaro; Migalska, Magdalena; Radwan, Jacek; Zając, Tadeusz; Bielański, Wojciech; Solarz, Wojciech; Ćmiel, Adam; Westerdahl, Helena
Recent work suggests that gene duplications may play an important role in the evolution of immunity genes. Passerine birds, and in particular Sylvioidea warblers, have highly duplicated major histocompatibility complex (MHC) genes, which are key in immunity, compared to other vertebrates. However, reasons for this high MHC gene copy number are yet unclear. High-throughput sequencing (HTS) allows MHC genotyping even in individuals with extremely duplicated genes. This HTS data can reveal evidence of selection, which may help to unravel the putative functions of different gene copies, i.e. neofunctionalization. We performed exhaustive genotyping of MHC class I in a Sylvioidea warbler, the sedge warbler, Acrocephalus schoenobaenus, using the Illumina MiSeq technique on individuals from a wild study population. The MHC diversity in 863 genotyped individuals by far exceeds that of any other bird species described to date. A single individual could carry up to 65 different alleles, a large proportion of which are expressed (transcribed). The MHC alleles were of three different lengths differing in evidence of selection, diversity and divergence within our study population. Alleles without any deletions and alleles containing a 6 bp deletion showed characteristics of classical MHC genes, with evidence of multiple sites subject to positive selection and high sequence divergence. In contrast, alleles containing a 3 bp deletion had no sites subject to positive selection and had low divergence. Our results suggest that sedge warbler MHC alleles that either have no deletion, or contain a 6 bp deletion, encode classical antigen presenting MHC molecules. In contrast, MHC alleles containing a 3 bp deletion may encode molecules with a different function. This study demonstrates that highly duplicated MHC genes can be characterised with HTS and that selection patterns can be useful for revealing neofunctionalization. Importantly, our results highlight the need to consider the
Fode, Peder; Jespersgaard, Cathrine; Hardwick, Robert J
There have been conflicting reports in the literature on association of gene copy number with disease, including CCL3L1 and HIV susceptibility, and ß-defensins and Crohn's disease. Quantification of precise gene copy numbers is important in order to define any association of gene copy number with...
Full Text Available Aristaless-like homeobox 4 (ALX4 gene is an important transcription regulator in skull and limb development. In humans and mice ALX4 mutations or loss of function result in a number of skeletal and organ malformations, including polydactyly, tibial hemimelia, omphalocele, biparietal foramina, impaired mammary epithelial morphogenesis, alopecia, coronal craniosynostosis, hypertelorism, depressed nasal bridge and ridge, bifid nasal tip, hypogonadism, and body agenesis. Here we show that a complex skeletal malformation of the hind limb in Galloway cattle together with other developmental anomalies is a recessive autosomal disorder most likely caused by a duplication of 20 bp in exon 2 of the bovine ALX4 gene. A second duplication of 34 bp in exon 4 of the same gene has no known effect, although both duplications result in a frameshift and premature stop codon leading to a truncated protein. Genotyping of 1,688 Black/Red/Belted/Riggit Galloway (GA and 289 White Galloway (WGA cattle showed that the duplication in exon 2 has allele frequencies of 1% in GA and 6% in WGA and the duplication in exon 4 has frequencies of 23% in GA and 38% in WGA. Both duplications were not detected in 876 randomly selected German Holstein Friesian and 86 cattle of 21 other breeds. Hence, we have identified a candidate causative mutation for tibial hemimelia syndrome in Galloway cattle and selection against this mutation can be used to eliminate the mutant allele from the breed.
Newton, Richard; Wernisch, Lorenz
Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments. PMID:25148247
Carpenter, Danielle; Dhar, Sugandha; Mitchell, Laura M; Fu, Beiyuan; Tyson, Jess; Shwan, Nzar A A; Yang, Fengtang; Thomas, Mark G; Armour, John A L
The human salivary amylase genes display extensive copy number variation (CNV), and recent work has implicated this variation in adaptation to starch-rich diets, and in association with body mass index. In this work, we use paralogue ratio tests, microsatellite analysis, read depth and fibre-FISH to demonstrate that human amylase CNV is not a smooth continuum, but is instead partitioned into distinct haplotype classes. There is a fundamental structural distinction between haplotypes containing odd or even numbers of AMY1 gene units, in turn coupled to CNV in pancreatic amylase genes AMY2A and AMY2B. Most haplotypes have one copy each of AMY2A and AMY2B and contain an odd number of copies of AMY1; consequently, most individuals have an even total number of AMY1. In contrast, haplotypes carrying an even number of AMY1 genes have rearrangements leading to CNVs of AMY2A/AMY2B. Read-depth and experimental data show that different populations harbour different proportions of these basic haplotype classes. In Europeans, the copy numbers of AMY1 and AMY2A are correlated, so that phenotypic associations caused by variation in pancreatic amylase copy number could be detected indirectly as weak association with AMY1 copy number. We show that the quantitative polymerase chain reaction (qPCR) assay previously applied to the high-throughput measurement of AMY1 copy number is less accurate than the measures we use and that qPCR data in other studies have been further compromised by systematic miscalibration. Our results uncover new patterns in human amylase variation and imply a potential role for AMY2 CNV in functional associations. © The Author 2015. Published by Oxford University Press.
Carpenter, Danielle; Dhar, Sugandha; Mitchell, Laura M.; Fu, Beiyuan; Tyson, Jess; Shwan, Nzar A.A.; Yang, Fengtang; Thomas, Mark G.; Armour, John A.L.
The human salivary amylase genes display extensive copy number variation (CNV), and recent work has implicated this variation in adaptation to starch-rich diets, and in association with body mass index. In this work, we use paralogue ratio tests, microsatellite analysis, read depth and fibre-FISH to demonstrate that human amylase CNV is not a smooth continuum, but is instead partitioned into distinct haplotype classes. There is a fundamental structural distinction between haplotypes containing odd or even numbers of AMY1 gene units, in turn coupled to CNV in pancreatic amylase genes AMY2A and AMY2B. Most haplotypes have one copy each of AMY2A and AMY2B and contain an odd number of copies of AMY1; consequently, most individuals have an even total number of AMY1. In contrast, haplotypes carrying an even number of AMY1 genes have rearrangements leading to CNVs of AMY2A/AMY2B. Read-depth and experimental data show that different populations harbour different proportions of these basic haplotype classes. In Europeans, the copy numbers of AMY1 and AMY2A are correlated, so that phenotypic associations caused by variation in pancreatic amylase copy number could be detected indirectly as weak association with AMY1 copy number. We show that the quantitative polymerase chain reaction (qPCR) assay previously applied to the high-throughput measurement of AMY1 copy number is less accurate than the measures we use and that qPCR data in other studies have been further compromised by systematic miscalibration. Our results uncover new patterns in human amylase variation and imply a potential role for AMY2 CNV in functional associations. PMID:25788522
Bodilis, Josselin; Nsigue-Meilo, Sandrine; Besaury, Ludovic; Quillet, Laurent
Even though the 16S rRNA gene is the most commonly used taxonomic marker in microbial ecology, its poor resolution is still not fully understood at the intra-genus level. In this work, the number of rRNA gene operons, intra-genomic heterogeneities and lateral transfers were investigated at a fine-scale resolution, throughout the Pseudomonas genus. In addition to nineteen sequenced Pseudomonas strains, we determined the 16S rRNA copy number in four other Pseudomonas strains by Southern hybridization and Pulsed-Field Gel Electrophoresis, and studied the intra-genomic heterogeneities by Denaturing Gradient Gel Electrophoresis and sequencing. Although the variable copy number (from four to seven) seems to be correlated with the evolutionary distance, some close strains in the P. fluorescens lineage showed a different number of 16S rRNA genes, whereas all the strains in the P. aeruginosa lineage displayed the same number of genes (four copies). Further study of the intra-genomic heterogeneities revealed that most of the Pseudomonas strains (15 out of 19 strains) had at least two different 16S rRNA alleles. A great difference (5 or 19 nucleotides, essentially grouped near the V1 hypervariable region) was observed only in two sequenced strains. In one of our strains studied (MFY30 strain), we found a difference of 12 nucleotides (grouped in the V3 hypervariable region) between copies of the 16S rRNA gene. Finally, occurrence of partial lateral transfers of the 16S rRNA gene was further investigated in 1803 full-length sequences of Pseudomonas available in the databases. Remarkably, we found that the two most variable regions (the V1 and V3 hypervariable regions) had probably been laterally transferred from another evolutionary distant Pseudomonas strain for at least 48.3 and 41.6% of the 16S rRNA sequences, respectively. In conclusion, we strongly recommend removing these regions of the 16S rRNA gene during the intra-genus diversity studies. PMID:22545126
Vaccari, Carlotta Maria; Tassano, Elisa; Torre, Michele; Gimelli, Stefania; Divizia, Maria Teresa; Romanini, Maria Victoria; Bossi, Simone; Musante, Ilaria; Valle, Maura; Senes, Filippo; Catena, Nunzio; Bedeschi, Maria Francesca; Baban, Anwar; Calevo, Maria Grazia; Acquaviva, Massimo; Lerone, Margherita; Ravazzolo, Roberto; Puliti, Aldamaria
Poland Syndrome (PS) is a rare congenital disorder presenting with agenesis/hypoplasia of the pectoralis major muscle variably associated with thoracic and/or upper limb anomalies. Most cases are sporadic, but familial recurrence, with different inheritance patterns, has been observed. The genetic etiology of PS remains unknown. Karyotyping and array-comparative genomic hybridization (CGH) analyses can identify genomic imbalances that can clarify the genetic etiology of congenital and neurodevelopmental disorders. We previously reported a chromosome 11 deletion in twin girls with pectoralis muscle hypoplasia and skeletal anomalies, and a chromosome six deletion in a patient presenting a complex phenotype that included pectoralis muscle hypoplasia. However, the contribution of genomic imbalances to PS remains largely unknown. To investigate the prevalence of chromosomal imbalances in PS, standard cytogenetic and array-CGH analyses were performed in 120 PS patients. Following the application of stringent filter criteria, 14 rare copy number variations (CNVs) were identified in 14 PS patients in different regions outside known common copy number variations: seven genomic duplications and seven genomic deletions, enclosing the two previously reported PS associated chromosomal deletions. These CNVs ranged from 0.04 to 4.71 Mb in size. Bioinformatic analysis of array-CGH data indicated gene enrichment in pathways involved in cell-cell adhesion, DNA binding and apoptosis processes. The analysis also provided a number of candidate genes possibly causing the developmental defects observed in PS patients, among others REV3L, a gene coding for an error-prone DNA polymerase previously associated with Möbius Syndrome with variable phenotypes including pectoralis muscle agenesis. A number of rare CNVs were identified in PS patients, and these involve genes that represent candidates for further evaluation. Rare inherited CNVs may contribute to, or represent risk factors of PS
Full Text Available Brain arteriovenous malformations (BAVM are clusters of abnormal blood vessels, with shunting of blood from the arterial to venous circulation and a high risk of rupture and intracranial hemorrhage. Most BAVMs are sporadic, but also occur in patients with Hereditary Hemorrhagic Telangiectasia, a Mendelian disorder caused by mutations in genes in the transforming growth factor beta (TGFβ signaling pathway.To investigate whether copy number variations (CNVs contribute to risk of sporadic BAVM, we performed a genome-wide association study in 371 sporadic BAVM cases and 563 healthy controls, all Caucasian. Cases and controls were genotyped using the Affymetrix 6.0 array. CNVs were called using the PennCNV and Birdsuite algorithms and analyzed via segment-based and gene-based approaches. Common and rare CNVs were evaluated for association with BAVM.A CNV region on 1p36.13, containing the neuroblastoma breakpoint family, member 1 gene (NBPF1, was significantly enriched with duplications in BAVM cases compared to controls (P = 2.2×10(-9; NBPF1 was also significantly associated with BAVM in gene-based analysis using both PennCNV and Birdsuite. We experimentally validated the 1p36.13 duplication; however, the association did not replicate in an independent cohort of 184 sporadic BAVM cases and 182 controls (OR = 0.81, P = 0.8. Rare CNV analysis did not identify genes significantly associated with BAVM.We did not identify common CNVs associated with sporadic BAVM that replicated in an independent cohort. Replication in larger cohorts is required to elucidate the possible role of common or rare CNVs in BAVM pathogenesis.
Full Text Available Abstract Background Xenografts have been shown to provide a suitable source of tumor tissue for molecular analysis in the absence of primary tumor material. We utilized ES xenograft series for integrated microarray analyses to identify novel biomarkers. Method Microarray technology (array comparative genomic hybridization (aCGH and micro RNA arrays was used to screen and identify copy number changes and differentially expressed miRNAs of 34 and 14 passages, respectively. Incubated cells used for xenografting (Passage 0 were considered to represent the primary tumor. Four important differentially expressed miRNAs (miR-31, miR-31*, miR-145, miR-106 were selected for further validation by real time polymerase chain reaction (RT-PCR. Integrated analysis of aCGH and miRNA data was performed on 14 xenograft passages by bioinformatic methods. Results The most frequent losses and gains of DNA copy number were detected at 9p21.3, 16q and at 8, 15, 17q21.32-qter, 1q21.1-qter, respectively. The presence of these alterations was consistent in all tumor passages. aCGH profiles of xenograft passages of each series resembled their corresponding primary tumors (passage 0. MiR-21, miR-31, miR-31*, miR-106b, miR-145, miR-150*, miR-371-5p, miR-557 and miR-598 showed recurrently altered expression. These miRNAS were predicted to regulate many ES-associated genes, such as genes of the IGF1 pathway, EWSR1, FLI1 and their fusion gene (EWS-FLI1. Twenty differentially expressed miRNAs were pinpointed in regions carrying altered copy numbers. Conclusion In the present study, ES xenografts were successfully applied for integrated microarray analyses. Our findings showed expression changes of miRNAs that were predicted to regulate many ES associated genes, such as IGF1 pathway genes, FLI1, EWSR1, and the EWS-FLI1 fusion genes.
Wu, Feilun; You, Lingchong
DNA copy number represents an essential parameter in the dynamics of synthetic gene circuits but typically is not explicitly considered. A new study demonstrates how dynamic control of DNA copy number can serve as an effective strategy to program robust oscillations in gene expression circuits.
Giao Ngoc Nguyen
Full Text Available Reproductive barriers are commonly observed in both animals and plants, in which they maintain species integrity and contribute to speciation. This report shows that a combination of loss-of-function alleles at two duplicated loci, DUPLICATED GAMETOPHYTIC STERILITY 1 (DGS1 on chromosome 4 and DGS2 on chromosome 7, causes pollen sterility in hybrid progeny derived from an interspecific cross between cultivated rice, Oryza sativa, and an Asian annual wild rice, O. nivara. Male gametes carrying the DGS1 allele from O. nivara (DGS1-nivaras and the DGS2 allele from O. sativa (DGS2-T65s were sterile, but female gametes carrying the same genotype were fertile. We isolated the causal gene, which encodes a protein homologous to DNA-dependent RNA polymerase (RNAP III subunit C4 (RPC4. RPC4 facilitates the transcription of 5S rRNAs and tRNAs. The loss-of-function alleles at DGS1-nivaras and DGS2-T65s were caused by weak or nonexpression of RPC4 and an absence of RPC4, respectively. Phylogenetic analysis demonstrated that gene duplication of RPC4 at DGS1 and DGS2 was a recent event that occurred after divergence of the ancestral population of Oryza from other Poaceae or during diversification of AA-genome species.
Drabova, Jana; Trkova, Marie; Hancarova, Miroslava; Novotna, Drahuse; Hejtmankova, Michaela; Havlovicova, Marketa; Sedlacek, Zdenek
Inversions are balanced structural chromosome rearrangements, which can influence gene expression and the risk of unbalanced chromosome constitution in offspring. Many examples of inversion polymorphisms exist in human, affecting both heterochromatic regions and euchromatin. We describe a novel, 15 Mb long paracentric inversion, inv(21)(q21.1q22.11), affecting more than a third of human 21q. Despite of its length, the inversion cannot be detected using karyotyping due to similar band patterns on the normal and inverted chromosomes, and is therefore likely to escape attention. Its identification was aided by the repeated observation of the same pair of 150 kb long duplications present in cis on chromosome 21 in three Czech families subjected to microarray analysis. The finding prompted us to hypothesise that this co-occurrence of two remote duplications could be associated with an inversion of the intervening segment, and this speculation turned out to be right. The inversion was confirmed in a series of FISH experiments which also showed that the second copy of each of the duplications was always located at the opposite end of the inversion. The presence of the same pair of duplications in additional individuals reported in public databases indicates that the inversion may also be present in other populations. Three out of the total of about 4000 chromosomes 21 examined in our sample carried the duplications and were inverted, corresponding to carrier frequency of about 1/660. Although the breakpoints affect protein-coding genes, the occurrence of the inversion in normal parents and siblings of our patients and the occurrence of the duplications in unaffected controls in databases indicate that this rare variant is rather non-pathogenic. The inverted segment carried an identical shared haplotype in the three families studied. The haplotypes, however, diverged very rapidly in the flanking regions, possibly pointing to an ancient founder event at the origin of the
P J Hastings
Full Text Available Chromosome structural changes with nonrecurrent endpoints associated with genomic disorders offer windows into the mechanism of origin of copy number variation (CNV. A recent report of nonrecurrent duplications associated with Pelizaeus-Merzbacher disease identified three distinctive characteristics. First, the majority of events can be seen to be complex, showing discontinuous duplications mixed with deletions, inverted duplications, and triplications. Second, junctions at endpoints show microhomology of 2-5 base pairs (bp. Third, endpoints occur near pre-existing low copy repeats (LCRs. Using these observations and evidence from DNA repair in other organisms, we derive a model of microhomology-mediated break-induced replication (MMBIR for the origin of CNV and, ultimately, of LCRs. We propose that breakage of replication forks in stressed cells that are deficient in homologous recombination induces an aberrant repair process with features of break-induced replication (BIR. Under these circumstances, single-strand 3' tails from broken replication forks will anneal with microhomology on any single-stranded DNA nearby, priming low-processivity polymerization with multiple template switches generating complex rearrangements, and eventual re-establishment of processive replication.
Matoso, Eunice; Melo, Joana B; Ferreira, Susana I; Jardim, Ana; Castelo, Teresa M; Weise, Anja; Carreira, Isabel M
An insertional translocation (IT) can result in pure segmental aneusomy for the inserted genomic segment allowing to define a more accurate clinical phenotype. Here, we report on two siblings sharing an unbalanced IT inherited from the mother with a history of learning difficulty. An 8-year-old girl with developmental delay, speech disability, and attention-deficit hyperactivity disorder (ADHD), showed by GTG banding analysis a subtle interstitial alteration in 21q21. Oligonucleotide array comparative genomic hybridization (array-CGH) analysis showed a 4q13.1-q13.3 duplication spanning 8.6 Mb. Fluorescence in situ hybridization (FISH) with bacterial artificial chromosome (BAC) clones confirmed the rearrangement, a der(21)ins(21;4)(q21;q13.1q13.3). The duplication described involves 50 RefSeq genes including the EPHA5 gene that encodes for the EphA5 receptor involved in embryonic development of the brain and also in synaptic remodeling and plasticity thought to underlie learning and memory. The same rearrangement was observed in a younger brother with behavioral problems and also exhibiting ADHD. ADHD is among the most heritable of neuropsychiatric disorders. There are few reports of patients with duplications involving the proximal region of 4q and a mild phenotype. To the best of our knowledge this is the first report of a duplication restricted to band 4q13. This abnormality could be easily missed in children who have nonspecific cognitive impairment. The presence of this behavioral disorder in the two siblings reinforces the hypothesis that the region involved could include genes involved in ADHD. Copyright © 2013 Wiley Periodicals, Inc.
Full Text Available Human respiratory syncytial virus (HRSV is the main cause of acute lower respiratory infections in children under 2 years of age and causes repeated infections throughout life. We investigated the genetic variability of RSV-A circulating in Ontario during 2010-2011 winter season by sequencing and phylogenetic analysis of the G glycoprotein gene.Among the 201 consecutive RSV isolates studied, RSV-A (55.7% was more commonly observed than RSV-B (42.3%. 59.8% and 90.1% of RSV-A infections were among children ≤12 months and ≤5 years old, respectively. On phylogenetic analysis of the second hypervariable region of the 112 RSV-A strains, 110 (98.2% clustered within or adjacent to the NA1 genotype; two isolates were GA5 genotype. Eleven (10% NA1-related isolates clustered together phylogenetically as a novel RSV-A genotype, named ON1, containing a 72 nucleotide duplication in the C-terminal region of the attachment (G glycoprotein. The predicted polypeptide is lengthened by 24 amino acids and includes a23 amino acid duplication. Using RNA secondary structural software, a possible mechanism of duplication occurrence was derived. The 23 amino acid ON1 G gene duplication results in a repeat of 7 potential O-glycosylation sites including three O-linked sugar acceptors at residues 270, 275, and 283. Using Phylogenetic Analysis by Maximum Likelihood analysis, a total of 19 positively selected sites were observed among Ontario NA1 isolates; six were found to be codons which reverted to the previous state observed in the prototype RSV-A2 strain. The tendency of codon regression in the G-ectodomain may infer a decreased avidity of antibody to the current circulating strains. Further work is needed to document and further understand the emergence, virulence, pathogenicity and transmissibility of this novel RSV-A genotype with a72 nucleotide G gene duplication.
Full Text Available In this paper we apply a predictive profiling method to genome copy number aberrations (CNA in combination with gene expression and clinical data to identify molecular patterns of cancer pathophysiology. Predictive models and optimal feature lists for the platforms are developed by a complete validation SVM-based machine learning system. Ranked list of genome CNA sites (assessed by comparative genomic hybridization arrays – aCGH and of differentially expressed genes (assessed by microarray profiling with Affy HG-U133A chips are computed and combined on a breast cancer dataset for the discrimination of Luminal/ ER+ (Lum/ER+ and Basal-like/ER- classes. Different encodings are developed and applied to the CNA data, and predictive variable selection is discussed. We analyze the combination of profiling information between the platforms, also considering the pathophysiological data. A specific subset of patients is identified that has a different response to classification by chromosomal gains and losses and by differentially expressed genes, corroborating the idea that genomic CNA can represent an independent source for tumor classification.
Mitchelson, K R
The small single-copy region (SSCR) of the chloroplast genome of many higher plants typically contain ndh genes encoding proteins that share homology with subunits of the respiratory-chain reduced nicotinamide adenine dinucleotide (NADH) dehydrogenase complex of mitochondria. A map of the lettuce chloroplast SSCR has been determined by Southern cross-hybridization, taking advantage of the high degree of homology between a tobacco small single-copy fragment and a corresponding lettuce chloroplast fragment. The gene order of the SSCR of lettuce and tobacco chloroplasts is similar. The cross-hybridization method can rapidly create a primary gene map of unknown chloroplast fragments, thus providing detailed information of the localization and arrangement of genes and conserved open reading frame regions.
Sara L Martin
Full Text Available Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed. We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T = 0.31 than diploids (^b(T = 0.40. Neotetraploids exhibited the highest evolutionary response (^b(T = 0.55. The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes.
Dzhumagaliev, E.B.; Mazo, A.N.; Baev, A.A. Jr.; Gorelova, T.V.; Arkhipova, I.R.; Shuppe, N.G.; Il'in, Yu.V.
The authors have determined the nucleotide sequences of long terminal repeats (LTRS) and adjacent regions in the transcribed and nontranscribed variants of the mobile dispersed gene mdg3. In its main characteristics the mdg3 is similar to other mdg. Its integration into chromosomal DNA brings about duplication of the 4 bp of the host DNA, no specificity of the mdg integration at the nucleotide level being detected. The mdg3 is flanked by a 5 bp inverted repeat. The variations in the length of the LTR in different mdg copies is mainly due to duplication of certain sequences in the U3 and R regions. mdg3 copies with a LTR length of 267 bp are the most abundant and are completely conservative in their primary structure. They are transcribed in the cells of the 67J25D culture, but not transcribed in the K/sub c/ line, where another mdg3 variant with a LTR length of 293 bp is transcriptionally active. The SI mapping of transcription initiation and termination sites has shown that in both mdg3 variants they are localized in the same LTR regions, and that the LTR itself has a characteristic U3-R-U5 structure-like retroviral LTRs. The possible factors involved in the regulation of mdg transcription are discussed
Cramer, Dina; Serrano, Luis; Schaefer, Martin H
Copy number alterations (CNAs) in cancer patients show a large variability in their number, length and position, but the sources of this variability are not known. CNA number and length are linked to patient survival, suggesting clinical relevance. We have identified genes that tend to be mutated in samples that have few or many CNAs, which we term CONIM genes (COpy Number Instability Modulators). CONIM proteins cluster into a densely connected subnetwork of physical interactions and many of them are epigenetic modifiers. Therefore, we investigated how the epigenome of the tissue-of-origin influences the position of CNA breakpoints and the properties of the resulting CNAs. We found that the presence of heterochromatin in the tissue-of-origin contributes to the recurrence and length of CNAs in the respective cancer type.
Lorentzen, Anders Blomkild; Mitchelmore, Cathy
AIM To investigate if the down-regulation of N-myc Downstream Regulated Gene 2 (NDRG2) expression in colorectal carcinoma (CRC) is due to loss of the NDRG2 allele(s). METHODS The following were investigated in the human colorectal cancer cell lines DLD-1, LoVo and SW-480: NDRG2 mRNA expression...... levels using quantitative reverse transcription-polymerase chain reaction (qRT-PCR); interaction of the MYC gene-regulatory protein with the NDRG2 promoter using chromatin immunoprecipitation; and NDRG2 promoter methylation using bisulfite sequencing. Furthermore, we performed qPCR to analyse the copy...... numbers of NDRG2 and MYC genes in the above three cell lines, 8 normal colorectal tissue samples and 40 CRC tissue samples. RESULTS As expected, NDRG2 mRNA levels were low in the three colorectal cancer cell lines, compared to normal colon. Endogenous MYC protein interacted with the NDRG2 core promoter...
Artzy-Randrup, Yael; Rorick, Mary M; Day, Karen; Chen, Donald; Dobson, Andrew P; Pascual, Mercedes
The coexistence of multiple independently circulating strains in pathogen populations that undergo sexual recombination is a central question of epidemiology with profound implications for control. An agent-based model is developed that extends earlier ‘strain theory’ by addressing the var gene family of Plasmodium falciparum. The model explicitly considers the extensive diversity of multi-copy genes that undergo antigenic variation via sequential, mutually exclusive expression. It tracks the dynamics of all unique var repertoires in a population of hosts, and shows that even under high levels of sexual recombination, strain competition mediated through cross-immunity structures the parasite population into a subset of coexisting dominant repertoires of var genes whose degree of antigenic overlap depends on transmission intensity. Empirical comparison of patterns of genetic variation at antigenic and neutral sites supports this role for immune selection in structuring parasite diversity. DOI: http://dx.doi.org/10.7554/eLife.00093.001 PMID:23251784
Bay, Jakob T; Schejbel, Lone; Madsen, Hans O
rejection, but a relationship between graft survival and serum C4 concentration as well as C4 genetic variation has not been established. We evaluated this using a prospective study design of 676 kidney transplant patients and 211 healthy individuals as controls. Increasing C4 gene copy numbers......Complement C4 is a central component of the classical and the lectin pathways of the complement system. The C4 protein exists as two isotypes C4A and C4B encoded by the C4A and C4B genes, both of which are found with varying copy numbers. Deposition of C4 has been implicated in kidney graft...... significantly correlated with the C4 serum concentration in both patients and controls. Patients with less than four total copies of C4 genes transplanted with a deceased donor kidney experienced a superior 5-year graft survival (hazard ratio 0.46, 95% confidence interval: 0.25-0.84). No significant association...
Jespersgaard, Cathrine; Fode, Peder; Dybdahl, Marianne
BACKGROUND AND PURPOSE OF STUDY: Extensive copy number variation is observed for the DEFA1A3 gene encoding alpha-defensins 1-3. The objective of this study was to determine the involvement of alpha-defensins in colonic tissue from Crohn's disease (CD) patients and the possible genetic association...... of DEFA1A3 with CD. METHODS: Two-hundred and forty ethnic Danish CD patients were included in the study. Reverse transcriptase PCR assays determined DEFA1A3 expression in colonic tissue from a subset of patients. Immunohistochemical analysis identified alpha-defensin peptides in colonic tissue. Copy...
Larsen, Simon Jonas; do Canto, Luisa Matos; Rogatto, Silvia Regina
Copy number variations (CNVs) are large segments of the genome that are duplicated or deleted. Structural variations in the genome have been linked to many complex diseases. Similar to how genome-wide association studies (GWAS) have helped discover single-nucleotide polymorphisms linked to diseas...
Wang, Tiehui; Gorgoglione, Bartolomeo; Maehr, Tanja; Holland, Jason W.; Vecino, Jose L. González; Wadsworth, Simon; Secombes, Christopher J.
The intracellular suppressors of cytokine signaling (SOCS) family members, including CISH and SOCS1 to 7 in mammals, are important regulators of cytokine signaling pathways. So far, the orthologues of all the eight mammalian SOCS members have been identified in fish, with several of them having multiple copies. Whilst fish CISH, SOCS3, and SOCS5 paralogues are possibly the result of the fish-specific whole genome duplication event, gene duplication or lineage-specific genome duplication may also contribute to some paralogues, as with the three trout SOCS2s and three zebrafish SOCS5s. Fish SOCS genes are broadly expressed and also show species-specific expression patterns. They can be upregulated by cytokines, such as IFN-γ, TNF-α, IL-1β, IL-6, and IL-21, by immune stimulants such as LPS, poly I:C, and PMA, as well as by viral, bacterial, and parasitic infections in member- and species-dependent manners. Initial functional studies demonstrate conserved mechanisms of fish SOCS action via JAK/STAT pathways. PMID:22203897
González Juan R
Full Text Available Abstract Background An important question in genetic studies is to determine those genetic variants, in particular CNVs, that are specific to different groups of individuals. This could help in elucidating differences in disease predisposition and response to pharmaceutical treatments. We propose a Bayesian model designed to analyze thousands of copy number variants (CNVs where only few of them are expected to be associated with a specific phenotype. Results The model is illustrated by analyzing three major human groups belonging to HapMap data. We also show how the model can be used to determine specific CNVs related to response to treatment in patients diagnosed with ovarian cancer. The model is also extended to address the problem of how to adjust for confounding covariates (e.g., population stratification. Through a simulation study, we show that the proposed model outperforms other approaches that are typically used to analyze this data when analyzing common copy-number polymorphisms (CNPs or complex CNVs. We have developed an R package, called bayesGen, that implements the model and estimating algorithms. Conclusions Our proposed model is useful to discover specific genetic variants when different subgroups of individuals are analyzed. The model can address studies with or without control group. By integrating all data in a unique model we can obtain a list of genes that are associated with a given phenotype as well as a different list of genes that are shared among the different subtypes of cases.
Dinsmore, Polly K.; Klaenhammer, Todd R.
The abiA gene (formerly hsp) encodes an abortive phage infection mechanism which inhibits phage DNA replication. To analyze the effects of varying the abiA gene dosage on bacteriophage resistance in Lactococcus lactis, various genetic constructions were made. An IS946-based integration vector, pTRK75, was used to integrate a single copy of abiA into the chromosomes of two lactococcal strains, MG1363 and NCK203. In both strains, a single copy of abiA did not confer any significant phage resist...
Simsek, Abdurrahman; Zeybek, Nazif; Yagci, Gokhan; Kaymakcioglu, Nihat; Tas, Huseyin; Saglam, Mutlu; Cetiner, Sadettin
Alimentary tract duplication and duplication cysts are rare congenital malformations. The ileum is the most frequently affected site. However, alimentary tract duplication and duplication cysts can occur at any point along the gastrointestinal tract. Early diagnosis and prompt surgical treatment is the best way to prevent associated morbidity. This article presents the cases of three patients admitted to Gulhane Military Medical Academy with signs of acute abdomen, intra-abdominal mass and chronic abdominal pain. These patients were found to have enteric duplication, duplication cyst and/or retro-rectal cyst. The literature on alimentary tract duplications is reviewed.
Venkatachalam, Ananda B; Fontenot, Quenton; Farrara, Allyse; Wright, Jonathan M
With the advent of high-throughput DNA sequencing technology, the genomic sequence of many disparate species has led to the relatively new discipline of genomics, the study of genome structure, function and evolution. Much work has been focused on the role of whole genome duplications (WGD) in the architecture of extant vertebrate genomes, particularly those of teleost fishes which underwent a WGD early in the teleost radiation >230 million years ago (mya). Our past work has focused on the fate of duplicated copies of a multigene family coding for the intracellular lipid-binding protein (iLBP) genes in the teleost fishes. To define the evolutionary processes that determined the fate of duplicated genes and generated the structure of extant fish genomes, however, requires comparative genomic analysis with a fish lineage that diverged before the teleost WGD, such as the spotted gar (Lepisosteus oculatus), an ancient, air-breathing, ray-finned fish. Here, we describe the genomic organization, chromosomal location and tissue-specific expression of a subfamily of the iLBP genes that code for fatty acid-binding proteins (Fabps) in spotted gar. Based on this work, we have defined the minimum suite of fabp genes prior to their duplication in the teleost lineages ~230-400 mya. Spotted gar, therefore, serves as an appropriate outgroup, or ancestral/ancient fish, that did not undergo the teleost-specific WGD. As such, analyses of the spatio-temporal regulation of spotted gar genes provides a foundation to determine whether the duplicated fabp genes have been retained in teleost genomes owing to either sub- or neofunctionalization. Copyright © 2017 Elsevier Inc. All rights reserved.
Thomassen, Mads; Skov, Vibe; Eiriksdottir, Freyja; Tan, Qihua; Jochumsen, Kirsten; Fritzner, Niels; Brusgaard, Klaus; Dahlgaard, Jesper; Kruse, Torben A.
The quality of DNA microarray based gene expression data relies on the reproducibility of several steps in a microarray experiment. We have developed a spotted genome wide microarray chip with oligonucleotides printed in duplicate in order to minimise undesirable biases, thereby optimising detection of true differential expression. The validation study design consisted of an assessment of the microarray chip performance using the MessageAmp and FairPlay labelling kits. Intraclass correlation coefficient (ICC) was used to demonstrate that MessageAmp was significantly more reproducible than FairPlay. Further examinations with MessageAmp revealed the applicability of the system. The linear range of the chips was three orders of magnitude, the precision was high, as 95% of measurements deviated less than 1.24-fold from the expected value, and the coefficient of variation for relative expression was 13.6%. Relative quantitation was more reproducible than absolute quantitation and substantial reduction of variance was attained with duplicate spotting. An analysis of variance (ANOVA) demonstrated no significant day-to-day variation
Full Text Available Abstract Background Obsessive-compulsive disorder (OCD is a clinically and etiologically heterogeneous syndrome. The high frequency of obsessive-compulsive symptoms reported in subjects with the 22q11.2 deletion syndrome (DiGeorge/velocardiofacial syndrome or Prader-Willi syndrome (15q11-13 deletion of the paternally derived chromosome, suggests that gene dosage effects in these chromosomal regions could increase risk for OCD. Therefore, the aim of this study was to search for microrearrangements in these two regions in OCD patients. Methods We screened the 15q11-13 and 22q11.2 chromosomal regions for genomic imbalances in 236 patients with OCD using multiplex ligation-dependent probe amplification (MLPA. Results No deletions or duplications involving 15q11-13 or 22q11.2 were identified in our patients. Conclusions Our results suggest that deletions/duplications of chromosomes 15q11-13 and 22q11.2 are rare in OCD. Despite the negative findings in these two regions, the search for copy number variants in OCD using genome-wide array-based methods is a highly promising approach to identify genes of etiologic importance in the development of OCD.
Venken, Koen J. T.; Popodi, Ellen; Holtzman, Stacy L.; Schulze, Karen L.; Park, Soo; Carlson, Joseph W.; Hoskins, Roger A.; Bellen, Hugo J.; Kaufman, Thomas C.
We describe a molecularly defined duplication kit for the X chromosome of Drosophila melanogaster. A set of 408 overlapping P[acman] BAC clones was used to create small duplications (average length 88 kb) covering the 22-Mb sequenced portion of the chromosome. The BAC clones were inserted into an attP docking site on chromosome 3L using C31 integrase, allowing direct comparison of different transgenes. The insertions complement 92% of the essential and viable mutations and deletions tested, demonstrating that almost all Drosophila genes are compact and that the current annotations of the genome are reasonably accurate. Moreover, almost all genes are tolerated at twice the normal dosage. Finally, we more precisely mapped two regions at which duplications cause diplo-lethality in males. This collection comprises the first molecularly defined duplication set to cover a whole chromosome in a multicellular organism. The work presented removes a long-standing barrier to genetic analysis of the Drosophila X chromosome, will greatly facilitate functional assays of X-linked genes in vivo, and provides a model for functional analyses of entire chromosomes in other species.
Houcinat, N; Llanas, B; Moutton, S; Toutain, J; Cailley, D; Arveiler, B; Combe, C; Lacombe, D; Rooryck, C
The use of array-comparative genomic hybridization (array-CGH) in routine clinical work has allowed the identification of many new copy number variations (CNV). The 16p13.11 duplication has been implicated in various congenital anomalies and neurodevelopmental disorders, but it has also been identified in healthy individuals. We report a clinical observation of two brothers from related parents each carrying a homozygous 16p13.11 duplication. The propositus had mild intellectual disability and posterior urethral valves with chronic renal disease. His brother was considered a healthy child with only learning disabilities and poor academic performances. However, a routine medical examination at 25-years-old revealed a mild chronic renal disease and ureteropelvic junction obstruction. Furthermore, the father presented with a unilateral renal agenesis, thus it seemed that a "congenital anomalies of kidney and urinary tract" (CAKUT) phenotype segregated in this family. This may be related to the duplication, but we cannot exclude the involvement of additional genetic or non-genetic factors in the urological phenotype. Several cohort studies showed association between this chromosomal imbalance and different clinical manifestations, but rarely with CAKUT. The duplication reported here was similar to the larger one of 3.4 Mb previously described versus the more common of 1.6 Mb. It encompassed at least 11 known genes, including the five ohnologs previously identified. Our observation, in addition to expanding the clinical spectrum of the duplication provides further support to understanding the underlying pathogenic mechanism. © 2015 Wiley Periodicals, Inc.
Schirtzinger, Erin E.; Tavares, Erika S.; Gonzales, Lauren A.; Eberhard, Jessica R.; Miyaki, Cristina Y.; Sanchez, Juan J.; Hernandez, Alexis; Müeller, Heinrich; Graves, Gary R.; Fleischer, Robert C.; Wright, Timothy F.
Mitochondrial genomes are generally thought to be under selection for compactness, due to their small size, consistent gene content, and a lack of introns or intergenic spacers. As more animal mitochondrial genomes are fully sequenced, rearrangements and partial duplications are being identified with increasing frequency, particularly in birds (Class Aves). In this study, we investigate the evolutionary history of mitochondrial control region states within the avian order Psittaciformes (parrots and cockatoos). To this aim, we reconstructed a comprehensive multi-locus phylogeny of parrots, used PCR of three diagnostic fragments to classify the mitochondrial control region state as single or duplicated, and mapped these states onto the phylogeny. We further sequenced 44 selected species to validate these inferences of control region state. Ancestral state reconstruction using a range of weighting schemes identified six independent origins of mitochondrial control region duplications within Psittaciformes. Analysis of sequence data showed that varying levels of mitochondrial gene and tRNA homology and degradation were present within a given clade exhibiting duplications. Levels of divergence between control regions within an individual varied from 0–10.9% with the differences occurring mainly between 51 and 225 nucleotides 3′ of the goose hairpin in domain I. Further investigations into the fates of duplicated mitochondrial genes, the potential costs and benefits of having a second control region, and the complex relationship between evolutionary rates, selection, and time since duplication are needed to fully explain these patterns in the mitochondrial genome. PMID:22543055
Tran Lan T
Full Text Available Abstract Background Plant polyphenol oxidases (PPOs are enzymes that typically use molecular oxygen to oxidize ortho-diphenols to ortho-quinones. These commonly cause browning reactions following tissue damage, and may be important in plant defense. Some PPOs function as hydroxylases or in cross-linking reactions, but in most plants their physiological roles are not known. To better understand the importance of PPOs in the plant kingdom, we surveyed PPO gene families in 25 sequenced genomes from chlorophytes, bryophytes, lycophytes, and flowering plants. The PPO genes were then analyzed in silico for gene structure, phylogenetic relationships, and targeting signals. Results Many previously uncharacterized PPO genes were uncovered. The moss, Physcomitrella patens, contained 13 PPO genes and Selaginella moellendorffii (spike moss and Glycine max (soybean each had 11 genes. Populus trichocarpa (poplar contained a highly diversified gene family with 11 PPO genes, but several flowering plants had only a single PPO gene. By contrast, no PPO-like sequences were identified in several chlorophyte (green algae genomes or Arabidopsis (A. lyrata and A. thaliana. We found that many PPOs contained one or two introns often near the 3’ terminus. Furthermore, N-terminal amino acid sequence analysis using ChloroP and TargetP 1.1 predicted that several putative PPOs are synthesized via the secretory pathway, a unique finding as most PPOs are predicted to be chloroplast proteins. Phylogenetic reconstruction of these sequences revealed that large PPO gene repertoires in some species are mostly a consequence of independent bursts of gene duplication, while the lineage leading to Arabidopsis must have lost all PPO genes. Conclusion Our survey identified PPOs in gene families of varying sizes in all land plants except in the genus Arabidopsis. While we found variation in intron numbers and positions, overall PPO gene structure is congruent with the phylogenetic
Rausher, Mark D; Huang, Jie
Although plants and their natural enemies may coevolve for prolonged periods, little is known about how long individual plant defensive genes are involved in the coevolutionary process. We address this issue by examining patterns of selection on the defensive gene threonine deaminase (TD). Tomato (Solanum lycopersicum) has two copies of this gene. One performs the canonical housekeeping function in amino acid metabolism of catalyzing the first reaction in the conversion of threonine to isoleucine. The second copy functions as an antinutritive defense against lepidopteran herbivores by depleting threonine in the insect gut. Wild tobacco (Nicotiana attenuata) also contains a defensive copy. We show that a single copy of TD underwent two or three duplications near the base of the Solanaceae. One copy retains the housekeeping function, whereas a second copy evolved defensive functions. Positive selection occurred on the branch of the TD2 gene tree subtending the common ancestor of the Nicotianoideae and Solanoideae. It also occurred within the Solanoideae clade but not within the Nicotianoideae clade. Finally, it occurred on most branches leading from the common ancestor to S. lycopersicum. Based on recent calibrations of the Solanaceae phylogeny, TD2 experienced adaptive substitutions for a period of 30-50 My. We suggest that the most likely explanation for this result is fluctuating herbivore abundances: When herbivores are rare, relaxed selection increases the likelihood that slightly disadvantageous mutations will be fixed by drift; when herbivores are common, increased selection causes the evolution of compensatory adaptive mutations. Alternative explanations are also discussed. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: email@example.com.
Heide, Solveig; Keren, Boris; Billette de Villemeur, Thierry; Chantot-Bastaraud, Sandra; Depienne, Christel; Nava, Caroline; Mignot, Cyril; Jacquette, Aurélia; Fonteneau, Eric; Lejeune, Elodie; Mach, Corinne; Marey, Isabelle; Whalen, Sandra; Lacombe, Didier; Naudion, Sophie; Rooryck, Caroline; Toutain, Annick; Caignec, Cédric Le; Haye, Damien; Olivier-Faivre, Laurence; Masurel-Paulet, Alice; Thauvin-Robinet, Christel; Lesne, Fabien; Faudet, Anne; Ville, Dorothée; des Portes, Vincent; Sanlaville, Damien; Siffroi, Jean-Pierre; Moutard, Marie-Laure; Héron, Delphine
To evaluate the role that chromosomal micro-rearrangements play in patients with both corpus callosum abnormality and intellectual disability, we analyzed copy number variations (CNVs) in patients with corpus callosum abnormality/intellectual disability STUDY DESIGN: We screened 149 patients with corpus callosum abnormality/intellectual disability using Illumina SNP arrays. In 20 patients (13%), we have identified at least 1 CNV that likely contributes to corpus callosum abnormality/intellectual disability phenotype. We confirmed that the most common rearrangement in corpus callosum abnormality/intellectual disability is inverted duplication with terminal deletion of the 8p chromosome (3.2%). In addition to the identification of known recurrent CNVs, such as deletions 6qter, 18q21 (including TCF4), 1q43q44, 17p13.3, 14q12, 3q13, 3p26, and 3q26 (including SOX2), our analysis allowed us to refine the 2 known critical regions associated with 8q21.1 deletion and 19p13.1 duplication relevant for corpus callosum abnormality; report a novel 10p12 deletion including ZEB1 recently implicated in corpus callosum abnormality with corneal dystrophy; and) report a novel pathogenic 7q36 duplication encompassing SHH. In addition, 66 variants of unknown significance were identified in 57 patients encompassed candidate genes. Our results confirm the relevance of using microarray analysis as first line test in patients with corpus callosum abnormality/intellectual disability. Copyright © 2017 Elsevier Inc. All rights reserved.
Papper, Zack; Jameson, Natalie M; Romero, Roberto; Weckle, Amy L; Mittal, Pooja; Benirschke, Kurt; Santolaya-Forgas, Joaquin; Uddin, Monica; Haig, David; Goodman, Morris; Wildman, Derek E
In anthropoid primates, growth hormone (GH) genes have undergone at least 2 independent locus expansions, one in platyrrhines (New World monkeys) and another in catarrhines (Old World monkeys and apes). In catarrhines, the GH cluster has a pituitary-expressed gene called GH1; the remaining GH genes include placental GHs and placental lactogens. Here, we provide cDNA sequence evidence that the platyrrhine GH cluster also includes at least 3 placenta expressed genes and phylogenetic evidence that placenta expressed anthropoid GH genes have undergone strong adaptive evolution, whereas pituitary-expressed GH genes have faced strict functional constraint. Our phylogenetic evidence also points to lineage-specific gene gain and loss in early placental mammalian evolution, with at least three copies of the GH gene present at the time of the last common ancestor (LCA) of primates, rodents, and laurasiatherians. Anthropoid primates and laurasiatherians share gene descendants of one of these three copies, whereas rodents and strepsirrhine primates each maintain a separate copy. Eight of the amino-acid replacements that occurred on the lineage leading to the LCA of extant anthropoids have been implicated in GH signaling at the maternal-fetal interface. Thus, placental expression of GH may have preceded the separate series of GH gene duplications that occurred in catarrhines and platyrrhines (i.e., the roles played by placenta-expressed GHs in human pregnancy may have a longer evolutionary history than previously appreciated).
Hippolyte, Loyse; Maillard, Anne M; Rodriguez-Herreros, Borja; Pain, Aurélie; Martin-Brevet, Sandra; Ferrari, Carina; Conus, Philippe; Macé, Aurélien; Hadjikhani, Nouchine; Metspalu, Andres; Reigo, Anu; Kolk, Anneli; Männik, Katrin; Barker, Mandy; Isidor, Bertrand; Le Caignec, Cédric; Mignot, Cyril; Schneider, Laurence; Mottron, Laurent; Keren, Boris; David, Albert; Doco-Fenzy, Martine; Gérard, Marion; Bernier, Raphael; Goin-Kochel, Robin P; Hanson, Ellen; Green Snyder, LeeAnne; Ramus, Franck; Beckmann, Jacques S; Draganski, Bogdan; Reymond, Alexandre; Jacquemont, Sébastien
Deletions and duplications of the 16p11.2 BP4-BP5 locus are prevalent copy number variations (CNVs), highly associated with autism spectrum disorder and schizophrenia. Beyond language and global cognition, neuropsychological assessments of these two CNVs have not yet been reported. This study investigates the relationship between the number of genomic copies at the 16p11.2 locus and cognitive domains assessed in 62 deletion carriers, 44 duplication carriers, and 71 intrafamilial control subjects. IQ is decreased in deletion and duplication carriers, but we demonstrate contrasting cognitive profiles in these reciprocal CNVs. Deletion carriers present with severe impairments of phonology and of inhibition skills beyond what is expected for their IQ level. In contrast, for verbal memory and phonology, the data may suggest that duplication carriers outperform intrafamilial control subjects with the same IQ level. This finding is reminiscent of special isolated skills as well as contrasting language performance observed in autism spectrum disorder. Some domains, such as visuospatial and working memory, are unaffected by the 16p11.2 locus beyond the effect of decreased IQ. Neuroimaging analyses reveal that measures of inhibition covary with neuroanatomic structures previously identified as sensitive to 16p11.2 CNVs. The simultaneous study of reciprocal CNVs suggests that the 16p11.2 genomic locus modulates specific cognitive skills according to the number of genomic copies. Further research is warranted to replicate these findings and elucidate the molecular mechanisms modulating these cognitive performances. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Nishijima, Toshimitsu; Yamamoto, Hidetaka; Nakano, Takafumi; Nakashima, Torahiko; Taguchi, Ken-ichi; Masuda, Muneyuki; Motoshita, Jun-ichi; Komune, Shizuo; Oda, Yoshinao
We investigated the potential roles of HER2 and EGFR and evaluated their prognostic significance in carcinoma ex pleomorphic adenoma (CXPA). We analyzed HER2 and EGFR overexpression status using immunohistochemistry (IHC) and gene copy number gain by chromogenic in situ hybridization (CISH) in 50 cases of CXPA (40 ductal-type and 10 myoepithelial-type CXPAs). Salivary duct carcinoma was the most common histologic subtype of malignant component (n = 21). Immunohistochemistry positivity and chromogenic in situ hybridization positivity were closely correlated in both HER2 and EGFR. HER2 CISH positivity (mostly gene amplification) and EGFR CISH positivity (mostly gene high polysomy) were present in 19 (40%) and 21 (44%) cases, respectively, and were each significantly correlated with poor outcome (P = .0009 and P = .0032, respectively). Dual gain of HER2 and EGFR gene copy numbers was present in 11 cases (23%) and was the most aggressive genotype. HER2 CISH positivity was more frequently present in ductal-type CXPAs (47%) than in myoepithelial-type CXPAs (10%), whereas the prevalence of EGFR CISH positivity was similar in both histologic subtypes (42% and 50%, respectively). Our results suggest that HER2 and EGFR gene copy number gains may play an important role in the progression of CXPA, in particular ductal-type CXPAs. HER2 CISH-positive/EGFR CISH-positive tumors may be the most aggressive subgroup in CXPA. The molecular subclassification of CXPA based on the HER2 and EGFR status may be helpful for prognostic prediction and decisions regarding the choice of therapeutic strategy. Copyright © 2015 Elsevier Inc. All rights reserved.
Karaman, I; Karaman, A; Arda, N; Cakmak, O
Duplications of gastrointestinal tract are rare anomalies, and rectal duplications account for five percent of the alimentary tract duplications. We present an unusual case of rectal duplication, which was located externally in a newborn female, and discuss the types of distal hindgut duplications.
Briscoe, Adriana D.; Bybee, Seth M.; Bernard, Gary D.; Yuan, Furong; Sison-Mangus, Marilou P.; Reed, Robert D.; Warren, Andrew D.; Llorente-Bousquets, Jorge; Chiao, Chuan-Chin
The butterfly Heliconius erato can see from the UV to the red part of the light spectrum with color vision proven from 440 to 640 nm. Its eye is known to contain three visual pigments, rhodopsins, produced by an 11-cis-3-hydroxyretinal chromophore together with long wavelength (LWRh), blue (BRh) and UV (UVRh1) opsins. We now find that H. erato has a second UV opsin mRNA (UVRh2)—a previously undescribed duplication of this gene among Lepidoptera. To investigate its evolutionary origin, we screened eye cDNAs from 14 butterfly species in the subfamily Heliconiinae and found both copies only among Heliconius. Phylogeny-based tests of selection indicate positive selection of UVRh2 following duplication, and some of the positively selected sites correspond to vertebrate visual pigment spectral tuning residues. Epi-microspectrophotometry reveals two UV-absorbing rhodopsins in the H. erato eye with λmax = 355 nm and 398 nm. Along with the additional UV opsin, Heliconius have also evolved 3-hydroxy-DL-kynurenine (3-OHK)-based yellow wing pigments not found in close relatives. Visual models of how butterflies perceive wing color variation indicate this has resulted in an expansion of the number of distinguishable yellow colors on Heliconius wings. Functional diversification of the UV-sensitive visual pigments may help explain why the yellow wing pigments of Heliconius are so colorful in the UV range compared to the yellow pigments of close relatives lacking the UV opsin duplicate. PMID:20133601
Carabajal Paladino, Leonela Z; Nguyen, Petr; Síchová, Jindra; Marec, František
We work on the development of transgenic sexing strains in the codling moth, Cydia pomonella (Tortricidae), which would enable to produce male-only progeny for the population control of this pest using sterile insect technique (SIT). To facilitate this research, we have developed a number of cytogenetic and molecular tools, including a physical map of the codling moth Z chromosome using BAC-FISH (fluorescence in situ hybridization with bacterial artificial chromosome probes). However, chromosomal localization of unique, single-copy sequences such as a transgene cassette by conventional FISH remains challenging. In this study, we adapted a FISH protocol with tyramide signal amplification (TSA-FISH) for detection of single-copy genes in Lepidoptera. We tested the protocol with probes prepared from partial sequences of Z-linked genes in the codling moth. Using a modified TSA-FISH protocol we successfully mapped a partial sequence of the Acetylcholinesterase 1 (Ace-1) gene to the Z chromosome and confirmed thus its Z-linkage. A subsequent combination of BAC-FISH with BAC probes containing anticipated neighbouring Z-linked genes and TSA-FISH with the Ace-1 probe allowed the integration of Ace-1 in the physical map of the codling moth Z chromosome. We also developed a two-colour TSA-FISH protocol which enabled us simultaneous localization of two Z-linked genes, Ace-1 and Notch, to the expected regions of the Z chromosome. We showed that TSA-FISH represents a reliable technique for physical mapping of genes on chromosomes of moths and butterflies. Our results suggest that this technique can be combined with BAC-FISH and in the future used for physical localization of transgene cassettes on chromosomes of transgenic lines in the codling moth or other lepidopteran species. Furthermore, the developed protocol for two-colour TSA-FISH might become a powerful tool for synteny mapping in non-model organisms.
Armour, J A L; Barton, D E; Cockburn, D J; Taylor, G R
While methods for the detection of point mutations and small insertions or deletions in genomic DNA are well established, the detection of larger (>100 bp) genomic duplications or deletions can be more difficult. Most mutation scanning methods use PCR as a first step, but the subsequent analyses are usually qualitative rather than quantitative. Gene dosage methods based on PCR need to be quantitative (i.e., they should report molar quantities of starting material) or semi-quantitative (i.e., they should report gene dosage relative to an internal standard). Without some sort of quantitation, heterozygous deletions and duplications may be overlooked and therefore be under-ascertained. Gene dosage methods provide the additional benefit of reporting allele drop-out in the PCR. This could impact on SNP surveys, where large-scale genotyping may miss null alleles. Here we review recent developments in techniques for the detection of this type of mutation and compare their relative strengths and weaknesses. We emphasize that comprehensive mutation analysis should include scanning for large insertions and deletions and duplications. Copyright 2002 Wiley-Liss, Inc.
Lord, Nathan P; Plimpton, Rebecca L; Sharkey, Camilla R; Suvorov, Anton; Lelito, Jonathan P; Willardson, Barry M; Bybee, Seth M
Arthropods have received much attention as a model for studying opsin evolution in invertebrates. Yet, relatively few studies have investigated the diversity of opsin proteins that underlie spectral sensitivity of the visual pigments within the diverse beetles (Insecta: Coleoptera). Previous work has demonstrated that beetles appear to lack the short-wavelength-sensitive (SWS) opsin class that typically confers sensitivity to the "blue" region of the light spectrum. However, this is contrary to established physiological data in a number of Coleoptera. To explore potential adaptations at the molecular level that may compensate for the loss of the SWS opsin, we carried out an exploration of the opsin proteins within a group of beetles (Buprestidae) where short-wave sensitivity has been demonstrated. RNA-seq data were generated to identify opsin proteins from nine taxa comprising six buprestid species (including three male/female pairs) across four subfamilies. Structural analyses of recovered opsins were conducted and compared to opsin sequences in other insects across the main opsin classes-ultraviolet, short-wavelength, and long-wavelength. All nine buprestids were found to express two opsin copies in each of the ultraviolet and long-wavelength classes, contrary to the single copies recovered in all other molecular studies of adult beetle opsin expression. No SWS opsin class was recovered. Furthermore, the male Agrilus planipennis (emerald ash borer-EAB) expressed a third LWS opsin at low levels that is presumed to be a larval copy. Subsequent homology and structural analyses identified multiple amino acid substitutions in the UVS and LWS copies that could confer short-wavelength sensitivity. This work is the first to compare expressed opsin genes against known electrophysiological data that demonstrate multiple peak sensitivities in Coleoptera. We report the first instance of opsin duplication in adult beetles, which occurs in both the UVS and LWS opsin classes
Takahashi, Tadashi; Sato, Atsushi; Ogawa, Masahiro; Hanya, Yoshiki; Oguma, Tetsuya
We describe here the first successful construction of a targeted tandem duplication of a large chromosomal segment in Aspergillus oryzae. The targeted tandem chromosomal duplication was achieved by using strains that had a 5'-deleted pyrG upstream of the region targeted for tandem chromosomal duplication and a 3'-deleted pyrG downstream of the target region. Consequently,strains bearing a 210-kb targeted tandem chromosomal duplication near the centromeric region of chromosome 8 and strains bearing a targeted tandem chromosomal duplication of a 700-kb region of chromosome 2 were successfully constructed. The strains bearing the tandem chromosomal duplication were efficiently obtained from the regenerated protoplast of the parental strains. However, the generation of the chromosomal duplication did not depend on the introduction of double-stranded breaks(DSBs) by I-SceI. The chromosomal duplications of these strains were stably maintained after five generations of culture under nonselective conditions. The strains bearing the tandem chromosomal duplication in the 700-kb region of chromosome 2 showed highly increased protease activity in solid-state culture, indicating that the duplication of large chromosomal segments could be a useful new breeding technology and gene analysis method.
Garry N. Hannan
Full Text Available Hereditary non-polyposis colorectal cancer (HNPCC is the commonest form of inherited colorectal cancer (CRC predisposition and by definition describes families which conform to the Amsterdam Criteria or reiterations thereof. In ~50% of patients adhering to the Amsterdam criteria germline variants are identified in one of four DNA Mismatch repair (MMR genes MLH1, MSH2, MSH6 and PMS2. Loss of function of any one of these genes results in a failure to repair DNA errors occurring during replication which can be most easily observed as DNA microsatellite instability (MSI—a hallmark feature of this disease. The remaining 50% of patients without a genetic diagnosis of disease may harbour more cryptic changes within or adjacent to MLH1, MSH2, MSH6 or PMS2 or elsewhere in the genome. We used a high density cytogenetic array to screen for deletions or duplications in a series of patients, all of whom adhered to the Amsterdam/Bethesda criteria, to determine if genomic re-arrangements could account for a proportion of patients that had been shown not to harbour causative mutations as assessed by standard diagnostic techniques. The study has revealed some associations between copy number variants (CNVs and HNPCC mutation negative cases and further highlights difficulties associated with CNV analysis.
Ku, Chee-Seng; Pawitan, Yudi; Sim, Xueling; Ong, Rick T H; Seielstad, Mark; Lee, Edmund J D; Teo, Yik-Ying; Chia, Kee-Seng; Salim, Agus
Research on the role of copy number variations (CNVs) in the genetic risk of diseases in Asian populations has been hampered by a relative lack of reference CNV maps for Asian populations outside the East Asians. In this article, we report the population characteristics of CNVs in Chinese, Malay, and Asian Indian populations in Singapore. Using the Illumina Human 1M Beadchip array, we identify 1,174 CNV loci in these populations that corroborated with findings when the same samples were typed on the Affymetrix 6.0 platform. We identify 441 novel loci not previously reported in the Database of Genomic Variations (DGV). We observe a considerable number of loci that span all three populations and were previously unreported, as well as population-specific loci that are quite common in the respective populations. From this we observe the distribution of CNVs in the Asian Indian population to be considerably different from the Chinese and Malay populations. About half of the deletion loci and three-quarters of duplication loci overlap UCSC genes. Tens of loci show population differentiation and overlap with genes previously known to be associated with genetic risk of diseases. One of these loci is the CYP2A6 deletion, previously linked to reduced susceptibility to lung cancer. (c) 2010 Wiley-Liss, Inc.
Full Text Available Recent progress in the analysis of whole genome sequencing data has resulted in the emergence of paleogenomics, a field devoted to the reconstruction of ancestral genomes. Ancestral karyotype reconstructions have been used primarily to illustrate the dynamic nature of genome evolution. In this paper, we demonstrate how they can also be used to study individual gene families by examining the evolutionary history of relaxin hormones (RLN/INSL and relaxin family peptide receptors (RXFP. Relaxin family hormones are members of the insulin superfamily, and are implicated in the regulation of a variety of primarily reproductive and neuroendocrine processes. Their receptors are G-protein coupled receptors (GPCR's and include members of two distinct evolutionary groups, an unusual characteristic. Although several studies have tried to elucidate the origins of the relaxin peptide family, the evolutionary origin of their receptors and the mechanisms driving the diversification of the RLN/INSL-RXFP signaling systems in non-placental vertebrates has remained elusive. Here we show that the numerous vertebrate RLN/INSL and RXFP genes are products of an ancestral receptor-ligand system that originally consisted of three genes, two of which apparently trace their origins to invertebrates. Subsequently, diversification of the system was driven primarily by whole genome duplications (WGD, 2R and 3R followed by almost complete retention of the ligand duplicates in most vertebrates but massive loss of receptor genes in tetrapods. Interestingly, the majority of 3R duplicates retained in teleosts are potentially involved in neuroendocrine regulation. Furthermore, we infer that the ancestral AncRxfp3/4 receptor may have been syntenically linked to the AncRln-like ligand in the pre-2R genome, and show that syntenic linkages among ligands and receptors have changed dynamically in different lineages. This study ultimately shows the broad utility, with some caveats, of
Fong, Chii Shyang; Kim, Minhee; Yang, T Tony; Liao, Jung-Chi; Tsou, Meng-Fu Bryan
Centrioles are 9-fold symmetric structures duplicating once per cell cycle. Duplication involves self-oligomerization of the centriolar protein SAS-6, but how the 9-fold symmetry is invariantly established remains unclear. Here, we found that SAS-6 assembly can be shaped by preexisting (or mother) centrioles. During S phase, SAS-6 molecules are first recruited to the proximal lumen of the mother centriole, adopting a cartwheel-like organization through interactions with the luminal wall, rather than via their self-oligomerization activity. The removal or release of luminal SAS-6 requires Plk4 and the cartwheel protein STIL. Abolishing either the recruitment or the removal of luminal SAS-6 hinders SAS-6 (or centriole) assembly at the outside wall of mother centrioles. After duplication, the lumen of engaged mother centrioles becomes inaccessible to SAS-6, correlating with a block for reduplication. These results lead to a proposed model that centrioles may duplicate via a template-based process to preserve their geometry and copy number. Copyright © 2014 Elsevier Inc. All rights reserved.
Guo, Yaqiong; Tang, Kevin; Rowe, Lori A; Li, Na; Roellig, Dawn M; Knipe, Kristine; Frace, Michael; Yang, Chunfu; Feng, Yaoyu; Xiao, Lihua
Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis-associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major new subtype in the United States. In this study, we sequenced the genomes of two field specimens from each of the two subtypes and conducted a comparative genomic analysis of the obtained sequences with those from the only fully sequenced Cryptosporidium parvum genome. Altogether, 8.59-9.05 Mb of Cryptosporidium sequences in 45-767 assembled contigs were obtained from the four specimens, representing 94.36-99.47% coverage of the expected genome. These genomes had complete synteny in gene organization and 96.86-97.0% and 99.72-99.83% nucleotide sequence similarities to the published genomes of C. parvum and C. hominis, respectively. Several major insertions and deletions were seen between C. hominis and C. parvum genomes, involving mostly members of multicopy gene families near telomeres. The four C. hominis genomes were highly similar to each other and divergent from the reference IaA25R3 genome in some highly polymorphic regions. Major sequence differences among the four specimens sequenced in this study were in the 5' and 3' ends of chromosome 6 and the gp60 region, largely the result of genetic recombination. The sequence similarity among specimens of the two dominant outbreak subtypes and genetic recombination in chromosome 6, especially around the putative virulence determinant gp60 region, suggest that genetic recombination plays a potential role in the emergence of hyper-transmissible C. hominis subtypes. The high sequence conservation between C. parvum and C. hominis genomes and significant differences in copy numbers of MEDLE family secreted proteins and insulinase-like proteases indicate that telomeric gene duplications could potentially contribute to
Shewmaker Christine K
Full Text Available Abstract Background Camelina sativa, an oilseed crop in the Brassicaceae family, has inspired renewed interest due to its potential for biofuels applications. Little is understood of the nature of the C. sativa genome, however. A study was undertaken to characterize two genes in the fatty acid biosynthesis pathway, fatty acid desaturase (FAD 2 and fatty acid elongase (FAE 1, which revealed unexpected complexity in the C. sativa genome. Results In C. sativa, Southern analysis indicates the presence of three copies of both FAD2 and FAE1 as well as LFY, a known single copy gene in other species. All three copies of both CsFAD2 and CsFAE1 are expressed in developing seeds, and sequence alignments show that previously described conserved sites are present, suggesting that all three copies of both genes could be functional. The regions downstream of CsFAD2 and upstream of CsFAE1 demonstrate co-linearity with the Arabidopsis genome. In addition, three expressed haplotypes were observed for six predicted single-copy genes in 454 sequencing analysis and results from flow cytometry indicate that the DNA content of C. sativa is approximately three-fold that of diploid Camelina relatives. Phylogenetic analyses further support a history of duplication and indicate that C. sativa and C. microcarpa might share a parental genome. Conclusions There is compelling evidence for triplication of the C. sativa genome, including a larger chromosome number and three-fold larger measured genome size than other Camelina relatives, three isolated copies of FAD2, FAE1, and the KCS17-FAE1 intergenic region, and three expressed haplotypes observed for six predicted single-copy genes. Based on these results, we propose that C. sativa be considered an allohexaploid. The characterization of fatty acid synthesis pathway genes will allow for the future manipulation of oil composition of this emerging biofuel crop; however, targeted manipulations of oil composition and general
Jeziorczak, Paul M; Warner, Brad W
Enteric duplications have been described throughout the entire gastrointestinal tract. The usual perinatal presentation is an abdominal mass. Duplications associated with the foregut have associated respiratory symptoms, whereas duplications in the midgut and hindgut can present with obstructive symptoms, perforation, nausea, emesis, hemorrhage, or be asymptomatic, and identified as an incidental finding. These are differentiated from other cystic lesions by the presence of a normal gastrointestinal mucosal epithelium. Enteric duplications are located on the mesenteric side of the native structures and are often singular with tubular or cystic characteristics. Management of enteric duplications often requires operative intervention with preservation of the native blood supply and intestine. These procedures are usually very well tolerated with low morbidity.